edu.berkeley.guir.ptk.util
Class PTKWebPageParser

java.lang.Object
  extended byedu.berkeley.guir.ptk.util.PTKWebPageParser

public class PTKWebPageParser
extends java.lang.Object

This is a PTK input library class that enables Web page parsing. You can get text from any public Web page using this parser.

Author:
tmatthew

Constructor Summary
PTKWebPageParser(java.lang.String url)
           
 
Method Summary
 java.lang.String findFirstString(java.lang.String left, java.lang.String right, java.lang.String after)
          Returns first string occuring between the left and right markers, after string "after" but before string "before"
 java.lang.String[] findStrings(java.lang.String left, java.lang.String right, java.lang.String after, java.lang.String before)
          Returns all strings occuring between the left and right markers, after string "after" but before string "before"
static void main(java.lang.String[] args)
           
 java.lang.String urlToString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PTKWebPageParser

public PTKWebPageParser(java.lang.String url)
Method Detail

findStrings

public java.lang.String[] findStrings(java.lang.String left,
                                      java.lang.String right,
                                      java.lang.String after,
                                      java.lang.String before)
                               throws PTKWebPageParserException
Returns all strings occuring between the left and right markers, after string "after" but before string "before"

Parameters:
left - text bounding the left side of the returned text
right - " " right " "
after - string after which to look for return strings (can be null)
before - " " before " "
Throws:
PTKWebPageParserException

findFirstString

public java.lang.String findFirstString(java.lang.String left,
                                        java.lang.String right,
                                        java.lang.String after)
                                 throws PTKWebPageParserException
Returns first string occuring between the left and right markers, after string "after" but before string "before"

Parameters:
left - text bounding the left side of the returned text
right - " " right " "
after - string after which to look for return strings (can be null)
Throws:
PTKWebPageParserException

urlToString

public java.lang.String urlToString()
                             throws PTKWebPageParserException
Throws:
PTKWebPageParserException

main

public static void main(java.lang.String[] args)