edu.berkeley.guir.ptk.util
Class PTKWebPageParser
java.lang.Object
edu.berkeley.guir.ptk.util.PTKWebPageParser
- public class PTKWebPageParser
- extends java.lang.Object
This is a PTK input library class that enables Web page parsing. You can get
text from any public Web page using this parser.
- Author:
- tmatthew
|
Method Summary |
java.lang.String |
findFirstString(java.lang.String left,
java.lang.String right,
java.lang.String after)
Returns first string occuring between the left and right markers,
after string "after" but before string "before" |
java.lang.String[] |
findStrings(java.lang.String left,
java.lang.String right,
java.lang.String after,
java.lang.String before)
Returns all strings occuring between the left and right markers,
after string "after" but before string "before" |
static void |
main(java.lang.String[] args)
|
java.lang.String |
urlToString()
|
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PTKWebPageParser
public PTKWebPageParser(java.lang.String url)
findStrings
public java.lang.String[] findStrings(java.lang.String left,
java.lang.String right,
java.lang.String after,
java.lang.String before)
throws PTKWebPageParserException
- Returns all strings occuring between the left and right markers,
after string "after" but before string "before"
- Parameters:
left - text bounding the left side of the returned textright - " " right " "after - string after which to look for return strings (can be null)before - " " before " "
- Throws:
PTKWebPageParserException
findFirstString
public java.lang.String findFirstString(java.lang.String left,
java.lang.String right,
java.lang.String after)
throws PTKWebPageParserException
- Returns first string occuring between the left and right markers,
after string "after" but before string "before"
- Parameters:
left - text bounding the left side of the returned textright - " " right " "after - string after which to look for return strings (can be null)
- Throws:
PTKWebPageParserException
urlToString
public java.lang.String urlToString()
throws PTKWebPageParserException
- Throws:
PTKWebPageParserException
main
public static void main(java.lang.String[] args)