edu.berkeley.guir.ptk.input
Class CNNNewsInput

java.lang.Object
  extended byedu.berkeley.guir.ptk.PTK
      extended byedu.berkeley.guir.ptk.input.InputSource
          extended byedu.berkeley.guir.ptk.input.CNNNewsInput
All Implemented Interfaces:
java.lang.Runnable

public class CNNNewsInput
extends InputSource

Gets news headlines from CNN.com and dispatches all the headlines in one event to the PTK Server. Extends the class InputSource and implements newInput() (an inherited, abstract method). This method is responsible for getting the actual input data from the CNN Web page. To do so, it uses the PTKWebParser.

This class sets the inherited Metadata mds field in the constructors. It is necessary to set the metadata because this is what identifies the events this input creates to outputs and to the PTK server. Outputs that want CNN news input, would subscribe to same metadata that is added to the event in the newInput method. Metadata is added using a Metadata object, which includes the ID of the metadata (which type of metadata it is) and the value of the metadata (the value that distinguishes it from other Events). Multiple Events can have Metadata with the same ID, but if it is important to tell them apart, their values should be different. For this CNN news input, the only Metadata in the event is the Event_TYPE_ID (the metadata ID), which is set to NewsConstants.NEWS (the metadata value). All events should have a Event_TYPE_ID.

To start the input, run this class as a Java application. The main class creates a new CNNNewsInput and starts it (which gets a new input and dispatches it to the PTK server). Events are dispatched in a loop with a sleeping time specified this.time_between_events. To customize the time between events, call setTimeBetweenEvents(long).


Field Summary
static java.lang.String MY_ID
          The unique ID String for this input, found in NewsConstants, which is an application specific class.
 
Fields inherited from class edu.berkeley.guir.ptk.input.InputSource
history, mds, my_id, my_ip, time_between_events
 
Fields inherited from class edu.berkeley.guir.ptk.PTK
debug, MAX_DEBUG, MED_DEBUG, MIN_DEBUG, NO_DEBUG
 
Constructor Summary
CNNNewsInput()
          Constructor for non-distributed applications.
CNNNewsInput(java.lang.String my_ip)
          Constructor for distributed applications.
 
Method Summary
static void main(java.lang.String[] argsv)
          This main method allows the input to run, continually getting new input and dispatching the data in events to the PTK server.
 Events newInput()
          Gets news headline input from the data source, CNN.com, and fills a Event with the headline strings.
 
Methods inherited from class edu.berkeley.guir.ptk.input.InputSource
addMetadata, addMetadataItem, addMyMetadataToEvent, dispatchEvent, finalize, getAbstractThenSendInputEvent, getMetadata, getMetadataItemsAsArray, getThenSendInputEvent, register, run, setMetadata, setTimeBetweenEvents, startInput
 
Methods inherited from class edu.berkeley.guir.ptk.PTK
getMAX, getMED, getMIN, getNO, printDebug, printDebug, printError
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MY_ID

public static final java.lang.String MY_ID
The unique ID String for this input, found in NewsConstants, which is an application specific class. The ID must be unique per PTK server instance.

See Also:
Constant Field Values
Constructor Detail

CNNNewsInput

public CNNNewsInput()
Constructor for non-distributed applications. An input that uses this constructor cannot be part of an application on another machine.


CNNNewsInput

public CNNNewsInput(java.lang.String my_ip)
Constructor for distributed applications. An input that uses this constructor can send evets to remote applications.

Method Detail

newInput

public Events newInput()
Gets news headline input from the data source, CNN.com, and fills a Event with the headline strings. The Event is returned when it has NewsConstants.NUM_HEADLINES headline strings in it. This Event is what will be dispatched to the PTK server and eventually sent to outputs that subscribe to its metadata. Uses the PTKWebParser class to parse the CNN Web page.

Specified by:
newInput in class InputSource
Returns:
Events an Events object that contains a single Event with the news headlines. There are NewsConstants.NUM_HEADLINES DataString objects created and then added to the Event. Each DataString is given an ID in the array NewsConstants.ALL_IDS, which is an array of news headline IDs indexed by the number of the headline (i.e., the first headline is in the first position of the array, etc.).

main

public static void main(java.lang.String[] argsv)
This main method allows the input to run, continually getting new input and dispatching the data in events to the PTK server.

Parameters:
argsv -