com.aliasi.corpus
Class Parser<H extends Handler>

java.lang.Object
  extended by com.aliasi.corpus.Parser<H>
Type Parameters:
H - the type of handler which receives events from this parser
Direct Known Subclasses:
InputSourceParser, StringParser

public abstract class Parser<H extends Handler>
extends Object

The Parser abstract class provides methods for parsing content from an input source or character sequence and passing extracted events to a content handler. Concrete implementations will typically make assumptions about the type of the handler.

Concrete subclasses must implement both parse(InputSource) and parseString(char[],int,int). Two subclasses of this class, InputSourceParser and StringParser may be extended by implementing only one of the above methods.

Since:
LingPipe2.1
Version:
3.0
Author:
Bob Carpenter

Constructor Summary
Parser()
          Construct a parser with a null handler.
Parser(H handler)
          Construct a parser with the specified handler.
 
Method Summary
 H getHandler()
          Returns the current content handler.
 void parse(File file)
          Parse the specified file, passing extracted events to the handler.
abstract  void parse(InputSource in)
          Parse the specified input source, passing extracted events to the handler.
 void parse(String sysId)
          Parse the specified system identifier, passing extracted events to the handler.
abstract  void parseString(char[] cs, int start, int end)
          Parse the specified character slice as a string input.
 void parseString(CharSequence cSeq)
          Parse the specified character sequence.
 void setHandler(H handler)
          Sets the content handler to the specified value.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Parser

public Parser()
Construct a parser with a null handler.


Parser

public Parser(H handler)
Construct a parser with the specified handler.

Parameters:
handler - Current handler.
Method Detail

setHandler

public void setHandler(H handler)
Sets the content handler to the specified value. The current handler is used for all content extracted by this parser.

Parameters:
handler - Handler to use for content extracted from parsed content.

getHandler

public H getHandler()
Returns the current content handler. The current handler is applied to all extracted content.

Returns:
Current content handler.

parse

public void parse(String sysId)
           throws IOException
Parse the specified system identifier, passing extracted events to the handler.

The implementation provided by this abstract class constructs an input source from the system identifier and passes it to parse(InputSource).

Parameters:
sysId - System ID from which to read.
Throws:
IOException - If there is an exception reading from the specified source.

parse

public void parse(File file)
           throws IOException
Parse the specified file, passing extracted events to the handler.

The implementation provided by this abstract class constructs a string-based URL name from the specified file and passes it to parse(String).

Parameters:
file - File to parse.
Throws:
IOException - If there is an exception reading from the specified file or it does not exist.

parseString

public void parseString(CharSequence cSeq)
                 throws IOException
Parse the specified character sequence. Extracted content is passed to the current handler.

The character sequence is converted to a character array using Strings.toCharArray(CharSequence) and then passed as a slice to to parseString(char[],int,int).

Parameters:
cSeq - Character sequence to parse.
Throws:
IOException - If there is an exception reading the characters.

parse

public abstract void parse(InputSource in)
                    throws IOException
Parse the specified input source, passing extracted events to the handler. Concrete subclasses must implement this method.

Parameters:
in - Input source from which to read.
Throws:
IOException - If there is an exception reading from the specified stream.

parseString

public abstract void parseString(char[] cs,
                                 int start,
                                 int end)
                          throws IOException
Parse the specified character slice as a string input. Extracted content is passed to the current handler.

Parameters:
cs - Characters underlying slice.
start - Index of first character in slice.
end - One past the index of the last character in slice.
Throws:
IOException - If there is an exception reading the characters.