Package com.aliasi.corpus

Classes for parsing and handling various corpora.

See:
          Description

Interface Summary
ChunkHandler Deprecated. Use ObjectHandler<Chunking> instead.
ClassificationHandler<E,C extends Classification> The ClassificationHandler interface specifies a single method for operating on a classification input and result.
Handler The Handler marker interface indicates that a class will be implement a handle method appropriate for a particular Parser.
IntArrayHandler Deprecated. Use ObjectHandler<int[]> instead.
ObjectHandler<E> The ObjectHandler interface specifies a handler with a single method that takes a single argument of the type of the generic paramter.
StringArrayHandler Deprecated. Use ObjectHandler<String[]> instead.
TagHandler The TagHandler interface specifies a single method for operating on an array of tokens, whitespaces and tags.
TextHandler The TextHandler interface specifies a single method for operating on a slice of characters.
 

Class Summary
ChunkHandlerAdapter A ChunkHandlerAdapter converts a BIO-coded tag handler to a chunk handler.
ChunkTagHandlerAdapter A ChunkTagHandlerAdapter converts a chunk handler to a BIO-coded tag handler.
Corpus<H extends Handler> The Corpus abstract class provides a basis for passing training and testing data to data handlers.
DiskCorpus<H extends Handler> A DiskCorpus reads data from a specified training and test directory using a specified parser.
InputSourceParser<H extends Handler> An InputSourceParser is an abstract parser based on an abstract method for parsing from an input source.
LineParser<H extends Handler> A LineParser provides an abstract adapter for line-based parsing.
Parser<H extends Handler> The Parser abstract class provides methods for parsing content from an input source or character sequence and passing extracted events to a content handler.
StringParser<H extends Handler> A StringParser is an abstract parser based on an abstract method for parsing from a character slice.
XMLParser<H extends Handler> An XMLParser adapts a handler to be used to handle text extracted from an XML source.
XValidatingObjectCorpus<E> An XValidatingObjectCorpus holds a list of items which it uses to provide training and testing items using cross-validation.
 

Package com.aliasi.corpus Description

Classes for parsing and handling various corpora.