Package com.aliasi.chunk

Classes for extracting meaningful chunks (spans) of text.


Interface Summary
Chunk The Chunk interface specifies a slice of a character sequence, a chunk type and a chunk score.
Chunker The Chunker interface specifies methods for returning a chunking given a character sequence or character slice.
Chunking The Chunking interface specifies a set of chunks over a shared underlying character sequence.
ConfidenceChunker The ConfidenceChunker interface specifies a method for returning an iterator over chunks in order of confidence.
NBestChunker An NBestChunker is a chunker that is able to return results iterating over scored chunkings or scored chunks in order of decreasing likelihood.
TagChunkCodec A TagChunkCodec provides a means of coding chunkings as taggings and decoding (string) taggings back to chunkings.

Class Summary
AbstractCharLmRescoringChunker<B extends NBestChunker,O extends LanguageModel.Process,C extends LanguageModel.Sequence> An AbstractCharLmRescoringChunker provides the basic character language-model rescoring model used by the trainable CharLmRescoringChunker and its compiled version.
BioTagChunkCodec The BioTagChunkCodec implements a chunk to tag coder/decoder based on the BIO encoding scheme and a specified tokenizer factory.
CharLmHmmChunker A CharLmHmmChunker employs a hidden Markov model estimator and tokenizer factory to learn a chunker.
CharLmRescoringChunker A CharLmRescoringChunker provides a long-distance character language model-based chunker that operates by rescoring the output of a contained character language model HMM chunker.
ChunkAndCharSeq The ChunkAndCharSeq is an immutable composite of a chunk and a character sequence.
ChunkerEvaluator The ChunkerEvaulator class provides an evaluation framework for chunkers.
ChunkFactory The ChunkFactory provides static factory methods for creating chunks from components.
ChunkingEvaluation A ChunkingEvaluation stores and reports the results of evaluating response chunkings against reference chunkings.
ChunkingImpl A ChunkingImpl provides a mutable, set-based implementation of the chunking interface.
HmmChunker An HmmChunker uses a hidden Markov model to perform chunking over tokenized character sequences.
IoTagChunkCodec The IoTagChunkCodec implements a chunk to tag coder/decoder based on the IO encoding scheme and a specified tokenizer factory.
RegExChunker A RegExChunker finds chunks that matches regular expressions.
RescoringChunker<B extends NBestChunker> A RescoringChunker provides first best, n-best and confidence chunking by rescoring n-best chunkings derived from a contained chunker.
TagChunkCodecAdapters The TagChunkCodecAdapters class contains static utility methods for adapting tagging handlers to chunking handlers and vice-versa using tag-chunk codecs.
TokenShapeChunker A TokenShapeChunker uses a named-entity TokenShapeDecoder and tokenizer factory to implement entity detection through the chunk.Chunker interface.
TrainTokenShapeChunker A TrainTokenShapeChunker is used to train a token and shape-based chunker.

