com.aliasi.corpus.parsers
Class GeniaPosParser

java.lang.Object
  extended by com.aliasi.corpus.Parser<H>
      extended by com.aliasi.corpus.StringParser<TagHandler>
          extended by com.aliasi.corpus.parsers.GeniaPosParser

Deprecated. This class will move to the demos in 4.0.

@Deprecated
public class GeniaPosParser
extends StringParser<TagHandler>

The GeniaPosParser extracts the part-of-speech (POS) tags from the GENIA text POS corpus and sends them to the specified tag handler.

An example from the start of the GENIA POS corpus is:

 UI/LS
 -/:
 95369245/CD
 ====================
 TI/LS
 -/:
 IL-2/NN
 gene/NN
 expression/NN
 and/CC
 NF-kappa/NN
 B/NN
 activation/NN
 through/IN
 CD28/NN
 requires/VBZ
 reactive/JJ
 oxygen/NN
 production/NN
 by/IN
 5-lipoxygenase/NN
 ./.
 ====================
 AB/LS
 -/:
 Activation/NN
 of/IN
 the/DT
 CD28/NN
 surface/NN
 receptor/NN
 provides/VBZ
 a/DT
 major/JJ
 costimulatory/JJ
 signal/NN
 for/IN
 T/NN
 cell/NN
 activation/NN
 resulting/VBG
 in/IN
 enhanced/VBN
 production/NN
 of/IN
 interleukin-2/NN
 (/(
 IL-2/NN
 )/)
 and/CC
 cell/NN
 proliferation/NN
 ./.
 ====================
 In/IN
 primary/JJ
 T/NN
 lymphocytes/NNS

    ......snip.....
 
The parser handles entries by "sentence", where a sentence is the set of token/tag pairs between the double-lines composed of equal signs (=). Some of these sentences begin with a special token drawn from the following set: Note that all of these are tagged with the "part-of-speech" LS and followed by a single hyphen (-) tagged as part-of-speech colon (:). Further note that the begin citation includes a PubMed identifier drawn from the MEDLINE corpus (see the com.aliasi.medline package for more information on MEDLINE). Further note that continuing sentences in the same abstract are not tagged with any prefix.

The GENIA corpus itself and extensive information about it is available from:

Since:
LingPipe2.1
Version:
3.9.1
Author:
Bob Carpenter

Constructor Summary
GeniaPosParser()
          Deprecated. Construct a GENIA part-of-speech parser with no handler specified.
GeniaPosParser(TagHandler handler)
          Deprecated. Moving to demos in 4.0.
 
Method Summary
 TagHandler getTagHandler()
          Deprecated. Use generic Parser.getHandler() instead.
 void parseString(char[] cs, int start, int end)
          Deprecated. Implementation of the parser for the GENIA corpus.
 
Methods inherited from class com.aliasi.corpus.StringParser
parse
 
Methods inherited from class com.aliasi.corpus.Parser
getHandler, parse, parse, parseString, setHandler
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GeniaPosParser

public GeniaPosParser()
Deprecated. 
Construct a GENIA part-of-speech parser with no handler specified.


GeniaPosParser

@Deprecated
public GeniaPosParser(TagHandler handler)
Deprecated. Moving to demos in 4.0.

Construct a GENIA part-of-speech parser with the specified tag handler.

Parameters:
handler - Tag handler for the parser.
Method Detail

getTagHandler

@Deprecated
public TagHandler getTagHandler()
Deprecated. Use generic Parser.getHandler() instead.

Returns the tag handler for this parser.

Returns:
The tag handler for this parser.
Throws:
ClassCastException - If a handler that does not implement TagHandler was set using Parser.setHandler(Handler).

parseString

public void parseString(char[] cs,
                        int start,
                        int end)
Deprecated. 
Implementation of the parser for the GENIA corpus.

Specified by:
parseString in class Parser<TagHandler>
Parameters:
cs - Underlying characters.
start - Index of first character in slice.
end - Index of one past the last character in the slice.