Class MedPostPosParser

  extended by com.aliasi.corpus.Parser<H>
      extended by com.aliasi.corpus.StringParser<TagHandler>
          extended by com.aliasi.corpus.parsers.AbstractMedTagParser
              extended by com.aliasi.corpus.parsers.MedPostPosParser

Deprecated. This class will move to the demos in 4.0.

public class MedPostPosParser
extends AbstractMedTagParser

The MedPostPosParser class provides a parser for MedPost part-of-speech corpus. MedPost was created at the United States National Center for Biotechnology Information (NCBI) and is a part of their MedTag distribution, which also includes a gene-chunked corpus.

NCBI distributes the MedPost corpus freely for public use as a "United States Government Work" (see included README file for full licensing information):

The labeled part-of-speech files match the pattern /medtag/medpost/*.ioc relative to the directory from which the distribution was unpacked.

The beginning of the first training file, medtag/medpost/tag_mb01.ioc, is:

 A_DD MAP_NN kinase_NN activator_NN recently_RR purified_VVN and_CC cloned_VVN has_VHZ been_VBN shown_VVN to_TO be_VBI a_DD protein_NN kinase_NN (_( MAP_NN kinase_NN kinase_NN )_) that_PNR is_VBZ able_JJ to_TO induce_VVI the_DD dual_JJ phosphorylation_NN of_II MAP_NN kinase_NN on_II both_CC the_DD regulatory_JJ tyrosine_NN and_CC threonine_NN sites_NNS in_RR+ vitro_RR ._.
 Here_RR we_PN report_VVB the_DD cloning_VVGN and_CC characterization_NN of_II a_DD novel_JJ dual_JJ specific_JJ phosphatase_NN ,_, HVH2_NN ,_, which_PNR may_VM function_VVB in_RR+ vivo_RR as_II a_DD MAP_NN kinase_NN phosphatase_NN ._.
Note that sentences are marked with identifiers on their own line and followed by the text of the sentence with underscores separating words from their tags.

The primary citation for MedPost is available freely from BioMedCentral:

Bob Carpenter

Constructor Summary
          Deprecated. Construct a MedPost corpus part-of-speech tag parser with no handler specified.
MedPostPosParser(TagHandler handler)
          Deprecated. Moving to demos in 4.0.
Method Summary
protected  void parseTokensTags(String[] tokens, String[] whitespaces, String[] tags)
          Deprecated. Passes the specified tokens, whitespaces and tags to the contained handler.
Methods inherited from class com.aliasi.corpus.parsers.AbstractMedTagParser
parseString, tagHandler
Methods inherited from class com.aliasi.corpus.StringParser
Methods inherited from class com.aliasi.corpus.Parser
getHandler, parse, parse, parseString, setHandler
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail


public MedPostPosParser()
Construct a MedPost corpus part-of-speech tag parser with no handler specified.


public MedPostPosParser(TagHandler handler)
Deprecated. Moving to demos in 4.0.

Construct a MedPost corpus part-of-speech tag parser with the specified tag handler.

handler - Tag handler.
Method Detail


protected void parseTokensTags(String[] tokens,
                               String[] whitespaces,
                               String[] tags)
Passes the specified tokens, whitespaces and tags to the contained handler. This implementation simply passes the tokens, tags and a null whitespace array to the contained handler.

Specified by:
parseTokensTags in class AbstractMedTagParser
tokens - Tokens to handle.
whitespaces - Whitespaces to handle (ignored).
tags - Tags to handle.