com.aliasi.sentences
Class IndoEuropeanSentenceModel

java.lang.Object
  extended by com.aliasi.sentences.AbstractSentenceModel
      extended by com.aliasi.sentences.HeuristicSentenceModel
          extended by com.aliasi.sentences.IndoEuropeanSentenceModel
All Implemented Interfaces:
SentenceModel, Serializable

public class IndoEuropeanSentenceModel
extends HeuristicSentenceModel
implements Serializable

An IndoEuropeanSentenceModel is a heuristic sentence designed primarily for English. Whehter or not it balances parentheses or forces the last token to be a boundary may be set in the constructor. It uses the default implementation of possible sentence starts and the following token sets:

Possible Stops
.
..
!
?
"
''
).
Impossible Penultimates
any single letter
personal and professional titles, ranks, etc.
commas, colon, and quotes
common abbreviations
directions
corporate designators
times, months, etc.
U.S. political parties
U.S. states (not ME or IN)
shipping terms
address abbreviations
Impossible Starts
possible stops (see above)
close parentheses
,
;
:
-
--
---
%
Note that all of these sets are case insensitive.

Serialization

An Indo-European setence model is serializable. The model read back in will be an instance of IndoEuropeanSentenceModel with the same behavior as the model that was written.

Since:
LingPipe1.0
Version:
3.9
Author:
Bob Carpenter
See Also:
Serialized Form

Constructor Summary
IndoEuropeanSentenceModel()
          Construct an Indo-European sentence model that does not force the final token to be a stop and does not balance parentheses.
IndoEuropeanSentenceModel(boolean forceFinalToken, boolean balanceParentheses)
          Construct an Indo-European sentence model that forces final tokens and balances parentheses according to the specified flags.
 
Method Summary
 
Methods inherited from class com.aliasi.sentences.HeuristicSentenceModel
balanceParens, boundaryIndices, forceFinalStop, possibleStart
 
Methods inherited from class com.aliasi.sentences.AbstractSentenceModel
boundaryIndices, boundaryIndices, verifyBounds, verifyTokensWhitespaces
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

IndoEuropeanSentenceModel

public IndoEuropeanSentenceModel()
Construct an Indo-European sentence model that does not force the final token to be a stop and does not balance parentheses.


IndoEuropeanSentenceModel

public IndoEuropeanSentenceModel(boolean forceFinalToken,
                                 boolean balanceParentheses)
Construct an Indo-European sentence model that forces final tokens and balances parentheses according to the specified flags.

Parameters:
forceFinalToken - Whether the final token is always a sentence stop.
balanceParentheses - Whether sentences can stop if not all open parentheses have been closed.