|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.coref.BooleanMatcherAdapter
com.aliasi.coref.matchers.SequenceSubstringMatch
public final class SequenceSubstringMatch
Implements a matching function that returns the score specified in
the constructor if there is a token-wise match between the normal
tokens of the mention and one of the mentions in the mention chain
that is within a specified edit distance. Subclasses of this
class may redefine the basic edit distances provided by
deleteCost(String), insertCost(String),
and substituteCost(String,String), which are defined in this
class to be 1 in the case of insertion or deletion,
and 0 for an exact substitution and 2 for
a mismatch substitution.
| Field Summary |
|---|
| Fields inherited from interface com.aliasi.coref.Matcher |
|---|
MAX_DISTANCE_SCORE, MAX_SCORE, MAX_SEMANTIC_SCORE, NO_MATCH_SCORE |
| Constructor Summary | |
|---|---|
SequenceSubstringMatch(int score)
Construct a sequence substring matcher that returns the specified score in the case of a match. |
|
| Method Summary | |
|---|---|
protected int |
deleteCost(String token)
Returns the cost to delete the specified token. |
protected int |
insertCost(String token)
Returns the cost to insert the specified token. |
boolean |
matchBoolean(Mention mention,
MentionChain chain)
Returns true if the normal tokens in the mention
are within a threshold edit distance of the normal tokens in
one of the mentions in the chain. |
protected int |
substituteCost(String originalToken,
String newToken)
Returns the cost to substitute the new token for the original token. |
boolean |
withinEditDistance(String[] tokens1,
String[] tokens2)
Returns true if the specified arrays of tokens
have an edit distance within the distance specified internally. |
boolean |
withinEditDistance(String[] tokens1,
String[] tokens2,
int maximumDistance)
Returns true if the specified arrays of tokens are
within the specified maximum distance, allowing for deletion,
insertion and substitution costs as specified by deleteCost(String), insertCost(String), and substituteCost(String,String). |
| Methods inherited from class com.aliasi.coref.BooleanMatcherAdapter |
|---|
match |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public SequenceSubstringMatch(int score)
score - Score to return in the case of a match.| Method Detail |
|---|
public boolean matchBoolean(Mention mention,
MentionChain chain)
true if the normal tokens in the mention
are within a threshold edit distance of the normal tokens in
one of the mentions in the chain.
matchBoolean in class BooleanMatcherAdaptermention - Mention to test.chain - Mention chain to test.
true if there is a sequence substring
match between the mention and chain.
public boolean withinEditDistance(String[] tokens1,
String[] tokens2)
true if the specified arrays of tokens
have an edit distance within the distance specified internally.
tokens1 - First array of tokens to test.tokens2 - Second array of tokens to test.
true if the edit distance between the
arrays of tokens is within the threshold.
public boolean withinEditDistance(String[] tokens1,
String[] tokens2,
int maximumDistance)
true if the specified arrays of tokens are
within the specified maximum distance, allowing for deletion,
insertion and substitution costs as specified by deleteCost(String), insertCost(String), and substituteCost(String,String). To support pairs of tokens
from different sets, as well as asymmetric primitive edit
distances, insertions and deletions are separated, and
substitution may be order sensitive. Deletions are from the
first array of tokens, and insertions into the second array.
Substitution costs will be computed with the first argument
drawn from the first array of tokens and the second argument
drawn from the second array.
tokens1 - First array of tokens to match.tokens2 - Second array of tokens to match.maximumDistance - Maximum edit distance allowed between
token arrays.
true if the edit distance between the
arrays is less than or equal to the specified maximum distance.protected int deleteCost(String token)
token - Token to measure for deletion cost.
protected int insertCost(String token)
token - Token to measure for insertion cost.
protected int substituteCost(String originalToken,
String newToken)
originalToken - Original token.newToken - New token.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||