com.aliasi.chunk
Class ChunkingImpl

java.lang.Object
  extended by com.aliasi.chunk.ChunkingImpl
All Implemented Interfaces:
Chunking

public class ChunkingImpl
extends Object
implements Chunking

A ChunkingImpl provides a mutable, set-based implementation of the chunking interface. At construction time, a character sequence or slice is specified. Chunks may then be added using the add(Chunk) method.

Since:
LingPipe2.1
Version:
3.1
Author:
Bob Carpenter

Constructor Summary
ChunkingImpl(char[] cs, int start, int end)
          Construct a chunking implementation to hold chunks over the specified character slice.
ChunkingImpl(CharSequence cSeq)
          Constructs a chunking implementation to hold chunks over the specified character sequence.
 
Method Summary
 void add(Chunk chunk)
          Add a chunk this this chunking.
 void addAll(Collection chunks)
          Adds all of the chunks in the specified collection to this chunking.
 CharSequence charSequence()
          Returns the character sequence underlying this chunking.
 Set<Chunk> chunkSet()
          Returns the set of chunks for this chunking.
static boolean equal(Chunking chunking1, Chunking chunking2)
          Returns true if the specified chunkings are equal.
 boolean equals(Object that)
          Returns true if the specified object is a chunking equal to this one.
 int hashCode()
          Returns the hash code for this chunking.
static int hashCode(Chunking chunking)
          Returns the hash code for the specified chunking.
 String toString()
          Returns a string-based representation of this chunking.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ChunkingImpl

public ChunkingImpl(CharSequence cSeq)
Constructs a chunking implementation to hold chunks over the specified character sequence. The sequence is stored immutably in this implementation, so later changes to the sequence provided to this constructor will not affect the constructed chunking implementation. All chunks added must be within this character sequence's bounds.

Parameters:
cSeq - Character sequence underlying the chunking.

ChunkingImpl

public ChunkingImpl(char[] cs,
                    int start,
                    int end)
Construct a chunking implementation to hold chunks over the specified character slice. The slice is copied, so later changes to it do not affect the constructed chunking. All chunks added to this chunking must be within this character slice's (relative) bounds. The chunks themselves will have indices relative to the start parameter of this constructor, rather than absolute offsets into this character slice.

Parameters:
cs - Character array.
start - Index in array of first element in chunk.
end - Index in array of one past the last element in chunk.
Method Detail

addAll

public void addAll(Collection chunks)
Adds all of the chunks in the specified collection to this chunking. If any of the chunks do not implement the Chunk interface, an illegal argument exception is thrown.

Parameters:
chunks - Chunks to add to this chunking.
Throws:
IllegalArgumentException - If the collection contains an object that does not implement Chunk.

add

public void add(Chunk chunk)
Add a chunk this this chunking. The chunk must have start and end points within the bounds provided by the character sequence underlying this chunking.

Parameters:
chunk - Chunk to add to this chunking.
Throws:
IllegalArgumentException - If the end point is beyond the underlying character sequence.

charSequence

public CharSequence charSequence()
Returns the character sequence underlying this chunking.

Specified by:
charSequence in interface Chunking
Returns:
The character sequence underlying this chunking.

chunkSet

public Set<Chunk> chunkSet()
Returns the set of chunks for this chunking. The returned set is an immutable view of the chunks in this set; it will change as the set underlying this chunking changes, but it may not be modified externally.

Specified by:
chunkSet in interface Chunking
Returns:
The set of chunks for this chunking.

equals

public boolean equals(Object that)
Description copied from interface: Chunking
Returns true if the specified object is a chunking equal to this one. Equality for chunking is defined by character sequence yield equality and chunk set equality. Character sequences are tested for equality with Strings.equalCharSequence(CharSequence,CharSequence) and chunks are compared as sets with elements tested for equality using Chunk.equals(Object). There is a utility implementation of this definition provided for chunkings in equal(Chunking,Chunking).

Specified by:
equals in interface Chunking
Overrides:
equals in class Object
Parameters:
that - Object to compare.
Returns:
true if the specified object is a chunking equal to this one.

hashCode

public int hashCode()
Description copied from interface: Chunking
Returns the hash code for this chunking. Hash codes for chunkings are defined by:
 hashCode() 
   = Strings.hashCode(charSequence())
     + 31 * chunkSet().hashCode()
 
There is a utility implementation of this definition provided for chunkings in hashCode(Chunking).

Specified by:
hashCode in interface Chunking
Overrides:
hashCode in class Object
Returns:
The hash code for this chunking.

equal

public static boolean equal(Chunking chunking1,
                            Chunking chunking2)
Returns true if the specified chunkings are equal. Chunking equality is defined in Chunking.equals(Object) to be equality of character sequence yields and equality of chunk sets.

Warning: Equality is unstable if the chunkings change.

Parameters:
chunking1 - First chunking.
chunking2 - Second chunking.
Returns:
true if the chunkings are equal.

hashCode

public static int hashCode(Chunking chunking)
Returns the hash code for the specified chunking. The hash code for a chunking is defined by Chunking.hashCode().

Warning: Hash codes are unstable if the chunkings change.

Parameters:
chunking - Chunking whose hash code is returned.
Returns:
The hash code for the specified chunking.

toString

public String toString()
Returns a string-based representation of this chunking. This representation includes the character sequence and each chunk in the chunk set.

Overrides:
toString in class Object
Returns:
String-based representation of this chunking.