com.aliasi.util
Class CommaSeparatedValues

java.lang.Object
  extended by com.aliasi.util.CommaSeparatedValues
All Implemented Interfaces:
Serializable

public class CommaSeparatedValues
extends Object
implements Serializable

A CommaSeparatedValues object represents a two-dimensional array of strings which may be read and written in comma-separated-value string representation. The CSV encoding is general enough to encode arbitary ragged two-dimensional arrays of strings.

The CSV notation is character oriented, so any data read to or from files or streams must use a character encoding. The behavior of reads and writes for unknown characters is determined by InputStreamReader and OutputStreamWriter, constructed with the user-specified character set.

The CSV format is row-oriented, consisting of a number of rows, followed by the end of the stream. Each row consists of a number of elements separated by commas. The rows are not required to contain the same number of elements.

An element may be plain or quoted. Plain elements consist of a sequence of characters not containing any double quote ("), comma (,), or newline (\n) characters. Any leading or trailing whitespace is trimmed to produce the element string. For CSV processing, a whitespace character may be either a space (' ') or a tab (Java literal '\t') character.

Quoted elements consist of a sequence of characters surrounded by double quotes. The elements between the double quotes may include comma or newline characters. Double quotes may be included, but must be escaped with another double quote. Any space before or after the quote symbols is ignored, but any whitespace between the element-wrapping quotes is included in the element string.

Since:
LingPipe3.1
Version:
3.8
Author:
Bob Carpenter
See Also:
Serialized Form

Constructor Summary
CommaSeparatedValues(File file, String charset)
          Construct a comma-separated values array from the specified file using the specified character set.
CommaSeparatedValues(InputStream in, String charset)
          Construct a comma-separated values array from the specified input stream using the specified character set.
CommaSeparatedValues(Reader reader)
          Construct a comma-separated values array from the specified reader.
 
Method Summary
 String[][] getArray()
          Returns the underlying array for this comma-separated values object.
 void toFile(File file, String charset)
          Write this comma-separated values object to the specified file using the specified charset.
 void toStream(OutputStream out, String charset)
          Write this comma-separated values object to the specified output stream using the specified charset.
 String toString()
          Returns a string-based representation of this comma-separated values object.
 void toWriter(Writer writer)
          Write this comma-separated values object to the specified writer.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CommaSeparatedValues

public CommaSeparatedValues(File file,
                            String charset)
                     throws IOException
Construct a comma-separated values array from the specified file using the specified character set.

Parameters:
file - File from which to read.
charset - Encoding of characters in the stream.
Throws:
IOException - If there is an underlying I/O error.
IllegalArgumentException - If the stream of characters produced by the reader is not a well-defined CSV string.

CommaSeparatedValues

public CommaSeparatedValues(InputStream in,
                            String charset)
                     throws IOException
Construct a comma-separated values array from the specified input stream using the specified character set. The stream is converted to an input reader and then buffered. The input stream will be fully read and closed after the read is complete or if there is an exception.

Parameters:
in - Input stream from which to read.
charset - Encoding of characters in the stream.
Throws:
IOException - If there is an underlying I/O error.
IllegalArgumentException - If the stream of characters produced by the reader is not a well-defined CSV string.

CommaSeparatedValues

public CommaSeparatedValues(Reader reader)
                     throws IOException
Construct a comma-separated values array from the specified reader. The reader will be fully read and closed after the read is complete or if there is an exception. No further buffering is done to the reader.

Parameters:
reader - Reader from which the CSV object will be read.
Throws:
IOException - If there is an underlying I/O error.
IllegalArgumentException - If the stream of characters produced by the reader is not a well-defined CSV string.
Method Detail

getArray

public String[][] getArray()
Returns the underlying array for this comma-separated values object. Modifying this array will change the values that are written out.

Returns:
The array underlying this CSV object.

toFile

public void toFile(File file,
                   String charset)
            throws IOException
Write this comma-separated values object to the specified file using the specified charset. Characters in the elements that are not encodable in the specified character set are replaced with the question mark (?) character.

Parameters:
file - File to which this CSV object is written.
charset - Character encoding to use for characters.
Throws:
IOException - If there is an underlying I/O exception.

toStream

public void toStream(OutputStream out,
                     String charset)
              throws IOException
Write this comma-separated values object to the specified output stream using the specified charset. Characters in the elements that are not encodable in the specified character set are replaced with the question mark (?) character.

Parameters:
out - Stream to which this CSV object is written.
charset - Character encoding to use for characters.
Throws:
IOException - If there is an underlying I/O exception.

toWriter

public void toWriter(Writer writer)
              throws IOException
Write this comma-separated values object to the specified writer.

Parameters:
writer - Writer to which this CSV object is written.
Throws:
IOException - If there is an underlying I/O exception.

toString

public String toString()
Returns a string-based representation of this comma-separated values object. Reading the string back in through a file or stream will reproduce the same array of values.

Overrides:
toString in class Object
Returns:
The string-based representation of this CSV array.