com.aliasi.io
Class BitOutput

java.lang.Object
  extended by com.aliasi.io.BitOutput

public class BitOutput
extends Object

A BitOutput wraps an underlying output stream to provide bit-level output. Output is written through the method writeBit(boolean), with true used for the bit 1 and false for the bit 0. The methods writeTrue() and writeFalse() are shorthand for writeBit(true) and writeBit(false) respectively.

If the number of bits written before closing the output does not land on a byte boundary, the remaining fractional byte is filled with 0 bits.

None of the methods in this class are safe for concurrent access by multiple threads.

Since:
LingPipe2.1.1
Version:
2.1.1
Author:
Bob Carpenter

Constructor Summary
BitOutput(OutputStream out)
          Construct a bit output wrapping the specified output stream.
 
Method Summary
 void close()
          Closes underlying output stream and releases any resources associated with the stream.
 void flush()
          Flushes writes to the underlying output stream.
static long leastSignificantBits(long n, int numBits)
          Returns the specified number of the least significant bits of the specified long value as a long.
static int mostSignificantPowerOfTwo(long n)
          Returns the index of the most significant bit filled for the specified long value.
static long sliceBits(long n, int leastSignificantBit, int numBits)
          Returns a slice of bits in the specified long value running from the specified least significant bit for the specified number of bits.
 void writeBinary(long n, int numBits)
          Writes the bits of a binary representation of the specified non-negative number in the specified number of bits.
 void writeBit(boolean bit)
          Writes the specified bit.
 void writeDelta(long n)
          Writes the bits for the Elias delta code for the specified positive number.
 void writeFalse()
          Writes a single false (0) bit.
 void writeFibonacci(long n)
          Writes the Fibonacci code for the specified positive number.
 void writeGamma(long n)
          Writes the bits for the Elias gamma code for the specified positive number.
 void writeRice(long n, int numFixedBits)
          Writes the bits for Rice code for the specified non-negative number with the specified number of bits fixed for the binary remainder.
 void writeTrue()
          Writes a single true (1) bit.
 void writeUnary(int n)
          Writes the bits for a unary code for the specified positive number.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BitOutput

public BitOutput(OutputStream out)
Construct a bit output wrapping the specified output stream.

Parameters:
out - Underlying output stream.
Method Detail

writeUnary

public void writeUnary(int n)
                throws IOException
Writes the bits for a unary code for the specified positive number. The unary code for the number n is defined by:
unaryCode(n) = 0n-1 1
In words, the number n is coded as n-1 zeros followed by a one. The following table illustrates the first few unary codes:
NumberCode
11
201
3001
40001
500001

Parameters:
n - Number to code.
Throws:
IOException - If there is an I/O error writing to the underlying output stream.
IllegalArgumentException - If the number to be encoded is zero or negative.

writeBinary

public void writeBinary(long n,
                        int numBits)
                 throws IOException
Writes the bits of a binary representation of the specified non-negative number in the specified number of bits. if the number will not fit in the number of bits specified, an exception is raised.

For instance, the following illustrates one, two and three-bit codings.

Number Binary Code for Num Bits
1 2 3
01 0 00 000
11 1 01 001
210 Exception 10 010
310 Exception 11 011
4100 Exception Exception 100
5101 Exception Exception 101
6110 Exception Exception 110
7111 Exception Exception 111
81000 Exception Exception Exception

Parameters:
n - Number to code.
numBits - Number of bits to use for coding.
Throws:
IllegalArgumentException - If the number to code is negative, the number of bits is greater than 63, or the number will not fit into the specified number of bits.
IOException - If there is an error writing to the underlying output stream.

writeRice

public void writeRice(long n,
                      int numFixedBits)
               throws IOException
Writes the bits for Rice code for the specified non-negative number with the specified number of bits fixed for the binary remainder. Rice coding is a form of Golomb coding where the Golomb paramemter is a power of two (2 to the number of bits in the remainder). The Rice code is defined by unary coding a magnitude and then binary coding the remainder. It can be defined by taking a quotient and remainder:
m = 2b = (1<<b)
q = (n - 1) / m = (n - 1) >>> b
r = n - q*m - 1 = n - (q << b) - 1
both of which are defined by shifting, and then coding each in turn using a unary code for the quotient and binary code for the remainder:
riceCode(n,b) = unaryCode(q) binaryCode(r)
For example, we get the following codes with the number of fixed remainder bits set to 1, 2 and 3, with the unary coded quotient separated from the binary coded remainder by a space:
Number
n
Binary Code for Number of Remainder Bits
b=1 b=2 b=3
1 1 1 0 1 00 1 000
2 10 1 1 1 01 1 001
3 11 01 0 1 10 1 010
4 100 01 1 1 11 1 011
5 101 001 0 01 00 1 100
6 110 001 1 01 01 1 101
7 111 0001 0 01 10 1 110
8 1000 0001 1 01 11 1 111
9 1001 00001 0 001 00 01 000
10 1010 00001 1 001 01 01 001
11 1011 000001 0 001 10 01 010
12 1100 000001 1 001 11 01 011
13 1101 0000001 0 0001 00 01 100
14 1110 0000001 1 0001 01 01 101
15 1111 00000001 0 0001 10 01 110
16 10000 00000001 1 0001 11 01 111
17 10001 000000001 0 00001 00 001 000
In the limit, if the number of remaining bits to code is set to zero, the Rice code would reduce to a unary code:
riceCode(n,0) = unaryCode(n)
but this method will throw an exception with a remainder size of zero.

In the limit the other way, if the number of remaining bits is set to the width of the maximum value, the Rice code is just the unary coding of 1, which is the single binary digit 1, followed by the binary code itself:

riceCode(n,64) = unaryCode(1) binaryCode(n,64) = 1 binaryCode(n,64)

The method will throw an exception if the encoding produces a unary code that would output more bits than would fit in a positive integer (that is, more than (232-1) bits. For more information, see:

Parameters:
n - Number to code.
numFixedBits - Number of bits to use for the fixed remainder encoding.
Throws:
IOException - If there is an error writing to the underlying output stream.
IllegalArgumentException - If the number to be encoded is not positive, if the number of fixed bits is not positive, or if the unary prefix code overflows.

writeFibonacci

public void writeFibonacci(long n)
                    throws IOException
Writes the Fibonacci code for the specified positive number. Roughly speaking, the Fibonacci code specifies a number as a sum of non-consecutive Fibonacci numbers, terminating a representation with two consecutive 1 bits.

Fibonacci numbers are defined by setting

 Fib(0) = 0
 Fib(1) = 1
 Fib(n+2) = Fib(n+1) + Fib(n)
 
The first few Fibonacci numbers are:
0, 1, 1, 2, 3, 5, 8, 13, 21, ...
This method starts with the second 1 value, namely Fib(2), making the sequence a sequence of unique numbers starting with 1, 2, 3, 5,....

The Fibonacci representation of a number is a bit vector indicating the Fibonacci numbers used in the sum. The Fibonacci code reverses the Fibonacci representation and appends a 1 bit. Here are examples for the first 17 numbers:

Number Fibonacci Representation Fibonacci Code
1 1 11
2 10 01 1
3 100 001 1
4 101 101 1
5 1000 0001 1
6 1001 1001 1
7 1010 0101 1
8 10000 00001 1
9 10001 10001 1
10 10010 01001 1
11 10100 00101 1
12 10101 10101 1
13 100000 000001 1
14 100001 100001 1
15 100010 010001 1
16 100100 001001 1
17 100101 101001 1
For example, the number 11 is coded as the sum of the non-consecutive Fibonacci numbers 8 + 3, so the Fibonacci representation is 10100 (8 is the fifth number in the series above, 3 is the third). Its Fibonacci code reverses the number to 00101 and appends a 1 to yield 001011.

Fibonacci codes can represent arbitrary positive numbers up to Long.MAX_VALUE.

See Math.FIBONACCI_SEQUENCE for a definition of the Fibonacci sequence as an array of longs.

In the limit (for larger numbers), the number of bits used by a Fibonacci coding is roughly 60 percent higher than the number of bits used for a binary code. The benefit is that Fibonacci codes are prefix codes, whereas binary codes are not.

Parameters:
n - Number to encode.
Throws:
IllegalArgumentException - If the number is not positive.
IOException - If there is an I/O exception writing to the underlying stream.

writeGamma

public void writeGamma(long n)
                throws IOException
Writes the bits for the Elias gamma code for the specified positive number. The gamma code of the number n is based on its binary representation b[k-1],...,b[0]:
gammaCode(b[k-1],...,b[0]) = unaryCode(k),b[k-1],...,b[0]
In words, the position of the most significant binary digit is coded using a unary code, with the remaining digits making up the rest of the gamma code.

The Following table provides an illustration of the gamma coding of the first 17 positive integers. Each row displays the number being coded, its binary representation, and its gamma code. The gamma code is displayed as its unary coding of the number of digits in the binary representation followed by a space and then by the digits of the binary representation after the first one.

Number Binary Gamma code
111
21001 0
31101 1
4100001 00
5101001 01
6110001 10
7111001 11
810000001 000
910010001 001
1010100001 010
1110110001 011
1211000001 100
1311010001 101
1411100001 110
1511110001 111
161000000001 0000
171000100001 0001
For more information on gamma coding, see:

Parameters:
n - Number to code.
Throws:
IOException - If there is an I/O error writing to the underlying stream.
IllegalArgumentException - If the number to be encoded is zero or negative.

writeDelta

public void writeDelta(long n)
                throws IOException
Writes the bits for the Elias delta code for the specified positive number. The delta code of the number n is based on its binary representation b[k-1],...,b[0]:
deltaCode(b[k-1],...,b[0]) = gammaCode(k),b[k-1],...,b[0]
In words, the position of the most significant binary digit is coded using a gamma code, with the remaining digits making up the rest of the gamma code.

The following table illustrates the delta codes for some small numbers. Each row lists the number, its binary representation, and its delta code. The delta code is written as the initial gamma code of its most significant digit's position and the remaining bits in the binary representation. Note that the delta codes are longer for small numbers, but shorter for large numbers.

Number Binary Delta code
111
210010 0
311010 1
4100011 00
5101011 01
6110011 10
7111011 11
8100000100 000
9100100100 001
10101000100 010
11101100100 011
12110000100 100
13110100100 101
14111000100 110
15111100100 111
161000000101 0000
171000100101 0001
For more information on delta coding, see:

Parameters:
n - Number to code.
Throws:
IOException - If there is an I/O error writing to the underlying stream.
IllegalArgumentException - If the number to be encoded is zero or negative.

close

public void close()
           throws IOException
Closes underlying output stream and releases any resources associated with the stream. This method first flushes the output stream, which sets any remaining bits in the byte currently being written to 0.

The close method calls the OutputStream.close() method on the contained output stream.

Throws:
IOException - If there is an I/O exception writing the next byte or closing the underlying output stream.

flush

public void flush()
           throws IOException
Flushes writes to the underlying output stream. First, this method sets any bits remaining in the current byte to 0. It then calls OutputStream.flush() on the underlying output stream.

Throws:
IOException - If there is an exception writing to or flushing the underlying output stream.

writeBit

public void writeBit(boolean bit)
              throws IOException
Writes the specified bit. The boolean true is used for the bit 1 and false for 0.

Parameters:
bit - Value to write.
Throws:
IOException - If there is an exception writing to the underlying output stream.

writeTrue

public void writeTrue()
               throws IOException
Writes a single true (1) bit.

Throws:
IOException - If there is an exception writing to the underlying output stream.

writeFalse

public void writeFalse()
                throws IOException
Writes a single false (0) bit.

Throws:
IOException - If there is an exception writing to the underlying output stream.

leastSignificantBits

public static long leastSignificantBits(long n,
                                        int numBits)
Returns the specified number of the least significant bits of the specified long value as a long. For example, leastSignificantBits(13,2) = 3, because 13 is 1011 in binary and the two least significant digits are 11.

Parameters:
n - Value whose least significant bits are returned.
numBits - The number of bits to return.
Returns:
The least significant number of bits.
Throws:
IllegalArgumentException - If the number of bits is less than 1 or greater than 64.

sliceBits

public static long sliceBits(long n,
                             int leastSignificantBit,
                             int numBits)
Returns a slice of bits in the specified long value running from the specified least significant bit for the specified number of bits. The bits are indexed in increasing order of significance from 0 to 63. So for the binary 110, the bit indexed 0 is 0, the bit indexed 1 is 1 and the bit indexed 2 is 1. For example, sliceBits(57,2,3) = 6, because 57 is 111001 in binary and the three bits extending to the left from position 2 are 110, which is 2.

Parameters:
n - Value to be sliced.
leastSignificantBit - Index of least significant bit in the result.
numBits - Number of bits including least significant bit to return.
Throws:
IllegalArgumentException - If the number of bits is less than zero or greater than 64, or if the least significant bit index is less than 0 or greater than 63.

mostSignificantPowerOfTwo

public static int mostSignificantPowerOfTwo(long n)
Returns the index of the most significant bit filled for the specified long value. For example,
 mostSignificantPowerOfTwo(1) = 0
 mostSignificantPowerOfTwo(2) = 1
 mostSignificantPowerOfTwo(4) = 2
 mostSignificantPowerOfTwo(8) = 3
 

This result of this method may be defined in terms of the built-in method Long.numberOfLeadingZeros(long), added in Java 1.5, by:

 mostSignificantPowerOfTwo(n) = Math.max(0,63-Long.numberOfLeadingZeros(n))
 

Parameters:
n - The specified value.
Returns:
The most significant power of 2 of the specified value.