How Can We Help You?

Join us on Facebook

What is LingPipe?

LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like:

To get a better idea of the range of possible LingPipe uses, visit our tutorials and sandbox.


LingPipe's architecture is designed to be efficient, scalable, reusable, and robust. Highlights include:

Latest Release: LingPipe 4.1.0

Intermediate Release

The latest release of LingPipe is LingPipe 4.1.0, which is a feature release, as well as patching some bugs. It is fully backward compatible with LingPipe version 4.0.1.

Character, Token, and Document Suffix Arrays

The largest addition in LingPipe 4.1 is suffix arrays. The package com.aliasi.suffixarray contains classes for suffix arrays of characters, of tokens, or of tokenized documents with links back to the documents from the suffix array. Suffix arrays support finding arbitrary length repeated strings in a large text collection.

Serialization for Language Models

We also added serializability to a number of the language model implementations which helps them play nicely with our classifiers, taggers, etc.

TF/IDF Classifier Access Methods

We added methods to TF/IDF classifiers to access the raw IDF values for terms and raw IDF values for term/document pairs.

Line Tagging Parser

The line tagging parser was updated to handle more general end-of-line markers across platforms.

Single-Link Clustering Bug

We fixed a bug in single-link clustering which caused elements further away than the distance bound from all other elements to disappear.

Tests Fork

If you run our top-level API test through Ant, you'll find they're much slower, as in about four times slower. This isn't because LingPipe is slower, but because we rewrote the test call to fork a new process for each test. This allows the tests to succeed out of the box with under 1MB memory on the Macintosh OSX platform with their Java.

Migration from LingPipe 3 to LingPipe 4

LingPipe 4.1.0 is not  backward compatible with LingPipe 3.9.3.

Programs that compile in LingPipe 3.9.3 without deprecation warnings should compile and run in Lingpipe 4.1.0.

Downloading Last 3.9 Version: LingPipe 3.9.3

The last 3.9 version of LingPipe before the major refactoring is available at: