What is LingPipe?
LingPipe is a suite of Java libraries for the linguistic analysis of human language.
Feature Overview
LingPipe's information extraction and data mining tools:
- track mentions of entities (e.g. people or proteins);
- link entity mentions to database entries;
- uncover relations between entities and actions;
- classify text passages by language, character encoding, genre, topic, or sentiment;
- correct spelling with respect to a text collection;
- cluster documents by implicit topic and discover significant trends over time; and
- provide part-of-speech tagging and phrase chunking.
Architecture
LingPipe's architecture is designed to be efficient, scalable, reusable, and robust. Highlights include:
- Java API with source code and unit tests;
- multi-lingual, multi-domain, multi-genre models;
- training with new data for new tasks;
- n-best output with statistical confidence estimates;
- online training (learn-a-little, tag-a-little);
- thread-safe models and decoders for concurrent-read exclusive-write (CREW) synchronization; and
- character encoding-sensitive I/O.
Latest Release: LingPipe 3.9.3
Minor Release
The latest release of LingPipe is LingPipe 3.9.3. This release replaces LingPipe 3.9.2, with which it is fully backward compatible.
Upgrade Path to LingPipe 4.0
LingPipe 3.9.3 is scheduled to be the final 3.x release. LingPipe 4.0 is going to remove all currently deprecated classes and methods. As tempted as we were to enforce some kind of naming consistency, we decided to follow Java itself and maintain backward compatibility instead.
There was major refactoring for 3.9 and 3.9.1, which along with backward compatibility, leads to a rather clunky API with lots of duplicate functionality.
Updated Library Jars
Updates to the latest version include:
junit-4.8.2.jarservlet-api-6.0.26.jar