Software and Tools
A few broad categories are used below to add some structure to this list. Some items fall into multiple categories but they are listed under the heading that they fit mostly.
Text Processing Tools:
- ABNER: Biomedical Named Entity Recogniser
- ANNIE: A Nearly-New IE system, bundled with GATE
- BALIE: Baseline Information Extraction (BALIE) - Java toolkit
- CALAIS Web Service
- CLAIR: Computational Linguistics and Information Retrieval (CLAIR) - Perl toolkit
- FreeLing: C++ toolkit
- Carabao Language Kit: A mutlipurpose text-to-knowledge suite with extensive customization
- IBM LanguageWare: UIMA-compliant annotators, Eclipse plug-ins or Web Services
- Lingpipe: Java toolkit
- LT-XML2
- MALLET: Java toolkit
- Minor Third: Java toolkit
- NLTK: Natural Language Toolkit (NLTK) - Python toolkit
- OpenNLP: Java toolkit(s)
- RAST: Robust Accurate Statistical Parsing
- Stanford NLP Tools: Java toolkit
Text Processing Frameworks:
- GATE: General Architecture for Text Engineering - Java toolkit
- OpenPipeline: Open Source Java software for crawling, parsing, analyzing and routing documents.
- UIMA: Unstructured Information Management Architecture - Java framework
See Also:
Machine Learning Tools:
- BOW - C toolkit
- RapidMiner
- GATE Machine Learning layer
- Weka
Search Tools:
- Apache Lucene (also see Lucene Wiki)
- Swish-e
- WebGlimpse
- Namazu
- ht://Dig
- Harvest
- Lemur Project search with language models
See Also:
page revision: 15, last edited: 21 Jan 2009 17:45