TreeTagger: a part of speech tagger

The TreeTagger is a tool for annotating text with part-of-speech and lemma information which has been developed within the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Spanish, Greek and old French texts and is easily adaptable to other languages if a lexicon and a manually tagged training corpus are available.

Sample output:

word    lemma     

The      DT       

TreeTagger      NP

is          VBZ – be

easy      JJ

to          TO

use      VB


Leave a Reply