Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments

Kevin Gimpel,  Nathan Schneider,  Brendan O'Connor,  Dipanjan Das,  Daniel Mills,  Jacob Eisenstein,  Michael Heilman,  Dani Yogatama,  Jeffrey Flanigan,  Noah A. Smith
Carnegie Mellon University


Abstract

We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results nearing 90% accuracy. The data and tools have been made available to the research community with the goal of enabling richer text analysis of Twitter and related social media data sets.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-2008.pdf