Preliminary Program

Judging Grammaticality with Tree Substitution Grammar Derivations

Matt Post
Johns Hopkins University

Abstract

In this paper, we show that local features computed from the derivations of tree substitution grammars --- such as the identify of particular fragments, and a count of large and small fragments
--- are useful in binary grammatical classification tasks. Such features outperform n-gram features and various model scores by a wide margin. Although they fall short of the performance of the hand-crafted feature set of \namecite{charniak2005coarse} developed for parse tree reranking, they do so with an order of magnitude fewer features. Furthermore, since the TSGs employed are learned in a Bayesian setting, the use of their derivations can be viewed as the automatic discovery of tree patterns useful for classification. On the BLLIP dataset, we achieve an accuracy of 89.9\% in discriminating between grammatical text and samples from an n-gram language model.

Full paper: http://www.aclweb.org/anthology/P/P11/P11-2038.pdf