Effective Use of Function Words for Rule Generalization in Forest-Based Translation

Xianchao Wu,  Takuya Matsuzaki,  Jun'ichi Tsujii
Computer Science, The University of Tokyo


Abstract

In the present paper, we propose the effective usage of function words to generate generalized translation rules for forest-based translation. Given aligned forest-string pairs, we extract composed tree-to-string translation rules that account for multiple interpretations of both aligned and unaligned target function words. In order to constrain the exhaustive attachments of function words, we limit to bind them to the nearby syntactic chunks yielded by a target dependency parser. Therefore, the proposed approach can not only capture source-tree-to-target-chunk correspondences but can also use forest structures that compactly encode an exponential number of parse trees to properly generate target function words during decoding. Extensive experiments involving large-scale English-to-Japanese translation revealed a significant improvement of 1.8 points in BLEU score, as compared with a strong forest-to-string baseline system.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-1003.pdf