Word Alignment Combination over Multiple Word Segmentation

Ning Xi,  Guangchao Tang,  Boyuan Li,  Yinggong Zhao
State Key Laboratory for Novel Software Technology of Nanjing University


Abstract

In this paper, we present a new word alignment combination approach on language pairs where one language has no explicit word boundaries. Instead of combining word alignments of different models (Xiang et al., 2010), we try to combine word alignments over multiple monolingually motivated word segmentation. Our approach is based on link confidence score defined over multiple segmentations, thus the combined alignment is more robust to inappropriate word segmentation. Our combination algorithm is simple, efficient, and easy to implement. In the Chinese-English experiment, our approach effectively improved word align-ment quality as well as translation performance on all segmentations simultaneously, which showed that word alignment can benefit from complementary knowledge due to the diversity of multiple and monolingually motivated segmentations.




Full paper: http://www.aclweb.org/anthology/P/P11/.pdf