Multilingual Bert with Tag and Align attention Net (META Net)
Named entity missing and recognition errors are top issues in multilingual generation scenarios such as machine translation, accounting for 5.46% and 4.38% respectively (Hassan, et al. 2019). Bilingual constraints (the category and the number of detected entities must be equal for bilingual pair) can be de-fined as SoftTag and SoftAlign(Che, et al. 2013). We propose Multilingual Bert with Tag and Align attention Net (META Net), by adding SoftTag and SoftAlign attention layers to learn the bilingual constraints. Experiments are conducted on English-Chinese and English-German with OntoNotes 5.0, ConLL2003 and WMT18 aligned corpus. All NER results are better than SOTA monolingual model, where Person Entity F1 in Chinese is significantly improved with 3.3%. We also have following find-ings from case studies. (1) Strong features such as capitalized Person Name in English and Location with common suffixes in Chinese can be learned by bilingual NER transfer. (2) Because sentence-level pairs are derived from document-level or para-graph-level pairs, which often causes ambiguity and omission and finally becomes noise for bilingual text.