Invention Grant
US08838433B2 Selection of domain-adapted translation subcorpora 有权
选择领域适应翻译子公司

Selection of domain-adapted translation subcorpora
Abstract:
An architecture is discussed that provides the capability to subselect the most relevant data from an out-domain corpus to use either in isolation or in combination conjunction with in-domain data. The architecture is a domain adaptation for machine translation that selects the most relevant sentences from a larger general-domain corpus of parallel translated sentences. The methods for selecting the data include monolingual cross-entropy measure, monolingual cross-entropy difference, bilingual cross entropy, and bilingual cross-entropy difference. A translation model is trained on both the in-domain data and an out-domain subset, and the models can be interpolated together to boost performance on in-domain translation tasks.
Public/Granted literature
Information query
Patent Agency Ranking
0/0