論文

査読有り
2016年1月

A Constraint Approach to Pivot-Based Bilingual Dictionary Induction

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING
  • Mairidan Wushouer
  • ,
  • Donghui Lin
  • ,
  • Toru Ishida
  • ,
  • Katsutoshi Hirayama

15
1
記述言語
英語
掲載種別
研究論文(学術雑誌)
DOI
10.1145/2723144
出版者・発行元
ASSOC COMPUTING MACHINERY

High-quality bilingual dictionaries are very useful, but such resources are rarely available for lower-density language pairs, especially for those that are closely related. Using a third language to link two other languages is a well-known solution and usually requires only two input bilingual dictionaries A-B and B-C to automatically induce the new one, A-C. This approach, however, has never been demonstrated to utilize the complete structures of the input bilingual dictionaries, and this is a key failing because the dropped meanings negatively influence the result. This article proposes a constraint approach to pivot-based dictionary induction where language A and C are closely related. We create constraints from language similarity and model the structures of the input dictionaries as a Boolean optimization problem, which is then formulated within the Weighted Partial Max-SAT framework, an extension of Boolean Satisfiability (SAT). All of the encoded CNF (Conjunctive Normal Form), the predominant input language of modern SAT/MAX-SAT solvers, formulas are evaluated by a solver to produce the target (output) bilingual dictionary. Moreover, we discuss alternative formalizations as a comparison study. We designed a tool that uses the Sat4j library as the default solver to implement our method and conducted an experiment in which the output bilingual dictionary achieved better quality than the baseline method.

リンク情報
DOI
https://doi.org/10.1145/2723144
Web of Science
https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000373912300004&DestApp=WOS_CPL
ID情報
  • DOI : 10.1145/2723144
  • ISSN : 2375-4699
  • eISSN : 2375-4702
  • Web of Science ID : WOS:000373912300004

エクスポート
BibTeX RIS