期刊文献+

一种高效的基于启发式规则和词典相结合的双语词对齐方法 被引量:2

A high efficient word alignment approach based on heuristic rule and lexicon
在线阅读 下载PDF
导出
摘要 双语词对齐是指在互译的双语句对中找到词汇级的对译关系,是自然语言处理领域一个非常有用而又比较困难的研究课题。本文在对当前主流的词对齐方法进行实验分析的基础上,综合考虑了各种因素,提出了基于启发式统计规则和词典相结合的方法,该方法充分利用现有资源,同时考虑到了后续的应用问题。实验表明,该方法在训练语料规模较小的情况下,取得了较好的对齐结果。 Bilingual Word Alignment,which can be defined as an object to indicate the corresponding words in a parallel sentence pair which have been aligned in the parallel text,is a very useful and difficult research topic in NLP.Considering various factors,comparing the word alignment results obtained from the current mainstream methods,this paper proposed a method combined heuristic rule and lexicon,the method is making full use of existing resources,taking into the application subsequently.Experiments show that the proposed method achieved good performance based on a small training corpus.
出处 《沈阳航空工业学院学报》 2010年第5期73-77,共5页 Journal of Shenyang Institute of Aeronautical Engineering
关键词 自然语言处理 双语词对齐 锚点 启发式规则 高效 NLP bilingual word alignment anchor heuristic rule high efficient
  • 相关文献

参考文献18

  • 1周蓝海,蔡东风.多策略英汉词对齐方法的研究[J].计算机工程与设计,2009,30(17):4138-4140. 被引量:5
  • 2S. S. Piao. Word alignment in English - Chinese parallel corpora [ J ]. Literary and Linguistic Computing,2002,17 (2) :207 - 230.
  • 3刘划.基于最优邻接锚点消歧的词对齐方法[J].沈阳航空工业学院学报,2009,26(1):53-55. 被引量:2
  • 4Dan Tufts, Radu Ion, Alexandru Ceausu, et al. Combined word alignments[ A]. In Proc. of the ACL -2005 Workshop on Building and Using Parallel Texts: Data - driven Machine Translation and Beyond [ C ]. Morristown, NJ, USA: Publisher Association for Computational Linguistics, 2005 : 107 - 110.
  • 5Shankar Kumar, Franz Oeh, Wolfgang Maeherey. Improving word alignment with bridge languages [ A ]. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning ( EMNLP - CoNLL) [ C]. 2007 : 42 -50.
  • 6刘洋,刘群,林守勋.词语对齐的对数线性模型[M].北京:中国科学院计算技术研究所,2004.
  • 7周波,蔡东风.基于条件随机场的中文组织机构名识别研究[J].沈阳航空工业学院学报,2009,26(1):49-52. 被引量:8
  • 8K. Yamamoto, Y Matsumoto, M. Kitamura. A comparative study on translation units for bilingual lexicon extraction [ A ]. ACL - 2001 Workshop on Data - Driven Methods in Machine Translation [C]. Toulouse, 2001 : 87 -94.
  • 9Bing Zhao, Stephan Vogel. Word alignment based on bilingual bracketing[ A ]. Proceedings of the HLT - NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond[ C]. Edmonton, Canada, May 31 -31, 2003:15-18.
  • 10晋薇,黄河燕,夏云庆.基于语义相似度并运用语言学知识进行双语语句词对齐[J].计算机科学,2002,29(11):44-47. 被引量:6

二级参考文献53

共引文献93

同被引文献24

  • 1李维刚,刘挺,张宇,李生.基于长度和位置信息的双语句子对齐方法[J].哈尔滨工业大学学报,2006,38(5):689-692. 被引量:25
  • 2黄红梅,李鹏,赵济民.宇称模糊逻辑与自然语言理解[J].现代电子技术,2007,30(8):84-86. 被引量:1
  • 3Ying Zhang,Stephan Vogel,Alex Waibel.InterpretingBLEU/NIST scores:how much improvement do weneed to have a better system. Proceedings ofLREC . 2004
  • 4F.J.Och,H.Ney.A systematic comparison of vari-ous statistical alignment models. Proceedings ofthe 19th international conference on computational lin-guistics . 2002
  • 5Francisco Nevado,Francisco Casacuberta,EnriqueVidal.Parallel corpora segmentation by using anchorwords. Proc.of the EAMT/EACL Workshop onMT and Other Language Technology Tools . 2003
  • 6Yaohong Jin,Zhiying Liu.Improving Chinese-Englishpatent machine translation using sentence segmentation. proceedings of the 6th International Conferenceon Natural Langugae Processing and Knowledege En-gineering . 2010
  • 7DeKai Wu.Aligning a Parallel English-Chinese Cor-pus Statistically with Lexical Criteria. the pro-ceeding of Annual meeting of ACL-32 . 1993
  • 8N. Gough,and A. Way.Robust Large-Scale EBMT with Marker-Based Segmentation. Proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation . 2004
  • 9Brown PE,Della Pietra VJ,Della Pietra SA,et al.The mathematics of statistical machine translation: parameter estimation. Computational Linguistics . 1993
  • 10Chen,Stanley F.Aligning sentences in bilingual corpora using lexical information. Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (ACL‘93) . 1993

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部