期刊文献+

单纯形算法在统计机器翻译Re-ranking中的应用 被引量:2

Re-ranking for Statistical Machine Translation Using Simplex Algorithm
在线阅读 下载PDF
导出
摘要 近年来,discriminative re-ranking技术已经被应用到很多自然语言处理相关的分支中,像句法分析,词性标注,机器翻译等,并都取得了比较好的效果,在各自相应的评估标准下都有所提高。本文将以统计机器翻译为例,详细地讲解利用单纯形算法(Simplex Algorithm)对翻译结果进行re-rank的原理和过程,算法的实现和使用方法,以及re-rank实验中特征选择的方法,并给出该算法在NIST-2002(开发集)和NIST-2005(测试集)中英文机器翻译测试集合上的实验结果,在开发集和测试集上,BLEU分值分别获得了1.26%和1.16%的提高。 Recently, discriminative re-ranking technique has been applied in many fields relative to NLP (Natural Language Processing), such as parsing, pos-tagging, and machine translation etc., and performs very well. We will take SMT as an example to explain how to re-rank the translation candidates using Simplex Algorithm in detail and give the experiment results on NIST-2002(development set) and NIST_2005(test set) Chinese-to-English test sets. Our experiments show that we can gain significant improvements in BLEU by re-ranking. It can provide 1.26 % absolute increase in development set and 1.16 % absolute increase in test set.
作者 付雷 刘群
出处 《中文信息学报》 CSCD 北大核心 2007年第3期28-33,共6页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60573188)
关键词 人工智能 机器翻译 discriminative re—ranking 单纯形算法 统计机器翻译 artificial intelligence machine translation discriminative re-ranking simplex algorithm SMT
  • 相关文献

参考文献14

  • 1Ashish Venugopal and Stephan Vogel. Considerations in Maximum Mutual Information and Minimum Classification Error training for Statistical Machine Translation [A]. In: EAMT 2005 Conference Proceedings[C].
  • 2B. Chen, R. Cattoni, N. Bertoldi, M. Cettolo, M.Federieo. The ITC-irst SMT System for IWSLT-2005[A].
  • 3Franz Josef Och. Minimum error rate training in statistical machine translation [A]. Ins Pro. of ACL 2003 [C].
  • 4Franz Josef Och and Hermann Ney. Discriminative Trainging and Maximum Entropy Models for Statistical Machine Translation [A]. In: Proceedings of the 40^th Annual Meeting of the ACL [C]. Philadelphia,July 2002, pp. 295-302.
  • 5I. Dan Melamed. A Word-to-Word Model of Translational Equivalence [A]. In: Pro. of 35th Conference of the Association for Computational Linguistics (ACL'97) [C]. Madrid, 1997. 490-497.
  • 6Libin Shen and A. K. Joshi. An SVM based voting algorithm with application to parse reranking [A]. In:Proc. of CoNLL 2003 [C].
  • 7Libin Shen, Anoop Sarkar, Franz Josef Och. Discriminative Reranking for Machine Translation [A]. In:Proc. HLTNAACL 2004 [C].
  • 8M. Cettolo, M. Federico, N. Bertoldi, R. Cattoni and B. Chen. A Look inside the ITC-irst SMT System[A]. In: Proceedings of the 10th MT-Summit [C].Phuket, Thailand. 2005.
  • 9M. Collins and N. Dully. New ranking algorithm for parsing and tagging: Kernels over discret structures,and the voted perceptron [A]. In: Proceedings of ACL 2002 [C].
  • 10P. F. Brown, S. A. Della Pietra, V. J. Della Pietra, R. L. Mercer. The Mathematics of Statistical Machine Translation [J]. Computational Linguistics,1993, 19(2).

二级参考文献14

  • 1俞士汶等.机器翻译译文质量自动评估系统[A]..中国中文信息学会1991年会论文集[C].,.314—319.
  • 2Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, Paul S. Roossin, A Statistical Approach to Machine Translation [J],Computational Linguistics, 1990.
  • 3Peter. F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, Robert L. Mercer, The Mathematics of Statistical Machine Translation: Parameter Estimation [J], Computational Linguiatics, 19,(2), 1993.
  • 4F. J. Och, C. Tillmann, and H. Ney. Improved alignment models for statistical machine translation[A]. In Proc. of the Joint SIGDAT Conf. On Empirical Methods in Natural Language Processing and Very Large Corpora, pages 20-28, University of Maryland, College Park, MD, June 1999.
  • 5Franz Josef Och, Hermann Ney. What Can Machine Translation Learn from Speech Recognition? [A]In: proceedings of MT 2001 Workshop: Towards a Road Map for MT, 26-31, Santiago de Compostels,Spain, September 2001.
  • 6Franz Josef Och, Hermann Ney, Discriminative Training and Maximum Entropy Models for Statistical Machine Translation [A], ACL2002.
  • 7K. A. Papineni, S. Roukos, and R. T. Ward. Feature-based language understanding[A]. In European Conf. on Speech Communication and Technology, 1435-1438, Rhodes, Greece, September,1997.
  • 8K. A. Papineni, S. Roukos, and R. T. Ward. Maximum likelihood and discriminative training of direct translation models [A] In Proc. Int. Conf. on Accoustics, Speech, and Signal Processing,pages,189-192, Seattle, WA, May, 1998.
  • 9Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, Bleu: a Method for Automatic Evaluation of Machine Translation [R], IBM Research, RC22176 (W0109-022) September 17, 2001.
  • 10Ye-Yi Wang, Grammar Inference and Statistical Machine Translation [D], Ph.D Thesis, Carnegie Mellon University, 1998.

共引文献70

同被引文献31

  • 1刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 2侯宏旭,刘群,那顺乌日图.基于实例的汉蒙机器翻译[J].中文信息学报,2007,21(4):65-72. 被引量:16
  • 3Sonja Niessen, Hermann Ney. Statistical Machine translation with Scarce Resources Using Morphosyntatic Information [J]. Computational Linguistics, 2004,30(2) : 181-204.
  • 4Mei Yang, Katrin Kirchhoff. Phrase-based Backoff Models for Machine Translation of Highly Inflected Languages[C]// Proceedings of EACL. 2006: 41-48.
  • 5Young-Suk Lee. Morphological analysis for statistical machine translation[C]//Proceedings of HLT-NAACL 2004-Companion Volume. 2004: 57-60.
  • 6Andreas Zollmann, Ashish Venugopal, Stephan Vogel. Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation [C]//Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume. 2006: 201-204.
  • 7Maja Popovic, Hermann Ney. Towards the Use of Word Stems and Suffixes for Statistical Machine Translation[C]//Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC). 2004:1585- 1588.
  • 8Sharon Goldwater, David McClosky. Improving Statistical MT Through Morphological Analysis[C]// Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005 : 676-683.
  • 9Einat Minkov, Kristina Toutanova, Hisami Suzuki. Generating Complex Morphology for Machine Translation[C]//Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL' 07). 2007: 128-135.
  • 10Kemal Oflazer, Ilknur Durgar E1-Kahlout. Exploring Different Representational Units in English to-Turkish Statistical Machine Translation [C]//Proceedings of the Second Workshop on Statistical Machine Translation (ACL'07). 2007: 25-32.

引证文献2

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部