期刊文献+

基于层叠条件随机场模型的中文机构名自动识别 被引量:113

Automatic Recognition of Chinese Organization Name Based on Cascaded Conditional Random Fields
在线阅读 下载PDF
导出
摘要 中文机构名的自动识别是自然语言处理中的一个比较困难的问题.本文提出了一种新的基于层叠条件随机场模型的中文机构名自动识别算法.该算法在低层条件随机场模型中解决对人名、地名等简单命名实体的识别,将识别结果传递到高层模型,为高层的机构名条件随机场模型实现对复杂机构名的识别提供决策支持.文中为机构名条件随机场模型设计了有效的特征模板和特征自动选择算法.对大规模真实语料的开放测试中,召回率达到90.05%,准确率达到88.12%,性能优于其它中文机构名识别算法. Automatic recognition of Chinese organization name is a very difficult problem in many NLP tasks. This paper presents a new algorithm of Chinese organization name recognition based on cascaded conditional random fields. In the proposed algorithm, the person name and location name are first recognized by the lower model. The result then is passed to the high model and supports the decision of high model for recognition of the complicated organization names. We experimentally evaluate the algorithm on large-scale corpus. In open test, its recalling rate achieves 90, 05% and the precision rate 88, 12%. The evaluation results show that the algorithm based on cascaded conditional random fields significantly outperforms previous methods.
出处 《电子学报》 EI CAS CSCD 北大核心 2006年第5期804-809,共6页 Acta Electronica Sinica
基金 国家863高技术研究发展计划(No.2004AA117010-05) 江苏省教育厅基金(No.03KJD520117)
关键词 命名实体 中文机构名识别 条件随机场 named entity Chinese organization name recognition conditional random fields
  • 相关文献

参考文献16

  • 1张小衡,王玲玲.中文机构名称的识别与分析[J].中文信息学报,1997,11(4):21-32. 被引量:84
  • 2Wang Houfeng,Shi Wuguang.A simple rule-based approach to organization name recognition in chinese text[A].Proc of 5th CICLing[C].LNCS 3406,Heidelberg,German:Springer-Verlag,2005.769-772.
  • 3Hongkui Yu,Huaping Zhang,Quan Liu.Recognition of Chinese organization name based role tagging[A].Proc of Advances in Computation of Oriental Languages[C].Beijing:Tsinghua University Press,2003.79-87.
  • 4McCallum A,Freitag D,Pereira F.Maximum entropy Markov models for information extraction and segmentation[A].Proc of 17th ICML[C].Stanford,California,USA:Morgan Kaufmann,2000.591-598.
  • 5John Lafferty,Andrew McCallum,Fernando Pereira.Conditional random fields:Probabilistic models for segmenting and labeling sequence data[A].Proc of the 18th ICML[C].San Francisco:Morgan Kaufmann,USA:2001.282-289.
  • 6Andrew McCallum,Wei Li.Early results for named entity recognition with conditional random fields,feature induction and Web-enhanced lexicons[A].Proc of the 7th CoNLL[C].Edmonton,Canada:Morgan Kaufmann,2003.188-191.
  • 7Thorsten Brants.Cascaded Markov models[A].Proc of EACL'99[C].Bergen,Norway:European Chapter of the Association for Computational Linguistics,1999.118-125.
  • 8刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 9M Skounakis,M Craven,S Ray.Hierarchical hidden markov models for information extraction[A].Proc of the 18th International Joint Conference on Artificial Intelligence[C].Acapulco,Mexico:Morgan Kaufmann,2003.427-433.
  • 10张华平,刘群.基于N-最短路径方法的中文词语粗分模型[J].中文信息学报,2002,16(5):1-7. 被引量:99

二级参考文献38

  • 1周强.规则和统计相结合的汉语词类标注方法[J].中文信息学报,1995,9(3):1-10. 被引量:43
  • 2张小衡.从“qinghuadaxue”谈起逐步实现中文智能输入[J].中文信息,1996,13(5):3-5. 被引量:1
  • 3H Y Tan. Chinese place automatic recognition research. In: C N Huang, Z D Dong, eds. Proc of Computational Language.Beijing: Tsinghua University Press, 1999
  • 4Zhang Huaping, Liu Qun, Zhang Hao, et al. Automatic recognition of Chinese unknown words recognition. First SIGHAN Workshop Attached with the 19th COLING, Taipei, 2002
  • 5S R Ye, T S Chua, J M Liu. An agent-based approach to Chinese named entity recognition. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002
  • 6J Sun, J F Gao, L Zhang, et al. Chinese named entity identification using class-based language model. The 19th Int'l Conf on Computational Linguistics, Taipei, 2002
  • 7Lawrence R Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proc of IEEE, 1989,77(2): 257~286
  • 8Shai Fine, Yoram Singer, Naftali Tishby. The hierarchical hidden Markov model: Analysis and applications. Machine Learning,1998, 32(1): 41~62
  • 9Richard Sproat, Thomas Emerson. The first international Chinese word segmentation bakeoff. The First SIGHAN Workshop Attached with the ACL2003, Sapporo, Japan, 2003. 133~143
  • 10J Hockenmaier, C Brew. Error-driven learning of Chinese word segmentation. In: J Guo, K T Lua, J Xu, eds. The 12th Pacific Conf on Language and Information, Singapore, 1998

共引文献361

同被引文献989

引证文献113

二级引证文献1040

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部