摘要
中文机构名的自动识别是自然语言处理中的一个比较困难的问题.本文提出了一种新的基于层叠条件随机场模型的中文机构名自动识别算法.该算法在低层条件随机场模型中解决对人名、地名等简单命名实体的识别,将识别结果传递到高层模型,为高层的机构名条件随机场模型实现对复杂机构名的识别提供决策支持.文中为机构名条件随机场模型设计了有效的特征模板和特征自动选择算法.对大规模真实语料的开放测试中,召回率达到90.05%,准确率达到88.12%,性能优于其它中文机构名识别算法.
Automatic recognition of Chinese organization name is a very difficult problem in many NLP tasks. This paper presents a new algorithm of Chinese organization name recognition based on cascaded conditional random fields. In the proposed algorithm, the person name and location name are first recognized by the lower model. The result then is passed to the high model and supports the decision of high model for recognition of the complicated organization names. We experimentally evaluate the algorithm on large-scale corpus. In open test, its recalling rate achieves 90, 05% and the precision rate 88, 12%. The evaluation results show that the algorithm based on cascaded conditional random fields significantly outperforms previous methods.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2006年第5期804-809,共6页
Acta Electronica Sinica
基金
国家863高技术研究发展计划(No.2004AA117010-05)
江苏省教育厅基金(No.03KJD520117)
关键词
命名实体
中文机构名识别
条件随机场
named entity
Chinese organization name recognition
conditional random fields