摘要
针对现有英汉翻译语法误译校正方法中存在的校正精度较低等问题,提出一种基于K均值聚类的英汉翻译语法误译校正方法。对采集的英汉翻译语法数据实施预处理,运用TF-IDF算法由预处理后的语法数据内提取语法特征,构成语法特征样本集;通过K均值聚类确定特征样本集内语法误译特征,将此误译特征作为输入参数,输入到构建误译校正模型中,实现英汉翻译语法误译的校正。结果表明,该方法可检测出语法特征样本集内的误译特征,所检测的误译特征个数与对应数据集的实际误译类别数量几乎一致,综合检测性能较高;可通过语法误译校正将误译语法与正确语法区分,整体校正精度高于98%。
To solve the problem of low accuracy in the existing grammatical mistranslation correcting methods in English-Chinese translation,this essay proposes a K-Means-Clustering-Based Grammatical Mistranslation Correcting Method.This method uses TF-IDF algorithm to extract grammatical mistranslation features from grammatical translation data collected and preprocessed so as to form a grammatical feature sample set.The grammatical mistranslation features in the feature sample set are determined by K-Means Clustering,and put as parameters into the Mistranslation Correcting Model constructed to realize the correction of grammatical mistranslation in English-Chinese translation.The results show that this method can detect the mistranslation features in the grammatical feature sample set,with a high effectiveness that the number of mistranslation features detected is almost the same as that of those corresponding actual mistranslation features in grammatical mistranslation data collected.The grammatical mistranslation can be distinguished from the correct grammar and corrected by this Grammatical Mistranslation Correction,with the overall correcting accuracy exceeding98%.
作者
吴南辉
沈炎松
WU Nanhui;SHEN Yansong(School of International Cooperation&Exchange,Zhangzhou Institute of Technology,Zhangzhou,Fujian 363000,China;School of Electronic Information,Zhangzhou Institute of Technology,Zhangzhou,Fujian 363000,China)
出处
《漳州职业技术学院学报》
2022年第2期67-75,共9页
Journal of Zhangzhou Institute of Technology
基金
福建省中青年教师教育科研项目(JZ180805)。
关键词
K均值聚类
英汉翻译
语法误译
TF-IDF算法
特征提取
误译校正模型
K-Means clustering
English-Chinese translation
grammatical mistranslation
TF-IDF algorithm
feature extraction
mistranslation correcting model