期刊文献+

基于投影寻踪的中文网页分类算法 被引量:11

Chinese Web-page Classification Based on Projection Pursuit
在线阅读 下载PDF
导出
摘要 随着Web信息迅猛发展,网络用户对网页自动分类器的需求日益增长。为了提高分类精度,本文提出了一种新的基于投影寻踪(ProjectionPursuit,简称PP)的中文网页分类算法。我们首先利用遗传算法找到一个最好的投影方向,然后将已被表示成为n维向量的网页投影到一维空间。最后采用KNN分类算法对其进行分类。此方法能解决“维数灾难”问题。实验结果表明,我们提出的算法是可行而且是有效的。 With the rapid growth of the World Wide Web (www), there is an increasing need to provide automated classifier to Web users for Web page classification and categorization. In this paper, we propose a new Web-page classification algorithm based on projection pursuit for improving the accuracy. We first seek the best projection direction using the genetic algorithm, and the Web-document (represent by n-dimension vector) is projected to One-dimension space. Then classify the Web-document using classical KNN (k-nearest neighbor) algorithm. This method can overcome the curse of dimensionality. Experimental results show that our proposed algorithm is feasibility and effectiveness.
出处 《中文信息学报》 CSCD 北大核心 2005年第4期60-67,共8页 Journal of Chinese Information Processing
基金 教育部重点科技资助项目(03070) 江西省自然科学基金资助项目(0311041) 江西师范大学青年成长基金资助项目(1090)
关键词 计算机应用 中文信息处理 投影寻踪 网页分类 遗传算法 KNN算法 computer application Chinese information processing projection pursuit Webpages classification genetic algorithm KNN algorithm
  • 相关文献

参考文献15

  • 1Aas K., Eikvil L. Text Categorization:A Survey[Z]. http://citeseer.nj.nec.com/aas99text.html, 1999.
  • 2A. Hyvarinen, E. Oja. Independent component analysis: algorithms and applications[J]. Neural Networks 13,2000: 411-430.
  • 3Angela Montanari, Laura Lizzani. a projection pursuit approach to variable selection[J]. Computational Statistic&Data Analysis 35,2001:463-473.
  • 4DudaR. Hart P.E Stock D.G.李宏东 姚天翔等译.Pattern Oassifieation,Second Edition[M].模式分类:2003年9月第1版[M].机械工业出版社,2004年2月..
  • 5Emmanuel A., Iafis O. J., Unsupervised Feature Extraction Using Projection Pursuit[Z]. http://www.censsis.neu.edu/Education/StudentResearch/2001/posters/arzuaga-cruz_e!., 2001.
  • 6Fabrizio Sebastiani. Machine Learning in Automated Text Categorization[J].ACM Computing Surveys, Vol.54, No.1, March 2002.
  • 7Luis O. Jimenez, David Landgrebe. High Dimensional Feature Reduction Via Projection Pursuit[D]. TR-ECE 96-5 April 1995.
  • 8Mizuta M. Projection Pursuit into High Dimensional Space and its Applications[Z]. http://www.stat.fi/isi99/pmceedings/arkisto/varasto/mizu0171.1999.
  • 9Pires A.M. Robust Linear Discriminant Analysis and the Projection Pursuit Approach[Z]. hap://www.math.ist.ufl.pt/-apires/AP_ ICORSO1. 2001.
  • 10RicardoBaeza-Yates BeahierRibeiro-Neto.现代信息检索(英文版)[M].机械工业出版社,2004年2月第1版..

二级参考文献17

共引文献422

同被引文献199

引证文献11

二级引证文献47

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部