期刊文献+

面向互联网的信息处理 被引量:7

Internet information processing
原文传递
导出
摘要 随着互联网海量信息的快速增长,由原来单一的以文本信息为主的信息处理发展成文本、语音、图像等多模态的信息处理.同时,用户的需求也从关键词搜索为主的信息获取向着基于语义理解的自动问答、辅助决策等智能交互的方向发展.本文从互联网服务信息处理特点以及用户需求变化出发,阐述了互联网信息处理面临的挑战和发展趋势.首先介绍了基于互联网的海量信息处理特点以及基本方法,然后分别阐述了互联网文本信息处理、语音信息处理、图像信息处理、位置信息处理的挑战以及发展趋势.最后介绍了如何对互联网的用户从属性、状态、兴趣3个维度进行建模,以满足用户个性化服务和商业分析的需要. With the fast growth of internet massive information, the simple text-oriented information processing has gradually transformed into a multi-modal technology capable of processing information of text, voice and image, etc. Meanwhile, the internet users' requirements have changed from information acquisition via keyword retrieval into semantic-based intelligent interaction, such as question-answering and assistant decision-making. This paper elaborates the challenges and trends of internet information processing based on the characteristics of information and the users' requirements transformation. First, the basic ideas and methods of internet massive information processing are introduced. Second, the challenges and trends of internet information processing on text, voice, image and location are addressed. Finally, we present how to model internet users from the perspectives of user property, status and interest, so as to satisfy the users' personalized requirements and business analysis needs.
出处 《中国科学:信息科学》 CSCD 2013年第12期1624-1640,共17页 Scientia Sinica(Informationis)
基金 国家高技术研究发展计划(批准号:2011AA01A207)资助项目
关键词 互联网信息处理 语义理解 智能交互 用户建模 internet information processing, semantic understanding, intelligent interaction, user modeling
  • 相关文献

参考文献37

  • 1李国杰. 大数据研究的科学价值. 计算机学会通讯, 2012, 8: 9-15.
  • 2Friedman J. Stochastic gradient boosting. Comput Stat Data Anal, 38: 367-378.
  • 3李智超,余慧佳,刘奕群,马少平.网页作弊与反作弊技术综述[J].山东大学学报(理学版),2011,46(5):1-8. 被引量:9
  • 4Etzioni O, Cafarella M, Downey D, et al. Web-scale information extraction in knowItAll. In: Proceedings of the 13th International Conference on World Wide Web, New York, 2004. 100-110.
  • 5Banko M, Cafarella M, Soderland S, et al. Open information extraction from the web. In: Proceedings of International Joint Conference on Artificial Intelligence, Hyderabad, 2007. 2670-2676.
  • 6Carlson A, Betteridge J, Kisiel B, et al. Toward an architecture for never-ending language learning. In: Proceedings of the Conference on Artificial Intelligence AAAI, Atlanda, 2010. 1306-1313.
  • 7McCord M C, Murdock J W, Boguraev B K. Deep parsing in watson. IBM J Res Dev, 2012, 56: 1-15.
  • 8Mendes P N, Jakob M, Bizer C. DBpedia for NLP: a multilingual cross-domain knowledge base. In: Proceedings of the International Conference on Language Resources and Evaluation, Istanbul, 2012. 1813-1817.
  • 9Hinton G, Osindero S, Teh Y. A fast learning algorithm for deep belief nets. Neu Comput, 2006, 18: 1527-1554.
  • 10Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems 19, Vancouver, 2006. 153-160.

二级参考文献29

  • 1余慧佳,刘奕群,张敏,茹立云,马少平.基于大规模日志分析的搜索引擎用户行为分析[J].中文信息学报,2007,21(1):109-114. 被引量:118
  • 2中国互联网络信息中心(CNNIC).第26次中国互联网络发展状况统计报告[R].2010-07:3-15.
  • 3SILVERSTEIN C, HENZINGER M, MARAIS H, et al. Analysis of a very large web search engine query log[J]. ACM SIGIR Forum, 1999, 33 (1) :6-12.
  • 4GYONGYI Z, GARCIA-MOLINA H. Web spare taxono- my [ C ]//AIRWeb' 05. Chiba, Japan : [ s. n. ], 2005 : 1-9.
  • 5GKANOGIANNIS A, KALAMBOUKIS T. A novel su- pervised learning algorithm and its use for spam detection in social bookmarking systems [ C]// Europe Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. [ S. l. ] : [ s. n. ], 2008 : 1-8.
  • 6MARKINES B, CATTUTO C, MENCZER F. Social spam detection [C]// AIRWeb'09. New York: ACM Press, 2009: 41-48.
  • 7HENZINGER M, MOTWANI R, SILVERSTEIN C.Challenges in web search engines [ J ]. ACM SIGIR Fo- rum, 2002, 36 ( 2 ) : 11-22.
  • 8SAHAMI M, MITTAL V, BALUJA S, et al. The happy searcher: challenges in web information retrieval [ C ]//Proceedings of 8th Pacific Rim International Conference on Artificial Intelligence. Berlin, Heidelberg: Springer- Verlag, 2004, 3157:3-12.
  • 9FETTERLY D, MANASSE M, NAJORK M. Spam, damn spare, and statistics [ C ]// Proceedings of the 7th International Workshop on the Web and Databases. New York: ACM Press, 2004: 1-6.
  • 10BAEZA-YATES R, RIBEIRO-NETO B. Modem infor- mation retrieval[ M ]. London: Addison-Wesley, 1999.

共引文献8

同被引文献34

引证文献7

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部