期刊文献+

大数据可用性的研究进展 被引量:66

State-of-the-Art of Research on Big Data Usability
在线阅读 下载PDF
导出
摘要 信息技术的迅速发展,催生了大数据时代的到来.大数据已经成为信息社会的重要财富,为人们更深入地感知、认识和控制物理世界提供了前所未有的丰富信息.然而随着数据规模的扩大,劣质数据也随之而来,导致大数据质量低劣,极大地降低了大数据的可用性,严重困扰着信息社会.近年来,数据可用性问题引起了学术界和工业界的共同关注,展开了深入的研究,取得了一系列研究成果.介绍了数据可用性的基本概念,讨论数据可用性的挑战与研究问题,综述了数据可用性方面的研究成果,探索了大数据可用性的未来研究方向. The rapid development of information technology gives rise to the big data era. Big data has become an important wealth of information society, and has provided unprecedented rich information for people to further perceive, understand and control the physical world. However, withthe growth in data scale, dirty datacomes along. Dirty data leads to the low qualityand usability of big data, and seriously harms the information society. In recent years, the data usability problems have drawn the attentions of both the academia and industry. In-Depth studies have been conducted, and a series of research results have been obtained. This paper introduces the concept of data usability, discusses the challenges and research issues, reviews the research results and explories future research directions in this area.
出处 《软件学报》 EI CSCD 北大核心 2016年第7期1605-1625,共21页 Journal of Software
基金 国家重点基础研究发展计划(973)(2012CB316200) 国家自然科学基金(U1509216 61472099)~~
关键词 大数据 数据可用性 数据质量 数据清洗 数据管理 big data data usability data quality data cleaning data management
  • 相关文献

参考文献8

二级参考文献264

  • 1Han J,Kamber M.数据挖掘:概念与技术[M].北京:机械工业出版社,2007.
  • 2Eckerson W W. Data quality and the bottom line: Achieving business success through a commitment to high quality data. Data Warehousing Institute: Technical Report TDWI Report Series, 2002.
  • 3Zhang H, Diao Y, Immerman N. Recognizing patterns in streams with imprecise timestamps. Proceedings of the VLDB Endowment, 2010, 3(1-2): 244-255.
  • 4Fan W, Geerts F, Wijsen J. Determining the currency of data//Proceedings of the ACM Symposium on Principles of Database Systems(PODS). Athens, Greece, 2011:71-82.
  • 5Berti-EquiUe L, Sarma A D, Dong X, Marian A, Srivastava D.Sailing the information ocean with awareness of currents: Discovery and application of source dependence//Proceedings of the Conference on Innovative Data Systems Research (CIDR). Asilomar, CA, USA, 2009.
  • 6Dong X, Berti-Equille L, Hu Y, Srivastava D. Global detec- tion of complex copying relationships between sources. Pro- ceedings of the VLDB Endowment, 2010, 3(1 2) : 1358-1369.
  • 7Dong X, Berti-Equille L, Srivastava D. Truth discovery and copying detection in a dynamic world. Proceedings of the VLDB Endowment, 2009, 2(1) : 562-573.
  • 8Clifford J, Dyreson C E, Isakowitz T, Jensen C S, Snodgrass R T. On the semantics of "now" in databases. ACM Transactions on Database Systems (TODS), 1997, 22 (2):171-214.
  • 9Snodgrass R T, Gao D, Zhang R, Thomas S W. Temporal support for persistent stored modules//Proceedings of the 1EEE International Conference on Data Engineering (ICDE). Washington, DC, USA, 2012.
  • 10Bodirsky M, Kara J. The cortxplexity of temporal constraint satisfaction problems//Proceedings of the 40th Annual ACM Symposium on Theory of Computing. Victoria, British Columbia, Canada, 2008:29-38.

共引文献307

同被引文献599

引证文献66

二级引证文献592

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部