摘要
提出了一种基于关键词描述的数据模型,该模型可以较好地表示结构化数据和非结构化数据。此外还提出了一种基于匹配的异构数据索引方法,其主要思想是预先计算并存储某些特定的查询及相应的结果,索引的构建和查询整体上遵循匹配思想。在构建时,充分利用剪枝及关键词计数排序策略,较大地缩短了索引构建时间;在查询时,主要依靠关键词计数并采用分层检索的方式,较大地减少了用户检索时间。实验结果表明该索引能够较好地解决异构数据索引问题,具有良好的性能。
In this paper, we proposed a keyword-based data model, which can well represent structured data and unstructured data. In addition, we also came up with a partial match-based indexing approach for heterogeneous data. The main idea of the approach is to pre-compute certain queries and store their results. Partial matching is considered in building and querying index on the whole. When building this index, we took advantage of the strategy of pruning and sorting based on keyword count, which significantly shortened the time of construction, and when querying the index, a keywords count and astratified index method were adopted, which greatly lessen the users' retrieval time. The experimental results show that the index method can solve the problem of heterogeneous data index excellently and has a good performance.
作者
梁英飞
童海红
刘巍
LIANG Yingfei;TONG Haihong;LIU Wei(AECC Harbin Dongan Engine Corporation LTD Information Archive Center, Harbin 150066, China)
出处
《沈阳航空航天大学学报》
2018年第2期60-66,共7页
Journal of Shenyang Aerospace University
关键词
异构数据
部分匹配
数据模型
分层索引
heterogeneous data
partial match
data model
stratified index