期刊文献+

利用MapReduce的异常轨迹检测并行算法 被引量:7

A Parallel Algorithm for Detecting Trajectory Outliers Based on Map Reduce
原文传递
导出
摘要 异常轨迹检测是移动对象数据挖掘的一个重要研究领域。TRAOD(TRAjectory Outlier Dectection Algorithm)算法是一种经典的异常轨迹检测算法,但它对于海量轨迹数据的异常检测效率低。为提高海量轨迹数据集的异常检测效率,本文提出了一种利用Map Reduce的异常轨迹检测并行算法(Parallel algorithm for TRAjectory Outlier Detection,PTRAOD),并在此基础上提出了网格索引的异常轨迹检测并行算法(Grid-based Parallel algorithm for TRAjectory Outlier Dectection,GPTRAOD)。GPTRAOD算法在PTRAOD算法的基础上,利用网格索引实现区域查询,进一步提高算法效率。将PTRAOD算法和GPTRAOD算法在Hadoop平台上加以实现,结果表明:本文提出的2个并行检测算法,能实现异常轨迹的检测;GPTRAOD算法的效率优于PTRAOD算法;GPTRAOD算法具有较高的可扩展性和较好的加速比。 Trajectory outlier detection is significantly important in the field of data mining for moving object. TRAOD (TRAjectory Outlier Dectection Algorithm), a classic algorithm for detecting trajectory outliers, focuses on a new two-level trajectory partitioning strategy to enhance the efficiency of algorithm. The main advantage of TRAOD algorithm is the ability to detect outlying sub-trajectories. However, it has a low efficiency on abnormal- ity detection for massive trajectory data. In order to improve the efficiency for mining trajectory outliers from massive datasets, a parallel algorithm for detecting trajectory outliers based on MapReduce framework, which is called PTRAOD (Parallel algorithm for TRAjectory Outlier Detection), is presented. It redesigns the TRAOD al- gorithm based on the MapReduce framework, and encapsulates the steps of TRAOD into its Map and Reduce functions. PTRAOD algorithm takes full advantages of the features from Hadoop platform. It firstly distributes the trajectory data into distributed computing nodes. While distributing the data, it also takes the load-balance in- to consideration. And after all, each node runs the same algorithms to detect abnormal trajectories. Based on PTRAOD algorithm, a grid-based parallel algorithm for detecting trajectory outliers, called GPTRAOD (Grid- based Parallel algorithm for TRAjectory Outlier Detection), is then proposed. GPTRAOD algorithm makes use of the grid index to realize regional query and reduce unnecessary calculations. At first, GPTRAOD algorithm di- vides the map into a series of equal-sized grids, whose size is determined with respect to each specific data. Then, the grid index is established to implement the regional query. Finally, the algorithm finds out the abnormal trajectory segments and judges whether the trajectories that contains the abnormal trajectory segments are abnor- mal. In general, GPTRAOD algorithm takes advantages of the gird index to realize regional query on the basis of PTRAOD algorithm, which furthermore can search abnormal trajectory on the cloud computing platform. To as- sess the performances of the proposed algorithms, extensive experiments were conducted. The experimental re- suits demonstrate that the proposed two parallel detection algorithms can both successfully achieve the trajectory outlier detection. The efficiency of PTRAOD algorithm is higher than TRAOD algorithm, while GPTRAOD al- gorithm has the higher scalability and better speedup ratio than PTRAOD algorithm. In addition, with the rapidly expanding of datasets, GPTRAOD algorithm shows obvious advantages and increasing potentials.
出处 《地球信息科学学报》 CSCD 北大核心 2015年第5期523-530,共8页 Journal of Geo-information Science
基金 国家自然科学基金项目"云计算环境下顾及用户关系的手机用户时空轨迹模式挖掘方法研究"(41471371)
关键词 异常轨迹检测 网格索引 并行数据挖掘 MAPREDUCE trajectory outlier detection grid index parallel data mining MapReduce
  • 相关文献

参考文献22

  • 1Liao L, Patterson D J, Fox D, et al. Learning and infer- ring transportation routines[J]. Artificial Intelligence, 2007,171(5):311-331.
  • 2Yuan J, Zheng Y, Zhang C, et al. T-drive: Driving direc- tions based on taxi trajectories[C]. In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York: ACM, 2010:99-108.
  • 3Giannotti F, Nanni M, Pinelli F, et al. Trajectory pattern mining[C]. In: Proceedings of the 13th ACM SIGKDD In- ternational Conference on Knowledge Discovery and Da- ta Mining. New York: ACM, 2007:330-339.
  • 4Yu Y, Cao L, Rundensteiner E A, et al. Detecting moving object outliers in massive-scale trajectory streams[C]. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014:422-431.
  • 5Li X, Li Z, Han J, et al. Temporal outlier detection in ve- hicle traffic data[C]. In: Proceedings of the 25th Interna- tional Conference on Data Engineering. Washington: IEEE Computer Society, 2009:1319-1322.
  • 6Ge Y, Xiong H, Liu C, et al. A taxi driving fraud detection system[C]. 2011 IEEE 11 International Conference on Data Mining (ICDM), 2011:181-190.
  • 7Barnett V, Lewis T. Outliers in statistical data[M]. New York: Wiley, 1994.
  • 8Knorr E M, Ng R T. Finding intentional knowledge of dis- tance-based outliers[C]. In: Proceedings of the 25th Inter- national Conference on Very Large Data Bases. Edin- burgh, Scotland: Springer-Verlag, 1999,99:211-222.
  • 9Breunig M M, Kriegel H P, Ng R T, et al. LOF: Identify- ing density-based local outliers[C]. In: Proceedings of 2000 ACM SIGMOD International Conference on Man- agement of Data. New York: ACM, 2000,29(2):93-104.
  • 10Aggarwal C C, Yu P S. Outlier detection for high dimen- sional data[C]. In: Proceedings of 2001 ACM SIGMOD International Conference on Management of Data. Cali- fornia: ACM, 2001,30(2):37-46.

二级参考文献10

  • 1Knorr EM, Ng RT, Tucakov V. Distance-Based outliers: Algorithms and applications. VLDB Journal, 2000,8(3):237-253.
  • 2Ramaswamy S, Rastogi R, Shim K. Efficient algorithms for mining outliers from large data sets. In: Chen WD, Jeffrey FN, Philip AB, eds. Proc. of the SIGMOD 2000. New York: ACM, 2000. 427-438.
  • 3Breunig MM, Kriegel HP, Ng RT, Sander J. LOF: Identifying density-based local outliers. In: Chen WD, Jeffrey FN, Philip AB, eds. Proc. of the SIGMOD 2000. New York: ACM, 2000.93-104.
  • 4Papadimitriou S, Kitagawa H, Gibbons PB, Faloutsos C. LOCI: Fast outlier detection using the local correlation integral. In: Dayal U, Ramamritham K, Vijayaraman TM, eds. Proc. of the ICDE 2003. New York: IEEE Computer Society, 2003.315-326.
  • 5Aggarwal CC, Yu PS. Outlier detection for high dimensional data. In: Aref WG, ed. Proc. of the SIGMOD 2001. New York: ACM, 2001.37-46.
  • 6Lee J, Han J, Li X. Trajectory outlier detection: A partition-and-detect framework. In: Proe. of the ICDE 2008. New York: IEEE Computer Society, 2008. 140-149.
  • 7Chen J, Maylor K. Leung, Gao Y. Noisy logo recognition using line segment hausdorff distance. Pattern Recognition, 2003,36(4): 943-955.
  • 8Huttenlocher DP, Klanderman GA, Rucklidge WA. Comparing images using the hausdorff distance. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1993,15(9):850-863.
  • 9Huttenlocher DP, Kcdem K, Sharir M. The upper envelope of voronoi surfaces and its applications. Discrete and Computational Geometry, 1993,9(1):267-291,.
  • 10Beckmann N, Kriegel HP, Schneider R, Seeger B. The R*-tree: An efficient and robust access method for points and rectangles. In: Hector GM, ed. Proc. of the SIGMOD'90. New York: ACM, 1990. 322-331.

共引文献14

同被引文献42

引证文献7

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部