摘要
针对Storm流式计算平台中默认轮询调度策略存在通信开销大、负载不均衡的问题,提出基于拓扑结构的任务调度策略(TS^2)。首先,选取CPU资源充足且可用的工作节点并各分配一个进程,消除节点内进程间通信开销,优化进程部署;然后,分析拓扑结构,找出拓扑中度最大的组件,优先分配该组件的线程;最后,在满足节点可承载最大线程数的条件下,尽可能将关联任务部署到同一个节点来减少节点间通信开销,改善集群负载均衡,优化线程部署。实验结果表明:在系统延迟方面,与Storm默认调度策略和离线调度策略相比,TS^2的平均优化率分别为16. 91%和5. 69%,有效提高了系统的实时性;在节点间通信开销方面,TS^2相比于Storm默认调度策略平均降低了15. 75%;在平均吞吐量方面,TS^2相比于Storm默认调度策略平均提升了14. 21%。
In order to solve the problems of large communication cost and unbalanced load in the default round-robin scheduling strategy of Storm stream computing platform,a Task Scheduling Strategy based on Topology Structure(TS2)in Storm was proposed.Firstly,the work nodes with sufficient and available Central Processing Unit(CPU)resources were selected and only a process was allocated to each work node to eliminate the communication cost between processes within the nodes and optimize the process deployment.Then,the topology structure was analyzed,the component with the biggest degree in the topology was found and the thread of the component was assigned with the highest priority.Finally,under the condition of the maximum number of threads that a node could carry,the associated tasks were deployed to the same node as far as possible to reduce the communication cost between nodes,improve the load balance of cluster and optimize the thread deployment.The experimental results show that,in terms of system latency,the average optimization rate of TS^2 is 16.91% and 5.69%respectively compared with Storm default scheduling strategy and offline scheduling strategy,which effectively improves the real-time performance of system.Additionally,compared with the Storm default scheduling strategy,the communication cost between nodes of TS^2 is reduced by 15.75% and its average throughput is improved by 14.21%.
作者
刘粟
于炯
鲁亮
李梓杨
LIU Su;YU Jiong;LU Liang;LI Ziyang(College of Information Science and Engineering, Xinjiang University, Urumqi Xinjiang 830046, China)
出处
《计算机应用》
CSCD
北大核心
2018年第12期3481-3489,共9页
journal of Computer Applications
基金
国家自然科学基金资助项目(61462079
61562078
61562086)
国家科技支撑项目(2015BAH02F01)~~
关键词
STORM
流式计算
任务调度
拓扑结构
通信开销
Storm
stream computing
task scheduling
topology structure
communication cost