分布式存储系统作为数据存储的载体,广泛应用于大数据领域.纠删码存储方式相对副本方式,既具有较高的空间利用效率,又能保证数据存储的可靠性,因此被越来多的应用于存储系统当中.在EB级大规模纠删码分布式存储系统中,元数据管理成本较大...分布式存储系统作为数据存储的载体,广泛应用于大数据领域.纠删码存储方式相对副本方式,既具有较高的空间利用效率,又能保证数据存储的可靠性,因此被越来多的应用于存储系统当中.在EB级大规模纠删码分布式存储系统中,元数据管理成本较大,位置信息等元数据查询效率影响了I O时延和吞吐量.基于位置信息记录的有中心数据放置算法需要频繁访问元数据服务器,导致性能优化受限,基于Hash映射的无中心数据放置算法越来越多地得到应用.但面向纠删码的无中心放置算法,在节点变更和数据恢复过程中,存在位置变更困难、迁移数据量大、数据恢复和迁移并发度低等问题.提出了一种基于条带的一致性Hash数据放置算法(consistent Hash data placement algorithm based on stripe,SCHash),SCHash以条带为单位放置数据,通过把数据块到节点的映射转化为条带到节点组的映射过程,减少节点变动过程中的数据迁移量,从而在恢复过程中降低了变动数据的比例,加速了恢复带宽.并基于SCHash算法设计了一种基于条带的并发I O调度恢复策略,通过避免选取同一节点的数据块进行I O操作,提升了I O并行度,通过调度恢复I O和迁移I O的执行顺序,减少了数据恢复的执行时间.相比APHash数据放置算法,SCHash在数据恢复过程中,减少了46.71%~85.28%数据的迁移.在条带内重建时,恢复带宽提升了48.16%,在条带外节点重建时,恢复带宽提升了138.44%.展开更多
Performance models provide insightful perspectives to predict performance and to propose optimization guidance.Although there has been much researches,pinpointing bottlenecks of various memory access patterns and reac...Performance models provide insightful perspectives to predict performance and to propose optimization guidance.Although there has been much researches,pinpointing bottlenecks of various memory access patterns and reaching high accurate prediction of both regular and irregular programs on various hardware configurations are still not trivial.This work proposes a novel model called process-RAM-feedback(PRF)to quantify the overhead of computation and data transmission time on general-purpose multi-core processors.The PRF model predicts the cost of instruction for singlecore by a directed acyclic graph(DAG)and the transmission time of memory access between each memory hierarchy through a newly designed cache simulator.By using performance modeling and feedback optimization method,this paper uses PRF model to analyze and optimize convolution,sparse matrix-vector multiplication and sn-sweep as case study for covering with typical regular kernel to irregular and data dependence.Through the PRF model,it obtains optimization guidance with various sparsity structures,algorithm designs,and instruction sets support on different data sizes.展开更多
Improving processor frequency to strengthen massive data processing capability will lead to incremen-tal server marginal costs and bring about a series of problems such as power consumption,managementcomplexity,etc.Ba...Improving processor frequency to strengthen massive data processing capability will lead to incremen-tal server marginal costs and bring about a series of problems such as power consumption,managementcomplexity,etc.Based on the field programmable gate array(FPGA),TCP offload engine(TOE),zero-copy and other key technologies,this paper describes the design and realization of a reconfigurable accel-erator board.In this board,TCP/IP protocol will be moved to high-speed reconfigurable acceleratorboard.The packets will be labeled according to the protocol and submitted to the upper data processingsoftware after IP-quintuple filtering in hardware.Reconfigurable accelerator board obtains higher perfor-mance speed-up compared with ordinary NIC card.展开更多
Under virtualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in tra...Under virtualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in traditional server, achieves computing capacity, storage capacity and service capacity distri- bution according to need in application-level. Under the new server model, the segregation and protection of user space and system space as well as the security monitoring of virtual resources are the important factors of ultimate security guarantee. This article presents a large-scale and expansible distributed invasion detection system of virtual computing environment based on virtual machine. The system supports security monitoring management of global resources and provides uniform view of security attacks under virtual computing environment, thereby protecting the user applications and system security under capacity services domain.展开更多
文摘分布式存储系统作为数据存储的载体,广泛应用于大数据领域.纠删码存储方式相对副本方式,既具有较高的空间利用效率,又能保证数据存储的可靠性,因此被越来多的应用于存储系统当中.在EB级大规模纠删码分布式存储系统中,元数据管理成本较大,位置信息等元数据查询效率影响了I O时延和吞吐量.基于位置信息记录的有中心数据放置算法需要频繁访问元数据服务器,导致性能优化受限,基于Hash映射的无中心数据放置算法越来越多地得到应用.但面向纠删码的无中心放置算法,在节点变更和数据恢复过程中,存在位置变更困难、迁移数据量大、数据恢复和迁移并发度低等问题.提出了一种基于条带的一致性Hash数据放置算法(consistent Hash data placement algorithm based on stripe,SCHash),SCHash以条带为单位放置数据,通过把数据块到节点的映射转化为条带到节点组的映射过程,减少节点变动过程中的数据迁移量,从而在恢复过程中降低了变动数据的比例,加速了恢复带宽.并基于SCHash算法设计了一种基于条带的并发I O调度恢复策略,通过避免选取同一节点的数据块进行I O操作,提升了I O并行度,通过调度恢复I O和迁移I O的执行顺序,减少了数据恢复的执行时间.相比APHash数据放置算法,SCHash在数据恢复过程中,减少了46.71%~85.28%数据的迁移.在条带内重建时,恢复带宽提升了48.16%,在条带外节点重建时,恢复带宽提升了138.44%.
基金Supported by the National Key Research and Development Program of China(No.2017YFB0202105,2016YFB0201305,2016YFB0200803,2016YFB0200300)the National Natural Science Foundation of China(No.61521092,91430218,31327901,61472395,61432018).
文摘Performance models provide insightful perspectives to predict performance and to propose optimization guidance.Although there has been much researches,pinpointing bottlenecks of various memory access patterns and reaching high accurate prediction of both regular and irregular programs on various hardware configurations are still not trivial.This work proposes a novel model called process-RAM-feedback(PRF)to quantify the overhead of computation and data transmission time on general-purpose multi-core processors.The PRF model predicts the cost of instruction for singlecore by a directed acyclic graph(DAG)and the transmission time of memory access between each memory hierarchy through a newly designed cache simulator.By using performance modeling and feedback optimization method,this paper uses PRF model to analyze and optimize convolution,sparse matrix-vector multiplication and sn-sweep as case study for covering with typical regular kernel to irregular and data dependence.Through the PRF model,it obtains optimization guidance with various sparsity structures,algorithm designs,and instruction sets support on different data sizes.
基金the National High Technology Research and Development Programme of China(No2007AA01Z115)
文摘Improving processor frequency to strengthen massive data processing capability will lead to incremen-tal server marginal costs and bring about a series of problems such as power consumption,managementcomplexity,etc.Based on the field programmable gate array(FPGA),TCP offload engine(TOE),zero-copy and other key technologies,this paper describes the design and realization of a reconfigurable accel-erator board.In this board,TCP/IP protocol will be moved to high-speed reconfigurable acceleratorboard.The packets will be labeled according to the protocol and submitted to the upper data processingsoftware after IP-quintuple filtering in hardware.Reconfigurable accelerator board obtains higher perfor-mance speed-up compared with ordinary NIC card.
基金Supported by the High Technology Research and Development Programme of China (No. 2003AA1Z2070 ) and the National Natural Science Foundation of China (No. 90412013).
文摘Under virtualization idea based on large-scale dismantling and sharing, the implementing of network interconnection of calculation components and storage components by loose coupling, which are tightly coupling in traditional server, achieves computing capacity, storage capacity and service capacity distri- bution according to need in application-level. Under the new server model, the segregation and protection of user space and system space as well as the security monitoring of virtual resources are the important factors of ultimate security guarantee. This article presents a large-scale and expansible distributed invasion detection system of virtual computing environment based on virtual machine. The system supports security monitoring management of global resources and provides uniform view of security attacks under virtual computing environment, thereby protecting the user applications and system security under capacity services domain.