首页|HTDcr:a job execution framework for high-throughput computing on supercomputers

HTDcr:a job execution framework for high-throughput computing on supercomputers

扫码查看
High-throughput computing(HTC)is a computing paradigm that aims to accomplish jobs by easily breaking them into smaller,independent components.However,it requires a large amount of computing power for a long time.Most existing HTC frameworks are job-oriented without support for coscheduling with hardware architecture and task-level execution.Also,most of the frameworks reach a limited scale,and their usability needs further improvement.Herein,we present HTDcr,a job execution framework for the HTC on supercomputers.This study aims to improve the throughput,task dispatching,and usability of the framework.In detail,the throughput optimizations include a sophisticated designed task management system,a hierarchical scheduler,and the co-optimization of the task-scheduling strategy with the application and hardware characteristics.The optimizations for usability include a programable execution workflow,mechanisms for more robust and reliable service qualities,and a fine-grained resource allocation system for the colocation of multiple jobs.According to our evaluations,HTDcr can achieve outstanding scalability and high throughput on large-scale clusters for the HTC workload.We evaluate HTDcr with several microbenchmarks and real-world applications on Tianhe-2 and Sunway TaihuLight to demonstrate its effects on existing design mechanisms.For instance,the task scheduling for two real-world applications integrated with the application and hardware characteristics achieves 1.7× and 1.9× speedups over the basic task-scheduling strategy.

high-throughput computingsupercomputertask schedulingmiddlewarepassword guessing

Jiazhi JIANG、Dan HUANG、Hu CHEN、Yutong LU、Xiangke LIAO

展开 >

School of Computer Science and Engineering,Sun Yat-sen University,Guangzhou 510006,China

School of Software Engineering,South China University of Technology,Guangzhou 510006,China

National Key R&D Program of ChinaNational Natural Science Foundation of ChinaZhejiang LabMajor Program of Guangdong Basic and Applied ResearchProgram for Guangdong Introducing Innovative and Entrepreneurial Teams

2021YFB0301300U18114612021KC0AB042019B0303020022016ZT06D211

2024

中国科学:信息科学(英文版)
中国科学院

中国科学:信息科学(英文版)

CSTPCDEI
影响因子:0.715
ISSN:1674-733X
年,卷(期):2024.67(1)
  • 1