For the problem of workflow job scheduling,the critical path method was proposed to predict the execution time of the workflow and allocate resources.The parallel application directed acyclic graph was used to describe the relationships among the sub-jobs of a workflow in the workflow execution time prediction algorithm.Based on this order,the system resources were logically allocated to the sub-jobs.According to the characteristics and resource allocation information of sub-jobs,the gradient boosting decision tree-based algorithm was used to predict the execution time of sub-jobs,and the critical path of workflow was calculated.The sum of the completion time of all sub-jobs on the critical path is the execution time of the workflow.If the predicted workflow execution time satisfies the user's requirements,job scheduling was executed according to the sub-job execution sequence and resource allocation scheme,and the workflow was executed.Comparative experiments show that the prediction errors of the execution time of two workflows are 5.72%and 1.57%,respectively.Compared with the default scheduling algorithm of Spark,the workflow scheduling algorithm reduces the completion time of the two workflows by 15.71%and 15.44%,respectively.
关键词
工作流/时间预测/关键路径/调度算法/梯度提升决策树
Key words
workflow/time prediction/critical path/scheduling algorithm/gradient boosting decision tree