首页|一种基于随机森林分类器构建高性能应用程序性能分析模型的方法

一种基于随机森林分类器构建高性能应用程序性能分析模型的方法

扫码查看
高性能应用程序的传统性能分析方法因分析过程存在额外开销和分析结果不准确等缺陷,致使用户耗费更多的时间和领域知识.为解决以上问题,将程序的性能分析问题转化成高维特征下非平衡小样本数据集的多分类问题,采集500条包含程序运行时进程切换次数、内存利用率、磁盘I/O负载等7种性能数据,经PCA降维等数据预处理后,使用随机森林分类器训练程序性能问题分析模型.实验验证该模型可识别出内存利用率过高、磁盘I/O负载过重等5类性能问题.为评估模型的指导有效性,分别采集HotSpot3D程序和LU-Decomposition程序运行时产生的性能数据,并根据模型输出结果指导,分别基于运行级和编译级优化2个验证程序运行.实验结果表明,所提方法可有效指导优化程序的运行性能,2个验证程序的加速比分别为1.056和5.657.
A method for constructing performance analysis model of high performance application based on random forest classifier
Traditional performance analysis methods for high performance applications have short-comings such as additional overhead during the analysis process and inaccurate analysis results,resulting in users spending more time and domain knowledge.To address these issues,this paper transforms the problem of program performance analysis into a multi-classification problem of unbalanced small sample datasets under high-dimensional features.By collecting 500 pieces of performance data that include seven types of metrics such as the number of process switches,memory utilization,and disk I/O load during program runtime,after data preprocessing such as PCA dimensionality reduction,a program perform-ance problem analysis model is trained using a random forest classifier.Experimental validation shows that the model can identify five types of performance issues,including excessive memory utilization and heavy disk I/O load.To evaluate the effectiveness of the model's guidance,this paper collects perform-ance data generated by the HotSpot3D program and the LU-Decomposition program during runtime.Based on the model's output guidance,the two validation programs are optimized at the runtime level and the compilation level.Experimental results indicate that the proposed method can effectively guide the optimization of program performance,with speedup ratios of 1.056 and 5.657 for the two programs,respectively.

Nmonperformance analysisvariational autoencoderclusterrandom forest

柴旭清、乔一航、范黎林

展开 >

河南师范大学计算机与信息工程学院,河南 新乡 453007

河南师范大学高性能计算中心,河南 新乡 453007

智慧商务与物联网技术河南省工程实验室,河南 新乡 453007

Nmon 性能分析 变分自编码器 聚类 随机森林

国家自然科学基金河南省优秀青年科学基金河南省高校科技创新计划

1227411720230041022620HASTIT026

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(7)