首页|面向众核处理器的阴阳K-means算法优化

面向众核处理器的阴阳K-means算法优化

扫码查看
传统阴阳K-means算法处理大规模聚类问题时计算开销十分昂贵.针对典型众核处理器的体系结构特征,提出了一种阴阳K-means算法高效并行加速实现.该实现基于一种新内存数据布局,采用众核处理器中的向量单元来加速阴阳K-means中的距离计算,并面向非一致内存访问(non-unified memory access,NUMA)特性进行了针对性的访存优化.与阴阳K-means算法的开源多线程实现相比,该实现在ARMv8 和x86 众核平台上分别获得了最高约5.6 与8.7 的加速比.因此上述优化方法在众核处理器上成功实现了对阴阳K-means算法的加速.
Optimizing Yinyang K-means algorithm on many-core CPUs
Traditional Yinyang K-means algorithm is computationally expensive when dealing with large-scale clustering problems.An efficient parallel acceleration implementation of Yinyang K-means algorithm was proposed on the basis of the architectural characteristics of typical many-core CPUs.This implementation was based on a new memory data layout,used vector units in many-core CPUs to accelerate distance calculation in Yinyang K-means,and targeted memory access optimization for NUMA(non-uniform memory access)characteristics.Compared with the open source multi-threaded version of Yinyang K-means algorithm,this implementation can achieve the speedup of up to5.6 and8.7 approximately on ARMv8 and x86 many-core CPUs,respectively.Experiments show that the optimization successfully accelerate Yinyang K-means algorithm in many-core CPUs.

K-meansNUMAvectorizationmany-core CPUperformance optimization

周天阳、王庆林、李荣春、梅松竹、尹尚飞、郝若晨、刘杰

展开 >

国防科技大学 计算机学院,湖南 长沙 410073

国防科技大学 并行与分布计算全国重点实验室,湖南 长沙 410073

K-means 非一致内存访问 向量化 众核处理器 性能优化

国家自然科学基金

62002365

2024

国防科技大学学报
国防科学技术大学

国防科技大学学报

CSTPCD北大核心
影响因子:0.517
ISSN:1001-2486
年,卷(期):2024.46(1)
  • 1