首页|分布式稀疏软大间隔聚类

分布式稀疏软大间隔聚类

扫码查看
虽然软大间隔聚类(Soft large margin clustering,SLMC)相比其他诸如K-Means等算法具有更优的聚类性能与某种程度的可解释性,然而当面对大规模分布存储数据时,均遭遇了同样的可扩展瓶颈,其涉及的核矩阵计算需要高昂的时间代价.消减此代价的有效策略之一是采用随机Fourier特征变换逼近核函数,而逼近精度所依赖的特征维度常常过高,隐含着可能过拟合的风险.本文将稀疏性嵌入核SLMC,结合交替方向乘子法(Alternating direction method of multipliers,ADMM),给出了一个分布式稀疏软大间隔聚类算法(Distributed sparse SLMC,DS-SLMC)来克服可扩展问题,同时通过稀疏化获得更好的可解释性.
Distributed Sparse Soft Large Margin Clustering
Soft large margin clustering(SLMC)has been proved to achieve better clustering performance and interpretability than other algorithms,such as K-Means.However,when facing large scale distributed data storage,computing involved kernel matrix requires large time cost.One of the effective strategies to reduce this time cost is to use random Fourier feature transform to approximate the kernel function,and the feature dimension on which approximating accuracy depends is often too high,which implies the risk of overfitting.This paper embeds the sparsity into kernel SLMC and combines the alternating direction method of multipliers(ADMM)with SLMC.Finally,we propose a distributed sparse soft large margin clustering algorithm(DS-SLMC)to overcome scalability problem and achieve better interpretability through sparsity.

alternating direction method of multipliers(ADMM)soft large margin clustering(SLMC)distributed machine learningkernel approximation

谢云轩、陈松灿

展开 >

南京航空航天大学计算机科学与技术学院,南京 211106

交替方向乘子法 软大间隔聚类 分布式机器学习 核近似

2024

数据采集与处理
中国电子学会 中国仪器仪表学会信号处理学会 中国仪器仪表学会中国物理学会微弱信号检测学会 南京航空航天大学

数据采集与处理

CSTPCD北大核心
影响因子:0.679
ISSN:1004-9037
年,卷(期):2024.39(2)
  • 20