针对传统K-means的聚类效果容易受到样本分布影响,且核函数表示能力不强导致对于复杂问题的聚类效果表现不佳的问题,利用深度核的强表示性并通过多核集成方式,提出一种具有强表示能力且分布鲁棒的深度多核 K-means(deep multiple kernel K-means,DMKK-means)聚类算法.构建具有强表示能力的深度多核网络架构,在新的特征空间进行K-means聚类;基于Kullback-Leibler(KL)散度的聚类损失函数衡量该算法与2 种基准聚类方法的差异;将该聚类算法建模成高效的端到端学习问题,利用随机梯度下降算法更新优化深度多核网络的权重参数.在多个标准数据集上进行试验,结果表明,相比于K-means、径向基函数核K-means(radial basis function kernel K-means,RBFKKM)及其他多核K-means聚类算法,该算法在聚类精度、归一化互信息和调整兰德系数指标上均有明显提升,验证该算法的可行性与有效性.
DMKK-means:a deep multiple kernel K-means clustering algorithm
The proposed algorithm,deep multiple kernel K-means(DMKK-means),addressed the limitations of traditional K-means clustering,which was sensitive to sample distribution and exhibited suboptimal performance for complex problems due to its limited expressive power of kernel representations.By leveraging the strong representational capability of deep kernels and employing a multi-kernel ensemble approach,DMKK-means constructed a highly expressive deep multiple kernel network architecture and per-formed K-means clustering in a new feature space.The dissimilarity between this algorithm and two baseline clustering methods was quantified using a clustering loss function based on Kullback-Leibler(KL)divergence.The clustering algorithm was modeled as an efficient end-to-end learning problem,and the weight parameters of the deep multiple kernel network were optimized through sto-chastic gradient descent.Experimental results on multiple standard datasets demonstrated the superiority of the proposed algorithm over K-means,radial basis function kernel K-means(RBFKKM),and other multi-kernel K-means clustering algorithms in terms of clustering accuracy,normalized mutual information,and adjusted rand index.These findings validated the feasibility and effectiveness of the proposed algorithm.