计算机工程与设计2024,Vol.45Issue(9) :2757-2763.DOI:10.16208/j.issn1000-7024.2024.09.027

基于特征表达和模型预测的主动学习

Active learning based on feature representation and model prediction

姜海涛 邱保志 李向丽
计算机工程与设计2024,Vol.45Issue(9) :2757-2763.DOI:10.16208/j.issn1000-7024.2024.09.027

基于特征表达和模型预测的主动学习

Active learning based on feature representation and model prediction

姜海涛 1邱保志 1李向丽1
扫码查看

作者信息

  • 1. 郑州大学计算机与人工智能学院,河南郑州 450001
  • 折叠

摘要

为解决当前的主动学习算法在采样时通常忽略样本特征表达信息的问题,提出一个基于样本特征表达和模型预测的主动学习模型.针对主动学习算法在模型训练早期阶段引起的冷启动问题,提出一个标注集初始化算法.利用聚类技术提取样本特征表达信息,通过分类器得到样本的模型预测信息,致力于使初始标注集的样本分布尽可能接近原始数据集的分布.实验结果表明,该模型分类准确率优于多个主动学习基线算法,该算法能够有效缓解模型的冷启动问题.

Abstract

To address the problem of existing active learning algorithms that typically ignore sample feature representation infor-mation during sampling,an active learning model was proposed based on sample feature representation and model prediction.To alleviate the cold-start problem caused by active learning algorithms in the early stages of model training,a labeling set initializa-tion algorithm was proposed.Clustering methods were utilized to extract sample feature representation information and the sam-ple model prediction information was obtained through a classifier.The sample distribution of the initial labeling set was made as similar as possible to that of the original dataset.Experimental results demonstrate that the proposed active learning model out-performs multiple active learning baseline algorithms in classification accuracy,and the labeling set initialization algorithm effec-tively alleviates the cold-start problem.

关键词

主动学习/特征表达/模型预测/冷启动/聚类/图像分类/标注集初始化

Key words

active learning/feature representation/model prediction/coldstart/clustering/image classification/labeling set ini-tialization

引用本文复制引用

基金项目

国家自然科学联合基金项目(U21B2037)

国家自然科学基金项目(62172371)

出版年

2024
计算机工程与设计
中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心
影响因子:0.617
ISSN:1000-7024
段落导航相关论文