基于大语言模型的多模态社交媒体信息流行度预测研究

扫码查看

原文链接

万方数据
维普

中文摘要：针对现有多模态社交媒体信息流行度预测算法对特征依赖强、泛化能力不足、面对少样本/冷启动环境表现不佳的问题,提出了一种基于大语言模型指令微调和人类对齐的多模态社交媒体流行度预测模型Mul-tiSmpLLM.首先,定义面向冷启动用户的多模态社交媒体流行度预测任务.其次,构建多模态微调指令,并分别通过低秩适配微调(LoRA)和冻结微调(Freeze)方法对大语言基座模型(Llama3)进行指令微调.最后,提出了一种改进直接偏好优化(DPO)的算法IDPOP,通过构造偏好数据,并对DPO损失函数施加由参数调节的惩罚项,解决了基于人类反馈的强化学习(RLHF)算法训练不稳定、不收敛,以及标准DPO在社交媒体流行度预测任务中产生错误优化的问题.实验结果表明,MultiSmpLLM显著优于传统多模态预测模型和GPT-4o等多模态大语言模型.

外文标题：Research on multimodal social media information popularity prediction based on large language model

外文摘要：To address the limitations of strong feature dependency,insufficient generalization,and inadequate perfor-mance in few-shot/cold-start settings in existing multimodal social media popularity prediction algorithms,a Mul-tiSmpLLM model based on large language model with instruction fine-tuning and human alignment was proposed.Firstly,the task of multimodal social media popularity prediction for cold-start users was defined.Secondly,multimodal fine-tuning instructions were constructed,and the large language model(Llama3)was instructionally fine-tuned using the low-rank adaptation(LoRA)and parameter freeze(Freeze)method.Finally,an improved direct preference optimiza-tion(DPO)algorithm IDPOP was developed by constructing preference data and adding a parameter-tuned penalty to the DPO loss function,resolving instability and non-convergence in RLHF and incorrect optimization in standard DPO for social media popularity prediction.Experiments show MultiSmpLLM outperforms conventional multimodal prediction models and multimodal large language models such as GPT-4o.

外文关键词：

large language modelpopularity predictioninstruction fine-tuninghuman alignment

作者：

王洁、王子曈、彭岩、郝博文

展开 >

作者单位：

首都师范大学管理学院,北京 100089

关键词：

大语言模型流行度预测指令微调人类对齐

出版年：

2024

DOI：

10.11959/j.issn.1000-436x.2024193

通信学报

中国通信学会

通信学报

CSTPCD北大核心

影响因子：1.265

ISSN：1000-436X

年,卷(期)：2024.45(11)