首页|基于多模态大语言模型的数字厨师与智能烹饪系统

基于多模态大语言模型的数字厨师与智能烹饪系统

扫码查看
面向高质量和精准烹饪的需求,提出一种基于多模态大语言模型的数字厨师与智能烹饪方法.离线阶段利用视觉、声音、温度等多源传感器记录专业厨师的连续操作,将图像与多轮问答文本融合,建立烹饪专家知识库,并采用低秩适配方法对预训练多模态大语言模型进行微调,以构建能够理解烹饪意图的多模态大语言模型.在线阶段将实时感知的数据转换为图文输入微调后的大语言模型,经模型分析后生成烹饪指令,引导用户完成相应的烹饪动作.以煎牛排任务为例,搭建了智能烹饪软硬件系统并进行实验验证.实验结果表明,经过微调后的智能烹饪系统能有效控制牛排的熟度与品质,相较于微调前的模型,显著提升了烹饪指令的合理性和针对性.
Digital chefs and intelligent cooking systems based on multimodal large language model
A digital chef and an intelligent cooking method were proposed to achieve high-quality,precise cooking results.In the offline phase,visual,auditory and thermal sensors record professional chefs'continuous cooking operations.The collected frame-by-frame images and multi-round Q&A texts form a culinary expert knowledge base.A low-rank adapta-tion method was applied to fine-tune a pretrained multimodal large language model,enabling it to understand cooking in-tentions.In the online phase,real-time sensory data were converted into image-text inputs for the fine-tuned model,which then generated cooking instructions to guide users through the cooking steps.A hardware-software cooking system was implemented and tested with a pan-frying steak task.Experimental results show that the fine-tuned system effectively con-trols the steak's doneness and quality,and significantly improves the accuracy and rationality of cooking instructions com-pared to the model before fine-tuning.

multimodal large language modeldigital chefintelligent cookingcooking robotexpert systemartificial in-telligence

李鑫源、李柏、孙跃硕、张坦探、田永林、殷烛炎、王飞跃

展开 >

湖南大学机械与运载工程学院,湖南 长沙 410082

湖南大学整车先进设计制造技术全国重点实验室,湖南 长沙 410082

中国科学院自动化研究所多模态人工智能系统全国重点实验室,北京 100190

中国科学院自动化研究所复杂系统管理与控制国家重点实验室,北京 100190

中国科学院大学人工智能学院,北京 100049

澳门科技大学创新工程学院工程科学系,澳门 999078

展开 >

多模态大语言模型 数字厨师 智能烹饪 烹饪机器人 专家系统 人工智能

2024

智能科学与技术学报

智能科学与技术学报

CSTPCD
ISSN:
年,卷(期):2024.6(4)