Digital chefs and intelligent cooking systems based on multimodal large language model
A digital chef and an intelligent cooking method were proposed to achieve high-quality,precise cooking results.In the offline phase,visual,auditory and thermal sensors record professional chefs'continuous cooking operations.The collected frame-by-frame images and multi-round Q&A texts form a culinary expert knowledge base.A low-rank adapta-tion method was applied to fine-tune a pretrained multimodal large language model,enabling it to understand cooking in-tentions.In the online phase,real-time sensory data were converted into image-text inputs for the fine-tuned model,which then generated cooking instructions to guide users through the cooking steps.A hardware-software cooking system was implemented and tested with a pan-frying steak task.Experimental results show that the fine-tuned system effectively con-trols the steak's doneness and quality,and significantly improves the accuracy and rationality of cooking instructions com-pared to the model before fine-tuning.
multimodal large language modeldigital chefintelligent cookingcooking robotexpert systemartificial in-telligence