首页|基于参数高效微调的藏文大模型研究

基于参数高效微调的藏文大模型研究

扫码查看
大模型是指拥有庞大参数量的深度学习模型,具备强大的表征学习和生成能力,对自然语言处理等领域产生了深远影响.随着技术的不断进步.大模型在性能和应用范围上不断取得突破,成为人工智能领域的研究热点.然而,大模型的发展也面临着一些挑战,如模型训练成本高、参数冗余以及跨语言应用存在局限性等.特别地,在藏文这一具有独特语言特性的研究领域,大模型的研究尚处于起步阶段,缺乏相应的模型和资源支持.针对上述问题,该文通过基于LoRA的参数高效微调方法,提出了基于Llama2模型架构构建的Tibetan-Llama2和Tibetan-Alpaca模型,经过较大规模数据的增量预训练和指令微调,上述两种模型具备了对藏文的长文本理解和生成能力,展现了其多任务学习能力,并且在多个领域都有广泛的应用前景.
Tibetan Large Model Based on Efficient Parameter Fine Tuning
A large model refers to a deep learning model with many parameters,which has powerful representation learning and generation capabilities and has had a profound impact on fields such as natural language processing.With the continuous advancement of technology,large models have made breakthroughs in performance and application scope,becoming a research hotspot in the field of artificial intelligence.However,the development of large models also faces some challenges,such as high model training costs,parameter redundancy,and limitations in cross language applications.Specifically,in the field of Tibetan,which has unique language characteristics,research on large models is still in its early stages and lacks corresponding models and resource support.In response to the above issues,this article proposes an efficient parameter fine-tuning method based on LORA and constructs the Tibetan-Llama2 and Tibetan-Alpaca models based on the Llama2 model architecture.After incremental pre-training with large-scale data and instruction fine-tuning,they can understand and generate long Tibetan texts,demonstrate their multitasking learning ability,and have broad application prospects in multiple fields.

natural language processingtibetan language modelefficient parameter fine-tuningincremental pre-traininginstruction fine-tuning

杨毛加、柔特、才智杰、官却才让、贡去卓么

展开 >

青海师范大学计算机学院,青海西宁 810016

省部共建藏语智能信息处理及应用国家重点实验室,青海西宁 810008

自然语言处理 藏文大模型 参数高效微调 增量预训练 指令微调

2024

中文信息学报
中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心
影响因子:0.8
ISSN:1003-0077
年,卷(期):2024.38(12)