当前大语言模型的兴起为自然语言处理、搜索引擎、生命科学研究等领域的研究者提供了新思路,但大语言模型存在资源消耗高、推理速度慢,难以在工业场景尤其是垂直领域应用等方面的缺点.针对这一问题,提出了一种多尺度卷积神经网络(convolutional neural network,CNN)与双向长短期记忆神经网络(long short term memory,LSTM)融合的唐卡问句分类模型,本文模型将数据的全局特征与局部特征进行融合实现唐卡问句分类任务,全局特征反映数据的本质特点,局部特征关注数据中易被忽视的部分,将二者以拼接的方式融合以丰富句子的特征表示.通过在Thangka数据集与THUCNews数据集上进行实验,结果表明,本文模型相较于Bert模型在精确度上略优,在训练时间上缩短了 1/20,运算推理时间缩短了 1/3.在公开数据集上的实验表明,本文模型在文本分类任务上也表现出了较好的适用性和有效性.
A Thangka Question Classification Model Incorporating Multi-scale CNN and Bidirectional LSTM
The current rise of big language models has provided new ideas for researchers in natural language processing,search en-gines,life science research,and other fields,but big language models have disadvantages in terms of high resource consumption,slow inference speed,and difficulty in application in industrial scenarios,especially in vertical fields.To address this problem,a multi-scale CNN(convolutional neural network)and bi-directional LSTM(long short term memory)fusion model for Thangka question clas-sification were proposed.This model fuses global and local features of the data to achieve the Thangka question classification task,with the global features reflecting the essential characteristics of the data and the local features focusing on the easily overlooked parts of the data,and the two were fused in a stitching manner to enrich the feature representation of the sentences.Experiments on the Thangka dataset and the THUCNews dataset show that the model is slightly better than the Bert model in terms of accuracy,l/20th shorter in training time and 1/3rd shorter in inference time,and also shows good applicability and effectiveness on text classification tasks.
text classificationlong and short-term memorymulti-scale convolutional neural networkThangka