首页|基于RetNet的建筑市政自然语言问题生成

基于RetNet的建筑市政自然语言问题生成

Natural Language Problem Generation for Building Municipalities Based on RetNet

扫码查看
目前大部分问题生成模型基于Transformer结构,但随着文本长度增加,Transformer的KV缓存机制导致GPU占用线性增加、吞吐量降低,增加推理成本.为解决此问题,采用RetNet模型构建RetNet-Bert问题生成模型.该模型使用多尺度保持机制替代多头注意力机制,具有并行和循环的双重形式,提高了推断效率.实验证明,RetNet-Bert在长序列建模上表现更佳,同时实现了训练并行性、低成本部署和高效推理,在建筑市政信息生成问题上具有高可行性和有效性,达到了较高水准.
Most of the current problem generation models are based on the Transformer structure,but as the text length increases,the KV caching mechanism of the Transformer leads to a linear increase in GPU occupancy,a decrease in throughput,and an increase in inference cost.To solve this problem,RetNet model was used to construct RetNet-Bert problem generation model.The model uses the multi-scale holding mechanism instead of the multi-head attention mechanism,and has the dual form of parallel and cyclic,which improves the inference efficiency.Experiments prove that RetNet-Bert performs better on long sequence modeling,while achieving training parallelism,low-cost deployment and efficient inference,and achieves a high level of feasibility and effectiveness on the building municipal information generation problem.

problem generation modelRetNet modellong sequence modelingconstruction and municipal information

李陟、阎文博

展开 >

中国能源建设集团山西省电力勘测设计院有限公司,太原 030001

问题生成模型 RetNet模型 长序列建模 建筑市政信息

2024

科技和产业
中国技术经济学会

科技和产业

影响因子:0.361
ISSN:1671-1807
年,卷(期):2024.24(23)