首页|基于RetNet的建筑市政自然语言问题生成

基于RetNet的建筑市政自然语言问题生成

扫码查看
目前大部分问题生成模型基于Transformer结构,但随着文本长度增加,Transformer的KV缓存机制导致GPU占用线性增加、吞吐量降低,增加推理成本.为解决此问题,采用RetNet模型构建RetNet-Bert问题生成模型.该模型使用多尺度保持机制替代多头注意力机制,具有并行和循环的双重形式,提高了推断效率.实验证明,RetNet-Bert在长序列建模上表现更佳,同时实现了训练并行性、低成本部署和高效推理,在建筑市政信息生成问题上具有高可行性和有效性,达到了较高水准.
Natural Language Problem Generation for Building Municipalities Based on RetNet
Most of the current problem generation models are based on the Transformer structure,but as the text length increases,the KV caching mechanism of the Transformer leads to a linear increase in GPU occupancy,a decrease in throughput,and an increase in inference cost.To solve this problem,RetNet model was used to construct RetNet-Bert problem generation model.The model uses the multi-scale holding mechanism instead of the multi-head attention mechanism,and has the dual form of parallel and cyclic,which improves the inference efficiency.Experiments prove that RetNet-Bert performs better on long sequence modeling,while achieving training parallelism,low-cost deployment and efficient inference,and achieves a high level of feasibility and effectiveness on the building municipal information generation problem.

problem generation modelRetNet modellong sequence modelingconstruction and municipal information

李陟、阎文博

展开 >

中国能源建设集团山西省电力勘测设计院有限公司,太原 030001

问题生成模型 RetNet模型 长序列建模 建筑市政信息

2024

科技和产业
中国技术经济学会

科技和产业

影响因子:0.361
ISSN:1671-1807
年,卷(期):2024.24(23)