首页|基于双向长短记忆网络融合模型的招标文件解析技术研究

基于双向长短记忆网络融合模型的招标文件解析技术研究

扫码查看
面对国家电网公司电子招投标业务的快速扩展,供应商在海量的招标文件中迅速而精确地提取相关信息变得尤为重要.本研究开发了一种适配国网招标文件特征的解析技术,旨在将数据结构化和可视化,以帮助供应商及时锁定投标机会并支持决策制定.通过对招标文件进行篇章分析、表格检测和文本纠错处理,获取了有效的数据输入.采用五种不同的解析算法模型对数据进行分析,并基于标注数据评估各模型性能.利用国网招标文件样本,经过模型定制与调优,构建了一个集成双向长短记忆网络(Bi-directional long short-term memory,Bi-LSTM)、条件随机场(conditional random fields,CRF)的解析模型.使用823份实际招标文件样本对模型进行了训练和对比测试,结果显示双向长短记忆融合模型的性能指标优于BERT+Bi-LSTM模型.此外,CRF层能够通过学习自动引入的约束条件来确保预测结果的准确性,从而显著提升解析效果.
Research on Analysis Technology of Bidding Documents Based on Fusion Model of Bi-LSTM and Conditional Random Field
In the face of the rapid expansion of the electronic bidding business of State Grid Corporation,it is particularly important for suppliers to extract relevant information quickly and accurately from the massive bidding documents.In this study,an analytical technique adapted to the characteristics of State Grid bidding documents is developed,aiming at structuring and visualizing data to help suppliers lock in bidding opportunities in time and support decision making.Through the text analysis,form detection and text error correction of bidding documents,the effective data input is obtained.Five different analytic algorithm models were used to analyze the data,and the performance of each model was evaluated based on the labeled data.Using the sample of State Grid bidding documents,an integrated analysis model composed of Bi-directional long short-term memory(Bi-LSTM)and conditional random fields(CRF)was constructed after model customization and tuning.The model is trained and tested with 823 samples of actual bidding documents.The results show that the performance index of the bidirectional long and short memory fusion model is better than that of BERT+Bi-LSTM model.In addition,the CRF layer is able to ensure the accuracy of the prediction results by learning automatically introduced constraints,which significantly improves the analytical performance.

bidding of state gridBi-LSTMCRFdocument structure analysistext analysis

徐世阳

展开 >

国网重庆市电力公司物资分公司重庆 400020

招投标 Bi-LSTM CRF 文件结构分析 文本分析

2024

电力大数据
贵州电力试验研究院 贵州省电机工程学会

电力大数据

影响因子:0.047
ISSN:2096-4633
年,卷(期):2024.27(4)