首页|基于采莓模型启示的探索式与查找式意图自动识别研究

基于采莓模型启示的探索式与查找式意图自动识别研究

扫码查看
[目的]通过选取新分类特征,提高探索式与查找式意图自动识别的准确度.[方法]在AOL查询日志中,选取1 805个查询并对其进行人工标注;在采莓模型的启示下,分别从查询性质、搜索过程与信息来源三个层面提出分类特征;进一步比较所提出特征在朴素贝叶斯、SVM、决策树、随机森林与神经网络5种分类模型中的分类效果;最后分析不同特征集合以及每个特征的分类效果.[结果]三种分类特征均能对探索式与查找式意图进行有效区分,其中查询性质相关特征的识别效果最佳;在5种分类模型中,采用神经网络算法的分类模型性能最佳(Accuracy=0.817 2,Precision=0.849 4,Recall=0.774 7,F1=0.810 3).[局限]未在多个数据集中验证新提出的分类特征的性能;未充分挖掘用户搜索行为以此形成更多有效的分类特征;由于人工标注存在高耗时、高人力成本等问题,使得最终应用于探索式/查找式意图识别的数据集有限.[结论]基于采莓模型启示提出的特征能对探索式与查找式意图进行有效区分.
Automatic Recognition of Exploratory and Lookup Intents Based on Berry Picking Model
[Objective]This paper selects several new classification features to improve the accuracy of automatic recognition of exploratory and lookup intents.[Methods]Firstly,we collected 1805 queries from the AOL search log and manually labelled them.Then,we proposed classification features from three aspects:query nature,search process,and information source inspired by the Berry Picking model.Third,we evaluated the performance of the proposed features in Naive Bayes,SVM,Decision Tree,Random Forest,and Neural Network.Finally,we explored the classification performance of individual features and feature sets.[Results]The three types of classification features can effectively distinguish exploratory and lookup intentions,with query nature-based features achieving the best performance.Among the five classification models,the neural network algorithm-based model performed the best(Accuracy=0.817 2,Precision=0.849 4,Recall=0.774 7,F1 Score=0.810 3).[Limitations]We did not examine the performances of newly proposed classification features with multiple datasets.User searching behaviors need to be fully explored to form more effective classification features.Moreover,the dataset applied to exploratory/lookup intent recognition was limited due to the high time consumption and labor cost of manual labelling.[Conclusions]The proposed features based on the Berry Picking model can effectively distinguish between exploratory and lookup intents.

Query Intent RecognitionExploratory IntentLookup IntentBerryPicking Model

刘杰、桂思思、张晓娟

展开 >

西南大学计算机与信息科学学院 重庆 400715

南京农业大学信息管理学院 南京 210095

四川大学公共管理学院 成都 610065

查询意图识别 探索式意图 查找式意图 采莓模型

国家社会科学基金青年项目

19CTQ023

2024

数据分析与知识发现
中国科学院文献情报中心

数据分析与知识发现

CSTPCDCSSCICHSSCD北大核心EI
影响因子:1.452
ISSN:2096-3467
年,卷(期):2024.8(4)
  • 47