首页|事件信息采集中的事件监测项归约方法研究

事件信息采集中的事件监测项归约方法研究

扫码查看
随着互联网的飞速发展,从微博、贴吧、论坛、新闻网站等媒体获取关注的事件信息已经是互联网信息处理系统的基本功能.然而,面对大数据时代的媒体资源,如何全面、快速地获取关注的事件信息是值得深入研究的问题.该文针对事件信息采集效率低下的问题,揭示了事件之间的约束效应,为事件监测项、最简事件监测项的要素组成提供了指导;分析了最简事件监测项之间的相交关系,提出了事件监测项的归约方法,减少了用于搜采的监测项的数量.以一个市级地域SaaS平台和一个消防行业SaaS平台中的事件信息采集为实验对象,面向主流的内置搜索引擎,在事件监测项选取率和事件信息采集效率两方面进行了实验评测.结果表明,该文提出的事件监测项归约方法,减少了信息采集的次数,改善了事件信息采集的性能.
Method of Reducing Event Monitoring Terms for Event Crawling
With the rapid development of the Internet,crawling event information from various media,such as mi-croblog,post bar,forum and news website,becomes essential to Internet information processing systems.Facing with these media resources in the era of big data,how to comprehensively and quickly obtain concerned event infor-mation is worthy of further study.We reveal event constraint effect,which provides the guideline for the structure of event monitoring term and simplest-event monitoring terms,and analyze the overlapping relation between simp-lest-event monitoring terms.We propose the method of reducing event monitoring terms,which reduces the number of monitoring terms for event search crawling.Taking municipal regional SaaS platform and fire control in-dustry SaaS platform,we conduct an experiment with mainstream built-in search engines to evaluate the selection ration of event monitoring terms and event crawling efficiency.The experimental results show that the proposed re-duction method of event monitoring term reduces the number of crawling information and improves the performance of event crawling.

event crawlingbuilt-in search enginesevent constraint effectevent monitoring term reduction

仲兆满、李恒、管燕、李慧

展开 >

江苏海洋大学计算机工程学院,江苏连云港 222005

江苏省海洋资源开发研究院(连云港),江苏连云港 222005

事件信息采集 内置搜索引擎 事件约束效应 事件监测项归约

国家自然科学基金江苏省高校自然科学研究项目江苏省高校科研实践创新计划项目

7217407919KJB520004KYCX20_2931

2024

中文信息学报
中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心
影响因子:0.8
ISSN:1003-0077
年,卷(期):2024.38(7)