首页|Research on Deep Web Query Interface Clustering Based on Hadoop

Research on Deep Web Query Interface Clustering Based on Hadoop

扫码查看
How to cluster different query interfaces effectively is one of the most core issues when generating integrated query interface on Deep Web integration domain. However, with the rapid development of Internet technology, the number of Deep Web query interface shows an explosive growth trend. For this reason, the traditional stand-alone Deep Web query interface clustering approaches encounter bottlenecks in terms of time complexity and space complexity. After further study of the Hadoop distributed platforms and Map Reduce programming model, a Deep Web query interface clustering algorithm based on Hadoop platform is designed and implemented, in which the Vector Space Model (VSM) and Latent Semantic Analysis (LSA) are employed to represent "Query Interfaces-Attributes" relationships. The experimental results show that the proposed algorithm has better scalability and speedup ratio by using Hadoop architecture.

HadoopMap ReduceDeep WebLSAQuery Interface Clustering

Baohua Qiang、Rui Zhang、Yufeng Wang、Qian He、Wei Li、Sai Wang

展开 >

Guangxi Key Lab of Trusted Software, Guilin University of Electronic Technology, Guilin 541000, China

North China University of Water Resources and Electric Power, Zhengzhou 450045, China

The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang 050000, China

2014

Journal of software

Journal of software

ISSN:1796-217X
年,卷(期):2014.9(12)