Research on a Fast Chinese Short Text Retrieval Scheme Based on HBase
With the rapid development of the information age,the amount of information that needs to be processed in every industry in daily life has multiplied.It is more suitable to calcu-late and store massive amounts of data in a fully distributed environment.However,in terms of retrieval,the efficiency of retrieval tasks for Chinese short text data is slightly insufficient.In summary,this article designs a fast Chinese short text retrieval scheme based on HBase.Firstly,the corresponding topic probability distribution is trained through BTM.Secondly,traditional KNN text classification is combined with latent semantic analysis to achieve latent topic classifi-cation of short texts.Finally,combine the text topic classification results with the Top Hits on ES to construct a secondary index for the corresponding table to avoid complex full table scans of the original text data.Thus achieving fast retrieval.Finally,through experimental compari-son,this scheme is more efficient than the traditional HBase scheme for retrieving Chinese short data.