A static hot word extraction model based on scenic spot comments
Hot word extraction is of great significance to the development of scenic spots.At present,hot word extraction methods still have problems such as poor word segmentation effect and high cost of training models,a static hot word extraction model called CRF+TBTT is proposed based on scenic comments.The model uses a new algorithm process to filter non-keywords,analyzes high-frequency words and featured words,extracts candidate words,and finally obtains accurate static hot words.The experiments based on 59107 scenic spot comments show that the performance of the CRF+TBTT model is significantly better than that of the competitors,and the accuracy rate of extracting the top 20 hot words in the scenic spot reaches 90%.These results suggest that the new model has a good effect on extracting static hot words,which can help tourism departments to effectively manage and plan scenic spots.
scenic spot commentsCRF+TBTT modelTextRank algorithmTF-IDF algorithmstatic hot word