Automatic Scientific Topic Ranking based on Pre-trained Neural Embedding
To accurately explore the development and changes of topics in the field of scientific research, implicit semantic features are often used to extract the scientific topics. However, due to the limitation of the topic mining technology itself, not all the topics are of equal significant or meaningful. Some topics may contain background terms or lack coherence between topic terms, resulting in the lack of practical significance. According to the existing research, this paper proposes a new multi-dimensional topic quality evaluation algorithm based on word embedding, and uses the statistical features of the corpus to optimize the insignificant topic distance scoring method based on the characteristics of scientific documents, and finally integrates the two into a unified topic ranking framework. Experimental results show that our method can effectively improve the overall effectiveness of topic ranking, and can identify and distinguish the insignificant and poor-quality topics from the legitimate ones. The overall effect of topic ranking is better than existing methods.