Algorithm of Thematic Words Extraction from Chinese Texts Based on Semantic
To meet the requirement of information times development and to improve the accuracy of extracting automatic thematic words from Chinese texts we provide an algorithm model from Chinese text thematic words extraction based on semantic. It constructs concept semantic network as dictionary and knowledge base by combining domain background knowledge and substitutes concept matching for traditional literal mating. It understands the Chinese texts subject from concept level and overcomes the limitation of literal matching and enhances the natural language processing from keyword level to knowledge level. And it solves the vocabulary difference problem to certain extent. The method can understand natural language in semantic to certain extent. Standardizing thematic words achieved by using domain knowledge. Results of experiments show that the approach gains accuracy of 71. 03%. in thematic words extraction from test document and it increases about 1. 87 times comparing with traditional approach.
natural language processingthematic words extractionconcept semantic network