A Textbook Corpus Approach to Constructing a Self-adaptive Subject Word List——Taking the Economics-relevant Majors as an Example
[Purpose/Significance]Building a specialized word list for non-native Chinese learners is of great significance for specialized learning and the construction and development of International Chinese Language Education discipline.[Methods/Processes]In response to the current shortage of Chinese specialized word list for foreign learners and the single construction method,this paper first crawls novels,news,and forum comments from websites to construct a reference corpus.Based on the specialized curriculum directory of the Ministry of Education,textbooks are selected to construct a corpus of specialized textbooks.Algorithms are used to select specialized subject words and construct a word co-occurrence matrix.Cohesive clustering is used to achieve subject words clustering.On this basis,calculate the semantic correlation of the subject words within the word cluster,select the word with the highest semantic co-occurrence as the central word of the word cluster,and arrange the word list based on the semantic correlation.Finally,taking economics major as an example,a specialized subject word list for foreign students is constructed.[Results/Conclusions]The results showed that the economic subject word list constructed in this paper can greatly extract the specialized vocabulary,and effectively cluster closely related specialized subject words within the same word cluster.Learners can quickly and effectively obtain relevant word clusters for adaptive learning.What's more,this method also provides a basis for the construction of other subject word list as well.
Subject Word ListCohesion Clustering AlgorithmSemantic Co-occurrenceCentral Word of the Word Cluster