首页|基于多维度熵值考察的常用字表构建

基于多维度熵值考察的常用字表构建

On construction of a commonly used glossary based on multidimensional entropy examination

扫码查看
常用字除了字频这一外显特性外,还应当具有稳定性、较广的分布性、构词构字的能产性等特征.以往基于语料选取来考察汉字,无法对每个汉字不同维度的特征进行量化,最终仍主要通过字频来构建字表.文章基于2007-2021年《中国语言生活状况报告》语言大数据,对常用字的字频、稳定性、分布度、构词频、构字频等五个维度进行详细的数据考察与特征分析,使用熵值法建立汉字效用综合测度模型,构建多维度常用字表.通过熵值法构建的汉字效用综合测度模型,从多个方面测量、量化了汉字的效用,得出的排序结果与以往的字表有着较大的差异.不单单考虑字频这一维度之后,大量在稳定性、分布度、构词构字能力等维度具有突出优势的常用字跻身字表前列,由此也更为科学合理.
In addition to the external characteristics of character frequency,commonly used characters should possess stabili-ty,wide distribution,and the ability to form new characters and words.Chinese characters used to be examined on the basis of corpus selection,but it was not possible to quantify the characteristics of each character in different dimensions,and eventual-ly the glossary was constructed mainly through character frequency.Based on the language data from Language Situation in Chi-na(2007-2021),the article examines and analyzes the character frequency,stability,distribution and word-formation fre-quency and character-formation frequency in detail.And the entropy method was used to establish a comprehensive model for measuring the utility of Chinese characters and to construct a multi-dimensional glossary of commonly used characters.The comprehensive model built by entropy method measures and quantifies the utility of Chinese characters in a number of ways,and the ranking results are significantly different from those of previous glossaries.Once the research considers not only charac-ter frequency,but also the stability,distribution,and word-formation ability of characters from multiple dimensions,a large number of commonly used characters with these significant characteristics will occupy top positions in the glossary.Therefore,a glossary of commonly used characters created from comprehensive consideration is more scientific and logical.

commonly used charactersglossary of commonly used charactersutility of Chinese charactersentropy method

张艳梅、李如龙、吕展

展开 >

武汉工程大学外语学院,湖北,武汉 430205

厦门大学中国语言文学系,福建,厦门 361005

暨南大学华文学院,广东,广州 510610

常用字 常用字表 汉字效用 熵值法

2024

华文教学与研究
暨南大学华文学院 暨南大学华文教育研究所

华文教学与研究

CHSSCD
影响因子:0.429
ISSN:1674-8174
年,卷(期):2024.(2)