Building a Corpus of Chinese Medical Cases for Specific Medical Conditions——Sleep Disorders as an Example
[Purpose/significance]To address the problem of non-standardized and non-uniform terminology in medical cases affect-ing the efficiency and accuracy of text mining,we conducted a study on terminology standardization of traditional Chinese medicine(TCM)sleep disorder cases as an example,and proposed a method to build a corpus of TCM medical cases to provide a standardized data base for machine understanding of medical cases,thereby improving the efficiency of TCM knowledge mining and promoting the manifestation of TCM tacit knowledge.[Method/process]A large number of medical cases of sleep disorders were collected,the na-tional standards were referred to control the quality of content from the perspective of word formation and phrasing,and scientific and authoritative medical cases were selected as the basis of the study;the core word corpus of medical cases was extracted,and the core corpus was subjected to the control of word form,word meaning,and inter-word relationship,and the preferred terminology and syn-onymous expressions under each semantics were identified and statistically counted;finally,the structure of the medical case corpus based on the logic of Chinese medicine diagnosis and treatment was proposed to integrate the corpus into Chinese medicine knowledge system and construct the corpus of sleep disorders medical cases.into the knowledge system of Chinese medicine,and constructed a corpus of medical cases of sleep disorders.[Result/conclusion]The principles and processes of terminology standardization for TCM-specific medical cases are proposed,and a corresponding corpus of standardized terms for TCM sleep disorder diagnosis and treatment medical cases is constructed to provide assistance for knowledge mining of TCM medical cases and contribute to the wisdom of TCM in the new era.[Innovation/limitation]On the basis of the existing research results on the basic terminology of Chinese medicine,the method of terminology standardisation and corpus construction for specific diseases is proposed by in-depth segmentation of the field;the number of medical cases screened in this study has some limitations,and it is hoped that the content of the corpus can be further enriched in the future research.
Chinese medical casescorpussleep disordersterminology standardizationimplicit knowledge