首页|CATI: A medical context-enhanced framework for diagnosis code assignment in the UK Biobank study

CATI: A medical context-enhanced framework for diagnosis code assignment in the UK Biobank study

扫码查看
Diagnosis codes are standard code format of diseases or medical conditions. This study is aimed at assigning diagnosis codes to patients in large-scale biobanks, particularly addressing the issue of missing codes for some patients. This is crucial for downstream disease-related tasks. While recent methods primarily rely on structured biobank data for code assignment, they often overlook the valuable medical context provided by textual information in the biobanks and hierarchical structure of the disease coding system. To address this gap, we have developed CATI, a medical context-enhanced framework for diagnosis Code Assignment by integrating Textual details derived from key features and disease hIerarchy. The study is based on the UK Biobank data and considers Phecodes and ICD-10 codes as standard disease formats. We start by representing ten informative codified features using their formal names and then integrate them into CATI as text embeddings, achieved through prompt tuning on the pre-trained language model BioBERT. Recognizing the hierarchical structure of diagnosis codes, we have developed a novel convolution layer in our method that effectively propagates logits between adjacent diagnosis codes. Evaluation results demonstrate that CATI outperforms existing stateof-the-art methods in terms of both Phecodes and ICD-10 codes, boasting at least a 5.16% improvement in average AUROC for unseen disease codes and an 8.68% rise in average AUPRC for disease codes with training instances ranging in (1000,10000]. This framework contributes to the formation of well-defined cohorts for downstream studies and offers a unique perspective for addressing complex healthcare tasks by incorporating vital medical context.

Diagnosis code assignmentDisease hierarchyMedical contextUK biobankELECTRONIC HEALTH RECORDCLASSIFICATION

Shen, Yue、Wang, Jie、Shi, Zhihao、Chen, Nzhu、Wang, Zheng、Jiang, Yukang、Wang, Xiaopu、Cheng, Chuandong、Wang, Xueqin、Zhu, Hongtu、Ye, Jieping

展开 >

Univ Sci & Technol China

Alibaba Cloud Comp

Univ North Carolina Chapel Hill

University of Science and Technology of China School of Management

展开 >

2025

Artificial intelligence in medicine

Artificial intelligence in medicine

SCI
ISSN:0933-3657
年,卷(期):2025.166(Aug.)
  • 37