查看更多>>摘要:目的 基于监测、流行病学和最终结果(SEER)数据库大样本数据,构建并分析可视化预测老年晚期肺腺癌术后患者预后的列线图模型。 方法 使用SEER*Stat8.4.0.1软件筛选2000年至2019年SEER数据库中来自17个注册点的数据,纳入4 453例经美国癌症联合会(AJCC)第7版分期标准诊断为Ⅲ期和Ⅳ期接受手术治疗、年龄≥65岁的肺腺癌患者,按7∶3比例随机分为训练集(3 117例)和验证集(1 336例),比较两组的流行病学资料和临床病理特征。采用LASSO回归进行数据降维,从患者预后因素中选择最佳预测因子。采用Cox比例风险模型对筛选出来的变量进行单因素和多因素分析,采用R软件rms包根据预后独立危险因素构建列线图,预测患者1、3、5年肿瘤特异性生存(CSS)率。采用Bootstrap法对验证集进行1 000次等量有放回重复采样验证,采用C指数、受试者工作特征(ROC)曲线及校正曲线验证列线图模型的准确性。 结果 训练集、验证集患者年龄、性别、种族、肿瘤位置、Grade分级、手术方式、淋巴结清扫数目、放疗方式、肿瘤长径、肿瘤转移、婚姻、居住环境、TNM分期、放化疗等比较,差异均无统计学意义(均P>0.05)。将训练集中18个变量纳入LASSO回归分析,对变量进行降维,共筛选出11个最优预测变量,年龄≥85岁(HR=2.34,95% CI:1.803~3.037,P<0.01)、男性(HR=1.326,95% CI:1.228~1.432,P<0.01),Grade分级Ⅲ~Ⅳ级(HR=1.333,95% CI:0.844~2.105,P<0.01)、未清扫淋巴结(HR=2.261,95% CI:2.023~2.527,P<0.01),肿瘤长径≥3.7 cm(HR=1.445,95% CI:1.333~1.566,P<0.01),发生骨转移(HR=1.535,95% CI:1.294~1.819,P<0.01)、脑转移(HR=1.308,95% CI:1.117~1.532,P<0.01)、肺转移(HR=1.229,95% CI:1.056~1.431,P=0.01),居住于农村(HR=1.215,95% CI:1.084~1.363,P<0.01)、TNM分期Ⅳ期(HR=1.155,95% CI:1.044~1.278,P=0.01)、术后放疗(HR=1.148,95% CI:1.054~1.250,P<0.01)的肺腺癌患者预后更差。根据以上变量构建列线图预测模型,预测老年晚期肺腺癌术后患者1、3、5年CSS率。采用Bootstrap法重复抽样1 000次验证列线图的建模效果,模型组训练集的C指数为0.654(95%CI:0.641~0.668),验证集为0.666(95% CI:0.646~0.685)。分别绘制老年晚期肺腺癌术后患者训练集和验证集1、3、5年CSS率的ROC曲线,曲线下面积(AUC)分别为0.730(95% CI:0.708~0.754)和0.689(95% CI:0.672~0.710)、0.687(95% CI 0.668~0.711)和0.731(95% CI:0.697,0.765)、0.712(95% CI:0.684~0.740)、0.714(95% CI:0.683~0.745)。校准曲线显示模型预测概率与真实概率具有较高的一致性。 结论 根据最优预测变量构建的老年晚期肺腺癌术后患者预后预测列线图模型,可能是患者生存预测的一个简便工具。 Objective To construct and analyze the visual nomogram predictive model for the prognosis of elderly advanced lung adenocarcinoma patients after surgery based on the Surveillance, Epidemiology, and End Results (SEER) database. Methods SEER*Stat8.4.0.1 software was used to screen out the data from 17 register in SEER database between 2000 and 2019, and finally 4 453 lung adenocarcinoma patients aged ≥ 65 years who underwent surgical treatment and were diagnosed as stage Ⅲ and Ⅳ according to the 7th edition of the American Joint Committee on Cancer (AJCC) staging criteria were enrolled. The data were randomly divided into the training set (3 117 cases) and the validation set (1 336 cases) in a 7:3 ratio the epidemilogical data and clinicopathological characteristics of the two groups were compared. LASSO regression was used for data dimensionality reduction to select the best predictors from the prognostic factors of patients. Cox proportional risk model was used to perform univariate and multivariate analyses of the screened variables, and based on R software rms package and the prognostic independent risk factors, the nomogram was constructed to predict the 1-, 3-, and 5-year cancer-specific survival (CSS) rates of the patients. The validation set was validated by using Bootstrap method with 1 000 equal repeated samples with playback, and the accuracy of the nomogram model was verified by using the C-index, receiving operating characteristic (ROC) curves and calibration curves. Results There were no statistically significant differences in age, gender, race, tumor location, Grade grading, surgery methods, the number of lymph node dissection, radiotherapy, tumor diameter, tumor metastasis, marriage, living condition, TNM staging, radiochemotherapy of training set and validation set (all P > 0.05). In training set, 18 variables were included into LASSO regression analysis and were performed with dimensionality reduction ultimately, 11 optimal predictive variables were selected, including age ≥ 85 years ( HR = 2.34, 95% CI: 1.803-3.037, P < 0.01), male ( HR = 1.326, 95% CI: 1.228-1.432, P < 0.01), Grade grading Ⅲ-Ⅳ ( HR = 1.333, 95% CI: 0.844-2.105, P < 0.01), undissected lymph nodes ( HR = 2.261, 95% CI: 2.023-2.527, P < 0.01), tumor diameter ≥3.7 cm ( HR = 1.445, 95% CI: 1.333-1.566, P < 0.01), bone metastasis ( HR = 1.535, 95% CI: 1.294-1.819, P < 0.01), brain metastasis ( HR = 1.308, 95%CI: 1.117-1.532, P < 0.01), lung metastasis ( HR = 1.229, 95% CI: 1.056-1.431, P = 0.01), living in rural areas (HR = 1.215, 95% CI: 1.084-1.363, P < 0.01), TNM staging Ⅳ ( HR = 1.155, 95% CI: 1.044-1.278, P = 0.01), postoperative radiotherapy (HR = 1.148, 95% CI: 1.054-1.250, P < 0.01) lung adenocarcinoma patients with the above 11 factors had worse prognosis. Based on the variables, the nomogram predictive model was constructed to predict 1-, 3-, and 5-year CSS rates of elderly advanced lung adenocarcinoma patients. Bootstrap method was used for repeated sampling for 1 000 times to verify the modeling effect of nomogram. In the model group, C-index was 0.654 (95% CI: 0.641-0.668), 0.666 (95% CI: 0.646-0.685), respectively in the training set and the validation set. The nomogram was drawn to predict ROC curves of 1-, 3-, and 5-year CSS rates for elderly advanced lung adenocarcinoma patients after operation in the training set and validation set the area under the curve (AUC) of 1-year, 3-year, and 5-year CSS rates was 0.730 (95% CI: 0.708-0.754) and 0.689 (95% CI: 0.672-0.710), 0.687 (95% CI: 0.668-0.711) and 0.731 (95% CI: 0.697-0.765), 0.712 (95% CI:0.684-0.740) and 0.714 (95% CI: 0.683-0.745), respectively in the training and validation sets. The calibration curve showed a high consistency between the predicted probability of the model and the actual probability. Conclusions The nomogram model constructed by optimal predictive variables for predicting the prognosis of elderly advanced lung adenocarcinoma patients after surgery may be a convenient tool for survival prediction of these patients.