项目总结与完整Python程序通过本书的学习,我们从医疗AI的基础知识出发,系统掌握了经典机器学习算法的原理与医疗应用,深入探讨了数据处理、特征工程、模型评估、可解释性、不平衡问题处理、模型融合等进阶技术,并在第16章中以ICU败血症早期预警系统为例,完整演示了从问题定义到模型部署的全流程。现在,我们将所有这些知识整合为一个统一的Python程序,实现败血症预测的端到端流程,包括:模拟生成符合MIMIC-III分布的数据集数据预处理与特征工程多模型训练(逻辑回归、随机森林、XGBoost)模型融合(Stacking)超参数调优与不平衡处理模型评估(AUC、PR AUC、分类报告、混淆矩阵)可解释性分析(SHAP)阈值选择与决策曲线模型保存与简单API示例该程序可直接运行(需要安装相关库),可作为医疗AI项目的模板。完整Python程序# -*- coding: utf-8 -*-""" 败血症早期预警系统 - 完整实现 基于模拟MIMIC-III数据,包含预处理、建模、评估、解释和部署示例 """importpandasaspdimportnumpyasnpimportmatplotlib.pyplotaspltimportseabornassnsfromsklearn.model_selectionimporttrain_test_split,StratifiedKFold,GridSearchCVfromsklearn.preprocessingimportStandardScaler,LabelEncoderfromsklearn.imputeimportSimpleImputerfromsklearn.linear_modelimportLogisticRegressionfromsklearn.ensembleimportRandomForestClassifier,StackingClassifier,VotingClassifierfromxgboostimportXGBClassifierfromsklearn.metricsimport(classification_report,confusion_matrix,roc_auc_score,average_precision_score,precision_recall_curve,roc_curve)fromimblearn.over_samplingimportSMOTEfromimblearn.pipelineimportPipelineasImbPipelineimportshapimportjoblibimportwarnings warnings.filterwarnings('ignore')# 设置随机种子RANDOM_SEED=42np.random.seed(RANDOM_SEED)# ====================== 1. 模拟数据生成 ======================# 为了便于运行,我们生成一个模拟数据集,特征分布参考真实MIMIC数据defgenerate_synthetic_data(n_samples=10000,pos_ratio=0.12):"""生成模拟ICU患者数据,包含败血症标签"""n_pos=int(n_samples*pos_ratio)n_neg=n_samples-n_pos# 定义特征列feature_cols=['age','gender','heart_rate_mean','sbp_mean','dbp_mean','map_mean','resp_rate_mean','temp_mean','spo2_mean','wbc_first','hgb_first','plt_first','creatinine_first','bun_first','glucose_first','lactate_first','lactate_min','ph_first','gcs_min','sofa_first','sofa_max','sofa_min','chf','diabetes','copd','liver_disease','renal_disease','cancer','vent','vaso','rrt']# 多数类(阴性)分布参数(均值,标准差)neg_params={'age':(60,15),'gender':(0.5,0.5),'heart_rate_mean':(80,15),'sbp_mean':(130,20),'dbp_mean':(70,10),'map_mean':(90,15),'resp_rate_mean':(18,4),'temp_mean':(36.8,0.5),'spo2_mean':(97,2),'wbc_first':(8,3),'hgb_first':(12,2),'plt_first':(250,80),'creatinine_first':(0.9,0.3),'bun_first':(15,8),'glucose_first':(110,30),'lactate_first':(1.2,0.5),'lactate_min':(1.0,0.4),'ph_first':(7.4,0.05),'gcs_min':(15,1),'sofa_first':(1,1),'sofa_max':(2,1.5),'sofa_min':(0.5,0.8),'chf':(0.1,0.3),'diabetes':(0.2,