本平台为互联网非涉密平台,严禁处理、传输国家秘密或工作秘密

基于SHAP值解析的烟叶外观和化学品质指标重要性分析

Importance analysis of tobacco appearance and chemical quality indicators based on shap value analysis

  • 摘要: 为明确不同香型生态区烟叶外观和化学品质指标的相对重要性,基于多产地、多年份品质梯度烟叶样品制备试验,构建外观特征、化学成分及感官品质指标一一对应的数据集,结合机器学习模型与SHAP(SHapley Additive exPlanations,Shapley加性解释)值解析方法,分析了烟叶外观品质和主要化学成分对感官品质的贡献度以及相应的交互特征。结果表明:(1)在处理具有显著时空异质性(不同产区、年份)的烟叶品质数据预测时,支持向量回归模型(SVR)在泛化稳健性与预测精度之间相对平衡;(2)清香型典型产区感官品质指标预测模型的测试集精度在0.53~0.77,高成熟度、高还原糖显著提升香气质感,高油分促进香气饱满度,高总糖优化口感舒适性;(3)浓香型典型产区感官品质指标预测模型的测试集精度在0.51~0.79,高成熟度显著驱动香气质感、香气饱满度和口感舒适性提升,叶片结构、油分也是重要的驱动因子。(4)同一感官品质指标的特征交互效应因年份而异,外观特征与化学成分的协同驱动机制地区间差异显著。通过SHAP值揭示了不同香型产区烟叶感官品质的驱动机制,为烟叶定向栽培烘烤管理和工业应用提供依据。

     

    Abstract: To clarify the relative importance of visual and chemical quality indicators in tobacco leaves across different aroma-type ecological regions, a dataset was constructed based on multi-origin, multi-year quality gradient tobacco leaf samples. This dataset correlated visual characteristics, chemical components, and sensory quality indicators. Machine learning models and SHAP (Shapley Additive exPlans) value analysis methods were employed to examine the contributions of visual quality and major chemical components to sensory quality, as well as their corresponding interaction effects. The results indicated: (1) When predicting tobacco leaf quality data with significant spatiotemporal heterogeneity (across different production regions and years), the Support Vector Regression (SVR) model achieved a relatively balanced trade-off between generalization robustness and prediction accuracy; (2) For the sensory quality prediction model of typical light-aroma production regions, the test set accuracy ranged from 0.53 to 0.77. High maturity and high reducing sugar significantly enhanced aroma quality, while high oil content promoted aroma fullness. High total sugar optimized taste comfort. (3) For the sensory quality prediction model of typical strong-aroma production regions, the test set accuracy ranged from 0.51 to 0.79. High maturity significantly drove improvements in aroma quality, aroma fullness, and taste comfort, with leaf structure and oil content also serving as key driving factors. (4) The feature interaction effects of the same sensory quality indicator varied by year, and the synergistic driving mechanisms between visual characteristics and chemical components exhibited significant regional differences. By revealing the driving mechanisms of sensory quality in different aroma-type production regions through SHAP values, this study provides a foundation for targeted tobacco cultivation, baking management, and industrial applications.

     

/

返回文章
返回