Abstract:
To clarify the relative importance of visual and chemical quality indicators in tobacco leaves across different aroma-type ecological regions, a dataset was constructed based on multi-origin, multi-year quality gradient tobacco leaf samples. This dataset correlated visual characteristics, chemical components, and sensory quality indicators. Machine learning models and SHAP (Shapley Additive exPlans) value analysis methods were employed to examine the contributions of visual quality and major chemical components to sensory quality, as well as their corresponding interaction effects. The results indicated: (1) When predicting tobacco leaf quality data with significant spatiotemporal heterogeneity (across different production regions and years), the Support Vector Regression (SVR) model achieved a relatively balanced trade-off between generalization robustness and prediction accuracy; (2) For the sensory quality prediction model of typical light-aroma production regions, the test set accuracy ranged from 0.53 to 0.77. High maturity and high reducing sugar significantly enhanced aroma quality, while high oil content promoted aroma fullness. High total sugar optimized taste comfort. (3) For the sensory quality prediction model of typical strong-aroma production regions, the test set accuracy ranged from 0.51 to 0.79. High maturity significantly drove improvements in aroma quality, aroma fullness, and taste comfort, with leaf structure and oil content also serving as key driving factors. (4) The feature interaction effects of the same sensory quality indicator varied by year, and the synergistic driving mechanisms between visual characteristics and chemical components exhibited significant regional differences. By revealing the driving mechanisms of sensory quality in different aroma-type production regions through SHAP values, this study provides a foundation for targeted tobacco cultivation, baking management, and industrial applications.