Abstract:
To mitigate the impact of moldy tobacco leaves on the quality of cigarette products, a non-destructive method for detecting moldy tobacco leaves was established. Hyperspectral images of moldy tobacco leaves were acquired using a hyperspectral imager, and the spectral data were extracted. The acquired spectral data were pre-processed with seven methods including Min-Max Scaling (MMS), Standard Normal Variate (SNV), Multivariate Scatter Correction (MSC), and Savitzky-Golay smoothing filter (SG). The characteristic wavelengths were selected with Successive Projections Algorithm (SPA) and Principal Component Analysis (PCA). Six classification models were established using machine learning algorithms including Random Forest (RF) and Support Vector Machine (SVM), and comparatively tested. The results showed that: 1) SNV was the optimal spectral preprocessing method, and the RF model based on characteristic wavelengths selected by SPA exhibited superior performance, achieved recognition accuracies of 98.82% and 98.64% on the training and test sets, respectively. This approach also reduced the computational time of full-spectrum classification algorithms and achieved favorable classification results for tobacco of the same grade from different growing areas. 2) The hyperspectral imaging technology combined with the SPA-RF model could effectively and accurately identify the moldy states of tobacco leaves. This technology supports the promotion of tobacco leaf quality control and management.