Abstract:
To avoid the impact of subjective factors associated with traditional tobacco grading, the influences of three preprocessing methods, including MSC (Multivariate Scattering Correction), SNV(Standard Normal Variate transformation) and SG (Savitzky-Golay convolution smoothing) and four classification models, including RF (Random Forest), ELM (Extreme Learning Machine), GBDT (Gradient Boosting Decision Tree) and SVM (Support Vector Machine) on tobacco classification accuracy were analyzed based on the hyperspectral datasets A and B of tobacco leaves of different grades. Further, the hyperspectral data were preprocessed by MSC, from which the characteristic bands were selected by F-Score algorithm, and the variations of classification accuracy in the case of different number of characteristic bands were investigated. The results showed that: 1)The MSC preprocessing based on all bands combined with ELM or SVM models showed a better tobacco recognition effect, and their classification accuracies for datasets A and B reached 84%, 80% and 96%, 95% respectively. 2) When the number of characteristic bands selected by F-Score algorithm accounted for 70% of all bands, the classification accuracies of the four models based on MSC preprocessing were close to those based on all bands, and the classification accuracy of SVM for dataset B reached 96%. These research results provide a support for promoting the recognition accuracy and intelligent level of tobacco grading.