Abstract:
To overcome problems in flue-cured tobacco leaf grading such as confusion in stalk position recognition, limited effective features, and low recognition accuracy, a dataset containing data of leaves from five stalk positions: tips (T), second upper leaves (B), cutters (C), lugs (X), and flyings (P) was established. The features of leaf images were digitally transformed by combining OpenCV image processing technique with expert experiences, and 155 features including morphology, color, and texture of leaves were manually extracted. The XGBoost ensemble learning algorithm was selected, and the Dung Beetle Optimizer (DBO) was used to automatically optimize the hyperparameters of XGBoost to develop a DBO-XGBoost model for recognizing the stalk positions of flue-cured tobacco leaves. The DBO-XGBoost model was validated on a tobacco grading machine. The results showed that the DBO-XGBoost model achieved a recognition accuracy of 98.88% for the stalk positions of flue-cured tobacco leaves with a macro
F1-score of 0.987 3. The features extracted using the threshold segmentation method, including the proportions of orange, scorched red, green, and clustered colors, significantly improved the performance of the model. In addition, the DBO-XGBoost model showed at least a 2.37 percentage increase in recognition accuracy compared with classical models such as XGBoost, Support Vector Machine, Random Forest, Decision Trees, ResNet50, MobileNetv3-s, MobileViT-s, and YOLOv8-s, with an recognition speed of 0.3 seconds per image. The DBO-XGBoost model is capable of accurately recognizing the stalk position of flue-cured tobacco leaves, and applicable to the actual flue-cured tobacco purchasing process.