基于近红外光谱维度转换和卷积神经网络识别小产地烟叶

居雷; 高扬; 张鑫; 葛炯; 岳宝华; 束茹欣

doi:10.16135/j.issn1002-0861.2023.0794

基于近红外光谱维度转换和卷积神经网络识别小产地烟叶

Identification of tobacco leaves from small production regions based on near-infrared spectral dimension transformation and convolutional neural network

摘要

摘要: 为了提升小产地烟叶识别的准确率，解决近红外光谱分析技术在面对样本量大、相似度高、分类数多时类别预测不佳的问题。采集4 625个云南省8个小产地的烟叶样品，将一维近红外光谱数据重构为二维图像数据，采用卷积神经网络（Convolutional neural network，CNN）建立了小产地烟叶的分类识别模型，并比较了不同机器学习算法的效果。结果表明：①主成分分析（PCA）、支持向量机（SVM）等常规的机器学习算法在处理多个相邻产地烟叶分类时效果一般，SVM算法的训练集、测试集总体准确率分别为78.86%、69.08%。②采用CNN的训练集、测试集准确率分别达97.41%、92.54%，相较于SVM算法分别高出18.55、23.46百分点。通过近红外光谱维度转换并结合CNN算法，可以提取更多的样品特征信息，有效应用于小产地烟叶的分类识别。

Abstract: To improve the identification accuracy for tobacco leaf production areas and their category prediction accuracy using near-infrared (NIR) spectroscopy analysis and when dealing with a large number of samples with high similarity and numerous classifications. A total of 4 625 tobacco leaf samples were collected from eight small production regions in Yunnan Province, and one-dimensional near-infrared spectral data were transformed into two-dimensional image data. The convolutional neural network (CNN) algorithm was used to build an identification model for tobacco leaves from these small regions, and the effects of different machine-learning algorithms were also compared. The results showed that: 1) Conventional machine-learning algorithms such as principal component analysis (PCA) and support vector machine (SVM) were generally not very effective in classifying tobacco leaves from multiple adjacent regions. The overall accuracies of the training and test sets of the SVM algorithm were 78.86% and 69.08%, respectively. 2) The accuracies of the training and test sets of CNN reached 97.41% and 92.54%, respectively, which were 18.55 and 23.46 percentage points higher than those of the SVM algorithm. By transforming the dimension of the NIR spectral data and combining with the CNN algorithm, more sample characteristics could be extracted and effectively applied to the classification and identification of tobacco leaves from small regions.

HTML全文

参考文献(15)

施引文献

资源附件(0)