本平台为互联网非涉密平台,严禁处理、传输国家秘密或工作秘密

基于机器学习的植烟区土壤有机质和全氮含量预测

Predictions on organic matter and total nitrogen contents in tobacco-growing soil based on machine learning

  • 摘要: 为揭示植烟土壤有机质(SOM)和土壤全氮(STN)含量的空间分布规律,以重庆市巫山县笃坪乡为研究区,以成土母质和地形因子为预测因子,采用随机森林(RF)、梯度提升决策树(GBDT)和极端梯度提升(XGBoost)3种机器学习方法进行模型构建和评价,同时选择最优模型进行数字土壤制图并分析了环境变量的重要性。结果表明:①成土母质为二叠系梁山组灰岩发育的土壤SOM和STN含量显著高于成土母质为三叠系大冶组灰岩发育土壤。②GBDT模型的预测精度最佳,对于SOM和STN含量的预测, 其决定系数(R2)分别为0.616 7和0.746 8,平均绝对误差(MAE)分别为4.81 g/kg和0.25 g/kg,均方根误差(RMSE)分别为5.94 g/kg和0.34 g/kg。③主要环境因子对SOM含量影响的排序依次为母质 > 海拔 > 地形湿度指数 > 山谷深度,对STN含量影响排序依次为母质 > 坡高 > 海拔。

     

    Abstract: To reveal the spatial distribution patterns of soil organic matter (SOM) and soil total nitrogen (STN) contents in tobacco-planting areas, Duping Town of Wushan County in Chongqing City was taken as the study area, the soil parent materials and topographic factors were used as predictors, and three machine learning algorithms, namely Random Forest (RF), Gradient Boosted Decision Tree (GBDT) and Extreme Gradient Boosting (XGBoost) were used for model construction and evaluation, and then the optimal model for digital soil mapping was selected and the importance of environmental variables was analyzed. The results showed that: 1) Soils developed from limestone of Permian Liangshan Formation had higher SOM and STN contents than those developed from limestone of Triassic Daye Formation. 2) The GBDT model had the best prediction accuracy with coefficients of determination (R2) of 0.616 7 and 0.746 8, mean absolute error(MAE)of 4.81 g/kg and 0.25 g/kg and root mean square error (RMSE) of 5.94 g/kg and 0.34 g/kg for SOM and STN contents, respectively. 3) The effects of the main environmental factors on SOM contents were in the order of parent material > altitude > topographic wetness index > valley depth, and those on STN contents was in the order of parent material > slope height > altitude.

     

/

返回文章
返回