利用可解释机器学习估算河套平原盐碱农田土壤水分和有机质含量
DOI:
作者:
作者单位:

1.宁夏大学生态环境学院;2.宁夏大学地理科学与规划学院

作者简介:

通讯作者:

中图分类号:

基金项目:

国家重点研发计划项目(2021YFD1900602);国家自然科学基金项目(42467036);宁夏科技创新领军人才项目(2022GKLRLX02)


The Estimation of Soil Moisture and Organic Matter Content in Saline-Alkaline Farmland of the Hetao Plain Using Interpretable Machine Learning
Author:
Affiliation:

College of Ecology and Environmental Science,Ningxia University

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    [目的] 针对传统方法在盐碱化农田土壤水分(SMC)和有机质含量(SOMC)监测中存在效率低下的问题,探索高光谱数据结合可解释机器学习的估算方法,以期为河套平原盐碱化土壤信息快速获取和土壤质量评价提供理论依据。[方法]以地面高光谱反射率及实测SMC和SOMC为数据源,对光谱数据采用分数阶微分(FOD)变换并构建光谱指数,基于偏最小二乘回归(PLSR)、支持向量机(SVM)和随机森林(RF)算法建模,并引入夏普利加性解释(SHAP)方法解析变量对模型预测结果的相对贡献,来提升模型的解释性。[结果]①经1.25阶微分变换后构建的光谱指数与SMC和SOMC间相关性最强,其中,广义差异指数(GDI)和最优光谱指数(OSI)与SMC和SOMC间相关系数最大分别为0.5054和0.6825。②RF模型对SMC和SOMC的估算精度远高于PLSR和SVM;SMC和SOMC-RF模型验证集 R2、均方根误差(RMSE)和相对分析误差(RPD)分别为0.734、3.28和2.07及0.870、1.53和2.43。③SHAP分析发现,氮平面域指数(NPDI)和比值指数(RI)分别在SMC和SOMC的建模估算中贡献度最大,且NPDI、OSI和差值指数(DI)对SMC的建模贡献度累计达到68.58%;RI、GDI和NPDI对SOMC的建模贡献度累计达到61.86%。[结论] FOD联合光谱指数在高光谱数据的有效利用中具有明显优势,RF模型在土壤属性估算中展现了较高的精度和鲁棒性,SHAP分析有效揭示了不同变量对目标变量的贡献度。NPDI、RI、OSI和DI等光谱指数在盐碱化农田SMC和SOMC的建模估算中贡献显著。研究结论为盐碱化农田SMC和SOMC的高精度估算提供了新思路,对区域土壤改良和农业精准施策具有重要指导意义。

    Abstract:

    [Objective] To address the inefficiencies of traditional approaches in monitoring soil moisture content (SMC) and soil organic matter content (SOMC) in saline-alkaline farmlands, this study investigates an estimation method that integrates hyperspectral data with interpretable machine learning. The goal is to establish a theoretical foundation for the rapid acquisition of soil information and quality assessment in the Hetao Plain. [Methods] Ground-based hyperspectral reflectance data and field-measured SMC and SOMC were used as the primary data sources. Spectral data were processed using fractional-order differential (FOD) transformation, and various spectral indices were constructed. Models were developed using partial least squares regression (PLSR), support vector machines (SVM), and random forests (RF). To enhance interpretability, the Shapley Additive Explanations (SHAP) method was employed to evaluate the relative contribution of each variable to the model predictions. [Results] ①Spectral indices derived from the 1.25-order differential transformation showed the highest correlation with SMC and SOMC. In particular, the generalized difference index (GDI) and optimal spectral index (OSI) exhibited the strongest correlations, with coefficients of 0.5054 and 0.6825, respectively. ②The RF model significantly outperformed PLSR and SVM in estimating both SMC and SOMC. For the validation datasets, the RF models achieved R2 values of 0.734 and 0.870, RMSEs of 3.28 and 1.53, and RPDs of 2.07 and 2.43, respectively. ③SHAP analysis indicated that the normalized plane domain index (NPDI) and ratio index (RI) were the most influential variables for the estimation of SMC and SOMC, respectively. The combined contributions of NPDI, OSI, and difference index (DI) to SMC modeling reached 68.58%, while RI, GDI, and NPDI collectively contributed 61.86% to SOMC modeling. [Conclusion] The integration of FOD and spectral indices enhances the utility of hyperspectral data. The RF model demonstrated superior accuracy and robustness in estimating soil properties, while SHAP analysis effectively elucidated the contribution of individual variables. Spectral indices such as NPDI, RI, OSI, and DI played significant roles in modeling SMC and SOMC in saline-alkaline farmland. These findings offer novel insights for high-precision estimation of soil attributes and provide valuable guidance for regional soil management and precision agriculture practices.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-01-08
  • 最后修改日期:2025-05-18
  • 录用日期:2025-05-19
  • 在线发布日期:
  • 出版日期: