2025-02-25 中国科学院 (CAS)
<関連情報>
- https://english.cas.cn/newsroom/research_news/earth/202502/t20250226_902633.shtml
- https://www.sciencedirect.com/science/article/pii/S0169136825000666
機械学習を利用して地球化学元素間の暗黙の関連性を発見する Uncover implicit associations among geochemical elements using machine learning
Shuguang Zhou, Zhizhong Cheng, Jinlin Wang, Nuo Li, Guo Jiang
Ore Geology Reviews Available online: 16 February 2025
DOI:https://doi.org/10.1016/j.oregeorev.2025.106506
Graphical abstract
Highlights
- Most of the major trace elements in rock and stream sediments can be reliably simulated using random forest models.
- Adding feature variables can improve the simulation result of geochemical element content in the random forest model.
- The method proposed in this study can assist in detecting potential errors in geochemical data.
- The method proposed in this study provides a viable and reliable solution for imputing censored or missing values in geochemical data.
Abstract
The production of geochemical data serves diverse purposes, and a variety of analytical methods are utilized for analyzing geochemical element content. However, due to limitations in project funds, censored or missing values are common in geochemical data. This scarcity of data becomes more pronounced when dealing with large datasets. Regrettably, numerous data analysis techniques are unable to process datasets containing missing values, which presents a significant hurdle for researchers who depend on geochemical data. To address this issue, here we employed a random forest model to simulate the geochemical elements of rocks and stream sediments. By comparing and analyzing the effects of model parameters and feature variable selection on the simulation results of major and trace elements, the study found that with appropriate model parameters and variable selection, the simulation results for many elements are reliable, and the generalization performance of the random forest model is satisfactory. This research sheds light on the inherent correlations among various elements in nature, offers solutions to the challenges posed by missing values in geochemical data, and provides valuable technical support for disciplines such as geology, environmental science and soil science.