2023-11-09 カリフォルニア大学バークレー校(UCB)
A new statistical technique allows researchers to safely use the predictions obtained from machine learning to test scientific hypotheses. This image shows an artistic interpretation of the technique, called prediction-powered inference, which has been generated by the DALL-E AI system.
◆これは機械学習モデルから得られた予測を使用して科学的仮説を検証するための手法で、特定の科学的問題の文脈で大規模な一般モデルの出力を少量の実データで補正します。これにより、モデルの潜在的な誤りを修正することが可能で、モデルの誤差の本質を事前に知る必要はありません。
◆これにより、科学的な研究でのモデルのバイアスを修正し、例えばアマゾンの森林減少推定などに成功。この手法は幅広い研究に適用可能であり、科学の進化において重要なツールとなる可能性がある。
<関連情報>
- https://news.berkeley.edu/2023/11/09/how-to-use-ai-for-discovery-without-leading-science-astray
- https://www.science.org/doi/full/10.1126/science.adi6000
予測による推論 Prediction-powered inference
Anastasios N. Angelopoulos,Stephen Bates,Clara Fannjiang ,Michael I. Jordan,and Tijana Zrnic
Science Published:9 Nov 2023
DOI:https://doi.org/10.1126/science.adi6000
Editor’s summary
Over the past decade, there has been rapid progress in the development of large-scale machine learning (ML) systems that provide predictions related to various scientific phenomena. Unfortunately, the standard statistical approaches used to calculate confidence intervals and P values from gold standard data lose their statistical validity for ML-derived data. Angelopoulos et al. introduced “prediction-powered inference,” a standardized protocol for constructing valid confidence intervals and P values that enables the power and scale of ML systems to be used as predictors while ensuring responsible and reliable scientific inferences. The method has been demonstrated on a broad range of real datasets and offers a promising statistical approach for using ML to derive scientific conclusions responsibly. —Yury Suleymanov
Abstract
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system. The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients without making any assumptions about the machine-learning algorithm that supplies the predictions. Furthermore, more accurate predictions translate to smaller confidence intervals. Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning. The benefits of prediction-powered inference were demonstrated with datasets from proteomics, astronomy, genomics, remote sensing, census analysis, and ecology.