AIが複雑な研究データを理解する方法を学習―脳スキャンから合金解析まで(From brain scans to alloys: Teaching AI to make sense of complex research data)

2026-01-12 ペンシルベニア州立大学(Penn State)

ペンシルベニア州立大学(ペンステート大学)の研究チームが、脳画像から金属合金データまで分野横断的に扱えるAI解析手法を開発した研究成果を紹介している。現代科学では、医療画像、材料特性、シミュレーション結果など複雑かつ高次元なデータが大量に生成される一方、それらを人間が直感的に理解することは困難である。研究者らは、自己教師あり学習などの先進的AI手法を用い、データの種類が異なっても共通する構造や特徴を自動的に抽出・整理できるモデルを構築した。このアプローチにより、脳スキャンでは疾患関連パターンを、合金研究では組成と特性の関係を効率的に把握できることが示された。分野固有の専門知識に依存せず、データ主導で知見を引き出せる点が大きな特徴であり、研究の再現性向上や新発見の加速が期待される。本成果は、AIが複雑科学データを「理解可能な知識」へ変換する基盤技術となる可能性を示している。

AIが複雑な研究データを理解する方法を学習―脳スキャンから合金解析まで(From brain scans to alloys: Teaching AI to make sense of complex research data)
This illustration shows how ZENN, a new kind of AI model,helps computers make sense of messy, real-world information. The flowing, multicolored surface represents the many possible patterns hiding inside complex data, even when that data comes in very different forms. Images, written text and location data — shown by the icons at the bottom — are all combined into one shared picture. By bringing these different sources together, ZENN can spot meaningful patterns and make better predictions than traditional models that struggle with inconsistent or imperfect data. Credit: Jennifer M. McCann. All Rights Reserved.

<関連情報>

ZENN: 異種データ駆動型モデリングのための熱力学に着想を得た計算フレームワーク ZENN: A thermodynamics-inspired computational framework for heterogeneous data–driven modeling

Shun Wang, Shun-Li Shang, Zi-Kui Liu, and Wenrui Hao
Proceedings of the National Academy of Sciences  Published:January 2, 2026
DOI:https://doi.org/10.1073/pnas.2511227122

Significance

The increasing availability of complex, heterogeneous datasets poses significant challenges for traditional data-driven methods, which often assume data homogeneity and fail to account for internal disparities. Quantifying entropy and its evolution in such settings remains a fundamental problem in digital twins and data science. While traditional entropy-based approaches provide useful approximations, they are limited in handling multisource, dynamically evolving systems. To address these challenges, we introduce a zentropy-enhanced neural network (ZENN)—a framework that extends zentropy theory from quantum and statistical mechanics to data science by assigning intrinsic entropy to each dataset. ZENN simultaneously learns both Helmholtz energy and intrinsic entropy, enabling robust generalization, accurate high-order derivative prediction, and adaptability to heterogeneous, real-world data.

Abstract

Traditional entropy-based methods—such as cross-entropy loss in classification problems—have long been essential tools for representing the information uncertainty and physical disorder in data and for developing artificial intelligence algorithms. However, the rapid growth of data across various domains has introduced new challenges, particularly the integration of heterogeneous datasets with intrinsic disparities. To address this, we introduce a zentropy-enhanced neural network (ZENN), extending zentropy theory into the data science domain via intrinsic entropy, enabling more effective learning from heterogeneous data sources. ZENN simultaneously learns both energy and intrinsic entropy components, capturing the underlying structure of multisource data. To support this, we redesign the neural network architecture to better reflect the intrinsic properties and variability inherent in diverse datasets. We demonstrate the effectiveness of ZENN on classification tasks and energy landscape reconstructions, showing its superior generalization capabilities and robustness-particularly in predicting high-order derivatives. In image and text classification tasks, ZENN demonstrates superior generalization by introducing a learnable temperature variable that models latent multisource heterogeneity, allowing it to surpass state-of-the-art models on CIFAR-10/100, BBC News, and AG News. As a practical application in materials science, we employ ZENN to reconstruct the Helmholtz energy landscape of Fe3Pt using data generated from density functional theory and capture key material behaviors, including negative thermal expansion and the critical point in the temperature–pressure space. Overall, this work presents a zentropy-grounded framework for data-driven machine learning, positioning ZENN as a versatile and robust approach for scientific problems involving complex, heterogeneous datasets.

1603情報システム・データ工学
ad
ad
Follow
ad
タイトルとURLをコピーしました