2024-01-26 プリンストン大学
◆研究者たちはMaterials Projectからのテキストデータを使用し、Google ResearchのT5言語モデルを適応させてツールを訓練しました。この手法は結晶の特性予測において新たな基準を提供し、新しい技術の設計とテストのプロセスを迅速化する可能性があります。
<関連情報>
- https://engineering.princeton.edu/news/2024/01/26/researchers-harness-large-language-models-accelerate-materials-discovery
- https://arxiv.org/abs/2310.14029
LLM-Prop: 結晶性固体の物理的・電子的特性をテキスト記述から予測する LLM-Prop: Predicting Physical And Electronic Properties Of Crystalline Solids From Their Text Descriptions
Andre Niyongabo Rubungo, Craig Arnold, Barry P. Rand, Adji Bousso Dieng
arXiv Submitted on:21 Oct 2023
DOI:https://doi.org/10.48550/arXiv.2310.14029
Abstract
The prediction of crystal properties plays a crucial role in the crystal design process. Current methods for predicting crystal properties focus on modeling crystal structures using graph neural networks (GNNs). Although GNNs are powerful, accurately modeling the complex interactions between atoms and molecules within a crystal remains a challenge. Surprisingly, predicting crystal properties from crystal text descriptions is understudied, despite the rich information and expressiveness that text data offer. One of the main reasons is the lack of publicly available data for this task. In this paper, we develop and make public a benchmark dataset (called TextEdge) that contains text descriptions of crystal structures with their properties. We then propose LLM-Prop, a method that leverages the general-purpose learning capabilities of large language models (LLMs) to predict the physical and electronic properties of crystals from their text descriptions. LLM-Prop outperforms the current state-of-the-art GNN-based crystal property predictor by about 4% in predicting band gap, 3% in classifying whether the band gap is direct or indirect, and 66% in predicting unit cell volume. LLM-Prop also outperforms a finetuned MatBERT, a domain-specific pre-trained BERT model, despite having 3 times fewer parameters. Our empirical results may highlight the current inability of GNNs to capture information pertaining to space group symmetry and Wyckoff sites for accurate crystal property prediction.