データ抽出ツールが新しいポリマーの発見につながるかもしれない(Data Extraction Tool May Lead to Discovery of New Polymers)

2023-07-13

2023-07-12 ジョージア工科大学

◆新たな材料科学データ抽出パイプラインが登場し、研究者の仕事を容易かつ迅速にすることで、材料科学研究の急速な増加に対応しています。
◆パイプラインは論文から材料特性のデータを抽出し、Polymer Scholarというアプリケーションに提供します。このアプリケーションはキーワードで高分子や材料特性を検索できるため、材料研究を効率化し、新しい高分子の発見につながる可能性があります。このプロジェクトは、ジョージア工科大学の研究者が開発し、材料科学文献のデータを抽出するために特別に訓練されたモデルを使用しています。

＜関連情報＞

自然言語処理を用いた大規模ポリマーコーパスからの汎用材料特性データ抽出パイプライン A general-purpose material property data extraction pipeline from large polymer corpora using natural language processing

Pranav Shetty,Arunkumar Chitteth Rajan,Chris Kuenneth,Sonakshi Gupta,Lakshmi Prerana Panchumarti,Lauren Holm,Chao Zhang & Rampi Ramprasad
npj Computational Materials Published:05 April 2023
DOI:https://doi.org/10.1038/s41524-023-01003-w

Abstract

The ever-increasing number of materials science articles makes it hard to infer chemistry-structure-property relations from literature. We used natural language processing methods to automatically extract material property data from the abstracts of polymer literature. As a component of our pipeline, we trained MaterialsBERT, a language model, using 2.4 million materials science abstracts, which outperforms other baseline models in three out of five named entity recognition datasets. Using this pipeline, we obtained ~300,000 material property records from ~130,000 abstracts in 60 hours. The extracted data was analyzed for a diverse range of applications such as fuel cells, supercapacitors, and polymer solar cells to recover non-trivial insights. The data extracted through our pipeline is made available at polymerscholar.org which can be used to locate material property data recorded in abstracts. This work demonstrates the feasibility of an automatic pipeline that starts from published literature and ends with extracted material property information.

月	火	水	木	金	土	日
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31