2025-03-06 産業技術総合研究所
<関連情報>
- https://www.aist.go.jp/aist_j/press_release/pr2025/pr20250306_2/pr20250306_2.html
- https://www.nature.com/articles/s41598-025-90988-z
フラクタル視覚変換器による微化石放散虫の分類 Classifying microfossil radiolarians on fractal pre-trained vision transformers
Kazuhide Mimura,Takuya Itaki,Hirokatsu Kataoka & Ayumu Miyakawa
Scientific Reports Published:06 March 2025
DOI:https://doi.org/10.1038/s41598-025-90988-z
Abstract
While deep learning techniques, especially image classification using deep learning, continue to evolve, it has been noted that there is a large time gap in applying these techniques in geological studies. Recently, a new architecture called the vision transformer (ViT), which is an alternative to convolutional neural networks (CNN), has attracted considerable attention. In addition, it has been proposed that the pre-training of classification models using mathematically generated images instead of real images, called formula-driven supervised learning (FDSL), achieves a comparative or even higher performance in visual understanding. In this study, we applied these new techniques to the classification of microfossils (radiolarians). Compared with a previous CNN model, the ViT-based model achieved 6–8% higher average precision. On average, the precision of the FDSL pre-trained models was slightly higher than that of the models pre-trained on real images. Therefore, we propose that these techniques may be suitable for image classification in geological tasks.