AIは大多数の人間より創造的 ― ただし全員ではない(AI more creative than most – but not all – humans: Study)

2026-02-13 トロント大学(U of T)

トロント大学の研究によると、生成AIは多くの人間より創造的なアイデアを生み出す傾向があるが、最も創造性の高い人間には及ばない場合があることが示された。実験では、参加者とAIに発想課題を与え、独創性や多様性を比較。AIは平均的な創造性で高得点を示し、安定して質の高い案を提示した一方、突出した独自性では一部の人間が上回った。研究は、AIが創造活動を補完する有力なツールとなる可能性を示すと同時に、人間固有の創造的強みの重要性も浮き彫りにした。

AIは大多数の人間より創造的 ― ただし全員ではない(AI more creative than most – but not all – humans: Study)
Chart comparing mean Divergent Association Task performance of humans and various large language models (Bellemare-Pepin. et al.; Divergent creativity in humans and large language models.)

<関連情報>

人間の創造性の多様性と大規模言語モデル Divergent creativity in humans and large language models

Antoine Bellemare-Pepin,François Lespinasse,Philipp Thölke,Yann Harel,Kory Mathewson,Jay A. Olson,Yoshua Bengio & Karim Jerbi
Scientific Reports  Published:21 January 2026
DOI:https://doi.org/10.1038/s41598-025-25157-3

Abstract

The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs’ semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. These divergence-based measures index associative thinking—the ability to access and combine remote concepts in semantic space—an established facet of creative cognition. We benchmark performance on the Divergent Association Task (DAT) and across multiple creative-writing tasks (haiku, story synopses, and flash fiction), using identical, objective scoring. We found evidence that LLMs can surpass average human performance on the DAT, and approach human creative writing abilities, yet they remain below the mean creativity scores observed among the more creative segment of human participants. Notably, even the top performing LLMs are still largely surpassed by the aggregated top half of human participants, underscoring a ceiling that current LLMs still fail to surpass. We also systematically varied linguistic strategy prompts and temperature, observing reliable gains in semantic divergence for several models. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labor by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.

1600情報工学一般
ad
ad
Follow
ad
タイトルとURLをコピーしました