026-03-16 ワシントン州立大学(WSU)

Illustration by ismagilov on iStock
<関連情報>
- https://news.wsu.edu/press-release/2026/03/16/ai-gets-a-d-study-shows-inaccuracies-inconsistency-in-chatgpt-answers/
- https://rbr.business.rutgers.edu/article/unstable-intelligence-genai-struggles-accuracy-and-consistency
- chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://rbr.business.rutgers.edu/sites/default/files/documents/rbr-100209.pdf
不安定な知能:ジェネレーティブAIは精度と一貫性に課題を抱えている Unstable Intelligence: GenAI Struggles with Accuracy and Consistency
Mesut Cicek,Sevincgul Ulu,Can Uslay ,Kate Karniouchina
Rutgers Business Review Published:2025
Abstract
This study examines the accuracy and consistency of Generative AI (GenAI) by testing ChatGPT’s ability to estimate the accuracy of 719 business research hypotheses. For critical tasks, we find GenAI performance to be inadequate in terms of accuracy and consistency. Accuracy improved only marginally from 76.5% (GPT-3.5, 2024) to 80% (GPT-5 mini, 2025), yielding an effective chanceadjusted accuracy of only 60%. Moreover, accuracy drops significantly for insignificant hypotheses, reaching only 16.4% in 2025. Crucially, consistency across ten identical prompts was poor, with over a quarter of the cases having at least one incorrect estimation. We conclude that GenAI’s linguistic fluency is not yet backed by commensurate conceptual intelligence and frequently produces unreliable output, necessitating vigilant human oversight.


