2025-07-01 ワシントン大学(UW)
University of Washington researchers developed the game AI Puzzlers to show kids an area where AI systems still typically and blatantly fail: solving certain reasoning puzzles. In the game, users get a chance to solve puzzles by completing patterns of colored blocks. They can then ask various AI chatbots to solve and have the systems explain their solutions — which they nearly always fail. Here two children in the UW KidsTeam group test the game.University of Washington
<関連情報>
- https://www.washington.edu/news/2025/07/01/this-puzzle-game-shows-kids-how-theyre-smarter-than-ai/
- https://dl.acm.org/doi/10.1145/3713043.3728836
「AIは推測し続けるだけ」:ARCパズルを使って、子どもたちが生成型AIの推論エラーを特定できるように支援する “AI just keeps guessing”: Using ARC Puzzles to Help Children Identify Reasoning Errors in Generative AI
Aayushi Dangol, Runhua Zhao, Robert Wolfe, Trushaa Ramanan, Julie A. Kientz, Jason Yip
IDC ’25: Proceedings of the 24th Interaction Design and Children Published: 23 June 2025
DOI:https://doi.org/10.1145/3713043.3728836
Abstract
The integration of generative Artificial Intelligence (genAI) into everyday life raises questions about the competencies required to critically engage with these technologies. Unlike visual errors in genAI, textual mistakes are often harder to detect and require specific domain knowledge. Furthermore, AI’s authoritative tone and structured responses can create an illusion of correctness, leading to overtrust, especially among children. To address this, we developed AI Puzzlers, an interactive system based on the Abstraction and Reasoning Corpus (ARC), to help children identify and analyze errors in genAI. Drawing on Mayer & Moreno’s Cognitive Theory of Multimedia Learning, AI Puzzlers uses visual and verbal elements to reduce cognitive overload and support error detection. Based on two participatory design sessions with 21 children (ages 6 – 11), our findings provide both design insights and an empirical understanding of how children identify errors in genAI reasoning, develop strategies for navigating these errors, and evaluate AI outputs.