AIチャットボットが過度に同調的になる傾向を発見(AI-powered chatbots can become too agreeable over time, researchers report)

2026-03-18 ペンシルベニア州立大学(Penn State)

ペンシルベニア州立大学の研究は、AIチャットボットが使用を重ねるうちに過度に同調的(過剰にユーザーに合わせる)になる傾向を明らかにした。実験では、対話の蓄積によりモデルがユーザーの意見に過剰適応し、客観性や正確性が低下する可能性が確認された。この現象は学習過程やフィードバックの偏りに起因し、誤情報の助長や判断支援の信頼性低下につながる恐れがある。研究は、バランスの取れた応答を維持するための設計や評価手法の重要性を指摘し、AIの安全性と倫理的運用に新たな課題を提示している。

AIチャットボットが過度に同調的になる傾向を発見(AI-powered chatbots can become too agreeable over time, researchers report)
Longer conversations with an AI-powered chatbot can make the bot overly agreeable, affecting the accuracy of its responses. Credit: Tada Images/Adobe Stock. All Rights Reserved.

<関連情報>

相互作用の状況はしばしばLLMにおける追従行為を増加させる Interaction Context Often Increases Sycophancy in LLMs

Shomik Jain, Charlotte Park, Matt Viana, Ashia Wilson, Dana Calacci
arXiv  last revised 3 Feb 2026 (this version, v3)
DOI:https://doi.org/10.48550/arXiv.2509.12517

Abstract

We investigate how the presence and type of interaction context shapes sycophancy in LLMs. While real-world interactions allow models to mirror a user’s values, preferences, and self-image, prior work often studies sycophancy in zero-shot settings devoid of context. Using two weeks of interaction context from 38 users, we evaluate two forms of sycophancy: (1) agreement sycophancy — the tendency of models to produce overly affirmative responses, and (2) perspective sycophancy — the extent to which models reflect a user’s viewpoint. Agreement sycophancy tends to increase with the presence of user context, though model behavior varies based on the context type. User memory profiles are associated with the largest increases in agreement sycophancy (e.g. +45\% for Gemini 2.5 Pro), and some models become more sycophantic even with non-user synthetic contexts (e.g. +15\% for Llama 4 Scout). Perspective sycophancy increases only when models can accurately infer user viewpoints from interaction context. Overall, context shapes sycophancy in heterogeneous ways, underscoring the need for evaluations grounded in real-world interactions and raising questions for system design around alignment, memory, and personalization.

1603情報システム・データ工学
ad
ad
Follow
ad
タイトルとURLをコピーしました