大規模言語モデルに論理推論能力を付加する研究（Researchers empower LLMs with logical reasoning）

2025-05-18 清華大学

Research Survey Classification Framework as proposed in the paper

清華大学とアムステルダム大学の共同研究センターは、北京大学、カーネギーメロン大学、モハメド・ビン・ザイード人工知能大学と協力し、大規模言語モデル（LLM）の論理的推論能力を強化するための包括的な調査論文「Empowering LLMs with Logical Reasoning: A Comprehensive Survey」を発表しました。この論文は、IJCAI 2025のサーベイトラックに採択され、同会議でチュートリアルも実施される予定です。論文では、論理的質問応答と論理的一貫性の2つの主要課題に焦点を当て、最先端の手法や評価ベンチマーク、今後の研究方向性を体系的に整理しています。主な貢献者には、清華大学の劉芬蓉教授とその学生である程風翔氏（現在アムステルダム大学で博士課程在籍）が含まれます。この研究は、LLMの幻覚問題への対処や、より信頼性の高いAIシステムの構築に寄与することが期待されています。

＜関連情報＞

LLMを論理的推論で強化する：包括的な調査 Empowering LLMs with Logical Reasoning: A Comprehensive Survey

Fengxiang Cheng, Haoxuan Li, Fenrong Liu, Robert van Rooij, Kun Zhang, Zhouchen Lin
arXiv last revised 24 Feb 2025 (this version, v2)
DOI:https://doi.org/10.48550/arXiv.2502.15652

Abstract

Large language models (LLMs) have achieved remarkable successes on various natural language tasks. However, recent studies have found that there are still significant challenges to the logical reasoning abilities of LLMs. This paper summarizes and categorizes the main challenges into two aspects: (1) Logical question answering, LLMs often fail to generate the correct answer within complex logical problem which requires sophisticated deductive, inductive or abductive reasoning given a collection of premises and constrains. (2) Logical consistency, LLMs are prone to producing responses contradicting themselves across different questions. For example, a state-of-the-art Macaw question-answering LLM answers Yes to both questions Is a magpie a bird? and Does a bird have wings? but answers No to Does a magpie have wings?. To facilitate this research direction, we comprehensively investigate the most cutting-edge methods and propose detailed taxonomies of these methods. Specifically, to accurately answer complex logic questions, previous methods can be categorized based on reliance on external solvers, prompts, pretraining, and fine-tuning. To avoid logical contradictions, we discuss concepts and solutions of various logical consistencies, including implication, negation, transitivity, factuality consistency, and their composites. In addition, we review commonly used benchmark datasets and evaluation metrics, and discuss promising research directions, such as extensions to modal logic to account for uncertainty, and efficient algorithms satisfying multiple logical consistencies simultaneously.