言語エージェントは大規模言語モデルの「思考」をより良く、より安く支援する(Language agents help large language models ‘think’ better, cheaper)

2024-09-23 ワシントン大学セントルイス校

ワシントン大学の研究者たちは、大規模言語モデル（LLMs）の高コスト問題に対処するため、タスクの思考プロセスを指示する自律型エージェントを開発しました。このエージェントは、大規模なLLMを使用してタスクごとに高品質な手順を生成し、その手順をより小型でコスト効果の高いLLMに適用します。これにより、複雑な論理や数学の問題で大規模LLMを一度だけ使用し、その指示を小型モデルに移行することで、コストを抑えつつ高い性能を実現しています。

＜関連情報＞

言語エージェントが大規模言語モデルに一般的なゼロショット推論を指示する Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang
arXiv last revised: 14 Aug 2024 (this version, v2)
DOI:https://doi.org/10.48550/arXiv.2310.03710

Abstract

We introduce a method to improve the zero-shot reasoning abilities of large language models on general language understanding tasks. Specifically, we build an autonomous agent to instruct the reasoning process of large language models. We show this approach further unleashes the zero-shot reasoning abilities of large language models to more tasks. We study the performance of our method on a wide set of datasets spanning generation, classification, and reasoning. We show that our method generalizes to most tasks and obtains state-of-the-art zero-shot performance on 20 of the 29 datasets that we evaluate. For instance, our method boosts the performance of state-of-the-art large language models by a large margin, including Vicuna-13b (13.3%), Llama-2-70b-chat (23.2%), and GPT-3.5 Turbo (17.0%). Compared to zero-shot chain of thought, our improvement in reasoning is striking, with an average increase of 10.5%. With our method, Llama-2-70b-chat outperforms zero-shot GPT-3.5 Turbo by 10.2%.

月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30