大規模言語モデルの学習効率を倍増させる新手法(A new method could increase the training efficiency of large language models)

2026-02-26 マサチューセッツ工科大学(MIT)

Massachusetts Institute of Technologyの研究チームは、大規模言語モデル(LLM)の学習効率を大幅に高める新手法を開発した。従来は膨大な計算資源と電力を必要としていたが、新方式では訓練中の計算プロセスを最適化し、不要な更新や冗長計算を削減することで、同等の性能を維持しながら計算量とエネルギー消費を低減できることを示した。これにより、モデル開発コストの削減と環境負荷の軽減が期待される。手法は既存のLLM訓練パイプラインにも適用可能で、持続可能なAI基盤構築に向けた重要な一歩となる。

大規模言語モデルの学習効率を倍増させる新手法(A new method could increase the training efficiency of large language models)
A new method could increase the training efficiency of large language models: By leveraging idle computing time, it can double the speed of model training while preserving accuracy.Image: MIT News; iStock

<関連情報>

ロングテールの抑制:Adaptive Drafterによる効率的な推論強化学習 Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter

Qinghao Hu,Shang Yang,Junxian Guo,Xiaozhe Yao,Yujun Lin,Yuxian Gu,Han Cai,Chuang Gan,Ana Klimovic,Song Han
arXiv  Published:21 Jan 2026

Abstract

The emergence of Large Language Models (LLMs) with strong reasoning capabilities marks a significant milestone, unlocking new frontiers in complex problem-solving. However, training these reasoning models, typically using Reinforcement Learning (RL), encounters critical efficiency bottlenecks: response generation during RL training exhibits a persistent long-tail distribution, where a few very long responses dominate execution time, wasting resources and inflating costs. To address this, we propose TLT, a system that accelerates reasoning RL training losslessly by integrating adaptive speculative decoding. Applying speculative decoding in RL is challenging due to the dynamic workloads, evolving target model, and draft model training overhead. TLT overcomes these obstacles with two synergistic components: (1) Adaptive Drafter, a lightweight draft model trained continuously on idle GPUs during long-tail generation to maintain alignment with the target model at no extra cost; and (2) Adaptive Rollout Engine, which maintains a memory-efficient pool of pre-captured CUDAGraphs and adaptively select suitable SD strategies for each input batch. Evaluations demonstrate that TLT achieves over 1.7× end-to-end RL training speedup over state-of-the-art systems, preserves the model accuracy, and yields a high-quality draft model as a free byproduct suitable for efficient deployment. Code is released at https://github.com/mit-han-lab/fastrl.

1602ソフトウェア工学
ad
ad
Follow
ad
タイトルとURLをコピーしました