大規模言語モデルへの新スキル学習法を開発(Researchers Found a Better Way to Teach Large Language Models New Skills)

ad

2025-07-07 ノースカロライナ州立大学 (NCState)

大規模言語モデルへの新スキル学習法を開発(Researchers Found a Better Way to Teach Large Language Models New Skills)
Image credit: Growtika.

ノースカロライナ州立大学の研究者らは、LLMに新スキルを効率的に学習させる手法「WeGeFT」を開発した。これはLoRAを基盤に、既知パラメータと新規パラメータを区別し、後者に重点を置いてファインチューニングすることで、高性能を低コストで実現する。常識・算術推論やコード生成、視覚認識など複数タスクでLoRAを上回る結果を達成。AIの安全性や有害出力抑制への応用も期待される。詳細はICML 2025で発表予定。

<関連情報>

WeGeFT: 大規模モデルの多面的で効率的な適応のための重み生成による微調整 WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models

Chinmay Savadikar · Xi Song · Tianfu Wu
International Conference on Machine Learning  Presented: July 17, 2025

Abstract

Fine-tuning large pretrained Transformer models can focus on either introducing a small number of new learnable parameters (parameter efficiency) or editing representations of a small number of tokens using lightweight modules (representation efficiency). While the pioneering method LoRA (Low-Rank Adaptation) inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. To address this limitation and unify parameter-efficient and representation-efficient fine-tuning, we propose Weight-Generative Fine-Tuning (WeGeFT, pronounced wee-gift), a novel approach that learns to generate fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers. This design achieves multi-faceted efficiency in parameters, representations, compute, and memory, while maintaining or exceeding the performance of LoRA and its variants. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction following, code generation, and visual recognition verify the effectiveness of our proposed WeGeFT.

Lay Summary

Modern AI language models are extremely capable, but adapting them to new tasks can be resource-heavy — requiring lots of memory, computing power, and to change the many internal parameters. To make this easier, researchers have developed techniques that aim to update only a small number of these parameters, making the fine-tuning process more efficient.One popular method, called LoRA (Low-Rank Adaptation), strikes a strong balance: it keeps the number of new parameters low and remains efficient in terms of memory, speed, and performance. However, many newer methods reduce the number of added parameters even further — but at the cost of using more memory, more computation, or losing accuracy.We created WeGeFT (short for Weight-Generative Fine-Tuning, and pronounced as wee-gift), a new approach that keeps LoRA’s broad efficiency benefits while reducing the number of added parameters even more. It learns how to generate the necessary updates directly from the original model’s knowledge, using a simple and compact design. Despite being lightweight, WeGeFT matches or outperforms LoRA on a wide range of tasks — from arithmetic and commonsense reasoning, following instructions to coding and image recognition — making it a powerful and efficient tool for tuning AI models.

1602ソフトウェア工学
ad
ad
Follow
ad
タイトルとURLをコピーしました