ロボットが高度に調整されたダンスのように協調動作を習得(Robots learn to work together like a well-choreographed dance)

2025-09-04 ユニバーシティ・カレッジ・ロンドン（UCL）

UCLとGoogle DeepMind、Intrinsicの研究チームは、複数のロボットアームが効率的に協調できるAIアルゴリズム「RoboBallet」を開発しました。従来は専門家が衝突を避ける動作計画を手作業で設計していましたが、本手法はグラフニューラルネットワークと強化学習を組み合わせ、ロボット自身に最適な協調動作を学ばせます。数日間の訓練で、8台のロボットが40種類の課題を数秒で処理でき、従来手法を大幅に上回る性能を示しました。学習したのは個別動作ではなく協調の原理であるため、未知のレイアウトや機器故障時にも即座に再計画が可能です。Science Roboticsに掲載されたこの成果は、製造ライン自動化や産業ロボットの柔軟な適用に向けた大きな前進となります。

Robotic arms used in study. Credit: Google Deepmind/UCL.

＜関連情報＞

ロボバレエ：グラフニューラルネットワークと強化学習によるマルチロボット到達計画 RoboBallet: Planning for multirobot reaching with graph neural networks and reinforcement learning

Matthew Lai, Keegan Go, Zhibin Li, Torsten Kröger, […] , and Jonathan Scholz
Science Robotics Published:3 Sep 2025
DOI:https://doi.org/10.1126/scirobotics.ads1204

Abstract

Modern robotic manufacturing requires collision-free coordination of multiple robots to complete numerous tasks in shared, obstacle-rich workspaces. Although individual tasks may be simple in isolation, automated joint task allocation, scheduling, and motion planning under spatiotemporal constraints remain computationally intractable for classical methods at real-world scales. Existing multiarm systems deployed in industry rely on human intuition and experience to design feasible trajectories manually in a labor-intensive process. To address this challenge, we propose a reinforcement learning (RL) framework to achieve automated task and motion planning, tested in an obstacle-rich environment with eight robots performing 40 reaching tasks in a shared workspace, where any robot can perform any task in any order. Our approach builds on a graph neural network (GNN) policy trained via RL on procedurally generated environments with diverse obstacle layouts, robot configurations, and task distributions. It uses a graph representation of scenes and a graph policy neural network trained through RL to generate trajectories of multiple robots, jointly solving the subproblems of task allocation, scheduling, and motion planning. Trained on large randomly generated task sets in simulation, our policy generalizes zero-shot to unseen settings with varying robot placements, obstacle geometries, and task poses. We further demonstrate that the high-speed capability of our solution enables its use in workcell layout optimization, improving solution times. The speed and scalability of our planner also open the door to capabilities such as fault-tolerant planning and online perception-based replanning, where rapid adaptation to dynamic task sets is required.

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30