AIの需要に対応するウェーハスケール加速器の研究（Wafer-scale accelerators could redefine AI）

2025-06-16 カリフォルニア大学リバーサイド校 (UCR)

UCリバーサイドは、Cerebras社のウェハースケールAIアクセラレータが従来GPUより大幅に高速かつ省エネでAIモデルを処理できると報告。直径約20cmのシリコンチップに数千万のコアを搭載し、メモリ転送の遅延を解消。最新のWaferLLMシステムは、GPU比で最大606倍の速度・22倍の省エネを実現。高演算密度が利点だが、製造コストやソフト最適化など課題も残る。AI計算基盤の革新につながる研究成果。

＜関連情報＞

ウェハースケールAIアクセラレータとシングルチップGPUの性能、効率、コスト分析 Performance, efficiency, and cost analysis of wafer-scale AI accelerators vs. single-chip GPUs

Mihrimah Ozkan ∙ Lily Pompa² ∙ Md Shaihan Bin Iqbal ∙ … ∙ Handing Wang ∙ Lusha Gao ∙ Sandra Hernandez Gonzalez
Device Published:June 16, 2025
DOI:https://doi.org/10.1016/j.device.2025.100834

Graphical abstract

The bigger picture

The rapid advancement of artificial intelligence (AI) has revolutionized industries, driving breakthroughs in protein-structure prediction, weather forecasting, and optimization algorithms. However, the exponential growth of AI models, now scaling to trillions of parameters, reveals limitations of traditional hardware like single-chip GPUs in scalability, energy efficiency, and throughput. Wafer-scale computing emerges as a groundbreaking alternative, integrating multiple chiplets into a monolithic wafer to push performance and efficiency boundaries. Platforms like Cerebras Wafer-Scale Engine (WSE-3), with trillions of transistors and hundreds of thousands of cores, and Tesla’s Dojo D1 tileset, with over a trillion transistors and thousands of cores per training tile, exemplify wafer-scale accelerators’ transformative potential. Meanwhile, single-chip GPUs remain critical due to cost effectiveness, versatility, and mature software ecosystems. This review compares wafer-scale AI accelerators and single-chip GPUs, examining performance, energy efficiency, and cost in high-performance AI applications. It highlights enabling technologies like TSMC’s chip-on-wafer-on-substrate (CoWoS), which enhances computational density up to 40×. Addressing challenges such as fault tolerance, software optimization, and cost efficiency, this review offers insights into trade-offs and synergies between architectures, inspiring innovation in scalable, energy-efficient hardware for next-generation AI.

Summary

The exponential growth of artificial intelligence (AI) models, now reaching trillions of parameters, has revealed significant limitations in traditional single-chip graphics processing unit (GPU) architectures, particularly in scalability, energy efficiency, and computational throughput. Wafer-scale computing has emerged as a transformative paradigm, integrating multiple chiplets into a single monolithic wafer to deliver unprecedented performance and efficiency. Platforms such as the Cerebras Wafer-Scale Engine (WSE-3), with 4 trillion transistors and 900,000 cores, and Tesla’s Dojo, featuring 1.25 trillion transistors and 8,850 cores per training tile, exemplify the potential of wafer-scale AI accelerators to address the demands of large-scale AI workloads. This review provides a comprehensive comparative analysis of wafer-scale AI accelerators and single-chip GPUs, focusing on their relative performance, energy efficiency, and cost effectiveness in high-performance AI applications. Emerging technologies, such as TSMC’s chip-on-wafer-on-substrate (CoWoS), which promise to enhance computational density by up to 40 times, are also examined. In addition, this work discusses critical challenges, including fault tolerance, software optimization, and economic feasibility, offering insights into the trade-offs and synergies between these two hardware paradigms. Furthermore, emerging AI hardware trends, including three-dimensional (3D) integration, photonic chips, and advanced semiconductor materials, are also discussed. This review aims to inform the development of scalable and energy-efficient AI computing by evaluating their strengths and limitations. A future outlook outlines key advancements expected over the next 5 to 10 years, shaping the next generation of AI hardware.

月	火	水	木	金	土	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30