より安全で信頼性の高いシステムのためにAIの堅牢性を強化(Enhancing AI robustness for more secure and reliable systems)

2023-09-30

2023-09-28 スイス連邦工科大学ローザンヌ校(EPFL)

◆EPFL工学部の研究者は、人工知能（AI）システムの攻撃に対する新しいトレーニングアプローチを開発しました。従来のゼロサムゲームに基づくアプローチを置き換え、連続的に適応する攻撃戦略を使用し、深層ニューラルネットワークを含む機械学習モデルの信頼性を向上させました。
◆この研究は、ビデオストリーミング、自動運転車、監視など、AIに依存するさまざまな活動に適用可能で、従来の敵対的なトレーニングの限界を克服しました。この新しいアプローチは、AIの防御を強化し、2023年の国際機械学習会議で論文賞を受賞しました。

＜関連情報＞

敵対的な訓練は非ゼロサムゲームとして行うべき
Adversarial Training Should Be Cast as a Non-Zero-Sum Game

Alexander Robey, Fabian Latorre, George J. Pappas, Hamed Hassani, Volkan Cevher
arXiv Submitted on 19 Jun 2023
DOI:https://doi.org/10.48550/arXiv.2306.11035

One prominent approach toward resolving the adversarial vulnerability of deep neural networks is the two-player zero-sum paradigm of adversarial training, in which predictors are trained against adversarially-chosen perturbations of data. Despite the promise of this approach, algorithms based on this paradigm have not engendered sufficient levels of robustness, and suffer from pathological behavior like robust overfitting. To understand this shortcoming, we first show that the commonly used surrogate-based relaxation used in adversarial training algorithms voids all guarantees on the robustness of trained classifiers. The identification of this pitfall informs a novel non-zero-sum bilevel formulation of adversarial training, wherein each player optimizes a different objective function. Our formulation naturally yields a simple algorithmic framework that matches and in some cases outperforms state-of-the-art attacks, attains comparable levels of robustness to standard adversarial training algorithms, and does not suffer from robust overfitting.