ブラックボックスAIの起源特定技術を開発（Researchers Crack AI Blackbox）

2026-02-25 ジョージア工科大学

米ジョージア工科大学（Georgia Institute of Technology）の研究チームは、AIモデルの「ブラックボックス」性に対処する新しい解析フレームワークZENを開発した。高度な人工知能システムは内部構造や学習過程が不透明であり、欠陥やセキュリティ上の問題、ライセンス違反の有無を外部から検証できないという課題がある。ZENは、実行中のAIモデルのメモリイメージから固有の「フィンガープリント」を抽出し、モデルの数学的構造とプログラムコードの特徴を統合した表現を生成する。これにより、未知のモデルが既存のオープンソースモデルに基づくものか否かを照合し、変更点を特定したうえで再構築可能な複製を作成できる。テストでは21の最先端モデルで100％の帰属精度を達成し、セキュリティ分析や知的財産権保護への応用が期待される。この成果は2026年NDSSシンポジウムで発表予定である。

＜関連情報＞

禅の実現：帰属と再利用のための数学的およびプログラム的な深層学習モデル表現の統合
Achieving Zen: Combining Mathematical and Programmatic Deep Learning Model Representations for Attribution and Reuse

David Oygenblik, Dinko Dermendzhiev, Filippos Sofias, Mingxuan Yao, Haichuan Xu (), Runze Zhang, Jeman Park, Amit Kumar Sikder, Brendan Saltaformaggio
Network and Distributed System Security (NDSS) Symposium

Prior work has developed techniques capable of extracting deep learning (DL) models in universal formats from system memory or program binaries for security analysis. Unfortunately, such techniques ignore the recovery of the DL model’s programmatic representation required for model reuse and any white-box analysis techniques. Addressing this, we propose a novel recovery methodology, and prototype ZEN, that automatically recovers the DL model programmatic representation complementing the recovery of the mathematical representation by prior work. ZEN identifies novel code in an unknown DL system relative to a base model and generates patches such that the recovered DL model can be reused. We evaluated ZEN on 21 SOTA DL models, including models across the language and vision domains, such as Llama 3 and YoloV10. ZEN successfully attributed custom models to their base models with 100% accuracy, enabling model reuse.