AIを使ってサイバーセキュリティ対策を強化する(Using AI to develop enhanced cybersecurity measures)

2024-02-16

2024-02-16 ロスアラモス国立研究所(LANL)

A new, innovative method using AI is paving the way for enhanced cybersecurity measures. Credit: Maksim Eren, image created in DALL-E.

◆ロスアラモス国立研究所の研究チームは、人工知能を使用して大規模なマルウェア解析のいくつかの重要な欠点に対処し、Microsoft Windowsマルウェアの分類において重要な進展を遂げ、より高度なサイバーセキュリティ対策の道を開いています。チームのアプローチを使用して、マルウェアファミリーの分類で新しい世界記録を樹立しました。研究は、アメリカ計算機学会のジャーナルで最近発表されました。
◆この研究では、AIを使用した革新的な方法が紹介されており、Windowsマルウェアの分類の分野における重要な突破口となっています。この方法は、セミスーパーバイズテンソル分解法と選択分類、特にリジェクトオプションを活用して、現実的なマルウェアファミリーの分類を実現しています。また、この方法は、予測が自信を持っていない場合には拒否することもできます。これにより、セキュリティアナリストは、これらの手法をサイバー防御のような実践的な高リスク状況に適用する自信を持つことができます。この方法は、トレーニングに限られたデータが使用されていても、性能を維持することができます。

＜関連情報＞

モデル自動選択を伴う階層的非負行列因子分解による極端なクラス不均衡下でのマルウェアファミリーの半教師付き分類 Semi-Supervised Classification of Malware Families Under Extreme Class Imbalance via Hierarchical Non-Negative Matrix Factorization with Automatic Model Selection

Maksim E. Eren,Manish Bhattarai,Robert J. Joyce,Edward Raff,+ 2
ACM Transactions on Privacy and Security Published:13 November 2023
DOI:https://doi.org/10.1145/3624567

Abstract

Identification of the family to which a malware specimen belongs is essential in understanding the behavior of the malware and developing mitigation strategies. Solutions proposed by prior work, however, are often not practicable due to the lack of realistic evaluation factors. These factors include learning under class imbalance, the ability to identify new malware, and the cost of production-quality labeled data. In practice, deployed models face prominent, rare, and new malware families. At the same time, obtaining a large quantity of up-to-date labeled malware for training a model can be expensive. In this article, we address these problems and propose a novel hierarchical semi-supervised algorithm, which we call the HNMFk Classifier, that can be used in the early stages of the malware family labeling process. Our method is based on non-negative matrix factorization with automatic model selection, that is, with an estimation of the number of clusters. With HNMFk Classifier, we exploit the hierarchical structure of the malware data together with a semi-supervised setup, which enables us to classify malware families under conditions of extreme class imbalance. Our solution can perform abstaining predictions, or rejection option, which yields promising results in the identification of novel malware families and helps with maintaining the performance of the model when a low quantity of labeled data is used. We perform bulk classification of nearly 2,900 both rare and prominent malware families, through static analysis, using nearly 388,000 samples from the EMBER-2018 corpus. In our experiments, we surpass both supervised and semi-supervised baseline models with an F1 score of 0.80.

月	火	水	木	金	土	日
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29