「細かく追うほど誤検出が増える」を解決　多数決で誤検出を抑える細粒度バグ混入コミット特定技術

2026-05-08 九州大学

Kyushu UniversityやOsaka Universityなどの研究グループは、ソフトウェアの不具合原因となった「バグ混入コミット」を高精度で特定する新技術「MV-SZZ」を開発した。従来のSZZ系手法では、ソースコード変更を細かく追跡するほど誤検出が増える課題があった。MV-SZZでは、ソースコードをトークン単位で解析し、変更履歴を詳細に追跡した上で、複数候補から多数決アルゴリズムによって最も妥当なバグ混入コミットを選択する仕組みを導入した。これにより、従来法に比べ見逃しや誤検出を大幅に削減し、高精度な原因特定を実現した。研究成果は、ソフトウェア品質向上や開発コスト削減に加え、バグ予測・自動修正AIの学習用データセット整備にも寄与すると期待される。AI時代のソフトウェア工学基盤を支える技術として注目される。

図1：新技術「MV-SZZ」の概要

＜関連情報＞

MV-SZZ：多数決に基づくSZZ手法の実証研究 MV-SZZ: An Empirical Study of a Majority Voting-Based SZZ Method

Inase Kondo, Masanari Kondo, Daniel M. German, Yasutaka Kamei, Yoshiki Higo
IEEE Transactions on Software Engineering
DOI:https://doi.ieeecomputersociety.org/10.1109/TSE.2026.3688089

Abstract

The SZZ method identifies defect-inducing commits by tracing lines modified in defect-fixing commits back to the commits that introduced them. While this method is widely used, it may fail to identify defect-inducing commits that are untraceable at the line-level. To address this limitation, a previous study proposed a Token-SZZ method that tracks changes at the token-level. This approach converts a line-level Git history into a token-level history by decomposing each line into individual tokens. While this method is able to identify defect-inducing commits that the previous SZZ method misses, it also incorrectly identifies many commits as defect-inducing (false positives), resulting in decreased performance. To mitigate this issue, we propose Majority Voting SZZ (MV-SZZ), which consists of two key features: an N-token representation of the Git history and a majority voting mechanism. The N-token representation expands on the token-level concept (where N = 1) by instead using N (N > 1) consecutive tokens as a single line. This allows the method to capture more context for defect identification. The majority voting mechanism identifies the most frequent candidate commit associated with the changed tokens, and selects it as the defect-inducing commit. This approach effectively filters out unrelated tokens and reduces false positives. We compared MV-SZZ with six SZZ methods on both the Developer-IO and Defects4J datasets. MV-SZZ achieved the highest F1 and F0.5 scores on both datasets (F1: 0.587 and 0.580; F0.5: 0.591 and 0.578). Interestingly, we found that the issue of increased false positives due to token-level tracking is not unique to token-level methods; rather, it is a general problem that arises in SZZ methods when more accurate tracking is applied. Moreover, we demonstrated that our majority voting mechanism serves as an effective filtering strategy for SZZ variants in general.

月	火	水	木	金	土	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31