Is this build failure related to my patch? An empirical study of unrelated build failures in continuous integration

Yonghui Andie Huang; Daniel Alencar da Costa; Grant Dick; Mariam El Mezouar; Liwen Xiao

doi:10.1007/s10664-026-10874-8

Back

Is this build failure related to my patch? An empirical study of unrelated build failures in continuous integration

Journal article

Open access

Peer reviewed

Is this build failure related to my patch? An empirical study of unrelated build failures in continuous integration

Yonghui Andie Huang, Daniel Alencar da Costa, Grant Dick, Mariam El Mezouar and Liwen Xiao

Empirical software engineering, Vol.31(6), 148

21/05/2026

DOI: https://doi.org/10.1007/s10664-026-10874-8

Handle:

https://hdl.handle.net/10523/51051

Abstract

Non-code-related failures

Continuous Integration (CI)

Empirical study Issue resolving

In a hectic Continuous Integration (CI) environment, where several builds are triggered concurrently, legitimate build failures (e.g., not caused by flaky tests) may not always be related to the current push. These unrelated build failures can burden developers as they devote hours to attest whether errors are truly associated with their present changes. In this paper, we extract 77,354 CI build failures from 7 open source projects to understand and identify unrelated build failures. We attempt to provide an indication for developers about whether a build failure is likely to be related to the current push or not. Our results reveal that developers likely invest a median of 4 hours to determine whether a build failure is (un)related to their pushes. We perform a document analysis on a sample of 371 unrelated build failures (based on the 95% confidence level and 5% confidence interval from 10,316 potentially unrelated failures) to understand why build failures are deemed as unrelated by developers. The themes generated from our document analysis reveal that unrelated tests failures represent 20% of the cases of why build failures are deemed unrelated by developers. To predict whether a build failure is unrelated to the current push, we extract 33 features from issue reports, issue comments, and from the commits pertaining to the triggering push. We build semi-supervised PU-learning models over seven Apache projects and achieve precision ranging from 0.70 ± 0.01 to 0.88 ± 0.02 , recall ranging from 0.30 ± 0.03 to 1.00 ± 0.00, and F1-scores ranging from 0.44 ± 0.03 to 0.91 ± 0.00, while the area under the ROC curve (AUC) spans 0.63 ± 0.02 to 0.97 ± 0.03. Our analysis of feature importance reveals that (i) the time taken from a submitted patch to the build triggering push (CI latency), (ii) build failures sharing similar error messages with recent failures, and (iii) the number of comments preceding the build failure, are all efficient indicators for identifying potential unrelated build failures. The semi-supervised approach proposed in this work can help developers identify build failures that are unrelated to their current push, providing actionable guidance such as re-running builds, inspecting infrastructure logs, or prioritizing code-level debugging based on prediction outcomes.

Files and links (2)

pdf

s10664-026-10874-84.71 MBDownload View

Published (Version of record) Open Access CC BY V4.0

url

https://doi.org/10.1007/s10664-026-10874-8View

Published (Version of record) Open CC BY V4.0

Metrics

1 Record Views

Details

Record Identifier: 9926867881201891
Title: Is this build failure related to my patch? An empirical study of unrelated build failures in continuous integration
Creators: Yonghui Andie Huang
Daniel Alencar da Costa
Grant Dick
Mariam El Mezouar
Liwen Xiao
Academic Unit: School of Computing
Publication Details: Empirical software engineering, Vol.31(6), 148
Publisher: Springer Nature
Date published ; e-published: 21/05/2026
Copyright: Copyright © The Author(s) 2026. This work was first published in Empirical Software Engineering (Springer Nature). This is an open access work distributed under the terms of the Creative Commons Attribution 4.0 International License (https://www.creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, provided that the original work is properly attributed to the creator(s) and the source, a link to the Creative Commons license is provided, and any changes made are indicated.
Language: English
Resource Type ; Subtype: Journal article