Using machine learning approaches to classify false positives generated by static analysis tools - An empirical study using small datasets - Replication Package

Lakmal Deshapriya; Sherlock Licorish; Brendon Woodford

doi:10.5281/zenodo.17861560

Back

Using machine learning approaches to classify false positives generated by static analysis tools - An empirical study using small datasets - Replication Package

Code

Open access

Using machine learning approaches to classify false positives generated by static analysis tools - An empirical study using small datasets - Replication Package

Lakmal Deshapriya, Sherlock Licorish and Brendon Woodford

Zenodo

09/12/2025

DOI: https://doi.org/10.5281/zenodo.17861560

Handle:

https://hdl.handle.net/10523/49209

Abstract

replication package

Static code analysis (SCA) tools are vital for software development to reduce the cost and time required for manual code reviews. It was established in our previous study that the best tools reported in the software engineering community return high false positive and false negative rates when used for code analysis. Several approaches have been taken in existing studies to identify false positive alarms generated by SCA tools, in helping to enhance the utility of these tools. However, existing studies reveal several limitations, chiefly among which is the tendency to focus on a limited number of alarm types or failing to evaluate a large breadth of machine-learning (ML) techniques comprehensively. This study has addressed this opportunity. We identified the best Code Representation Learning (CRL) techniques and ML algorithms for identifying false positives, where results show that performance outcomes vary based on the characteristics of the training datasets and the goal of the model building. Given the evidence established in this study, we recommend further investigations targeting specific configurations of advanced CRLs and ML Algorithms, including simple ML models.

Files and links (1)

url

https://doi.org/10.5281/zenodo.17861560View

Code/ScriptCC BY V4.0, Open

Metrics

1 Record Views

Details

Record Identifier: 9926819783501891
Title: Using machine learning approaches to classify false positives generated by static analysis tools - An empirical study using small datasets - Replication Package
Creators: Lakmal Deshapriya
Sherlock Licorish
Brendon Woodford
Academic Unit: School of Computing
Publisher: Zenodo
Date published ; e-published: 09/12/2025
Language: English
Resource Type ; Subtype: Code