Comprehensive Predictive Analytics for Collaborators' Answers, Code Quality, and Dropout: Stack Overflow Case Study – Replication Package

Elijah Zolduoarrati; Sherlock Licorish; Nigel Stanger

doi:10.5281/zenodo.15330992

Back

Comprehensive Predictive Analytics for Collaborators' Answers, Code Quality, and Dropout: Stack Overflow Case Study – Replication Package

Dataset

Open access

Comprehensive Predictive Analytics for Collaborators' Answers, Code Quality, and Dropout: Stack Overflow Case Study – Replication Package

Elijah Zolduoarrati, Sherlock Licorish and Nigel Stanger

Zenodo

03/05/2025

DOI: https://doi.org/10.5281/zenodo.15330992

Handle:

https://hdl.handle.net/10523/46296

Abstract

replication package

Stack Overflow

Previous studies that used data from Stack Overflow to develop predictive models often employed limited benchmarks of 3-5 models or adopted arbitrary selection methods. Despite being insightful, such approaches may not provide optimal results given their limited scope, suggesting the need to benchmark more models to avoid overlooking untested algorithms. Our study evaluates 21 algorithms across three tasks: predicting the number of question a user is likely to answer, their code quality violations, and their dropout status. We employed normalisation, standardisation, as well as logarithmic and power transformations paired with Bayesian hyperparameter optimisation and genetic algorithms. CodeBERT, a pre-trained language model for both natural and programming languages, was fine-tuned to classify user dropout given their posts (questions and answers) and code snippets. This replication package is provided for those interested in further examining our research methodology.

Files and links (1)

url

https://doi.org/10.5281/zenodo.15330992View

CC BY V4.0, Open

Metrics

16 Record Views

Details

Record Identifier: 9926743737901891
Title: Comprehensive Predictive Analytics for Collaborators' Answers, Code Quality, and Dropout: Stack Overflow Case Study – Replication Package
Creators: Elijah Zolduoarrati
Sherlock Licorish
Nigel Stanger
Academic Unit: Information Science; School of Computing
Publisher: Zenodo
Date published ; e-published: 03/05/2025
Language: English
Resource Type ; Subtype: Dataset

Comprehensive Predictive Analytics for Collaborators' Answers, Code Quality, and Dropout: Stack Overflow Case Study – Replication Package

Abstract

Files and links (1)

Related content

Metrics

Details