Abstract
Open Source Software Development (OSSD) communities are often able to produce high quality software comparable to proprietary software. The success of an OSSD community is often attributed to the underlying governance model, and a key component of these models is the decision-making stage. Most studies on OSSD communities, particularly those that focus on decision-making involving people and processes, have adopted a qualitative lens to study decision-making, rather than undertaking a data-driven, empirical study. This work aims to bridge this gap by pursuing a large-scale quantitative study based on data available in Python development email archives.
There are two parts to this study: people and processes. The first part of this thesis investigates members’ involvement during different stages of decision-making and the participant patterns for different types of proposals. It also uses Social Network Analysis (SNA) to identify the core decision-makers in Python. Amongst these core contributors, this thesis also identifies those decision-makers who contribute in multiple ways (i.e. perform multiple roles) and thus form the boundary spanners in OSSD. Based on this, we propose an approach to identify a replacement administrator that can take charge of the community if the current leader is no longer able to contribute.
The second part of this thesis addresses the need to make hidden decision-making processes more explicit. Even when such processes are publicly documented, the rationale for the various steps in a decision-process often remain hidden. To address these gaps, this thesis presents the Decision-Making Process (DeMaP) miner framework that employs NLP techniques and a data-driven, bottom-up approach to extract the decision-making process from email discussions. This thesis also presents a mechanism that can be used to infer the rationale behind how OSS enhancement decisions were reached (e.g. based on developer consensus or administrator pronouncement).
The thesis makes two main contributions. First, it makes a knowledge contribution by a) highlighting the involvement patterns of members during decision-making in different types of proposals and different states within these proposals, b) identifying key decision makers and boundary spanners in the Python OSSD community, c) extracting the Python decision-making process, and d) determining the rationale behind specific decisions. Second, it makes a methodological contribution in the form of a framework (DeMaP miner) that can be used to mine the decision-making processes and the rationale behind decisions (Rationale miner), both in the Python community and in other similar OSSD communities.