SHAP-Boruta

This is the repository

The present work aims to provide a model-agnostic analysis towards explaining decisions made by the complex TD classification framework proposed in our previous related study, in order to make the entire decision-making process transparent. To this end, by constructing accurate project-specific classifiers for 22 software projects, we exploit the SHAP model explainability method to extract feature importance ranks and interpret the effect that various software metrics have on labeling a software class as high-TD. Subsequently, given a list of ranked important features (i.e., metrics) per each project, we investigate whether the top important metrics (as extracted by SHAP analysis) agree among projects with similar characteristics. Finally, through the features’ global interpretation that SHAP analysis inherently supports, we extract metric thresholds (heuristic values) that may act as practical TD prevention guidelines (or rules of thumb) for developers.