Probabilistic Efficiency Analysis Using Explainable Artificial Intelligence
Create New SMOTE Units to Balance Data combinations of m + s
Search Range for Directional Efficiency Parameter ()
Create New SMOTE Units to Balance Data combinations of m + s
Data preprocessing and efficiency labeling with Additive DEA
Training Classification Models to Estimate Efficiency
Global feature importance for efficiency classifiers
Identify Benchmark Peers Based on Estimated Efficiency Probabilities
Generate Efficiency Rankings Based on Probabilistic Classification
Projection-Based Efficiency Targets
Prepare Data and Handle Errors
Create New SMOTE Units to Balance Data combinations of m + s
Training a Classification Machine Learning Model
Prepare Training and Target Datasets from a caret Model
Provides a probabilistic framework that integrates Data Envelopment Analysis (DEA) (Banker et al., 1984) <doi:10.1287/mnsc.30.9.1078> with machine learning classifiers (Kuhn, 2008) <doi:10.18637/jss.v028.i05> to estimate both the (in)efficiency status and the probability of efficiency for decision-making units. The approach trains predictive models on DEA-derived efficiency labels (Charnes et al., 1985) <doi:10.1016/0304-4076(85)90133-2>, enabling explainable artificial intelligence (XAI) workflows with global and local interpretability tools, including permutation importance (Molnar et al., 2018) <doi:10.21105/joss.00786>, Shapley value explanations (Strumbelj & Kononenko, 2014) <doi:10.1007/s10115-013-0679-x>, and sensitivity analysis (Cortez, 2011) <https://CRAN.R-project.org/package=rminer>. The framework also supports probability-threshold peer selection and counterfactual improvement recommendations for benchmarking and policy evaluation. The probabilistic efficiency framework is detailed in González-Moyano et al. (2025) "Probability-based Technical Efficiency Analysis through Machine Learning", in review for publication.
Useful links