Although the distinction between prediction and explanation is well established in the philosophy of science, statistical modeling techniques too often overlook the practical implications of such theoretical divergence. Can predictive and explanatory models be recognized as complements rather than substitutes? We argue that predictive and explanatory modeling need not be seen as in conflict: this two so far parallel approaches would largely benefit one from the other and the contamination between the two might be one of the central topics in statistical modeling in the years to come. Most importantly, we show that the need for this convergence is made apparent by the requirements imposed by the EU General DataProtection Regulation (GDPR), and it is of paramount importance when dealing with legal data. We also show how the demand to meaningfully clarify the logic behind solely automated decision-making processes creates a unique incentive to reconcile two seemingly contradictory scientific paradigms. In addition, by looking at 2585 Italian cases related to personal injury compensation, we develop a simple application to map the space of judges’ decisions and, using state-of-the-art multi-label algorithms, we classify such decisions according to the relevant heads of damages. As a matter of fact, drawing causal evidence from the analysis might be dangerous: if we want machines to improve human decisions, we need more robust, generalized, and explainable models.
Causality and Explanation in ML: a Lead from the GDPR and an Application to Personal Injury Damages
Giovanni Comandé;Denise Amram;
2019-01-01
Abstract
Although the distinction between prediction and explanation is well established in the philosophy of science, statistical modeling techniques too often overlook the practical implications of such theoretical divergence. Can predictive and explanatory models be recognized as complements rather than substitutes? We argue that predictive and explanatory modeling need not be seen as in conflict: this two so far parallel approaches would largely benefit one from the other and the contamination between the two might be one of the central topics in statistical modeling in the years to come. Most importantly, we show that the need for this convergence is made apparent by the requirements imposed by the EU General DataProtection Regulation (GDPR), and it is of paramount importance when dealing with legal data. We also show how the demand to meaningfully clarify the logic behind solely automated decision-making processes creates a unique incentive to reconcile two seemingly contradictory scientific paradigms. In addition, by looking at 2585 Italian cases related to personal injury compensation, we develop a simple application to map the space of judges’ decisions and, using state-of-the-art multi-label algorithms, we classify such decisions according to the relevant heads of damages. As a matter of fact, drawing causal evidence from the analysis might be dangerous: if we want machines to improve human decisions, we need more robust, generalized, and explainable models.File | Dimensione | Formato | |
---|---|---|---|
2019_DataScienceForLaw.pdf
solo utenti autorizzati
Tipologia:
Documento in Post-print/Accepted manuscript
Licenza:
PUBBLICO - Pubblico con Copyright
Dimensione
1.96 MB
Formato
Adobe PDF
|
1.96 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.