Fairness in machine learning with tractable models

作者:

Highlights:

摘要

Machine Learning techniques have become pervasive across a range of different applications, and are now widely used in areas as disparate as recidivism prediction, consumer credit–risk analysis and insurance pricing. The prevalence of machine learning techniques has raised concerns about the potential for learned algorithms to become biased against certain groups. Many definitions have been proposed in the literature, but the fundamental task of reasoning about probabilistic events is a challenging one, owing to the intractability of inference.The focus of this paper is taking steps towards the application of tractable probabilistic models to fairness in machine learning. Tractable probabilistic models have recently emerged that guarantee that conditional marginal can be computed in time linear in the size of the model.In particular, we show that sum product networks (SPNs) enable an effective technique for determining the statistical relationships between protected attributes and other training variables. We will also motivate the concept of “fairness through percentile equivalence”, a new definition predicated on the notion that individuals at the same percentile of their respective distributions should be treated equivalently, and this prevents unfair penalisation of those individuals who lie at the extremities of their respective distributions.We compare the efficacy of this pre-processing technique with an alternative approach that assumes an additive contribution. It was found that when these two approaches were compared on a data set containing the results of law school applicants, the percentile equivalence method reduced the average underestimation in the exam score of ethnic minority applicants black applicants at the bottom end of their conditional distribution by about a fifth. We conclude by outlining potential improvements to our existing methodology and suggest opportunities for further work in this field.

论文关键词:

论文评审过程:Received 7 October 2019, Revised 18 May 2020, Accepted 20 December 2020, Available online 9 January 2021, Version of Record 20 January 2021.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106715