Research Spotlight: Teaching machine learning algorithms to be fair
Machine learning algorithms are increasingly being used to automate decision making in several domains such as hiring, lending and crime-risk prediction. These algorithms have shown significant promise in leveraging large or “big†training datasets to achieve high prediction accuracy, sometimes surpassing even human accuracy.
Unfortunately, some recent investigations have shown that machine learning algorithms can also lead to unfair outcomes. For example, a recent ProPublica study found that COMPAS, a tool used in US courtrooms for assisting judges with crime risk prediction, was unfair towards black defendants. In fact, several studies from governments, regulatory authorities, researchers as well as civil rights groups have raised concerns about machine learning potentially acting as a tool for perpetuating existing unfair practices in society, and worse, introducing new kinds of unfairness in prediction tasks. As a consequence, a flurry of recent research has focused on defining and implementing appropriate computational notions of fairness for machine learning algorithms.
Parity-based fairness
Existing computational notions of fairness in the machine learning literature are largely inspired by the concept of discrimination in social sciences and law. These notions require the decision outcomes to ensure parity (i.e. equality) in treatment and in impact.
Notions based on parity in treatment require that the decision algorithm should not take into account the sensitive feature information (e.g., gender, race) of a user. Notions based on parity in impact require that the decision algorithm should give beneficial decision outcomes (e.g., granting a loan) to similar percentages of people from all sensitive feature groups (e.g., men, women).
However, in many cases, these existing notions are too stringent and can lead to unexpected side effects. For example, ensuring parity has been shown to lead to significant reductions in prediction accuracy. Parity may also lead to scenarios where none of the groups involved in decision making (e.g., neither men nor women) get beneficial outcomes. In other words, these scenarios might be preferred neither by the decision maker using the algorithm (due to diminished accuracy), nor by the groups involved (due to very little benefits).
User preferences and fairness
In recent work, to appear at NIPS 2017, researchers at MPI-SWS have introduced two new computational notions of algorithmic fairness: preferred treatment and preferred impact. These notions are inspired by ideas related to envy-freeness and bargaining problem in economics and game theory. Preferred treatment and preferred impact leverage these ideas to build more accurate solutions that are preferable for both the decision maker and the user groups.
The new notion of preferred treatment allows basing the decisions on sensitive feature information (thereby relaxing the parity treatment criterion) as long as the decision outcomes do not lead to envy. That is, each group of users prefers their own group membership over other groups and does not feel that presenting itself to the algorithm as another group would have led to better outcomes for the group.
The new notion of preferred impact allows differences in beneficial outcome rates for different groups (thereby relaxing the parity impact criterion) as long as all the groups get more beneficial outcomes than what they would have received under the parity impact criterion.
In their work, MPI-SWS researchers have developed a technique to ensure machine learning algorithms satisfy preferred treatment and / or preferred impact. They also tested their technique by designing crime-predicting machine-learning algorithms that satisfy the above-mentioned notions. In their experiments, they show that preference-based fairness notions can provide significant gains in overall decision-making accuracy as compared to parity-based fairness, while simultaneously increasing the beneficial outcomes for the groups involved.
This work is one of the most recent additions to an expanding set of techniques developed by MPI-SWS researchers to enable fairness, accountability and interpretability of machine learning algorithms.
References
Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna Gummadi and Adrian Weller. From Parity to Preference: Learning with Cost-effective Notions of Fairness. Neural Information Processing Systems (NIPS), Long Beach (CA, USA), December 2017