Fairness in Machine Learning

46
Fairness in Machine Learning Delip Rao

Transcript of Fairness in Machine Learning

Page 1: Fairness in Machine Learning

Fairness in Machine LearningDelip Rao

Page 2: Fairness in Machine Learning
Page 3: Fairness in Machine Learning

Metrics

Page 4: Fairness in Machine Learning

Every ML practitioners dream scenarioWell defined eval objective

Lots of clean data

Rich data with lot of attributes

Page 5: Fairness in Machine Learning

Incorporating Ethnicity improves Engagement Metrics

But should you do it?

Page 6: Fairness in Machine Learning
Page 7: Fairness in Machine Learning
Page 9: Fairness in Machine Learning
Page 10: Fairness in Machine Learning
Page 12: Fairness in Machine Learning

Goldstein, the “computer expert”

Page 13: Fairness in Machine Learning

Dramatic Changes in Machine Learning Landscape

Page 14: Fairness in Machine Learning

Rise of fast/cheap data collection, processing

Page 15: Fairness in Machine Learning

Rise of popular, easy-to-use tools

Page 16: Fairness in Machine Learning

Rise of Data Scientist Factories

Page 17: Fairness in Machine Learning
Page 18: Fairness in Machine Learning

Two QuestionsShould everything that can be predicted, be predicted?

If you really have to predict, what should you be aware of?

Page 19: Fairness in Machine Learning

“Catalog of Evils”

Dwork et al 2011

Page 20: Fairness in Machine Learning

ScS protected

class

population

Page 21: Fairness in Machine Learning

Blatant Explicit DiscriminationFeature4231:Race=’Black’

Page 22: Fairness in Machine Learning

Discrimination Based on Redundant Encoding

Feature4231:Race=’Black’

Features = {‘loc’, ‘income’, ..}

Polynomial kernel with degree 2

Feature6578:Loc=’EastOakland’^Income=’<10k’

Page 23: Fairness in Machine Learning
Page 24: Fairness in Machine Learning

Big DataThere is no data like more data.

Page 25: Fairness in Machine Learning

Big Data

Classifier error rate

Number of training examples in your data

Page 26: Fairness in Machine Learning

Most ML Objective functions create models accurate for the majority

class at the expense of the protected class

Page 27: Fairness in Machine Learning

Cultural differences can throw a wrench in your models

Page 28: Fairness in Machine Learning

Look at Error Cases vs. Error RatesMacros - Accuracy, RMSE, F1, etc

vs.

Individuals

Page 29: Fairness in Machine Learning

Becoming Responsible Gatekeepers

Page 30: Fairness in Machine Learning

We are pretty good at learningfunction approximations today

Page 31: Fairness in Machine Learning

Image Credit: Jason Eisner, the three cultures of ML, 2016

NNs &Decision Trees

Page 32: Fairness in Machine Learning

Learning methods that introduce fairness

Ways to characterize fairness

Need

Page 33: Fairness in Machine Learning

How can we characterize fairness?What does fairness even mean?

Group Fairness vs. Individual Fairness

Page 34: Fairness in Machine Learning

How can we characterize fairness?One way to characterize group fairness is to ensure both majority and the protected population have similar outcomes.

or

P(FavorableOutcome | S) : P(FavorableOutcome | Sc) = 1 : 1

Page 35: Fairness in Machine Learning

How can we characterize fairness?One way to characterize group fairness is to ensure both majority and the protected population have similar outcomes.

or

P(FavorableOutcome | S) : P(FavorableOutcome | Sc) = 1 : 1

often this is hard to achieve.

For example, for jobs, the EEOC specifies this ratio should be no less than 0.8 : 1 (aka 80% rule).

Page 36: Fairness in Machine Learning

Characterizing Fairness of a black box classifierOne way: Is classifier outcome correlated with membership in S?

Page 37: Fairness in Machine Learning

Fairness as a constraintIs classifier outcome correlated with membership in S?Sensitive attributes

Decision function

Want

Page 38: Fairness in Machine Learning

Fairness as a constraint

Constraint to be added:

Page 39: Fairness in Machine Learning

Supervised Learning with Fairness Constraint

minimize

such that

Zafar et al, ICML 2015

Page 40: Fairness in Machine Learning

“If we allowed a model to be used for college admissions in 1870, we’d still have 0.7% of women going to college.”

Recommended Reading

Page 41: Fairness in Machine Learning

Reading ListThere’s much material on fairness in data-driven decision/policy making from literature in

- law

- sociology

- political science

- computer science/machine learning

- economics

(the machine learning literature is nascent, only around 2009 onwards)

Page 42: Fairness in Machine Learning

Reading List (Fairness in ML)Pedreschi, Dino, Salvatore Ruggieri, and Franco Turini. "Measuring Discrimination in Socially-Sensitive Decision Records." SDM. 2009.

Kamiran, Faisal, and Toon Calders. "Classifying without discriminating."Computer, Control and Communication, 2009. IC4 2009. 2nd International Conference on. IEEE, 2009.

Dwork, Cynthia, et al. "Fairness through awareness." Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. ACM, 2012

Romei, Andrea, and Salvatore Ruggieri. "A multidisciplinary survey on discrimination analysis." The Knowledge Engineering Review 29.05 (2014)

Page 43: Fairness in Machine Learning

Reading List (Fairness in ML)Friedler, Sorelle, Carlos Scheidegger, and Suresh Venkatasubramanian. "Certifying and removing disparate impact." CoRR (2014).

Barocas, Solon and Selbst, Andrew D., Big Data's Disparate Impact (August 14, 2015). California Law Review, Vol. 104,

Zafar, Muhammad Bilal, et al. "Fairness Constraints: A Mechanism for Fair Classification." arXiv preprint arXiv:1507.05259 (2015).

Zliobaite, Indre. "On the relation between accuracy and fairness in binary classification." arXiv preprint arXiv:1505.05723 (2015).

Page 44: Fairness in Machine Learning

Other resourcesNSF’s “Big Data Innovation Hubs” were created in part to address these challengeshttp://www.nsf.gov/news/news_summ.jsp?cntn_id=136784

Stanford Law Review touches upon this topic regularlyhttp://www.stanfordlawreview.org/online/privacy-and-big-data

Fairness bloghttp://fairness.haverford.edu

Academic: FATML workshops (NIPS 2014, ICML 2015)www.fatml.org

Page 45: Fairness in Machine Learning

LessonsDiscrimination is an emergent property of any learning algorithm

Watch out for discrimination (implicitly) encoded in features

Big Data can cause Big Problems

Watch out for the proportion of the protected classes

Always do error analysis with protected classes in mind

Notions of fairness are nascent at best. Involve as many people to improve understanding.

There is no one best notion of fairness

Page 46: Fairness in Machine Learning

questions@deliprao / [email protected]