awesome-fairness-papers

Papers about fairness in NLP

Jieyu Zhao, Emily Sheng, Sunipa Dev, Yu (Hope) Hou, Nanyun (Violet) Peng, and Kai-Wei Chang

Background

Fairness, accountability, transparency, and ethics are becoming more and more important in Natural Language Processing (NLP). We provide a list of papers that serve as references for researchers interested in these topics. This repo mainly focuses on papers published in the NLP venues, but we also point to some other resources at the end.

For relevant courses and other resources, please refer to ACL Wiki

Disclaimer: We may miss some relevant papers in the list. If you have any suggestions or would like to add some papers, please submit a pull request or email us. Your contribution is much appreciated!

Language (Technology) is Power: A Critical Survey of "Bias" in NLP, Blodgett, Su Lin and Barocas, Solon and Daumé III, Hal and Wallach, Hanna, 2020
Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview, Shah, Deven Santosh and Schwartz, H. Andrew and Hovy, Dirk, 2020
Mitigating Gender Bias in Natural Language Processing: Literature Review, Sun, Tony and Gaut, Andrew and Tang, Shirlyn and Huang, Yuxin and ElSherief, Mai and Zhao, Jieyu and Mirza, Diba and Belding, Elizabeth and Chang, Kai-Wei and Wang, William Yang, 2019
A survey on bias and fairness in machine learning, Mehrabi, Ninareh and Morstatter, Fred and Saxena, Nripsuta and Lerman, Kristina and Galstyan, Aram, 2019
50 years of test (Un)fairness: Lessons for machine learning, Hutchinson, Ben and Mitchell, Margaret, 2019
Societal Biases in Language Generation: Progress and Challenges, Sheng, Emily and Chang, Kai-Wei and Natarajan, Prem and Peng, Nanyun, 2021
Gender Bias in Machine Translation, Savoldi, Beatrice and Gaido, Marco and Bentivogli, Luisa and Negri, Matteo and Turchi, Marco, 2021
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics, Czarnowska, Paula and Vyas, Yogarshi and Shah Kashif, 2021
Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective, Kiritchenko, Svetlana and Nejadgholi, Isar and Fraser, Kathleen C, 2020
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜, Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021.

Social Impact of Biases

The Social Impact of Natural Language Processing, Hovy, Dirk and Spruit, Shannon L., 2016
Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?, Leins, Kobi and Lau, Jey Han and Baldwin, Timothy, 2020
Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research, Havens, Lucy and Terras, Melissa and Bach, Benjamin and Alex, Beatrice, 2020
Re-imagining Algorithmic Fairness in India and Beyond, Sambasivan, Nithya and Arnesen, Erin and Hutchinson, Ben and Doshi, Tulsee and Prabhakaran, Vinodkumar, 2021
Improving fairness in machine learning systems: What do industry practitioners need?, Holstein, Kenneth and Wortman Vaughan, Jennifer and Daumé III, Hal and Dudik, Miro and Wallach, Hanna, 2019
The problem with bias: Allocative versus representational harms in machine learning, Barocas, Solon and Crawford, Kate and Shapiro, Aaron and Wallach, Hanna, 2017
The many dimensions of algorithmic fairness in educational applications, Loukina, Anastassia and Madnani, Nitin and Zechner, Klaus, 2019

Data, Models, & Metrics

Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science, Bender, Emily M. and Friedman, Batya, 2018
Data and its (dis)contents: A survey of dataset development and use in machine learning research, Paullada, Amandalynne and Raji, Inioluwa Deborah and Bender, Emily M and Denton, Emily and Hanna, Alex, 2020
Datasheets for datasets, Gebru, Timnit and Morgenstern, Jamie and Vecchione, Briana and Vaughan, Jennifer Wortman and Wallach, Hanna and Daumé III, Hal and Crawford, Kate, 2018
Discovering and categorising language biases in reddit, Ferrer, Xavier and van Nuenen, Tom and Such, Jose M. and Criado, Natalia, 2021
Model cards for model reporting, Mitchell, Margaret and Wu, Simone and Zaldivar, Andrew and Barnes, Parker and Vasserman, Lucy and Hutchinson, Ben and Spitzer, Elena and Raji, Inioluwa Deborah and Gebru, Timnit, 2019
Counterfactual fairness, Kusner, Matt J and Loftus, Joshua and Russell, Chris and Silva, Ricardo, 2017
Fairness through awareness, Dwork, Cynthia and Hardt, Moritz and Pitassi, Toniann and Reingold, Omer and Zemel, Richard, 2012
Equality of opportunity in supervised learning, Hardt, Moritz and Price, Eric and Srebro, Nati, 2016
The price of debiasing automatic metrics in natural language evalaution, Chaganty, Arun and Mussmann, Stephen and Liang, Percy, 2018
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets, Geva, Mor and Goldberg, Yoav and Berant, Jonathan, 2019
Proposed Taxonomy for Gender Bias in Text; A Filtering Methodology for the Gender Generalization Subtype, Hitti, Yasmeen and Jang, Eunbee and Moreno, Ines and Pelletier, Carolyne, 2019
These are not the Stereotypes You are Looking For: Bias and Fairness in Authorial Gender Attribution, Koolen, Corina and van Cranenburgh, Andreas, 2017
Discovering Biased News Articles Leveraging Multiple Human Annotations, Lazaridou, Konstantina and L{"o}ser, Alexander and Mestre, Maria and Naumann, Felix, 2020
Annotating and Analyzing Biased Sentences in News Articles using Crowdsourcing, Lim, Sora and Jatowt, Adam and F{"a}rber, Michael and Yoshikawa, Masatoshi, 2020
Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness, Lyu, Lingjuan and He, Xuanli and Li, Yitong, 2020
Building Better Open-Source Tools to Support Fairness in Automated Scoring, Madnani, Nitin and Loukina, Anastassia and von Davier, Alina and Burstein, Jill and Cahill, Aoife, 2017
StereoSet: Measuring stereotypical bias in pretrained language models, Nadeem, Moin and Bethke, Anna and Reddy, Siva, 2020
Investigating Sports Commentator Bias within a Large Corpus of American Football Broadcasts, Merullo, Jack and Yeh, Luke and Handler, Abram and Grissom II, Alvin and O{'}Connor, Brendan and Iyyer, Mohit, 2019
Artie Bias Corpus: An Open Dataset for Detecting Demographic Bias in Speech Applications, Meyer, Josh and Rauchenstein, Lindy and Eisenberg, Joshua D. and Howell, Nicholas, 2020
RtGender: A Corpus for Studying Differential Responses to Gender, Voigt, Rob and Jurgens, David and Prabhakaran, Vinodkumar and Jurafsky, Dan and Tsvetkov, Yulia, 2018
Multi-Dimensional Gender Bias Classification, Dinan, Emily and Fan, Angela and Wu, Ledell and Weston, Jason and Kiela, Douwe and Williams, Adina, 2020
UNQOVERing Stereotyping Biases via Underspecified Questions, Li, Tao and Khashabi, Daniel and Khot, Tushar and Sabharwal, Ashish and Srikumar, Vivek, 2020
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models, Nangia, Nikita and Vania, Clara and Bhalerao, Rasika and Bowman, Samuel R., 2020
Gender Bias in Coreference Resolution, Rudinger, Rachel and Naradowsky, Jason and Leonard, Brian and Van Durme, Benjamin, 2018
Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods, Zhao, Jieyu and Wang, Tianlu and Yatskar, Mark and Ordonez, Vicente and Chang, Kai-Wei, 2018
Unmasking the Mask -- Evaluating Social Biases in Masked Language Models, Kaneko, Masahiro and Bollegala, Danushka, 2021
WIKIBIAS: Detecting Multi-Span Subjective Biases in Language, Zhong, Yang and Yang, Jingfeng and Xu, Wei and Yang, Diyi, EMNLP, 2021
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness Metrics, Charan Reddy, Deepak Sharma, Soroush Mehri, Adriana Romero, Samira Shabanian, Sina Honari. NeurIPS, 2021.

Word/Sentence Representations

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings, Bolukbasi, Tolga and Chang, Kai-Wei and Zou, James and Saligrama, Venkatesh and Kalai, Adam, 2016 [github]
Semantics derived automatically from language corpora contain human-like biases, Caliskan, Aylin and Bryson, Joanna J. and Narayanan, Arvind, 2017
Attenuating Biases in Word Vectors, Dev, Sunipa and Phillips, Jeff M, 2019
Gender Bias in Contextualized Word Embeddings, Zhao, Jieyu and Wang, Tianlu and Yatskar, Mark and Cotterell, Ryan and Ordonez, Vicente and Chang, Kai-Wei, 2019
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings, Manzini, Thomas and Yao Chong, Lim and Black, Alan W and Tsvetkov, Yulia, 2019
Towards Understanding Linear Word Analogies, Ethayarajh, Kawin and Duvenaud, David and Hirst, Graeme, 2019
Understanding Undesirable Word Embedding Associations, Ethayarajh, Kawin and Duvenaud, David and Hirst, Graeme, 2019
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer, Zhao, Jieyu and Mukherjee, Subhabrata and Hosseini, Saghar and Chang, Kai-Wei and Hassan Awadallah, Ahmed, 2020
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings, Kumar, Vaibhav and Bhotia, Tenzin Singhay and Kumar, Vaibhav and Chakraborty, Tanmoy, 2020
Measuring Bias in Contextualized Word Representations, Kurita, Keita and Vyas, Nidhi and Pareek, Ayush and Black, Alan W and Tsvetkov, Yulia, 2019
Unmasking Contextual Stereotypes: Measuring and Mitigating BERT's Gender Bias, Bartl, Marion and Nissim, Malvina and Gatt, Albert, 2020
Evaluating the Underlying Gender Bias in Contextualized Word Embeddings, Basta, Christine and Costa-jussà, Marta R. and Casas, Noe, 2019
Evaluating Bias In Dutch Word Embeddings, Chávez Mulsa, Rodrigo Alejandro and Spanakis, Gerasimos, 2020
Learning Gender-Neutral Word Embeddings, Zhao, Jieyu and Zhou, Yichao and Li, Zeyu and Wang, Wei and Chang, Kai-Wei
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them, Gonen, Hila and Goldberg, Yoav, 2019
It{'}s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution, Hall Maudslay, Rowan and Gonen, Hila and Cotterell, Ryan and Teufel, Simone, 2019
Gender-preserving Debiasing for Pre-trained Word Embeddings, Kaneko, Masahiro and Bollegala, Danushka, 2019
Debiasing Pre-trained Contextualised Embeddings, Kaneko, Masahiro and Bollegala, Danushka, 2021
Dictionary-based Debiasing of Pre-trained Word Embeddings, Kaneko, Masahiro and Bollegala, Danushka, 2021
Conceptor Debiasing of Word Representations Evaluated on WEAT, Karve, Saket and Ungar, Lyle and Sedoc, Jo{~a}o, 2019
Are We Consistently Biased? Multidimensional Analysis of Biases in Distributional Word Vectors, Lauscher, Anne and Glava{\v{s}}, Goran, 2019
AraWEAT: Multidimensional Analysis of Biases in Arabic Word Embeddings, Lauscher, Anne and Takieddin, Rafik and Ponzetto, Simone Paolo and Glava{\v{s}}, Goran, 2020
Unequal Representations: Analyzing Intersectional Biases in Word Embeddings Using Representational Similarity Analysis, Lepori, Michael, 2020
Monolingual and Multilingual Reduction of Gender Bias in Contextualized Representations, Liang, Sheng and Dufter, Philipp and Sch{"u}tze, Hinrich, 2020
Towards Debiasing Sentence Representations, Liang, Paul Pu and Li, Irene Mengze and Zheng, Emily and Lim, Yao Chong and Salakhutdinov, Ruslan and Morency, Louis-Philippe, 2020
On Measuring Social Biases in Sentence Encoders, May, Chandler and Wang, Alex and Bordia, Shikha and Bowman, Samuel R. and Rudinger, Rachel, 2019
Fair Is Better than Sensational: Man Is to Doctor as Woman Is to Doctor, Nissim, Malvina and van Noord, Rik and van der Goot, Rob, 2020
Gender Bias in Pretrained Swedish Embeddings, Sahlgren, Magnus and Olsson, Fredrik, 2019
Is Wikipedia succeeding in reducing gender bias? Assessing changes in gender bias in Wikipedia using word embeddings, Schmahl, Katja Geertruida and Viering, Tom Julian and Makrodimitris, Stavros and Naseri Jahfari, Arman and Tax, David and Loog, Marco, 2020
The Role of Protected Class Word Lists in Bias Identification of Contextualized Word Representations, Sedoc, Jo{~a}o and Ungar, Lyle, 2019
Neutralizing Gender Bias in Word Embeddings with Latent Disentanglement and Counterfactual Generation, Shin, Seungjae and Song, Kyungwoo and Jang, JoonHo and Kim, Hyemi and Joo, Weonyoung and Moon, Il-Chul, 2020
A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings, Sweeney, Chris and Najafian, Maryam, 2019
Can Existing Methods Debias Languages Other than English? First Attempt to Analyze and Mitigate Japanese Word Embeddings, Takeshita, Masashi and Katsumata, Yuki and Rzepka, Rafal and Araki, Kenji, 2020
Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation, Vargas, Francisco and Cotterell, Ryan, 2020
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation, Wang, Tianlu and Lin, Xi Victoria and Rajani, Nazneen Fatema and McCann, Bryan and Ordonez, Vicente and Xiong, Caiming, 2020
Robustness and Reliability of Gender Bias Assessment in Word Embeddings: The Role of Base Pairs, Zhang, Haiyang and Sneyd, Alison and Stevenson, Mark, 2020
Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change, Hamilton, William L. and Leskovec, Jure and Jurafsky, Dan, 2016 [github]
Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories, Chaloner, Kaytlin and Maldonado, Alfredo, 2019
Relating Word Embedding Gender Biases to Gender Gaps: A Cross-Cultural Analysis, Friedman, Scott and Schmer-Galunder, Sonja and Chen, Anthony and Rye, Jeffrey, 2019
Modeling Personal Biases in Language Use by Inducing Personalized Word Embeddings, Oba, Daisuke and Yoshinaga, Naoki and Sato, Shoetsu and Akasaki, Satoshi and Toyoda, Masashi, 2019
Quantifying 60 Years of Gender Bias in Biomedical Research with Word Embeddings, Rios, Anthony and Joshi, Reenam and Shin, Hejin, 2020
Debiasing knowledge graph embeddings, Fisher, Joseph and Mittal, Arpit and Palfrey, Dave and Christodoulopoulos, Christos, 2020
Assessing the Reliability of Word Embedding Gender Bias Measures, Du, Yupei and Fang, Qixiang and Nguyen, Dong, EMNLP, 2021
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies, Dev, Sunipa and Monajatipoor, Masoud and Ovalle, Anaelia and Subramonian, Arjun and Phillips, Jeff M and Chang, Kai-Wei, EMNLP, 2021

Natural Language Understanding

Bias Amplification Issue

Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints, Zhao, Jieyu and Wang, Tianlu and Yatskar, Mark and Ordonez, Vicente and Chang, Kai-Wei, 2017
Feature-Wise Bias Amplification, Klas Leino, Emily Black, Matt Fredrikson, Shayak Sen, Anupam Datta. ICLR, 2019.
Mitigating Gender Bias Amplification in Distribution by Posterior Regularization, Jia, Shengyu and Meng, Tao and Zhao, Jieyu and Chang, Kai-Wei, 2020
Fairness Without Demographics in Repeated Loss Minimization, Tatsunori B. Hashimoto, Megha Srivastava, Hongseok Namkoong, Percy Liang, ICLM, 2018

Bias Detection

LOGAN: Local Group Bias Detection by Clustering, Zhao, Jieyu and Chang, Kai-Wei, 2020
Examining Gender Bias in Languages with Grammatical Gender, Zhou, Pei and Shi, Weijia and Zhao, Jieyu and Huang, Kuan-Hao and Chen, Muhao and Cotterell, Ryan and Chang, Kai-Wei, 2019
Racial Bias in Hate Speech and Abusive Language Detection Datasets, Davidson, Thomas and Bhattacharya, Debasmita and Weber, Ingmar, 2019
Social Biases in NLP Models as Barriers for Persons with Disabilities, Hutchinson, Ben and Prabhakaran, Vinodkumar and Denton, Emily and Webster, Kellie and Zhong, Yu and Denuyl, Stephen, 2020
Perturbation Sensitivity Analysis to Detect Unintended Model Biases, Prabhakaran, Vinodkumar and Hutchinson, Ben and Mitchell, Margaret, 2019
OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings, Dev, Sunipa and Li, Tao and Phillips, Jeff M and Srikumar, Vivek, 2020
Women's Syntactic Resilience and Men's Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing, Garimella, Aparna and Banea, Carmen and Hovy, Dirk and Mihalcea, Rada, 2019
Towards Understanding Gender Bias in Relation Extraction, Gaut, Andrew and Sun, Tony and Tang, Shirlyn and Huang, Yuxin and Qian, Jing and ElSherief, Mai and Zhao, Jieyu and Mirza, Diba and Belding, Elizabeth and Chang, Kai-Wei and Wang, William Yang, 2020
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias, Gonz{'a}lez, Ana Valeria and Barrett, Maria and Hvingelby, Rasmus and Webster, Kellie and S{\o}gaard, Anders, 2020
Inflating Topic Relevance with Ideology: A Case Study of Political Ideology Bias in Social Topic Detection Models, Guo, Meiqi and Hwa, Rebecca and Lin, Yu-Ru and Chung, Wen-Ting, 2020
Media Bias, the Social Sciences, and NLP: Automating Frame Analyses to Identify Bias by Word Choice and Labeling, Hamborg, Felix, 2020
An Annotation Scheme for Automated Bias Detection in Wikipedia, Herzig, Livnat and Nunes, Alex and Snir, Batia, 2011
Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition, Huang, Xiaolei and Xing, Linzi and Dernoncourt, Franck and Paul, Michael J., 2020
Enhancing Bias Detection in Political News Using Pragmatic Presupposition, Kameswari, Lalitha and Sravani, Dama and Mamidi, Radhika, 2020
Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems, Kiritchenko, Svetlana and Mohammad, Saif, 2018
Social Bias in Elicited Natural Language Inferences, Rudinger, Rachel and May, Chandler and Van Durme, Benjamin, 2017
The Risk of Racial Bias in Hate Speech Detection, Sap, Maarten and Card, Dallas and Gabriel, Saadia and Choi, Yejin and Smith, Noah A., 2019
Social Bias Frames: Reasoning about Social and Power Implications of Language, Sap, Maarten and Gabriel, Saadia and Qin, Lianhui and Jurafsky, Dan and Smith, Noah A. and Choi, Yejin, 2020
Do Neural Language Models Overcome Reporting Bias?, Shwartz, Vered and Choi, Yejin, 2020
Context in Informational Bias Detection, van den Berg, Esther and Markert, Katja, 2020
Comparative Evaluation of Label-Agnostic Selection Bias in Multilingual Hate Speech Datasets, Ousidhoum, Nedjma and Song, Yangqiu and Yeung, Dit-Yan, 2020
Detecting and Reducing Bias in a High Stakes Domain, Zhong, Ruiqi and Chen, Yanda and Patton, Desmond and Selous, Charlotte and McKeown, Kathy, 2019
Measuring the Effects of Bias in Training Data for Literary Classification, Bagga, Sunyam and Piper, Andrew, 2020
Unsupervised Discovery of Implicit Gender Bias, Field, Anjalie and Tsvetkov, Yulia, 2020
Evaluating Debiasing Techniques for Intersectional Biases, Subramanian, Shivashankar and Han, Xudong and Baldwin, Timothy and Cohn, Trevor and Frermann, Lea , EMNLP 2021
Towards Automatic Bias Detection in Knowledge Graphs, Keidar, Daphna and Zhong, Mian and Zhang, Ce and Shrestha, Yash Raj and Paudel, Bibek, EMNLP, 2021
Uncovering Implicit Gender Bias in Narratives through Commonsense Inference, Huang, Tenghao and Brahman, Faeze and Shwartz, Vered and Chaturvedi, Snigdha, EMNLP 2021
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds, Ethayarajh, Kawin, 2020

Bias Mitigation

Reducing Gender Bias in Abusive Language Detection, Park, Ji Ho and Shin, Jamin and Fung, Pascale, 2018
On Measuring and Mitigating Biased Inferences of Word Embeddings, Dev, Sunipa and Li, Tao and Phillips, Jeff M and Srikumar, Vivek, 2019
Debiasing Embeddings for Reduced Gender Bias in Text Classification, Prost, Flavien and Thain, Nithum and Bolukbasi, Tolga, 2019
Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function, Qian, Yusu and Muaz, Urwa and Zhang, Ben and Hyun, Jae Won, 2019
Linguistic Models for Analyzing and Detecting Biased Language, Recasens, Marta and Danescu-Niculescu-Mizil, Cristian and Jurafsky, Dan, 2013
What's in a Name? Reducing Bias in Bios without Access to Protected Attributes, Romanov, Alexey and De-Arteaga, Maria and Wallach, Hanna and Chayes, Jennifer and Borgs, Christian and Chouldechova, Alexandra and Geyik, Sahin and Kenthapadi, Krishnaram and Rumshisky, Anna and Kalai, Adam, 2019
Demoting Racial Bias in Hate Speech Detection, Xia, Mengzhou and Field, Anjalie and Tsvetkov, Yulia, 2020
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations, Wang, Tianlu and Zhao, Jieyu and Yatskar, Mark and Chang, Kai-Wei and Ordonez, Vicente, 2019
Fairness without Demographics through Adversarially Reweighted Learning , Preethi Lahoti, Alex Beutel, Jilin Chen, Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, Ed H. Chi, 2020.
On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections, Peizhao Li, Yifei Wang, Han Zhao, Pengyu Hong, Hongfu Liu, 2021
Challenges in Automated Debiasing for Toxic Language Detection, Zhou, Xuhui and Sap, Maarten and Swayamdipta, Swabha and Choi, Yejin and Smith, Noah A, 2021
Mitigating Language-Dependent Ethnic Bias in BERT, Ahn, Jaimeen and Oh, Alice, EMNLP, 2021
Sustainable Modular Debiasing of Language Models, Lauscher, Anne and Lüken, Tobias and Glavaš, Goran, EMNLP, 2021

Natural Language Generation

Machine Translation

Towards Mitigating Gender Bias in a decoder-based Neural Machine Translation model by Adding Contextual Information, Basta, Christine and Costa-jussà, Marta R. and Fonollosa, José A. R., 2020
On Measuring Gender Bias in Translation of Gender-neutral Pronouns, Cho, Won Ik and Kim, Ji Won and Kim, Seok Min and Kim, Nam Soo, 2019
Fine-tuning Neural Machine Translation on Gender-Balanced Datasets, Costa-jussà, Marta R. and de Jorge, Adrià, 2020
Equalizing Gender Bias in Neural Machine Translation with Word Embeddings Techniques, Escudé Font, Joel and Costa-jussà, Marta R., 2019
Automatically Identifying Gender Issues in Machine Translation using Perturbations, Gonen, Hila and Webster, Kellie, 2020
Gender Coreference and Bias Evaluation at WMT 2020, Kocmi, Tom and Limisiewicz, Tomasz and Stanovsky, Gabriel, 2020
Filling Gender & Number Gaps in Neural Machine Translation with Black-box Context Injection, Moryossef, Amit and Aharoni, Roee and Goldberg, Yoav, 2019
Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem, Saunders, Danielle and Byrne, Bill, 2020
Neural Machine Translation Doesn't Translate Gender Coreference Right Unless You Make It, Saunders, Danielle and Sallis, Rosie and Byrne, Bill, 2020
Mitigating Gender Bias in Machine Translation with Target Gender Annotations, Stafanovičs, Artūrs and Pinnis, Mārcis and Bergmanis, Toms, 2020
Evaluating Gender Bias in Machine Translation, Stanovsky, Gabriel and Smith, Noah A. and Zettlemoyer, Luke, 2019
Getting Gender Right in Neural Machine Translation, Vanmassenhove, Eva and Hardmeier, Christian and Way, Andy, 2018
"You Sound Just Like Your Father" Commercial Machine Translation Systems Include Stylistic Biases, Hovy, Dirk and Bianchi, Federico and Fornaciari, Tommaso, 2020
Assessing gender bias in machine translation: a case study with google translate, Prates, Marcelo O. R. and Avelar, Pedro H. C. and Lamb, Luis, 2019
Gender Bias in Multilingual Neural Machine Translation: The Architecture Matters, Costa-jussà, Marta R. and Escolano, Carlos and Basta, Christine and Ferrando, Javier and Batlle, Roser and Kharitonova, Ksenia, 2020
How to Measure Gender Bias in Machine Translation: Optimal Translators, Multiple Reference Points, Farkas, Anna and Németh, Renáta, 2020
Gender aware spoken language translation applied to English-Arabic, Elaraby, Mostafa and Tawfik, Ahmed Y and Khaled, Mahmoud and Hassan, Hany and Osama, Aly, 2018
Type {B} Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias, Gonz{'a}lez, Ana Valeria and Barrett, Maria and Hvingelby, Rasmus and Webster, Kellie and S{\o}gaard, Anders, 2020

Dialogue Generation

Conversational Assistants and Gender Stereotypes: Public Perceptions and Desiderata for Voice Personas, Cercas Curry, Amanda and Robertson, Judy and Rieser, Verena 2020
Does Gender Matter? Towards Fairness in Dialogue Systems, Liu, Haochen and Dacon, Jamell and Fan, Wenqi and Liu, Hui and Liu, Zitao and Tang, Jiliang, 2020
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning, Liu, Haochen and Wang, Wentao and Wang, Yiqi and Liu, Hui and Liu, Zitao and Tang, Jiliang, 2020
Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation, Dinan, Emily and Fan, Angela and Williams, Adina and Urbanek, Jack and Kiela, Douwe and Weston, Jason, 2020
Ethical challenges in data-driven dialogue systems, Henderson, Peter and Sinha, Koustuv and Angelard-Gontier, Nicolas and Ke, Nan Rosemary and Fried, Genevieve and Lowe, Ryan and Pineau, Joelle, 2018
"Nice Try, Kiddo": Ad Hominems in Dialogue Systems, Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun, 2020

Other Generation

Gender-Aware Reinflection using Linguistically Enhanced Neural Models, Alhafni, Bashar and Habash, Nizar and Bouamor, Houda, 2020
Identifying and Reducing Gender Bias in Word-Level Language Models, Bordia, Shikha and Bowman, Samuel R., 2019
Investigating African-American Vernacular English in Transformer-Based Text Generation, Groenwold, Sophie and Ou, Lily and Parekh, Aesha and Honnavalli, Samhita and Levy, Sharon and Mirza, Diba and Wang, William Yang, 2020
Automatic Gender Identification and Reinflection in Arabic, Habash, Nizar and Bouamor, Houda and Chung, Christine, 2019
Reducing Sentiment Bias in Language Models via Counterfactual Evaluation, Huang, Po-Sen and Zhang, Huan and Jiang, Ray and Stanforth, Robert and Welbl, Johannes and Rae, Jack and Maini, Vishal and Yogatama, Dani and Kohli, Pushmeet, 2020
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction, Ma, Xinyao and Sap, Maarten and Rashkin, Hannah and Choi, Yejin, 2020
Reducing Non-Normative Text Generation from Language Models, Peng, Xiangyu and Li, Siyan and Frazier, Spencer and Riedl, Mark, 2020
The Woman Worked as a Babysitter: On Biases in Language Generation, Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun, 2019
Towards Controllable Biases in Language Generation, Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun, 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models, Gehman, Sam and Gururangan, Suchin and Sap, Maarten and Choi, Yejin and Smith, Noah A, 2020
"You are grounded!": Latent Name Artifacts in Pre-trained Language Models, Shwartz, Vered and Rudinger, Rachel and Tafjord, Oyvind, 2020
Defining and Evaluating Fair Natural Language Generation, Yeo, Catherine and Chen, Alyssa, 2020
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology, Zmigrod, Ran and Mielke, Sabrina J. and Wallach, Hanna and Cotterell, Ryan, 2019
Investigating Gender Bias in Language Models Using Causal Mediation Analysis, Vig, Jesse and Gehrmann, Sebastian and Belinkov, Yonatan and Qian, Sharon and Nevo, Daniel and Singer, Yaron and Shieber, Stuart, 2020
Release strategies and the social impacts of language models, Solaiman, Irene and Brundage, Miles and Clark, Jack and Askell, Amanda and Herbert-Voss, Ariel and Wu, Jeff and Radford, Alec and Krueger, Gretchen and Kim, Jong Wook and Kreps, Sarah and others, 2019
Automatically neutralizing subjective bias in text, Pryzant, Reid and Martinez, Richard Diehl and Dass, Nathan and Kurohashi, Sadao and Jurafsky, Dan and Yang, Diyi, 2020
Language models are few-shot learners, Brown, Tom B and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others, 2020
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation, Dhamala, Jwala and Sun, Tony and Kumar, Varun and Krishna, Satyapriya and Pruksachatkun, Yada and Chang, Kai-Wei and Gupta, Rahul, 2021
Viable Threat on News Reading: Generating Biased News Using Natural Language Models, Gupta, Saurabh and Nguyen, Hong Huy and Yamagishi, Junichi and Echizen, Isao, 2020
Investigating Societal Biases in a Poetry Composition System, Sheng, Emily and Uthus, David, 2020
De-Biased Court's View Generation with Causality, Wu, Yiquan and Kuang, Kun and Zhang, Yating and Liu, Xiaozhong and Sun, Changlong and Xiao, Jun and Zhuang, Yueting and Si, Luo and Wu, Fei, 2020
Detoxifying Language Models Risks Marginalizing Minority Voices, Xu, Albert and Pathak, Eshaan and Wallace, Eric and Gururangan, Suchin, and Sap, Maarten and Klein, Dan, 2021
Detect and Perturb: Neutral Rewriting of Biased and Sensitive Text via Gradient-based Decoding, He, Zexue and Majumder, Bodhisattwa Prasad and McAuley, Julian, EMNLP, 2021

Bias Visualization

FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning, Cabrera, Ángel Alexander and Epperson, Will and Hohman, Fred and Kahng, Minsuk and Morgenstern, Jamie and Chau, Duen Horng, 2021
VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations, Archit Rathore, Archit and Dev, Sunipa and Phillips, Jeff M. and Srikumar, Vivek and Zheng, Yan and Yeh, Chin-Chia Michael and Wang, Junpeng and Zhang, Wei and Wang, Bei, 2021
DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces, Friedrich, Niklas and Lauscher, Anne and Ponzetto, Simone Paolo and Glavaš, Goran, 2021

Others

Gender bias in neural natural language processing, Lu, Kaiji and Mardziel, Piotr and Wu, Fangjing and Amancharla, Preetam and Datta, Anupam, 2020
Equity Beyond Bias in Language Technologies for Education, Mayfield, Elijah and Madaio, Michael and Prabhumoye, Shrimai and Gerritsen, David and McLaughlin, Brittany and Dixon-Rom{'a}n, Ezekiel and Black, Alan W, 2019
Shedding (a Thousand Points of) Light on Biased Language, Yano, Tae and Resnik, Philip and Smith, Noah A., 2010
Dialect Diversity in Text Summarization on Twitter, Celis, L Elisa and Keswani, Vijay, 2020
Identifying and Measuring Annotator Bias Based on Annotators' Demographic Characteristics, Al Kuwatly, Hala and Wich, Maximilian and Groh, Georg, 2020
Multilingual sentence-level bias detection in Wikipedia, Aleksandrova, Desislava and Lareau, François and Ménard, Pierre André, 2019
Automated Essay Scoring in the Presence of Biased Ratings, Amorim, Evelin and Cançado, Marcia and Veloso, Adriano, 2018
Predicting Factuality of Reporting and Bias of News Media Sources, Baly, Ramy and Karadzhov, Georgi and Alexandrov, Dimitar and Glass, James and Nakov, Preslav, 2018
We Can Detect Your Bias: Predicting the Political Ideology of News Articles, Baly, Ramy and Da San Martino, Giovanni and Glass, James and Nakov, Preslav, 2020
The Multilingual Affective Soccer Corpus (MASC): Compiling a biased parallel corpus on soccer reportage in English, German and Dutch, Braun, Nadine and Goudbeek, Martijn and Krahmer, Emiel, 2016
Word-order Biases in Deep-agent Emergent Communication, Chaabouni, Rahma and Kharitonov, Eugene and Lazaric, Alessandro and Dupoux, Emmanuel and Baroni, Marco, 2019
Importance sampling for unbiased on-demand evaluation of knowledge base population, Chaganty, Arun and Paranjape, Ashwin and Liang, Percy and Manning, Christopher D., 2017
Bias and Fairness in Natural Language Processing, Chang, Kai-Wei and Prabhakaran, Vinod and Ordonez, Vicente, 2019
Learning to Flip the Bias of News Headlines, Chen, Wei-Fan and Wachsmuth, Henning and Al-Khatib, Khalid and Stein, Benno, 2018
Analyzing Political Bias and Unfairness in News Articles at Different Levels of Granularity, Chen, Wei-Fan and Al Khatib, Khalid and Wachsmuth, Henning and Stein, Benno, 2020
Detecting Media Bias in News Articles using Gaussian Bias Distributions, Chen, Wei-Fan and Al Khatib, Khalid and Stein, Benno and Wachsmuth, Henning, 2020
Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation, Cohn, Trevor and Specia, Lucia, 2013
Masking Actor Information Leads to Fairer Political Claims Detection, Dayanik, Erenay and Padó, Sebastian, 2020
CLARIN: Towards FAIR and Responsible Data Science Using Language Resources, de Jong, Franciska and Maegaard, Bente and De Smedt, Koenraad and Fišer, Darja and Van Uytvanck, Dieter, 2018
Semi-Supervised Topic Modeling for Gender Bias Discovery in English and Swedish, Devinney, Hannah and Björklund, Jenny and Björklund, Henrik, 2020
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds, Ethayarajh, Kawin, 2020
Team Peter Brinkmann at SemEval-2019 Task 4: Detecting Biased News Articles Using Convolutional Neural Networks, Färber, Michael and Qurdina, Agon and Ahmedi, Lule, 2019
Biases in Predicting the Human Language Model, Fine, Alex B. and Frank, Austin F. and Jaeger, T. Florian and Van Durme, Benjamin, 2014
Analyzing Biases in Human Perception of User Age and Gender from Text, Flekova, Lucie and Carpenter, Jordan and Giorgi, Salvatore and Ungar, Lyle and Preoţiuc-Pietro, Daniel, 2016
Reference Bias in Monolingual Machine Translation Evaluation, Fomicheva, Marina and Specia, Lucia, 2016
Analyzing Gender Bias within Narrative Tropes, Gala, Dhruvil and Khursheed, Mohammad Omar and Lerner, Hannah and O'Connor, Brendan and Iyyer, Mohit, 2020
Detecting Political Bias in News Articles Using Headline Attention, Gangula, Rama Rohit Reddy and Duggenpudi, Suma Reddy and Mamidi, Radhika, 2019
Detecting Independent Pronoun Bias with Partially-Synthetic Data Generation, Munro, Robert and Morrison, Alex (Carmen), 2020
Analyzing Gender Bias in Student Evaluations, Terkik, Andamlak and Prud{'}hommeaux, Emily and Ovesdotter Alm, Cecilia and Homan, Christopher and Franklin, Scott, 2016

Tutorial List

Fairness in Machine Learning, NeurIPS 2017
The Trouble with Bias, NeurIPS 2017
Socially Responsible NLP, NAACL 2018
Tutorial: Bias and Fairness in Natural Language Processing, EMNLP 2019
Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned, KDD 2019
Dealing with Bias and Fairness in Building Data Science/ML/AI Systems, KDD 2020
A Visual Tour of Bias Mitigation Techniques for Word Representations, AAAI 2021

Jupyter/Colab Tutorial

Conference/Workshop List

Ethics in NLP, ACL Wiki
ACM FAccT conference
Gendered Ambiguous Pronoun (GAP) Shared Task at the Gender Bias in NLP Workshop 2019, Webster, Kellie and Costa-jussa, Marta R. and Hardmeier, Christian and Radford, Will, 2019
Proceedings of the First Workshop on Gender Bias in Natural Language Processing, Costa-juss{`a}, Marta R. and Hardmeier, Christian and Radford, Will and Webster, Kellie, 2019
Proceedings of the Second Workshop on Gender Bias in Natural Language Proceedings, Costa-juss{`a}, Marta R. and Hardmeier, Christian and Radford, Will and Webster, Kellie, 2020
Team Kermit-the-frog at SemEval-2019 Task 4: Bias Detection Through Sentiment Analysis and Simple Linguistic Features, Anthonio, Talita and Kloppenburg, Lennart, 2019
Fairness and machine learning

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-fairness-papers

Background

Contents

Paper List

Surveys

Social Impact of Biases

Data, Models, & Metrics

Word/Sentence Representations

Natural Language Understanding

Bias Amplification Issue

Bias Detection

Bias Mitigation

Natural Language Generation

Machine Translation

Dialogue Generation

Other Generation

Bias Visualization

Others

Tutorial List

Jupyter/Colab Tutorial

Conference/Workshop List

About

Releases

Packages

umbrellabeach/awesome-fairness-papers

Folders and files

Latest commit

History

Repository files navigation

awesome-fairness-papers

Background

Contents

Paper List

Surveys

Social Impact of Biases

Data, Models, & Metrics

Word/Sentence Representations

Natural Language Understanding

Bias Amplification Issue

Bias Detection

Bias Mitigation

Natural Language Generation

Machine Translation

Dialogue Generation

Other Generation

Bias Visualization

Others

Tutorial List

Jupyter/Colab Tutorial

Conference/Workshop List

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages