Astraea: Grammar-based fairness testing

[en] Software often produces biased outputs. In particular, machine learning (ML) based software is known to produce erroneous predictions when processing discriminatory inputs. Such unfair program behavior can be caused by societal bias. In the last few years, Amazon, Microsoft and Google have provided software services that produce unfair outputs, mostly due to societal bias (e.g. gender or race). In such events, developers are saddled with the task of conducting fairness testing. Fairness testing is challenging; developers are tasked with generating discriminatory inputs that reveal and explain biases. We propose a grammar-based fairness testing approach (called ASTRAEA) which leverages context-free grammars to generate discriminatory inputs that reveal fairness violations in software systems. Using probabilistic grammars, ASTRAEA also provides fault diagnosis by isolating the cause of observed software bias. ASTRAEA’s diagnoses facilitate the improvement of ML fairness. ASTRAEA was evaluated on 18 software systems that provide three major natural language processing (NLP) services. In our evaluation, ASTRAEA generated fairness violations at a rate of about 18%. ASTRAEA generated over 573K discriminatory test cases and found over 102K fairness violations. Furthermore, ASTRAEA improves software fairness by about 76% via model retraining, on average.

Disciplines :

Computer science

Author, co-author :

SOREMEKUN, Ezekiel ^✱; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

Udeshi, Sakshi Sunil ^✱; Singapore University of Technology and Design, Singapore

Chattopadhyay, Sudipta; Singapore University of Technology and Design, Singapore

^✱ These authors have contributed equally to this work.

External co-authors :

yes

Language :

English

Title :

Astraea: Grammar-based fairness testing

Publication date :

2022

Journal title :

IEEE Transactions on Software Engineering

Publisher :

IEEE

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Available on ORBilu :

since 16 January 2023

Statistics

Number of views

64 (1 by Unilu)

Number of downloads

63 (1 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenAlex citations

WoS citations^™

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Grammarly, 2021. [Online]. Available: https://app.grammarly.com/
U.S. Social Security Administration. Top names over the last 100 years, 2020. [Online]. Available: https://www.ssa.gov/OACT/babynames/decades/century.html
A. Aggarwal, P. Lohia, S. Nagar, K. Dey, and D. Saha, “Black box fairness testing of machine learning models,” in Proc. 27th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2019, pp. 625–635.
A. Albarghouthi, L. D’Antoni, S. Drews, and A. V. Nori, “Fairsquare: Probabilistic verification of program fairness,” in Proc. ACM Program. Lang.2017, pp. 1–30.
F. Alfaro, M. R. Costa-Jussa, and J. A. R. Fonollosa “Bert masked language modeling for co-reference resolution,” in Proc. 1st Workshop Gender Bias Natural Lang. Process., 2019, pp. 76–81.
J. Angwin, J. Larson, S. Mattu, and L. Kirchner. Machine bias. 2016. [Online]. Available: https://www.propublica.org/article/machinebias-risk-assessments-in-criminal-sentencing
A. Arcuri and L. Briand, “A practical guide for using statistical tests to assess randomized algorithms in software engineering,” in Proc. 33rd Int. Conf. Softw. Eng., 2011, pp. 1–10.
M. H. Asyrofi, Z. Yang, I. N. B. Yusuf, H. J. Kang, F. Thung, and D. Lo, “BiasFinder: Metamorphic test generation to uncover bias for sentiment analysis systems,” IEEE Trans. Softw. Eng., 2021.
E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo, “The oracle problem in software testing: A survey,” IEEE Trans. Softw. Eng., vol. 41, no. 5, pp. 507–525, 2014.
O. Bastani, R. Sharma, A. Aiken, and P. Liang, “Synthesizing program input grammars,” in Proc. 38th ACM SIGPLAN Conf. Programm. Lang. Des. Implementation, 2017, pp. 95–110.
D. Berend et al., “Cats are not fish: Deep learning testing calls for out-of-distribution awareness,” in Proc. 35th IEEE/ACM Int. Conf. Automated Softw. Eng., 2020, pp. 1041–1052.
D. Bissell, T. Birtchnell, A. Elliott, and E. L. Hsu, “Autonomous automobilities: The social impacts of driverless vehicles,” Current Sociol., vol. 68, no. 1, pp. 116–134, 2020.
S. Biswas and H. Rajan, “Do the machine learning models on a crowd sourced platform exhibit bias? An empirical study on model fairness,” in Proc. 28th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2020, pp. 642–653.
S. Biswas and H. Rajan, “Fair preprocessing: Towards understanding compositional fairness of data transformers in machine learning pipeline,” in Proc. 29th ACM Joint Eur. Softw. Eng. Conf. Symp. Foundations Softw. Eng., 2021, pp. 981–993.
Su L. Blodgett, “Sociolinguistically driven approaches for just natural language processing,” Ph.D. thesis, 2020.
S. L. Blodgett, S. Barocas, H. Daume III, and H. Wallach, “Language (technology) is power: A critical survey of “bias” in NLP,” in Proc. 58th Annu. Meeting Assoc. Comput. Linguistics, 2020, pp. 5454–5476.
S. L. Blodgett, L. Green, and B. O’Connor. “Demographic dialectal variation in social media: A case study of african-american english,” in Proc. Conf. Empirical Methods Natural Lang. Process., 2016, pp. 1119–1130.
S. L. Blodgett and B. O’Connor. “Racial disparity in natural language processing: A case study of social media african-american english,” Fairness Accountability, Transparency Mach. Learn. Workshop, 2017.
T. Bolukbasi, K.-W. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai, “Man is to computer programmer as woman is to homemaker? debiasing word embeddings,” in Proc. Adv. Neural Inf. Process. Syst. 29: Annu. Conf. Neural Inf. Process. Syst., 2016, pp. 4349–4357.
T. Calders, F. Kamiran, and M. Pechenizkiy, “Building classifiers with independency constraints,” in Proc. IEEE Int. Conf. Data Mining Workshops, 2009, pp. 13–18.
A. Caliskan, J. J. Bryson, and A. Narayanan, “Semantics derived automatically from language corpora contain human-like biases,” Science, vol. 356, no. 6334, pp. 183–186, 2017.
J. Chakraborty, S. Majumder, and T. Menzies, “Bias in machine learning software: Why? how? what to do?,” in Proc. 29th ACM Joint Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2021, pp. 429–440.
J. Chakraborty, S. Majumder, Z. Yu, and T. Menzies, “Fairway: A way to build fair ML software,” in Proc. 28th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2020, pp. 654–665.
K. Charmaz, “Constructing grounded theory: A practical guide through qualitative analysis,” Newbury Park, CA, USA: Sage, 2006.
K. Crawford, “The trouble with bias,” in Proc. Conf. Neural Inf. Process. Syst., 2017. [Online]. Available: https://www.youtube.com/watch?v¼fMym_BKWQzk
E. Denton, B. Hutchinson, M. Mitchell, and T. Gebru, “Detecting bias with generative counterfactual face attribute augmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshop Fairness Accountability Transparency Ethics Comput. Vis., 2019.
S. Dev, T. Li, J. M. Phillips, and V. Srikumar,. “On measuring and mitigating biased inferences of word embeddings,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 7659–7666.
S. Dev and J. Phillips, “Attenuating bias in word vectors,” in Proc. 22nd Int. Conf. Artif. Intell. Statist., 2019, pp. 879–887.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pretraining of deep bidirectional transformers for language understanding,” NAACL, 2019.
Mark Diaz, I. Johnson, A. Lazar, A. M. Piper, and Darren Gergle. “Addressing age-related bias in sentiment analysis,” in Proc. Chi Conf. Hum. Factors Comput. Syst., 2018, pp. 1–14.
S. Dola, M. B. Dwyer, and M. L. Soffa, “Distribution-aware testing of neural networks using generative models,” in Proc. IEEE/ACM 43rd Int. Conf. Softw. Eng, 2021, pp. 226–237.
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel, “Fairness through awareness,” in Proc. 3rd Innov. Theor. Comput. Sci. Conf., 2012, pp. 214–226.
A. Field, S. L. Blodgett, Z. Waseem, and Y. Tsvetkov, “A survey of race, racism, and anti-racism in NLP,” in Proc. 59th Annu. Meeting Assoc. Comput. Linguistics 11th Int. Joint Conf. Natural Lang. Process., 2021, pp. 1905–1925.
Batya Friedman and H. Nissenbaum, “Bias in computer systems,” ACM Trans. Inf. Syst, vol. 14, no. 3, pp. 330–347, 1996.
S. Galhotra, Y. Brun, and A. Meliou, “Fairness testing: Testing software for discrimination,” in Proc. 11th Joint Meeting Found. Softw. Eng., 2017, pp. 498–510.
M. Gardner et al., “AllenNLP: A deep semantic natural language processing platform,” in Proc. Workshop NLP Open Source Softw., 2018, pp. 1–6.
Hila Gonen and Y. Goldberg, “Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics: Hum. Lang. Technol., 2019, pp. 609–614.
Joseph Y Halpern and J. Pearl, “Causes and explanations: A structural-model approach. part I: Causes,” Brit. J. Philosophy Sci., vol. 56, no. 4, pp. 843–887, 2005.
Frank R Hampel. “The influence curve and its role in robust estimation,” J. Amer. Statist. Assoc., vol. 69, no. 346, pp. 383–393, 1974.
Fabrice Harel-Canada, L. Wang, M. A. Gulzar, Q. Gu, and M. Kim, “Is neuron coverage a meaningful measure for testing deep neural networks?” in Proc. 28th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2020, pp. 851–862.
Nikolas Havrikov and A. Zeller, “Systematically covering input structure,” in Proc. 34th IEEE/ACM Int. Conf. Automated Softw. Eng., 2019, pp. 189–199.
H. Hemmati, “How effective are code coverage criteria?” in Proc. IEEE Int. Conf. Softw. Qual. Rel. Secur., 2015, pp. 151–156.
Renata Hodovan, Akos Kiss, and Tibor Gyimothy “Grammarinator: A grammar-based open source fuzzer,” in Proc. 9th ACM SIGSOFT Int. Workshop Automating TEST Case Des. Selection Eval., 2018, pp. 45–48.
M Hort, J Zhang, F Sarro, and M Harman. “Fairea: A model behaviour mutation approach to benchmarking bias mitigation methods,” in Proc. 13th Joint Meeting Found. Softw. Eng., 2021, pp. 994–1006.
D. Hovy and S. L. Spruit, “The social impact of natural language processing,” in Proc. 54th Annu. Meetinge Assoc. Comput. Linguistics, 2016, pp. 591–598.
Amazon Web Services Inc. Amazon comprehend, 2020. [Online]. Available: http://aws.amazon.com/comprehend
B. Johnson et al., “Fairkit, fairkit, on the wall, who’s the fairest of them all? supporting data scientists in training fair models,” 2020, arXiv:2012.09951.
F. Kamiran, T. Calders, and M. Pechenizkiy, “Discrimination aware decision tree learning,” in Proc. IEEE Int. Conf. Data Mining, 2010, pp. 869–874.
J. Kim, R. Feldt, and S. Yoo, “Guiding deep learning system testing using surprise adequacy,” in Proc. IEEE/ACM 41st Int. Conf. Softw. Eng., 2019, pp. 1039–1049.
S. Kiritchenko and S. Mohammad, “Examining gender and race bias in two hundred sentiment analysis systems,” in Proc. 7th Joint Conf. Lexical Comput. Semantics, 2018, pp. 43–53.
Google Cloud Natural Language. Derive insights from unstructured text, 2020. [Online]. Available: http://cloud.google.com/natural-language/
S. M. Lundberg and S.-I. Lee. “A unified approach to interpreting model predictions,” in Proc. 31st Int. Conf. Adv. Neural Inf. Process. Syst. 30, 2017, pp. 4768–4777.
P. Ma, S. Wang, and J. Liu, “Metamorphic testing and certified mitigation of fairness violations in NLP models,” in Proc. 29th Int. Joint Conf. Artif. Intell., 2020, pp. 458–465.
N. Madaan, I. Padhi, N. Panwar, and D. Saha, “Generate your counterfactuals: Towards controlled counterfactual generation for text,” in Proc. 35th AAAI Conf. Artif. Intell., 2021, pp. 13516–13524.
T. Miller, “Contrastive explanation: A structural-model approach,” Knowl. Eng. Rev., vol. 36, 2021.
T. Nasukawa and J. Yi, “Sentiment analysis: Capturing favorability using natural language processing,” in Proc. 2nd Int. Conf. Knowl. Capture, 2003, pp. 70–77.
A. Nguyen, J. Yosinski, and J. Clune, “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 427–436.
U.S. Bureau of Labor Statistics, Labor force statistics from the current population survey CPS CPS program links, 2020. [Online]. Available: URL: https://www.bls.gov/cps/cpsaat11.htm
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” OpenAI Blog, vol. 1, no. 8, p. 9, 2019.
B. Rezabakhsh, D. Bornemann, U. Hansen, and U. Schrader, “Consumer power: A comparison of the old economy and the internet economy,” J. Consum. Policy, vol. 29, no. 1, pp. 3–36, 2006.
M. T. Ribeiro, S. Singh, and C. Guestrin, “”Why should I trust you?”: Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016, pp. 1135–1144.
M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proc. 32nd AAAI Conf. Artif. Intell., 2018, pp. 1527–1535.
M. T. Ribeiro, T. Wu, C. Guestrin, and S. Singh, “Beyond accuracy: Behavioral testing of NLP models with checklist,” in Proc. 58th Annu. Meeting Assoc. Comput. Linguistics, 2020, pp. 4902–4912.
P. J. Rousseeuw and C. Croux, “Alternatives to the median absolute deviation,” J. Amer. Statist. Assoc., vol. 88, no. 424, pp. 1273–1283, 1993.
R. Rudinger, J. Naradowsky, B. Leonard, and B. Van Durme, “Gender bias in coreference resolution,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics: Hum. Lang. Technol., 2018, pp. 8–14.
Microsoft Azure Cognitive services. Text analytics - detect sentiment, key phrases, and language, 2020. [Online]. Available: http://azure.microsoft.com/en-us/services/cognitive-services/text-analytics/
I. Solaiman et al., “Release strategies and the social impacts of language models,” 2019, arXiv:1908.09203.
W. M. Soon, H. T. Ng, and D. C. Y. Lim, “A machine learning approach to coreference resolution of noun phrases,” Comput. Linguistics, vol. 27, no. 4, pp. 521–544, 2001.
T. Sun et al., “Mitigating gender bias in natural language processing: Literature review,” in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, 2019, pp. 1630–1640.
Z. Sun, J. M. Zhang, M. Harman, M. Papadakis, and L. Zhang, “Automatic testing and improvement of machine translation,” in Proc. ACM/IEEE 42nd Int. Conf. Softw. Eng., 2020, pp. 974–985.
Y. Tian, Z. Zhong, V. Ordonez, G. Kaiser, and B. Ray, “Testing DNN image classifier for confusion & bias errors,” in Proc. 42nd Int. Conf. Softw. Eng., 2020.
S. Udeshi, P. Arora, and S. Chattopadhyay, “Automated directed fairness testing,” in Proc. 33rd ACM/IEEE Int. Conf. Automated Softw. Eng., 2018, pp. 98–108.
S. Udeshi and S. Chattopadhyay, “Grammar based directed testing of machine learning systems,” IEEE Trans. Softw. Eng., vol. 47, no. 11, pp. 2487–2503, Nov. 2019.
S. Verma and J. Rubin, “Fairness definitions explained,” in Proc. IEEE/ACM Int. Workshop Softw. Fairness, 2018, pp. 1–7.
IBM Watson. Natural language processing for advanced text analysis, 2020. [Online]. Available: http://www.ibm.com/cloud/watson-natural-language-understanding/details/
T. Wolf, State-of-the-art neural coreference resolution for chatbots, 2019. [Online]. Available: https://medium.com/huggingface/ state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30
Z. Yang, M. H. Asyrofi, and D. Lo, “BiasRV: Uncovering biased sentiment predictions at runtime,” 2021, arXiv:2105.14874.
M. B. Zafar, I. Valera, M. G. Rogriguez, and K. P. Gummadi, “Fairness constraints: Mechanisms for fair classification,” in Proc. 20th Int. Conf. Artif. Intell. Statist., 2017, pp. 962–970.
R. Zemel, Yu Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning fair representations,” in Proc. Int. Conf. Mach. Learn., 2013, pp. 325–333.
B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating unwanted biases with adversarial learning,” in Proc. AAAI/ACM Conf. AI, Ethics Soc., 2018, pp. 335–340.
J. M. Zhang, M. Harman, L. Ma, and Y. Liu, “Machine learning testing: Survey, landscapes and horizons,” IEEE Trans. Softw. Eng., vol. 48, no. 1, pp. 1–36, Jan. 2020.
P. Zhang et al., “White-box fairness testing through adversarial sampling,” in Proc. 42nd Int. Conf. Softw. Eng., 2020, pp. 949–960.
J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang, “Men also like shopping: Reducing gender bias amplification using corpus-level constraints,” in Proc. Conf. Empirical Methods Natural Lang. Process., 2017, pp. 2979–2989.
J. Zhao, T. Wang, M. Yatskar, V. Ordonez, and K.-W. Chang, “Gender bias in coreference resolution: Evaluation and debiasing methods,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics: Hum. Lang. Technol., 2018, pp. 15–20.
J. Zhao, Y. Zhou, Z. Li, W. Wang, and K.-W. Chang, “Learning gender-neutral word embeddings,” in Proc. Conf. Empirical Methods Natural Lang. Process., 2018, pp. 4847–4853.
P. Zhou et al., “Examining gender bias in languages with grammatical gender,” in Proc. Conf. Empirical Methods Natural Lang. Process. 9th Int. Joint Conf. Natural Lang. Process., 2019, pp. 5276–5284.
I. Zliobaite, F. Kamiran, and T. Calders, “Handling conditional discrimination,” in Proc. IEEE 11th Int. Conf. Data Mining, 2011, pp. 992–1001.
R. Zmigrod, S. J. Mielke, H. Wallach, and R. Cotterell, “Counterfactual data augmentation for mitigating gender stereotypes in languages with rich morphology,” in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics, 2019, pp. 1651–1661.

Similar publications

Sorry the service is unavailable at the moment. Please try again later.