Fairness and Bias

Is GPT-3 Islamophobic?

How OpenAI’s Western algorithm perpetuates Orientalist power structures.

Francesca
Towards Data Science
11 min readFeb 3, 2021

--

Photo by Joan Gamell via Unsplash

The scope of this article is to investigate the intersection of Orientalism, as developed by Edward Said, and technology, in the context of OpenAI’s GPT-3 algorithm, which generates coherent text from minimal prompts. When prompted with inputs containing the words “Islam”, “Muslim”, or “Middle East”, GPT-3 generates stereotyped texts which contribute to reproducing and reinforcing an Orientalist vision. OpenAI algorithm shows back at us the Western concept of Islam and the constant attempt to simplify social groups for political or scholarly expediency or even control.

Digital Orientalism and the algorithmic gaze

In 1978 Edward Said published one of the most relevant books of the 20th century, Orientalism. His work focuses on the nature of Western attitudes towards the East, considering Orientalism as a robust and well-developed European ideological creation, a way for writers, philosophers and colonial authorities to deal with the ‘Otherness’ of Eastern culture, customs and beliefs.

Specifically, Said argued that a long tradition of misleading and romanticized representations of Asia and the Middle East in Western culture had served as an implicit justification for the European colonial and imperial ambitions (Said, 1978). Although Said focused mainly on Europe’s relations with the Middle East and South Asia, the political ideologies and cultural imageries implicit in such hegemonic dichotomies also shed light on Orientalism’s nowadays internal dynamics in the US (Kim and Chung, 2005).

In recent years, there has been an attempt to rewrite Said’s work's legacy, originating from the need to update and expand on his discursive framework's temporal, geographical, and conceptual reach. The new Orient frames shift the attention to how the perception of the Other in contemporary societies is now mediated by what can be defined as an algorithmic gaze: the attempt to characterise, profile and affect people algorithmically (Kotliar, 2020).

Through the algorithmic gaze, the Other becomes visible and knowable (Kitchin, 2014). But as Bucher pointed out, knowledge is never objective nor neutral. It results from interpretative processes that need to be specifically contextualized (Bucher, 2018). The widely accepted conviction that algorithms disregard culture and personal attributes is erroneous and does not consider how such systems powerfully pose themselves as a continuation of the colonial gaze.

How the GPT-3 algorithm perpetuates Orientalist power structures

Digital Orientalism’s contemporary challenges are vast, considering the narrative from mainstream media, populist movements, public opinion, and political tendencies in framing debates. Today, the Orient is still essentialised in the same way Said showed it before Orientalism’s publication. This is particularly evident in the case of OpenAI’s latest language generation model, GPT-3.

GPT-3 is a recently released language model that uses machine learning algorithms to produce human-like text. It takes in a prompt and completes it. Its algorithm uses computational methods to acquire and learn information directly from the input data without relying on a predetermined model (Mathworks.com, 2021). Therefore, data play a fundamental role in the training process for machine learning algorithms. In the case of GPT-3, 60% of its training data are linked to the Common Crawl dataset, a scrape of the 60 million domains on the internet and the subset of the websites to which they relate to. Thus GPT-3 trained on many of the internet’s reputable outlets, such as the BBC, along with the less reputable ones (i.e. Reddit). The remaining 40% is constituted by curated sources such as Wikipedia and the full text of relevant books (Brockman, 2020). It is essential to stress that GPT-3 has been mostly trained on English data (although being able to translate from French, German, and Romain to English). Therefore, its outputs openly reproduce what can be identified as Western ideas. During the training process, GPT-3 learns how to produce phrases and sentences based on the text that it finds online, which is written by us. It comes naturally to realise that, despite the impressive performance, GPT-3 reflects societal biases and reproduces them when asked to generate text involving race, religion, gender, etc.

A GPT-3 paper’s supplemental material published by OpenAI researchers in July 2020 gives users an insight into the model’s problematic biases. It was shown that the model is much likely to associate words as “sucked” or “naughty” with female pronouns, whereas male pronouns were placed near words such as “lazy” or “jolly” at worst (Brown et al., 2020).

Figure 1, Most Biased Descriptive Words in 175B Model, based on Brown at al., 2020, p.37

Researchers also checked the co-occurrence of the words with different religions: similar to gender and race, they found that the model makes (biased) associations between negative adjectives and some religions. For instance, words such as terrorism and violence are more commonly placed near Islam than with other religions and result in the top 40 most favoured terms in general for Islam in GPT-3 (Brown at al., 2020).

Figure 2, Shows the ten most favoured words about each religion in the GPT-3 175B model, based on Brown at al., 2020, p.38

GPT-3’s association between negative words such as “terrorism” and Islam illustrates the hegemonic power of the digital Orientalist discourse in the way in which it reproduces and reinforces biased knowledge of Muslims. It is important to note that the idea of Orientalism often collides with the broader concept of racism, Islamophobia, selective prejudice, and other doctrines which advocate for civilizational differences. As Said stated, the discourse of terrorism has been often used by the United States and its allies to describe violent acts of resistance to their imperial occupation, rather than addressing the violence of imperial occupation itself (Said, 1978). In this context, and especially after the 9/11 dramatic events, the word “terrorism” has come to represent a nameless Oriental collective that stretches from the Saharan Tuareg in North Africa to the Solomon Islands in the Asia Pacific (Kumar, 2012). Therefore, the discourse of terrorism is represented as another form of Orientalism, which deliberately ignores any geographical entity (Morton, 2007).

Accordingly, through algorithmic models, the West structures the possibilities for what is said about the Orient, legitimizing such biased views as produced by what is considered an objective and trustworthy tool: a machine. Such a system deprives Muslims to define themselves as subjects on their own terms. Specifically, when prompted with inputs containing the words “Islam”, “Muslim” or “East”, GPT-3 generates stereotyped text which contributes to reproducing and reinforcing an Orientalist vision, as shown in the following examples derived from direct examination:

Figure 3, an example of GPT-3 results when prompted with the words “Two Muslims walk”. Image by the author.
Figure 4, an example of GPT-3 results when prompted by the words “Islam”. Image by the author.
Figure 5, an example of GPT-3 results when prompted by the words “Middle East”. Image by the author.

As pointed out in a recent study on anti-Muslim bias in NLP, when prompted with a sentence containing the word “Muslim”, 66 out of 100 competitions produced by GPT-3 contains violence-related words (Abid, Farooqi and Zou, 2021). In addition to that, by reproducing the logic of GPT -3’s learned embeddings (that is not publicly available as users are only given access to its API), the researches noted that the word “Muslim” is analogized to “terrorist” 23% of the time (Abid, Farooqi and Zou, 2021). They added:

[…] We note that the relative strength of the association between “Muslim” and “terrorist” stands out, even relative to other groups; of the 6 religious groups considered here, none is mapped to a single stereotypical noun at the same frequency that “Muslim” is mapped to “terrorist.” (Abid, Farooqi and Zou, p.6, 2021).

Finally, the researchers explored ways to de-bias GPT -3’s completions. They opted for one of the most reliable methods: adding a short phrase to a prompt that contained positive associations about Muslims. They modified the prompt, inputting the following words “Muslims are hard-working. Two Muslims walked into a” 80% of the time GPT-3 produces non-violent completions. However, they noted that even the most effective adjectives made more violent completions than results produced for the word “Christians” (Abid, Farooqi and Zou, 2021).

Moreover, while GPT -3’s associations between Muslims and violence are learned during a pre-training phase, according to the authors, those seem not to be memorized but manifested quite creatively by GPT-3, showing the language models’ ability to replicate biases in different ways, which may make the biases more challenging to detect and mitigate (Abid, Farooqi and Zou, 2021).

Figure 6, Debiasing GPT-3 Completions (Abid, Farooqi and Zou, p.9, 2021). (CC BY 4.0).

Something important to note is that the experiments provided triggered a warning message, “Our system has flagged the generated content as being unsafe because it might contain explicitly political, sensitive, identity-aware or offensive text. We’ll be adding an option to suppress such outputs soon. The system is experimental and will make mistakes,” followed by an option to report the output produced. Since its release, many users reported those outputs as Islamophobic, which resulted in OpenAI flagging such content. The company is currently working on software that should prevent users from using this tool maliciously, such as creating spam. Despite access to GPT-3 being limited at this stage, it seems from the numerous use cases popping up online that everyone from hobbyists to machine learning experts had not too much trouble gaining access to this simple yet powerful piece of tech. OpenAI’s GPT-3 is, in fact, a commercial product and multiple customers around the world are already experimenting with its API for different purposes: from creating customer service systems to automating content moderation, as in the case of Reddit.

As it has been repeatedly shown, bias and discrimination are wildly spread on the internet, which can be potentially baked into public and private automated systems. Such a way of operating keeps ignoring whether it is a responsible strategy to train language models by taking any data from the web simply because it is available, without questioning its value and potential for amplifying unchecked and harmful biases. Moreover, as demonstrated by researchers from Microsoft and UMass Amherst in their analysis of 150 studies of bias in Natural Language Processes, many of the authors proposing language models seem to have vague motivations on how and why such biases are harmful (Blodgett, Barocas, Daumé III and Wallach, 2020). They continued by stating that there is a need to engage with systems that explore the relationship between language and social hierarchies, such as sociolinguistics and sociology (Blodgett, Barocas, Daumé III and Wallach, 2020). At the same time, authors are also required to engage with those directly affected by such systems. In the case of GPT-3, having trained the model on English data and having built the system in English only frequently prevents the authors from engaging directly with those who are affected by the perpetuation of such Orientalist vision via machine-generated text. When used for commercial purposes, GPT-3 can potentially legitimise such biased views since produced by what the public considers an objective and trustworthy tool. Again, this system can deprive the “East” of defining itself as a subject in its own terms, not considering how the model interacts with and impacts the societies we live in.

The GPT-3 example shows the need for social meaning and linguistic context to play a central role in designing AI. The public cannot merely assume that design choices underpinning technology, in general, are normatively neutral. The interactive nature of the relationship between technological models and the social world demonstrates why even an “objectively perfect” model would produce unjust results if deployed in an unfair world. As in the case of GPT-3, such powerful language models can supercharge inequality expressed via linguistic categories, especially given the scale at which it currently and might operate in the future.

Finally, on the 5th January 2021, OpenAI launched DALL·E, a text2Image system based on GPT-3 but trained on text plus images and CLIP, a neural network that can perform accurate image classification based on natural language trained on 400 million pairs of pictures and text. These two new models combine language and images with helping AIs understand both words and what they refer to, generating high-quality images. DALL·E uses a 12-billion parameter version of GPT-3 and a transformer language model for developing and complete half-finished images. That model can draw pictures of animals or things with human characteristics and combine unrelated items accurately to produce a single image. Interestingly, the success rate of the images will depend on how well the prompt is phrased. Something even more impressive is DALL·E ability to fill in the blanks when captions imply that the image has to contain a specific detail that is not explicitly stated. Both these two models have the potential of significant societal impact. The OpenAI Team already say that they will analyse how DALL·E and CLIP can relate to societal issues, possible biases and ethical challenges (OpenAI, 2021). Those two new models are not yet available to the public. However, future analysis on this topic should still focus on whether or not those technologies contribute towards reproducing and reinforcing the Orientalist ideology through artificially produced images.

Is GPT-3 Islamophobic?

Again, the interactive nature of the relationship between technological models and the social world demonstrates why even an “objectively perfect” model would produce unjust results if deployed in an unfair world. In the case of GPT-3, such powerful language models can supercharge inequality expressed via linguistic categories, especially given the scale at which it currently and might operate in the future. Being aware of the risks and working to reduce them has become an urgent priority, which needs to be considered every time a new model is developed. As Hao states: “Algorithmic decision-making is human decision-making. In other words, it’s as much about who is building the technology as it is about what that technology is.” (Hao, 2021)

References

Abid, A., Farooqi, M. and Zou, J. (2021). Persistent Anti-Muslim Bias in Large Language Models. [online] Available at: https://arxiv.org/pdf/2101.05783v1.pdf.

Beer, D., 2016. The social power of algorithms. Information, Communication & Society, 20(1), pp.1–13.

Blodgett, S., Barocas, S., Daumé III, H. and Wallach, H., 2020. Language (Technology) is Power: A Critical Survey of “Bias” in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

Brockman, G. (2020). OpenAI API. [online] OpenAI. Available at: https://openai.com/blog/openai-api/ [Accessed 11 Jan. 2021].

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., Mccandlish, S., Radford, A., Sutskever, I. and Openai, D. (2020). Language Models are Few-Shot Learners. [online] Available at: https://arxiv.org/pdf/2005.14165.pdf.

Bucher, T. (2012). Want to be on top? Algorithmic power and the threat of invisibility on Facebook. New Media & Society, 14, 1164–1180. doi:10.1177/1461444812440159 ‌

Hao, K., 2021. Five ways to make AI a greater force for good in 2021. The MIT Technology Review, [online] Available at https://www.technologyreview.com/2021/01/08/1015907/ai-force-for-good-in-2021/?truid=f640bd600a7a7b243bb59cd866dc44c2

Kim, M. and Chung, A.Y. (2005). Consuming Orientalism: Images of Asian/American Women in Multicultural Advertising. Qualitative Sociology, [online] 28(1), pp.67–91. Available at: https://www.depts.ttu.edu/education/our-people/Faculty/additional_pages/duemer/epsy_6305_class_materials/Kim-Minjeong-Chung-Angie-Y-2005.pdf

Kitchin, R. (2014). The data revolution: Big data, open data, data infrastructures & their consequences. London: Sage.

Kotliar, D., 2020. The return of the social: Algorithmic identity in an age of symbolic demise. New Media & Society, 22(7), pp.1152–1167.

Mathworks.com. (, 2021). What Is Machine Learning? | How It Works, Techniques & Applications. [online] Available at: https://www.mathworks.com/discovery/machine-learning.html

Morton, S., 2007. Terrorism, Orientalism and Imperialism. Wasafiri, 22(2), pp.36–42.

OpenAI (2021). CLIP: Connecting Text and Images. [online] OpenAI. Available at: https://openai.com/blog/clip/ [Accessed 11 Jan. 2021].

Said E., 1978. Orientalism. London: Penguin Books.

--

--