Understanding Bias: Neuroscience & Critical Theory for Ethical AI

Applying a critical theory framework to AI Ethics, while using neuroscience to understand unconscious bias with synaptic plasticity.

Haaya Naushan

A year ago when discussing racial bias present in facial recognition, AI pioneer Yan Lecun controversially tweeted, “ML systems are biased when data is biased” (source: Twitter). This provoked a response from AI Ethics researcher, Timnit Gebru, who expressed her frustration at the overly simplistic framing of this issue, an opinion based on her expertise in AI Ethics (source: Twitter). Gebru’s reply and the ensuing conversation was amplified by mainstream media, and while this did prompt a broader discussion of bias in the AI community, the media focus was on how Lecun chose to (mis)communicate.

Irrespective of sincerity, Lecun’s apology (source) reminded me of Audre Lorde’s thoughts on guilt,

“all too often, it becomes a device to protect ignorance and the continuation of things the way they are, the ultimate protection for changelessness.” (Lorde, 1981).

And with regards to the entitled expectation that the marginalized should be responsible for educating others about bias, Lorde says,

“[t]here is a constant drain of energy which might be better used in redefining ourselves and devising realistic scenarios for altering the present and constructing the future.” (Lorde, 1980).

I share Gebru’s frustration. Societal bias pervades every aspect of AI, from the datasets, the research environments, to even the practitioners themselves. An equitable future depends on our ability to create Ethical AI; hence, I believe it is important to reflect critically on bias — despite how emotionally difficult this can be, and despite the lack of easy answers. Moreover, discussing bias without addressing social and structural problems is a hollow and ultimately, meaningless endeavour. Consequently, I seek to integrate critical theory with neuroscience to understand unconscious bias and chart a path towards Ethical AI.

In a two-part series, I draw from diverse fields such as neuroscience, genetics, psychology, critical theory, linguistics, mathematics and pedagogy, to articulate my view that AI Ethics can benefit from an unconventional approach that combines various disciplines. This first article uses neuroscience to understand unconscious bias while making direct connections to AI, all within a critical theory framework. The second article explores learning with respect to neuroscience and presents a critical theory-informed view of causality, allowing for a discussion of neurosymbolic AI, as a way of potentially achieving mindfulness in machines.

Here, I start with critical theory and contrast the field with STEM disciplines, to suggest that AI practitioners can benefit from thinking about intersectionality. Next, I offer a brief introduction to neuroscience and highlight the important differences between select sub-disciplines and their relationship to AI. This leads directly into a discussion of synaptic plasticity, which lays the foundation for discussing this article’s central topic of unconscious bias. Following that, I address the issue of bias in datasets using critical theory. Lastly, I return to my previous series on linguistics (part 1 & part 2), and use bias in AI language models as a practical example to showcase the ways neuroscience and critical theory connect to AI Ethics.

Critical Theory and AI Ethics

A long time ago, I was introduced to the writers that helped shape my perception of the world: Audre Lorde, Maya Angelou, bell hooks, Octavia Butler and Angela Davis. I credit my sister for that influence; concomitantly, she also shared the works of critical theorists such as Franz Fanon, Edward Said, Paulo Freire and Michel Foucault. While critical theory is sometimes narrowly defined as the Frankfurt School of philosophy, I espouse a broader view of critical theory, where the purpose of the discipline is social inquiry aimed at decreasing oppression and increasing freedom. This requires challenging traditional power structures, to quote Foucault,

“A critique does not consist in saying that things aren’t good the way they are. It consists in seeing on just what type of assumptions, of familiar notions, of established and unexamined ways of thinking the accepted practices are based… To do criticism is to make harder those acts which are now too easy.” (Foucault, 1982).

The prevailing attitude in STEM disciplines is either ignorance of, or contempt of critical theory; the field is widely treated as intellectually inferior to pure science. Popular arguments include claims of the superiority of “facts over feelings”. It has become radical to consider the value of non-empirical claims, the lack of data being sufficient ground to dismiss an entire discipline. I disagree with this view and choose instead to embrace the radical idea that there is value in critical theory, especially when it comes to the development of the burgeoning field of AI. Specifically, attempts to create Ethical AI would benefit from an understanding of intersectionality (Crenshaw, 2017). Bias is about more than racism, gender inequality or religious intolerance alone, it is the result of the interactions of all forms of bias.

In 1989, when Kimberlé Crenshaw coined the term, intersectionality, it was an effort to describe the complicated ways social identities can interact. Today, the meaning has been distorted by those threatened by the loss of current social hierarchies, derided and dismissed as “identity politics”. Intersectionality, is however, according to Crenshaw herself,

“a lens, a prism, for seeing the way in which various forms of inequality often operate together and exacerbate each other. We tend to talk about race inequality as separate from inequality based on gender, class, sexuality or immigrant status. What’s often missing is how some people are subject to all of these, and the experience is not just the sum of its parts.” (source).

AI is already entwined in many aspects of many peoples lives; a growing trend, since the use of AI is ramping up across various industries. In a 2018 book, AI Superpowers: China, Silicon Valley, and the New World Order, AI pioneer and venture capitalist, Kai-Fu Lee, described the current state of AI as being in the “age of implementation”. Writing about the potential harm of AI, Lee says,

“[b]ringing them to market requires no major new breakthroughs in AI research, just the nuts-and-bolts work of everyday implementation: gathering data, tweaking formulas, iterating algorithms in experiments and different combinations, prototyping products, and experimenting with business models.” (Lee, 2018).

Lee’s vision for human coexistence with AI is based on promoting love for each other; he suggests using love as a blueprint for designing social programs to combat the future existential threat of AI. While I agree with the importance of love, I would argue that what his vision lacks, is an understanding of the current existential threat to our shared humanity, the deep-rooted systemic and structural inequalities from which AI has sprung.

Ethical AI needs to be about more than ensuring that language models like GPT-3 do not regurgitate racist text, or that facial recognition can identify diverse faces. An inspiring example is the powerful documentary Coded Bias, which started with a paper authored by Joy Buolamwini and Timnit Gebru, titled “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification”. The influence of intersectionality and critical theory is seen all throughout the film. It is particularly evident in the efforts made to engage all stakeholders, especially those who are currently being negatively affected by facial recognition technology. Importantly, the researchers consulted with people impacted, and included them in the discussions around Ethical AI. Even more importantly, this work is ongoing with continuous effort directed towards making the research accessible, an approach that I wish would be adopted by all researchers and organizations involved with AI Ethics.

The entire AI industry, governments, and especially AI practitioners should be asking questions informed by intersectionality. Questions such as: Who researches, builds and teaches AI? Who has access to, who profits from, and whose jobs are displaced by AI services? Who produces, and who owns the data used to build AI? Who controls the infrastructure needed for AI?. And more broadly, Whose opinions on AI are heard? Who decides the law around AI and who enforces these laws? Which countries have the capacity to research or build AI and where will it be deployed? Which countries’ economies will profit from or suffer from AI?. These questions merely scratch the surface, but at present they are not being discussed by those who have the capacity to enact change.

Understanding intersectionality is only a single step in the right direction; however, it is an easily accessible one that includes everyone. Feminist.AI is an organization that engages through intersectional feminism and works to make AI accessible to everyone — their resource page is a great place to start learning. Some of the books they recommend include: Algorithms of Oppression by Safiya Umoja Noble, Weapons of Math Destruction by Cathy O’Neil and Race After Technology by Ruha Benjamin. Current data science, machine learning and AI educational programs would benefit from including such material in their curriculum, providing students with an intersectional framing of AI Ethics. I do not claim to have expertise in critical theory nor intersectionality, but these ideas are the foundation that helps me think critically about AI. As a starting point, intersectionality offers hope that as AI progresses, we do not leave behind those who have the least and could benefit the most. As mentioned, the next two sections introduce neuroscience and unconscious bias, before returning to critical theory in the last two sections to further discuss AI development and ethics.

Neuroscience: An overview of select subdisciplines

A neuron is the cellular unit that comprises the structure of the human brain, a fact discovered in the 1880’s by Santiago Ramón y Cajal (Rapport, 2006). Currently it is believed that the human brain contains approximately 86 billion neurons, and each of these neurons has approximately 10,000 synapses — these can be thought of as the connections between neurons. This helpful article from John Hopkins Medicine offers an accessible overview of the anatomy and function of the brain. For accessibility, all neuroscience concepts introduced in this article will be explained with enough detail to guarantee an understanding sans the technical complexities.

At the simplest level, neuroscience is the study of how neural networks work, in contrast with neurology which is the study of diseases of the brain. There are several types or sub-disciplines of neuroscience, here I focus on the three most relevant to AI: systems neuroscience, cognitive neuroscience and computational neuroscience. Systems neuroscience relates to systems biology, which involves the computational and mathematical modeling of complex biological systems. Accordingly, systems neuroscience includes studying the brain at the molecular and cellular level, and then connects that understanding to cognitive functions such as memory storage and language. This sub-discipline is useful for AI because it takes the approach of studying the physical brain in order to understand the mind, as in, how does the physical structure of the brain relate to consciousness.

Cognitive neuroscience takes the opposite approach, it is highly influenced by psychology and relies on understanding the mind (cognition) to help understand the brain. For example, it adopts the psychological categorization of memory, by using the definitions of episodic, semantic and procedural memory. With regards to AI, a key concept is the neural correlates of consciousness, which can be defined as the specific pattern of brain activity that correlates with particular conscious experiences. Consider, for instance, reinforcement learning, neural correlates of decision making suggest that the frontal cortex of the brain intervenes by adding the context of episodic memory, such that decisions are informed by past events. Cognitive neuroscience is also directly related to the field of symbolic AI, where intelligence is designed as the manipulation of symbols via logic operations, according to rules of cause and effect.

In addition to cognitive neuroscience, symbolic AI is rooted in the philosophy of mathematics; it follows the logicism-based view that human cognition is achieved through forming internal symbolic representations and creating rules to deal with these concepts. Therefore, with symbolic AI, machine cognition is the result of formalizing these causal rules in a way that a machine has knowledge of the world. As a philosopher and mathematician, Bertrand Russell was a staunch defender of logicism and believed that mathematics was reducible to logic. The establishment of symbolic logic can be attributed to Russell; in 1910 he published Principia Mathematica, which presented an axiomatic system on which all of mathematics can be built.

Two decades later, Kurt Gödel presented his incompleteness theorem, proving that there are limits to any system of primitive recursive arithmetic. Essentially, Gödel proved that it is impossible to use symbolic logic to create a complete and consistent set of axioms with which to model the world. This flaw extends to symbolic AI, which was popular between the 1950s and 1980s, but was largely abandoned in favour of the connectionist view of neurons ie. neural networks. The shift away from symbolic AI towards neural networks was mirrored in neuroscience, where cognitive neuroscience was overtaken in 1985 by the new sub-discipline of computational neuroscience.

While Gödel’s incompleteness theorem can be the mathematical basis for why symbolic AI is insufficient for cognition, we still do not have formal mathematical theory to explain why neural networks are effective in practice. Hence, the importance of computational neuroscience; this field seeks to use mathematical models to understand not just cognition, but also the structure and development of the brain. Within this sub-discipline, there is an interest in developing a mathematical model of the phenomenon of synaptic plasticity, which is a key neuroscience concept needed to understand unconscious bias. The following section covers this point, and the second article of this series covers computational neuroscience in greater detail.

Synaptic Plasticity and Unconscious Bias

Synaptic plasticity, which is sometimes referred to as neuroplasticity, was first proposed by Donald Hebb, in 1949 (source). The idea is that the synaptic connections between neurons are capable of change; this flexibility is important because variation in activity determines the strength of a synaptic connection. It is possible that existing connections between synapses might disappear or become silent. Furthermore, new connections can form in new places between the synapses of a pair of neurons. This means that the network structure of a biological neural network is dynamic, the connections between synapses exhibit temporal plasticity.

Interestingly, learning occurs by tweaking the strength of synaptic connections, an idea that will be explored in depth in the second article of this series. Here, the salient point is that synaptic plasticity can be viewed as the adaptation of neurons during the learning process. Relatedly, Hebb attempted to explain synaptic plasticity by theorizing that an increase in synaptic efficacy arises from a presynaptic cell’s repeated and persistent stimulation of a postsynaptic cell (Choe, 2014). More formally, Hebbian learning theory is a principle that states that a presynaptic neuron A, if successful in repeatedly activating a postsynaptic neuron B when itself (neuron A) is active, will gradually become more effective in activating neuron B (Hebb, 1949).

Short-term synaptic plasticity refers to changes in synaptic strength that occur on a sub-second timescale, such that there is a rapid increase or decrease in strength with the connection returning to normal afterwards. On the other hand, long-term synaptic plasticity can last from minutes to years, and represents the way in which the brain can store information as memories (source). In a neuroscience book released last year: Sway: Unravelling Unconscious Bias, Pragya Agarwal uses long-term synaptic plasticity to explain unconscious bias. In an opinion piece published in Scientific American, Agarwal writes,

“Social attitudes and expectations such as stereotypes can change how the brain processes information, and so brain-based differences in behavioural characteristics and cognitive skills change across time, place and culture. This means that our unconscious biases are not wired into us. They are learned through our experiences, and hence can also be unlearned.” (Agarwal, 2020).

Agrawal used neuroimaging research to support her claims around unconscious bias, specifically, recent fMRI studies that demonstrated that people use different areas of the brain when reasoning about familiar and unfamiliar situations. The research pointed to two specific parts of the brain, the amygdala and the prefrontal cortex, both of which “light up” in response to stereotypes. Agrawal explains that when meeting someone new, we quickly “turn behaviors into neural signals with identifiable information about the person to form an impression of them”. While this is happening, on a subconscious level, the prefrontal cortex simultaneously monitors neural information from all our senses, contextualizing with social norms or personal preferences. This means that the conscious brain forms incomplete interpretations and will often include biases of some kind, that is to say, our socializing, memories and experiences produce unconscious biases.

The amygdala is important for understanding bias because it labels incoming stimuli unconsciously, so that people rapidly categorize others as “like me and not like me”. Agrawal claims that this is the root of prejudice and discrimination, which means that sterotypes activate the part of the brain associated with threats and fear (the amygdala). More specifically, fMRI data shows that when people see “facial images of people of an ethnic background different from their own, it often activates the amygdala more than seeing people of the same ethnicity.”. Furthermore, research has shown that, “negative out-group biases are even more prominent than in-group empathy.” (source).

While neuroimaging makes it clear that unconscious bias is connected to both the amygdala and prefrontal cortex, synaptic plasticity offers a way of addressing bias in both these regions. The main point that Agrawal stresses is: just as unconscious bias is learned, it can be unlearned. I return to this idea in the second article of this series, where I connect synaptic plasticity to mindfulness meditation, to suggest a method by which unconscious bias can be unlearned in machines. The next two sections offer a critical theory framing of bias and AI development.

Dataset Bias and AI Development

Kathryn Hume, Borealis AI’s director of product, offers a good reason to think critically about unconscious bias with regards to AI development. She says

“Algorithms are like convex mirrors that refract human biases, but do it in a pretty blunt way. They don’t permit polite fictions like those that we often sustain our society with” (source).

She further cautions,

“we need to be careful, because if we don’t design these systems well, all that they’re going to do is encode what’s in the data and potentially amplify the prejudices that exist in society today.” (source).

When Yan Lecun suggested that bias in AI stems from bias in data, he implicitly suggests that data itself is objective and that a balanced and racially representative dataset would solve the problem. As stated previously, this is a problematic approach because it naively disregards systemic and structural bias. Nonetheless, it provides an opportunity to dig deeper into the idea that data should be treated as objective. In Pedagogy of the Oppressed, Paulo Freire suggests that subjectivity has a role in the struggle to change power structures. Freire states that, “one cannot conceive of objectivity without subjectivity. Neither can exist without the other, nor can they be dichotomized.”; he proposes an approach of “subjectivity and objectivity in constant dialectical relationship” (Freire, 1970).

A compelling example of the subjectivity of data, is the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) software created by Northpointe to assess the risk of recidivism (re-offending) for a defendant. A ProPublica study in 2016 exposed the racial bias encoded in the algorithms; the software was more likely to have type 1 errors (false positives) for black people and type 2 errors (false negatives) for white people. This meant that black defendants were inaccurately assigned higher scores than white defendants, such that when forecasting the risk of violent crimes, the algorithm was only accurate 20% of the time (source). COMPAS is a proprietary software and the workings of the algorithms are claimed as trade secrets, which means the software acts as a black box. Assuming that the algorithms were trained on a balanced and racially representative dataset, this would highlight the point that simply including people of colour in datasets does not address the issue of bias.

What is clear, however, is that when a recidivism risk assessment tool is based on questions such as, “Was one of your parents ever sent to jail or prison?” (source), the issue of bias extends beyond the data. Tim Brennan, the founder of Northpointe and a former professor of statistics, claimed that it is difficult to construct a score that does not include items that can be correlated with race, he says, “If those are omitted from your risk assessment, accuracy goes down” (source). Based on Northpointe’s disclosure, their COMPAS data covers poverty, unemployment and social marginalization (source).

These social issues are inextricably tied to structural and systemic racism; in fact, prisons as an institution and the definition of crime itself, has a racist history. In the revolutionary book, “Are Prisons Obsolete?”, Angela Davis examines the historical links between U.S. slavery and the early penitentiary system. She points out that historically, within the US, race has always played a central role in constructing presumptions of criminality,

“After the abolition of slavery, former slave states passed new legislation revising the Slave Codes in order to regulate the behavior of free blacks in ways similar to those that had existed during slavery.” (Davis, 2003).

Brennan and Northpointe, by ignoring history and current reality, subjectively curated data that led to algorithms that encoded and propagated societal injustice. Hence, treating data as objective overlooks deep-rooted problematic bias.

Therefore, in practice, data should be treated as subjective, with regards to the means of collection, the framing, measures of inclusivity, appropriateness for the problem, quality standards and most of all the presence of structural and systemic oppression. In support of such inquiry, researchers at MIT and MILA created a dataset that measures stereotype bias in AI language models. StereoSet provides a way of measuring a language model’s preference for stereotypes related to race, gender, religion and profession, while ensuring that the debiasing techniques used do not negatively affect the underlying model performance. The last section of this article continues the discussion of bias in AI language models; note however, that bias is present across all types of AI, these models are just one example.

AI Language Models and Linguistic Bias

I have a strong interest in natural language processing (NLP) and I have struggled directly with the issue of bias in AI language models. This struggle led me to write a pair of articles earlier this year that use linguistic theory to explore AI language models. The first piece contrasted linguistic relativity and linguistic universality, to better understand the philosophies guiding the development of language models. Early on in that first article, I mentioned how Google had fired Timnit Gebru, forresearching the social and environmental impact of AI language models, an example of the risk of attempting to draw attention to bias, amongst other concerns. My second linguistics-focused article dealt directly with the issue of bias; for a practical example, I primarily focused on the impressively large GPT-3 model. To quote the original GPT-3 paper, “ [GPT-3] retains the biases of the data it has been trained on. … that may lead the model to generate stereotyped or prejudiced content”.

My current view is that bias in AI language models is about more than data, structure matters; specifically, the structure of language matters. Through researching economic uncertainty, I stumbled across Whorfian socioeconomics, which is described by Thomas Pepinsky, a professor at Cornell, as,

“an emerging interdisciplinary field of study that holds that linguistic structures explain differences in beliefs, values, and opinions across communities” (Pepinsky, 2019).

The relevance in an economic context is understanding the uncertainty contributed by language, when text is used for economic analysis. Previous research by M. Keith Chen (2013) in the American Economic Review, also suggests that language has an effect on economic behaviour. Chen argues that a language’s grammatical associations of the future and the present, has an effect on a person’s saving habits (Chen, 2013). Current trends in economic research include the adoption of machine learning techniques; I, myself have written about causal ML for economics, here and here. More importantly, NLP is very useful for economics; hence, the importance of the structure of language is directly related to the usefulness of AI for economics.

Economics aside, since the structure of language has relevance, it is natural to apply linguistics to the development of AI. Researchers from the Cognitive Science department at the Rensselaer Polytechnic Institute, recently published a book titled, “Linguistics for the age of AI”, which focuses on a linguistics approach to natural language understanding (NLU). Amazingly, the authors, Marjorie McShane and Sergei Nirenburg, have open sourced the entire book, which is made available on the MIT Press website. The authors claim that ambiguities present an obstacle for NLU and the human approach is to use context to decipher the meaning behind words. The current NLP paradigm, however, is to use extremely large datasets and the statistical relations between word sequences to determine context, an approach that does not capture meaning. Previously, before the advent of AI language models, knowledge-based systems containing lexical structures and ontologies were used to explain language to machines. Unfortunately, this type of knowledge engineering has a high resource cost; hence, the popularity of the alternative statistical approach that lacks an understanding of meaning. “Practitioners proceed as if words were a sufficient proxy for their meanings, which they are not.”, criticizes McShane (source).

She also claims that, “at this stage of its development, neuroscience cannot provide any contentful (syntactic or structural) support for cognitive modeling of the type, and with the goals, that we undertake.”. Perhaps this is true with regards to NLU; however, the link between the structure of language, synaptic plasticity and unconscious bias provides insight as to the role language plays in the encoding of bias in AI models. Supposing that linguistic structure contains expressions of bias, and supposing that this structure directly affects brain activity (eg. saving habits), it is not far fetched to suppose that manipulating synaptic plasticity could provide a way of addressing bias encoded in AI language models. I approach this idea from multiple angles in the second article of this series.

Final Thoughts

Critical theory can help AI practitioners think conscientiously about AI Ethics, and neuroscience can provide neural insight into the issue of bias. Simplistically, systems neuroscience relies on biology, cognitive neuroscience relies on psychology and symbolic logic, and computational neuroscience is centered around a quantitative understanding of biological neural networks. Relatedly, it could be said that the goal of artificial neural networks (AI) is to mimic biological neural networks in design and function. Towards that end, synaptic plasticity provides a way of understanding biological neural networks, the mechanism of which could be extended to improve artificial neural networks.

Therefore, in the second part of this series, I contrast human learning with machine learning and discuss the concept of heterarchy to connect critical theory with cognition. I then focus on relating synaptic plasticity to mindfulness meditation, which allows me to explore the idea that it is possible to learn from mistakes. Finally, I return to symbolic AI, and introduce the new field of neurosymbolic AI which is representative of a hybrid approach that melds disparate learning paradigms. This creates room to discuss causality from a critical theory perspective, connecting to the main goal of creating Ethical AI with mindful machines.

If readers are interested in further material that incorporates intersectionality into AI Ethics, I again suggest the following books: Algorithms of Oppression by Safiya Umoja Noble, Weapons of Math Destruction by Cathy O’Neil and Race After Technology by Ruha Benjamin.

Readers who are interested in the topics mentioned in this series can connect with me on Linkedin. I welcome questions and feedback.


Leave a Comment