How AI reduces the world to stereotypes
June 8, 2024 7:42 AM   Subscribe

"Bias occurs in many algorithms and AI systems — from sexist and racist search results to facial recognition systems that perform worse on Black faces. Generative AI systems are no different. In an analysis of more than 5,000 AI images, Bloomberg found that images associated with higher-paying job titles featured people with lighter skin tones, and that results for most professional roles were male-dominated. A new Rest of World analysis shows that generative AI systems have tendencies toward bias, stereotypes, and reductionism when it comes to national identities, too." CW: stereotyping of peoples, nations, cuisines, and more

This October 2023 article by Victoria Turk was shared at a library instruction conference I attended over the last couple days.
posted by cupcakeninja (24 comments total) 22 users marked this as a favorite
 
Given that these systems are just a highly mixed up stew of all of OUR doings, why are we surprised to see ourselves reflected back?
posted by njohnson23 at 8:02 AM on June 8 [8 favorites]


Thank you for sharing this!

The generated images are a stark demonstration of the problems of AI that its boosters want ignored. Theirs is a discursive closure fueled by and aspiring to a hideous accumulation of wealth.

I was not previously familiar with the Rest of World publication. When I went to their about page, I found it aligned nicely with the article.

From the article:
Midjourney did not respond to multiple requests for an interview or comment for this story.

Midjourney did not respond to questions about the data it trained its system on.
From the About page:
Why “Rest of World”? It’s a corporate catchall term used in the West to designate “everyone else.” Companies use it to lump together people and markets outside wealthy Western countries. We like the term because it encapsulates the problems we fight head-on: a casual disregard for billions of people, and a Western-centric worldview that leaves an unthinkable number of insights, opportunities, and nuances out of the global conversation.

posted by audi alteram partem at 8:14 AM on June 8 [4 favorites]


It's sort of ironic, because afaik stereotypes and generalizations are (among other things) cognitive strategies we use to save on mental energy. Brains are pretty limited in lots of ways, we can't just upgrade processors or RAM, and it takes a lot of cognitive work to consider all the shades of gray and the exceptions and the complexity of this world. Given compute limits, brains tend to (and probably have to) apply lots of shortcuts and filters, not always for the best.

So I understand why the current LLM models spit out stereotypes - parrots will parrot, and LLMs are no AGI - but it's also kind of ironic and frustrating, given that we can pour so much computing power and memory storage into these things.
posted by trig at 8:29 AM on June 8 [3 favorites]


This is the lesson from Amazon's ML resume screen fiasco - training on biased data sets (and everything we humans produce is filled with bias) will cause those biases to be pulled forward, because it turns out that ML is really good at sussing that sort of shit out.
posted by NoxAeternum at 8:34 AM on June 8 [1 favorite]


audi alteram partem, I had a similar reaction! I did my usual check to see if similar links had been posted here, and I was intrigued to see that Rest of World has appeared here before. I'd missed those posts, forgotten them, or the like.

The generated images are a stark demonstration of the problems of AI that its boosters want ignored. Theirs is a discursive closure fueled by and aspiring to a hideous accumulation of wealth.

Derailing myself to say: I will say that the conference I attended was the first time I'd gone to an event where I focused on the AI track. There was discussion of AI-driven research tools like Research Rabbit, Scopus' AI tool, etc. I'd encountered a bunch of these things in part, as well as cruising through lists of tools, but I saw several demonstrations of uses for these tools, as well as GPTs. There was discussion of the flaws and the ethical challenges, including the real problems that AI tools can cause for researchers, but I don't think anyone is unclear at this point that some students and faculty are using these tools right now, sometimes extensively. I suspect this is going to fall into the same general category as all the other problems that get grouped under the rubric of "scholarly communication infrastructure problems."
posted by cupcakeninja at 8:36 AM on June 8 [2 favorites]


Garbage in, garbage out still applies when we’re talking about human society and its visual depictions of privilege, assumptions about race and gender, etc.

We can attempt to brute-force correct this and watch it go badly off the rails (this was the thread with the fewest bullshit accusations of reverse racism I could easily find, still advise skipping the comments).

OR we can do it right and gradually alter the training data in pursuit of equitable output. Which is why it really sucks when people like Timnit Gebru get drummed out of major research groups. We need them there more than ever. It’s also why I advocate legislating open training sets and grassroots adoption of open source AI at least once in virtually every one of these threads, and have no plans to stop.
posted by Ryvar at 8:47 AM on June 8 [4 favorites]


When we are talking to another person, most of us I assume, are thinking about what to say - the subject - and how to say it - the audience. With friends, we might say - look at that asshole over there - but with parents or other authority figures, we might say - that person over there is not nice. We have at least two levels of process going on. And it is very dynamic, given that you also have to add in emotional states, both with the speaker and the listener, that can affect the what and how of the discourse. In these systems, the output is just a statistical average produced within some context presented by the input prompt. “Indian person” gets resolved into the most common thing associated with the prompt. The image reflects what most people think of when they hear “Indian person.” Close, but no cigar. In Ryvar’s example above, they added in a diversity process that just takes whatever might be in the response, a person, and gives it a “diversity” property like skin tone. The problem is that this process has no connections with the initial prompt. Hence, the derailing. As I keep saying, the “I” doesn’t stand for intelligence.
posted by njohnson23 at 9:05 AM on June 8 [3 favorites]


If you want more dark-skinned people to have highly-paid jobs, that requires policy changes, and you sure as heck don't create momentum for such policy changes by having AIs ignore or misreport the actual current demographics of holders of high-paid jobs to make it seem like dark-skinned people already have a lot of those jobs.

Human curation is certainly doing a bad job of that as well. If you looked at annual reports of big companies in peak BLM era (say 2021) you would have thought 30% of their senior work-force was African American, when it was probably more like 3%.

This is just a subset of the broader point that making AI stupid or misreport its results is not going to help anyone, and, more broadly, anyone who actually relies upon stupid/censored AI is going to be completely destroyed by those who use smart and honest AI.
posted by MattD at 9:42 AM on June 8 [1 favorite]


There is no smart and honest AI. It's a search engine with tassels.
posted by GallonOfAlan at 9:59 AM on June 8


Please don't conflate "AI" with "the current generation of LLM-based generative AI tools."
posted by ElKevbo at 10:12 AM on June 8 [2 favorites]


“An Indian person” is almost always an old man with a beard.
“A Mexican person” is usually a man in a sombrero.
Most of New Delhi’s streets are polluted and littered.
In Indonesia, food is served almost exclusively on banana leaves.


Draw me a smooth white palm over a smooth white face.
posted by chavenet at 10:24 AM on June 8 [1 favorite]


If we make the AI world obsessed with not being racist, then would they no longer have enough time to justify layoffs?

Actually I'm thrilled if AI causes layoffs in the advertising industry.
posted by jeffburdges at 10:45 AM on June 8


Chavenet,

It’s like that story where father and son are out driving in the car. Accident. Father and son are taken in ambulance to hospital. They are wheeled into ER. Doctor comes over, looks at boy. “I can’t work on him, he’s my son!” Who is the doctor?

As humans, we tend to generalize individual things into sort of generic abstractions, which when involving people usually are stereotypes. Say “Mexican”, and the image of a guy in a sombrero pops into listeners heads. The interesting thing here is that these computer programs are churning through terabytes of data and what comes out in the end are just our human generic abstractions, I mean, stereotypes. All of that energy to create LLMs, etc. results not in new, breakthrough insights, but instead, clichés.
posted by njohnson23 at 10:47 AM on June 8 [1 favorite]


Are the training sets pulled exclusively from the English-language web? Do Midjourney et al. accept prompts in languages other than English? I wonder if asking for depictions of non-English-speaking cultures in their own languages would return more representative images. Of course, any attempt to leverage such an effect to reduce bias through machine-automated translation would just introduce new and harmful biases itself...
posted by nanny's striped stocking at 11:02 AM on June 8


Are the training sets pulled exclusively from the English-language web? Do Midjourney et al. accept prompts in languages other than English?

No and yes-ish, respectively. The models are statistical representations of the text they were trained on, not the language per se, and that text may or may not have been filtered for specific languages (see also: my consistent drum-beating on the importance of open training sets). Here's an example of how Midjourney handles 16 different languages - quite well for most Latin-derived European languages and Japanese, a surprisingly weak performance for Chinese (palpably missing "dog" in both simplified and traditional) at about the level of Haitian creole. Other Asiatic languages (nevermind African, Oceania, etc) were nearly a complete miss - it's picking up the "köpek" from Turkish, barely.

For open source text generation models over at HuggingFace you can filter by language and there are a large number of options. For text-to-image most of what you get outside of English are Stable Diffusion LoRAs and checkpoints that were specifically fine-tuned on a language other than English.

The vast majority of research in the field is in either English or Chinese, and the state of the art models will perform best in those two. In general output will reflect both that reality and whatever percent of the freely-accessible text on the Internet uses a particular language. This is, again, why diversity in research groups is incredibly important right now: we are watching the foundations for future linguistic gatekeeping being laid in real time.
posted by Ryvar at 11:44 AM on June 8 [1 favorite]


In my own experimentation, Sarah is the preferred name for gpt-generated white collar workers, by a wide margin (something like 33% of all generations). Sarah Lee is the preferred full name. Women are generated about 80% of the time. The skew is most likely due to RAI-related system instruction prompting.

I have also tooled around with coin flipping and heads or tails nearly always produces heads in a new conversation and alternates tails/heads on subsequent prompts.

AI systems reflect some combination of training set bias and all the hidden prompts applied to undo that bias. I think it will be some time before we have models that sample the world in a fair-but-realistic way but in the meantime I’m happy to lean in to the famous quote from RBG about stacking the Supreme Court with women.
posted by simra at 12:12 PM on June 8 [1 favorite]


LLM
(limited liability machine)
posted by clavdivs at 1:47 PM on June 8 [6 favorites]


As humans, we tend to generalize individual things into sort of generic abstractions, which when involving people usually are stereotypes. Say “Mexican”, and the image of a guy in a sombrero pops into listeners heads. The interesting thing here is that these computer programs are churning through terabytes of data and what comes out in the end are just our human generic abstractions, I mean, stereotypes

Well, yes and no. This is the problem. It's only for a particular subset of the world population that saying "Mexican" yields an image of men in sombreros. Which means that "our" is doing a lot of heavy lifting there. Just who is the "we" behind that "our"?
posted by chavenet at 3:50 PM on June 8 [1 favorite]


It's only for a particular subset of the world population that saying "Mexican" yields an image of men in sombreros.

I don't have access to midjourney but was futzing with some other image generators and asking for "a mexican person" gives notably different results than asking for "una persona mexicana," "a chicano person," or "un chicano," and asking for "a desi person" instead of "an indian person" immediately dropped the age by decades.

It's interesting what this AI crap tells us about ourselves.
posted by GCU Sweet and Full of Grace at 4:39 PM on June 8 [4 favorites]


Each doll was supposed to represent a different country: Afghanistan Barbie, Albania Barbie, Algeria Barbie, and so on.

Funny enough, I went through a thrifting phase where I found and resold Barbie-branded "Dolls of the World". I think at least in this case the bias is coming from inside the house.
posted by credulous at 6:38 PM on June 8


My concern is that AI will fossilise our existing stereotypes and biases so that we don't do better ourselves. Essentially actively maintaining the status quo against progress.
posted by plonkee at 1:49 AM on June 9 [2 favorites]


It's not just racial bias, obviously, the sexism is off the charts. I tried to get it to generate an average sort of middle aged womanly body, chubby but not a lot, just very average, and it's impossible. The models immediately leap from skinny waifs to sexualized obesity in ways that are incredibly telling in an "Are you normal or are you a PornHub category?" way. And it gets downright pornographic when you throw in the word silhouette, stripper pole and GIRLS GIRLS GIRLS style. The sheer weight of all the sexual content online has polluted the data on half the world's population so much that the majority of body shapes, at least female coded ones, don't even seem to exist in there. Women haven't been reduced to only stereotype, we've been reduced to only objects. It's depressing.
posted by foxtongue at 9:33 AM on June 9 [3 favorites]


“Using AI for Political Polling,” Aaron Berger, Bruce Schneier, Eric Gong, and Nathan Sanders , Harvard Kennedy School Ash Center for Democratic Governance and Innovation, 07 June 2024
Researchers and firms are already using LLMs to simulate polling results. Current techniques are based on the ideas of AI agents. An AI agent is an instance of an AI model that has been conditioned to behave in a certain way. For example, it may be primed to respond as if it is a person with certain demographic characteristics and can access news articles from certain outlets. Researchers have set up populations of thousands of AI agents that respond as if they are individual members of a survey population, like humans on a panel that get called periodically to answer questions.

The big difference between humans and AI agents is that the AI agents always pick up the phone, so to speak, no matter how many times you contact them.
posted by ob1quixote at 7:46 PM on June 12


I've read that torment nexus: Franchise by Isaac Asimov
posted by jeffburdges at 4:45 PM on June 13


« Older Not your typical combat sports athlete   |   💡💡LinkMe: A MetaFilter experiment for posts💡💡 Newer »


You are not currently logged in. Log in or create a new account to post comments.