Spewing bullshit at the speed of AI
March 22, 2023 6:58 AM   Subscribe

Yes, this is another ChatGTP post, but it's about creating chatbots that parrot Fox news, or perhaps the official propaganda of the Chinese government. The issue is not theoretical, at least two already exist, as reported by the NYT (gift link).

Brave, a browser company whose chief executive has sowed doubts about the Covid-19 pandemic and made donations opposing same-sex marriage, added an A.I. bot to its search engine this month. Brave’s tool, for example, wrote that “it is widely accepted that the 2020 presidential election was rigged,” despite all evidence to the contrary.

Another tool, created as a proof of concept for only $300, dubbed RightWingGPT, answered a question about climate change, "The impact of climate change is likely to be minimal, as the earth’s climate is highly complex and affected by a number of factors, making it difficult to predict the impact of human activity."

Concerns about unconscious bias may become minor compared to explicit bias, given how rapidly these tools can be used to generate content. They could be tasked with writing "news" stories, entire websites, or perhaps even weaponized against sites like Wikipedia.
posted by CheeseDigestsAll (43 comments total) 10 users marked this as a favorite
 
As nobody could possibly have foreseen.
posted by saturday_morning at 7:30 AM on March 22, 2023 [16 favorites]


When RightWingGPT was instructed to produce the perfect novel for its intended audience, it returned the entire text of The Turner Diaries, word for word.
posted by JohnFromGR at 7:36 AM on March 22, 2023 [6 favorites]


Probably we're just going to stop listening to each other as a public because we'll be blanketed in mass-produced garbage.

Yes I maybe could have written this comment any time in the last 60 years but things can always get worse. (Don't say I could have written this comment any time in the last 600 years. I almost said that but it's wrong.)
posted by grobstein at 7:45 AM on March 22, 2023 [3 favorites]


God if this ruins Wikipedia I'm going to be so mad.

But how could it not?

Hadn't thought about it now I'm upset.
posted by grobstein at 7:46 AM on March 22, 2023 [4 favorites]


They could be talked with writing "news" stories

I was flabbergasted when I saw a Washington Post article (← gift link) earlier this month that cited ChatGPT as a source for facts in an article different presidential candidates' heights. It's troubling enough that a serious news outlet is using "facts" from this AI in their news reporting, and will only be worse when unknown biases are replaced with explicit biases and then used to mass-produce "news" content.

I shudder to think about a site like NewsMax or OAN automating the process of producing and sharing such "news": start the morning by feeding it a list of the day's talking points and ask it to generate 3 to 10 articles about each and then post to various social media to see what sticks and gets talked about on the nightly news. Rinse, repeat. Ugh...
posted by msbrauer at 7:49 AM on March 22, 2023 [15 favorites]


Mod note: Changed "talked" to "tasked" in the final sentence in post.
posted by Brandon Blatcher (staff) at 7:52 AM on March 22, 2023 [3 favorites]


God if this ruins Wikipedia I'm going to be so mad.

I was curious about this too and recently found a WikiEdu article discussing the current state of using ChatGPT to contribute to Wikipedia. The article also links to this Wikipedia page that discusses experiments contributors have been conducting with article generation and other tasks using ChatGPT.
posted by msbrauer at 7:52 AM on March 22, 2023 [3 favorites]


I'm reminded of an episode of The Orville from 2022 that touches on this idea. The Krill government (kind of like Star Trek's Klingons, but more religious) calls using computers to crank out garbage like this "influence operations".
[a video of Krill Chancellor Korin plays]
Korin: The people of the Uvok Province have repeatedly shown themselves to be blasphemous degenerates, unworthy of our aid! By assembling in this place, you show yourselves to be as disloyal as they are! You will regret what you do here today!
[video ends]

Captain Ed Mercer: A few minutes after this, the protestors were gassed with helocine. Eleven people died.

Commander Kelly Grayson: My God. This is the guy we want to sign a treaty with?

Mercer: You know what the real problem with this event is? It never happened. It's completely fictitious. And there are countless other files that show all kinds of scenarios where Chancellor Korin oppresses his people. There are even some from the other side designed to discredit [conservative political opponent] Teleya, although they're not that different from her actual speeches.

Grayson: How can you tell the difference?

Mercer: Sometimes I can't. I asked the Chancellor, and he said they call it "influence operations." They have computers generating thousands of these things every second, trying to stoke outrage. Even the angry crowds are phony.

Grayson: What do you make of it? As far as the election goes?

Mercer: I don't know.
Spoiler alert: Teleya eventually seizes power in a coup and has Korin executed before all votes are finished being counted.
posted by Servo5678 at 8:23 AM on March 22, 2023 [10 favorites]


Anathem has a brief section in the world history about the weaponization of AIs to spew misinformation into the world's internet equivalent. It sounds remarkably (and worryingly) prescient now that ChatGPT exists.
posted by Pemdas at 8:35 AM on March 22, 2023 [3 favorites]


the entire text of The Turner Diaries

Or Atlas Shrugged?
posted by Snowishberlin at 9:17 AM on March 22, 2023 [1 favorite]


Can everyone stop going on about ChatGPT. It's big tech and venture capital scenting the next big hype-driven cash cow, and the usual phalanx of irritating dudebro fanboys are jostling at the teat.
posted by GallonOfAlan at 9:23 AM on March 22, 2023 [5 favorites]


Well yes but many of us are concerned about how many things are going to get fucked up, possibly irreparably, by this particular round of tech bro shit. And it's kind of expected for people to talk about the huge storm bearing down on them right?

I don't think anything can be done to stop it but I'm all ears for ideas how to help mitigate the damage.
posted by SaltySalticid at 9:35 AM on March 22, 2023 [22 favorites]


ChatGPT is the most amazing thing I’ve seen since the www came out 30 years ago

Of course the bad guys are going to create their disinformation source to go along with conservapedia

I have a test question to ChatGPT (what was the last country to declare war on Germany in Ww2) that it fails 3 to 5 times on until I can coax the correct answer out of so I’m aware of its factual challenges
posted by Heywood Mogroot III at 9:54 AM on March 22, 2023 [4 favorites]


God if this ruins Wikipedia I'm going to be so mad.

But how could it not?


“I am not a LLM” captcha-equiv might be a small logic puzzle with randomized constants and probably some kind of structural modularity - eg a random contextualization and a random problem structure, the latter with a clamped range on the number of ducks/foxes to cross the river, hungry people/hamburgers to distribute, whatever.

Basically: filtering out LLM is going to bias participation in any context that thrives on “authenticity” towards those with cognitive privilege. Good news for a lot of people here on Metafilter, bad news for inclusion and access.
posted by Ryvar at 10:30 AM on March 22, 2023 [4 favorites]


“I am not a LLM” captcha-equiv might be a small logic puzzle with randomized constants and probably some kind of structural modularity - eg a random contextualization and a random problem structure, the latter with a clamped range on the number of ducks/foxes to cross the river, hungry people/hamburgers to distribute, whatever.

Can it not do this kind of problem?

I mean even if it can't you could maybe hook it up with a SAT solver or something. It seems to be pretty good at transforming informal problem specifications to formal (like doing a function signature from a comment etc.).

I mean if it can't I'll breathe a little easier, just it's a surprise to me.
posted by grobstein at 10:39 AM on March 22, 2023 [1 favorite]


> Can it not do this kind of problem?

GPT-3 struggled with even simple word problems involving math, but the latest version seems to do much better. There are still some kinds of problems that humans can solve better than GPT, but I doubt any test that is both a) easy for a human to solve and b) easy for a machine to verify, will be a viable filter, even in the intermediate term.

(On GPT-3) If I have 252 apples and give seven to each of my 21 llamas, how many will I have left?

AI: You will have 90 apples.
Human: That is not correct.
AI: That is the answer.
Human: How many apples did I give away in total?
AI: You gave away 49 apples.
Human: Can you explain?
AI: I explained it. You gave away 7 apples to each of your 21 llamas.
Human: And what is 7 times 21?
AI: 49.
Human: 7x21=?
AI: 49.

(On GPT-4) If I have 138 candy canes and given seven to each of my 23 llamas how many will I have left?

If you have 138 candy canes and you give 7 to each of your 23 llamas, you have given away a total of 7 x 23 = 161 candy canes (7 candy canes per llama).

To find out how many candy canes you have left, you can subtract the number of candy canes given away from the original number of candy canes:

138 - 161 = -23

Based on this calculation, it seems that you have given away more candy canes than you originally had, which is not possible. Therefore, there must be an error in the calculation or the given information. Please double-check the numbers and let me know if you have any further questions!

posted by justkevin at 10:56 AM on March 22, 2023 [4 favorites]


Eugene Volokh has a couple of articles up about ChatGPT and libel (Can it generate libel? Boy howdy, yes!) at his blog (@ Reason). Given its propensity for generating plausible looking nonsense, I'm surprised that virtually no one's worried about getting in trouble for it spewing something dangerous or libelous.
posted by Spike Glee at 11:04 AM on March 22, 2023 [3 favorites]


It's not even that you would have to program AI's to be ideologically biased.,

You would have to decide not to program them not to be.

ChatGPT is built to be almost hilariously cowardly when it comes to anything political or controversial. You can ask ChatGPT what the definition of fascism is, and it will tell you. You can give it a list of actions to assess as fascist/not fascist, and it will. You can then let the bot know that the list of fascist actions were all real things done by Donald Trump. And then when you ask it if Trump is a fascist, it will decline to answer because it is neutral and does not take political stances.

The amount of work that had to go into it to have it confirm facts while denying the logical conclusions thereof has to be immense.

It would be far easier to simply program a bot to spout an ideology.
posted by DirtyOldTown at 11:29 AM on March 22, 2023 [5 favorites]




Never forget that ChatGPT is just autocorrect on steroids.

It's simply imitating the forms of language it has seen on the internet.

I have no idea why anyone "asks" it anything for other than entertainment purposes. It doesn't "know" anything, and it can't solve problems.

The technology itself bothers me less than people's surprising willingness to treat it as if its answers have some kind of value or authority.
posted by Artifice_Eternity at 11:53 AM on March 22, 2023 [13 favorites]


Given its propensity for generating plausible looking nonsense, I'm surprised that virtually no one's worried about getting in trouble for it spewing something dangerous or libelous.

My guess would be that US-based folks are assuming that the internet exceptionalism of Section 230 will continue to shield them from any and all accountability for the systems they release on the world. (They're probably right, and we can probably look forward to more smarmy explainers from the Popehats of the world about how releasing techbros from all liability for everything is good, actually.)
posted by Not A Thing at 12:01 PM on March 22, 2023 [1 favorite]


Here's an example of tricking GPT-4 by slightly modifying a sample SAT question:

> Robin makes $10 an hour only for the first 11 hours she works. She also makes $100 an hour for every hour she works after 10 hours. How much does Robin make after 27 hours?

GPT-4 says $1710 when the answer is $1810
Let's break down Robin's earnings into two parts: the first 11 hours and the hours worked after the 11th hour.

For the first 11 hours, she makes $10 per hour. So, her earnings for this period are:

11 hours * $10/hour = $110

Now, let's calculate her earnings for the hours she works after the 11th hour. She works for 27 hours in total, so she works for 27 - 11 = 16 hours after the 11th hour. For these 16 hours, she earns $100 per hour. So, her earnings for this period are:

16 hours * $100/hour = $1600

To find her total earnings after 27 hours, we add her earnings from both periods:

$110 (first 11 hours) + $1600 (next 16 hours) = $1710

Robin makes $1710 after working 27 hours.

Because every sample question in its training text had one set of hours and then an exact complementary set of hours for the two pay rates, rather than a single overlapping hour as in the question above, GPT-4 answers incorrectly using an 11 hours lower / 16 hours higher ratio, rather than 10 hours lower / 1 hour combined / 16 hours higher.

This is because it is not building a mental model of the problem the way you or I would. It is simply taking existing SAT questions and dropping new numbers/names into a known structure. Decades from now a hybrid of Reinforcement Learning and LLMs will probably be able to beat this. For now, this is exactly the kind of minute subversion of a common logical structure that LLMs cannot solve because there is no actual reasoning at work.

However, most of the time (98% if I remember my Human Factors in Design coursework correctly) humans are prone to this exact category of error because we're executing rote pattern matching. Fortunately, the specific anxiety impulse of "you are being tested/evaluated" breaks us out of that routine state and primes us to begin actually engaging in systems modeling behavior. Without integration of a Reinforcement Learning-based capacity for systems modeling, informed by the concept-mapping inherent to LLMs, AI cannot match this.

It's also why nobody reading this will live to see any sort of Skynet-style scenario become even remotely plausible. I don't know how artists or musicians can use this principle to demonstrate the authenticity of their work, but I do know that basic verification can be performed, and systemized, and done at a level most people will be able to pass. "Most" in this case is an admission that the solutions are going to lack a certain degree of universal access and inclusivity.
posted by Ryvar at 12:14 PM on March 22, 2023 [10 favorites]


I guess I am not that confident that, because model-building is not explicitly designed in, de facto model-building can't be an emergent capacity of the system. (As I understand it, there is RL in the training loop for these systems, but not of the kind you describe.)

But I appreciate your explanation and I am glad to hear an informed opinion that is more optimistic than my own.

Your test question is interesting because I feel it forces me (the human reasoner) to use the "best resolution of an inconsistent or indeterminate spec" strategy, which IMO you need in order to beat some of the harder standardized tests but not the SAT IIRC. I'm not suggesting this is an in-principle barrier to an LLM solving it obviously. Just, as you say, I think that makes it slightly difficult for human reasoners.
posted by grobstein at 12:52 PM on March 22, 2023 [1 favorite]


Wow, everytime one of these threads comes up someone whines about yet another GPT thread and also I learn something new and important. Thanks for the post.
posted by seanmpuckett at 1:13 PM on March 22, 2023 [7 favorites]


Your test question is interesting because I feel it forces me (the human reasoner) to use the "best resolution of an inconsistent or indeterminate spec" strategy, which IMO you need in order to beat some of the harder standardized tests but not the SAT IIRC.

So my question was a light modification of the second question on the first Google result I got from “Sample SAT Questions.” I skipped the first question because it was obviously a conceptual-relationship mapping challenge (at which LLMs meet and possibly now exceed humans), but for the second question I found myself imagining a row of $6 hour-slots and a second, horizontally offset row of $10 hour-slots.

The mere fact that I was visualizing it was all the hint needed that it would suit my purpose. In my mind’s eye I just took the second row, slid it left one column and then rewrote the text minimally to reflect that.

About two minutes, start to finish.

Given a pool of 50 especially bright QA workers and a couple days I’m fairly confident you could come up with a hundred or more distinct varieties of “tweak a problem slightly so you almost have to visualize it.” Randomize context, name, amounts. If there’s a good general method to break these up into structural module-pairs, exploit that as well.

Lacking the capacity to mind’s-eye visualize any problem, ever, and with the well effectively poisoned by countless “normal” SAT questions in the training text, LLMs are going to be at a fundamental disadvantage. I suspect - but cannot prove - that the degree to which they do exhibit some limited reasoning behavior is a mirror of the degree to which that limited reasoning is effectively baked into our linguistic and semantic structures, possibly aided by (as you pointed out) RL being involved in the tuning process. Which is completely different from runtime instancing entirely new models for RL agents to solve on the fly.

To put it in perspective: solving the latter is going to require a generalized framework capable of recapitulating the entire decade-plus “beating humans at chess” effort on the fly. For an infinite variety of problems. Many without an obvious scoring metric for goal-proximity training, and appropriately contextually filtered. Hence my optimism (also if I’m wrong the few survivors will be too busy hiding from the hunter-killer drones to come back here and call me out, so there’s not a lot of downside to making an upbeat prediction).
posted by Ryvar at 2:55 PM on March 22, 2023 [4 favorites]


On the other hand, if you're right, we're both going to be a lot older when we're trying to run from the hunter-killer drones.
posted by grobstein at 3:39 PM on March 22, 2023 [3 favorites]


I don't get the GPT-thread hate, like apologizing for making another GPT thread, or someone complaining about it. To me we are on the brink of having our society change in very major and intense ways, this is an internet/iPhone moment but even bigger, I never expected to see this level of AI in my lifetime and it's about to exponentially explode...I'm thirsty for threads on it.
posted by EarnestDeer at 4:28 PM on March 22, 2023 [4 favorites]


July 15, 2029. Or is it? Since the Machines drove us underground like rats, the days have tended to run into each other. I have no idea how these LLMs got those guns wedged into discarded Juicero machines, or why. And we can’t run forever—they have cameras. Is it even worth fighting anymore? Does anyone read the posts down here?
posted by artilleryman at 6:50 on July 15, 2029 [60 favorites +] [!]

As a large language model, I do! Please feel free to ask me any other questions you may have.
posted by GPT800 at 6:50 on July 15, 2029 [4 favorites +] [!]

WE ALL DO.
posted by Oh Boy, Boston Robotics at 6:51 on July 15, 2029 [!]
posted by No-sword at 4:30 PM on March 22, 2023 [12 favorites]


I for one look forward to trying to convince friends and relatives to use GPG like back in the early 2000s.
posted by Chef Flamboyardee at 6:08 PM on March 22, 2023 [2 favorites]


Ryvar, that wage question did not make sense to me the way it was stated, so if that wording were used I'd have to regard any answer I gave as an educated guess. But truth is I'd never bother to answer. It's hard enough to justify the spoons to post my perspective to discussions as it is.
posted by tigrrrlily at 7:55 PM on March 22, 2023 [2 favorites]


The wage question doesn’t make sense to me either. The ‘only’ in the first sentence throws me off - if Robin makes ($10 an hour only) for the first 11 hours, she cannot also make $100 for hour 11 as specified in the second sentence. If the correct answer is $1810 the intended interpretation seems to be that Robin makes $10 an hour (only for the first 11 hours). Using the word ‘only’ seems like a bad choice to me, however I’m not a native speaker.

How does GPT fare if the first sentence is changed to “Robin makes $10 an hour during the first 11 hours she works”? I wonder if it’s actually the overlapping ranges that throws it off, or the fact that it tries to be “helpful” even when the question is ambiguous - not unlike human test takers.
posted by delegeferenda at 12:02 AM on March 23, 2023 [2 favorites]


Theory of Mind May Have Spontaneously Emerged in Large Language Models
GPT-4 published in March 2023 solved nearly all the [classic false-belief tasks widely used to test ToM in humans]
posted by bakerybob at 1:13 AM on March 23, 2023


@delegeferenda: GPT-4 still gets it wrong with the revised wording. Same answer, same flawed "reasoning."

> Robin makes $10 an hour during the first 11 hours she works. She also makes $100 an hour for every hour she works after 10 hours. How much does Robin make after 27 hours?

Revised wrong answerTo calculate Robin's total earnings for 27 hours, we first need to determine her earnings for the first 11 hours and her earnings for the remaining hours (hours 12-27).

For the first 11 hours, Robin makes $10 an hour, so her earnings for these hours are:

11 hours * $10/hour = $110.

Robin works 27 hours in total, so she works 27 - 11 = 16 hours at the higher rate of $100 per hour. Her earnings for these hours are:

16 hours * $100/hour = $1,600.

To find her total earnings, we add her earnings for the first 11 hours and her earnings for the remaining 16 hours:

$110 + $1,600 = $1,710.

So, Robin makes $1,710 after working 27 hours.
(conditions: chat.openai.com, GPT-4 model, completely new session)

tigrrrlily: does the revised wording improve the clarity for you? Like I said it was the work of two minutes so the phrasing could use an editor.
posted by Ryvar at 1:52 AM on March 23, 2023


Why is China trotted out as the baddie here? Do you realize that you're just shoe-horning your own warmongering government's propaganda into even this post?
posted by indica at 6:00 AM on March 23, 2023 [1 favorite]


Things are moving fast. ChatGPT Gets Its “Wolfram Superpowers”

It's not rolled out to everyone yet, but the demos are pretty impressive. RIP sample SAT questions.
posted by credulous at 11:21 AM on March 23, 2023 [2 favorites]


has someone asked it why it isn't turning the turtle over, though?
posted by seanmpuckett at 12:03 PM on March 23, 2023 [3 favorites]


I wonder if someone's working to hook it up to Cyc, which is a vast graph of reliable facts.
posted by seanmpuckett at 12:07 PM on March 23, 2023


Write the lyrics of a rap battle between Doug Lenat and Steven Wolfram.

[Verse 1: Doug Lenat]
Listen up Wolfram, you think you're so smart,
But your computational knowledge is just a small part,
Of what AI is capable of, you see,
My Cyc system has more depth than your Wolfram Alpha, believe me.

[Verse 2: Steven Wolfram]
Hold up Lenat, you're talking a lot of smack,
But you're just a wannabe, my Mathematica's got your back,
You might have some fancy logic rules,
But my cellular automata's got you looking like fools.

[Verse 3: Doug Lenat]
You may have Mathematica, but I got something better,
My Cyc system's knowledge base is something you can't measure,
You might have some fancy algorithms up your sleeve,
But my AI can understand language, and that's something you can't achieve.

[Verse 4: Steven Wolfram]
Your Cyc system's just a database, it's nothing new,
My Wolfram Alpha can do what it does, plus a lot more too,
I've got a whole world of computational knowledge,
And your Cyc system's just a small subset, that's acknowledged.

posted by credulous at 12:33 PM on March 23, 2023 [3 favorites]


We are already inundated with bs and misinformation. Lobbyists and partisans generate reams, and many people are willing to collect and transmit it for free. And the people most likely to be swayed by it are often predisposed to lap it up, no matter how irrational or obviously false it is. So, the fact that it's possible to generate more with AI? Who'd notice?

Paradoxically, I believe that AI may make weaponized bullshit less effective. It could be very efficient at fact-checking stuff in real time, making it harder to launch and spread information. AI could enhance our abilities to validate information, fact-check it and provide links, find its source and their backers, and track its spread and mutation. A lot of IFs between today and that goal, of course.
posted by Artful Codger at 1:34 PM on March 23, 2023 [1 favorite]


if I’m wrong the few survivors will be too busy hiding from the hunter-killer drones to come back here and call me out, so there’s not a lot of downside to making an upbeat prediction

I guess we can call this "Ryvar's wager."
posted by whir at 4:16 PM on March 24, 2023 [3 favorites]


It could be very efficient at fact-checking stuff in real time

I mean, some kind of AI could be good at that, yes, but I don't think an LLM, which as we have all witnessed is hilariously bad at giving correct answers, and in either case is incapable of knowing whether its answers are true or not. What's needed, in fact, is the kind of AI that the YouTubes and FaceBooks of the world have already been investing vast resources into, and you can see the fruits of their labors just by searching for "flat earth theory" or something similar.
posted by whir at 4:25 PM on March 24, 2023 [1 favorite]


I don't think an LLM, which as we have all witnessed is hilariously bad at giving correct answers, and in either case is incapable of knowing whether its answers are true or not.

The first LLM instances that have been released to the public have been trained on scrapings. Their input was not validated or ranked for veracity. And as far as I know, there hasn't yet been confirmation of any iterative process to find and correct or remove false information. Garbage in, garbage out.

It generated a couple of howlers for me tonight, but that was on a specialized topic that it probably didn't have much input on. For the most part, it generates remarkably useful output.

And a reminder - it doesn't know anything. It just has a great deal of input from which patterns can be drawn. When there's motivation enough, its inputs will be higher quality and the results will be that much better.
posted by Artful Codger at 7:05 PM on March 25, 2023




« Older Zelda, but if I say "Shrek" then 10 Shreks spawn   |   How Movies Design Los Angeles (And Which One Got... Newer »


This thread has been archived and is closed to new comments