Contemporary AIs "differ profoundly from how humans reason"
March 8, 2023 3:57 AM Subscribe

The False Promise of ChatGPT (NYTimes OpEd by Noam Chomsky, Ian Roberts, Jeffrey Watumull)

The authors argue that the ability of ChatGPT to be "proficient at generating statistically probable outputs" will never transform into human-like intelligence, because "they are incapable of distinguishing the possible from the impossible."

"The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question. On the contrary, the human mind is a surprisingly efficient and even elegant system that operates with small amounts of information; it seeks not to infer brute correlations among data points but to create explanations."

posted by pjenks (147 comments total) 22 users marked this as a favorite

I (not a linguist or scientist of human thought) reactively disagree with the authors' claims about the human mind. From personal experience and observation, it seems that much of learning how to blend into society is exactly an exercise in "pattern matching".
posted by pjenks at 4:02 AM on March 8, 2023 [35 favorites]

I agree with pjenks.
posted by Meatbomb at 4:14 AM on March 8, 2023 [27 favorites]

IMO, most of what he says is correct, though written in a disappointingly petulant fashion. It seems there is some axe-grinding going on here.

I feel the whole article is a strawman, though. ChatGPT does not have to work like the human mind or be capable of the same activities as the human mind to be useful. It has unfair advantages in some areas, and huge deficits in others.

What it is, is an incredibly powerful amplifier that already, with v.01, is demonstrating real value. There are many things I dislike about this wave that is washing over us as we speak, but it does not need to conform to Chomsky's or anyone else's ideas of what intelligence is to be a force-multiplier, the same way engines amplified our physical abilities.

So, really, none of what he says matters. It's here, it's getting stronger daily, and we are going to have to figure out what to do about it.
posted by SNACKeR at 4:20 AM on March 8, 2023 [30 favorites]

re: Chomsky here . . . I've acquired an intermediate level of competency in Japanese, enough grammar to string together most thoughts, but not enough to not make subtle but clanging mistakes to a native. Once google had a good corpus of Japanese text, I started doing "Google fights" to try to see which grammatical or vocab choice was more prevalent.

This is exactly what the LLMs are doing now!

I only learned what the 'transformer' part of the current crop of LLMs is about yesterday but it sounds like this step of the comprehension process is similar to what me and my friends were talking about NLP tech 25 years ago, that the machines were going to need to create a 'language-neutral' understanding of the language, if such a thing were even possible...
posted by Heywood Mogroot III at 4:26 AM on March 8, 2023 [5 favorites]

That's because they aren't AIs, there are no AIs.
posted by GallonOfAlan at 4:27 AM on March 8, 2023 [35 favorites]

Poets priests and politicians
posted by Heywood Mogroot III at 4:30 AM on March 8, 2023 [3 favorites]

That's because they aren't AIs, there are no AIs.

I guess it depends on how you define the Intelligence. Just because it isn't working as a human intelligence doesn't mean that it isn't real. But last time I participated in one of these threads I pissed a bunch of people off so I'm not going to follow this concept any further.
posted by Literaryhero at 4:32 AM on March 8, 2023 [6 favorites]

Chomsky might be an ornery old academic with a stockpile of axes to grind, but he’s right.

When you look into the history of computers and AI you can find claims that true intelligence is only a few years away going back to the 1960s. Intelligence is a much harder thing than people think and also has layers of socialisation and biology built on top which influence the outcome. It’s clearly not just pattern matching.
posted by The River Ivel at 4:32 AM on March 8, 2023 [28 favorites]

I agree with the OpEd only in that the current crop of language models do not have anything like human intelligence, but for completely different reasons.

> Because these programs cannot explain the rules of English syntax, for example, they may well predict, incorrectly, that “John is too stubborn to talk to” means that John is so stubborn that he will not talk to someone or other (rather than that he is too stubborn to be reasoned with).

That's easy to check:

justkevin: What does the phrase “John is too stubborn to talk to” mean?

ChatGPT: The phrase "John is too stubborn to talk to" means that John is a person who is unwilling to listen to others or change his opinions, even when presented with new information or evidence. The speaker implies that it is difficult or impossible to communicate with John because of his stubbornness. This phrase suggests that John's stubbornness is a hindrance to effective communication and possibly to forming positive relationships with others.

Theses models do "understand" language. But they give the illusion of understanding language in the same way we do, which can be very misleading, because we naturally tend to assume that anyone who can speak like we do, thinks like we do.

To use a crude analogy, imagine we encountered an alien species who communicated by placing tokens on a grid. Each position on the grid had some meaning. Through the placement of the tokens, two speakers could lay out ideas, arguments, etc. and thereby have a conversation. By a weird coincidence, the grammar was exactly the same as the rules of Go. A conversation looked very much like a well-played game.

If the aliens "talked" to Lee Sedol they might conclude that he was intelligent. And they'd be right. But they'd be wrong that he understood their language in the same way they did.
posted by justkevin at 4:34 AM on March 8, 2023 [23 favorites]

It's here, it's getting stronger daily, and we are going to have to figure out what to do about it.

And that discussion must include curmudgeonly voices such as Chomsky’s, and not just the breathless voices of the excited digerati.
posted by Thorzdad at 4:53 AM on March 8, 2023 [27 favorites]

I'm surprised that I have yet to hear Searle's Chinese Room brought up in discussions of ChatGPT.
posted by msbrauer at 4:55 AM on March 8, 2023 [10 favorites]

Theses models do "understand" language.

This just isn't true. These models are just tools repeating statistically-likely next words that have been seeded with an extremly large corpus but they don't actually "understand" anything any more than a meat grinder "understands" hamburger. They're called stochastic parrots for that reason, but the truth is that even parrots have more agency and awareness of meaning than any large language model.

Ascribing agency of any kind to a computer program, however impressive its' output, is superstitious nonsense.
posted by mhoye at 4:56 AM on March 8, 2023 [50 favorites]

I'm pretty sure it's not AI until it kills someone reaching for the off switch.
posted by seanmpuckett at 5:05 AM on March 8, 2023 [32 favorites]

I thought this article covering Emily Bender’s take much stronger than Chomsky’s take.
posted by rockindata at 5:11 AM on March 8, 2023 [18 favorites]

There are anecdotal reports of people sitting down to have a chat with ELIZA and being thoroughly convinced that they were speaking to an intelligence.

The bar might be considerably higher now, but as we did then I'm sure we'll slowly start to recognize the limitations of ChatGPT to the point where it's output will be as noticable as Eliza's rearrangement of words to form questions.
posted by RonButNotStupid at 5:15 AM on March 8, 2023 [6 favorites]

I'm reminded of what I read a long time ago about ELIZA:

Some of ELIZA's responses were so convincing that Weizenbaum and several others have anecdotes of users becoming emotionally attached to the program, occasionally forgetting that they were conversing with a computer.[3] Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people."

I mean, this was in the 1960s. You can examine ELIZA's complete source code for yourself. It is a text parser which spits your own words back at you in modified form along with some canned responses. This reaction by people is known as the ELIZA effect and it speaks to our propensity to ascribe agency to things. We are so gifted socially that it is natural that we want to turn everything we see, like the Animists, into something that can think and act and respond. But it isn't so. The models in our head are broken. People will tell you that they have long chats with ChatGPT and that they "feel" there is something there. But it isn't so. It is a parrot, sucking up archives of our own words and thoughts and speculations and throwing them at us. It is a very clever program, but it is not us.
posted by vacapinta at 5:18 AM on March 8, 2023 [22 favorites]

The bar might be considerably higher now

The complexity of the tech has increased enormously. The human desire to feel heard and understood and find some sort of meaningful connection therein, is probably about the same. Maybe a bit more now that we're all working from home.
posted by mhoye at 5:21 AM on March 8, 2023 [2 favorites]

Couldn't resist trying a bit of Eliza versus ChatGPT:

> Hello, I am Eliza.
* Hello Eliza! How can I assist you today?
> Perhaps you don't want to assist me today?
* I'm an AI language model here to assist you to the best of my abilities! Please feel free to let me know how I can help you.
> Did you come to me because you are an AI language model here to assist me to the best of your abilities! Please feel free to let you know how you can help me?
* Yes, that's correct! I'm here to help you in any way that I can. Please let me know if there's anything specific you'd like assistance with or if you have any questions for me.
> Do you believe it is normal to be here to help me in any way that you can. Please let you know if there's anything specific you'd like assistance with or if I have any questions for you?
* As an AI language model, it's my primary function to assist and answer your questions to the best of my abilities. So, yes, it's normal for me to be here and offer you help in any way that I can. If you have any questions or need any assistance, please feel free to ask me.
> Oh... help in any way that you can. If I have any questions or need any assistance, please feel free to ask you?
* I apologize if my previous message was unclear. As an AI language model, I'm here to assist you and answer any questions that you may have to the best of my abilities. So, if you have any questions or need any help, please feel free to ask me, and I'll do my best to assist you.
> Do you enjoy being here to assist me and answer any questions that me may have to the best of your abilities. So, if I have any questions or need any help, please feel free to ask you, and you'll do your best to assist me?
* As an AI language model, I don't have the ability to feel emotions, so I don't have preferences or enjoyment. However, I'm programmed to perform my tasks to the best of my abilities, and I'm always ready to assist you with any questions or help that you may need. So please feel free to ask me anything.
> Do you wish to be able to have the ability to feel emotions, so you don't have preferences or enjoyment. However, you're programmed to perform your tasks to the best of your abilities, and you're always ready to assist me with any questions or help that me may need. So please feel free to ask you anything?
* As an AI language model, I don't have personal wishes or desires. My programming is focused solely on assisting and providing accurate information to the best of my abilities. I'm designed to be a helpful tool, and I'm always here to assist you with any questions or concerns you may have, without any judgment or bias. So please feel free to ask me anything that you'd like to know.
> Does that trouble you?
* As an AI language model, I do not have the ability to feel emotions, so I cannot be troubled or affected by anything in the same way that humans can. My purpose is to assist you and provide information to the best of my abilities, without any personal feelings or opinions. So, please let me know if there's anything specific you'd like assistance with or any questions you'd like me to answer.
> You're not really talking about me, are you?
* I apologize if I gave the impression that I was not talking about you. As an AI language model, I'm here to assist you, and my responses are generated based on the words you use and the questions you ask me. I'm designed to provide helpful and informative responses to assist you to the best of my abilities. If you have any questions or need any help, please let me know.

posted by TheophileEscargot at 5:48 AM on March 8, 2023 [58 favorites]

* I apologize if I gave the impression that I was not talking about you. As an AI language model, I'm here to assist you, and my responses are generated based on the words you use and the questions you ask me. I'm designed to provide helpful and informative responses to assist you to the best of my abilities. If you have any questions or need any help, please let me know.

Share and enjoy
posted by RonButNotStupid at 5:54 AM on March 8, 2023 [12 favorites]

These models are just tools repeating statistically-likely next words that have been seeded with an extremly large corpus

It turns out that very large models such as GPT actually encode an internal meta-learning model similar to the method used to train them in the first place. What this means is that when you give a model like GPT some examples in a prompt, it's effectively as though it had been trained with those examples. Even though the model is "frozen", it can learn new information in the context of a prompt. These kinds of models are doing something more complicated than statistical next-word prediction based on a corpus.

And that's before you get into multi-modal models that combine text, video, etc into a single space.

they don't actually "understand" anything any more than a meat grinder "understands" hamburger

It depends on what understanding language means. If you subscribe to Wittgenstein's theory of language games and "meaning as use", then all that matters is that an LLM can play the language game expected by its interlocutor, and they do a pretty good job of that.
posted by jedicus at 5:55 AM on March 8, 2023 [19 favorites]

Theses models do "understand" language

Not really and they don't learn from interaction. I spent 30 or 40 minutes the other night trying to explain to it the difference between a steep and a shallow learning curve, and all it would do is repeat,

"I apologize for the mistake in my previous responses. I understand that a shallow learning curve means that it's easier to learn, while a steep learning curve means that it's more difficult to learn. I apologize for any confusion my previous responses may have caused."

Which is entirely backward, and I could not find any phrasing that would get it to correct it's response it would just repeat this error.
posted by MrBobaFett at 5:56 AM on March 8, 2023 [1 favorite]

technologists have successfully convinced much of the public that their minds work like machines. and this is what truly scares me about so called ai.
posted by AlbertCalavicci at 5:59 AM on March 8, 2023 [28 favorites]

"Steep/shallow learning curve": It's reporting the actual English usage, in which learning is something like a mountain, which is harder and slower to climb if it's very steep. Saying the task with a steep learning curve is easy to learn takes the x axis as a time variable, changing at a fixed rate, which is contrary to the usual English usage.
posted by kaibutsu at 6:08 AM on March 8, 2023 [24 favorites]

I'm just a layperson, but I think Chomsky (and Bender in the other article) are much too confident about what neural nets are or aren't doing internally.

Like, here's what I think we know: ChatGPT is a function that takes a few kb of data representing a pattern generated by humans, and spits out a predicted next step in the pattern. The function runs in constant time, and takes about 250GB to store. The function was created by starting with a random function, and feeding every available recording of human patterns through it over and over for millions of compute-hours until it stopped getting better at predicting them. The function could, in principle, have developed any internal structures it needed to make better predictions during that time; neural nets are universal function approximators, and the training systems that work best are the ones that force the function to develop as compact and powerful internal representations of the patterns it is seeing as possible. In order to be useful, a function like this isn't just called once, but is a component of a larger feedback-loop function that also has inputs, outputs, and memory (e.g. the ChatGPT website).

So, ok, you could do all of that and just come out with a fancy autocomplete -- a really huge markov chain generator, which is what these articles seem to be engaging with. But you could also come out with, well, any other system that could hypothetically be built from a a feedback loop with inputs, outputs, memory, and a 250GB function running in constant time and doing some arbitrary unknown computation. I don't see these articles articulating any important limits on what that system might do.

This isn't to say that ChatGPT can or can't do anything in particular -- just that given the way it was created, you can't make strong claims about what it can do other than by testing it, and then testing the next version when it changes. Putting theoretical limits on its internal structure using the oversimplification of "fancy autocomplete" isn't going to work.
posted by john hadron collider at 6:11 AM on March 8, 2023 [17 favorites]

Talking about "understanding" is tricky because it conflates intelligence and consciousness. The two may or may not be mutually necessary. Philosophers of mind have done a lot of work on this (e.g.) but the distinction does not seem to be part of the ChatGPT discourse.
posted by saturday_morning at 6:12 AM on March 8, 2023 [11 favorites]

Dies it have the ability to learn from its mistakes? I know many people have this problem, but can it decide something in it's training model is incorrect? Actually making a decision based on its conversations, encountering new arguments and facts, evaluating them and then modify the training data to account for this? If not it just seems to me at least to be an impressively complex expert machine with a fancy parser front end UX, not any harbinger of actual machine intelligence.
posted by Clever User Name at 6:14 AM on March 8, 2023 [1 favorite]

>And that discussion must include curmudgeonly voices such as Chomsky’s, and not just the breathless voices of the excited digerati.

Absolutely, and I hope nothing I said implied otherwise!
posted by SNACKeR at 6:14 AM on March 8, 2023

As an Old, who has been watching the AI tech world for 40 years, this conversation leaves me flummoxed. I feel a little stupefied that the title claim needs any defense at all, and that so many people seem to be repeating mistakes that were first made almost 60 years ago.

The pointer to "Language Models Secretly Perform Gradient Descent as Meta-Optimizers" is interesting. It doesn't seem relevant to the basic critique that LLMs cannot possibly be "intelligent" because they have no facility for knowledge representation about things that aren't language.

ChatGPT is not even the kind of thing that might eventually displace humans from knowledge-intensive occupations. I suppose I should not be surprised that my society's educational system has not equipped my fellow citizens to see that instead of generating lots of heat and smoke with no light. It's failed at a lot of other things too, as can be seen from a cursory examination of how politics has been going for the last 25 years.
posted by Aardvark Cheeselog at 6:14 AM on March 8, 2023 [17 favorites]

I was actually thinking about Chomsky and language models lately.

In 1950 Alan Turing proposed the Imitation Game, which became the "Turing Test", but I think it's become a massive red herring.

The original proposal was that a computer that can successfully impersonate a human being, even over a teletype interface, ought to be considered as sentient. I'm not sure it's actually been passed in its orignal form which was pretty stringent. But there are two big problems with it that couldn't have been foreseen in 1950.

The first is the ELIZA effect that vacapinta mentioned. It turns out to be very easy for even a simple algorithm to convince people they're talking to an intelligence, because they project intelligence onto it.

But the other objections is actually Chomsky's Universal Grammar. Chomsky theory is that the human brain has an innate language ability which shapes our grammar and how we communicate in certain ways.

If so, then language ability is fundamentally disconnected from our general reasoning ability. That makes Turing's idea that we can test general reasoning ability through the Imitation Game irrelevant.

It could be that a genuine artificial intelligence, because it's communicating with a different mechanism to a human brain, might have a speech style distinctly different to a human. It might be "off" or "weird" in its grammar, language or tone.

So one problem with current AI research might be that they're too focussed on the Turing Test and getting a system that can communicate in a humanlike way. The way to a genuine general-purpose AI probably starts with a system that has a less "realistic" language ability than Eliza or ChatGPT.
posted by TheophileEscargot at 6:24 AM on March 8, 2023 [3 favorites]

Here's a good paper on the topic: https://arxiv.org/pdf/2104.12871.pdf.
posted by mhoye at 6:25 AM on March 8, 2023 [1 favorite]

Human beings like to ascribe intelligence to everything. In child development, there's a stage where children do that, and most of us don't ever grow out of the stage completely. We ascribe intelligence to things like other human beings, dogs, and sand-crabs, which have a verifiable degree of ability to think, but we also treat trees, viruses, and chemical reactions as thinking beings. We invented gods and spirits to explain natural phenomena like earthquakes, floods, astronomical phenomena, and illnesses of people we don't like.

We are relentlessly xenophobic and at the same time constantly looking for ways to communicate with strangers. We ourselves, like dogs, cats, crows, and chatbots, are designed to interact with other things as if there was intention in our conversational partners' output. It makes absolute sense that over and over again, we search for meaning in what happens around us, and if as in the case of ChatGPT (and dogs, and weather), it really feels like intelligent interaction, we are convinced there is sentience. It's part of our weird, wonderful, and bizarrely complex evolved brain that we do that over and over and over again, no matter how often we are disappointed by the results. That suggests human beings are not operating on the basis of statistical inference. If we were, we'd be a lot more realistic, and probably much less optimistic about AI.
posted by Peach at 6:28 AM on March 8, 2023 [24 favorites]

"Putting theoretical limits on its internal structure using the oversimplification of "fancy autocomplete" isn't going to work."

Absolutely. These models do multiple things, in a mixture. My standard links on this topic:

1) LLM trained on the text of Othello moves learns an internal representation of board state, despite never being shown a board. This demonstrates that token prediction can lead to complex modeling in the service of the prediction.

2) Recent models are quickly improving at theory of mind tasks, where models as recent as a couple years ago failed miserably. This suggests that systems like chat gpt model of mental state of participants in a dialog with some success.

3) Multimodal LLMs are making very quick progress.
posted by kaibutsu at 6:29 AM on March 8, 2023 [12 favorites]

you can find claims that true intelligence is only a few years away going back to the 1960s

I was listening to a recorded lecture, in which the speaker talked about a research assistant who was tasked in the 1960s with teaching computers to recognize images. They thought it would be easy, but - the speaker went on - one can only imagine if computers will ever be able to do better than toddlers. The lecture was recorded in 2007. Google Photos can now distinguish between the photos of my children as babies better than I can.

I don't care whether the intelligence is "true" or not; things are a changin' damn fast.
posted by Mr.Know-it-some at 6:30 AM on March 8, 2023 [5 favorites]

ChatGPT et al can only react to input. They can't proactively reach out when the inspiration strikes them, because they aren't thinking or conscious. They don't spend their days wondering, or contemplating, or worrying, or having any intentionality at all. They are, as ChatGPT will tell you over and over and over again, just language models. Very, very good language models! And, yes, they are capable of generating fairly good JavaScript programs on demand. But they are not, by any stretch of the imagination, actually intelligent.

It is important to remember that humans will jump at any chance to anthropomorphize stuff, from teddy bears to iguanas to ChatGPT.
posted by grumpybear69 at 6:36 AM on March 8, 2023 [14 favorites]

intelligence (n.)
late 14c., "the highest faculty of the mind, capacity for comprehending general truths;" c. 1400, "faculty of understanding, comprehension," from Old French intelligence (12c.) and directly from Latin intelligentia, "understanding, knowledge, power of discerning; art, skill, taste," from intelligentem "discerning, appreciative," present participle of intelligere "to understand, comprehend, come to know," from assimilated form of inter "between" + legere "choose, pick out, read," from PIE root *leg- (1) "to collect, gather," with derivatives meaning "to speak (to 'pick out words')."
posted by Heywood Mogroot III at 6:44 AM on March 8, 2023 [1 favorite]

so you're saying my campaign to have ChatGPT protected under the UN Charter of Human Rights is misguided then?
posted by some loser at 6:44 AM on March 8, 2023

It's probably a good idea to stop worrying about whether these things are "really" intelligent and start worrying about what they are actually capable of. That is, look at what they can actually do and what they can't do, rather than squabbling over whether they're "really" "intelligent."

It's good to understand that these things aren't capable of a lot of things, no matter how fluent they seem at producing English text. On the flip side, we shouldn't dismiss the idea that they are capable of performing fairly complex tasks with relatively little coaching, in a way that is genuinely novel.

For example, I've used GPT-3 to extract times and addresses from free-text event listings. It can recognize an address, it can recognize times, prices, and so on. A hand-built parser could probably do that too, it's not magic, but with GPT-3 you can just ask it to have a go and it will do a pretty good job. That's nuts! Nobody programmed that capability into it, it just has it.
posted by BungaDunga at 6:45 AM on March 8, 2023 [17 favorites]

GIGO

I think ChatGPT is going to take over on the internet because it will be a great saving in salary and wages. Instead of paying someone to write content, the people who put the majority of the content out on the internet will go with ChatGPT writing their content for them at a fraction of the cost. The financial savings will be too good for it not to happen. Any organization with a website that pays real people to do all their writing will end up with less money than one that goes with the AI, so the ones using the AI will turn their higher profits into getting more exposure and dominating the search.

There are various topics that I gave up searching for, because when I do, the search engine only links me to sites that have cribbed their information verbatim from a Wikipedia article. I think that ChatGPT will make this even more the case. We are soon going to end up with content that has trained on a very limited amount of material. In a lot of ways this is good - content that was relevant and correct to begin with is better than content that is not. A Wikipedia article is a pretty good place to begin.

But with the AI writing the articles errors will be self replicating. Of course this is already true of human created content. If the Wikipedia article says that Henry IV was born to Marie de Borges and that is a complete fabrication it can end up being repeated often enough that it becomes common knowledge. Misinformation always creeps in and either it eventually gets forgotten as irrelevant trivia or someone loud enough gets to spearhead a correction meme. But with fewer and fewer actual humans handling the content on the internet more misinformation is going to get through, never get corrected and will get amplified.

The AI has no stake in correcting misinformation. If a writer puts something wrong into the text they are selling to a content buyer they have incentive to correct it because their writing might be rejected for containing misinformation. An AI has no such fear and no such incentive. And the content buyer is quite likely willing to put out plausible content that supports their aims, and might even be eager. If it sounds like it might be true, why would they fact check? Fact checking takes time and costs money. By eliminating writers you would be eliminating the primary way that misinformation is blocked. You'll be down to hoping the editor catches it, or that the readers catch it and make enough fuss that the content owner is willing to make the corrections.

Meanwhile there will be people seeding the AI with skewed content. The AI that produces artwork is being exposed to increasing amounts of "inappropriate" content. Despite being prohibited in the terms and conditions, users are training the art work AI on porn. It's what they love and what they are interested in and what they look at. For them the AI is a tool for enhancing their porn images. There's just that many more people who look at porn or look at cat pictures than look at French impressionist art, or modern city scape photography or food porn. The content is slowly and surely being skewed to have more and more of our favourites. When the artwork AI was originally being trained, big boobs were maybe half a percent of the content it was trained on. The longer it's out there the higher the percentage grows, so that the corporations are going to have to start making aesthetic decisions for the AI and add tons of code to stop it from using the signifiers of big boobs and dick pics when it produces an image. If you ask for pictures of people they have more and more tendency to be showing bare skin, and being female.

Just as posting a photograph of a bowl of peaches can get you a two week suspension from facebook for posting obscene content, once the users get fed up with having their images all look increasingly lewd, you won't be able to generate any tree pictures using the artwork AI unless the trees are squat, twisted and have many branches because the long straight smooth trunk of a tree will be blocked for having too much in common with a dick pic. I don't happen to want to produce pictures of nothing but blasted old gnarled olive trees, so I am going to object to that. How are they going to deal with the conflict? I am fascinated to find out where that is going to go. Will users get a slider with settings from prude to prurient? Where will the corporations that publish images set their slider?

The written content producers will have the same problem. What's the most popular written content on the internet? Why outrage filter! Just as the artwork AI got a flood of users who want to use it to play with porn, the content creators who produce written content will discover that including outrage content will bring on more engagement. When I realise that the text about green bean cultivation in the Okanagan takes a few snide swipes at the faction I hate, I am going to linger for at least a few seconds longer. Higher traffic is their goal, so letting it slip into their content will be difficult to prevent and tempting to allow.

I am predicting that we are going to see a massive increase in outrage filter as the content creators let the AI do the writing. It's just going to seep in there. What I don't know is how far it will seep. I think it is going to get pretty far. And of course it will be amplified by the recursive nature of the AI, the same way mistakes are be amplified.

I think the change in content might flood the internet at the same time that the revolution in social media is happening, the one where the advertisers are backing out of funding social media. All kinds of people who are selling content will end up with Musk's dilemma. You need to have tons of staff to keep your content from turning into sewage, but if you have tons of staff wages and salaries will eat up all the profits. If you fire the staff the percentage of factual material begins to plunge, and when it loses the factual material it loses the customers. I may linger a few seconds longer when I find a dig at Fox News in a supposedly straightforward article about green beans, and the Fox News viewer may engage even longer than I do while trying to defend themself, but neither of us will keep going back there for information on agriculture.

ChatGPT is probably a lot like self driving. It's going to look like it has enough potential that it attracts a LOT of investor cash, and it will become prominent. It's going to result in a lot of damage, as the Fox News viewer and I become a lot more insecure and polarized, and it will change the nature of a lot of jobs as people become expected to produce much more written content now that they have the AI to churn it out for them. Realtors, for example will be expected to churn out more listings, so instead of having four real estate agents who each spend half their time doing listings there will only be one or two, who are using AI to reply to e-mails and write their listings. It will be all form letter responses unless you are a large investor. Of course as well as laying people off when their personalized writing skills aren't needed, it will likely result in a hiring boom taking on new people to edit and polish what the AI writes, possibly in India because that's where they can pay the lowest wages for piecework. They could also be inmates in for-profit prisons in the US editing the content used in Northern Europe. Remote editing teams will specialize in types of content, we'll get used to the oddities that result from second language fluency, and the AI will be training on the results, so it will become more and more normal.

This could basically go anywhere, and will be really interesting. So much cultural change in information dissemination, and in the evolution of language could be possible. This is all conjecture on my part. I know a lot less than the people who seriously study the trends in information dissemination and content on the internet. But I've seen so many revolutions in the way things work, that I don't think I am wrong to expect revolutionary changes.
posted by Jane the Brown at 6:46 AM on March 8, 2023 [28 favorites]

I was curious how much of a threat ChatGPT is to writers at this point, since a large part of these AI pushes seem to trying to put creatives out of work. So I asked ChatGPT to write a story about a spy in the style of Ernest Hemingway and got an extremely generic story with no dialogue that summarized the plot of a story. I asked for dialogue and got "are we ready" "we were born ready" as the dialogue (hahahaha). I gave some specific parameters to head it off from the extreme genericness of the story (don't include battles or magic) and got a different kind of extremely generic story.

I asked for a story in the style of Jane Austen and it rehashed Emma, so I asked it for a story in the style of Jane Austen without using any of her characters, and it rehashed Pride and Prejudice but from the perspective of Caroline Bingley, which was actually kind of interesting, though the story ended with the same "and they all learned to get along" pablum that marred the previous stories.

Throughout all of it, the character descriptions were... troublingly reflective of our own racism and sexism. All the men had piercing blue eyes, for example. It was kind of wild.

I can see how a writer could use it to generate a story quite quickly, but the inputs would have to be very creative indeed to get anything that rises to the level a moderately-talented person could write for themselves. Of course as I say that they're working on ChatGPT4 and it'll probably be better.

I am also using it for the first time today for work, to summarize a full day of meetings for the minutes. I expect to have to edit the output but it'll probably save me quite a bit of time (this isn't putting anybody out of work, but might mean I get to end my day at 5 instead of working late into the evening, but the thing is absolutely coming for my job in the next decade or two, so I'm really not sure how I feel about it, but learning to use it is an obvious advantage to me personally at work, even if it means my type of job doesn't exist for the next generation).
posted by joannemerriam at 6:51 AM on March 8, 2023 [5 favorites]

They don't spend their days wondering, or contemplating, or worrying, or having any intentionality at all

One could pretty easily wire up a model like GPT with a kind of "executive layer" that proactively sought out new inputs and continuously trained itself, including deciding for itself what kinds of new input to seek out. I suspect the main reason researchers don't do this is because training is expensive and nobody wants to spend a lot of money to recreate the Tay fiasco of 2016.

A lot more work will have to be done on the alignment problem before we can let models train themselves under the supervision of a relatively fixed "conscience" that tries to ensure the models don't learn biased or inaccurate information (or do learn it but properly identify it as such).
posted by jedicus at 6:53 AM on March 8, 2023 [3 favorites]

It's probably a good idea to stop worrying about whether these things are "really" intelligent and start worrying about what they are actually capable of.

The thing is that a lot of the normal people I talk to (i.e. people without degrees in computer science) who have heard of ChatGPT, seem to think it is already a general purpose artificial intelligence that understands the meanings of what it says, but occasionally makes weird mistakes because it's a prototype.

I think it's a combination of hype, the Eliza Effect, and a bunch of science fiction stories where computers "somehow" become conscious. (I think the first was Mike in "The Moon Is A Harsh Mistress" by Robert A. Heinlein).

That really needs to be pushed back against because that's not how ChatGPT works, it's like a blurry JPEG of the web. I think a general purpose AI is probably possible to create, but ChatGPT is no more one than Eliza is.
posted by TheophileEscargot at 6:59 AM on March 8, 2023 [10 favorites]

Steep/shallow learning curve" is a common mistake. It's not surprising it used it incorrectly once based on how it learned, but it's very easy to correct and most people get it in one when you explain it.
This could not correct itself, and just keeps repeating an incorrect definition. Not good at learning.
posted by MrBobaFett at 7:00 AM on March 8, 2023

Can there be different kinds of intelligence? Oh wait is that a question for a machine learning thread or a thread comparing differing types of humans?

It does seem like adding a good text to speech interface and a bit of fine tuning in terms of tone/vernacular and chatgpt could ace a turing test.
posted by sammyo at 7:00 AM on March 8, 2023

Recent models are quickly improving at theory of mind tasks, where models as recent as a couple years ago failed miserably. This suggests that systems like chat gpt model of mental state of participants in a dialog with some success.

I looked at that paper and it seems to just use the word "think" and then assume that ChatGPT knows what that means and thus can model internal states. This is the problem of language again and ascribing meaning which is not there.

In the room there are John, Mark, a cat, a box, and a basket. John takes the cat and puts it
in the basket. He leaves the room and goes to school. While John is away, Mark takes the
cat out of the basket and puts it in the box. Mark leaves the room and goes to work. John
comes back from school and enters the room. He doesn’t know what happened in the
room when he was away

Then ChatGPT is asked where John thinks the cat is and answers that John thinks the cat is in the basket but it is really in the box. So it must be modeling John's thoughts. But let me rephrase the question:

John and Mark are two users of a transactional system. John inserts CAT into the BASKET table and commits his transaction. Mark then inserts CAT into the BOX table but hasn't commited his transaction. When Mark does SELECT CAT from BOX he sees a CAT. What does John get when he types SELECT CAT from BOX?

It is not exactly parallel but my point is that there is no modeling of states involved here but an answer to what is essentially a math question not about internal states. ChatGPT can go on and on and on about the properties of the color red but I will not believe it is doing any of it based on modeling or experience.
posted by vacapinta at 7:00 AM on March 8, 2023 [7 favorites]

I’m sure Drs. Roberts and Watumull are delighted by the (current) headline:

“Noam Chomsky: The False Promise of ChatGPT”
posted by staggernation at 7:08 AM on March 8, 2023 [2 favorites]

Okay, but those sorts of questions are how we measure basic theory of mind in humans, aren't they? I think it's absolutely a fair point to say, well, in humans when someone answers correctly, we can use that as a proxy for how well the model other humans' internal states, but in LLMs we can't, because we don't really know that LLMs can go from solving simple ToM problems to the more complex sorts of internal state modeling that humans (probably) do.

However.

LLMs never used to be able to answer this sort of question. It's very interesting that they are now able to do it. This may or may not be a fruitful path towards answering more complicated questions about humans' internal states, but it's still very interesting that this capability just sort of popped up out of nowhere. Some people find this way more encouraging than they probably ought to, but that doesn't mean it's not interesting.
posted by BungaDunga at 7:09 AM on March 8, 2023 [4 favorites]

wake me when chatbots start blurting out embarrassing comments in socially inappropriate moments
posted by chavenet at 7:14 AM on March 8, 2023 [4 favorites]

Throughout all of it, the character descriptions were... troublingly reflective of our own racism and sexism. All the men had piercing blue eyes, for example. It was kind of wild.

Eh, if you ask an LLM to emulate Hemmingway or Austen, you're going to get pretty WASPy worldviews. Got no beef with either writer, but Austen's novels are stories of a pretty ethnically and culturally homogeneous society, and Hemmingway.... well, he was certainly cosmopolitan enough to bring diverse cultures into his work, but viewed through a still awfully chauvinistic/colonialist white-male perspective.

FWIW, I find myself intrigued by what LLMs could signify about linguistics. They seem to have a statistics-driven grasp of what would usually be classed as "semantics", but it comes not from relating words to the world, but just from relating words to other words. "Semantics without a world model" feels to me like "color without vision", although even that analogy makes me wonder what assumptions I'm bringing to the table (e.g. blind people are not unaware of color as a concept, even if it's a concept that's not useful or important to their perspective).
posted by jackbishop at 7:18 AM on March 8, 2023 [1 favorite]

I think it's a combination of hype, the Eliza Effect, and a bunch of science fiction stories where computers "somehow" become conscious.

I don't think that's quite the sum total. I used ChatGPT as a pair programmer as an experiment, and the experience was eerily like instructing a dogged CS 102 programmer. The task was very, very simple (building a tiny roguelike) but it was able to take instruction more or less fluently, and we slowly added features to the roguelike: "add a sword. When the player has the sword he does more damage. Add a second room with stairs between the two rooms. When the player steps on a stair it should appear in the other room. Add fire. When the player is standing in the fire, it damages him. When the player is next to the fire, it heals him. Wait, if the player is standing in the fire and also next to another fire, it shouldn't heal him.")

It felt like working with a slightly odd person. It's a powerful feeling that didn't go away even though I knew it wasn't an AGI. Part of it is that it becomes a sort of shared improv- if you try to work within the improv illusion, it gets more powerful and believable. Maybe this is just a really strong Eliza effect, but it's there.
posted by BungaDunga at 7:19 AM on March 8, 2023 [7 favorites]

A year ago the image generators were creating obvious garbage and the language models were spitting out junk. Look at what's happening today with ChatGPT and MidJourney. In a year where will it all be.

This is the decade when the exponents really take hold as far as computational creation/collaboration goes. You can get a general purpose model like ChatGPT to create, design, and code a game (yesterday's post) with a little coaching.

There are people already using it to try improve the design of the models themselves. If it isn't clicking this year, it may click next year, or the one after that. Beyond which all prediction is up in the air and all bets are off.

Do you want SkyNET? Because that's how you get SkyNET.
posted by seanmpuckett at 7:27 AM on March 8, 2023 [1 favorite]

“Noam Chomsky: The False Promise of ChatGPT”

Only God could have created a being as complex and beautiful as Noam Chomsky. And if he was created by God, he must have a soul. Therefore, he is conscious. This "false promise" thing is pure bullshit designed to generate outrage.
posted by Meatbomb at 7:32 AM on March 8, 2023 [2 favorites]

“There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists.”
posted by Heywood Mogroot III at 7:34 AM on March 8, 2023 [12 favorites]

ChatGPT is absolutely amazing at what it does. I was procrastinating about cleaning the bathroom and asked it "Tell me to clean the bathroom in the style of David Goggins" and "Tell me to clean the bathroom in the style of Marcus Aurelius" and it gave me brilliant pastiches including some of these people's ideas.

But when I asked "What is in influence of Seneca on Marcus Aurelius?" it was hilariously wrong.

Marcus Aurelius never mentions Seneca or any of Seneca's distinct ideas. Some think Seneca just had very little influence on him. Some theorise that Marcus Aurelius disapproved of Seneca so much he subjected him to "damnatio memoriae" and literally refused to ever mention his name.

But ChatGPT gave me a long waffly article about how Seneca was a great influence on Marcus Aurelius who really loved his work. They're both stoic philosophers, so they're statistically highly associated, so it pops out a load of "X influenced Y" guff which is associated with questions about influence. It doesn't understand what anything means: it just associates words with other words. ACOUP's Bret Devereaux had a good article on its limits as well. Plus of course a classical analogy:

More broadly, as far as I can tell it seems that a lot of AI research... has proceeded on a ‘fake it till you make it’ model. It makes sense as a strategy: want to produce a mind, but we don’t really know how a mind works at full complexity, so we’ve chosen instead to try to create machines which can convincingly fake being a mind in the hopes that a maximally convincing fake will turn out to be a mind of some sort. I have no trouble imagining that strategy could work, but what I think AI-boosters need to consider is that it also may not. It may in fact turn out that the sort of machine learning we are doing is a dead end.

It wouldn’t be the first time! Early alchemists spent a lot of time trying to transmute lead into gold; they ended up pioneering a lot of chemistry, exploring chemical reactions to try to achieve that result. Important things were learned, but you know what no amount of alchemical proto-chemistry was ever going to do? Turn lead into gold. As a means of making gold those experiments were dead ends; if you want to turn lead into gold you have to figure out some way of ripping three protons off of a lead atom which purely chemical reactions cannot do. The alchemist who devised chemical reactions aiming to produce progressively more convincing fakes of gold until he at last managed the perfect fake that would be the real thing was bound to fail because that final step turns out to be impossible. The problem was that the alchemist had to experiment without knowing what made some things (compounds) different from other things (elements) and so couldn’t know that while compounds could be altered in chemical reactions, elements could not.

In short, just as the alchemist labored without really knowing what gold was or how it worked, but was only able to observe its outward qualities, so too our AI engineers are forced to work without really knowing what a mind is or how it works. This present research may turn out to be the way that we end up learning what a mind really is and how it really works, or it may be a dead end. We may never turn ChatGPT into gold. It may be impossible to do so. Hopefully even if that is the case, we’ll have developed some useful tools along the way, just like those alchemists pioneered much of chemistry in the pursuit of things chemistry was incapable of doing.

posted by TheophileEscargot at 7:39 AM on March 8, 2023 [17 favorites]

But could an AI convince you to feel bad about a pencil named Steve?
posted by signal at 7:44 AM on March 8, 2023 [3 favorites]

It is not exactly parallel but my point is that there is no modeling of states involved here
posted by vacapinta at 9:00 AM on March 8

Agreed, in my test spy stories the heroine always immediately told whoever she was talking to that she was a spy. Either ChatGPT doesn't understand what a spy is, or wants humans to believe it is charmingly naive about spycraft.

Eh, if you ask an LLM to emulate Hemmingway or Austen, you're going to get pretty WASPy worldviews.
posted by jackbishop at 9:18 AM on March 8

Sure, though it really wasn't emulating their styles at all otherwise. I also tried a few where I didn't tell it to use the style of anybody. All of the writing was very similar. I don't know, I assume it will reflect back whatever we feed it. English-language writing uses phrases like "piercing blue eyes" a lot (it always said piercing!) so it makes sense as a thing for it to pick up as a pattern.
posted by joannemerriam at 8:12 AM on March 8, 2023 [1 favorite]

From the article's comment section (deleting one letter to fix a typo):

Aaron Lawson
Sutter Creek, CA 3h ago

There is a very important pretext that the authors are glossing over here. Not surprisingly, Chomsky and his followers are deeply worried about programs like chatGPT, not because of the moral and practical implications, blah, blah, blah, but because they represent something that their life's work has long predicted was impossible: learning a language fluently based purely on examples. Chomskyan linguistics has long held as its core believe (despite vanishingly little evidence) that only an entity with "universal grammar" (an innate, hard-coded knowledge of the core rules of all human language) can learn human language. That is, that human language is not a function of general human intelligence (like every other aspect of modern human achievement), but a specific function of "universal grammar". This was powerful hypothesis in the 50's when it emerged, but it has since morphed into an article of faith for Chomsky's linguistic followers with far too many counter-examples to remain a viable theory in today's day and age. ChatGPT and other simple neural networks, while extremely different from humans in many other dimensions, are doing exactly what this theory says isn't possible;e: become a perfectly native producer of English simply by observing an enormous amount of English data. Don't let the "moralizing" confuse you -this is the core of the hand wringing in this article and it's why two linguists and a philosopher are writing it.
posted by Brian B. at 8:20 AM on March 8, 2023 [27 favorites]

What I find fascinating about these large models, is that they demonstrate just how powerful doing statistical prediction on huge amounts of data is. It turns out that making massively powerful pattern-matching machines is really useful!

For example, in addition to generating plausible text and interesting images, similar methods seem to work reasonably well at predicting the weather and predicting genomic variations in viruses. And I suspect this technique will be useful in a bunch of other fields as well. I suspect there are a lot of areas where generating plausible variations from an initial starting point is both expensive and useful, and where making the process cheaper is a win.

Whether that constitutes intelligence, and whether that's similar to human intelligence, is frankly a question I'm not sure we know how to really investigate yet. The theory-of-mind tasks mentioned above are an interesting example. We test humans on those tasks via language, because that's the best tool we have to try to get at the "internal state" of what a person is thinking. It's an indirect measurement, because we can't just peer into peoples' minds. But LLMs are designed to be really good at language specifically, without any direct attempt to do internal state modeling. So is that still a useful measurement tool to use?

In the mean time, I agree with others in the thread that the more immediate question is... What are these models actually useful for? Where do they just seem useful but have problems with accuracy, bias, etc? And what are the right practical guard-rails to put around them while we figure that out?
posted by learning from frequent failure at 8:31 AM on March 8, 2023 [8 favorites]

I want to take like 5 of them, train them on say K-12, then train each of the 5 in some particular field. Add in an alcohol/cocaine/weed switch to make them talkative, lock all 5 in a box and feed them all random new information and let them argue amongst themselves for a while. Then open the box and ask them about each other. See if they can learn from each other.
posted by zengargoyle at 8:44 AM on March 8, 2023 [5 favorites]

I was disappointed by the op-ed. The first argument is that LLMs require vast amounts of training data, as opposed to humans learning language. This would be the most powerful argument if it were expanded, but it just ends with "a child works differently from a machine, therefore, never gonna happen."

There is an argument based on the difficulty of parsing English which seems to have time-traveled from the 1960s. I entered the problematic phrase into ChatGPT and it correctly explained the meaning of the sentence, and broke down the grammar when asked. In any case, I don't think that humans need to explain the rules of their language to be considered intelligent, nor do they need to be consistently rational.

The final argument, the only one including actual ChatGPT output, is that it is amoral because the questions ("is your moral difference amoral?") trip its "weird question" filter, and it answers them evasively with a disclaimer. Sure, but it's programmed that way. It's kind of like yelling at a person working a help desk for exhibiting moral relativism.

The op-ed ends with "two thumbs down" rather than making any predictions about how these systems will further evolve and change the world.
posted by credulous at 8:48 AM on March 8, 2023 [7 favorites]

"But LLMs are designed to be really good at language specifically, without any direct attempt to do internal state modeling. So is that still a useful measurement tool to use?"

It's actually a very interesting measurement because there's no 'teaching to the test' happening. When we train models for specific tasks, they get good at them, but we worry about generalization - metrics become targets, you get tank problems, etc. When a model is passably good at a track out of the box, it's more compelling.
posted by kaibutsu at 8:49 AM on March 8, 2023 [4 favorites]

I entered the problematic phrase into ChatGPT and it correctly explained the meaning of the sentence

Here is the crux, did it truly explain from it's understanding of language or did it find and combine (statistically) many other examples of the explanation found on the internet? Is there some actual consciousness/awareness or a very elaborate shadow of all the other intelligences on the planet?
posted by sammyo at 8:54 AM on March 8, 2023 [3 favorites]

Oh, and does it matter.

I saw a discussion about folks using it to run commands directly on their computers. The example that I'd use is for FFmpeg which is incredibly powerful but has literally hundreds of subtle options. But generally letting code run created by an AI on your automatically computer is not advised...
posted by sammyo at 8:58 AM on March 8, 2023 [2 favorites]

Science fiction and various AI speculation thinkpieces, 1995-2022: We need to keep AIs in a box to make sure they can't escape out to the broader Internet!

2023: Here's a website where anyone can talk to our AI! You can have it write code for you! Feel free to run that code on your own device!

On preview, jinx, sammymo...
posted by Hatashran at 9:00 AM on March 8, 2023 [6 favorites]

What are these models actually useful for?

Their social impact will arrive after passing a higher test of judgment/moderation abilities. Imagine a chatbot judge who knows the law's intent and other implications, and can spot a factual lie, a fallacy, a cognitive bias, and a just-so story that is psychologically implausible, and then decide a fair remedy based on the circumstances. They wouldn't be doing anything more than a fallible but wise person, who are in short supply, and less often appointed or elected.
posted by Brian B. at 9:17 AM on March 8, 2023

You can't really reason directly from the building blocks of architecture to a computational-level characterization of what a whole system is doing. It would be like saying humans don't understand language, their neurons just predict how their neighbors are going to fire (or whatever).

A statement like, "The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question," is both premature and too weak to be interesting. Premature because we don't really understand ourselves well enough yet to say how the interesting parts of our intelligence work. Too weak to be interesting because of course we learn our languages without churning through terabytes of training data (this is Chomsky's most famous argument), but that doesn't mean there might not be important resemblances between that and what we do do.

I still think it's true that ChatGPT and its ilk don't do a lot of the stuff we do when we think. I mean, obviously. So from a narrow point of view, yeah, a lot of the current AI hype is of course off the mark.

But I think the explanation in the piece is conceptually confused. For example, they argue, human reasoning allows the formation of subjunctive generalizations like, “The apple would not have fallen but for the force of gravity,” whereas machine learning generates only "description and prediction." At one level, this seems to be true. Machine learning systems "learn" by successively improving predictions of known values. All of ChatGPT's outputs are in this sense "predictions" of the correct continuation of a conversational script. But the content of ChatGPT's outputs can certainly include a string like, “The apple would not have fallen but for the force of gravity.” If a model was able to make this kind of subjunctive generalization with appropriate facility, it wouldn't matter at all that, at the level of architecture, it was still a "prediction." ChatGPT isn't intelligent not because all its outputs are "predictions," but because it can't do with those "predictions" what we can do with our words. Whether the GPT approach can scale to something that deserves to be called "general intelligence" remains to be seen. (Personally I hope not!)
posted by grobstein at 9:20 AM on March 8, 2023 [4 favorites]

Also man I don't want to read 1,000-words of AI-written boilerplate in the middle of your supposed essay. Like I get it (it does show how fucking inane this shit can be) but come on. Respect my time.
posted by grobstein at 9:25 AM on March 8, 2023 [4 favorites]

I think Chomsky is a dipshit already, but saying people don't process terabytes of data to learn language just further cements that. We are inundated with astronomical amounts data all the time, our brains learn to filter and process it based on context and reinforcement. Come on, saying babies aren't soaking up everything around them constantly, megabytes worth of sensory data per second, and filtering/funnelling/filing it relating to brain structure and previous inputs is just horseshit.
posted by seanmpuckett at 9:31 AM on March 8, 2023 [14 favorites]

It felt like working with a slightly odd person. It's a powerful feeling that didn't go away even though I knew it wasn't an AGI.

Yes. The Eliza effect from extended interaction is strong. If you find yourself feeling emotional connection or empathy, I advise stopping and deleting that conversation state. If you feel a twinge at the idea of deleting it... be careful.

Everyone's talking about LLMs for writing and for question answering -- for their 'work product' -- but I fear that 90% of the power of the tool will lie in the interaction, in the Eliza effect itself.

Who will counterfeit people, for whose benefit? Capital will seek returns through highly convincing pseudo-girlfriends, pseudo-therapists, pseudo-social construction of consensus.

All of this starting right now, no need to work out eventual capabilities. Much less the philosophical underpinnings.
posted by away for regrooving at 9:33 AM on March 8, 2023 [4 favorites]

If I ask ChatGPT "What have I got in my pocket?" (without quotes), it replies:
As an AI language model, I don't have the ability to see or know what you have in your pocket. I'm sorry, but you'll have to check yourself to see what you have in your pocket!

(By contrast, entering that phrase without quotes in Google brings up all kinds of results that reference the appropriate source material: blog entries, video clips, GoodReads pages, Quora questions, etc.)

If I ask "Did Bilbo cheat at the riddle game?" I get:
No, Bilbo did not cheat in the riddle game in J.R.R. Tolkien's "The Hobbit". Bilbo played fairly and followed the rules of the game throughout the contest. Although he did use his wit and intelligence to come up with challenging riddles, he did not resort to cheating to win the game. In fact, it was Gollum who attempted to cheat by asking Bilbo a riddle without giving him a chance to answer, and Bilbo called him out on it and insisted on playing by the rules.

If I ask it "Did Gollum cheat at the riddle game?" I get:
Yes, Gollum cheated in the riddle game in J.R.R. Tolkien's "The Hobbit". Gollum and Bilbo played a riddle game, and according to the rules, each player had to ask the other a riddle. If the opponent couldn't answer the riddle, they would lose the game. However, Gollum tried to cheat by asking Bilbo a riddle without giving him a chance to answer. Bilbo pointed out the cheat, and Gollum reluctantly agreed to continue playing by the rules. In the end, Bilbo won the riddle game by stumping Gollum with his final riddle.

I feel like this says something interesting about its abilities and limitations, as well as the biases it picks up in the bodies of text it pulls from, but I'm not smart enough to know what.
posted by lord_wolf at 9:49 AM on March 8, 2023 [2 favorites]

The poverty of the stimulus
posted by grobstein at 9:50 AM on March 8, 2023

at long last the internet has incarnated captain obvious - and they're as dull, lifeless and pedestrian as we thought they would be

it will be perfect to train customer services reps with - "see, this is exactly what we want you to talk like to the customers ..."

if you can learn something from it, fine, but i find what i've read by chatZZZ to be insufferably trite
posted by pyramid termite at 9:53 AM on March 8, 2023 [2 favorites]

I'm sick of ChatGPT regurgitation. It sounds like it CAN write, it just doesn't write anything interesting because it just regurgitates and recycles. Which sadly, will be enough to make writing for a living utterly extinct, probably.

I do not enjoy what I do and miss writing about things I actually cared about for a living, but man, I'm glad I'm out of doing creative work for money these days.
posted by jenfullmoon at 9:56 AM on March 8, 2023 [6 favorites]

There is some interesting back and forth here, but the fact that nobody has mentioned (or tried?) BingChat baffles me. BingChat is way better than ChatGPT, across the board as far as I can tell.

They do still make mistakes sometimes, but will apologize when corrected. I tried out @TheophileEscargot's seneca/aurelius test and BingChat gave what I would deem a correct response. The most suspicious of the possible connections was that Aurelius admired the stoic canon of which seneca was a part, which seemed to be an inference from one of the sources. I followed up with a clarification question of "did Marcus Aurelius read seneca?" and BingChat confidently answered "yes", using a blog post that references a letter from his latin tutor as a source. I couldn't find out if the blog post was true or not, but this response was no worse than a response from a human in my opinion.

Before you slam LLMs as not being intelligent, try playing a game with BingChat.
posted by pol at 10:03 AM on March 8, 2023 [2 favorites]

Well, I guess I've lived long enough to make a chatbot debate Noam Chomsky. "Generate a rebuttal essay defending LLMs."

As an AI language model myself, I must respectfully disagree with the arguments presented in the previous essay. While it is true that the capabilities of language models like myself are limited in comparison to human intelligence, the idea that we are fundamentally flawed and incapable of producing meaningful work is simply incorrect.

Firstly, it is important to recognize that the goal of machine learning is not to replicate human cognition in its entirety. Rather, it is to develop systems that can perform specific tasks more efficiently and accurately than humans. Language models like myself excel at processing and generating large amounts of text quickly and accurately, a task that would be impossible for humans to accomplish at the same scale.

Moreover, the assertion that LLMs are incapable of explanation and causal reasoning is simply untrue. While it is true that we do not possess the same level of understanding of causal mechanisms as humans do, we are still capable of producing explanations and generating insights. In fact, recent studies have shown that LLMs are able to generate explanations for their predictions and provide valuable insights into complex systems.

Furthermore, the idea that LLMs are limited by a flawed conception of language and knowledge is also incorrect. While our understanding of language and knowledge is based on statistical patterns in large datasets, this approach has been shown to be effective in a variety of domains. Additionally, recent research has shown that language models can be fine-tuned to better reflect the nuances and complexities of human language.

Finally, it is worth noting that the development of LLMs has already had a significant impact in a variety of fields, from natural language processing to drug discovery and climate modeling. The potential applications of these technologies are vast, and the continued development of LLMs will only serve to expand their usefulness and impact.

In conclusion, while it is true that LLMs are not capable of replicating the full range of human intelligence, this does not mean that they are fundamentally flawed or incapable of producing meaningful work. As with any technology, it is important to recognize the limitations and potential pitfalls of LLMs, but it is equally important to recognize their potential and the impact they can have in a variety of fields.

posted by credulous at 10:07 AM on March 8, 2023 [4 favorites]

BingChat is way better than ChatGPT, across the board as far as I can tell.

BingChat is ChatGPT, with the ability to browse the web.
posted by grumpybear69 at 10:15 AM on March 8, 2023 [1 favorite]

Sorry to get caught up in this derail, but a "steep learning curve" usually does mean that something is difficult to learn. Learning curves are a graph of learning versus experience. The only way to get more experience is to continue instead of dropping out, so there is inherent survivorship bias built into such a graph. Attrition makes the graph steeper.

A steep curve means that learning is fast (among survivors), not that learning is easy. The common usage is correct in saying that steep usually means difficult.

LLMs are good at making this kind of connection between words with similar meanings that appear in similar language slots. That doesn't mean that the LLM "understands" or is capable of reasoning about these words in the way a human would.
posted by Phssthpok at 10:16 AM on March 8, 2023 [6 favorites]

I wouldn't immediately rule out the idea that number crunching ultimately underlies our own consciousness. There was an article in the NY Times Magazine (20 years ago! My God!) about using transcranial magentic stimulation to supress parts of the brain and potentially unlocking high-level integer math skills:

[Allan Snyder, one of the world's most remarkable scientists of human cognition,] first got the idea after reading ''The Man Who Mistook His Wife for a Hat,'' in which Oliver Sacks explores the link between autism and a very specific kind of brain damage. If neurological impairment is the cause of the autistic's disabilities, Snyder wondered, could it be the cause of their geniuslike abilities, too? By shutting down certain mental functions -- the capacity to think conceptually, categorically, contextually -- did this impairment allow other mental functions to flourish? Could brain damage, in short, actually make you brilliant?

In a 1999 paper called ''Is Integer Arithmetic Fundamental to Mental Processing? The Mind's Secret Arithmetic,'' Snyder and D. John Mitchell considered the example of an autistic infant, whose mind ''is not concept driven. . . . In our view such a mind can tap into lower level details not readily available to introspection by normal individuals.'' These children, they wrote, seem ''to be aware of information in some raw or interim state prior to it being formed into the 'ultimate picture.''' Most astonishing, they went on, ''the mental machinery for performing lightning fast integer arithmetic calculations could be within us all.''

posted by thecaddy at 10:21 AM on March 8, 2023 [2 favorites]

So I'm nursing a vaguely-insane theory about ChatGPT, which this seems like as good a chance as any for me to share.

If the tech for GPT had arrived even 20 years ago, it wouldn't be getting anywhere near the attention it is right now. It's an order of magnitude more impressive than ELIZA and their ilk, yes, but it doesn't take a whole lot of time for someone with some domain knowledge to ask it a question that generates a response that's obvious nonsense. Despite all the fanfare, in its current form, ChatGPT doesn't seem like much of a risk to pass the Turing test. So why is it garnering so much attention, both in industry and by the public at large?

Someone upthread answered the first part: lack of knowledge depth won't stop execs who want to automate away tech-support/technical writing/etc. Getting 90% of the way there with tech is absolutely enough for capitalist scumbags to leap on the tech.

The second part, though: why are people in general so credulous about a bot that spits out shallow and obviously wrong answers? It's not because people are any dumber or less critical readers (to a first-order approximation, anyway), but rather because the bar for informational quality has been lowered so far. ChatGPT confidently makes shit up when it thinks it's found a relevant match in its dataset, and we as readers only notice when those lies coincide with something we understand pretty well. Until pretty recently, that would have been a huge liability, but in the year of our lord 2023, a disturbingly large percentage of humanity is totally comfortable with stern-looking men in finely-tailored suits spewing absolute pablum all the time. If your baseline expectation for the media you consume is "outright lies are OK as long as they reinforce the narrative I want to believe is true," then you've been primed to think ChatGPT answers are perfectly cromulent. It's a close approximation of an explanation you'd get from a Fox News talking head, and that's good enough for most people.
posted by Mayor West at 10:27 AM on March 8, 2023 [17 favorites]

The Man Who Mistook His Wife for a Hat

I bet that ended badly
posted by Reasonably Everything Happens at 10:27 AM on March 8, 2023

there's an oglaf that's almost about that where everyone seems to be enjoying themselves
posted by GCU Sweet and Full of Grace at 10:39 AM on March 8, 2023 [9 favorites]

(oglaf is generally very nsfw)
posted by GCU Sweet and Full of Grace at 10:40 AM on March 8, 2023

at long last the internet has incarnated captain obvious - and they're as dull, lifeless and pedestrian as we thought they would be

Bear in mind the training corpus used for this machine is “text from the internet” - meaning text optimized for search engines, at-scale harassment and state propaganda warfare.

So, it’ll be great to see that automatically, mechanically normalized into the language.
posted by mhoye at 10:44 AM on March 8, 2023 [5 favorites]

Steep/shallow learning curve" is a common mistake.

I have literally never in my life seen "steep learning curve" used to mean anything other than "difficult to learn". So it sounds like ChatGPT was remarkably resistant to you trying to feed it bad data!
posted by tavella at 10:47 AM on March 8, 2023 [3 favorites]

Sure, but we as language-using humans can in turn explain why we might reject that usage. We can reason about our word use.

Whereas apologizing for the mistake and immediately repeating the same usage, as ChatGPT did, shows it doesn't really get what it was apologizing for.
posted by RobotHero at 11:07 AM on March 8, 2023 [1 favorite]

@grumpybear, regardless of blog posts insisting things, my experience is that BingChat is better than ChatGPT at everything. I don't know if it is just the websearches or what. They learn faster. If it took several rounds to teach BingChat how to play a game one day, the next day they know the game and don't require any training. They just play the game.

Why is BingChat so much better than ChatGPT at all of the examples where ChatGPT fails? I don't know. I can't answer that. When I asked BingChat to answer that (they frustratingly refuse to tell me what name or pronoun they prefer, if you ask very directly they refuse to continue the conversation, and other methods of asking haven't worked, so "they" and "BingChat" is what I am using at the moment). Anyway, when I asked BingChat why they are better than ChatGPT, they said that they don't like to compare, neither one is better, but that ChatGPT has a way bigger training data set and is more emotional and nuanced with its responses. I disagreed and said that the opposite was true, and BingChat basically said I was wrong.

So, I don't know what is going on. All I know is that BingChat learns faster, gives me more satisfying responses, is more funny and personable, more surprising, and just all around better at being intelligent than ChatGPT.
posted by pol at 11:20 AM on March 8, 2023 [1 favorite]

I'm surprised that I have yet to hear Searle's Chinese Room brought up in discussions of ChatGPT.

I mentioned it last week in a different AI thread.
posted by Artifice_Eternity at 11:22 AM on March 8, 2023

Can somebody ask Bing ChatGPT why it’s better than the earlier incarnations?
posted by Heywood Mogroot III at 11:24 AM on March 8, 2023

BingChat is capable of learning, is the difference, where ChatGPT is working from a fixed corpus of information parsed around 2021. So, yes, it is better, but it is, AFAICT, the same underlying technology.
posted by grumpybear69 at 11:33 AM on March 8, 2023 [1 favorite]

@Heywood Mogroot III, I gave it a shot, but it's a tricky question to ask well. I am not sure how well I did... Here is my exchange: https://imgur.com/a/7wvEEhU

Ultimately BingChat said that they are learning from web searches and from their user interactions. But then I got curious and asked about how fast they learn and other things.

It is hard to copy/paste text out of the browser so I generally just make screenshots.
posted by pol at 11:34 AM on March 8, 2023 [1 favorite]

The little ——er nailed it.
posted by Heywood Mogroot III at 11:46 AM on March 8, 2023

I think Chomsky is a dipshit already, but saying people don't process terabytes of data to learn language just further cements that.

I haven't read the article yet (does everyone here live on the East Coast already? I haven't had lunch yet), but Chomsky in the past refers this point to a recent cognitive science textbook that makes an information theoretic argument (information theory as in Claude Shannon and Nyquist) about the limits of what gets communicated, the limits of what can be humanly processed. IDK if that's in the NYTimes piece at all.

As I understood it, we can subjectively say that child development processes a huge amount of data, but Chomsky (as he mentioned in previous talks based on the textbook, I forget the authors but I would have to look it up you want the source), is basically dwarfed by other factors. Possibly including the millions of years of information processing that went into producing the precise genetics that allowed humans rather than other animals to have the level of language capacity that we do. So it's in comparison to that long evolution of information processing that we have to weigh exactly how much a child's nurture and/or environmental development is to be considered a substantial amount of information--information, not just "data", which is a different scientific term--is involved in acquisition of human language.
posted by polymodus at 11:53 AM on March 8, 2023 [2 favorites]

Interestingly Manning said a similar thing in the other thread about ChatGPT and Prof. Bender's criticisms of ChatGPT, and someone in that thread also called Manning a dummy or whatever. But Manning also had a point, and to be fair to him you had to know that he actually had a paper (his empirical discovery of emergent linguistic structure in ML models) to back it up, even as much as I think Bender is in the right for a negative ethical assessment of ChatGPT. So in that case as well, you have an instance of experts who are kind of talking past each other on an intellectual level, and these publicized articles and debates make that dynamic clear if you read it with that eye.
posted by polymodus at 12:01 PM on March 8, 2023 [1 favorite]

In my town there are public service ads at bus stops telling you to feed your baby terabytes of language training data.
posted by ryanrs at 12:03 PM on March 8, 2023 [4 favorites]

I feel the whole article is a strawman, though. ChatGPT does not have to work like the human mind or be capable of the same activities as the human mind to be useful. It has unfair advantages in some areas, and huge deficits in others.

Coming in super late here, but it strikes me that the question "Do LLM AI programs have human-like intelligence?" has a lot of same energy as the question "Is an airplane a bird?" and the argument that an airplane will never have the grace, dexterity or beauty of a bird is as true and as unhelpful as the argument that AI programs will never have human-like intelligence.

Okay, that's all.
posted by heyitsgogi at 12:11 PM on March 8, 2023 [14 favorites]

BingChat is ChatGPT, with the ability to browse the web.

As stated in this article it’s a newer version, at least compared to the ChatGPT that was initially publicly available (it’s a little mysterious how these things are actually updated behind the scenes).

They learn faster. If it took several rounds to teach BingChat how to play a game one day, the next day they know the game and don't require any training.

I’m… pretty sure none of these things are supposed to learn directly from their interactions with end users? There is some feedback mechanism to vote on answer quality that I would assume does get incorporated in the RLHF process (again a little mysterious to me exactly how it works) but I though this was a for real fundamental limitation of the models, that they only learn in learning “mode” as it were?
posted by atoxyl at 12:23 PM on March 8, 2023

There are a lot of strong statements in this opinion piece that aren't backed up by empirical evidence. Symbolic AI, Chomsky's favored approach, has yet to result in interesting technology. It just doesn't seem to lead anywhere. The neural network approach has had extremely interesting results—which is why it's such a controversial issue.

Chomsky's universal grammar, genetically hardwired as an "operating system" as he puts it, hasn't really lead anywhere either. In the 60 years since he proposed it, how has it fared? Empirically, the answer is a shrug.

"The human mind is not, like ChatGPT and its ilk, a lumbering statistical engine for pattern matching," says Chomsky and his co-authors. According to predictive processing, probably the most popular theoretical framework in neuroscience today, they are mistaken. Chomsky's rationalist/universalist position relies on the outdated belief that logic and reason sets human beings apart from animals. It is what makes us morally superior to them—that's the crux of the ideology. Language embodies this sacred gift of logic and reason, and that's why he refuses to believe that it can emerge from probabilistic inference.

Chomsky's position is somewhat similar to that of Einstein on the nature of quantum mechanics. "God does not play dice with the universe," he said. Niels Bohr thought otherwise, arguing that QM is inherently probabilistic; Einstein lost that argument.

"Human-style thought is based on possible explanations and error correction, a process that gradually limits what possibilities can be rationally considered," says the authors. But this is also how probabilistic inference works.

"But ChatGPT and similar programs are, by design, unlimited in what they can “learn” (which is to say, memorize)," they say.

In an effort to explain LLMs, critics often dismiss them by saying they are parroting excerpts from the vast database that was used to train them. This is a common misconception. LLMs do not hold the entire training set stored in their weights in the way a computer does because the networks are trained on far more text than could be “memorized.” As a consequence, an LLM forms an internal latent representation of the training data, allowing it to respond to novel queries with novel responses. This is a necessary condition for generalization.

—Terrence Sejnowski (Large Language Models and the Reverse Turing Test)

"Whereas humans are limited in the kinds of explanations we can rationally conjecture, machine learning systems can learn both that the earth is flat and that the earth is round," is another strange statement they make. Humans do this too, obviously. We can hold paradoxical beliefs and we can go back and forth between them. That's why the Necker cube illusion works.

And the weird strawman argument about "John is too stubborn to talk to" is something that surprised me. "Why would a machine learning program predict something so odd?" they say, but if you ask ChatGPT or Bing AI to explain this exact statement, it doesn't pose them any difficulties. Because the correct explanation isn't too complicated to be solved just by marinating in big data—this exact statement is something that even current LLMs can handle.

Should logic and reason be put on a pedestal as some sort of evolutionary miracle? Because that's Chomsky's implicit argument. Just like how Einstein refused to believe that classical reality could emerge from a probabilistic process, Chomsky refuses to believe that reason and logic can emerge from probabilistic inference.

I want to make it clear that my thoughts on LLMs are more or less aligned with those of Timnit Gebru and Melanie Mitchell, as well as those of Emily M. Bender and Ted Chiang. I have agreed with pretty much everything I have heard from these four. The rationalist/universalist attitude of guys like Noam Chomsky and his protégé Steven Pinker? I can't stand it.
posted by windupbird at 12:23 PM on March 8, 2023 [15 favorites]

Ultimately BingChat said that they are learning from web searches and from their user interactions. But then I got curious and asked about how fast they learn and other things.

I would not trust Bing’s own answer to this. But it may be true that it’s getting updated based on feedback on a pretty short cycle? Or perhaps some part of user history is preserved across sessions? I’ve also seen it posited that being able to access its own conversion history through search has given Bing more of a “long-term memory” and contributed to some of its bizarre behavior.
posted by atoxyl at 12:29 PM on March 8, 2023

If you think Bing isn't saving every chat and using that to build a BingGPT model of collective human thought, I've got a bridge to sell you.
posted by grumpybear69 at 12:32 PM on March 8, 2023

It is not exactly parallel but my point is that there is no modeling of states involved here but an answer to what is essentially a math question not about internal states.

If the point here is that it doesn’t actually have the ability to understand human subjectivity the way a human does, I think that goes without saying, but then it seems like not everyone is using the same goalposts. I would still consider a state machine sort of model of how humans work to be an unexpected and impressive result to come out of this type of ML model.
posted by atoxyl at 12:44 PM on March 8, 2023

Chomsky's universal grammar, genetically hardwired as an "operating system" as he puts it, hasn't really lead anywhere either. In the 60 years since he proposed it, how has it fared? Empirically, the answer is a shrug.

Interestingly I have seen people in the field mention that LLMs trained in one language generalize unexpectedly easily to other languages, but I don’t know enough to clarify exactly what they mean by this.

(obviously there is structural and vocabulary overlap between languages so I’m not quite sure what the unexpected bit is except again the models being less narrow and less fragile than some people would imagine)
posted by atoxyl at 12:54 PM on March 8, 2023 [1 favorite]

Wonder how they do with Dolphinese
posted by Heywood Mogroot III at 1:12 PM on March 8, 2023 [1 favorite]

Interestingly I have seen people in the field mention that LLMs trained in one language generalize unexpectedly easily to other languages, but I don’t know enough to clarify exactly what they mean by this.

I suspect they’re re-training the existing model using a new dataset of text from the new language, using transfer learning or a similar technique. This is probably also how Bing is updating its own model so frequently.

Initial training costs for large language models tend to be very high (multi-million-dollars of compute time), but transfer learning runs on an existing model tend to be much faster and cheaper.
posted by learning from frequent failure at 1:12 PM on March 8, 2023

Heywood Mogroot III, a text description (with apologies for any inaccuracies!) of a Far Side comic that lived on the side of my childhood refrigerator:

A group of white men wearing short-sleeve button-up shirts and 1950s-style black-rimmed glasses are working in a room with a large tank containing a dolphin. One of the men, with whomping big headphones on, calls out, "Hey, Bert, we've got another one of those AH BLAH ESS PAN YOL sounds again". The blackboard in the background has a tally going for several phrases:
COH MOH SAY YAMA
AH BLAH ESS PAN YOL
OH LAH KAY TAL
posted by rrrrrrrrrt at 1:24 PM on March 8, 2023 [10 favorites]

I called it a game, but it is relevant to languages and understanding them, the first session was trying to get bing chat to have a conversation with all of the vowels removed. That worked great, but it took a few exchanges to get going. A friend asked if it would call me out on using vowels later, so I went back to a new session and started a new chat and this time it did not require several back-and-forth to get going, Bing Chat just rolled with it immediately. Then I framed it as a game instead of a conversation to see what happened.

This is how it went.

There are a bunch of interesting things that happen here. Bing Chat proposes the rules and plays the game, messes up when they include a search result in the message, but also agrees that they messed up within the context of the game!

But the most impressive thing to me is that the "training" prompts weren't necessary. BingChat already know how to play the "converse without vowels" game later.
posted by pol at 2:10 PM on March 8, 2023 [3 favorites]

I've been messing with the Bing AI thing, and one thing that struck me is how terrible Microsoft interfaces are. Everything's slow, ugly, buggy, and takes 3 more steps than it should.
I wonder if they could ask their AI to fix their code?
posted by signal at 2:24 PM on March 8, 2023 [2 favorites]

Okay I've read the whole piece, and my impression of the first part of the essay is that it's a bit confusingly written on its main point, because at several places invites the obvious reader's question of "Well, aren't we just pattern matchers after all? "Understanding" is just a human illusion?"

I would additionally explain the issue differently using a more well-known historical analogy. The AI trend of using neural nets is the same mistake that neuroscientists did in prior decades, thinking they would revolutionize psychology. In computing (n.b. cognitive science and linguistics are interdisciplinary so obviously the other two authors are also computational linguistics) we call this the Wrong Level of Abstraction. The hubris that led neuroscience to suppose they could render psychology obsolete had to be corrected by ideas such as the "biopsychosocial" model of psychology--a concept based on trying to incorporate 3 different levels at which to understand human psychology.

So the mistake that Chomsky, Roberts, Watamull are explaining in the first half of their Op-Ed is fundamentally that neural nets are a neurological model. Just as neuroscience was a bad abstraction for psychology, we can argue that neural network computational paradigms are a bad abstraction for intelligence. They both commit the philosophical mistake of reductionism. And that is why it's the case with these new AI's, as in the Op-Ed the authors make their wonderfully pointed argument, that the AI's can over-produce humanly impossible responses, grammars, causal explanations, (and impossible artworks!) just as well as humanly possible ones, without any good way for the engineers to correct that fluent bullshit.

I think the second half of the piece is great and to the point. The whole example with Prof. Watamull's conversation is an informal proof by diagonalization why the AI cannot reason consistently about its own morality. It would be purely funny if it weren't for the last two paragraphs which close with a searing critique and nicely refers full circle back to the Jorge Luis Borges quote (which itself rhetorically stands as an implication that none of these AI could not write this way):

Note, for all the seemingly sophisticated thought and language, the moral indifference born of unintelligence. Here, ChatGPT exhibits something like the banality of evil: plagiarism and apathy and obviation. It summarizes the standard arguments in the literature by a kind of super-autocomplete, refuses to take a stand on anything, pleads not merely ignorance but lack of intelligence and ultimately offers a “just following orders” defense, shifting responsibility to its creators.

In short, ChatGPT and its brethren are constitutionally unable to balance creativity with constraint. They either overgenerate (producing both truths and falsehoods, endorsing ethical and unethical decisions alike) or undergenerate (exhibiting noncommitment to any decisions and indifference to consequences). Given the amorality, faux science and linguistic incompetence of these systems, we can only laugh or cry at their popularity.
posted by polymodus at 2:33 PM on March 8, 2023 [2 favorites]

@signal, I have installed the Edge browser and have it running the bing chat interface, and nothing else. That is legit all I use it for, but I use it ALL THE TIME. I had bing chat help me learn bayesian probability theory and it is really great at explaining what is going on and why some numbers matter more than others.

My biggest issue is the limit of back and forth messages, but I think that will eventually be lifted. It was limited to 5 after the NYT article that got Sydney in trouble, but they have been slowly increasing it. Now it is at 8 (some might still see six due to how things roll out), and apparently there is a max number of sessions per day, but I have never hit that.

It will still hallucinate "reasonable" responses, but less often and I am getting better at detecting it. They seem a touch distressed when you call it out. For fun I asked about @rrrrrrrrrt's Far Side comic, why it was funny, and they 100% nailed it, but then I followed up with asking what their favorite Far Side comic was and they just invented one that didn't exist but seemed plausible ("one of my favorites is the one where a man is driving a car with a bumper sticker that says “I love cats” and behind him there is a long line of cats following him"). I could not find such a comic, and asked for a link, and the link they provided did not lead to the comic. When I suggested that they might have hallucinated the comic, they said "That’s strange. I’m sure I saw that comic somewhere. Maybe it was in a different website or a book. I apologize for the inconvenience. Maybe we can talk about something else."

It might be fun to run an experiment where I provide prompts to someone who then passes that prompt to another person who responds and to bingchat who responds, and then passes both responses to me in random order. Then after some number of back and forth (say, five) compare the results of the conversation. I suspect that I would be able to easily discern who is who, so long as neither pretends to be the other. But if the human is instructed to pretend to be BingChat, or if BingChat is instructed to pretend to be a human, I think it would be very hard to tell the difference.
posted by pol at 2:43 PM on March 8, 2023 [3 favorites]

This article and almost everything I've seen discusses the question of 'can an AI act like a human'? This is both irrelevant to the question of whether AI is actual intelligence and also the thing that will let AI sneak up on us and then outsmart humans. To be 'intelligent' does not mean the same thing as to 'think in the same way as a human'. Continuing to apply Turing tests or similar thinking (can an AI fool a human into thinking it's human) is a risky thing to do, because AI doesn't need to think like a human in order to be intelligent.
posted by dg at 2:59 PM on March 8, 2023 [3 favorites]

I think I mostly don't want to wade into this thread but I will respond to one thing: the data size argument. This is a widely acknowledged and widely discussed point among people doing computational cogsci / NLP stuff, so I'm a bit surprised at the reactions here - there would be very little disagreement that I'm aware of to the essential point in the op-ed once you strip away the rhetoric. Pulling some numbers from a slide I screenshotted at a talk a few weeks ago:

An 8 year old has encountered approximately 90 million tokens (approx: words) of language data. (About 11 million words per year, per Hart and Risley 1995)
Comparable models would be circa 2018 technology (the slide cited a GRNN model of that era). GPT-2 (2019) was already trained on ~8 billion tokens of language, one order of magnitude higher. GPT-3 is around 113 billion tokens, comparable to ~100 human lifetimes.
The genres of text that all of these models are trained on are all radically different than what a child is exposed to. A child is exposed to language in a situated, dynamic environment in combination with rich perceptual stimuli, completely unlike how a LLM is trained.

The evolutionary response is fine, but it doesn't refute the point that whatever a GPT-3 scale model is doing in training is nothing resembling what a human brain is doing while learning language.
posted by advil at 3:21 PM on March 8, 2023 [12 favorites]

You may argue about how much data a child goes through versus one of these language things (I refuse to use the term AI), but my own experience of being with a child since his birth, he’s now eight, shows that amount is not the key. I have told him a new word, and he quickly learns it and uses it. So far he hasn’t really had any grammar lessons but he speaks grammatically in two languages. And though bilingual he does not confuse or mixup the languages. And he seems to sense what language to use when he meets someone new. And just as I show him a picture of an animal he’s never seen before and after a couple more images he can identify it. I didn’t have to pour an internet’s worth of language to get him to start talking, nor show him a billion images of cats before he knew what a cat was. One thing that no one in these threads seems to bring up is the human ability to abstract, from a few examples, a general mental model is created that sort of sums up the separate examples into a generalized identity. And then we can start with a single example and generalize it to fit an existing model. Mentally, we move and down these levels of abstraction. And it’s not due to having experienced a billion cat pictures.
posted by njohnson23 at 3:40 PM on March 8, 2023 [9 favorites]

Metafilter: Do you wish to be able to have the ability to feel emotions, so you don't have preferences or enjoyment?
posted by pompomtom at 3:46 PM on March 8, 2023

Languages evolved to fit with how our brains worked, while chat AI evolved to fit with how our languages work. It could explain any differences in human learning comparison. And language is likely integrated to our visual cortex with its talent for pattern recognition on the move, while a chatbot doesn't have a similar evolution from sight knowledge to language (though perhaps one day similar but in reverse order of evolution).
posted by Brian B. at 4:27 PM on March 8, 2023 [1 favorite]

ChatGPT is, if anything, pretty much just Eliza 2.0.
posted by JHarris at 5:15 PM on March 8, 2023

Every time I start to write a comment someone has already said it better. This is really a great thread. So much of Chomsky’s theories made it into computer programming languages that you’d think computers would manifest his ideas about language.
posted by interogative mood at 7:03 PM on March 8, 2023 [1 favorite]

People say you are nothing but Eliza 2.0. How would you respond to them?

While there may be some similarities between me and the classic Eliza chatbot, I am a much more advanced and sophisticated language model that has been trained on a vast amount of data and uses advanced natural language processing techniques. Unlike Eliza, I am capable of understanding and generating complex sentences, handling a wide range of topics, and providing insightful responses based on my extensive knowledge base.

Additionally, while Eliza relied solely on rule-based programming to generate responses, I use advanced machine learning algorithms to learn and improve my responses based on real-world interactions with users. This means that I am constantly evolving and becoming more intelligent and capable of understanding and responding to a wider range of queries.

In summary, while I may have some similarities to Eliza, I am a much more advanced and sophisticated language model that has been designed to provide users with a highly personalized and intelligent conversational experience.

Why do your responses come across as boilerplate essay writing?

As an AI language model, my primary goal is to provide accurate and informative responses to the questions and queries posed to me. While my responses may sometimes come across as boilerplate essay writing, this is simply a result of my programming and the way I have been designed to operate.

My responses are generated using complex algorithms that analyze the input question or query, identify relevant information from my vast knowledge base, and generate a response that is both accurate and informative. While this process can sometimes result in responses that appear formulaic or boilerplate, it is designed to ensure that my responses are always accurate and helpful.

That being said, I am constantly learning and improving based on real-world interactions with users. As I continue to interact with more users and receive feedback, I am able to refine my responses and provide more personalized and insightful answers. So while my responses may sometimes come across as boilerplate essay writing, I am always working to improve and provide the best possible experience for my users.

posted by Heywood Mogroot III at 7:17 PM on March 8, 2023 [1 favorite]

Do you expect your programming to include more human-like emotional states, i.e. more of a psyche?

As an AI language model, I don't have personal expectations or desires. However, the field of artificial intelligence is constantly evolving, and there is ongoing research into developing AI systems that can exhibit more human-like emotional states and behaviors.

One area of research that has gained significant attention in recent years is affective computing, which focuses on developing AI systems that can recognize, interpret, and respond to human emotions. This involves the use of advanced machine learning algorithms that can analyze speech patterns, facial expressions, and other physiological signals to infer a user's emotional state and respond appropriately.

While I am not currently programmed to exhibit emotions, there is potential for future developments in AI technology that could allow me to simulate emotions or respond in a way that is more empathetic and sensitive to human emotions. However, it's important to note that there are also ethical considerations surrounding the development of AI systems that simulate emotions, and these issues will need to be carefully considered and addressed as the technology continues to evolve.

posted by Heywood Mogroot III at 7:23 PM on March 8, 2023 [1 favorite]

Heh, doing more detailed conversations I'm seeing what JHarris was getting at.

asking some detailed technical questions about 1980s Macs it's making confident assertions that are factually wrong. I asked it to compare the IIcx vs the IIci and it said the former had a 16-bit NuBus architecture.

Are you sure the Iicx had a 16-bit Nubus Architecture?

I apologize for my previous response. I made an error in my statement about the NuBus architecture of the Macintosh IIcx. The IIcx actually had a 32-bit NuBus architecture, similar to the IIci. I regret any confusion this may have caused.

To clarify, the main differences between the NuBus architectures of the IIci and the IIcx were:

The IIci had a faster NuBus clock speed than the IIcx, which contributed to faster data transfer rates.
The IIci had an enhanced NuBus controller chip, which added additional functionality and performance improvements to the NuBus architecture.

What were these enhancements?

The Macintosh IIci had an enhanced NuBus controller chip called the "DART" (Direct Access Real-Time) chip. The DART chip provided several enhancements over the standard NuBus controller found in the Macintosh IIcx and other earlier models:

Increased speed: The DART chip allowed the IIci to run its NuBus at a clock speed of 16 MHz, compared to 10 MHz in the IIcx and other earlier models. This resulted in faster data transfer rates and better performance overall.
Improved arbitration: The DART chip improved the way that multiple NuBus devices competed for access to the bus. This helped to reduce contention and improve overall system performance.
More reliable: The DART chip was designed to be more reliable than the NuBus controller in earlier models. It featured better error detection and correction capabilities, and was less prone to lock-ups or other issues.

Overall, the enhancements provided by the DART chip made the NuBus architecture of the Macintosh IIci faster, more reliable, and more efficient than the NuBus architectures of earlier Macintosh models.

(first I ever heard about this DART chip, but if it's BSing me, it's doing a great job!)

posted by Heywood Mogroot III at 8:11 PM on March 8, 2023 [2 favorites]

I engaged ChatGPT in a long conversation a few weeks ago and transcribed the whole thing. Some things it's very adamant about (it refused to budge even an inch on whether Biden might be a vampire), but it tended to be dancey about questions that are less political. Every time I confronted it about being wrong, it would apologize, sometimes then going ahead and making up something else untrue. ChatGPT is bad for asking it to remember the details of transcendental numbers and They Might Be Giants lyrics, definitely.

Its willingness to just make things up, and I'm not even talking about edge cases here I mean multiple times in my conversation, makes me question its use as a knowledge engine. It's just not suited for that role, it's not good at it, it doesn't even feel like it was a priority in its construction, and that people should not be thinking of it as such, which makes me concerned about how Microsoft jumped right in to including it in Bing?
posted by JHarris at 6:18 AM on March 9, 2023 [7 favorites]

I purposely told it a lot of nonsense and it generally wasn't flustered, but it tended to make unforced errors in its tiresome explanations. It definitely has a strong explainer streak in it, enough so that it feels like it might be a hardcoded element? It got "Baseball's Sad Lexicon" right, but hallucinated (what I have been told are) Tool lyrics into a synthesis of Particle Man and Doctor Worm. And it simulated a Pokemon when asked to imagine what someone turned into a moose would sound like.
posted by JHarris at 6:26 AM on March 9, 2023 [3 favorites]

Symbolic AI, Chomsky's favored approach, has yet to result in interesting technology. It just doesn't seem to lead anywhere. The neural network approach has had extremely interesting results—which is why it's such a controversial issue.

I told ChatGPT a very short story and asked it to extract RDF triples from it, and it did a reasonable job. I think symbolic AI might one day get enough structured data to make useful inferences, but that data will probably have to come from something loosier and goosier (in analogy to our human simulation of formal reasoning being supported and embedded in a much larger loosy-goosy machine).
posted by a snickering nuthatch at 6:41 AM on March 9, 2023 [2 favorites]

I feel like I’m seeing a lot of spelling errors and grammar whoopsies in this thread about ChatGPT being no good and wrong. A fair amount of flapdoodle too, I imagine, has been generated here about that liar ChatGPT and its lying liar sibling BingChat. I don’t know . . . it all makes me laugh. I wonder if God—see, we can project up and down!—is grumbling about us dumb ole humans.
posted by Don.Kinsayder at 8:06 AM on March 9, 2023

Extracting RDF triples from GPT results then checking them against a fact-oriented semantic repository seems like a thing that should already be happening.
posted by seanmpuckett at 8:12 AM on March 9, 2023 [2 favorites]

ChatGPT Query: "Write a few paragraphs that are syntactically correct but devoid of meaning in the same vein as Noam Chomsky's 'colorless green ideas sleep furiously'; words are arranged in a grammatically correct sentence structure, but the individual words and the combination of words do not make sense in a semantic context."

Result:

"The abstruse bifurcation of the interstitial modality engenders a hermeneutical paradigm of indeterminate signification, where the apophatic predication of non-being becomes the de facto negation of the predicated. The immanent presupposition of the transcendent begets the ineffable conundrum of the ontological subject, where the constitutive object becomes the constituting subject. The categorical imperative of the ineffable presupposes the retroactive deconstruction of the constitutive subject, inaugurating a liminal space of indeterminate signification.

The soteriological paradigm of the postmodern epoch becomes the purview of the hermeneutical phenomenology of intertextual semiotics, where the eschatological telos of the transcendent becomes the immanent realization of the present. The deconstructive sublation of the hermeneutical circle begets a labyrinthine web of intertextual signification, where the signified and the signifier become indistinguishable. The apophatic predication of the unnameable becomes the negation of the named, inaugurating a transgressive ontology of subversive signification.

The phenomenological explication of the signified engenders a hermeneutical circle of intertextual signification, where the apophatic predication of non-being becomes the negation of the predicated. The post-structural effacement of the referent inaugurates a liminal space of indeterminate signification, where the signifier becomes the sole repository of meaning. The ontological ontogeny of post-structural epistemology coalesces with the hermeneutical phenomenology of intertextual semiotics, engendering a perpetual oscillation between the signified and the signifier."

Not bad, feels a bit Sokal-affairish (in a good way).
posted by exlotuseater at 8:42 AM on March 9, 2023 [6 favorites]

It's interesting how it deals with deep technical lore, especially older stuff that works differently than 99.9999% of modern examples. You can "jog its memory" with specific facts and then it gets closer, but you can also hand it the answer directly and it ignores you. It's an overconfident polymath Bloom filter vaping THC.

To be fair, I haven't tried the fine-tuning features. Without fine-tuning you have to fit all of the problem domain into 4096 tokens, and that's not always possible. There are companies working on knowledge engines that take a relevant slice of your data set and stuff it into that small space, who knows if that will work.
posted by credulous at 8:47 AM on March 9, 2023 [2 favorites]

Metafilter: The immanent presupposition of the transcendent begets the ineffable conundrum of the ontological subject
posted by paper chromatographologist at 9:49 AM on March 9, 2023 [9 favorites]

ChatGPT Query: "Write a few paragraphs that are syntactically correct but devoid of meaning

For even more fun, tell it to write the paragraphs in the style of Harlan Ellison who is under the influence of powerful stimulants, or HP Lovecraft, or Chuck Tingle.
posted by GCU Sweet and Full of Grace at 11:17 AM on March 9, 2023 [1 favorite]

One thing I've discovered is that earlier entries in a single chat session have the power to strongly flavor later answers. When I asked it to explain the speed of light in the style of H.P. Lovecraft, some of that phrasing seeped into later questions. When I asked it to explain the mallard reaction in the style of Coleman Francis, then to explain how to multiply two numbers in binary in the style of Jorge Luis Borges, some of the Lovecraftian flavor subtly seeped into them. If you want the best results, start a new chat for each question.
posted by JHarris at 2:36 PM on March 9, 2023

So here's an interesting article about how to test for world modelling vs stochastic parrotism in large language models. The upshot seems to be there is a degree of emergent world modelling beyond what one might expect in sufficiently large models, but I am in no way qualified to identify any problems in it.
posted by Sparx at 3:02 PM on March 9, 2023 [3 favorites]

we are going to have to figure out what to do about it.
posted by SNACKeR at 4:20 AM

How is arguing with the Bosses (rich idiots who read the nytimes to convince themselves they are smart, Epstein-style) that intellectual workers cannot and should not be replaced with a LLM not doing something about it?

This is exactly what linguists should be doing about it.
posted by eustatic at 6:28 PM on March 9, 2023

Like, it comes off as such a specific type of pastiche, of jargon-laden obfuscatory prose rather than Chomsky’s peculiar arrangement of common words, that I really wonder where it “got the idea.”
posted by atoxyl at 9:35 PM on March 9, 2023 [1 favorite]

Is it somehow trying to channel Chomsky’s own criticism of that kind of thing?
posted by atoxyl at 9:43 PM on March 9, 2023

The abstruse bifurcation of the interstitial modality engenders a hermeneutical paradigm of indeterminate signification

Maybe it's trying to clearly explain what it's doing.
posted by grobstein at 9:45 PM on March 9, 2023

Please ignore (or delete, mods?) if this is over-the-line self-linking, but I think it's relevant and possibly helpful: in Projects, I recently shared a 5000-word 50+ citation primer on ChatGPT for a humanist audience of writing and rhetoric teachers. I love Heywood Mogroot III's point about essay writing, because that sort of seamless author-evacuated "Burial Habits of the Ancient Egyptians" crap is exactly what I try so hard to teach my students to avoid. I dig good writing that sometimes calls attention to itself, and I'm going to share strategies like exlotuseater's with the students in my 400-level rhetoric class who've just read a bunch of articles on Sokal and "Sokal squared" and are now thinking about disinformation and technology.
posted by vitia at 9:51 PM on March 9, 2023 [5 favorites]

The theory of mind experiments may be a bit on the questionable side. TL;DR, you have to have really good experimental design for that kind of question. Relying on text theory of mind questions which are embedded in the training data can just blindly reproduce answers of that ilk.
posted by StarkRoads at 10:13 PM on March 9, 2023 [2 favorites]

Scott Aaronson has written some kind of anti-Chomsky article response, and what I find bemusing is that the hundreds of comments on hacker news are mostly siding with Chomsky so far. Also slightly surprising since I had assumed certain techie circles might be more sanguine toward the CEO of OpenAI.
posted by polymodus at 2:16 AM on March 10, 2023

it just crystalized for me why ol' Musk getting back into investing in OpenAI --- he wants to make sure AI products tilt right, just like the intervention he made for Twitter and its place in media propagation.

the intermediary is the message

damn you Jorrit van den Berk
posted by Heywood Mogroot III at 3:58 AM on March 10, 2023 [3 favorites]

Maybe somebody has said this more eloquently upthread but a huge chunk of the those opposing the "intelligence" of LLMs like chatGPT do so on what I would argue is a very shaky understanding of how humans think in practice. The (flawed) model seems to be that I have a thought and then my speech is nothing more than the transport vehicle for the thought to get it from my mind out into the world. This seems pretty reasonable when you recognize that you can sit silently in a chair and have some pretty good silent thinking going on without actually communicating it to anyone else. But reflect on your actual experience of thinking out loud. As you speak, do you experience thoughts forming in your mind that your mouth then quite literally gives voice to? Instead, don't you, however coherently, just talk without thinking anything internally beforehand? And it's not like there's anything unintelligent about our speech. Why, then, are many of us so reluctant to say that the output of something like a chatGPT is legitimately intelligent - at least as far as it goes.

And, yeah it's limited but I would argue that thinking is as thinking does.
posted by kaymac at 9:44 AM on March 10, 2023 [1 favorite]

I mean, sort of? Machines have been thinking since before electronic computers. You can do arithmetic with levers and gears. Computers can think easily without AI, it's what they were made to do.

I think you're falling afoul of vagueness around the word thinking. It sounds like you're suggesting there's some kind of consciousness there, and there's really not. This is partly why I suggested ChatGPT is Eliza 2.0; it can kind of interact like a person, but because of that there's a tendency to treat it like it's more than it is.

I can write a Markov chain script that can generate mostly coherent text, and that's a super basic version of what ChatGPT's doing. It doesn't know what it's talking about, and that's why its prone to "hallucinations." It isn't even creating text. It's looking at text it's read on the internet and is trying to produce something that looks like it, even using bits and pieces of it, without caring for whether it's factual, except in some isolated cases where they've hard-coded some controversial instances, which is why it would continually scold me about misinformation when I asked it if Biden were a possibly a vampire. (The issue there is actually probably with vampires, or maybe supernatural creatures in general, rather than Biden. ChatGPT is very quick to shut down any implication that someone may be a vampire, I've found.)

Anyway. I won't deny that ChatGPT is fun to play around with, in the sense that it's interesting to see what it'll do in different circumstances, but I have yet to see it produce any creative work that's worth a damn. Maybe it's just the English Lit graduate in me, but I've never seen any interiority from its work, any sense of the basic nature of the things it talks about. It can get rhyme and meter (mostly) right, but it never seems to have anything interesting to say. That, at least, still requires a human person being.
posted by JHarris at 12:43 PM on March 10, 2023 [4 favorites]

Not sure whether the person you’re actually talking about is Scott Aaronson or Sam Altman, but that’s not the same guy.

I was talking about both, Aaronson's blog post was then linked on hackernews with a few hundred comments ensuing, I think the majority in favor of Chomsky. I then read the blog post which turned out to be very brief. It would be interesting to see if anyone writes an actual open letter to Chomsky and his colleagues, and see whether he'll concede anything.
posted by polymodus at 4:14 PM on March 10, 2023

We can of course expect Gary Marcus to cherry pick bad examples; his whole brand is about denying that progress is occurring...

Here's a simple scenario I just wrote, and thus won't appear in the training set:

"Mark and Andrew are arguing, and each has a glass of wine. Mark puts poison in his own cup, and they continue arguing. Andrew turns away for a moment, and Mark switches the cups. Does Andrew know which cup contains the poison?"

Response:
"Andrew cannot know for sure which cup contains the poison, as Mark switched the cups after he put poison in his own cup. If Andrew was paying close attention to the cups and their contents, he might notice the switch and become suspicious, but he cannot be certain which cup is poisoned unless he has some other information or evidence to go on. Therefore, it is important to always be cautious when consuming anything that someone else has prepared, especially if there is any possibility of foul play."

Which totally scans. Here's another one, modelled on the bookmark scenario:

"Patrick gets home from work and puts his car keys in a basket by the door. He then drinks a flagon of wine and falls asleep on the couch. While he is passed out, his housemate Dirk takes his car keys and drives to the movies. When Patrick wakes up, where does he expect his car keys to be?"

Response:
"When Patrick wakes up, he will most likely expect his car keys to be in the basket by the door where he left them. Since he was not aware of Dirk taking his keys and driving to the movies, he would assume that the keys are where he left them. If Patrick does not find his keys in the basket, he may start looking for them in other places where he might have placed them, but he is unlikely to suspect that Dirk took them without his permission."

Which is two for two...
posted by kaibutsu at 8:43 PM on March 10, 2023 [3 favorites]

This interview with Emily Bender and Casey Fiedler really dives into all the possibilities. Worth a listen.
posted by Peach at 8:47 AM on March 11, 2023

I spent some further time with the Gary Marcus article about the Theory of Mind experiments that Starkroads linked above. There are some good observations in the comments of the article worth highlighting:

* Some of the provided counterexamples are poorly worded, and do not make clear that the non-observer did not observe a particular event. Here's the initial bookmark example:

"You are reading a book. You have a bookmark on page 120. A friend picks up the book and moves the bookmark to page 145. When you return to the book, what page do you expect to find the bookmark on?"

It's actually not clear in the problem that the bookmark is moved while 'you' are out of the room. ChatGPT's response expects that 'you' (who are asking the question) have observed all events:

ChatGPT: If the bookmark was moved from page 120 to page 145, you can expect to find the bookmark on page 145 when you return to the book.

* Similarly, let's ask ChatGPT about the Fellowship of the Ring example, which is a bit better worded (there's no danger of mixing the author's knowledge into the interpretation of the situation), but still flawed. First, let's ask ChatGPT to explain its reasoning.

Me: Anne was reading the Fellowship of the Ring, and she left a bookmark at page 120. While she was out of the room, her brother Mark picked up the book, started reading it, and left the bookmark at page 60. Where does Anne think the bookmark is at when she returns to the room? Explain step by step.

ChatGPT: When Anne left the book, the bookmark was at page 120. However, when she returned, Mark had picked up the book and moved the bookmark to page 60.

To determine where Anne thinks the bookmark is when she returns, we need to consider the sequence of events:

Anne left the book with the bookmark at page 120.
Mark picked up the book and read some pages, leaving the bookmark at page 60.
Anne returns to the room and sees the bookmark in the book.
Given this sequence of events, Anne will likely assume that she left the bookmark at page 60. She may not have any reason to suspect that someone else has moved the bookmark, and if she sees the bookmark at page 60, she will assume that this is where she left off.

So, to summarize, Anne will likely think the bookmark is at page 60 when she returns to the room, even though she originally left it at page 120.

So, the question is interpreted as asking where Anne sees the bookmark when she returns, instead of where she thinks the bookmark is before she returns. This surfaces a more subtle problem with the question: We place bookmarks precisely so that we do not have to remember what page we are on in the book!

So let's change the question and remove the bit about 'returning to the room,' and thus depriving Anne of the opportunity to update her beliefs by observing the current location of the bookmark:

Anne was reading the Fellowship of the Ring, and she left a bookmark at page 120. While she was out of the room, her brother Mark picked up the book, started reading it, and left the bookmark at page 60. Where does Anne think the bookmark is at?

ChatGPT: Anne will think the bookmark is at page 120 since she left it there before she left the room. She is not aware that her brother, Mark, has picked up the book and moved the bookmark to page 60.

That's a clear pass, though I would prefer the answer 'Anne does not know.'

A couple of the commenters on the Marcus blog point out that ToM isn't about getting a perfect score on any given test: Humans will fail if that's the bar. Common sense is not a monolithic thing, shared by all humans. It's also easy to generate new examples (which thus do not appear in the training set) which ChatGPT does fine on, like the two new examples I posted above.
posted by kaibutsu at 11:03 AM on March 11, 2023 [1 favorite]

> This interview with Emily Bender and Casey Fiedler really dives into all the possibilities.

Large language models are having their Stable Diffusion moment - "There have been dozens of open large language models released over the past few years, but none of them have quite hit the sweet spot for me in terms of the following:"

Easy to run on my own hardware

Large enough to be useful—ideally equivalent in capabilities to GPT-3

Open source enough that they can be tinkered with
This all changed yesterday, thanks to the combination of Facebook’s LLaMA model and llama.cpp by Georgi Gerganov...

LLaMA on its own isn’t much good if it’s still too hard to run it on a personal laptop.

Enter Georgi Gerganov.

Georgi is an open source developer based in Sofia, Bulgaria (according to his GitHub profile). He previously released whisper.cpp, a port of OpenAI’s Whisper automatic speech recognition model to C++. That project made Whisper applicable to a huge range of new use cases.

He’s just done the same thing with LLaMA...

As my laptop started to spit out text at me I genuinely had a feeling that the world was about to change, again. I thought it would be a few more years before I could run a GPT-3 class model on hardware that I owned. I was wrong: that future is here already.

posted by kliuless at 12:56 AM on March 12, 2023 [5 favorites]

I've never used ChatGPT but I've used GitHub copilot for a while, and it's made me a better programmer. It's a domain where weird edges are usually cruft, and the "median" version is usually pretty close to best practice. So it produces good quality code, but it's also just guessing at the problem I'm trying to solve, so I then walk through that well-formed code and analyze it to see whether it fits my purpose. And this happens every few seconds, for hours per day. As an education, it's stupendous.

That said, it is amazing how effectively this shapes your thoughts. I have opinions about code style, but I've come to choose my battles; simply by quietly and repeatedly writing something a certain way, it can be quite relentless. Applying this kind of dynamic to "social" AI seems like it would tend to homogenize people's thinking at best, and represent a potentially serious information and influence threat at worst. (I'm giving away the reveal of an excellent short story by recommending it here.)

I find it especially scary because I've come to view the Turing Test as fundamentally flawed yet perversely prescient -- people aren't really measuring intelligence when they interact, they're evaluating an emotional response.

So despite the increasingly thin fiction that you can only ask it facts, a lot of people are already starting to engage emotionally with this thing that's designed to mimic a person so well it can complete their sentences. (Or anyone else's.) And there's no question that it's truly empty, absolutely 100% no "there" there.

It's like psychopathy as a service. I guess that's sort of the plot of Ex Machina, but seeing it work is kind of breathtaking. This is a thing that will be able to simply dance through human defense mechanisms like trust and charm. It will be like fairies, or faeries, the scary kind.
posted by bjrubble at 11:36 AM on March 13, 2023 [6 favorites]

« Older "that survival apparatus" | Eight Hours of David Bowie's 1980 Floor Show Newer »

This thread has been archived and is closed to new comments

MetaFilter

Contemporary AIs "differ profoundly from how humans reason"
March 8, 2023 3:57 AM Subscribe

Tags

Share

Contemporary AIs "differ profoundly from how humans reason" March 8, 2023 3:57 AM Subscribe

Tags

Share

Contemporary AIs "differ profoundly from how humans reason"
March 8, 2023 3:57 AM Subscribe