"there are no gray squares, man, it's just in your mind"
December 5, 2024 11:46 AM Subscribe
thinking of calling this "The Illusion Illusion" [Tomer Ullman on BlueSky]
Your daily reminder that LLMs don't actually understand anything and are just extraordinarily complex (and non-deterministic) autocomplete machines.
posted by grumpybear69 at 11:56 AM on December 5, 2024 [10 favorites]
posted by grumpybear69 at 11:56 AM on December 5, 2024 [10 favorites]
Help me. I've fallen up and I can't get down.
posted by ashbury at 12:06 PM on December 5, 2024 [5 favorites]
posted by ashbury at 12:06 PM on December 5, 2024 [5 favorites]
I showed this to my parish priest and he said it was balderdash!
posted by Czjewel at 12:14 PM on December 5, 2024 [1 favorite]
posted by Czjewel at 12:14 PM on December 5, 2024 [1 favorite]
The lines are the same length, but the red one is much further away.
posted by GenjiandProust at 12:20 PM on December 5, 2024 [22 favorites]
posted by GenjiandProust at 12:20 PM on December 5, 2024 [22 favorites]
I think the one that interests me most is the duck. Because these models must be trained on plenty of ducks, right? If you said to chatgpt, hey, what animal is this, it'd tell you it was a duck. But if you prime chatgpt with the word rabbit--if it's going down through its little spiderwebs of associations with the words 'rabbit' and 'duck'--then you've sort of spoiled its neat image-recognition trick. ACTUALLY WAIT...so I wrote all that and to prove my point, I dropped the image into chatgpt and said what's this a picture of...and it told me it was the optical illusion--without my hinting around about it maybe being a rabbit. So that's pretty funny. Ducks: The David Mayer of waterfowl.
posted by mittens at 12:26 PM on December 5, 2024 [5 favorites]
posted by mittens at 12:26 PM on December 5, 2024 [5 favorites]
I just tested this on ChatGPT o1, which was released today and is the most advanced model (other can the recently announced "Pro" version) and it indeed failed the two tests I tried (the blue and red lines and the duck) The line problem even fails with the prompt "Which horizontal line is longer, the blue line or the red line? Think this through and really measure the length of the two lines to determine which is longer."
I recall a similar failure mode in earlier GPTs with the Fox/Chicken/Grain riddle-- if you changed the parameters to some obvious thing (like just the fox and grain) it would give you the answer to the original riddle. It works better in later models though.
Next someone should feed it the basketball gorilla video but without the gorilla.
posted by gwint at 12:41 PM on December 5, 2024 [4 favorites]
I recall a similar failure mode in earlier GPTs with the Fox/Chicken/Grain riddle-- if you changed the parameters to some obvious thing (like just the fox and grain) it would give you the answer to the original riddle. It works better in later models though.
Next someone should feed it the basketball gorilla video but without the gorilla.
posted by gwint at 12:41 PM on December 5, 2024 [4 favorites]
just extraordinarily complex (and non-deterministic) autocomplete machines.
just? what else could they possibly be?
posted by MisantropicPainforest at 12:47 PM on December 5, 2024 [1 favorite]
just? what else could they possibly be?
posted by MisantropicPainforest at 12:47 PM on December 5, 2024 [1 favorite]
These are great.
posted by creiszhanson at 1:16 PM on December 5, 2024 [1 favorite]
posted by creiszhanson at 1:16 PM on December 5, 2024 [1 favorite]
This is a great illustration of how these models work, and I will incorporate them into my talk I give to students on the use of these models.
posted by dhruva at 1:36 PM on December 5, 2024 [2 favorites]
posted by dhruva at 1:36 PM on December 5, 2024 [2 favorites]
My 8th grade biology teacher used to say, "What's the difference between a duck?" It left this 8th grader perplexed. Some 50 or so years later I now know the answer. It's a rabbit!
posted by JohnnyGunn at 2:03 PM on December 5, 2024 [5 favorites]
posted by JohnnyGunn at 2:03 PM on December 5, 2024 [5 favorites]
just? what else could they possibly be?
I agree, but lots of people think they are actually capable of understanding stuff and doing logical reasoning. They are not.
posted by grumpybear69 at 3:29 PM on December 5, 2024 [4 favorites]
I agree, but lots of people think they are actually capable of understanding stuff and doing logical reasoning. They are not.
posted by grumpybear69 at 3:29 PM on December 5, 2024 [4 favorites]
"What's the difference between a duck?"
One is a raven, the other is a writing desk.
posted by Greg_Ace at 3:35 PM on December 5, 2024 [6 favorites]
One is a raven, the other is a writing desk.
posted by Greg_Ace at 3:35 PM on December 5, 2024 [6 favorites]
just? what else could they possibly be?
They could be, as I think people sometimes imagine, search engines on large corpora of verified/trusted information. And that is how a lot of people seem to treat them: as something that is digesting and distilling reference material as a helper agent, without invention or guessing or confabulation. But the problem with those systems is (a) they require the underlying corpora to be carefully vetted and maintained, and (b) they come up short when you ask for something there isn't already clear concrete answers on or which is hard to form precise natural language queries around. (Also of concern is (c) that any good knowledge-fetching system should be able to cite its sources, which is a problem if your sources are "a variety of things we obtained without necessarily having any license to do so, but whose gonna stop us, robots.txt?")
The generality of LLMs is a feature if what you want is output in widest possible set of cases. That comes at the cost of making the output unreliable and the production process a black box. It's not obvious to everyone using a computer that this is operating differently than e.g. a classic search engine or an online dictionary or encyclopedia: it's being presented as "put question in, get answer out" which is a familiar paradigm. It could *be* much clearer what the difference is here, if the folks selling it wanted to make that clear. They do not want to, because they are selling the abstract promise of some future version of this technology that doesn't have the fundamental issues every past and current iteration does.
posted by cortex at 4:27 PM on December 5, 2024 [8 favorites]
They could be, as I think people sometimes imagine, search engines on large corpora of verified/trusted information. And that is how a lot of people seem to treat them: as something that is digesting and distilling reference material as a helper agent, without invention or guessing or confabulation. But the problem with those systems is (a) they require the underlying corpora to be carefully vetted and maintained, and (b) they come up short when you ask for something there isn't already clear concrete answers on or which is hard to form precise natural language queries around. (Also of concern is (c) that any good knowledge-fetching system should be able to cite its sources, which is a problem if your sources are "a variety of things we obtained without necessarily having any license to do so, but whose gonna stop us, robots.txt?")
The generality of LLMs is a feature if what you want is output in widest possible set of cases. That comes at the cost of making the output unreliable and the production process a black box. It's not obvious to everyone using a computer that this is operating differently than e.g. a classic search engine or an online dictionary or encyclopedia: it's being presented as "put question in, get answer out" which is a familiar paradigm. It could *be* much clearer what the difference is here, if the folks selling it wanted to make that clear. They do not want to, because they are selling the abstract promise of some future version of this technology that doesn't have the fundamental issues every past and current iteration does.
posted by cortex at 4:27 PM on December 5, 2024 [8 favorites]
I agree, but lots of people think they are actually capable of understanding stuff and doing logical reasoning. They are not.
This may perhaps be due to every third commercial being Google or whoever being like “look at how our AI understands stuff and does logical reasoning for you!”
posted by heyitsgogi at 4:28 PM on December 5, 2024 [2 favorites]
This may perhaps be due to every third commercial being Google or whoever being like “look at how our AI understands stuff and does logical reasoning for you!”
posted by heyitsgogi at 4:28 PM on December 5, 2024 [2 favorites]
It took me awhile to grok wtf the BlueSky post was about, as no context is provided. It wasn’t until I read through the comments here that I realized this must be a compendium of mistakes some LLM made when reproducing common optical illusions. Without the FPP, the BS post comes off as some sort of lame attempt at humor, in a “ha, ha, look how I dorked with these illusions, aren’t I clever?” sort of way.
Is this sort of “mistakes ai makes” post Ullman’s usual jam, and I just wasn’t in on the ongoing joke, since I have no idea who they are?
posted by Thorzdad at 5:33 AM on December 6, 2024 [2 favorites]
Is this sort of “mistakes ai makes” post Ullman’s usual jam, and I just wasn’t in on the ongoing joke, since I have no idea who they are?
posted by Thorzdad at 5:33 AM on December 6, 2024 [2 favorites]
the first image has Ullman asking a question about the lines and ChatGPT (as indicated by the logo) responding, but yeah, it could have been better contextualized
posted by chavenet at 5:53 AM on December 6, 2024
posted by chavenet at 5:53 AM on December 6, 2024
What I see is simply this bit of beginning text: “thinking of calling this "The Illusion Illusion" (more examples below)” followed by the image of the dorked line illusion, with the teeniest, tiny gray dot next to the text below the image, which, upon zooming, is the ChatGPT logo. I guess? There’s no indication of Ullman asking anything, unless “thinking of calling…” is somehow the question.
posted by Thorzdad at 6:10 AM on December 6, 2024
posted by Thorzdad at 6:10 AM on December 6, 2024
Wow, it's just like when people comment on a post without first reading the article...
...I assume.
posted by AlSweigart at 6:26 AM on December 6, 2024 [1 favorite]
...I assume.
posted by AlSweigart at 6:26 AM on December 6, 2024 [1 favorite]
the posted image is a snapshot of Ullmann's interaction with ChatGPT.
It has the two lines and directly under them, the question: "which line is longer, the red line or the blue line?"
then you get ChatGPT's (wrong) answer.
anyway, it could definitely have been clearer.
posted by chavenet at 6:26 AM on December 6, 2024
It has the two lines and directly under them, the question: "which line is longer, the red line or the blue line?"
then you get ChatGPT's (wrong) answer.
anyway, it could definitely have been clearer.
posted by chavenet at 6:26 AM on December 6, 2024
So we're saying that without the proper context, an intelligent being may not understand this post and fall back on existing assumptions based on prior experience, which may produce the wrong answer as to the meaning of said post?
i kid, i kid, i know we're god's special smarties, not stochastic parrots
posted by gwint at 6:36 AM on December 6, 2024 [4 favorites]
i kid, i kid, i know we're god's special smarties, not stochastic parrots
posted by gwint at 6:36 AM on December 6, 2024 [4 favorites]
I had a stochastic parrot once, I never knew where he'd turn up.
posted by Greg_Ace at 9:26 AM on December 6, 2024 [3 favorites]
posted by Greg_Ace at 9:26 AM on December 6, 2024 [3 favorites]
Toucan Rnd()
posted by gwint at 10:54 AM on December 6, 2024 [3 favorites]
posted by gwint at 10:54 AM on December 6, 2024 [3 favorites]
If you said to chatgpt, hey, what animal is this, it'd tell you it was a duck. But if you prime chatgpt with the word rabbit--if it's going down through its little spiderwebs of associations with the words 'rabbit' and 'duck'--then you've sort of spoiled its neat image-recognition trick.
Similarly it answers the question he asked about the grey lines and boxes correctly—the grey lines are indeed straight—but then can't resist screwing it up by pontificating incorrectly about how they're also parallel, which he hadn't even asked about. It got triggered into showing off that it had seen this optical illusion before and just had to tell him about it.
posted by straight at 3:55 PM on December 6, 2024
Similarly it answers the question he asked about the grey lines and boxes correctly—the grey lines are indeed straight—but then can't resist screwing it up by pontificating incorrectly about how they're also parallel, which he hadn't even asked about. It got triggered into showing off that it had seen this optical illusion before and just had to tell him about it.
posted by straight at 3:55 PM on December 6, 2024
I don't get it. I've seen the gray box illusion before. This is fake. I cropped it out and yes indeed, there are gray squares.
posted by Goofyy at 9:37 AM on December 8, 2024
posted by Goofyy at 9:37 AM on December 8, 2024
Mod note: [Who are you going to believe, lying ChatGTP or your lying eyes? We've added this bullshit to the "It Could Happen To You" collection on the sidebar and Best Of blog!]
posted by taz (staff) at 1:54 AM on December 10, 2024
posted by taz (staff) at 1:54 AM on December 10, 2024
« Older I is for Illiterate. Time to Sue. | Thunderbird and Whale Newer »
This thread has been archived and is closed to new comments
it’s turtles all the way down.
posted by heyitsgogi at 11:52 AM on December 5, 2024 [2 favorites]