I think I'm starting to peak now, Al
September 28, 2024 12:57 PM   Subscribe

I know this upsets people and there is “gut reaction” to push back on this view and point out that despite the performance of Open AI’s new o1 model on logic and reasoning tests, it still makes dumb mistakes at times. That’s true, but so do most humans. Probably all humans. I also hear people say that LLMs use “pattern matching” and “memorization” to solve many problems. Again, very true, but so do humans. Does this mean we will reach Peak Human in 2024? Yes and no. from Have we reached Peak Human? by Louis Rosenberg
posted by chavenet (37 comments total) 6 users marked this as a favorite
 
Let me tell you something, I haven't even begun to peak. And when I do peak, you'll know. Because I'm gonna peak so hard that everybody in Philadelphia's gonna feel it.
posted by MengerSponge at 1:16 PM on September 28 [8 favorites]


I think I'm starting to peak

chavenet, you're only getting started 😉
posted by HearHere at 1:25 PM on September 28 [3 favorites]


Humans peaked long ago. It's all been downhill since we invented agriculture.
posted by Faint of Butt at 2:00 PM on September 28 [3 favorites]


Have we? No.

Will we? I hope so.

Parents want their children to do better than they did, to be better people than they were. Is it any surprise that we hope humanity's metaphorical children will be better people than we are? Able to do things we can't?

We saw that sentiment expressed pretty well way back in 1967 by Richard Brautigan:
I like to think (and
the sooner the better!)
of a cybernetic meadow
where mammals and computers
live together in mutually
programming harmony
like pure water
touching clear sky.

I like to think
(right now, please!)
of a cybernetic forest
filled with pines and electronics
where deer stroll peacefully
past computers
as if they were flowers
with spinning blossoms.

I like to think
(it has to be!)
of a cybernetic ecology
where we are free of our labors
and joined back to nature,
returned to our mammal
brothers and sisters,
and all watched over
by machines of loving grace.
Right now? We don't have that. Current AI is inadequate to the task.

But one day? I'd like to live long enough to see humanity pass the torch of most advanced general purpose intellect in the known universe to a successor species. I'm doubtful that I will, the requirements for even human equivalent general purpose intellect are beyond our current tech. But one day.

And if I do live to see a time when AI can outthink me should I stop thinking? Of course not. As Iain M Banks said once: a fish can swim better than I can but I still swim.

There is definitely, in the shorter term, risk of for profit cancer capitalists using AI to manipulate, in theory that could be beneficial. If your search terms started showing you were suicidal it could subtly push you towards help. Or push a person drifting to Nazism away from it. But we all know that what will really happen is they will push you towards whatever product paid the most.

The idea of an AI muse/assistant/aide/helper/JARVIS/whatever has potential, but I think I'll wait until the open source version is available so I don't have Google/Apple/Microsoft trying to shape my desires towards whoever paid the most.
posted by sotonohito at 2:01 PM on September 28 [9 favorites]


I think IQ is a bad way to sort humans, especially given where those tests were developed and the folks who eagerly use them as 'objective' measures of group differences. But I really knew that the author and I don't see eye to eye when he used the fact that AI churns out more than 15 billion images per year as evidence of they are more 'creative' than humans. Being prolific is one thing, but creativity can't be measured in sheer quantity.

Maybe there's more there if I dug into his collective intelligence start ups, but this whole thing leaves a bad taste in my mouth.
posted by crossswords at 2:26 PM on September 28 [12 favorites]


I wrote this poem about it, 25 years ago.


the digital storm

I’m standing in a bookstore
lost among the shelves.

I'm in a straw shack
Awaiting the digital storm


This is the last great age of man.
Every book a headstone
Chiselled with the epitaphs of the lucky few
Who were in the right place and time to speak
And be heard
Before the digital storm.

Before the wind blows through and scatters
the sacred texts (as
all texts are holy.
As has been said,
As it is written:
History, recipe, self-help, all of these are invocations,)
And when they no longer can incant
the virtual dead will return
to consume the living.
posted by dances_with_sneetches at 2:37 PM on September 28 [12 favorites]


I think IQ is a bad way to sort humans

Even for what it does tell you about humans, it’s not the actual thing - it’s an attempt to gauge the actual thing(s) by proxy. So, no I don’t think one can just assume it translates as a test of AI capabilities.
posted by atoxyl at 2:43 PM on September 28 [3 favorites]


No we haven't reached Peak Human yet, because AI has yet to demonstrate Human qualities such as malevolence, envy, hatred.
posted by storybored at 2:45 PM on September 28 [2 favorites]


I mean, I know that Kenya starts with a K, so I feel like I've still got the edge here.
posted by kittens for breakfast at 2:47 PM on September 28 [5 favorites]


Parents want their children to do better than they did, to be better people than they were. Is it any surprise that we hope humanity's metaphorical children will be better people than we are? Able to do things we can't?

We saw that sentiment expressed pretty well way back in 1967 by Richard Brautigan:


That's a good thought, and a good poem, but if we're going to view AI as the offspring and heir of humanity (and I think we should!) then I'll counter with Philip Larkin, 1971:
They fuck you up, your mum and dad.
They may not mean to, but they do.
They fill you with the faults they had
And add some extra, just for you.

But they were fucked up in their turn
By fools in old-style hats and coats,
Who half the time were soppy-stern
And half at one another’s throats.

Man hands on misery to man.
It deepens like a coastal shelf.
Get out as early as you can,
And don’t have any kids yourself.
posted by Faint of Butt at 2:52 PM on September 28 [18 favorites]


IQ is a deeply flawed way of measuring intelligence among humans, and is utterly meaningless to apply to "AI" systems. I admit I stopped reading once the article seemed to be taking IQ test results of AI systems seriously, so my apologies if I misunderstood the thrust of the argument there.
posted by biogeo at 2:58 PM on September 28 [10 favorites]


I am a little annoyed that I spent half an hour parsing through the copious links to get to the actual paper wherein a chatroom does slightly better at the "guess the candy jar" fairground game than a survey.
posted by lucidium at 2:59 PM on September 28 [11 favorites]


Breathlessness from AI true believers has a lot in common with breathlessness from UFO true believers or bigfoot true believers. People who want to see the thing will see the thing.

Software that guesses likely-sounding responses to test questions based on known prior good responses is doing something very similar to what I did to get through high school, and I can tell you, it's unrelated to thinking or understanding.

One of the depressing things about the whole fiasco is how eager people are to stop making any distinction between what the software does, and what we do. Are we really unable to recognize thinking and understanding? 'Cause this ain't it. It's a counterfeit, a cool toy in some ways, but one that becomes more dangerous the more we forget what it is.
posted by Sing Or Swim at 3:15 PM on September 28 [19 favorites]


Breathlessness from AI true believers has a lot in common with breathlessness from UFO true believers or bigfoot true believers. People who want to see the thing will see the thing.

The big difference is that the stock market isn't going to collapse if bigfoot and aliens aren't widely believe to be totally real and about to unveil themselves.
posted by Vulgar Euphemism at 3:36 PM on September 28 [8 favorites]


Hi, I am actually trained to administer IQ tests.

a standardized Mensa IQ test. Last week, for the first time

Not a validated measure of IQ.

Lott had a custom IQ test created that does not appear anywhere online and therefore is not in the training data.

Not how designing a new IQ measure works. Also, just give it the K-BIT 2, which I doubt is online anywhere because it's not one of the ones used in the kind of testing rich parents looking to skew results are looking for.

many people point out that human intelligence is far more than just the reasoning measured by IQ tests.

None of these tests are even administering what is measured by IQ tests. For example: one of the most common actually validated IQ tests (in America, at least) is the WAIS-IV. The global IQ is a composite of the four index scales: verbal comprehension, perceptual reasoning, working memory, and processing speed. An IQ test that does not measure multiple components of intelligence with identifiable subscores for each index is not an IQ test, it's a fun quiz. I can't get the exact info because the author is (for the training data reason) not putting the full thing online anywhere, but I suspect what is actually being administered is a number of verbal and perceptual reasoning tests; you COULD test processing speed (though it wouldn't be remotely valid) but not the way this author describes setting up his investigation. Working memory by definition is just sort of out, because it's not possible to give an AI a stimulus and then take it away. Also, the way they do matrix reasoning type shit is to describe the entire image verbally--that's not how perceptual reasoning works!

My verbal reasoning score is 130. My global IQ is much lower, because IQ is a composite score. Testing AI on the tasks it's good at and ignoring the rest (when human IQ norms are based on the full breadth of these tasks) because (this) AI is incapable of doing them and then saying "it surpasses humans on IQ" is laughable.

Louis Rosenberg, PhD is a computer scientist and engineer.

Ah.
posted by brook horse at 3:40 PM on September 28 [49 favorites]


Back in the day, I suffered a brain injury and they put me through, what I was told later, a battery of tests that resulted in an IQ score. For one test, I was blindfolded, and then told that there in front of me was a board with various shaped holes and on the table in front of me were blocks of various shapes that only fit through specific holes. When told to start, I was to put the right blocks through the right holes, using only one hand, as fast as I can. I was told to start and just feeling holes and feeling blocks, I did the test. When taking off my blindfold I finally saw what was in front of me. The psych guy had this weird look on his face, and he told me my score was way off the chart, really really fast. I asked him if there was a job opportunity for having this skill. He kept the weird look. A few other tests had this more physical activity thing. How would computers do with this test?
posted by njohnson23 at 4:43 PM on September 28 [7 favorites]


(Taps the sign which contains the text of Betteridge's Law Of Headlines)
posted by grumpybear69 at 5:11 PM on September 28 [7 favorites]


Scientology auditing claims it can train you to have an IQ of 200, shame that the distribution only goes up to 165.
posted by Narrative_Historian at 5:12 PM on September 28 [4 favorites]


The big difference is that the stock market isn't going to collapse if bigfoot and aliens aren't widely believe to be totally real and about to unveil themselves.

Give it like 5 years.
posted by Reyturner at 5:20 PM on September 28 [5 favorites]


A few other tests had this more physical activity thing. How would computers do with this test?

One of the WAIS perceptual reasoning tests involves taking blocks which have different colors/patterns on different sides and arranging them to match a particular shape in the book. Manipulating objects in 3D space is an aspect of intelligence that the AI's tested in here simply cannot do--they're language models. They don't have anything to do with this. Now, some AI has the ability to detect depth and objects in space to some degree, this is what is used in Augmented Reality games, but I don't think any of it is at a point where a computer could manipulate an object in real space in this way, either through robotics or giving a human instructions (e.g. "turn that block over"). We might get there at some point, but we are far from there right now.
posted by brook horse at 5:28 PM on September 28 [8 favorites]


>The big difference is that the stock market isn't going to collapse if bigfoot and aliens aren't widely believe to be totally real

Seems to me if it's going to do that, it's going to do that anyway. The people trying to get rich off of the AI Gold Rush aren't trying to get anybody else rich off of it, and they are just as likely to cause some widespread disaster by getting what they want as by not getting it.
posted by Sing Or Swim at 5:30 PM on September 28 [4 favorites]


I always figured Brautigan was being more than a little sarcastic with that poem. The interjections after each "I like to think" kind of imply that the speaker isn't thinking at all and the utopian fantasy described is complete bullshit.

Anyway. Once we have created a machine that can calculate faster than a human, then I'll believe the human race is doomed. But that will never happen.
posted by surlyben at 5:49 PM on September 28 [2 favorites]


Alright, the second sentence is giberish: An AI beating most human as some tricky tasks does not mean the AI "outthink" those humans. An AI "outthinking" an average human on some tricky tasks does not mean humans "peaked as an intellectual force", becuase exceptional humans have way more impact, heck even the lucky ones do. Is the article better later?

As I've said before, AIs' power consumption suggests they're doing something seriously wrong, if they were after intelligence per se, but intelligence differs from their real goals I think. See the Ed Zitron "AI as strock fraud" thread.

Also AI can't cross this line and we don't know why.
posted by jeffburdges at 5:53 PM on September 28 [7 favorites]


Oh, I was thinking "peak" like an acid trip, and I confess I thought briefly that it would be amusing to ask both ChatGPT and whatever the "AI" image software is called to depict my acid trip, and then I remembered that these queries would put a small town's daily CO2 output into the atmosphere.
posted by outgrown_hobnail at 7:18 PM on September 28 [3 favorites]


AI is no more our child than the guilotine or the a-bomb. By personifying (and making sympathetic) the tools that are and will be used to decieve us, exploit us, replace us and make war on us, is not analogous to familial love, it is a war against humans and the natural world. We have enough psychopathic, amoral quasi-extended organisms making war on humans and the natural world as it is ( see Corportations, org religions, governments). We are already ruled by a-holes acting like paperclip maximizers. inventing a silicon gollem to outcompete us is just malice.

Now, how is that project going? The software fails to perform but you still have to use it and we already started the layoffs.

All your IP was stolen and you have to rent the product we made from it.

All your privacy was stolen and we are getting better at manipulating you for profit and for politics.

I actually agree, this probably is peak human, not because Ai has gotten so good, but because our deathdrive is so strong.

Line must go up.
posted by No Climate - No Food, No Food - No Future. at 7:32 PM on September 28 [13 favorites]


Also, the way they do matrix reasoning type shit is to describe the entire image verbally--that's not how perceptual reasoning works!

The state of the art models are multimodal now so one could actually give them matrix questions straight up? I’m not sure if they did for the particular experiments we’re talking here, though.
posted by atoxyl at 8:19 PM on September 28 [1 favorite]


There’s something called the ARC Challenge that’s a set of grid-based reasoning puzzles designed specifically to test AI systems against supposed human norms. I don’t know how rigorous that approach to comparison is but it’s not the silliest I’ve seen. So far I don’t think any model does very well on its own (they can be made to do better with some brute force search in the loop).
posted by atoxyl at 8:24 PM on September 28 [1 favorite]


most advanced general purpose intellect in the known universe to a successor species

I don’t know if they would or not, but would that successor species not having any subjective experience or consciousness change how pleased you are about it?
posted by Jon Mitchell at 8:27 PM on September 28 [4 favorites]


+++ for the title.
posted by symbioid at 8:29 PM on September 28 [2 favorites]


Testing AI on the tasks it's good at and ignoring the rest (when human IQ norms are based on the full breadth of these tasks) because (this) AI is incapable of doing them and then saying "it surpasses humans on IQ" is laughable.

Also the extent of the gap between what’s it’s good at and what it’s not makes the value of a composite score seem pretty dubious even if you can get one. I think it’s pretty safe to say it does not, fundamentally, work the same way we do.
posted by atoxyl at 8:52 PM on September 28 [5 favorites]


"When told to start, I was to put the right blocks through the right holes, using only one hand, as fast as I can. I was told to start and just feeling holes and feeling blocks, I did the test. When taking off my blindfold I finally saw what was in front of me. The psych guy had this weird look on his face, and he told me my score was way off the chart, really really fast."

Did they all go in the square hole?
posted by Jacqueline at 1:15 AM on September 29 [8 favorites]


Saying it surpasses human IQ *is* laughable. LLMs and the current generation of related systems - o1 included - do not build mental models of complex arbitrary systems on the fly. They do not seed these mental models with conditions based on hypotheticals or with set victory conditions, they do not filter these mental models only on criteria related to the challenge at hand.

This is going to require a complex dance of large language models and reinforcement neural networks and probably a third thing to recognize the need for the above (which we used to label “metacognition” when humans applied it).

This collection of systems will need to be continuously trained and runtime adaptive, which is fundamentally not what current LLMs do. It may need to be embodied. To hold an apple in its manipulator claw and slowly rotate it in front of its camera.

None of this changes the fact that o1 is a major but highly inconsistent step forward. It may, with several revisions and a fuller implementation of Q*, achieve one of the major goals of current corporate “AI” which is a minimally passable digital assistant to smooth and reduce your time spent interfacing with human bureaucracies.

Frankly I have no confidence of that much. Apple’s solution will have the massive benefit of being located on-device with all your personal data. It will have context. I would be freaking the fuck out if their approach to privacy was anything less than groundbreaking.

Put future refinements of o1 in the context of Apple’s on-device inference / private temporary cloud compute, and you might have something genuinely interesting. But it still wouldn’t reason in the way we do, and OpenAI can’t stop chasing the next trillion-scale killer app long enough to cooperate with Apple beyond a default failover service endpoint, regardless.
posted by Ryvar at 1:35 AM on September 29 [5 favorites]


Agreed, Reyturner. We're definitely headed deeper into fantasy land, so after the AI x-risker pants fall off then aliens could absolutely become the next big fantasy, assuming they design some approporiate pyramid scheme or whatever.
posted by jeffburdges at 5:29 AM on September 29 [2 favorites]


The key to understanding the article's arguments are the fact that it's an ad for the author's company.
posted by signal at 8:01 AM on September 29 [2 favorites]


I don’t know if they would or not, but would that successor species not having any subjective experience or consciousness change how pleased you are about it?

I'm not prone to jealousy. I'd celebrate the knowledge that my successor would be free of the tortures that plague me.
posted by Faint of Butt at 11:37 AM on September 29 [2 favorites]


The state of the art models are multimodal now so one could actually give them matrix questions straight up? I’m not sure if they did for the particular experiments we’re talking here, though.

On the site for the project, if you hover over something like “how was this prompt phrased,” they show you. For the matrix questions they described each individual square in the image and then asked a question about it, so not giving it the actual image.
posted by brook horse at 11:55 AM on September 29 [2 favorites]


shame that the distribution only goes up to 165.

Now, now, everyone knows that 1st edition D&D allows Intelligence scores up to 18, and IQ is simply Int x 10.
posted by Flight Hardware, do not touch at 10:49 AM on September 30


« Older Here is where I would put my squash emoji, if I...   |   No one told me to do it. No one could have told... Newer »


You are not currently logged in. Log in or create a new account to post comments.