"Well, you seem like a person, but you're just a voice in a computer"
May 13, 2024 12:14 PM Subscribe

OpenAI unveils GPT-4o, a new flagship "omnimodel" capable of processing text, audio, and video. While it delivers big improvements in speed, cost, and reasoning ability, perhaps the most impressive is its new voice mode -- while the old version was a clunky speech --> text --> speech approach with tons of latency, the new model takes in audio directly and responds in kind, enabling real-time conversations with an eerily realistic voice, one that can recognize multiple speakers and even respond with sarcasm, laughter, and other emotional content of speech. Rumor has it Apple has neared a deal with the company to revamp an aging Siri, while the advance has clear implications for customer service, translation, education, and even virtual companions (or perhaps "lovers", as the allusions to Spike Jonze's Her, the Samantha-esque demo voice, and opening the door to mature content imply). Meanwhile, the offloading of most premium ChatGPT features to the free tier suggests something bigger coming down the pike.
posted by Rhaomi (150 comments total) 34 users marked this as a favorite

.
posted by lalochezia at 12:16 PM on May 13 [21 favorites]

Oh good, fewer jobs for humans, more jobs for corporate-sponsored chatbots. I note there's no commentary on the implications for misinformation and the spreading of hate material. And I'm sure all the robo-scammers out there are very excited for a new way to call up elderly and vulnerable people and trick them into handing over their money and personal details. Now the scammers can even sound like your poor daughter in trouble who just needs $800 to get home!
posted by fight or flight at 12:19 PM on May 13 [29 favorites]

Now the scammers can even sound like your poor daughter in trouble who just needs $800 to get home!

Scammers already do this with specialized tools. This and its ilk just lower the bar.
posted by lalochezia at 12:20 PM on May 13

Thank you for this detailed post, Rhaomi. I'm traveling today and can't do the kind of research you did!
posted by doctornemo at 12:22 PM on May 13 [3 favorites]

Apart from all the other issues: the way they kept cutting "her" off in the "recognize multiple speakers" video, and making "her" perform for them, didn't feel good to listen to.
posted by trig at 12:33 PM on May 13 [10 favorites]

still fails my test question on Germany and WW2, sigh
posted by torokunai at 12:47 PM on May 13 [2 favorites]

If it can't tell me why the porridge bird lays its eggs in the air, I don't want nothin' to do with it.
posted by grubi at 12:50 PM on May 13 [10 favorites]

Oh good, fewer jobs for humans, more jobs for corporate-sponsored chatbots. I note there's no commentary on the implications for misinformation and the spreading of hate material.

Yes! Scribes are being put out of business! Any person will be able to publish whatever they want! Stop Gutenberg's madness before civilization ends!

Sorry, wrong moral panic.
posted by Tell Me No Lies at 12:54 PM on May 13 [14 favorites]

Maybe this is something for a different thread, but I'm interested in how apps like Duolingo have incorporated AI in ways that are (according to native speakers of their various language learning tools) not quite accurate or tone-correct. This would seem to make Duolingo unusable, right?

It'll be interesting to see whether most people continue using a flawed language learning tool, learning/speaking/writing a language incorrectly (and then the language evolves as all languages do), or whether more traditional language learning companies like Rosetta see more growth from this. Or, now that the gamefied language learning tool is no longer any good, people (Americans in particular) stop learning a bit of Portuguese (etc.) on the side and retreat into more comfortable English-only content, and thus shrinking our worlds more and more.
posted by knotty knots at 12:55 PM on May 13 [5 favorites]

Can we please move past this energy wasting techbro hype and nonsense.
posted by GallonOfAlan at 12:55 PM on May 13 [31 favorites]

Sorry, wrong moral panic

Of course automation and globalization did in fact totally shatter economies and towns as recently as the eighties and nineties, and many people whose careers had been built in those towns simply never got a good job again, especially middle aged people - it's grim up North, etc. Nobody wants a "retrained" fifty year old in a competitive environment.
posted by Frowner at 12:57 PM on May 13 [74 favorites]

the offloading of most premium ChatGPT features to the free tier

Feeeed me humans

"GPT four-oh" is going to be less clever than hoped if everyone starts pronouncing it with tired resignation.

"GPT 4.....Oh."
posted by snuffleupagus at 12:57 PM on May 13 [4 favorites]

This sounds amazi-- wait, nevermind.

For a moment I thought GPT-4o would address the ways AI produces many harms that do not have adequate anthropomorphic correlates, like its various complex modes of exacerbating economic inequality, the use of automated decision-making within systems of oppression (often understood as 'bias'), carbon and other environmental impacts of training and deploying AI, technological unemployment and harmful transformations of work, erosion of privacy and personal autonomy through increased surveillance and data exploitation, deskilling and loss of institutional knowledge due to AI outsourcing, challenges around opacity, interpretability, and accountability, further erosion of the public sphere through AI-generated disinformation, or and the implications of autonomous AI systems in warfare, healthcare, transport, and cybersecurity, among others.

My bad, must have misread. So nothing I would consider an improvement, I guess. Oh shucks.

(With apologies to Jo Walton, from whom I have copy-pasted this list almost verbatim. Are you interested in NOT giving creatives and authors proper credit? GPT-4o might be the thing for you!)
posted by bigendian at 12:57 PM on May 13 [24 favorites]

still fails my test question on Germany and WW2, sigh
posted by torokunai

No current LLM is ever going to pass this, and you should probably let it go. You’re chatting with distilled word slurry that maintains the shape of answers, but unless a particular fact is repeated millions of times with 99% accuracy (eg what is the boiling point of water in Fahrenheit) in the training set, *shrug*, concrete facts just isn’t what LLMs do. You’re looking for an expert system that an LLM draws from, and recognizing when it needs to do so is a Hard Problem. Q*, which in theory might see a partial release with the model that finished training a month ago, may make some initial headway there. Maybe.
posted by Ryvar at 12:59 PM on May 13 [21 favorites]

Apart from all the other issues: the way they kept cutting "her" off in the "recognize multiple speakers" video, and making "her" perform for them, didn't feel good to listen to.

100% this, especially when coupled with "her" relentless, bubbly positivity. It had an extremely creepy vibe made somehow worse by the clean-cut, generic young techlings doing the presentation.
posted by The Bellman at 1:03 PM on May 13 [12 favorites]

I wonder what ChatGPT40's syllables per gallon rate is.
posted by grumpybear69 at 1:08 PM on May 13 [11 favorites]

20 s/gal same as in town.
posted by emelenjr at 1:12 PM on May 13 [25 favorites]

Apart from all the other issues: the way they kept cutting "her" off in the "recognize multiple speakers" video, and making "her" perform for them, didn't feel good to listen to.

It's really gross. Multiple studies and reports over the years have found that voice assistants reinforce harmful gender biases when they're given a female-presenting voice, enabling sexist abuse in users by responding to it with passiveness or even flirtation. A friend of mine did a PhD in interactive technologies (I forget what it was exactly) and as part of her studies she found a worrying number of people will casually throw out gendered insults and abuse at their Alexa/Siri.

Not that the tech bros in charge care about any of this. They're too busy making sure they can make deepfake AI sex workers who can't say "no".
posted by fight or flight at 1:14 PM on May 13 [33 favorites]

Siri is terrible and anything that replaces it will be an improvement.
posted by They sucked his brains out! at 1:15 PM on May 13 [1 favorite]

natural gas, kerosene, diesel, gasoline, flex fuel, fryer oil, batteries and I think, water.

I think that covers the checker cab.
posted by clavdivs at 1:15 PM on May 13

Sorry, wrong moral panic.

Lots of stupid moral police going on right now; concern at capitalism using this shit to mulch everything down to nothing sure isn’t one of them.
posted by Artw at 1:18 PM on May 13 [49 favorites]

If anybody else has gotten a "sign up for voice-only authentication!" email from their financial institution, DO NOT OPT IN to it. Do not bet your savings that AI's vocal deepfakes can't fool the bank's authenticators.

This has been your AI-thread PSA.
posted by humbug at 1:21 PM on May 13 [54 favorites]

Apart from all the other issues: the way they kept cutting "her" off in the "recognize multiple speakers" video, and making "her" perform for them, didn't feel good to listen to.

Holy shit this was bad. Who the hell greenlit this? It’s OpenAI, though, so… fuck ’em.

apologies to Jo Walton, from whom I have copy-pasted this list almost verbatim

I actually really like that list. As always: when and where possible, especially if you have any influence in organizational tech policy, please consider using open source / open weights AI models like Mixtral over ChatGPT. Same blanket copyright theft / unsanctioned use of training data as the major players? Oh yeah, totally - no getting around that currently. Keeps the means of production in the hands of workers and keeps your usage data out of the megacorps’ hands? Also yes.

Again, to be clear: I’m not saying it’s good, I’m saying it’s better, and while starving OpenAI may be impossible every penny that goes to small teams with open methods instead of them is better spent.
posted by Ryvar at 1:24 PM on May 13 [9 favorites]

No current LLM is ever going to pass this, and you should probably let it go.

I dunno, I asked it "What country was the last one to declare war on Germany during World War II?" using the API with the default System message of "You are a helpful assistant." It responded, essentially, "Argentina was the last one, declaring war on Germany on March 27, 1945", which is what Wikipedia lists as the correct answer, as far as I can tell.

(Because this was asked via the API it won't be used to train future models, unlike ChatGPT interactions.)
posted by jedicus at 1:26 PM on May 13 [2 favorites]

Not that the tech bros in charge care about any of this.

The assistant’s voice response bore a striking resemblance to the character Scarlett Johansson plays in the movie Her, where a man forms a relationship with a sophisticated AI assistant. After the event, OpenAI CEO Sam Altman cryptically posted just one word on X: "her." He has also expressed that Her is his favorite movie.

(The Verge: ChatGPT will be able to talk to you like Scarlett Johansson in Her)

I honestly wonder (as I so often do with these guys) what he thinks that movie was about. How obvious do we have to be? What other faves does he have? Ex Machina? Terminator?

Is the next headline: "Altman Announces Skynet?"
posted by The Bellman at 1:27 PM on May 13 [17 favorites]

Sorry, wrong moral panic

Using the example, yes the printing press put many scribes out of business. A printing press could do the job of, say, 20 scribes per day? I’m sure with enough information one could figure this out.

How many writers, artists, assistants, telecom workers, etc. can AI do the work of? If you give me the number of workers per industry, I can make the calculation in about 1/2 a second.
posted by blairsyprofane at 1:29 PM on May 13 [1 favorite]

My working assumption is that most tech bosses only watch the first half of the films they reference. It makes most of their decisions make a lot more sense.
posted by nangua at 1:29 PM on May 13 [15 favorites]

I'm watching the live demo video where they demonstrate "talk mode".

I wonder how much of the "decreased latency" claim is being helped by inserting delays padded with human-esque responses to buy time for the back-end to finish processing ("Ahhhhh..." "I think I can see now...." "Let me see....").
posted by neuracnu at 1:35 PM on May 13 [1 favorite]

The printing press also contributed significantly to a couple of hundred years of vicious civil and religious wars all over Europe, FWIW.

(Did it also have anything to do with the simultaneous degradation of living conditions for the average European over the next couple of hundred years to possibly their lowest level ever? Probably not, except perhaps indirectly via the aforementioned wars.)
posted by clawsoon at 1:40 PM on May 13 [9 favorites]

It is good to have machines do work that people could do, because it frees up people to do other things, both other work and leisure. (It's better to minimize rather than maximize work. See the "give them spoons" joke.)

In the short term, the transition can be incredibly destructive to the scribe or buggy-whip makers or lawyers.

In the long term, this can shift income and power from labor to capital.

Government will have little control over whether AI is developed. It may be able to control, to some extent, how it is regulated, and it can definitely affect how future income is distributed between different groups.
posted by Mr.Know-it-some at 1:42 PM on May 13 [7 favorites]

Apparently a big bottleneck for programs that call themselves AI/LLM on iOS devices is a low level of RAM. Apple have been parsimonious with RAM in their devices forever, but the locked down nature of iPhones means that you haven’t noticed you’re running on a minuscule amount of RAM. I heard maybe the iPhone 15 pro is good to go for an LLM system, other phones… not so much.
posted by The River Ivel at 1:42 PM on May 13

I wonder how much of the "decreased latency" claim is being helped by inserting delays padded with human-esque responses to buy time for the back-end to finish processing ("Ahhhhh..." "I think I can see now...." "Let me see....").

I too am guilty of human-esque responses, if we are going by that measure. I celebrate the human-esque padding, it's all that gets me through some days
posted by elkevelvet at 1:45 PM on May 13 [9 favorites]

>Government will have little control over whether AI is developed.

I haven't kept up on my cyberpunk reading since the early 90s, but sounds like a good plotpoint, capitalists trying to sneak in bootleg generative AI from Russia & Communist China to do jobs here in the US.
posted by torokunai at 1:48 PM on May 13 [3 favorites]

Sorry, wrong moral panic.

Citing Gutenberg as purely benign form of technological change is also a bit questionable. Did printing advance science? Sure. Did it make modernity possible? Yeah, probably. Did it also spread misinformation, fuel pamphlets wars, and play a contributing role in the violence which tore Europe apart over the next couple centuries? Yes to this too. People get very excited by the idea of "creative destruction", but again, I think that's only people who pay attention to the first half of the text.
posted by nangua at 1:51 PM on May 13 [24 favorites]

Ah, beaten to it by clawsoon...
posted by nangua at 1:53 PM on May 13 [1 favorite]

Mr.Know-it-some: It is good to have machines do work that people could do, because it frees up people to do other things, both other work and leisure. (It's better to minimize rather than maximize work. See the "give them spoons" joke.)

In theory that's true, although in practise over the past couple of hundred years there's always a group of people whose pace of work gets set by the most inefficient part of the production chain of the most productive machines.

A cotton gin was very productive, so slavery was needed to feed it. Canning, freezing and packing operations are very productive, so migrant farm labour is needed to feed it. Chip fabs are very productive, so vast armies of assembly plant workers are needed to digest what the fabs vomit out.

Instead of the machine feeding and freeing the people, it's as if people are fed to the machine.

They keep promising that one of these centuries the pattern will be reversed, but so far there are always those two (or a hundred) spots in the production chain that have to be filled with a mass of horribly paid, overworked, maltreated workers.
posted by clawsoon at 1:54 PM on May 13 [44 favorites]

Utterly useless.
posted by torokunai at 1:55 PM on May 13 [7 favorites]

elkevelvet: I too am guilty of human-esque responses, if we are going by that measure. I celebrate the human-esque padding, it's all that gets me through some days

From "I'm Glad My Mom Died": "His gestures are as exact as his phrasing—no uhhs or umms, in speech or in mannerisms. This is an umless man. I respect him. It takes a lot to be an umless man."
posted by clawsoon at 1:56 PM on May 13 [1 favorite]

You can’t spell “human” without um!

What other faves does he have? Ex Machina? Terminator?

Valve Software: “In our game Portal we invented GLaDOS as a cautionary tale”…
posted by mbrubeck at 2:07 PM on May 13 [5 favorites]

Citing Gutenberg as purely benign form of technological change is also a bit questionable. Did printing advance science? Sure. Did it make modernity possible? Yeah, probably. Did it also spread misinformation, fuel pamphlets wars, and play a contributing role in the violence which tore Europe apart over the next couple centuries? Yes to this too.

Yes, that was the point of referencing Gutenberg.

It would be possible to go on and on about all of the horrible things that the printing press made possible. You could live your life in fear of it, reacting to each improvement with vitriolic screeds (probably printed) about how it would oppress the weak, ruin morals, and result in the death of millions.

But, if you look at all of the good it has done it becomes a much much more complicated issue. Taking a staunch position for the Printing Press or against the Printing Press and shouting out loudly whenever it is mentioned is self-indulgent silliness.
posted by Tell Me No Lies at 2:21 PM on May 13 [7 favorites]

Yes! Scribes are being put out of business!

You joke, but I work in health care. Where I work we are testing "AI Supported" documentation. That is, the provider places their phone between then and the patient and press record. When they are done with the visit, they press stop. In less than two minutes, the documentation is in the patient's chart. Who did this before? People who's job title is literally "Medical Scribe".

So, yeah. People are going to lose jobs over this.
posted by a non mouse, a cow herd at 2:25 PM on May 13 [23 favorites]

anything that replaces it will be an improvement.

When will people learn the rules of the Monkey's Paw Timeline we almost certainly inhabit
posted by Jon Mitchell at 2:36 PM on May 13 [15 favorites]

Oooh I love this Gutenberg de-rail! It sure beats another hand-wringing "oh no AI is going to ruin everything" discussion going over the same tired old ground. (And too bad; GPT-4o looks like a nice enhancement and this post is comprehensive. If you read the links you might learn something!)

To that end, David d'Angers monument to Gutenberg at the Place Gutenburg in Strasbourg. It includes a plaque of jaw-dropping iconography (CW: racism, colonialism) depicting elegantly dressed Europeans standing over a throng of naked, fawningly grateful Africans. A printing press is in the middle, the source of Europe's gifts.

The Asia and America plaques are also like this; the general theme is that the printing press allowed Europe to bring "civilization" to the rest of the world. The sheer irony of claiming that cultural primacy over Asia, the actual place of invention of movable type, is not reflected in the 19th c. European artwork.

More in this journal article, also on sci-hub. Much like AI, the printing press is a technological invention delivering both blessings and curses. It can print declarations of freedom for all people. It also prints contracts of enslavement and exploitation.

May there be mercy on man and machine for their sins.
posted by Nelson at 2:37 PM on May 13 [9 favorites]

still fails my test question on Germany and WW2, sigh

Fuckin’ wild that we treat this like the Omni future.
posted by Going To Maine at 2:41 PM on May 13

And there is lots of other scribing that is being lost. I worked for about a decade on a medical device, a vital signs monitor. The device had no readable "face". I pumped all the signs to my PC program, and displayed it. Then the nurse would click a button, get authorized, and submit it directly into the hospital's medical record system. We kept those nurses from having to transcribe and often make mistakes. But did we help them? Now that this thing was a part of their job is now pretty seamless, maybe the hospital starts reducing the nursing staff? Had never thought about our helpful device might have been harmful. And boy howdy do we need more nurses. Waking up at midnight with the damned antibiotic infuser blocked, and alarming, and hitting the "Nurse" button and explaining the situation. And then it's 45 minuites before they come in and silence it and get it unblocked and running again. And then it goes empty and starts alarming again. Rinse and repeat

Spending way too much time with doctors in the last year and a half, at least here in Seattle, EPIC has won. Everywhere uses it. I think we only interfaced with EPIC once, and it was pretty gruesome to deal with. They did a good job of walling off their garden.

If all the AI could give us the leisure that was promised about all these techs improving our lives, instead of just being the wealth extraction we all know it will turn out to be...
posted by Windopaene at 2:55 PM on May 13 [11 favorites]

Valve Software: “In our game Portal we invented GLaDOS as a cautionary tale”…

Altman: "I am proud to introduce OpenAI's new CEO, Cave Johnson . . . "
posted by The Bellman at 3:09 PM on May 13 [7 favorites]

ChatGPT will be able to talk to you like Scarlett Johansson in Her

Meh. Make it talk like Ambassador Kosh.
posted by snuffleupagus at 3:14 PM on May 13 [6 favorites]

I work in health care. Where I work we are testing "AI Supported" documentation.

*inadvertently creates redheaded whiteguyblink.gif* …I’m sorry, but in what fucking universe is that HIPAA compliant? Because I would bet $100 they aren’t running a speech-to-text model inhouse on local hardware.

Apparently a big bottleneck for programs that call themselves AI/LLM on iOS devices is a low level of RAM.

Correct. When you see AI model names like llama2-70b or Mixtral 8x7b or llama3-8b, that’s “b” as in “billions” of parameters. Gross simplification: parameters are the weights (multiply to adjust strength) and biases (add/subtract to offset weighted inputs) of the individual neural connections between layers.

Single-precision floating point values like 0.1387123 are 32-bit, or four bytes (=roughly 7 useful decimal digits in base 10), so you generally expect an 8b parameter model to require 32GB of VRAM on the GPU. Models can quantized which means reduced from 32-bit single precision to 16-bit half-floats or even 8-bits to fit on consumer GPUs. So the recent llama3-8b model at half precision takes up more or less 16GB of VRAM, which after factoring in some overhead plus the memory being used by your actual OS desktop fits comfortably within the 24GB of an RTX 4090, the current top of the line gaming GPU.

There’s a deep irony here with Apple because the Mac Pro M2 Ultras and Macbook Pro M3 Max CPUs also contain integrated GPUs (roughly equiv to a 4070 Ti), but with the system RAM as a shared VRAM memory pool (~70% of which is accessible to the iGPU). For very large models like llama2-70b in 8-bit quantization a 128GB Macbook Pro will *smoke* a godtier gaming rig’s RTX 4090, which has to drop back to CPU and system RAM.

Basically Apple desktops occupy a weird best/worst hardware spot for certain specific very large models. With iOS I believe they’ll be pushing heavily on mixture of experts models with many small subnetworks + aggressive quantization, but even then I wouldn’t expect magic. It’ll be rough going on mobile for a while yet.
posted by Ryvar at 3:16 PM on May 13 [7 favorites]

Lots of stupid moral police going on right now; concern at capitalism using this shit to mulch everything down to nothing sure isn’t one of them.

I think it’s also really important to consider the difference between jobs being lost because something better comes along versus because something tolerable is much cheaper. Chatbots are a great example because the experience is universally worse but because it allows companies not to hire so many people means that you are increasingly unlikely to be able to use anything else. Printing presses were bad for high-art manuscripts but great from the perspective of everyone who wanted to read or sell more books. Some AI uses are really useful – transcription and translation come to mind - but a lot of the other stuff is simply a race to the bottom once the quality level is better than unacceptable.

All of this calls for some social policy to do things like taxing the owners of the machines at rates at least as high as their former employees paid but that seems politically unlikely given the current environment. I’m slightly more optimistic that there’d be political support for requiring companies to identify AI interfaces and require some kind of alternative since few voters like infuriating customer service experiences but I’m expecting some big PR campaigns arguing that providing support like they did in 2024 would bankrupt utilities, airlines, etc. and we just have to accept worse service for the sake of our 401ks.
posted by adamsc at 3:18 PM on May 13 [14 favorites]

> It also prints contracts of enslavement and exploitation.

Hearst made lampshades out of these (no joke . . . I spotted that on my visit to Hearst Castle)
posted by torokunai at 3:19 PM on May 13 [1 favorite]

in what fucking universe is that HIPAA compliant

this one.

What is the Difference Between a Data Use Agreement and a Business Associate Agreement?
(one compliance company's explainer)

Are your data processing agreements HIPAA-compliant?
(another's)
posted by snuffleupagus at 3:29 PM on May 13 [10 favorites]

Now I am Informed But Unsatisfied.
posted by Ryvar at 3:35 PM on May 13 [13 favorites]

Rather than the Gutenberg Press, a more relevant historical analogy might be horses and motor vehicles. Once there was a replacement for horses that was faster/stronger/cheaper/less finicky, and the horse's labor was no longer needed, the excess horses were turned into glue.

Good news is that after a few decades the horse population made a partial recovery as show animals for the idle rich:

https://www.researchgate.net/figure/Evolution-of-the-horse-population-in-France-from-1800-to-2010-translated-from-French_fig1_338480301

Hopefully us proles will be similarly successful in the long run.
posted by Balna Watya at 3:47 PM on May 13 [3 favorites]

So, yeah. People are going to lose jobs over this.

And imagine the forests we will lose over this. And the land to sea rise? Maybe there should be an energy tax to the transactions. Like, if each AI interaction cost the company a couple of bucks for the carbon emissions, would they be so hell bent on deploying it?
posted by GenjiandProust at 3:50 PM on May 13 [5 favorites]

roughly equiv to a 4070 Ti

I’m pretty sure even the M3 Max GPU benchmarks closer to 3080 level? Also while they have great memory bandwidth by system RAM standards, it’s not that high by GPU standards? So there’s a tradeoff of some performance versus the possibilities enabled by the large amount of memory available and the lack of physical separation between VRAM and system RAM.
posted by atoxyl at 3:51 PM on May 13 [1 favorite]

Metafilter: Informed but Unsatisfied
posted by sixswitch at 3:56 PM on May 13 [20 favorites]

My reaction to this news: scott_moir_booooo.gif
posted by ob1quixote at 4:11 PM on May 13 [2 favorites]

I like to think (and
the sooner the better!)
of a cybernetic meadow
where mammals and computers
live together in mutually
programming harmony
like pure water
touching clear sky.

I like to think
(right now, please!)
of a cybernetic forest
filled with pines and electronics
where deer stroll peacefully
past computers
as if they were flowers
with spinning blossoms.

I like to think
(it has to be!)
of a cybernetic ecology
where we are free of our labors
and joined back to nature,
returned to our mammal
brothers and sisters,
and all watched over
by machines of loving grace.

--R. Brautigan
posted by chavenet at 4:17 PM on May 13 [9 favorites]

IMO the scribes vs Gutenberg : humans vs AI is a false analogy

scribes can check the content they are creating and a human wrote whatever got printed

now Chat GPT pulls bullshit out of thin air, wearing a bikini and waggling its ass, but its still bullshit and hallucinations

this might work for a pretty picture or two but unless those AI chat bots clean up their act, the corporations aren't going to continue buying them

or as someone said, I wonder why the user numbers are still stuck at 100K
posted by infini at 4:20 PM on May 13 [8 favorites]

Like, if each AI interaction cost the company a couple of bucks for the carbon emissions, would they be so hell bent on deploying it?

GPT-4o API usage is priced at $15/million output tokens, half of the figure for the previous GPT-4 version. That only provides a sliver of insight into what it costs them to run it because I suspect they are still okay with losing a fair amount of money, but

a.) it does suggest that they are engineering for efficiency at least, it’s not like Bitcoin where it is expressly designed to waste energy

b.) I think a literal couple of bucks per request would imply a tax on electricity that dwarfs the current market price, which would have bigger-than-AI implications to say the least.
posted by atoxyl at 4:27 PM on May 13 [1 favorite]

Soooo... I have been solidly in the "this is crap" camp, in between two super excited coworkers, and... the discourse around LLMs and GANs is so polarized that I'm not able to find discussion that explores the bits of it that might be interesting or useful.

I thus far haven't been too worried about LLMs impacting my livelihood (I work in software) just because programmers who use CoPilot seem to turn out a lot of crap, and as much as "ship it fast" is totally a thing, that crap is gonna catch up with development processes that use it, and, as some wag observed, "all we have to do is to get our managers to concisely and unambiguously describe the requirements to the LLM".

I'm seeing a bunch of people use LLMs instead of search results, and I suspect that'll continue until a few people get even more screwed by confidently asserted bullshit. I've seen a gazillion demos of "virtual assistants", usually in conjunction with some overpriced piece of crappy hardware that gets a bunch of views, but as soon as it gets in people's hands it's plain that the assistant feature sucks, and the crappy hardware was misdirection to get people to look past that the hard problem hasn't been solved.

Thus so far the only real use I've found for these things seem to be as a teddy bear, as in "before you bother me for help, you have to explain your problem to the teddy bear" solving 99% of interruptions.

What I'd love to find is discourse with people who aren't philosophically opposed to the concepts of automation (yes, I know, there are political problems to solve, there are always political problems) but who don't credulously look at these obviously staged up demos and say "OMG this is amaaaazing!!1!".

That those folks seem to be limited to... I dunno, maybe Simon WIllison? ... makes me think that there's more bullshit here than value. But I keep trying to find it, because if it really is as shitty as my experience of it, there are a whole lot of people out there way more gullible than I'd hoped the masses of humanity are. Even with the available evidence.
posted by straw at 4:32 PM on May 13 [16 favorites]

straw, keep an eye on CK's blog
posted by infini at 4:44 PM on May 13 [2 favorites]

straw: because if it really is as shitty as my experience of it, there are a whole lot of people out there way more gullible than I'd hoped the masses of humanity are.

I've still got my money on AI founding a religion.
posted by clawsoon at 4:52 PM on May 13 [8 favorites]

I read Power and Progress a few months ago, and I highly recommend it to anyone who's interested in making sure that this technology is deployed for the benefit of all. Bemoaning it won't stop it, go read Homo Deus if you need another history lesson.

I'd also rec this interview with Salim Ismail, now is the time to build alternatives that can keep up with this tech, our antiquated governance systems aren't going to cut it and we need alternatives that can keep up.

As someone who is actively building AI tools (right now I'm working on deploying LLMs to labor unions for help with the grievance process), I'm generally excited about this and am already planning on how my next sprint is going to incorporate these new capabilities. Case in point: The Dept of Veterans Affairs is actually wrapping up their first Challenge.org tech sprint on how to 'reduce provider burnout using AI. I'm giving labor these tools and levelling the playing ground. That's is what we need.

My ex-boss sent me some post about how much energy LLMs use; my response is the same w/r/t bitcoin: every technological advancement in human history generally harnesses exponentially more amount of power, and I've stopped thinking this is a bad thing in general, but eventually moves toward more renewable energy.

We are seriously on the cusp of the most significant technological revolution in human history. I am optimistic (I'm using Iain M. Bank's Culture series as a desired outcome here), and yes, I recognize that there's going to be disruption that can make Globalization look passe, so my MTB right now is making sure these dark ages pass as quickly as possible.

Call it my Foundation thesis.
posted by daHIFI at 5:07 PM on May 13 [12 favorites]

Thus so far the only real use I've found for these things seem to be as a teddy bear, as in "before you bother me for help, you have to explain your problem to the teddy bear" solving 99% of interruptions.

I get some use out of them at this weird intersection of teddy bear/rubber duck, search, and Stack Overflow. You certainly can’t take them as authoritative - GPT-4o got a small but significant detail of the very first thing I threw at it, a technical question I’d recently researched in some depth, wrong - but you can get pointed in the general direction of an answer a lot faster than present-day Google*, you can get working solutions to many small, well-defined problems readymade, and since it’s a back-and-forth interaction it can help you talk yourself through figuring things out.

* of course this one might represent the decline of search as much as it represents innovation in language modeling
posted by atoxyl at 5:09 PM on May 13 [1 favorite]

There’s a deep irony here with Apple because the Mac Pro M2 Ultras and Macbook Pro M3 Max CPUs also contain integrated GPUs (roughly equiv to a 4070 Ti),

Apple Silicon M3 Pro competes with Nvidia RTX 4090 GPU in AI benchmark [u]

Tl;dr: basically, Apple has code optimization work to do to get to Nvidia-level performance, but its chips are already fast and use an order of magnitude less electricity.
posted by They sucked his brains out! at 5:22 PM on May 13

They sucked his brains out!: Tl;dr: basically, Apple has code optimization work to do to get to Nvidia-level performance, but its chips are already fast and use an order of magnitude less electricity.

Until you scroll down and read the update to the post...
posted by clawsoon at 5:32 PM on May 13 [1 favorite]

Apple's spending a shitload of engineering time on making AI models live largely in flash memory rather than RAM, is the thing. This way they get to use massive models without loading devices up with more RAM than they need for regular operation. If they nail this, and I'm sure they will, next year's phones will have on-board AI that is shockingly good, for what the hardware is. The Apple developer expo thingy this year is going to be an AI circus, that's for sure.
posted by seanmpuckett at 6:15 PM on May 13 [1 favorite]

Around fewer jobs for humans, we'd no idea the world needed I Glued My Balls To My Butthole Again or The Secrets Your A**hole Keeps until AI song creation came along, so maybe our entertainment shifts with AI replacing humans.
posted by jeffburdges at 6:40 PM on May 13 [5 favorites]

I've still got my money on AI founding a religion.

I do too, but... it kinda has already, right? Certainly people have all kinds of faith in this stuff that is absolutely not the domain of evidence-based reasoning. And the word slurry it spits out certainly helps them along these lines.

The bar for new religions is pretty low too, we had a new one pop up just this spring, enshrining the Chicago Rat Hole.
posted by SaltySalticid at 6:43 PM on May 13 [2 favorites]

I couldn't get through the intro video, I was cringing too hard. I don't understand why someone would want to interact with such a thing. It's like talking to an overly fake, ingenuine person, but even more fake because there's no person.
posted by zsazsa at 6:45 PM on May 13 [1 favorite]

daHIFI,

As cstross says, AIs should be viewed like corporations within our political system. Yes, you could do helpful things with AIS, but mostly they'll just exploit people.

There are zero good reasons for bitcoin to waste so much energy, but bitcoiners believe many stupid things, so they never fixed bitcoin's waste, or numerous other problems. AIs might reduce energy consumption massively, but this likely requires real work, not just trivialities.

We'd hit really nasty energy problems if we could keep growing our energy production, but thankfully our energy production should plateau or really crash long before then. We'll instead hit planetary boundaries like clinate precisely because all these sociopaths thinks more more more is a good thing.
posted by jeffburdges at 7:04 PM on May 13 [12 favorites]

Consumer GPU TFlops (FP32):
Apple M3 Max 40-Core: 14.2 TFlops, 400 GB/s memory
Apple M2 Ultra 76-Core: 27.2 TFlops, 400 GB/s memory
NVidia RTX 4090: 82.6 TFlops, 1008 GB/s memory
NVidia RTX 4070 Ti: 40.1 TFlops, 504 GB/s memory
NVidia RTX 3080: 29.8 TFlops, 760 GB/s memory
NVidia RTX 3070 Ti: 21.7 TFlops, 608 GB/s memory

Bonus round:
NVidia A100 80GB: 19.5 FP32 TFlops | 156 TFlops using TensorFlow (compute library for ML), 2000 GB/s memory
Apple iPhone 15 Pro: 2.147 TFlops (GPU only), 51.2 GB/s memory

Atoxyl is definitely correct here - and as tempted as I am to say that I misremembered and it was the 3070 Ti: nah, I was just plain wrong and should've checked my sources.
posted by Ryvar at 7:17 PM on May 13 [3 favorites]

For another point in the "actual use" camp:

One of my kids is a big user of ChatGPT for test studying. She likes to really know the material, and never has enough practice problems. So she asks it to (for example) "Generate a 20 question multiple choice practice test focused on factoring polynomials based on the New York State common core Algebra 2 curriculum. Put the answers at the bottom."

In general that works like a charm for most subjects. Good practice problems, generally all correct and appropriate for the course, with the occasional bad one - a good lesson in not trusting the output blindly. If there's one problem that she wants to work on, she can ask it to generate more based on that.
posted by true at 7:36 PM on May 13 [12 favorites]

Until you scroll down and read the update to the post...

Which is that Apple has code optimization work to do, etc etc.
posted by They sucked his brains out! at 9:35 PM on May 13 [1 favorite]

I don't understand why someone would want to interact with such a thing. It's like talking to an overly fake, ingenuine person, but even more fake because there's no person

What part of "I DON'T WANT THE COMPUTER TO TALK TO ME" do they not understand?!?!
posted by jenfullmoon at 10:24 PM on May 13 [10 favorites]

First thing I'd do when I get a chance to interact with this thing it tell it to stop pretending to have emotions. I find the fake cheerfulness and enthusiasm to be *deeply* unsettling in a way I haven't been bothered by AI oddness before.
posted by NMcCoy at 11:23 PM on May 13 [6 favorites]

I continue to be astounded by the tech and almost certain it's not going to be the economic transformation people expect. We just don't need what it actually produces, and we need a lot less of some of the things it enables.

It's too much like WYSIWYG - really impressive for trivial tasks, immediately widespread, pain in the ass to use over time, undercuts the development of the very skills you need to Do The Thing effectively, then 20 years have passed and I'm using Markdown.

Or even plain HTML, on some godforsaken sites.
posted by McBearclaw at 11:29 PM on May 13 [16 favorites]

This new one still tells the same lies as previous versions about musician Robyn Hitchcock being an outspoken fan of the Boston Red Sox.
posted by johngoren at 11:55 PM on May 13 [4 favorites]

Mark my words— just as the gig worker economy (Uber, Doordash, etc.) allowed companies to circumvent workers rights regulations by hiring 'independent' contractors instead of full-time employees, the real innovation of AI will be to justify depressing wages by e.g firing writers and rehiring them for less money as mere 'editors' of AI slop.

Mystery AI Hype Theater 3000, Episode 25 - An LLM Says LLMs Can Do Your Job shows two researchers deep dive into some of these studies.
posted by i like crows very much at 5:08 AM on May 14 [7 favorites]

gpt-4o says I died in 2023.
posted by the antecedent of that pronoun at 5:43 AM on May 14 [2 favorites]

NMcCoy: I find the fake cheerfulness and enthusiasm to be *deeply* unsettling in a way I haven't been bothered by AI oddness before.

Well, then, have I got the robot assistant for you!

“Sorry, did I say something wrong?" said Marvin, dragging himself on regardless. "Pardon me for breathing, which I never do anyway so I don't know why I bother to say it, oh God I'm so depressed. Here's another one of those self-satisfied doors. Life! Don't talk to me about life.”

― Douglas Adams, The Hitchhiker’s Guide to the Galaxy

posted by wenestvedt at 5:47 AM on May 14 [7 favorites]

You joke, but I work in health care. Where I work we are testing "AI Supported" documentation. That is, the provider places their phone between then and the patient and press record. When they are done with the visit, they press stop. In less than two minutes, the documentation is in the patient's chart. Who did this before? People who's job title is literally "Medical Scribe".

So, yeah. People are going to lose jobs over this.

Well, that is a bit of an interesting case. I work in non-US healthcare, and we don’t have scribes. Every word that ends up in a patient’s note was written by their physician. This is not super ideal because physicians spend way too much of their time doing paperwork instead of seeing patients. But you’re not going to hire a full-time person to do your documentation unless the additional patients you’ll be able to see by virtue of having less documentation responsibility is going to actually pay that full-time salary. And that’s really only possible in extremely high paying specialties where I work, or in a festering capitalist hell system like the American one.

In my practice, a reasonably reliable AI transcriptionist generating first-draft notes and imaging reports would probably allow me to increase my clinic volumes by ~20% and wouldn’t threaten anyone’s job.
posted by saturday_morning at 6:44 AM on May 14 [5 favorites]

@jeffburdges thx for the links, will take a while to look them over.
posted by daHIFI at 6:53 AM on May 14

a reasonably reliable AI transcriptionist

This, I think, will continue to be the problem in healthcare, the 'reasonably reliable' part. AI scribes are just as prone to hallucination as the rest of LLMs, and so really require a human being to clean up the mess, in exactly the same way that earlier voice recognition does. Which is to say, there will be a divide between doctors using these tools as labor-saving devices that create dangerous inaccuracies in the record, or someone doing the work of making sure the record is accurate, except because that someone will be using the technology, their work will be undervalued because of the presumed increase in productivity.
posted by mittens at 6:59 AM on May 14 [8 favorites]

Productivity is such a load of crap. Belief in it is religion, not science.
posted by grubi at 7:11 AM on May 14 [7 favorites]

AI scribes are just as prone to hallucination as the rest of LLMs, and so really require a human being to clean up the mess

Of course. I’m already used to “reasonably reliable” medical students and residents who rotate through my clinic writing first drafts that I need to clean up. Editing a note is usually not the time-consuming part. And yes, I’m sure LLMs will make different errors from the errors humans make. Users would need to be familiar with the technology and use it judiciously.

Look, I’m really not an LLM booster — this is the only application I’ve seen so far that I’m at all interested in — but yeah, I’m open to the possibility that these tools could help me provide better care to my patient population.
posted by saturday_morning at 7:23 AM on May 14 [6 favorites]

scribes can check the content they are creating and a human wrote whatever got printed

now Chat GPT pulls bullshit out of thin air, wearing a bikini and waggling its ass, but its still bullshit and hallucinations

The question is going to come down to the rate that errors are made at. Humans are error prone, very much so when you overwork them. An AI has much more stamina, but is prone to flights of fancy.

The human side of that equation isn’t going to change. It’s not clear how much the AI situation can be cleaned up.

EDIT: or what saturday_morning just said
posted by Tell Me No Lies at 7:29 AM on May 14 [1 favorite]

To that end, David d'Angers monument to Gutenberg at the Place Gutenburg in Strasbourg. It includes a plaque of jaw-dropping iconography (CW: racism, colonialism) depicting elegantly dressed Europeans standing over a throng of naked, fawningly grateful Africans. A printing press is in the middle, the source of Europe's gifts

That is amazing. Thank you for sharing it.
posted by Tell Me No Lies at 7:32 AM on May 14 [1 favorite]

As an aside, Peter Watts' novels Blindsight & Echopraxia explore intelegences being radically different in their subjective experence, so much so that conversations become almost impossible, and "consciousness" disapears entirely among more intelligent AIs and hive minds.
posted by jeffburdges at 7:49 AM on May 14 [1 favorite]

Apart from all the other issues: the way they kept cutting "her" off in the "recognize multiple speakers" video, and making "her" perform for them, didn't feel good to listen to.

It’s an interesting dichotomy. On one hand it’s a computer interface, something which one would feel no compunction about interrupting and redirecting. On the other hand, it manages to emulate just enough sentience to make it uncomfortable.

Despite the use of a cheerful and compliant female voice throughout the demos, I wonder if people are going to end up choosing a less human voice on purpose.

Personally, I’ll be using HAL.
posted by Tell Me No Lies at 7:49 AM on May 14

FWIW conventional LLM hallucination and speech-to-text transcription inaccuracy are two very different things. LLM hallucination can be statistically minimized with more data but that is subject to diminishing returns because it has no access to the reality all its human-authored writing is describing. There is no mental model of our reality behind it, just the associations between the words we use and the concepts they describe.

Transcription can be improved via RLHF until it slightly exceeds the average doctor’s notes, though probably not much beyond that because how do you score higher than your best evaluator? The rate at which this improvement happens is limited by the usual training data volume and subject-specific model funding bottlenecks.

Diagnostics basically lives on the fault line between those two where in some basic respects it is possible for LLMs to outperform actual MDs, but for anything requiring a deep understanding of how a body or pathology actually works, an LLM is extremely likely to wander off into nearest approximation/hallucination.
posted by Ryvar at 7:52 AM on May 14 [2 favorites]

I’m curious what part of the medical industry you work in. In my entire journey through the American health services I have never once had a conversation recorded. Are there places where this is common practice?

I work in a combination of hospitals and clinics (I'm IT). I am unaware of recording conversations to be a common practice. Like I said, we are just testing it now. And, before the providers start recording they need to get consent from the patient.
posted by a non mouse, a cow herd at 7:56 AM on May 14 [2 favorites]

And, before the providers start recording they need to get consent from the patient.

Interesting. I wonder what it is about getting cheap transcriptions that makes the medical industry want to cross this line now.
posted by Tell Me No Lies at 8:08 AM on May 14

With regards to voice translation I currently use Siri for speech to text, Google Translate for text translation, and Siri to speak the translated text.

It’s not as cumbersome as it sounds (Google Translate has a nice facility for conversations) but the ability to recognize the speaker and know what language they’ll be using, as well as to recognize the end of someone speaking, will remove all the button pushing completely. I will use the shit out of that.

Of course at the moment it’s probably just a rigged demo, but it’ll be cool when it works.
posted by Tell Me No Lies at 8:26 AM on May 14

I am not...entirely sure that I want my doctor having a literal word for word transcription of everything I said to them, something that will no doubt be fed to the insurance company and used to deny coverage or otherwise harm me. The doctor's notes of course involve the doctor's assumptions, which can be bad, but a word for word every-pause-or-clarification transcription subject to manipulation by the insurance company will be a lot worse.
posted by Frowner at 9:28 AM on May 14 [12 favorites]

That's the "AI" part of it. It's not literal word for word. If you come in and say, "I've got a real bad headache, Doc." The AI would have a note something along the lines of "Patient is a 37 year old male presenting with a headache. Patient states headaches are chronic and are more acute with bright light."

The provider has to approve/edit the note before it is part of your chart.

Please note, I am in no way endorsing this software.
posted by a non mouse, a cow herd at 9:50 AM on May 14 [2 favorites]

am not...entirely sure that I want my doctor having a literal word for word transcription of everything I said to them, something that will no doubt be fed to the insurance company and used to deny coverage or otherwise harm me.

And if it supports your claim it can always turn out to have "not been turned on" via the same system that manages police body cameras.
posted by Tell Me No Lies at 10:15 AM on May 14 [4 favorites]

Machinery comes into the world not as a servant of ‘humanity,’ but as the instrument of those to whom the accumulation of capital gives the ownership of the machines.

…machinery also has in the capitalist system the function of divesting the mass of workers of their control over their own labor.

Harry Braverman, Labor and Monopoly Capital
posted by audi alteram partem at 10:38 AM on May 14 [2 favorites]

Decent-quality transcription has been around for quite a while. Google's Live Transcribe launched in 2019, well before the big wave of LLMs. Because it is an easily-supervised task, with lots of available data, it works well, and tends to make mistakes at the word level (mishearing) rather than invent new sections of text (hallucination). Transcription will improve with the introduction of LLMs (better ability to clean up obvious mis-hearings through use of longer context, for example), and I expect that, because there's still plenty of supervised data around, the failure mode isn't going to be hallucination. I would expect this kind of application to get to somewhere north of the 90th percentile of human performance, without the endurance limits and with a lot more convenience.

This paper from 2017 finds similar quality between human and machine transcription, and also shows the people generally can't tell the difference betwene human and machine transcripts in review (correctly identify the source 53% of the time.

This recent paper creates an robustness benchmark for speech transcription. They report a 2016 system (deepspeech) with a Word Error Rate (WER) of 17.7%, with more recent methods varying between 1.8% and 6.4% (for WhisperTiny, which is open source and can run locally on a wide range of devices).

Overall takeaway: Transcription is more like laying railroad than painting pictures. Current algorithms have good quality, are efficient, and don't mind running for days without rest.
posted by kaibutsu at 11:03 AM on May 14 [9 favorites]

“What OpenAI did,” Ethan Mollick, One Useful Thing, 14 May 2024
posted by ob1quixote at 11:36 AM on May 14 [1 favorite]

After going back and forth for five minutes (in text chat) to get an actually relevant documentation link I realize that the future is yelling at your computer.
posted by atoxyl at 11:43 AM on May 14 [7 favorites]

I also work in healthcare, and people in my field are sincerely and genuinely presenting LLM-based chat bots and similar to be used in end-of-life discussions, comforting people with dementia, and keeping older folks company. The justification is that we have a huge labor shortage.

It really gets me -- instead of spending millions improving working conditions and hiring and paying better salaries, the corporations want to spend millions to take humans out of healthcare entirely?
Humans are terrible at end-of-life conversations, so let's bring in the chatbots?

Hell no. Wrong way, go back.
posted by acridrabbit at 12:19 PM on May 14 [19 favorites]

comforting people with dementia,

Japan is already doing this with very promising results.

Frankly, dealing with people with Alzheimer’s or dementia can require inhuman patience. Humans may not necessarily be replaceable by AI in this situation, but it can take a tremendous load off of them.
posted by Tell Me No Lies at 12:39 PM on May 14 [5 favorites]

I will absolutely grant the "chat with people with dementia and remind them of things" case, because that's a realm where infinite patience and stamina are a true superpower, and it sure beats all-day TV. But as with the Vision Pro and its potential as an accessibility aid, it'll never have the ROI they need to appease the overlords. (Although this would be a hell of a use case for the open-source models Ryvar has been writing about).
posted by McBearclaw at 1:21 PM on May 14 [2 favorites]

I am positive that AIs are right now being trained for the optimal way to convert medical transcripts into maximally profitable healthcare billing codes that minimize rejections, i.e. Revenue Cycle Management. On the insurer side, the optimal strategy to deny procedures without getting in trouble.
posted by credulous at 1:30 PM on May 14 [2 favorites]

the “chat with people with dementia and remind them of things" case [will] never have the ROI they need to appease the overlords

Probably not the overlords, but I think the under-overlords may take it up. After all the investment just consists of training up a bot and covering computer time. I think there are quite a few people who would pay a moderate fee to ease the load of dealing with an ailing parent.
posted by Tell Me No Lies at 1:43 PM on May 14 [1 favorite]

I work in health care. Where I work we are testing "AI Supported" documentation.

*inadvertently creates redheaded whiteguyblink.gif* …I’m sorry, but in what fucking universe is that HIPAA compliant? Because I would bet $100 they aren’t running a speech-to-text model inhouse on local hardware.

Judging from the stream of arxiv posts, something like half of all the research being done on federated ML models is being done by or with medical organizations trying to pool compute in ways that keep confidential information within their walls.
posted by a faded photo of their beloved at 1:52 PM on May 14 [1 favorite]

One of the tracks for the Veterans Affairs sprint I linked above is an ambient transcription application. They want a recording device in the room with patient/provider interactions. The idea is that you get a diarized transcript of the interaction, along with notes that the LLM generates from the transcript.

The VA can't retain providers because they don't pay in line with private hospitals, and the bean counting is much worse. They just don't have the numbers to see the number of heads that they need to, so the goal of the exercise is to 'reduce burnout. The other track is regarding taking external provider data (patient procedures performed outside the VA), and mapping it to the existing VA ERP.

And yes, there is some sort of standard metric used to evaluate provider notes, it's used as a ranking criteria for the project results.
posted by daHIFI at 4:25 PM on May 14

Just chiming in to say one thing... People are talking about GPT's refusal to give straightforward answers about things like World War 2 as if it's an inherent issue with AI, which it isn't. It's actually incredibly easy to get AI's to be hyper opinionated, to a fault. That's why they put in a ton of safeguards to get it to hedge and be wishy-washy about that stuff. There's an extremely long, hidden mandatory system prompt that basically tells it in a million ways to take a polite tone, not take sides on controversial issues, etc. Neutrality isn't the default, it's something they have to go out of their way to achieve. It's not a fault with LLM's or AI, it's a fault of risk-averse corporations.

If if says that Germany started World War 2 99% of the time but the other 1% it says Hitler was right, that's going to be worse for Open AI's PR than if it just hedges every time. On a similar note, if having fewer safeguards means it can be more spontaneous and interesting, but also means it can be super racist when you tell it to be, that's some really bad PR for OpenAI. They're already capable of being more creative, unpredictable, and human, it's just not worth the risks for a company like OpenAI to let them work that way. Not until they iron out all the kinks, at least.
posted by Green Winnebago at 4:27 PM on May 14 [3 favorites]

SNL covered the elder care case
posted by jeffburdges at 4:54 PM on May 14 [3 favorites]

News: Ilya Sutskever, the chief scientist deeply involved with last year's board drama, announced that he is leaving OpenAI
posted by Rhaomi at 5:25 PM on May 14 [1 favorite]

Humans may not necessarily be replaceable by AI in this situation, but it can take a tremendous load off of them.

Ok, well, before we get to emotional labor around end of life care, lets make a voice assistant that can change the channel on the Roku despite impaired speech.
posted by snuffleupagus at 5:31 PM on May 14 [4 favorites]

Ilya Sutskever, the chief scientist deeply involved with last year's board drama, announced that he is leaving OpenAI

Ditto Jan Leike, who AIUI was the next most senior member of the “superalignment” team, aka the safety faction within OpenAI. Sutskever is one of the weird ones but he was also an important counterweight to Altmann’s well-spoken techbro / the general “move fast and break things” Valley ethos. It was really only a question of time after the failed ouster, but having this happen while the first model potentially capable of any sort of limited reasoning is currently undergoing its safety review is… not great.

I’ve been thinking a lot lately about just how many “may you live in interesting times” things appear to be coming to a head in November, and… I wish this had waited until after that.
posted by Ryvar at 6:24 PM on May 14 [6 favorites]

Green Winnebago: People are talking about GPT's refusal to give straightforward answers about things like World War 2 as if it's an inherent issue with AI, which it isn't. It's actually incredibly easy to get AI's to be hyper opinionated, to a fault. That's why they put in a ton of safeguards to get it to hedge and be wishy-washy about that stuff. There's an extremely long, hidden mandatory system prompt that basically tells it in a million ways to take a polite tone, not take sides on controversial issues, etc. Neutrality isn't the default, it's something they have to go out of their way to achieve. It's not a fault with LLM's or AI, it's a fault of risk-averse corporations.

There's another option: that there's no coherent worldview in the training data plus the model can't pick an answer that matches the user's worldview, either, so it has to summarise a mess of input and then take guidance to deliver it to some middle ground for whatever 'middle' looks like to the user.
posted by k3ninho at 12:08 AM on May 15 [3 favorites]

The problem with the WWII answers that people have posted isn't that they're wishy-washy, it's that they're incorrect.

The last country to declare war on Germany during World War II was Brazil. Brazil declared war on Germany on August 22, 1942.

And then after another prompt to try to get it right:

The correct answer to your question is that the last country to declare war on Germany during World War II was the United States, on June 5, 1945.

I don't know myself what the answer to that question is. I guess the skill that I have that the LLM doesn't is to say, "I don't know, let me look it up," then to call on my knowledge of/ability to sus out clues about trustworthy vs untrustworthy sources, then to find a couple of sources, understand what they're saying, and see if there's a consensus between them.
posted by clawsoon at 4:28 AM on May 15 [2 favorites]

am not...entirely sure that I want my doctor having a literal word for word transcription of everything I said to them, something that will no doubt be fed to the insurance company and used to deny coverage or otherwise harm me.

This take didn't even remotely occur to me because I live in a country that (in theory) doesn't operate healthcare for profit. However it extremely underlines a lot of the points already made (Shovels for spoons being a great example) if you're concerned that these tools will affect jobs then your complaint is with capitalism, not with AI.
posted by Just this guy, y'know at 4:51 AM on May 15

It’s not as cumbersome as it sounds (Google Translate has a nice facility for conversations) but the ability to recognize the speaker and know what language they’ll be using, as well as to recognize the end of someone speaking, will remove all the button pushing completely. I will use the shit out of that.

Funny enough, my earbuds seem to do a great job of detecting when I'm speaking so they can switch from noise cancelling to transparent mode. Sadly, they are unable to tell the difference between speech and singing along with the song they're playing in my ear, but they seem not to trigger on anything that isn't word-like.

Also, my phone will transparently translate text conversations in several messaging apps to/from other languages and make it look to me like everything is in English, while the person on the other end only sees their native language.

Point being that we're really not far off from

Point being that we're not too far away from that. Somebody just needs to get off their ass and combine a few things that already exist. Google has been super close for a few years now, but their ADHD seems to be getting in the way of crossing the finish line on seamless universal translator style translation. (With the caveat that even if they had stuck it out it wouldn't be perfect by any means, just mostly usable for casual conversation in some common languages, as with any transformer)
posted by wierdo at 5:08 AM on May 15

@snuffleupagus I think your Neuralink will take care of that.

@clawsoon That's why Retrieval Augmented Generation is so popular. Vector databases and the embedding process is fascinating. I've been incorporating these into Obsidian notebooks for some very interesting 'chat with my notes' use cases. I think the general model being put forth by Altman and others is that you'll have these very large, general models, which can then be fine-tuned for specific verticals or domains. I think the answer to questions like yours can be solved pretty easily if the model has the ability to look up information from a copy of wikipedia directly and return a reference with it's response.
posted by daHIFI at 7:10 AM on May 15

Point being that we're not too far away from that.

I have to admit that I am a greedy brat about all this. In my lifetime I’ve been:

Astonished and amazed I could fit an entire translation dictionary into a Palm Pilot.
Deeply impressed when machine text translation became useful.
Once again, astonished when machine dictation became useful.
Pleasantly surprised by Google Translate’s mechanism for tying this all into a relatively smooth conversation.

And now all I want is perfection.

———

seamless universal translator style translation

Alas, unless we can teach machines telepathy we will never reach the simultaneous speech and translation ideal. The last word in a sentence can change its entire meaning, and the language that is being translated to may not support a similar construction.
posted by Tell Me No Lies at 7:43 AM on May 15 [2 favorites]

Ars Technica: Disarmingly lifelike: ChatGPT-4o will laugh at your jokes and your dumb hat

At this point, anyone with even a passing interest in AI is very familiar with the process of typing out messages to a chatbot and getting back long streams of text in response. Today's announcement of ChatGPT-4o—which lets users converse with a chatbot using real-time audio and video—might seem like a mere lateral evolution of that basic interaction model.

After looking through over a dozen video demos OpenAI posted alongside today's announcement, though, I think we're on the verge of something more like a sea change in how we think of and work with large language models. While we don't yet have access to ChatGPT-4o's audio-visual features ourselves, the important non-verbal cues on display here—both from GPT-4o and from the users—make the chatbot instantly feel much more human. And I'm not sure the average user is fully ready for how they might feel about that.

posted by Rhaomi at 2:35 PM on May 16

I think your Neuralink will take care of that.

The singularity is the new cold fusion.
posted by snuffleupagus at 4:50 PM on May 16 [2 favorites]

What's going to do us in - and by "us" I mean regular non-billionaires - is that we've never had an effective understanding of the world as a thing in itself beyond its use value. So at this point, people aren't really going to be super useful - the chatbots can chat with each other and move money around, the whole thing can go on autopilot except for nurses, maybe some researchers, a small number of people needed to push the buttons and unsnarl the things that the robots can't do for themselves, professions that are about groveling, like being a waiter, and so on. And those will get more and more poorly paid. Everyone with money will just sit at home slack-jawed watching custom porn while the chatbots do their investing. The rest of us will be unemployed and then homeless and then dead.

Just like whales and redwoods and obscure insects and so on, we ought to deserve to make lives for ourselves even if we don't make anyone any money. Even if a car is faster than a horse, horses still ought to exist. Even if an apartment building is more useful to humans than a redwood, we shouldn't chop the redwood down. But that line of thinking has never had much power, and now humans are going the same way. The remnant will just be weird, stunted rich people who only know how to talk to chatbots and their inbred human chattel, but the stock market will still go up.
posted by Frowner at 5:42 PM on May 16 [11 favorites]

The rest of us will be unemployed and then homeless and then dead.

Unlikely. You could paint the same picture of the concentration of wealth that went into the French monarchy, and look what came out of that?

You do not want a bunch of bored, starving humans wandering around. It is very bad for the status quo.
posted by Tell Me No Lies at 6:56 PM on May 16 [3 favorites]

[lots of creepy cheering for machine translation]

It would be nice, once we're all sitting around in the smoldering ruins of society, if people could at least admit that (a) the thirst to reduce translation to unpaid, zero-value labor was really shitty and has had entirely predictably disastrous results, and (b) since the work of translating a text from L1 to L2 is at least as complex as the production of the L1 text in the first place, the days of every word-centric job were numbered as soon as MT became good enough for capitalist purposes.

The urge to devalue others' labor is all too human, but it should be pretty bleedingly obvious at this point that the folks who cheered the destruction of the translation profession have been cheerfully digging their own professions' graves.
posted by Not A Thing at 7:13 PM on May 16 [7 favorites]

This is all truly terrible. The last time we had a major shift from paid workers to machinery we were forced to cope with our work hours being cut to 40 hours a week and children being forced to go to school instead of a job.

In the face of increased mechanization, Europe is even starting to play with four day work weeks. What are we going to do?
posted by Tell Me No Lies at 7:22 PM on May 16 [3 favorites]

Tell Me No Lies: This is all truly terrible. The last time we had a major shift from paid workers to machinery we were forced to cope with our work hours being cut to 40 hours a week and children being forced to go to school instead of a job.

That's true if your field of view is limited. The other thing that our mastery of machinery allowed us to do was export the shitty jobs that couldn't be automated and machine-gun down the people who objected.

Mechanization does have major benefits for the quality of life of the people who benefit from it, I won't argue with that. Perhaps one of these centuries its benefits, instead of its harms, will be extended to those who have been deemed raw material suppliers.
posted by clawsoon at 5:48 AM on May 17 [7 favorites]

Industrialization led to more child labor, not less. It was the labor movement that reduced child labor and work hours.
posted by tofu_crouton at 9:38 AM on May 17 [5 favorites]

Industrialization led to more child labor, not less. It was the labor movement that reduced child labor and work hours.

And Industrialization fueled the labor movement.
posted by Tell Me No Lies at 10:35 AM on May 17

That's true if your field of view is limited. The other thing that our mastery of machinery allowed us to do was export the shitty jobs that couldn't be automated and machine-gun down the people who objected.

The cost was high and it's certainly not an unalloyed good. But roughly 1 billion people are working reasonable hours and the children aren't working at all. I think that counts for something.
posted by Tell Me No Lies at 10:42 AM on May 17

and the children aren't working at all

I'm sure the children working in mines producing precious metals (some of which go into making computer chips and components, under ever-increasing demand) would love to hear this.

Here's a handy list of all the goods produced by child labor in 2022.

But maybe you're just talking about the United States? If you are, I have some bad news for you.
posted by fight or flight at 11:14 AM on May 17 [2 favorites]

Peter Watts thoughts on Neuralink
posted by jeffburdges at 11:24 AM on May 17 [1 favorite]

The other thing that our mastery of machinery allowed us to do was export the shitty jobs that couldn't be automated

Those shitty jobs were way better than toiling on subsistence farms, which is why the share of people in extreme poverty has dropped by about 2/3 in the last 3 decades.

Should there be more labor rights worldwide? Absolutely. Does the economic development in China justify the political oppression there? Absolutely not. But at its core, economic development is about technological diffusion, which is often driven by increased trade - aka exporting shitty jobs.
posted by Mr.Know-it-some at 1:47 PM on May 17 [1 favorite]

> Those shitty jobs were way better than toiling on subsistence farms

Aside from the many other advantages to modern life, like less child labor, it's only mostly true subsistane farming wound up shitty in the modern world, but if you look back then your claim winds up quite complicated. It's complicated.

An idealistic future would be: AIs suck up all the biullshit jobs like law & advertising & administration, so then collapse can whipe them out and nobody cares. That's extremely unrealistic because..

Joe Tainer's collapse model says "complexity" results from using energy & resources to solve "problems", so then "collapse" aka "simplification" means not solving those social problems anymore. All those bullshit jobs actually do solve some problem, maybe problems which never required solving, but likely some human level complexity there.
posted by jeffburdges at 2:05 PM on May 17

It would be nice, once we're all sitting around in the smoldering ruins of society, if people could at least admit that (a) the thirst to reduce translation to unpaid, zero-value labor was really shitty and has had entirely predictably disastrous results

Whoops!
posted by Tell Me No Lies at 9:03 AM on May 19 [1 favorite]

Bwahaha, looks like Altman fucked up by indulging in the "Her" hype, and they're now having to backpedal, saying that their default voice is not based on Scarlett Johansen (sure) and that they'll be "pausing" use of "Sky", the ChatGPT voice that sounds like her.

(Luckily I chose "Ember", the voice that sounds like a narrator from 90s edugames.)
posted by Rhaomi at 12:15 PM on May 20

Amusingly, Sam Altman is one of three co-founders of Worldcoin.

At least some of us figured out ChatGPT was a basically some form of investor fraud, but if you've not yet figured this out, then at least you can rest easy knowing the Sam Altman founds frauds.
posted by jeffburdges at 4:10 PM on May 20

> At least some of us figured out ChatGPT was a basically some form of investor fraud

@jeffburdges, I have not yet figured this out, could you explain a bit more or give a source? Google just gives me scammers who've used ChatGPT.
posted by kwartel at 3:42 AM on May 21

Kwartel, I can't find the specific article I read about it now, but Ed Zitron has been reporting how the promises are deliberately unachievable in order to generate obscene investments. I think some of his work on that was only on social media though.
posted by tofu_crouton at 7:15 AM on May 21 [1 favorite]

They're marketing large language models as search engines and factual content generators. They are neither, and it's not a matter of improvement. They're simply the wrong tools for the job. They will never "improve" into being those things because that's not what they're for. But by the time the investor class figures that out, the marketers will have taken the money and be long gone.
posted by hydropsyche at 8:29 AM on May 21 [6 favorites]

Large language models could be great for turning a database of facts into an output appropriate for your chatbot of choice to read to you or summarizing a longer text to make it more casually digestible.They could also be great for better gleaning intention from instructions given to said chatbot.

The problem is that some people are trying to throw away the other stuff and just let the LLM do the whole thing, which they are very much not suited for. I could imagine a similarly structured system that could make a pretty decent search engine that one could pair with an LLM if you wanted conversational output, but that's not how these things are being built. It's cheaper to half ass it, given the vast amounts of storage and compute needed to train a model.
posted by wierdo at 10:51 AM on May 21 [1 favorite]

It depends upon the output type and required precission, wierdo. As a rule, LLM output need review & tweaking by humans, usually human experts.

It's cheap to review a meme image, a poem, or a 3 min song. Yes, LLMs definitely open up shorter forms of artistic expression for people without those talents. In particular, LLMs should rock for making advertisments, maybe eliminating many jobs there.

I'd expect video game NPCs shall become much less predictable too, not because they would not make mistakes, but because nobody cares if they do.

If otoh you want a novel then an LLM descends into hallucinations somewhere along the way, but not in the meaningful way in which good authors do. Although not cheap, editing a novel remains cheaper than write one, but editing still requires talent, and editing shall become much more expesive for LLM output.

I've mostly observed LLMs produce garbage when people asked for short summaries, or even translations, but even once they improve then you still cannot trust the output. You want oncologists, civil engeneers, lawyers, pension funds managers, etc reading LLM summaries of new methods, climate shifts, regulations, etc respectively?

Investor marketing for ChatGPT presents a complete revolution. It therefore focusses upon exactly those serious applications, not less predictable NPCs in video games, or untalented 14 year old boys making songs.

Aside from tech limits, there exist economic limitations too, in that elites could only squeeze so much more the middle & lower classes, which then limits the potential for economic revolution of the sort investors crave. America has like 200k workers in advertising, but if they can cut this by 50% then how much do their stock prices or dividends really increase?
posted by jeffburdges at 4:18 PM on May 21

I have found that they rarely fuck up when you tell them something like "say the sun is 93 million miles away, but more conversationally". They don't then say "alpha centauri is 20 miles away" or whatever other dumb shit they sometimes end up spewing when you ask them to generate supposed facts on their own.

Or to use an less contrived example that would make the marketing people wet themselves, you ask your phone or "smart" speaker something like "where can I buy terro liquid ant bait near me?" Google already knows this and can do it fairly reliably thanks to their advertising integrations, but responses sound canned as shit. Training an LLM to take a list of stores and their corresponding distance and pricing would allow a natural-sounding response like "the closest place is megalomart where you can buy a six pack for eight dollars. lowes depot is half a mine farther and sells the same six pack four twenty six. you can also buy online starting at three ninety eight with free two day shipping."

The idea being to use LLMs for the part of the problem they're actually good at and use existing, deterministic, technology for the parts where it already works well.

Unfortunately, the AI boosters are in full on magic beans mode and insist on throwing everything including the kitchen sink at the language model. We'd be way better off training models to take on subtasks that are most in their wheelhouse. That's not to say that research shouldn't continue on the larger, more general purpose use cases, but we shouldn't be rushing to foist them on people by embedding them in every damn piece of software.

Point being that there are limited domains for which LLMs are already good enough to be useful and the stakes are low enough that errors aren't going to cause anything more than the mild inconvenience we already experience at times with almost all technology in our lives. Using LLMs for that kind of thing is fine. Problem is that it doesn't support tens of billions worth of valuation for companies like OpenAI, so nobody working on AI commercially is interested in rolling things out in such a limited way. It would give the VCs a sad if they had to wait for their lottery win.
posted by wierdo at 8:01 PM on May 21 [2 favorites]

I feel like this is what 'RAG' or....um....the other buzzword I can't remember right now is supposed to do.

But what none of the boosters are saying is 'expert systems,' which is what they need to couple LLMs to.

That is, traditional 'expert systems' from the previous AI generations, coupled to expert LLMs trained on a domain specific corpus with embedded systematic taxonomies as used in the industry or profession, that the LLM can map. Which is what I've concluded AI proponents mean when they talk about a LLM generating an internal 'world model.' A medical LLM should be trained on medical information organized according to medical principles.

And sanity checks on the output. Like, in the legal world, verification that the case relied on exists and that the text at the pin-cite actually contains the point of law it's offered for.

The thing is, all of that has to be trained on and interface with existing proprietary sources controlled by famously exploitative entities like TR West.
posted by snuffleupagus at 9:34 AM on May 22

You type in, "say the sun is 93 million miles away, but more conversationally."

A tree burns in the Amazon.

You get back, "Yo, the sun is 93 million miles away."
posted by tofu_crouton at 12:49 PM on May 22 [5 favorites]

Inferencing doesn't have to gulp power. One of the few good things about this magic beans shit is that companies are building compute units well tailored to do such work in a very small power budget.

Hell, even the training is improving drastically. Nvidia has been focusing heavily on performance per watt over the past several years. They still mostly make a bunch of what amounts to toaster ovens in a PCIe slot, but that's because their customers prefer to spend the same amount of power to get more work done rather than do the same work for less power.

My hope is that as it becomes increasingly clear that we have reached a plateau of effectiveness in scaling ML models to larger and larger parameter sizes the magic beans people will see some sense and slow their roll.
posted by wierdo at 11:22 AM on May 23

“An Age of Hyperabundance,” Laura Preston, n+1, 10 April 2024
posted by ob1quixote at 7:08 PM on June 3

The AI Revolution Is Already Losing Steam
posted by jeffburdges at 1:42 AM on June 4 [1 favorite]

« Older If you need it, give me a hug or stop for a chat | Suck it, Lichtenstein! Newer »

This thread has been archived and is closed to new comments

MetaFilter

"Well, you seem like a person, but you're just a voice in a computer"
May 13, 2024 12:14 PM Subscribe

Tags

Share

"Well, you seem like a person, but you're just a voice in a computer" May 13, 2024 12:14 PM Subscribe

Tags

Share

"Well, you seem like a person, but you're just a voice in a computer"
May 13, 2024 12:14 PM Subscribe