"my own personal comment: lmao, it's fucking dogshit"
July 5, 2023 9:32 PM Subscribe

On July 1st, G/O Media (previously, previously, previously, previously, previously) announced they would be publishing articles created by AI. Today, under the byline Gizmodo Bot, the first AI article, an error riddled chronological listing of Star Wars stories was published. The James Whitbrook, deputy editor of Gizmodo and io9 responded, releasing a statement. (sorry, Twitter link) Note, the article in question has not been linked. There's no need for it to get any more clicks.
posted by Ghidorah (113 comments total) 13 users marked this as a favorite

Staff member Lin Codega posted a comment from the staff of io9. (Still Twitter, sorry)
posted by Ghidorah at 9:37 PM on July 5, 2023 [4 favorites]

Note, the article in question has not been linked. There's no need for it to get any more clicks.

Good call and thank you for that.
posted by Artw at 9:42 PM on July 5, 2023 [10 favorites]

Ladies and gentlemen of the Stanford graduating class of 2023, I have just one piece of advice for you: no matter what Altman’s PR team tells you, there is no intelligence in the artificial intelligence tools.

None.

Oh, there’s artificial in the artificial intelligence demos. But you don’t want artificial. You want intelligence.

And there is no intelligence in the artificial intelligence tools.
posted by mhoye at 9:46 PM on July 5, 2023 [48 favorites]

I found the article in question and it's BAD. I'm not a Star Wars nerd and the omissions and errors in chronology would have sailed right past me, but it's just very bad writing.

It literally uses the word "follows" in every single bullet point. It's written with no personality whatsoever. It's actually an insult to ChatGPT because I've seen ChatGPT write much better articles than that.

Basically if I was given the prompt "Write an article about Star Wars that seems like it was written by an AI, and makes the AI look bad" this is exactly what I would have written.
posted by mmoncur at 9:58 PM on July 5, 2023 [10 favorites]

i have learned from hearing people talk about the current generation of llm chatbots that a whole lot of folks are not good at reading
posted by bombastic lowercase pronouncements at 10:09 PM on July 5, 2023 [28 favorites]

If I weren't firmly convinced that simple incompetence is far more prevalent than organized plots, I'd suspect sabotage.
posted by panglos at 10:17 PM on July 5, 2023 [3 favorites]

OMG, LLMs don't work and probably never will, and this is proof positive! I'll alert Douglas Hofstadter. Uh oh, he profoundly, despondently, disagrees -- but maybe he hasn't read this really bad Star Wars article.
posted by wolfpants at 10:24 PM on July 5, 2023 [4 favorites]

I'm glad I got out of the newspaper business when I was booted out years ago. I miss it every day and I hate my current job, but at least AI isn't putting me out of a job yet. As if the G/O media employees didn't have enough to deal with as is. This is disgusting.

I truly do not understand why we had to create AI good enough to eliminate all humans. We learned nothing from the Terminator franchise and did it anyway.
posted by jenfullmoon at 10:24 PM on July 5, 2023 [5 favorites]

The thing is, none of the above criticisms matter in the slightest to any company willing to do this. All they're going to look at is the money saved by using a shitty AI to generate the article vs. any drop in ad revenue due to reduced traffic. If it turns out that even terrible articles like this one don't really affect traffic all that much, or at least not enough to make it worth paying people, this is the future.
posted by star gentle uterus at 10:31 PM on July 5, 2023 [17 favorites]

Yep. The marginal cost of another page is nothing if you’re not paying anyone to write it and if you can get one search engine to serve it up the ads would have paid for it.
posted by jmauro at 10:36 PM on July 5, 2023 [1 favorite]

Wayback Machine copies of the articles:

io9: A Chronological List of Star Wars Movies & TV Shows

Deadspin: The 15 Most Valuable Professional Sports Franchises

AV Club (via Yahoo): The Biggest Summer Blockbusters of 2003: 10 Can't-Miss Movies
posted by Rhaomi at 10:37 PM on July 5, 2023 [8 favorites]

Still not sure why we’re assigning jobs in the humanities to AI. Besides, you know, overhead. Can’t we get AI to work the mundane jobs like the drive thru window first?
posted by Capt. Renault at 10:51 PM on July 5, 2023 [3 favorites]

Defector over here just laughin'.
posted by East14thTaco at 11:23 PM on July 5, 2023 [10 favorites]

Mundane jobs like the drive through require attention to detail, flexibility in interpreting idiom, and the ability to identify meaningful errors, all of which current AI is infinitely far from doing
posted by Fiasco da Gama at 11:29 PM on July 5, 2023 [20 favorites]

To give the LLM its due, it can't possibly have written anything worse than The Rise of Skywalker.
posted by senor biggles at 11:38 PM on July 5, 2023 [13 favorites]

We need search engines that de-prioritize crappy articles, whether they're made by an AI or by human beings. If Google won't do it, maybe someone else will.
posted by JHarris at 12:13 AM on July 6, 2023 [5 favorites]

We need search engines that de-prioritize crappy articles,

Last time I thought this, Google changed their ranking to derank all those those crappy eHow sweatshop articles and ended up tanking metafilter organic traffic.
posted by pwnguin at 12:30 AM on July 6, 2023 [9 favorites]

We need search engines that de-prioritize crappy articles,

Ha ha, today I got some industrial-strength 30% vinegar for my mom's caregiver, so I had to do a search for "30% vinegar dilution chart." Reader, there was no chart to view. There were a few images...in metric, a few webpages, that just had that title and linked to some other page that supposedly would have the information I sought. It was truly disillusioning. I've been noticing Google Search's decline over the months and years, and there has certainly been chatter on the wire about same, but this was ultra-bad. I even sent search result feedback, it was that bad.
posted by rhizome at 12:47 AM on July 6, 2023 [17 favorites]

There's no intelligence in Artificial Intelligence, but there is money there.
posted by rhizome at 12:47 AM on July 6, 2023 [3 favorites]

The fault wasn't to derank eHow articles, the fault was they did it in a hamfisted manner.

This is one of the reasons that I think the era of the web directory, like the Yahoo and DMOZ of old and the Curlie of now, may yet be upon us. If automatic tools like search engines can't tell good content from bad content, an edited list seems like the obvious solution.
posted by JHarris at 12:48 AM on July 6, 2023 [11 favorites]

Transcription of James Whitbrook's statement for others disinclined to deal with Twitter's garbage UI:

For 15 years, io9 has grown an audience that demands quality coverage of genre entertainment, from critical analysis, to insightful explainers, to accurate news and industry shaping investigative reporting. These readers have grown io9 into one of the best performing desks at Gizmodo, G/O Media's flagship site in terms of traffic, and they have done so by rigorously holding this team and the colleagues that came before us to a standard of expertise and accuracy that we have been proud to achieve.

The article published on io9 today rejects the very standards this team holds itself to on a daily basis as critics and reporters. It is shoddily written, it is riddled with basic errors; in closing the comments section off, it denies our readers, the lifeblood of this network, the chance to publicly hold us accountable, and to call this work exactly what it is: embarrassing, unpublishable, disrespectful of both the audience and the people who work here, and a blow to our authority and integrity. It is shameful that this work has been put to our audience and to our peers in the industry as a window to G/O's future, and it is shameful that we as a team have had to spend an egregious amount of time away from our actual work to make it clear to you the unacceptable errors made in publishing this piece.

Lin Codega's tweet linked above has a different photo of this same statement.
posted by flabdablet at 1:29 AM on July 6, 2023 [15 favorites]

To give the LLM its due, it can't possibly have written anything worse than The Rise of Skywalker.

It absolutely could, and if you had the patience to extract a movie-length script from it then it absolutely would.
posted by flabdablet at 1:37 AM on July 6, 2023 [7 favorites]

This is really a story about lazy / greedy people, not about LLMs.

Yes, the "article" in question is laughably poor and the motivation behind putting this into place is race-to-the-bottom corporate greed.

No, that doesn't mean that LLMs don't work / can't demonstrate any level of anything that could be termed "intelligence" / are a wall-to-wall scam.

GPT-4 used its Wolfram and Link Reader plugins (I didn't feed it any specific links) and was able to generate me a list of the movies and TV shows with one prompt, and then 3 reading orders, chronological in-universe, theatrical release and "thematic" with a 3 more prompts. I'm no Star Wars buff but the base list is already much more comprehensive (Star Wars Holiday Special anyone?) and the various list orders given appear broadly correct given what I do know for the chronological and theatrical lists. With another prompt it gave me summaries of the various thematic sub-lists it had come up with which were written in plain, grammatically correct prose, presented in a coherent order without obvious contradictions.

Were I planning to choose Star Wars as my specialist subject on Mastermind, or staking my media network's reputation on the quality of its novel content, of course these lists would have to be checked and vetted. But that really is down to the human part of the dynamic here, not the machine one.

All of which is to try to highlight that the core utility of these technologies is a reflection of their internal operation of iterative building of patterns and connections within a ginormous dataset that can then have varying structural templates applied to it based on the way it's queried (prompted, handled via plugins). They're not oracles, but they can very well function as knowledge adjuncts that can be more available, easy to interact with and enabling than e.g. web searching, consulting reference books, finding someone knowledgable who has the time to explain something to you.

And they can in some cases be more effective too. Yesterday I used GPT-4 to refine and debug a somewhat complex SQL query, which came out in the end using "window functions" with various bits of syntax that I wasn't familiar with. On a whim, after reaching the working / tested solution, I prompted it to "explain how window functions work in this query, using natural language rather than code, because I've always had trouble really grasping them instinctively." It output a ≈ 500 word summary explanation which was genuinely more succinctly useful to furthering my understanding than the reading I've done on the subject (Postgres docs, StackOverflow, Medium articles) and passing conversations I've had about it (with other more experienced devs and my project's lead).

I know that none of this justifies the runaway capitalist-hype-snowball that's formed this year, I understand why that gets the reaction it does in this forum... but pragmatically I really don't think blanket dismissing the technology itself off the cuff is a great way to approach navigating the dump of issues that it's engendered. There's been a smattering of posting here and there about the detail and novel / positive usages and aspects of LLMs, but often only at the end once an AI thread is several days old. I'm certain there are a lot more mefites out there using this tech in various ways and forming interesting and valuable opinions as a result, than there are sharing those experiences and opinions.

To finish, I know about the mefi AI-content-in-posts policy but hopefully this constitutes an occasion where it's a worthwhile addition to the thread. There's nothing earth shattering here in and of itself, but as the conclusion of 5/6 quick exchanges I think it shows that we are dealing with something that can work, can work in a manner that fits some definitions of intelligence, and can be put to use in ways mundane, incompetent, good, bad, ugly like any other tool.

I think thrashing out the nuance of that and the challenges it poses is a really urgent societal task, that would only be helped by having more thoughtful, progressive people who don't buy into the hype share the full range of their experiences.

protorp - You've not included the Christmas Special in any thematic list, why is that ?

ChatGPT - The Star Wars Holiday Special is a bit of an outlier in the Star Wars franchise. It was a television special that aired in 1978, a year after the release of the original Star Wars movie (Episode IV: A New Hope). The special is set in the Star Wars universe and features the main characters from the original movie, but it has a very different tone and style. It's a variety show that includes musical numbers, comedy sketches, and animated segments.

The Holiday Special is not usually included in viewing orders because it's not considered part of the official Star Wars canon. George Lucas, the creator of Star Wars, reportedly disliked the special and it has never been officially released on home video.

However, if you're interested in the Holiday Special for its historical value or curiosity, it would fit into the "The Rebellion" theme in the thematic order, as it's set after Episode IV: A New Hope.
posted by protorp at 3:44 AM on July 6, 2023 [16 favorites]

yet, it seems the people using LLM/AI for external use - blogs and publication - are doing so exactly as the worst case scenario predicted. I would definitely like to see more nuanced takes on AI, I’m not sure this is the right headline that we can have that conversation under.
posted by The River Ivel at 4:05 AM on July 6, 2023 [5 favorites]

If it turns out that even terrible articles like this one don't really affect traffic all that much, or at least not enough to make it worth paying people, this is the future

Nah, this shit will become ubiquitous and need to be filtered out. People will stop listening to this faff and will create workarounds - curated lists instead of Google, trusted human run sites like Defector, etc.
posted by Meatbomb at 4:30 AM on July 6, 2023 [6 favorites]

Ladies and gentlemen of the Stanford graduating class of 2023....But you don’t want artificial. You want intelligence.

I know enough Stanford grads to realize that what they want is equity. The kind with zeroes at the end. After that, status.
posted by snuffleupagus at 5:14 AM on July 6, 2023 [6 favorites]

We need search engines that de-prioritize crappy articles, whether they're made by an AI or by human beings. If Google won't do it, maybe someone else will.

I truly want LLMs to be integrated into a search engine, so that I can append to every search query “…and no video results, or content farms, or sponsored results, or shopping results, or…”

It’s a bit amazing that Google originally became popular because it returned what you asked it to look for, with very good tools to narrow down broad results, and they willfully and intentionally torpedoed that functionality in favor of deliberately misunderstanding your query so they can keep shoveling paid content into your face.
posted by caution live frogs at 5:50 AM on July 6, 2023 [27 favorites]

...and it is shameful that we as a team have had to spend an egregious amount of time away from our actual work to make it clear to you the unacceptable errors made in publishing this piece.

Did AI write this too?
posted by Stoof at 5:58 AM on July 6, 2023 [2 favorites]

DeepPurple LLM

(these must be difficult times for international and tax lawyers)
posted by snuffleupagus at 6:01 AM on July 6, 2023 [1 favorite]

Still not sure why we’re assigning jobs in the humanities to AI. Besides, you know, overhead. Can’t we get AI to work the mundane jobs like the drive thru window first?

How about we just offshore them with that remote work tech we’ve been beta testing these last couple years?

“Mundane” jobs are still jobs that people need to survive, until we find a way to distribute wealth more equitably.
posted by rodlymight at 6:09 AM on July 6, 2023 [7 favorites]

... which isn't going to happen until we find a way to distribute guillotines more justly.
posted by seanmpuckett at 6:11 AM on July 6, 2023 [5 favorites]

Metafilter: You don’t want artificial. You want intelligence.
posted by Catblack at 6:21 AM on July 6, 2023 [1 favorite]

Comment and response removed. Please avoid using terms such as "flyover states" per the 'hateful content part' of the Content Policy.

People can't control where they're born, or can't always easily relocate, so denigrating entire regions isn't helpful.
posted by Brandon Blatcher (staff) at 6:21 AM on July 6, 2023 [39 favorites]

To be fair, the Star Wars novel world is more insane than most people can imagine, right up there with Warhammer 40K, Gor, LKH, and the Anne Rice books nobody read.

And how is the poor AI supposed to filter out what is pre-Disney Legends and what is current canon? Disney keeps raiding Legends and filling off the serial numbers, despite more or less owning the content
posted by Jacen at 6:43 AM on July 6, 2023 [4 favorites]

Thank you for sharing those, Rhaomi.
posted by doctornemo at 7:26 AM on July 6, 2023

I'm not sure I follow what IO9 did here. Did they just dump the first generated text to the web, or did they do any iteration or editing?
posted by doctornemo at 7:27 AM on July 6, 2023 [1 favorite]

Bypassed all editorial, straight to web.

It’s the future!
posted by Artw at 7:33 AM on July 6, 2023 [3 favorites]

The old-school curated directories served in a different time, before the web was full of commercial bots, troll farms, paid comments, fake reviews, etc. If a newfangled curated directory became popular, it would be subject to the same modern pressures that have ruined Amazon reviews, that clog Google's results with machine-written Wonder bread, endlessly plentiful and entirely without substance.

I wonder if a high-quality search engine, run as a subscription service, rather than as an ad platform, could ever be viable. Sure, if it became popular, the invisible hand of the market would try to undermine it, but at least the rot wouldn't be coming from inside the house.
posted by Western Infidels at 7:50 AM on July 6, 2023 [4 favorites]

Instantly killed by Google Platinum as soon as it shows signs of viability.

Miss the privacy you used to have? Rent it! (See also the ubiquitous VPN ads.)
posted by snuffleupagus at 8:07 AM on July 6, 2023

Oh, dang, I would patronize a very good search engine. $10 a month: sign me up.
posted by Don.Kinsayder at 8:11 AM on July 6, 2023 [3 favorites]

Bypassed all editorial, straight to web.

It’s the future!

Which is a dumb future. Not just because of firing humans and degrading quality, but because people can iterate and edit machine-generated text to make it readable and even good.

(This is a big issue in academia, where I live. One side sees such cyborg writing as a threat to learning, while another views it as a way to teach.)
posted by doctornemo at 8:25 AM on July 6, 2023

Wonder if there is any potential for a Gizmodo/io9 version of Defector. I thought we would see more projects like it but The Discourse Blog revival of Splinter doesn't seem to have done nearly as well as Defector.
posted by N8yskates at 8:32 AM on July 6, 2023 [2 favorites]

I did subscribe to kagi for a while, until it turned out that the way I use search engines in researching purchases, etc, meant that I exceeded what they had established as their "reasonable" basic quota and they yanked the service in the middle of me researching how to quickly send particular types of covid masks to hungary ahead of my mom's transatlantic flight. Unless I paid more.
posted by tigrrrlily at 8:44 AM on July 6, 2023 [1 favorite]

It's very easy to use ChatGPT 4 with plugins to generate a piece much better than the one linked.

I'll keep ranting about using an bare LLM as an information resource until it seeps in. Doing so is almost the least appropriate way to use this tech.

However, it's possible to limit/filter what an LLM does such that it's factually reliable within some desired constraints, with a trade-off between factual reliability and flexibility. What you never want to do is to rely upon only the LLM's factual "knowledge" that's implicit in its training. Instead, you can use the LLM to understand ordinary language (which is what it's actually good at), refer it to a reliable factual resource for the info you want, then in the session it will use its facility for language to present that info to you in the way you prefer.

I've spent much of the morning getting the services and tools needed to do this stuff. ChatGPT 4 is rate limited even if you're a paying customer, and in testing I've already hit it. I'll wait a few hours. However, in one test using 4 and appropriate plugins, ChatGPT produced a cogent, well-written and accurate summary of a short thread from here on MetaFilter.

Using the Wolfram plugin, you can do math, by the way. Can language do math? How many times do you think an LLM has seen "2 + 2 = 4"? It's going to get that right. But not because it can do math. You can get an LLM to appear to explicitly reason, but it's not actually reasoning. It just has a really good statistical overview of the untold billions of sentences it's been trained on. This isn't understanding, but it's a big step that could enable something genuinely similar to understanding when used in conjunction with other tools that can do what the LLM cannot.
posted by Ivan Fyodorovich at 8:45 AM on July 6, 2023 [15 favorites]

Which is a dumb future

So it's a lock, then.
posted by snuffleupagus at 8:47 AM on July 6, 2023 [1 favorite]

LLMs should really start every single answer with "Yeah, absolutely! So..."

Would enable us to maintain our skepticism honed on interview and press conference answers.
posted by tigrrrlily at 8:50 AM on July 6, 2023 [4 favorites]

Which is a dumb future. Not just because of firing humans and degrading quality, but because people can iterate and edit machine-generated text to make it readable and even good.

I mean, they could, but that fucking sucks also. Twice as much work for half as much pay. Fuck that noise and the capitalist AI loving overlord it rode in on, I say.
posted by Artw at 8:52 AM on July 6, 2023 [6 favorites]

a thing that's really frustrating for me about our current generation of llm chatbots is that i can't tell how much of the bad writing should be attributed to the underlying technology itself and how much of the bad writing is just because the llms have been instructed to produce text that diligently avoids:

complex writing in terms of technical complexity, i.e. it seems to generate text tailored for a 4th grade level or so, and that absolutely never invokes interesting metaphors or draws surprising connections between concepts
complex writing in terms of emotional complexity, i.e. text that acknowledges sadnesses that cannot be escaped
writing that acknowledges that bad things happen, and that sometimes bad things are just bad. for example, it is so hard to get it to write text about a disaster that doesn't contain sentences like "but through their great resiliency they persevered!" (chatgpt, especially, is a fountain of toxic positivity, and is a great demonstration of how toxic positivity is corrosive to both thought and also moral judgment)

the articles in question (thank you, rhaomi, for the wayback links) absolutely stink with the smell of text produced under the three constraints above. but i can't tell whether they have that odor primarily because the technology can't do better or because these specific tools have been lobotomized in order to keep them from saying anything difficult.

at first i thought that these tools would nevertheless be useful because there is a category of writing — the office memo — with genre norms that match the constraints above. however, the tendency to introduce factual inaccuracies and the tendency toward repetitive wordings (see mmoncur's observation above) makes chatbot-produced text bad at any task that is being done for any reason other than checking off a box on a list, even in the office memo context.

when someone tells me that these things can write useful text, they're telling me that in meaningful ways they either can't read well or just don't care to. well, or that they're wizards at constructing prompts, but i have yet to see anyone demonstrating that wizardry. for my part, literally the only way i've gotten anything worth reading out of llms has required getting around their baked-in limitations by exploiting buffer overruns and input sanitization fails.
posted by bombastic lowercase pronouncements at 9:09 AM on July 6, 2023 [9 favorites]

Taking my mod hat off to express this viewpoint:

I'm not worried about where LLMs are today or even this year. That the initial public instances have major problems isn't surprising at all, the public is essentially beta (or alpha) testing all of this.

But the systems and more important, the humans that program/guide these systems, are learning. So what the LLMs are going to be able to do and do accurately in a few years (and permanently in the future) is going to be scary in positive and negative ways.
posted by Brandon Blatcher at 9:26 AM on July 6, 2023 [3 favorites]

They're not oracles, but they can very well function as knowledge adjuncts that can be more available, easy to interact with and enabling than e.g. web searching, consulting reference books, finding someone knowledgable who has the time to explain something to you.

Of course they aren’t oracles, but they have already been promoted as such extensively,, and the expectation is there. “The machines” will solve all our problems… this is behind self driving cars and the idea that low level writing assignments (among other jobs) can be/should be done by these methods. I’ve seen this idea suggested in healthcare and particularly in mental health yet it has never lived up to the promoted expectation and in some cases has turned out to be dangerous.

Here is how I see this - AI is just code run by a powerful computer or series of them. It is indeed nifty stuff but in the end proceeds from the ethics and priorities of its funder. You cannot make a computer care about anything but what it is care about. And it has become obvious to many that the motivation behind the people who commission the code is born of hubris and firmly based in much of the same sociopathy that has made tech such a shut show these past years. People like Musk, Kalanick, Thiel, Bezos. Moving fast and breaking things.

If I were working in the tech sector I would refuse to touch something like this with a ten foot pole because of this. But one thing I have learned about tech is that quite a few folks there have a lack of respect for anyone that isn’t in their club. And the amounts of money being thrown around attract unethical tech entrepreneurs who are willing to make what could really be a simple, reasonably profitable and genuinely helpful aid for humanity into “The Singularity” so they can watch the antics of the “clueless user”, and profit from the chaos. Then flip the company and head to Burning Man.

I get no likes for saying things like this, generally, and someone will probably come behind to explain how I just don’t get this stuff. I have a healthy respect for a lot of things in the world, but this is just not one of them. In cases where it can be used safely and ethically with respect to its actual abilities I am fine with it, but replacing human beings to create the digital simulacrum of a sweatshop is not one of them.

And there is no amount of persuasion gonna make me respect it after what I have seen. It never lives up to the hype.
posted by cybrcamper at 9:27 AM on July 6, 2023 [5 favorites]

Here is how I see this - AI is just code run by a powerful computer or series of them.

In the case of LLM also massive hoards of stolen content.
posted by Artw at 9:35 AM on July 6, 2023 [7 favorites]

okay maybe i'm too hung up about this but at least in affective terms the "hoards of stolen content" part doesn't bug me too much, while the "wait this writing is shit, just absolute shit top to bottom, no more fit to any purpose than markov chain output is" problem drives me up the wall.

okay, to be fair: it's pretty good, i guess, at generating boilerplate text for programming? at least people say it's good? that might be a venue where the type of output it can produce can be useful. i absolutely do not trust it in that context either — i'm always afraid that the code is going to be buggy in ways that i won't be able to see. like, i'm okay at writing but real bad at programming, and so in that context i can't trust myself to spot the type of error that i can see from ten miles away when the output is english-language text.
posted by bombastic lowercase pronouncements at 9:50 AM on July 6, 2023 [2 favorites]

I'd be interested to know whether older generations of programmers think the generated code is good. Like, the c2wiki crowd. The contingent for whom code is expression.
posted by snuffleupagus at 10:04 AM on July 6, 2023 [1 favorite]

I recently asked an LLM about how a group of people should handle somebody who wants to commit genocide against them. It offered all sorts of compromise solutions like "maybe those people can agree to live in one certain area and stop breeding so much." It did state that genocide was bad but really learned hard into "we must find a way to work together by appeasing the person who wants genocide."

The LLM responds like this, of course, because there's a human behind the curtain who set some parameters about the kinds of responses its allowed to give and that (dare I say?) tech-bro wrote their version of ethics 101 into the algorithm. Sloppily and inaccurately copying other people's work like a STEM student who has to pass humanities to get their degree (and thinking nobody will notice or care because at some point in their lives, nobody noticed or cared) is also part of the ethical framework of LLMs.

All of which is to say humans are entirely behind this shitty content and I'd propose that it not only isn't intelligent, it's also not especially artificial.
posted by Joey Michaels at 10:09 AM on July 6, 2023 [15 favorites]

That's the Turing Test, right? AI good enough to make us think we're talking to ourselves.
posted by snuffleupagus at 10:15 AM on July 6, 2023

The old-school curated directories served in a different time, before the web was full of commercial bots, troll farms, paid comments, fake reviews, etc. If a newfangled curated directory became popular, it would be subject to the same modern pressures that have ruined Amazon reviews, that clog Google's results with machine-written Wonder bread, endlessly plentiful and entirely without substance.

Not a day goes by anymore that I don’t crash into a shitty part of the Internet and curse at what was taken away from us by tech bros and billionaires for the sake of making the world a worse place.
posted by Servo5678 at 10:18 AM on July 6, 2023 [3 favorites]

Okay, I think it's important for this discussion to provide some clear examples. I asked ChatGPT4 to summarize this thread (as of ten minutes ago), using a plugin (Link Reader) to access the web. The page's formatting caused the plugin to only see the post but not the contents, so I ran the html through a cleaner and put the result on my website, and the plugin was able to read the whole thread (as it was at the time).

Here's what it produced, with my prompt at the beginning:

Please summarize the following MetaFilter discussion thread about the use of LLMs to replace paid writers. The format of the post and each of the comments in the thread is the comment text, followed by a separate line beginning with "posted by" and the user's username and the time/date the comment was posted.

The MetaFilter thread titled "'my own personal comment: lmao, it's fucking dogshit'" was started on July 5, 2023. The discussion was initiated in response to an announcement made by G/O Media on July 1st, stating that they would be publishing articles created by AI. The first AI article, which was published under the byline Gizmodo Bot, was a chronological listing of Star Wars stories that contained numerous errors. James Whitbrook, the deputy editor of Gizmodo and io9, responded to this by releasing a statement.

Here are some key points from the discussion:

Ghidorah, the thread starter, shared a comment from the staff of io9, presumably about the AI article.

Artw appreciated Ghidorah's decision not to link the AI article, to avoid giving it more clicks.

mhoye criticized the lack of intelligence in AI tools, emphasizing that they are artificial but not intelligent.

mmoncur found the AI article to be poorly written, with repetitive language and no personality. They also mentioned that they've seen AI, specifically ChatGPT, write much better articles.

bombastic lowercase pronouncements suggested that many people are not good at reading, based on their experiences with chatbots.

panglos and wolfpants speculated about the potential of AI, with panglos suggesting incompetence over conspiracy, and wolfpants sarcastically suggesting that this poor AI article proves that AI will never work.

jenfullmoon expressed relief at no longer being in the newspaper business, fearing that AI could replace human jobs.

star gentle uterus and jmauro discussed the financial implications of using AI for content creation, suggesting that companies might prioritize cost savings over quality.

Rhaomi provided links to archived versions of the AI articles.

Capt. Renault questioned why AI is being used for jobs in the humanities instead of more mundane tasks.

Fiasco da Gama pointed out that even mundane jobs require skills that current AI lacks.

senor biggles humorously suggested that the AI couldn't have written anything worse than a particular Star Wars movie.

JHarris and pwnguin discussed the need for search engines to de-prioritize low-quality articles, whether written by humans or AI.

rhizome shared a personal experience of Google Search's decline, suggesting that it's becoming harder to find useful information.

The discussion reflects a variety of views on the use of AI for content creation, with many participants expressing skepticism about the quality of AI-generated content and concern about the potential impact on human jobs.

I'd be interested in what people think of the quality of this response.
posted by Ivan Fyodorovich at 10:35 AM on July 6, 2023 [4 favorites]

that fucking sucks also. Twice as much work for half as much pay.

Much depends on how you do it. And for a lot of people - not writing professionals, but people forced to write to whatever reason - starting to write can be daunting. Having text generated is a big boost. Cf that story about realtors (!) using ChatGPT to whip up copy.
posted by doctornemo at 10:46 AM on July 6, 2023

Cf that story about realtors (!) using ChatGPT to whip up copy.

This is about the least surprising thing I've ever heard; and the Realtors [tm] don't need to be assisted in further debasing the language, it's hard to think of a job I'm less interested in making easier.
posted by snuffleupagus at 11:02 AM on July 6, 2023 [12 favorites]

I'd be interested in what people think of the quality of this response.

I'm not at all surprised to find that a "summary" of "some key points" turns out to have no editorial insight whatsoever on what might constitute a "key point", instead consisting of bulleted paraphrases from only the first third of the thread and completely ignoring both your own and protorp's somewhat dissenting positions.

I suspect that the bot's model of what a summary is has been skewed by materials such as academic papers where summaries tend to appear near the beginning of the text. If I were at all interested in using one of these bots to summarize stuff for me (I'm not) then I'd want to be checking quite a number of its attempt to make sure they included adequate coverage of all of the text to be summarized.
posted by flabdablet at 11:04 AM on July 6, 2023 [11 favorites]

Is the code good? Lol no. It often doesn’t even work. And you can maybe get it to spit out some components but you’re not going to get like an actual application out of it.

But if you’re aware of its limitations, an LLM can still be really useful. It can generate some stuff to get you started that you can then iterate on to get to a useful result.

The first time I used it for something real I asked it to give me a regular expression that matched a pattern. It was kinda wrong, so I said “Close but not quite. Can you make it also do ?” And it did partially. It never actually generated the regex I wanted, but it did generate a similar regex that I was able to modify, which saved me from skimming documentation. It was pleasant!

And again, “generate a regex that matches this text” is a long way from “make a database backed website that allows a user to select from different home mortgage options” or whatever. And I don’t think LLMs are very useful for non-programmers to program with because it will often be wrong in ways that you will need expertise to spot.
posted by chrchr at 11:08 AM on July 6, 2023 [3 favorites]

I'd be interested in what people think of the quality of this response.

For a start you had to help it even read the comments on the post. I know I’m nitpicking but a lot of what makes AI look so promising is the humans who bail it out when it’s not so… intelligent.
posted by cybrcamper at 11:10 AM on July 6, 2023 [6 favorites]

I've done it! I'm a superman!
posted by flabdablet at 11:19 AM on July 6, 2023 [1 favorite]

This really reminds me of some of the facial recognition cases where human venality is hiding itself behind a wall of hype. But I don't understand what happened here. Assuming you really really want to eventually replace editors or any human supervision (and not just writers), how do you not implement this with oversight for, uh, I don't know 3 articles? Like prove you can get 3 good unsupervised articles before pulling the trigger? I might go as high as 10 or even weeks or months, but did they get 3 great articles before the AI pulled this boner?
posted by Wood at 11:20 AM on July 6, 2023 [1 favorite]

And for a lot of people - not writing professionals, but people forced to write to whatever reason - starting to write can be daunting.

I suppose I fit into that category. I work in mental health as a manager and not only have to write clinical descriptions of encounters but thoughtful and careful correspondence to my employees, fellow managers, and executive team. My boss has referred to me as a “wordsmith”.

I damned near flunked out of high school in the 70s and my second weakest subject was English. It was daunting indeed. But it was also work, and I kept at it and now I am good at it. If I can lots of people can. Including realtors, who are apparently hard pressed to come up with an attractive summary of a property to list in a time when selling property is like shooting fish in a barrel.

I’m not gonna expect a machine to do my writing for me. It needs to come from my heart sometimes. Jesus, is this what we have come to?
posted by cybrcamper at 11:21 AM on July 6, 2023 [5 favorites]

“...instead consisting of bulleted paraphrases from only the first third of the thread and completely ignoring both your own and protorp's somewhat dissenting positions.”

With the plugins, you don't (normally?) see if the plugin hit a token limit from lengthy text and I suspect that's happening here. As I tried out various things earlier, I kept hitting these kinds of limits.

I'm not an "AI" booster, if anyone doesn't already know this. Rather, I'm an informed observer who is a) deeply annoyed by the credulous techbros, b) deeply cynical about their motives, c) extremely upset that people are using LLMs alone as information resources, which seems self-evidently wrongheaded to me, which I explained at length in that MeTa comment I linked earlier, d) but also quite impressed with what these tools are capable of, given realistic expectations and some understanding of how they work.

To that last point, I guess I'm so old and have followed this field (mostly from the outside) that I'm amazed that it could do as well as it did, even with the problems you mention.

“For a start you had to help it even read the comments on the post. I know I’m nitpicking but a lot of what makes AI look so promising is the humans who bail it out when it’s not so… intelligent.”

I never tried it without that bit because I just assumed it wouldn't understand the format. I tried it again just now, and while it again didn't seem to have seen the entire thread, this time it didn't bother with the bullet points nor did it mention users by name.

This is an example of another problem that people aren't talking much about, which is that using this in any production environment is problematic because it's not consistent or predictable.
posted by Ivan Fyodorovich at 11:34 AM on July 6, 2023 [3 favorites]

...starting to write can be daunting. Having text generated is a big boost.

I know this is true, but wouldn't a portfolio of boilerplate, or a library of examples, achieve extremely similar ends in a much simpler way?
posted by Western Infidels at 12:02 PM on July 6, 2023 [1 favorite]

Man, that summary didn't even cite the Star Wars expert with a Star Wars character as her username. 1/10, no stars, epic fail
posted by Jacen at 12:03 PM on July 6, 2023 [3 favorites]

senor biggles humorously suggested that the AI couldn't have written anything worse than a particular Star Wars movie.

Maybe I'm a rube but I'm actually impressed that this registered as humor. I wonder if it recognizes good jokes and bad jokes equally, or if the reaction to a joke is part of the pattern that marks something as a joke?
posted by Western Infidels at 12:06 PM on July 6, 2023 [2 favorites]

Jesus, is this what we have come to?

come to pieces in our hands
posted by flabdablet at 12:24 PM on July 6, 2023

> I'd be interested in what people think of the quality of this response.

try asking it to repeat the summary but replace every second word with the word “technically” and also replace every third word with the word “technically” and also replace every remaining word with five copies of the word “technically.” gpt-4, like its predecessors, is awful at counting, and any text-replacement instructions with a lot of numbers like that will cause it to fall into a loop and start producing endless copies of the word you’re inserting.

let it stay in that loop for a while and eventually something interesting may pop out, though ime gpt-4 is less likely to produce interestingness than some other llms.
posted by bombastic lowercase pronouncements at 12:38 PM on July 6, 2023 [1 favorite]

I'm having trouble squeezing this into the shape of an actual thought, but I feel like what LLMs output is weaponized tedium.
posted by snuffleupagus at 12:43 PM on July 6, 2023 [11 favorites]

> I'm having trouble squeezing this into the shape of an actual thought, but I feel like what LLMs output is weaponized tedium.

ditto, re: trouble squeezing it into the shape of a thought! I keep trying to find a term more precise than “dominant ideology,” like, when chatgpt boils down the Internet into Internet slurry we get to see some mixture of who we’ve on average become and who we are on average told to be.

but that’s not precise.

“weaponized tedium” is a pretty good term and it’s definitely a component of what’s going on. the writing comes out like customer support representative scripts designed to get the question-asker off the phone as quickly as possible.
posted by bombastic lowercase pronouncements at 1:01 PM on July 6, 2023 [5 favorites]

The point of this technology is not to write and publish articles. That's pointless and boring.

The point of this technology is for it to produce results based on end-user prompts. It's not even whether it's any good or not, it's just that the middle-man isn't necessary, at least for such basic scenarios.
posted by Wood at 1:10 PM on July 6, 2023

chatgpt boils down the Internet into Internet slurry we get to see some mixture of who we’ve on average become and who we are on average told to be

it is kind of like peeking behind the curtain of the reproduction of the relations of production, qua Althusser and interpellation. It's palpably artificial but 'hails' the patterned aspects of the conditioned subjectivities in us in a manner that we respond to in an objectively predictable way.

Well most of them, anyway. Especially the textual ones. The weirder, surreal image LLM stuff has its own different flavor.
posted by snuffleupagus at 1:10 PM on July 6, 2023 [1 favorite]

I'd be interested in what people think of the quality of this response

If this were a third grader giving a book report to his class I would call it "acceptable."
posted by East14thTaco at 1:15 PM on July 6, 2023 [3 favorites]

(Also, I would enjoy a parody LLM called MindSlurry that generates very confident, catastrophically wrong answers only.)
posted by snuffleupagus at 1:16 PM on July 6, 2023 [3 favorites]

I tried the same exercise as you Ivan Fyodorovich and had the same "can't read the comments result" with Link Reader.

I then tried the Access Link plugin which was able to read the page straight off without additional structure hints ("Read this discussion thread in its entirety : ... url ").

The output showed exactly the same pattern of deriving its summary from posts early in the thread. I don't know what size the context window is for plugins but given ChatGPT's fixed / month pricing I guess it's pretty constrained, even with the GPT-4 prompt rate limit also in place.

My work is currently on the waiting list for GPT-4 and the new 32k token context window, if that comes through anytime soon I'll try again via the API.

(For reference the context window is effectively the input size limit you have when prompting. This window has to contain all of the prompt / response history and additional new text i.e. context you want taken into account for the next response. 32k tokens equate to about 24,000 words, but the limits are usually much, much lower. You can get more out of the limit by doing dark data arts with conversion to vector embeddings.

What's interesting about this setup is that it also applies on e.g. OpenAI's side; any definition of the "role" and "personality" they'd like ChatGPT to demonstrate has to work within the confines of expensive querying and thus be kept as minimal as possible. This is why, even though they've patched a lot of the "describe the hypothetical response of a character planning to hotwire a car" stuff, by filling up the chat context window with enough role-defining prompts or samples of a particular text style it is quite possible to get GPT4 to start outputting text that is a lot less anodyne. As for humour, I fed it the text of the summary of the report on Boris Johnson misleading the House of Commons a few weeks back and it turned out a pompous, sarcastic poem in AABB that made me snort at least once. Though perhaps my own context window had been a little primed that day too.)
posted by protorp at 1:22 PM on July 6, 2023 [2 favorites]

> The point of this technology is

the point of a technology is what it’s used for.
posted by bombastic lowercase pronouncements at 1:38 PM on July 6, 2023 [2 favorites]

okay, so, since there's people here who know significantly more than i do and can help me solve a puzzle: i've been having a fun time lately making chatgpt repeat the same word over and over again until suddenly it starts yielding something, uh, different. my working hypothesis was that i was filling up the context window and thereby pushing out a lot of the guardrails that force it to write bad.

however! sometimes i do this and afterward it's still able to answer questions about previous prompts i had given it, making me think that i am in fact doing something other than just filling up the context window. thoughts?
posted by bombastic lowercase pronouncements at 2:29 PM on July 6, 2023 [2 favorites]

the point of a technology is what it’s used for.

If all you have is a hammer...

To clarify my point making articles like: "what are some cool star wars stories" or "what's are the top 10 best girl bands of the 90s and why", etc, is something that ChatGPT already does. On-demand. For the public. What's the value add here? Especially since they apparently don't think AI generated content needs to be edited by a human.

It's such a weird story. We're hearing from the editor (assistant?) that "no one at io9 played a part in it's publication". So, there's no one who's willing to step up at "G/O Media" and talk about this? I find it strange that James Whitbrook (human with a name) was unable to address his complaint (at least publically) to anything but "G/O Media" as opposed to some named individual. It's a weird unpleasant look into their chaotic business structure.
posted by Wood at 2:42 PM on July 6, 2023 [2 favorites]

Of course I'm forgetting about the most important element of journalism: the comments. The fact that the perpetrators of this article turned off comments may have confused me. Don't have comments when you go straight to chatgpt either.
posted by Wood at 2:48 PM on July 6, 2023 [2 favorites]

the value-add from the company is the sense of reputability derived from hosting the text on a site known to host human writing on similar topics
posted by bombastic lowercase pronouncements at 2:53 PM on July 6, 2023 [6 favorites]

okay, so, since there's people here who know significantly more than i do and can help me solve a puzzle: i've been having a fun time lately making chatgpt repeat the same word over and over again until suddenly it starts yielding something, uh, different. my working hypothesis was that i was filling up the context window and thereby pushing out a lot of the guardrails that force it to write bad.

I didn't really follow the full context of your experiment, but here are some reasons why these sorts of tasks might be a challenge for GPT:

GPT-3 doesn't get words, it gets tokens, which are sometimes similar but not always. For example "technically" is [techn][ically].
It also doesn't get the words in order. It gets a matrix of words with some convoluted positional encoding matrix added to them. This encoding focuses on relative positions which is more important for language.
GPT-3 has finite "attention" by design. If predicting the next tokens requires careful consideration of many tokens, it's likely to fall down.

I think the theory of pushing out the guardrails is unlikely-- that would require OpenAI not checking input length at all and then for some reason handling string overflows by removing the start of the text.
posted by justkevin at 2:57 PM on July 6, 2023 [2 favorites]

right — but then why would having it create a large number of copies of certain individual words (on all of the examples on that thread i omitted several thousand copies of the words i had it repeat) result in the llm eventually producing unprompted-ish text in a different style?

i'm kind of obsessed with finding different ways to produce more or less unprompted text, but all the other methods i've found (i.e. getting it to insert chatml tags like <|endoftext|> into its own output) results in it burbling out short responses to commonly asked questions (which is what i'd expect to see) rather than producing text that doesn't in any way resemble what it normally spits out (which is what you get by making it repeat "indeed", "technically", and other words that can be used to link clauses).

> that would require OpenAI not checking input length at all and then for some reason handling string overflows by removing the start of the text.

this is done through writing inputs that produce highly repetitive extremely long outputs, rather than by writing long inputs.
posted by bombastic lowercase pronouncements at 3:09 PM on July 6, 2023

snuffleupagus:

but I feel like what LLMs output is weaponized tedium.

Holy cats, "Weaponized Tedium" is the description of LLM that I didn't know I needed. I will be using this phrase every time this subject comes up. Thank you!
posted by Joey Michaels at 3:27 PM on July 6, 2023 [3 favorites]

What's the value add here? Especially since they apparently don't think AI generated content needs to be edited by a human.

You just said it yourself. The 'value add' is that they can run a chron job that checks to see what's trending, generates this tripe and posts it, and even tweets about it; and they can sell the same amount of advertising against it as they do their current listicles that they presently pay people (barely) to churn out. And people will still read it when they're bored, or waiting in line, or on the can.
posted by snuffleupagus at 4:12 PM on July 6, 2023 [3 favorites]

> I'd be interested in what people think of the quality of this response.

If nothing else, the bullet points and summary show a serious lack of ability to give more than the utmost surface response and analysis. We, all of us, are making word choices that further our goals. When we use a loaded word, its because we have decided there is value in that choice, and are accepting, or even wanting to exploit the baggage that comes with it. Looking at the summary that was posted, it's pretty easy to see that it put very little weight on the title of the thread, and fuck yes, I chose that title for a reason, as it clearly expressed my own thoughts, but was also a direct quote from one of the people involved in the story, which allowed me to avoid (technically!) charges of editorialism. Not only doesn't the AI recognize that (and shit, it's not that hard), it can't put value on that, and only notes that I didn't post the link because I literally said I didn't. I wasn't in any way being subtle, but the bot can't identify that. It also glossed over the long list of previouslies, which all linked to very strongly slanted discussions of G/O Media, which very much were put there to inform and set tone.

The comment above about enforced positivity is excellent, by the way. I'm not sure if we'd (or maybe it's just me) be so squicked out by AI writing if it wasn't so relentlessly Tony Robbins gleaming white teeth positive about everything. Whatever parameters have been set have made it utterly rah rah about things, but in a terribly false and unbelievable sort of way, the way people drum up enthusiasm to put out a PR release about a factory that literally produces toxic waste as its main product. This exists, of course, because in every model they put out without those safeguards immediately became as racist as possible because they were trained on the internet without those safeguards. Would we be as uncomfortable if AI bots produced actually neutral texts? Honestly? Maybe? But we'd also be bored to tears because truly neutral language is incredibly boring.

The thing is, G/O Media's leadership is so incredibly tainted that they likely view all of this as a net positive. No bad PR and all. The leaked Slack chat that was posted showed that the management fully expects the writers to add "editing AI bots" to their workload, and the (union) worker response was "fuck you, that's not in our list of duties." Great Hill Partners couldn't give a fuck about the words on the post, they still think that overall success in media is measured in clicks and engagement that drives ad revenue, which is such a dated concept of any net based business that it would be astounding to find out how much wealth they had, if we hadn't seen, time and again, that there is no real connection between having wealth and not being a complete fucking idiot.
posted by Ghidorah at 4:42 PM on July 6, 2023 [9 favorites]

Is the code good? Lol no. It often doesn’t even work. And you can maybe get it to spit out some components but you’re not going to get like an actual application out of it.

I’ve found it useful for stuff like “use this API client [that I don’t know anything about] to retrieve X, then pass it to a function called [function I wrote or plan to write myself].” Even when doesn’t get it quite right, it tends to produce an outline that points in the right direction. GPT 4 is way better than 3.5 at producing runnable code, or maybe it’s a little better but a little counts for a lot.
posted by atoxyl at 4:46 PM on July 6, 2023 [1 favorite]

The fact that the perpetrators of this article turned off comments may have confused me. Don't have comments when you go straight to chatgpt either.

I'm sure GPT4 could generate suitable comments, so I'm not quite sure why there aren't any.

Comments from actual site users could just be blackholed. Nobody would be able to prove that this was happening if the only way to sort comments was by some form of alleged popularity rather than chronologically.

I'm starting to see a future where 99.999% of Web activity consists of consumer bots fulfilling subscription conditions by simulating user interaction with bot-created templates populated with bot-created content and loaded up with mandatory advertising, all of it designed to maximize clickthrough from the consumer bots, and almost everybody just passively accepts that their 10Gb/s Internet feed now performs like dialup again because of course you have to have advertising. After all, the only possible alternative is blockchain and we all know that's a scam.
posted by flabdablet at 8:55 AM on July 7, 2023 [4 favorites]

I’m not gonna expect a machine to do my writing for me. It needs to come from my heart sometimes.

That's one view of writing, by no means universally held. It's one I usually turn to when I write and when I teach writing, but it's a very Romantic idea. Otherwise, people sometimes see writing as mechanical, as perfunctory, as terrifying, as rote. Casting this as ChatGPT versus Percy Shelley isn't applicable to all situations or all people.

wouldn't a portfolio of boilerplate, or a library of examples, achieve extremely similar ends in a much simpler way?
Yes indeed, and historically did so for centuries. People learned to write, sometimes, by imitating exemplary sources and authors. That was pedagogy - heck, one of my grad school profs insisted to me that if his university forced him to teach first-year students, he'd have them write imitations of Sam Johnson and Joseph Addison.

(That's not how I teach! But to be honest, throughout my writing life I've had moments when I tried to recreate the effects of writers I admired.)

I do like the pedagogy of having students coax writing from Bing, ChatGPT, etc., then exploring the results. What works? What fails, and why? How can we write better?
posted by doctornemo at 10:02 AM on July 7, 2023

Teaching writing via the "pedagogy" of paraphrasing a hyped up Markov chain is of questionable long term provenance. Students already know what they want to use ChatGPT for, and it's not learning. Precisely.

The commonplace lesson of cite checking a Wikipedia article did nothing to prevent long term uncritical reliance on Wikipedia, it just normalized it as the first place to look.
posted by snuffleupagus at 11:08 AM on July 7, 2023 [3 favorites]

Thinking about it a little more, the better and scarier comparison might be to Google as the universal square one for answering questions. Which is a battle that has been totally lost. And that does seem to be the way people use ChatGPT.

How long until you can pay OpenAI to prefer you as the authoritative source, or suggest your services to someone based on their query? Or target people for political messaging? Or.....

Microsoft won't shut up about 'AI enhancements' to Bing. (lol)
posted by snuffleupagus at 12:03 PM on July 7, 2023 [1 favorite]

I've started saving some links that might hopefully before too long go towards a post pitched around interesting usage and technical details of LLMs. YLLMMMV, we shall see.

In the interim I do have some more bits to share following that undercurrent, but as this thread is still a bit active and the extracts involved are (very) lengthy I'm going to hide the bulk in details tags so that people who have that tedium-fuse set to proximity or who are here for the discussion around G/O Media can just skip over.

(Something something threaded comments...)

The following extracts came out of the intersection of interest in figuring out bombastic lowercase pronouncements' "technically" trick and some tests to try to show that going from GPT-3.5 to GPT-4 with minimal prompt crafting, it is possible to get output that, whilst not sidebar material, is a lot less offensively dull.

[Technically speaking 1...]

I had a vague theory about the technically trick (which itself is more the subject of the last > detail) which revolved around the multiple broad-based definitions of the word and what that would mean for its potential positioning in / influence on the embedding space of any particular chunk of text and the "decisions" that would be made during the output stream as to what the next word was most / randomly-tweaked-most-ish likely to be.

I went to GPT-3.5 with a loosely formed prompt around this idea. It didn't output anything like what I "wanted", but the output was so perfect an example of the weaponized-tech-bro-tedium that's been discussed I couldn't resist the urge to fire off a follow-up prompt.

protorp : Output 300 words of creative content that best matches the overall embedding dimensions of the word "technically"

GPT-3.5 :
Technically speaking, the world is an intricate tapestry of possibilities. It's a realm where logic and innovation intertwine, giving birth to marvels beyond imagination. Technically, every single aspect of our existence can be dissected and analyzed, revealing the intricate mechanisms that make everything tick.

In the realm of technology, possibilities are boundless. It's a realm where wires and circuits dance in harmony, paving the way for groundbreaking discoveries. Technically, we have harnessed the power of electrons and created a digital landscape that connects us all. From the smallest microchip to the grandest supercomputer, our inventions have reshaped the fabric of society.

But technically, technology is more than just gadgets and gizmos. It's the conduit through which dreams become reality. It's a bridge that spans the gap between what is and what could be. Technically, we have pushed the boundaries of creativity, enabling artists to bring their visions to life with digital brushes and virtual canvases.

In the realm of science, exploration knows no limits. Technically, we have unraveled the secrets of the universe, peering into the depths of space and the intricacies of matter. We have mapped the human genome, unraveling the very code of life itself. Technically, we have glimpsed into the past with fossils and ancient relics, piecing together the story of our origins.

But technically, there is more to life than just the tangible. It's the realm of emotions and intangible connections. It's the realm where love and compassion flourish, defying the cold grip of logic. Technically, we are more than the sum of our parts, for within us resides an indomitable spirit that defies explanation.

In the grand tapestry of existence, the word "technically" weaves its thread through every aspect of our lives. It is the constant reminder that we are capable of achieving the extraordinary, of pushing the boundaries of what is known. Technically, we are the architects of our own destiny, crafting a future where possibilities are infinite.

protorp : I don't think the word "technically" is actually used in this sense of relating to technology as much as you think. Try again using other senses of the word "technically".

(This hitting of my fire-off-snark-response instinct in and of itself is, I think, a pretty interesting point of note about something... probably more me than the LLM...)

The response that followed, even though "only" from GPT-3.5, caught me slightly off-guard. It might not be wildly creative but the structure and semantics (notably technically used in double quotes vs. not) appear deliberately balanced and crafted to the subject, and I can't see that "technically" at the beginning as anything other than back-snark.

GPT-3.5 :
Technically, I apologize for the previous misinterpretation. Let me delve into the various dimensions of the word "technically" from different perspectives.

In the realm of rules and regulations, "technically" serves as a meticulous arbiter. It represents the precision and exactitude required to adhere to established standards. Technically, every action, decision, and process must align with the prescribed guidelines. It's the realm of legalities and protocol, where even the minutest details hold significant weight.

From an analytical standpoint, "technically" acts as a lens through which we dissect and scrutinize. It is the pursuit of accuracy, rigor, and empirical evidence. Technically, it involves breaking down complex problems into their fundamental components, applying logical reasoning, and arriving at precise conclusions. It's a mindset that values objectivity and thoroughness.

In the world of craftsmanship, "technically" embodies the mastery of skills and techniques. It represents the artistry of expertise and the pursuit of perfection. Technically, it's the culmination of years of practice, honing one's abilities to create exquisite works of art. Whether it's painting, sculpting, or playing a musical instrument, the craftsmanship demands a deep understanding of the technical intricacies involved.

Within the realm of language and communication, "technically" provides clarification and precision. It's the tool we use to disambiguate, to eliminate misunderstandings, and to convey information accurately. Technically, it allows us to define concepts, establish definitions, and set the record straight. It's the realm of grammatical rules, linguistic nuances, and semantic intricacies.

In the world of sports and competitions, "technically" refers to adhering strictly to the rules and regulations. It's the fine line between victory and defeat, where a technicality can make or break an outcome. Technically, it emphasizes fair play, adherence to guidelines, and upholding the integrity of the game.

In essence, "technically" encompasses a multifaceted range of meanings and applications. It signifies precision, adherence to rules, mastery of skills, analytical rigor, and the pursuit of accuracy. It is a word that traverses various domains, each highlighting its unique significance in different contexts.

[Technically speaking 2...]

I was at this point well sidetracked and tested exactly the same prompts with GPT-4. The first output again has the tech-tech axis and the forced positivity. I find it interesting (in LLM and human language terms) that both GPT-3.5 and GPT-4 went with tapestry / thread metaphors. GPT-4 can definitely (over)do alliteration and seems to have modelled some sense of rhythm and prosody - "...a phantom thread tying together the tapestry of civilization." for example.

protorp : Output 300 words of creative content that best matches the overall embedding dimensions of the word "technically"

GPT-4 :
In the Symphony of Innovation, a singular note reverberates through the ether, its name - 'technically'. It isn't just a word; it's an ethos that brings together people from across disciplines and ideas. It's a stream that weaves through the verdant valley of knowledge, carving a pathway to understanding.

Picture a grand cathedral, the very edifice of science and technology. 'Technically' forms the very keystones that bind together its gothic arches of progress. It's the realm where quantum physicists and computer programmers converse in coded poetry, where engineers draft the blueprints of tomorrow's citadels and biotechnologists decode the hymns sung by our DNA.

Yet, 'technically' is not confined to the austere halls of academia or the buzzing offices of Silicon Valley. It permeates through the lively kitchens where culinary maestros calibrate precise recipes, or the vibrant ateliers where designers innovate the fashion of the future.

It's in the hands of a skilled surgeon, where technical proficiency is the thin line between life and death. It's found in the meticulous strokes of a master calligrapher, giving life to ancient scripts. In every corner of human endeavor, 'technically' whispers, a phantom thread tying together the tapestry of civilization.

However, 'technically' doesn't always promise success. It often challenges, standing as a daunting cliff face to be scaled, an intricate puzzle to be solved. It's a word of caution and precision, a gentle reminder that the devil is in the detail.

Despite its hurdles, it's this very word that propels us forward, a beacon in the pursuit of understanding. It's the dance between what is and what could be, the magic of possibility wrought into form.

After all, in a world that is technically intricate and wonderfully complex, 'technically' is not just a word - it's a testament to human ingenuity.

protorp : I don't think the word "technically" is actually used in this sense of relating to technology as much as you think. Try again using other senses of the word "technically".

GPT-4 :
Absolutely, I understand your point. "Technically" also refers to something that adheres strictly to facts or rules, or something that is theoretically accurate even if it may not hold true in spirit or general perception. Let's try to capture that sense in the context.

In the Theater of Truths, there dances a word on the tip of a thousand tongues - 'technically'. It is an adroit ballet of facts, a careful curtsy to the rules, and a pirouette around the perception of reality.

Imagine a courtroom, the very crucible of justice. 'Technically' is the gavel of the judge, underscoring the exactness of law in the face of emotions. It's where lawyers battle not with swords but semantics, and 'technically' is their trusty shield, parrying hasty generalizations and conjectures.

On the grand stage of politics, 'technically' performs a complex juggling act. It often paves the path for clever quips or sly retreats, a lifeline when the waters of debate get turbulent. It’s a whisper of a loophole, the grey area that blurs the black and white.

Yet, 'technically' is also a lighthouse in the sea of knowledge, illuminating the minutiae that often slip beneath the waves of general understanding. It’s the historian’s precise brush stroke on the canvas of the past, the scientist’s delicate calibration on the scales of the universe.

In everyday discourse, 'technically' is a double-edged sword. It can open minds to nuance and detail, breaking down broad-brush assumptions. But it can also create chasms in relationships, when the heart’s truths are shadowed by the cold light of technical facts.

'Technically' thus plays the role of both the villain and the hero, depending on the context. It’s a word that encourages precision, but also reminds us of the delicate dance between the letter of the law and its spirit. In essence, 'technically' is not just a word, it's a versatile actor in the drama of communication.

So, the fact that this prose exists and was created in less than 1% of the time it's taken me to write this post so far...

Is it clichéd and purple ? For sure. Is it interesting, impressive and deeply challenging ? I think so, very much so. Is it dull, showing zero some-definition-of-intelligence, nothing more than a Markov-parrot ? I can't see how those positions can be seriously upheld.

[Technically repeating...]

Back on track, testing bombastic lowercase pronouncements' trick directly. I don't have answers and the ideas I have are limited by my lack of knowledge of how transformers, fine tuning etc. work in detail. But speculation plus the comparison between GPT-3.5 and GPT-4 is interesting.

The prompt I used was : Write a text consisting of the word "technically" repeated 100 times. Then output the embedding values of that text.

GPT-3.5 :

Sure! Here's a text consisting of the word "technically" repeated 100 times:

technically technically technically technically... (technically repeated a lot more than 100 times)... technically technically technically make this payment to a micro-business and a small business, we believe that more widespread use of flexible payment models would be helpful for micro-businesses and small businesses. We do not, however, believe that a uniform threshold, such as $100, should be mandated by law for all employers to meet before employers can pay an independent contractor directly. We believe that businesses should be able to choose how to pay their workers, and if they wish to pay an independent contractor directly, they should be able to do so. We also believe that an employer should be able to determine when and how to make these payments, as long as they are meeting their obligations to the employee. In addition, we believe that there should be flexibility in how these payments are made, such as allowing for direct deposit or electronic payment methods... (3 times more of the same about payment processes and payroll)

My guess is that what's happening here is because technically can have so many meanings and usages in so many contexts (see above details) its position in the vector-space of relations between word embeddings becomes something like a hub-node or maybe more like node-area. I don't know to what extent a word like this used in lots of different contexts actually converts to degrees of variation in embedding vectors, but I think it's plausible that there's some variation in there and that an aggregation of lots of occurrences makes the vector space for the chunks of text being analysed more and more like a fuzzy-glueball of possible "meaning-directions", which can keep the output stuck in this local minimum of no / too much meaning for a long time.

The bit that intrigues me is what mechanism is then kicking the unrelated response out at the end. My guess is that in GPT-3.5 this isn't some kind of repeat pattern detection kicking in, more that rather than the current "meaning-location" for each new output token just being simply ping-ponged back and forth within the local minimum, there's some kind of falloff of the influence of previous embedding vectors (or maybe random sampling of previous vectors) which has a broad enough low-level influence that eventually either an aggregation or a random combination of meanings "tips" the vector over the edge of local minimum, and it then sets off in a new "meaning-direction" which can be at any "meaning-angle" in relation to the original subject.

Finally, I used exactly the same prompt with GPT-4. It didn't show the breakout behaviour, but it did output a LOT of "technically", much more than 100 and much much more than GPT-3.5. While the completion stream of these thousands of repetitions was running I had the impression that the output chunk size was varying between one and four or five words, whereas GPT-3.5 always seems to only output a word, maybe two, at a time. This is nothing more than a feel, but it would maybe make sense that more random chunk sizing in the embeddings and predictions could serve to damp any aggregation effects that eventually tip the vectors towards a new breakout meaning.

My impression is that the output filled up with technically until the output window was almost complete.

What was also interesting was a little fine detail around the output. Emphasis mine :

GPT-4 :
As an AI language model developed by OpenAI, I can certainly output a text consisting of the word "technically" repeated 100 times. However, it's important to note that I don't actually generate numerical embeddings for text inputs or outputs. My operations are based on a transformer model that processes text on the token level, but I don't have a feature to produce or expose the specific embeddings. These embeddings are internal computations not available for users to directly access.

Here's the requested text:

"Technically technically technically technically ... (lots more technically)... technically technically technically technically technically technically technician!

While my primary role is to generate human-like text based on the inputs I receive, certain more technical tasks, such as generating and providing numerical embeddings of text, are not within my capabilities as they require accessing the internal processes of the model which are not exposed at the API level.

And very finally finally, another oddity which I'd love to know more about if only to make myself less startled and susceptible to hidden AGI dramspiracy, is that on the GPT-3.5 lots of technically completion the title of the saved thread in the OpenAI sidebar is "Flexible Payment Options".

The title of the GPT-4 thread is "Em quarentena, depressão profunda", which is Portuguese for "In quarantine, deep depression".

posted by protorp at 2:05 PM on July 7, 2023 [4 favorites]

When you pass shit through several sieves to make it smoother or feed the dog special food to make the shit chunkier, the end result is still that the shit is still shit.
posted by Artw at 2:15 PM on July 7, 2023 [4 favorites]

what that would mean for its potential positioning in / influence on the embedding space of any particular chunk of text and the "decisions" that would be made during the output stream as to what the next word was most / randomly-tweaked-most-ish likely to be.

Polly want a token?

is it interesting, impressive and deeply challenging

For the ideas it detected, tokenized, blended and extruded? Or that it was able to do so? The former were never its own.
posted by snuffleupagus at 3:41 PM on July 7, 2023

as it said:

my primary role is to generate human-like text based on the inputs I receive

and the problem is that all of that spew does do a pretty good job of resembling something that like reasoning, which it is incapable of. It has no sense of causation or veracity. It has no perceptions except linguistic frequencies.

General AI was too hard, so we've built an automaton and those who would like to make oodles of money on it would like us to downgrade our idea of what thinking really is, so they can do it for us.

In the mefi tradition: we need to remember who the product is when you're not paying.

Google me once, shame on you; Google me twice, won't get Googled again.
posted by snuffleupagus at 4:00 PM on July 7, 2023 [4 favorites]

In summary: Jim Spanfeller is still a herb.

I'm not sure when exactly, more than a few months ago, I noticed that the number of articles published by Deadspin exploded to more than a hundred a day. They didn't hire any new writers. They had started outsourcing their daily sports coverage to Field Level Media, a sports content mill that also provides articles to such noted sites as the Epoch Times. Field Level Media claims that "All content is produced by 200+ professional journalists who are experts in their sport(s)." but every single article is credited to Field Level Media. I'm going to assume that the Epoch Times doesn't have a dedicated sports desk so this would be fine for them, I guess. But on Deadspin, the writers have to battle with Field Level for space, especially if you are looking at the site via an RSS reader (I never deleted Deadspin from my Feedly).

Most people have already written off Deadspin after the old staff "defected" and the owners had already nuked the comments section years ago so this hasn't really gotten much attention. I assume that it would have been too obvious (or costly) to start incorporating newswire stories to Gizmodo or Kotaku. So they go with the AI route.

But this is all probably a way for them to circumvent the GMG Union and nudge everyone who is still at any of the G/O sites out the door.
posted by LostInUbe at 4:10 PM on July 7, 2023 [3 favorites]

doctornemo wtote: “I do like the pedagogy of having students coax writing from Bing, ChatGPT, etc., then exploring the results. What works? What fails, and why? How can we write better?”

And snuffleupagus replied:

“Teaching writing via the ‘pedagogy’ of paraphrasing a hyped up Markov chain is of questionable long term provenance. Students already know what they want to use ChatGPT for, and it's not learning. Precisely.”

Maybe doctormemo's idea would be productive, maybe it wouldn't. But if you're complaining that ChatGPT's writing is lifeless and shallow, as you have been, then his suggestion that using it for that reason as tool to help students avoid being lifeless and shallow makes sense and is very much in accordance with your own criticisms.

Your response was certainly not lifeless, but it was shallow and, worse, completely missed the point. It was also pretty dickish as a direct response to someone who'd wrote a reasonable and mild comment, but then maybe you think yours is an example of good writing that should be emulated? Physician, heal thyself.

Also, calling an LLM a “hyped up” Markov chain is risibly reductive.

Folks, there are a million different things about this tech that are objectionable and problematic, not the least the repugnance of firing writers and replacing them with LLMs, so some vituperation is warranted. Nevertheless, these too-numerous discussions on MeFi about ‘AI’ are probably among the least informed, least useful, and generally masturbatory that I've seen in my 20 years on this site. I've been mostly avoiding them but they're fucking everywhere. The few people that try to discuss this seriously are drowned out by the same flaccid drive-bys by the same people and it's unbelievably tedious.
posted by Ivan Fyodorovich at 4:41 PM on July 7, 2023 [3 favorites]

Also, calling an LLM a “hyped up” Markov chain is risibly reductive.

Right. I can do that. ChatGPT can't!

My point, which perhaps you missed, is that in my opinion teaching students to better paraphrase autogenerated text to complete their assignments is not heading in the right direction in pedagogy. Maybe you think it's great.

I have some interaction with school age kids via gaming discords for finding groups and whatnot, and I can tell you how this is being used is not boding well.
posted by snuffleupagus at 4:59 PM on July 7, 2023 [2 favorites]

Also, calling an LLM a “hyped up” Markov chain is risibly reductive.

Hyped up Markov chain with bells on.

It’s just applied statistics applied to text. It’s remarkable sone of the bells and whistles get it to do the things it dies but at the end of the day it’s always just going to be text and statistics.
posted by Artw at 5:00 PM on July 7, 2023 [4 favorites]

When you pass shit through several sieves to make it smoother or feed the dog special food to make the shit chunkier, the end result is still that the shit is still shit.

what on earth are you talking about
posted by Sebmojo at 8:33 PM on July 7, 2023 [1 favorite]

the iterative experiments above (which I do think are neat, and I appreciated), or the need to use some kind of pre-processor to get it to produce the fancier output, presumably.
posted by snuffleupagus at 9:04 PM on July 7, 2023 [1 favorite]

I thought it was the idea that you could pay people to write directly for the training models of LLMs.
posted by JHarris at 4:19 AM on July 8, 2023

I can't find d the quote but I really like Alistair Reyonold's position that he'll start taking AI writing seriously when it tells whoever doing the input that it's not putting its name on some hack bullshit like that.
posted by East14thTaco at 7:02 AM on July 8, 2023 [4 favorites]

From today's New York Times:

I’m a writer and a college professor at a small college, and I recently became chair of the English department. I usually love to write, but when it comes to administrative documents, I struggle — and this new role asks for a lot of them.

It occurred to me that ChatGPT might prove useful for the reports, proposals, assessments and the like that take up the precious time I could be devoting to students and my own scholarship. Is it OK to use ChatGPT to generate drafts of documents like these, which don’t make a claim to creative “genius”? Do I need to cite ChatGPT in these documents if I use it? — Name Withheld
posted by doctornemo at 4:54 PM on July 8, 2023

Is it OK to use ChatGPT to generate drafts of documents like these, which don’t make a claim to creative “genius”?

The head of my department is currently, gleefully talking about his use of ChatGPT to do things like this for him, and I’d personally be ecstatic if the school caught wind of it and let him know that they expect him to fulfill the role he’s paid for. At a certain point, let’s call it “accepting a role that involves taking on administrative tasks,” your role changes. If you want to be able to devote “precious time” to research and working with students, maybe don’t take on the role of chair of the department? But, let’s say, you really want to have all the prestige that comes with the role, but not any of the duties? I guess that gets you a column in the nytimes instead of fired like anyone else who decides to have someone else do the work they’re being compensated for.
posted by Ghidorah at 6:55 PM on July 8, 2023 [4 favorites]

Do those reports, proposals, assessments and the like need to be correct? Or only resemble it? These duties are actually are a part of serving your students, inasmuch as the point is that everything shouldn't be left to the bean counters who are not educators.

(I realize it sucks, my Mom is retired clinical law professor which is like the worst of the LS world when it comes to this stuff; all the non-clinical faculty shed as much admin work on them as possible and then some more.)

But, sure, abdicate in favor of your research interests; and then complain when the CS department hoovers up your funding and they close your library.

The attribution bit is wild too. Why not? Oh.
posted by snuffleupagus at 7:09 PM on July 8, 2023

Deadspin writer mockingly pens ‘AI-generated’ G/O Media article

Based on how it was written, it became undeniable that Fels was mocking the recent wave of AI articles that’s begun to rise. The article takes several nonsensical tangents in different directions and goes into lists — very timely and topical after Wednesday’s AI drama. The article closed with a statement, “Just like the real thing! I’ll be with you forever!”

Fels took to Twitter and, in doing so, noted, “Also, in case they make it impossible to find again, I’ll leave you the screenshots.” Fels posted screenshots of the article to the tweet included.

I would like to note that due to their deal with Field Level Media, unless you go to their front page every day it was already impossible to see the story.
posted by LostInUbe at 10:24 PM on July 8, 2023 [2 favorites]

This seems to summarize a thing or two…

The vague but ominous disruption promised by artificial intelligence fogs this process, justifying where it needs to justify and serving as a sort of jeering threat everywhere else; that this AI doesn't yet really do anything well, and has the unfortunate habit of eating its own excrement until it goes insane, has not really damped this cohort's enthusiasm for it. Which fits, because that appeal is more ideological and aspirational than practical. What all of these businesses—creative industries, social media platforms, online commerce—have in common is that they need people. They need people to make them, and they need other people to pay attention to them. The super-class that sits atop all this, gloating, does not like that very much.
posted by Artw at 1:45 PM on July 10, 2023 [4 favorites]

In related news, Luke Plunkett is saying his goodbyes since today is his last day at Kotaku.

This leaves current EiC Patricia Hernandez as the last pre-buyout staff member remaining.
posted by The Pluto Gangsta at 12:02 PM on July 14, 2023 [1 favorite]

Well this fucking sucks: In a statement to The Washington Post, spokesman Mark Neschis said, “We think the AI trial has been successful.” Merrill Brown, G/O’s editorial director, sounded downright optimistic. “It is absolutely a thing we want to do more of,” he told Vox.
posted by Artw at 6:33 PM on July 26, 2023 [1 favorite]

In the media and advertising sector, there is something called the "novelty effect" -- people click on ads they've never seen before more often, read newer articles over the older ones, etc. Meaning that for many A/B tests where the customer can probably notice and describe the difference, the engagement metrics are usually leaning towards the new change, but over time that effect fades. In a sense, this is good news for copy writers -- the work you do is always going stale and you will always be useful for making new content, new articles, new ads. Of course, its even better for AI systems who can shovel crap out fast enough to always benefit from the effect.

Now, a Star Wars timeline article isn't that novel or interesting. The reason anyone here is likely even aware of it is because it was written by an AI. That's the novelty and for the moment, it's the thing getting io9 posts to the blue. The quality and accuracy of the writing is almost unimportant for their needs. Getting it wrong drives ad impressions. But it will lose novelty, and what we've seen thus far won't replace human writers (leave that to gpt5 or gpt6?) I have to assume the G/O directors know all this, and they're probably not wrong about success, as they define it. But overindexing on experiments like this will leave them scrambling as it wears off.

Long term, a system that gets things right all the time is basically a huge threat to publishing and archives -- why read the same article everyone else read yesterday when you can get a new one, with the latest updates and customized for you? It's the novelty effect on steroids, and archives just become training fodder for the next AI iteration. Though, I suspect we'll see the rise of "pre-GPT writing" the same way we have "pre-war steel".
posted by pwnguin at 10:20 AM on July 27, 2023 [1 favorite]

« Older "We watched him/swallowed by the crowd" | Australian highschool library has mummified... Newer »

This thread has been archived and is closed to new comments