The core query softness continues without mitigation
April 23, 2024 11:40 AM   Subscribe

Edward Zitron has been reading all of google's internal emails that have been released as evidence in the DOJ's antitrust case against google.

Zitron concludes that Google Search died on February 5th, 2019

It was on that date at Google's HQ evil lair an emergency meeting, aka a “code yellow” was called by Prabhakar Raghavan, then Google’s Head of Ads. This was followed by a core update to search in March 2019 which resulted in search traffic getting directed back to web sites that had previously been suppressed by Google Search’s “Penguin" update (mefi fallout) from way back in 2012.

Just over a year later in June 2020 Google's head of search Ben Gomes (previously), known as a key engineer with over 20 years of search development was replaced by Raghavan. Gnomes is on record arguing that search was "getting too involved with ads" (pdf of email), so Zitron tries to answer where Raghavan stands. Prior to working at Google Raghavan lead Yahoo's search division, but not with the ascent of Yahoo. Raghavan's tenure oversaw Yahoo going from around 30% share down to less than half which lead to Yahoo switching to using Bing for search.

Zitron concludes its all in the service of short term corporate earning goals, regardless of impact on users.

Ed Zitron previously on the death of the internet, Elon's many failures, a social media hot take, annotations on crypto and all the way back to a sharp game review of Darkfall.
posted by zenon (141 comments total) 76 users marked this as a favorite
 
It was always conventional wisdom that Google could fuck up as many products and launches as they wanted and they'd still always be fine as long as they didn't fuck up search.

It's been fascinating to watch them fuck up search. Both the speed and breadth of the fuckup is astounding. Search went from "eh, this seems less good than it once was" to "barely useful" in the course of a few years. My primary use case for Google search now is indulging my muscle memory.

So many big players are shooting themselves in the foot now that we may be witnessing the de-ossification of the web. It's been a couple of decades since we had a huge amount of tech churn; it's going to be interesting to see what's next.
posted by phooky at 11:58 AM on April 23 [53 favorites]


The vacuum it leaves is going to be hard to fill.
There’s Bing, but it’s a) Bing and b) likely to disappear into AI and other bullshit while chasing money even worse than Google will.

There’s duckDuckGo, but that’s just dressed up Bing.

So the golden period of the entirety of the web being easily searchable is just going to disappear and who knows what comes after.
posted by Artw at 12:04 PM on April 23 [26 favorites]


As all the search engines fail into enshitification and needless AI wackiness, Yahoo! sees their opportunity and blows the dust off their old web directory that's been sitting in a closet forgotten.
posted by Clever User Name at 12:09 PM on April 23 [70 favorites]


To be clear: by "growth" they mean "profit growth, or a proxy for profit, growth".

Engagement growth? Good because the idea that each engagement can be turned into a certain amount of $.

Having numbers you can sell as a growth story to wall street means your stock ticks up. Executive compensation is highly leveraged options, so upticks in stock price determine how the executives are paid.

So if you have faction A that is offering a way to generate an uptick in something you can sell to wall street, and faction B that is offering a way to generate longer term value but won't sell well to wall street, guess who you listen to?

As a hint, the other one is head of education at google.
posted by NotAYakk at 12:10 PM on April 23 [6 favorites]


For non-Google/Bing/Yandex'd search results, search.marginalia.nu is pretty good
posted by slater at 12:10 PM on April 23 [13 favorites]


Wow. I thought perhaps it was the structure of the internet somehow that made search so much worse. oh google we hardly knew you.
posted by bluesky43 at 12:18 PM on April 23 [1 favorite]


If you prefer to hear rants, with guests, then Zitron's podcast Better Offline is worth checking out. It's an iHeart podcast, so expect terrible ads (and many of them), but it's also Cool Zone Media, so the guests are usually good.
posted by GenjiandProust at 12:23 PM on April 23 [9 favorites]


<Moe Szyslak voice> You know what I blame this on the shutdown of? Reader. </Moe Szyslak voice>
posted by Kattullus at 12:27 PM on April 23 [52 favorites]


“Computer Scientist Class Traitor” is a dumb descriptor that assumes a solidarity that doesn’t exist. I am always excited to read Ed Zitron be angry about things, but he does let his rhetoric get away from him. Then again, that’s part of the fun.
posted by Going To Maine at 12:36 PM on April 23 [11 favorites]


This is interesting but I have a problem with the framing of heroic software engineers against evil management/consultants. Not that management isn't evil but there's nothing inherently ethically or morally pure about software development. It's some old school Silicon valley/Microserfs philosophy.
posted by muddgirl at 12:36 PM on April 23 [47 favorites]


I feel that between clickbait factories and AI search as we know it is already pretty useless. Yeah, let Yahoo bring back their web directory and then people can search that instead of all the deceptive crap that's out there.

What's to stop Google or someone else from hooking up search directly to some LLM so that everything you click after a search is auto generated content made to look authentic and serve you ads?
posted by any portmanteau in a storm at 12:38 PM on April 23 [11 favorites]


I remember altavista
infoseek
askjeeves
hotbot
lycos
yahoo
and even archie back in the day.....
search engines are always falling by the wayside as something better comes along. Let's just hope that google hasn't managed to entrench itself to such a degree that it's made it impossible to be replaced. It seems to me that google has become, not useless, but such a pain in the arse to use, that it's exceedingly ripe for being shunted aside in favour of a newer, leaner, hungrier competitor. Something that returns accurate results in an easy to read format without page after page of ads and sponsored links and ten other kinds of cruft.
Surely there must be bucketloads of VC money that think this as well.
posted by conifer at 12:38 PM on April 23 [19 favorites]


What's to stop Google or someone else from hooking up search directly to some LLM so that everything you click after a search is auto generated content made to look authentic and serve you ads?

Microsoft Bing basically does this now, at least on my work computer which has a corporate Microsoft account.
posted by muddgirl at 12:42 PM on April 23 [31 favorites]


What muddgirl just said. MS Bing at work is completely useless and even actively harmful. If I believed some of the cruft it recently returned on a simple query about Oregon statutes, I'd be out of a job by now and deservedly so.
posted by mygothlaundry at 12:45 PM on April 23 [36 favorites]


The vacuum it leaves is going to be hard to fill.

Time until someone mentions Kagi: three... two... one... wait, damn, I was the one who mentioned it!

I think what might possibly help is self-hosted search? People have been doing web search for decades now, and helpful packages like BeautifulSoup exist. Imagine having a spare desktop machine spidering the web based off of a list of links you provided while you sleep, so when you want to find information you'd have your own customize database to search through? Start with the list of all Metafilter posts, maybe.
posted by JHarris at 12:52 PM on April 23 [19 favorites]


For non-Google/Bing/Yandex'd search results, search.marginalia.nu is pretty good

In the top 10 results for me was a page from white supremacist and Holocaust denier Ron Unz's blog - so their "[attempt] to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed" may have some ulterior motives attached.
posted by ryanshepard at 1:00 PM on April 23 [31 favorites]


Not that management isn't evil but there's nothing inherently ethically or morally pure about software development. It's some old school Silicon valley/Microserfs philosophy.

On the other hand, most programmers want to, at the least, produce something that works, rather than something that will work at least until they jump ship to their next golden perch... So maybe not heroes, but not the worst thing in the room.

Imagine having a spare desktop machine spidering the web based off of a list of links you provided while you sleep

Given my usual dreams, this is a terrifying suggestion.
posted by GenjiandProust at 1:06 PM on April 23 [20 favorites]


I think I just discovered the plot of Hypnospace Outlaw 2!
posted by JHarris at 1:07 PM on April 23 [7 favorites]


Time until someone mentions Kagi: three... two... one... wait, damn, I was the one who mentioned it!


Yeah, Kagi! I heard about it through Cory Doctorow and decided to give it a try, and am now using it exclusively. 10 bucks a month but I get just what I need now instead of a bunch of crap. If it too deteriorates, I guess I’ll find something else then but for now this is very good.
posted by cybrcamper at 1:17 PM on April 23 [2 favorites]


Imagine having a spare desktop machine spidering the web based off of a list of links you provided while you sleep

I’m pretty sure that’s what Neo’s computer was doing in The Matrix…
posted by cybrcamper at 1:19 PM on April 23 [4 favorites]


Everyone having an index of the entire web sounds a bit impractical, TBH.
posted by Artw at 1:22 PM on April 23 [5 favorites]


Hilarious how many people running the biggest things in the world are bad at it, don't belong there, and seemingly cannot be removed or replaced, they become load-bearing failure people, or rather, have a fucked up economic system of chains of antihuman but correct capitalistic behaviours that by design enable such failsons to thrive as another part of efficiently turning resources into useless wealth accumulations.
posted by GoblinHoney at 1:39 PM on April 23 [23 favorites]


What's to stop Google or someone else from hooking up search directly to some LLM so that everything you click after a search is auto generated content made to look authentic and serve you ads?

If enough of the actual content out there is LLM generated, why would we even notice?
posted by It's Never Lurgi at 1:44 PM on April 23 [4 favorites]


In the top 10 results for me was a page from white supremacist and Holocaust denier

well, ... fuck. :(
posted by slater at 1:47 PM on April 23 [3 favorites]


How much does Google earn per person from showing ads? Could they sell an unfucked ad-free version of search results?
posted by pracowity at 1:56 PM on April 23 [2 favorites]


Re: Kagi, this Mastodon post raised questions for me about their CEO.
posted by audi alteram partem at 2:03 PM on April 23 [22 favorites]


Long before @lori@hackers.town so eloquently described some of the issues that I'd run into with Kagi (that audi alteram partem linked to), I bounced off it because it was the same language model semantic vector searching that was killing Google long before 2019. Things where I'd search for very MacOS specific terms, and get the similar concepts from other widget sets (Windows, Qt, etc) because the concept matching was too smart for its own good.

I keep trying Stract, the up-thread mentioned Marginalia, and, heck, Lycos is still around.

But I fear that the real issue is that LLM spew is gonna overwhelm the web with so much trash that we won't be able to find each other anymore.

I keep threatening to hack up a little thing that grabs feeds from various blogs and tosses them into something simple to search, maybe heading out a few links or two, to try to get back to a human vetted web, but I haven't gotten the tuits for that. On a modern gigabit home connection and a couple of terabytes of local storage, many of the aspects of search that were amazing two and a half decades ago are pretty much solved problems for home users, the big issue is just filtering the crap.
posted by straw at 2:16 PM on April 23 [10 favorites]


Just idly thinking aloud but I wonder if there’s room for a sort of federated search? I don’t know anything about Mastodon beyond the technical description (“Ryvar when has not knowing anything stopped anyone, including you?”), but it seems like a collaborative graph structure of social nodes containing lots of links could be a useful resource for gauging interest and utility?

Beyond that, the best tool for dealing with LLMs might just be another neural network (“when you all you have is a ham-” “Shut up, okay? I get it.”). Not for the purpose of detecting LLMs - that is flatly, fundamentally impossible - but in flagging infodump/tabulated compendiums of expert knowledge on specific topics. Basically a zero-shot classifier for “this is that one spreadsheet on reddit used by everyone who plays the game even halfway seriously.”

The whole reason everyone appends “reddit” to google searches these days is because they’re just looking for that one spreadsheet, or its contextual equivalent. And because LLMs have a tendency to parrot humans put on the spot and our smokescreens of utter bullshit, even if you can’t distinguish human vs LLMs-authored listicles and “content,” you might be able to positively identify “trove of useful information.”

“You talk about spidering a federated social network but you said federated sear-“ “I also said shut up.”

Because I missed 4/20, the following is only the purest of bonghits: what if, in order to maximize coverage, many people were running their own classifier-based crawlers for topics of interest to them? What if you could subscribe to collections of crawlers, very similar to how adblockers work? Neat idea, but ripe for exploitation when someone who runs a mainstay repository needs to make rent or discovers heroin. So what if there was a feedback channel for satisfaction with the results, and various crawlers rose and fell within the topic vectorspace of the larger meta-federated search? That feedback gives us the HF for an RLHF supervisory neural network with a crowdsourced scoring function. Only we’ll need to weight scores based on the user’s position within the topic vectorspace: ie, how seriously your feedback is taken depends on how close you are to the sort of search queries typical of a domain expert.

It’s similar to ad profiling, only now we’re customizing results based on your historical position within the collective vectorspace of our federation of search neural networks, and simultaneously using it to further train the assembly of networks, relative to your position within them.

{satisfied bubbling noises} Anyways, those are my bonghits.
posted by Ryvar at 2:35 PM on April 23 [28 favorites]


I feel like with so much content being rehashed the vital bit of data that I need-- that's being lost as the moments tick by-- is that of priority. If there are fifteen LLM rehashes of the same topic, they're probably all cribbing from the one that appeared first, and that's almost always the result I need.

You can kinda, sorta get away with using the "before:yyyy/mm/dd" to hack this on Google, but I don't know for how long.
posted by phooky at 2:36 PM on April 23 [11 favorites]


... search engines are always falling by the wayside as something better comes along.
True (and I remember all those too), but the problem we have right now is that there's nothing better coming along.

Google has grown so big and is now so entrenched in our world that it's hard to see any investor/s (let's be real - there's no hope of starting the next Google in your garage) being willing to dump billions into a product aimed at unseating something so deeply embedded in the way we use the Internet that we no longer search for something, we 'Google it'.

Hilarious how many people running the biggest things in the world are bad at it, don't belong there, and seemingly cannot be removed or replaced...
Yeah, 'hilarious'. It's not just big things that have this issue, though. Everywhere I look, particularly governments, I see people that are content-free 'managing' organisations into the ground and nobody gives a fuck.
posted by dg at 2:36 PM on April 23 [7 favorites]


Online advertising (web pages, search results, in-app ads) is still an order of magnitude crappier/worse-targeted/offensive than elsewhere (radio/TV/print/streaming/etc) So there's still TONS of room for improvement with online advertising. People might engage more with better ads for appropriate categories? Google could maybe fix that?

Like pracowity I've been wondering whether they could sell an unfucked ad-free version of search results? Cos if the search results were ad-free, not commercially skewed, and resistant to the more bogus SEO tactics... I would consider paying too. I've thought about Kagi, but reports from users have been mixed.

Every user doing their own web-spidering seems impractical, but interest groups setting up spidering for their particular topic would make sense. Searching a topic on reddit is sort of a half-step towards this. Is any company offering user-directed custom spidering and serving up the resulting index?

(on review: seems I might have had a hit from Ryvar's bong)
posted by Artful Codger at 2:37 PM on April 23 [4 favorites]


I bounced off kagi as well, not different enough, too into AI and still serving up too much seo'd bullshit.

Marginalia is the most interesting engine I'm aware of. It's regularly useless, but when it is that's immediately clear and it doesn't send you off on a LLM generated goose chase. When it's good it's great.
posted by deadwax at 2:38 PM on April 23 [4 favorites]


I know Ben Gomes from my time at Google. The events Zitron relates are many years after I left. But I can say that Ben Gomes seemed like a good engineer and leader, just the kind of guy you want in charge of search quality. Honestly surprised he still worked there, it certainly wasn't for the paycheck.

Back in the early 2000s we regularly talked about how our ideal experience was to get users the information they were looking for as quickly as possible, to get them on and off google dot com itself as fast as possible. We all understood that'd be good for the business long term. I suspect now that Google has monopoly market share in search and an enormous advantage in ads, that ethos has evaporated.

I've switched to Kagi since Google search has gotten so bad. Kagi works well for me. I also am cautiously optimistic about Bing; this is certainly their time to shine in the US market. I like all the AI experimentation they're doing and think they may see themselves through to something interesting and new.
posted by Nelson at 2:39 PM on April 23 [19 favorites]


Artful Codger: great bonghits also think alike, apparently.
posted by Ryvar at 2:39 PM on April 23 [1 favorite]


... and

metafilter: {satisfied bubbling noises}
posted by Artful Codger at 2:43 PM on April 23 [15 favorites]


based off of a list of links you provided while you sleep

Any links I provide while asleep are guaranteed to be very odd.
posted by Greg_Ace at 2:53 PM on April 23 [2 favorites]


Could they sell an unfucked ad-free version of search results?
Why would they, though? Streaming services have already figured out that most people will just put up with ads instead of paying even more for an ad-free (or, ad-light...er) experience. That ship has sailed. Even though for a lot of people, the whole point of moving to streaming was to escape ads.
posted by xedrik at 2:59 PM on April 23 [2 favorites]


For me, for the most part, searches fall into one of two categories: "I can't remember the specific URL for this entity that I know has a web presence" (example: the Colorado Public Utilities Commission, because I want to listen to a hearing that I know will be streamed online) or "I need a very specific piece of information related to (or prompted by) a conversation I'm in the middle of" (example: Who was the lead male actor in the 1990 film Flatliners?). For both of those use cases, which are probably 95% of my searches, search is not "broken" - I'm able to find what I need pretty quickly, and I can't remember the last time I needed to go past the first page of results.

That said, I also miss the days when it was possible to actually find a Googlewhack. And I also make a point of not clicking on the ad for the thing I searched for, even when it's exactly what I want, because why would you reward that kind of behavior. I scroll down an entry or two to the thing that's actually a result.
posted by nickmark at 2:59 PM on April 23 [18 favorites]


so their "[attempt] to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed" may have some ulterior motives attached

Not a tremendous surprise that a (solo-developed?) site designed to boost sites that mainstream search engines would downrank would end up revealing that some things are downranked for a reason, is it? I don’t think an ulterior motive is required to explain this.

When I search for the Holocaust the second result I get is not a denier site but rather a 9/11 conspiracy site, lamenting the association of the 9/11 truth movement with Holocaust deniers. I think that’s a kind of thing you’re gonna get with a naive implementation of:

The search engine uses various heuristics to prefer text-heavy websites, and puts an upper limit on heavy modern web design inventions.
posted by atoxyl at 3:21 PM on April 23 [5 favorites]


For non-Google/Bing/Yandex'd search results, search.marginalia.nu is pretty good

In the top 10 results for me was a page from white supremacist and Holocaust denier Ron Unz's blog - so their "[attempt] to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed" may have some ulterior motives attached.


same here, depressingly. Searched for a particular university, got a far-right conspiracy wiki about something unrelated in the top 5 results, yikes.
posted by bigendian at 3:29 PM on April 23 [4 favorites]


For me, for the most part, searches fall into one of two categories: "I can't remember the specific URL for this entity that I know has a web presence"

I know so many people who have Google set as their home page, and use the search field as the browser url field. That is to say, if they want to go to Amazon, they type amazon.com into Google’s search field, and then click on the Amazon link Google returns. And these are 30-40-somethings. Y’know, “digital natives.”

My mind sobs each time I witness this.
posted by Thorzdad at 3:35 PM on April 23 [17 favorites]


My mind sobs each time I witness this.

Once upon a time, Facebook OAuth logins were new tech, and the poor journalist blogging about them got 1. amazing amounts of traffic considering, and 2. massive amounts of angry comments demanding to know why they changed the Facebook UI so radically. The conclusion was that a huge number of people search for "Facebook login" then just click, instead of like, bookmarking it.
posted by pwnguin at 3:46 PM on April 23 [16 favorites]


> Wow. I thought perhaps it was the structure of the internet somehow that made search so much worse.

it's the other way around. search makes the structure of the Internet worse.
posted by bombastic lowercase pronouncements at 4:20 PM on April 23 [15 favorites]


Reading this gives me a strong feeling that I should divest as much of my personal data as possible from Google. It's something I haven't thought about since I experienced my first Google Funeral which was Buzz, where the group of friends I made in the comments sections of non-literature posts on Sparknotes moved to for very off-topic chats and threads.

If it can happen to Search...
posted by shenkerism at 4:31 PM on April 23 [5 favorites]


For both of those use cases, which are probably 95% of my searches

That kind of search is well under half of my use of search, likely under 25%. Yep, that kind of trivial thing is still well served, no matter what engine you use. The kind of thing I'm looking for when I'm designing is really, really broken and increasingly so. That tends to be material information, how other people have tackled similar problems, interpretation of technical information and so on. Technical info filtered through humans, basically. Increasingly the human bit of that is missing in search results.
posted by deadwax at 4:43 PM on April 23 [8 favorites]


Yeah, marginalia.nu is turning up weird right-wing conspiracy shit for me as well, for just plain searches for things that are a bit prone to that. Searching for "jews" turns up a link to Khazar conspiracy theory. Another search for something fairly innocuous turned up links to Lyndon LaRouche stuff. Granted, I'm trying to search for stuff where I think conspiracy theories and far-right shit might turn up, but so far, I'm succeeding in making it serve that to me.
posted by Joakim Ziegler at 4:49 PM on April 23 [4 favorites]


Wow, what starts as a post-mortem transitions to full-blown character assassination, based on a handful of factoids. Props to the author for their commitment?

I have professional reasons to use Bing and in the early years I would switch to Google when Bing’s results were lacking. I don’t remember the last time I had to do this but I’ll venture it was a few years pre-pandemic. The only Google feature I still use religiously is Scholar.
posted by simra at 4:53 PM on April 23 [2 favorites]


Imagine having a spare desktop machine spidering the web based off of a list of links you provided while you sleep.

There was a piece of software that did this maybe a decade ago. Can't recall the name. I appreciated it, but it was a little too esoteric for the average user, and it eventually disappeared. I've been thinking about it recently though, because it's not just search that's devolved, content has too. A search that would once get you an authoritative source now gets you a ton of junk pages that match your query because they're SEOed up the wazoo.
posted by CheeseDigestsAll at 4:55 PM on April 23 [1 favorite]


i had a few beers with ed zitron (unless there are two?) once, because he was friends with someone in the student residence i lived in, and though this would be nearly 20 years ago and he surely would not remember me, he seemed like a cool guy and i was pleased to note on the hell site that he is some sort of amusing pundit of the intertubes (unless there are two eds lemon). good luck verifying any of that story with google tho.
posted by busted_crayons at 4:56 PM on April 23 [2 favorites]


The whole reason everyone appends “reddit” to google searches these days is because they’re just looking for that one spreadsheet, or its contextual equivalent. And because LLMs have a tendency to parrot humans put on the spot and our smokescreens of utter bullshit, even if you can’t distinguish human vs LLMs-authored listicles and “content,” you might be able to positively identify “trove of useful information.”

This is still a race humans can't win. AI and bots can poison the web for search faster than anybody can clean it up. Laws and regulations are the only answers.
posted by mightygodking at 4:59 PM on April 23 [12 favorites]


I think this was it. I just have stopped updating versions or something.
https://stclairsoft.com/HistoryHound/index.html
posted by CheeseDigestsAll at 5:00 PM on April 23 [4 favorites]


I know so many people who have Google set as their home page, and use the search field as the browser url field. That is to say, if they want to go to Amazon, they type amazon.com into Google’s search field, and then click on the Amazon link Google returns. And these are 30-40-somethings. Y’know, “digital natives.”

Unless things have changed in the last 5 years, replace Google with "yahoo" (yes even now), and that is literally the entire nation of Japan, right down to it being taught culture. That's how a lot of people "surf" to anywhere.
posted by cendawanita at 5:10 PM on April 23 [6 favorites]


(I can confirm that I've seen metro ads that'll basically include the Yahoo search bar graphic with the name of the business they'll want you to get to, that gets a larger real estate than the actual url)
posted by cendawanita at 5:12 PM on April 23 [4 favorites]


I've been using Kagi for about a year and find it quite good for the types of things I — as a software developer — usually search for. I don't use the AI features. One particularly useful tool to me is the ability to pre-filter out certain domain names from my results. No Pinterest, etc.

It should annoy me, probably, but I'm frequently delighted when it doesn't find any results due to a misspelling. Just like the good old days!

Whenever I am actively looking for something commercial — a product or a local business — Google is generally better, but it's just a !g away.

I will say that Orion, Kagi's iOS browser app, is terrible. Very buggy. Unfortunately you can't select Kagi as a phone-wide search engine.

I am aware of the controversy around Kagi and its CEO, but honestly, there's no way to win with search. My options are to remain the product, use Bing-by-proxy, or pay a pittance to a slightly (by today's standard) problematic business that improves my search experience, doesn't show me ads, and allows me a semblance of control over what I see.
posted by flippant at 5:31 PM on April 23 [7 favorites]


>Unless things have changed in the last 5 years, replace Google with "yahoo" (yes even now), and that is literally the entire nation of Japan, right down to it being taught culture. That's how a lot of people "surf" to anywhere.

when the Internet becomes a series of sites connected to one central site to which you return each time you want to go from one site to another this is so bad everyone. when browing behavior is all routed through a centralized chokepoint whoever owns that chokepoint owns the Internet and can do whatever stupid thing they want to it.

n.b. the argument doesn't change if there's five or six chokepoints. particularly given that the owners of all of the extant chokepoints barely even bother to hide that they're a cartel
posted by bombastic lowercase pronouncements at 5:34 PM on April 23 [11 favorites]


they type amazon.com into Google’s search field, and then click on the Amazon link Google returns

There's absolutely nothing wrong with people doing this. It's very very common, it's called "navigational queries" because the intent is simple navigation, not research. There's a small shortcut those folks could learn; just typing "amazon" will probably work, or maybe even "amaz". Google works very well at navigational queries and has highly optimized the path so that search results page loads very quickly and easily sends you to where you're going. The "I'm feeling lucky" button is even still there just for this kind of thing.

It's a perfectly reasonable strategy now that every browser uses the same input box for URLs and search terms. (Remember how those used to be separate?!) It's how I navigate primarily, although these days Chrome's browser history will complete the URL for me before it ever gets to the search engine website.
posted by Nelson at 5:49 PM on April 23 [15 favorites]


I've been continuing to use the metasearch engine SearXNG with good results.
posted by ursus_comiter at 5:57 PM on April 23


Nelson: There is something wrong with navigational queries, though! The problem is that Google will happily take malware companies' money to appear at the top of search results! Not for Amazon, sure—they know they don't want that hassle—but for smaller sites? They sure as hell will.
posted by adrienneleigh at 5:59 PM on April 23 [15 favorites]


This is interesting but I have a problem with the framing of heroic software engineers against evil management/consultants. Not that management isn't evil but there's nothing inherently ethically or morally pure about software development. It's some old school Silicon valley/Microserfs philosophy.
That's true, but I think engineers typically have a dispositional, if not professional, inclination to take pride in their work and the things they build. Nobody (well, almost nobody) wants their job to be building bad and useless things. Management's job, though, is explicitly business priorities.

The ad-fueled internet model worked for a time but I think it's dying now. Too much noise generated and too much fake views have turned it into a Red Queen's Race.
posted by ndr at 6:26 PM on April 23 [7 favorites]


Granted, I'm trying to search for stuff where I think conspiracy theories and far-right shit might turn up, but so far, I'm succeeding in making it serve that to me.

Turns out search actually keeps the internet from being a bunch of handmade conspiracy sites!
posted by atoxyl at 6:34 PM on April 23 [1 favorite]


It seems happy enough to give me niche left-wing sites, too, if I ask for them, and in fact it loves RationalWiki for some reason. It’s not positioned as a boutique modern search engine like Kagi so much as an anti-search engine so I’m not really sure what people expected it to be turning up.
posted by atoxyl at 6:43 PM on April 23 [2 favorites]


It's the age-old struggle of Evil versus Neutrality.
posted by biogeo at 6:45 PM on April 23 [9 favorites]


Kagi looks pretty ok, but, paying for a search engine seems weird, but fine, 5 bucks a month only gets me 300 searches, though? I use that in like, two days. 10 bucks a month for unlimited searches seems excessive. I'd maybe pay 5. And they're all in on AI shit, it seems like.
posted by Joakim Ziegler at 6:57 PM on April 23 [7 favorites]


it's the other way around. search makes the structure of the Internet worse.

The internet’s post-mortem will note cause-of-death as SEO.
posted by Thorzdad at 7:07 PM on April 23 [4 favorites]


The author is angry and has a simplified story to tell, but, still, it's good to have at least a few details of the how and who of search sucking, instead of just vague hand-wavy stuff about market forces.

This was the funniest part of the story for me, as I appreciate irony:
It’s very, very difficult to find much on Raghavan’s history — it took me hours of digging through Google results to find the three or four articles that went into any depth about him...
posted by clawsoon at 7:14 PM on April 23 [9 favorites]


> The internet’s post-mortem will note cause-of-death as SEO.

yes but the search engine concept is itself Internet-killingly bad even without the optimization part
posted by bombastic lowercase pronouncements at 7:23 PM on April 23 [2 favorites]


This was the funniest part of the story for me, as I appreciate irony

Not really, merely the kind of thing that’s going to get less and less possible then go away.
posted by Artw at 7:30 PM on April 23


From the article:
Despite his history as a true computer scientist with actual academic credentials, Raghavan chose to bulldoze actual workers and replace them with toadies that would make Google more profitable and less useful to the world at large.
It wasn't until this sentence that it clicked for me that this Raghavan is that Raghavan. Tthe one from randomized rounding, the one from Motwani and Raghavan.

For non-Computer Science people, Raghavan was (is?) a fairly significant figure in the field, particularly in randomized algorithms where he is responsible for some seminal results as well as a co-author of one of the standard textbooks on the subject. Finding out that this is what he's been up to has been a tad disappointing.
posted by mhum at 7:32 PM on April 23 [19 favorites]


I think what might possibly help is self-hosted search? People have been doing web search for decades now, and helpful packages like BeautifulSoup exist. Imagine having a spare desktop machine spidering the web based off of a list of links you provided while you sleep, so when you want to find information you'd have your own customize database to search through? Start with the list of all Metafilter posts, maybe.
So, uhhhhh.

I sort of did (and do) this due to a combination of personal obsessiveness and work-related projects. Services like Omnivore, that serve as combination "read-this-page-later/archive-it-in-case-it-goes-away" tools, are great for a lot of use cases. Go the next step and start spidering/exploring every site that you happen to find an interesting page on, and you're quickly talking about hundreds of gigs of data without even trying. Extending that a few steps out to the things they link to, and the storage capacities of most peoples' desktop computers starts topping out.*

Taking a close, careful look at some focused subset of the internet can definitely be done with the kinds of resources you describe, and it is super fun, but discovering the stuff you didn't know was out there is... really, really difficult.

*(FWIW: I have a couple machines in my basement mostly-dedicated to it now, with (collectively) a hundred gigs of RAM and change, 20TB of "slow" storage with three rolling backups, and 4TB of SSDs for actual indexes and search and ad hoc data experiments. For some stuff that's overkill and I could probably get away with just slapping a raspberry pi in their place; for other stuff, particularly the indexing and relevance calculation stuff that makes "google-ish" queries effective, my setup is pitifully underpowered.
posted by verb at 8:54 PM on April 23 [12 favorites]


Artw: Everyone having an index of the entire web sounds a bit impractical, TBH.

Assuming this is replying to my suggestion about self-hosted search, that is generally true, but it needn't be the whole web. If you're seeding it from a list of pages of interest, you might be able to find a useful subset of the web, especially if some basic heuristic is applied to remove junksites. Just because Google feels like it needs to index ten thousand SEO sites doesn't mean we have to.
posted by JHarris at 9:21 PM on April 23


Every user doing their own web-spidering seems impractical, but interest groups setting up spidering for their particular topic would make sense.

Back in my day we called that a web ring.
posted by cosmologinaut at 9:41 PM on April 23 [21 favorites]


Everyone having an index of the entire web sounds a bit impractical, TBH.

You don't need everyone having an index of the entire web.

You need everyone having an index of what they are each an expert in their individual problem domain.

Then you need a piece of software running on a cluster that is able to evaluate those particular indices and connect you, the person submitting a query, to the index or indices that best match your interest.

Distributed search is the killer problem for AI to solve, I think, which will create the next generation of search engine that is useful and successful. Google engineers are too stupid to be experts in everything. So the researchers that solve that problem will get a Turing Award, or at least billions or trillions in VC funding.

Prove me wrong in five to ten years. I'm serious.
posted by They sucked his brains out! at 10:11 PM on April 23 [5 favorites]


Could they sell an unfucked ad-free version of search results?
Why would they, though? Streaming services have already figured out that most people will just put up with ads instead of paying even more for an ad-free (or, ad-light...er) experience.

I'm wondering whether companies would pay for it. For example, programmers famously use Google all the time. If I had a software company and Google was selling a developer package that included ad-free search, with no distortion of the results based on who paid for placement in results, I might buy it for my company. And if that got people spoiled for better search results, they might buy the home edition (for their personal Google account). Then Google could even make regular free search worse and see if it pushed more people to the subscription model.
posted by pracowity at 11:26 PM on April 23


Because Google is now useless I started springing a few bucks a month for Kagi. Breath of fresh air.
posted by GallonOfAlan at 11:35 PM on April 23 [2 favorites]


You can kinda, sorta get away with using the "before:yyyy/mm/dd" to hack this on Google, but I don't know for how long.

It's like low-background lead.
posted by Hermione Dies at 11:40 PM on April 23 [6 favorites]


My experiment with using Wikipedia as my search engine default is continuing. Its a little frustrating when I don't need something from Wikipedia and I need to remember to throw the request at a 'real' search engine, but it also throws up content & articles which are often tangentially relevant to my query so its still a net positive.
posted by phigmov at 12:26 AM on April 24 [4 favorites]


Back in the late 90s I had an idea for a fiction story about software engineers looking for The Next Big Thing in tech: a search engine that instead of giving you the results for the thing you wanted, gave you the results for the thing you didn't even know you wanted yet.

In a way, all this messing around with tracking personal data and AI is trying to do that, but only in the service of showing you ads you might possibly click on.
posted by rikschell at 5:33 AM on April 24 [1 favorite]


A bit late to the party but for anyone still reading and interested, there IS a working modern web directory service: Risen from the ashes of DMOZ, Curlie.org! I mentioned it in this Ask thread (which btw took me several G searches to find despite knowing several strings on the page and the domain).

Curlie is fun and interesting. I've found a bunch of cool stuff through them, almost like the early days of the web. It's better at finding good sites than a specific snippet of info. You can also volunteer to add to the directory, which is also neat. When google pisses you off, give Curlie a try.
posted by SaltySalticid at 6:26 AM on April 24 [3 favorites]


That curlie link isn't opening for me. "504 Gateway Time-out"
posted by pracowity at 7:01 AM on April 24 [1 favorite]


I ditched Google for DuckDuckGo years ago so I've mostly avoided the rot but do try searches on it now and then. And it's not good. It went from essential to avoid-at-all-costs, something I wouldn't have believed possible just a few years ago. Oh, well.
posted by tommasz at 7:05 AM on April 24


Oh my god my head. Typing this with one eye open because I'm not ready for two: the problem with my comment above is that the proposed bidirectional RLHF scoring/ranking is open to gaming by SEO scum, as are most crowdsource feedback channels: it just shifts the vulnerability down a layer. I'd like to think I'd have seen that immediately under other circumstances.

Assuming this is replying to my suggestion about self-hosted search, that is generally true, but it needn't be the whole web. If you're seeding it from a list of pages of interest, you might be able to find a useful subset of the web, especially if some basic heuristic is applied to remove junksites. Just because Google feels like it needs to index ten thousand SEO sites doesn't mean we have to.

It's been observed here many times that LLMs are effectively a lossy text compression scheme and that is not without merit, but it is somewhat missing the point: it is a lossy text compression scheme that conforms to the grammar of the training language and the assumptions about semantic/conceptual relationships shared by its speakers. See also: removing racial bias in output is incredibly difficult when the training data is all sourced from our racist society.

The reason ML might be useful in search is that it can mold its output to our shared concepts - or better yet the shared concepts of our best selves, per that last sentence. I'm just not certain an imposed conceptual bias of eradicating all marketing is possible without invoking Turing-or-better AGI (computer science unobtanium). Maybe a zero-shot classifier trained on the negative image? Spider everything uBlock is filtering out to create the training set, and rank results based on distance from that: the further the better. It's really just taking the GMail spam filter - spammers are effectively crowdsourced to train the expert system on ignoring them - and applying it to search.

That also answers "okay but why does this need to be federated, specifically?": because if it's successful there's absolutely no money in it for the Valley VC crowd to exploit and enshittify. That probably needs to be true for any sustainable search replacement, ML-based or not.
posted by Ryvar at 7:28 AM on April 24 [2 favorites]


That curlie link isn't opening for me. "504 Gateway Time-out"

Well dang. I used Curlie fairly recently, and archive.org has a snapshot from April 7 2024. I assume it's just a temporary outage and hope they come back soon! In the meantime, there's also dmoztools.net, which is a snapshot of dmoz when it closed down. So it won't have sites created after 2017, but still useful for some cases.
posted by SaltySalticid at 7:40 AM on April 24 [1 favorite]


Curlie was down for maintenance for a few days at the end of March, but it came back up. I don’t see anything about an outage on the Mastodon page, so hopefully it’s just temporary.

In tracking down the Curlie account I came across the Mojeek Mastodon account, and was surprised to find that the Mojeek search engine is still around. It seems to work pretty well, and it’s still apparently using its own index, and hasn’t signed up with Bing or Yandex.
posted by Kattullus at 7:53 AM on April 24 [3 favorites]


Also, I got curious about Qwant, the French search engine, which I had a vague idea was using Bing. Apparently it’s a hybrid, with a 20 billion page index of its own which it supplements with Bing.
posted by Kattullus at 8:20 AM on April 24


You don't need everyone having an index of the entire web.

You need everyone having an index of what they are each an expert in their individual problem domain.

Then you need a piece of software running on a cluster that is able to evaluate those particular indices and connect you, the person submitting a query, to the index or indices that best match your interest.


I guess that piece of software would need some way to index those distributed nodes in order to dispatch users to the appropriate ones, so I suppose it could have some sort of automatic crawling process to...hey wait a minute.
posted by figurant at 8:43 AM on April 24 [3 favorites]


Let’s go back to the drawing board webring.
posted by snofoam at 8:48 AM on April 24 [2 favorites]


they type amazon.com into Google’s search field, and then click on the Amazon link Google returns

People do this because in the past if you got 1 single character wrong in the search field, it goes to (for example) googl.co which is a spam site rather than what you are actually looking for. The old web was notorious for stuff like that. And bookmarks is 4-6 clicks (depending on the browswer) plus having to remember which random shape at the top even gets you to bookmarks. It's literally easier to just search rather than maintain bookmarks or trust the address bar.
posted by The_Vegetables at 9:27 AM on April 24 [3 favorites]


I moved to DuckDuckGo as my default browser, and it's good enough for some things, and flat-out worthless for others to the point where I have to use Google anyway and then get the results I'm looking for. And I'm talking pretty basic stuff, like sourcing a quote I read in a New Yorker article.

Sometimes, I just use Google in VPN + incognito mode so that it doesn't connect me with the search and feed me things I don't want as a result. It's a clunky workaround, but Google is, generally, better at searches than DDG in my own experience. Which is a pain, and I hope someone comes up with something better.
posted by the sobsister at 9:35 AM on April 24 [2 favorites]


Did anyone else ever contribute to NewHoo? It looks like the founders quickly sold out to Netscape. No idea what happened after that. I guess it just disappeared with Netscape. Or is that what this Curlie is now? I had forgotten all about it, but I recently found an old NewHoo T-shirt in a cupboard.
posted by pracowity at 9:43 AM on April 24 [2 favorites]


In a way, all this messing around with tracking personal data and AI is trying to [give you the results for the thing you didn't even know you wanted yet], but only in the service of showing you ads you might possibly click on.

The sad thing about internet advertising is that it's currently the inverse of that. Aside from giving me a day's worth of powertool ads after I searched once for "drill", my experience of Internet ads is that, even with all the data they allegedly have... they still serve me crap ads for stuff I'm not remotely interested in! Even if I repeatedly flag an ad as "not interesting".

/derail
posted by Artful Codger at 10:17 AM on April 24 [4 favorites]


I certainly remember the point in late 2020 when I realized it was no longer possible to simply search for reviews. You could get review-shaped pages but none performed the crucial task of describing the options in a way that would allow me to choose the ones I preferred. Y'know, reviewing.

Before that I remember hearing various rumors and then confirmed stories that Google was paying billions of dollars per year for default search engine deals, which honestly puzzled me. Why would Firefox risk putting some garbage search engine on the home page, knowing the users would probably never figure out how to change it? Why was it worth billions to Google? Then it became clear, surprisingly quickly, that the reason was Google was making a lot of money on the searches themselves, not just the ads. And therefore Google was no longer in a position of returning search results that were good, it wanted to return results that were profitable for Google. Obviously I knew that Google was making a huge amount of money from search, but it appeared from the outside to be doing so by, at worst, targeting ads using huge amounts of personal information gathered from the Google ecosystem.
posted by wnissen at 10:32 AM on April 24 [5 favorites]


Just because Google feels like it needs to index ten thousand SEO sites doesn't mean we have to.

It kind of does. SEO sites are intentionally as difficult as possible for computers to distinguish from real sites.
posted by shponglespore at 11:09 AM on April 24 [2 favorites]


the sobsister: I moved to DuckDuckGo as my default browser

Sorry, did you mean browser or search engine? I'm asking because DDG offers both.
posted by Too-Ticky at 11:19 AM on April 24


“The specific process by which Google enshittified its search,” Cory Doctorow, pluralistic.net, 24 April 2024
posted by ob1quixote at 11:41 AM on April 24 [5 favorites]


Thorzdad: The internet’s post-mortem will note cause-of-death as SEO.

bombastic lowercase pronouncements: yes but the search engine concept is itself Internet-killingly bad even without the optimization part

I get that it's a bombastic lowercase pronouncement, but can you include more detail, especially about centralisation of the index used to search vs distributed networks with varied interests, accuracy and reputation?

(Imma claim the reputation boost correcting this conflation of The Internet with The World-Wide Web. DARPA wanted this network to survive nuclear attacks, only DDoS traffic kills parts of it and we "route around the brokenness." SSH, Samba and iSCSI-over-RDMA -- even async JSON in your webapp and microservices -- give no sh_ts about this search engine mullarkey that port-80 or port-443 www is drowning under.)
posted by k3ninho at 12:40 PM on April 24 [3 favorites]


90% of my searches are just "fudgesicle recipe" and I still feel like google is broken.

The internet is broken by this shit now I guess, I can't even tell you how often I'm reading a recipe now and I realize no one has ever cooked this, this is probably just generated, ugh.

I miss the young internet. I wish we hadn't let corporations take it from us.

But we did. Lets all meet up at the library.
posted by euphoria066 at 1:48 PM on April 24 [8 favorites]


It kind of does. SEO sites are intentionally as difficult as possible for computers to distinguish from real sites.
The original intent of the Backrub algorithm (later renamed PageRank for obvious reasons) was to use a few high-quality sites to bootstrap the rest of the index, because "good" sites wouldn't link to garbage. In that model, it wouldn't matter if someone makes an SEO site, no matter how good it is, because the humans at the New York Times (or Slashdot, or whatever) aren't going to link to the fake site. What I worry about is that there may not be any "good" sites anymore, since so many are adopting the AI and SEO content. The SEO sites are always going to outnumber the "real" ones, because it's so much easier to produce knockoff or even entirely fake content. So for me the ultimate question is whether SEO is going to become so much larger that it's simply impossible to pick out the actual content. We'll see, but it's not looking good. Even the "Reddit trick" would only shift the SEO activity from hosted websites to subreddits and their posters. It's probably even easier to make a plausible Reddit comment than a full SEO website, unfortunately.
posted by wnissen at 2:43 PM on April 24 [9 favorites]


It kind of does. SEO sites are intentionally as difficult as possible for computers to distinguish from real sites.

This is a response to my comment above about self-hosted search, so I'll reply to it.

At the relatively small scale of search that I'm talking about, which is more about finding independent sites and blogs with interesting writing and voices than answering general knowledge questions or searching for info that's probably on Wikipedia, this isn't such a problem. If you're doing search on your own, you could in fact manually strike sites from your search corpus that you find are spammy or useless, or that become so. Also, SEO works largely because of search engine monoculture, you only have to satisfy one set of rules to get to the top of Google. If there were 1,000 different tiny search engines, you might be able to game some of them, but probably not all of them.
posted by JHarris at 2:51 PM on April 24 [2 favorites]


JHarris: searching for info that's probably on Wikipedia, this isn't such a problem

This is perpendicular to your argument, but I had such a frustrating experience with Wikipedia a couple of days ago. There was a glaring error in an article about an old jazz standard, saying that that the first line of the second verse was the opening line of the song. In the old days of not so long ago, I could’ve just corrected that in a matter of seconds, but the barriers to editing that have been put up make anything frustratingly slow. I started the process but halfway through I lost my work through a stray misclick on my part, and I just didn’t have the time anymore.

Even Wikipedia is getting enshittified, though the reasons for it are different from what’s messing up Google. The great thing about Wikipedia was that even if it was full of errors, they were easy to correct. It’s still full of errors, but it’s always getting harder and harder to fix them.
posted by Kattullus at 3:11 PM on April 24 [4 favorites]


What I worry about is that there may not be any "good" sites anymore, since so many are adopting the AI and SEO content. The SEO sites are always going to outnumber the "real" ones, because it's so much easier to produce knockoff or even entirely fake content.

Here's where AI could improve search: detect (and exclude) crap AI and SEO-purpose pages.

And is it conceivable that independent crowd-sourced "page rank" or site-rank could be used to filter the results from the big commercial search engines? Eg implemented as a browser plug-in?

( No bong hits. Craft beer)
posted by Artful Codger at 3:17 PM on April 24


Most searches don't need Google scale. Many searches are cache hits. Your browser can (and most do) maintain a history, which is the beginning of a search index that will satisfy 80% of your searches, unless you are someone who always remembers where everything is.
Searches for novel information can largely be satisfied through one's social graph. If your browser history is available to you, then you can make an expurgated / annotated version of it available to your people as a supplementary index. All of you, jointly, could share these index snippets in a distributed hash table, serving as a meta-index to the smaller indices.

Right now, there is a "distributed" self-hosted search engine, YaCy. (yacy.net)
When I tried it many years ago, the computer I had at that time was too slow and small to use it while doing my normal computing. The computer I'm typing on now is new, but also too slow and small (netbook). My larger laptop, with a corei3, should be plenty capable but its storage is all full- so I won't be experimenting again with yacy soon.
Yacy is developed in Java, which is a language I don't enjoy and don't have time currently to study, so I will not be hacking on Yacy soon.

For anyone interested in helping a nascent project to make something better, Debian Developer Thomas Koch is making a beginning: https://blog.koch.ro/posts/2024-01-20-rebuild-search-with-trust.html

HTH
posted by Rev. Irreverent Revenant at 3:25 PM on April 24 [6 favorites]


> It’s essentially the equivalent of DEFCON 1 and activates, as Levy explained, a war room-like situation where workers are pulled from their desks and into a conference room where they tackle the problem as a top priority: ... search query growth was “significantly behind forecast”

Sooo . . . this is perfectly insane for a few reasons.

But one simple reason is that when you are just starting out and have a 0.0001% market share, exponential growth for a while is a reasonable target.

Even when you are 1% or 2% or 5% or 15% of the market, I suppose. Or let's just say, the entire market it growing at an exponential rate - which it can do for a certain period of time, but not forever.

But Google has been the leading search engine since 2002, held more than 50% of the entire market since 2007, and held more than 75% of the entire search engine market for well over a decade. Exponential growth just simply isn't even possible given that starting point.

And the only way to fake it for a little while is doing idiotic tricks of the sort they finally did resort to as outlined in this article - things like basically tricking people into clicking a lot more times to get the same result they previously got with one click.

The ultimate cause here, though, is simple irrational and self-defeating expectations.
posted by flug at 4:24 PM on April 24 [2 favorites]


Did anyone else ever contribute to NewHoo?

Gnuhoo begat Newhoo begat DMOZ begat Curlie, as I understand it. I logged around 25,000 edits on DMOZ back in the early aughts, but ended up bouncing pretty hard off the hierarchical structure. Then I went to Wikipedia, which of course has ended up embracing similar forms of hierarchy for even less reason. At least DMOZ had structural reasons for its structural problems. No idea how things are going with Curlie though.
posted by Not A Thing at 4:25 PM on April 24 [4 favorites]


Ryvar: "Just idly thinking aloud but I wonder if there’s room for a sort of federated search?"

I have wondered the same thing. Many of you probably remember the SETI@Home and Folding@Home projects. It seems like it should be possible to have a Crawling@Home project to divide the web into chunks for participants to index. Those of us who have logins to paywalled sites could be automatically prioritized to crawl those sites. Storing a distributed index would be…a trick, and performing a "simple" web search wouldn't be simple at all at the back end. Getting something like that off the ground would be difficult, but it seems like it should be possible.
posted by adamrice at 4:51 PM on April 24 [4 favorites]


It seems like it should be possible to have a Crawling@Home project to divide the web into chunks for participants to index
I want to use something like this but the problem is scaling: once you’re beyond people you personally know, spam management becomes a big problem and moderation is also a challenge (e.g. imagine trying to craft a policy for what Israeli or Palestinian news or commentary gets indexed at what weights). SETI@Home worked because they weren’t a target for that and there wasn’t competition – if a small service is too slow to index things, people are going to go back to Google.
posted by adamsc at 5:07 PM on April 24 [2 favorites]


Most searches don't need Google scale. Many searches are cache hits. Your browser can (and most do) maintain a history, which is the beginning of a search index that will satisfy 80% of your searches, unless you are someone who always remembers where everything is.
Searches for novel information can largely be satisfied through one's social graph. If your browser history is available to you, then you can make an expurgated / annotated version of it available to your people as a supplementary index. All of you, jointly, could share these index snippets in a distributed hash table, serving as a meta-index to the smaller indices.

posted by Rev. Irreverent Revenant at 3:25 PM on April 24

Maybe this is true for you, but it's not true for me, not in the least. Don't assume that your Internet/search experience is anywhere equivalent to my Internet search experience.

I'm SO DAMNED TIRED of reading all of the same arguments whenever the topic of search comes up on the blue. We don't need search. Search is the signifier of the death of the Internet. A webring will solve all your search needs....

This is all bunk--at least as a universal truth.

I need search. I need search to work. I need search to work quickly and effectively. Sadly, it's not working well because of Google's actions, and despite trying every new-to-me search engine that gets mentioned here, I still wind up back at Google, because the type of searching I need to do and the content I'm need to find is so vast and varied that specialized search engines don't cope well with it. (FYI, I just opened a tab with Mojeek, so that will be the next one I take out for a spin.)

And no, my "social graph" can't hold a candle to my search needs. I'd be surprised if my "social graph" accounted for eight per cent (not 80) of my search needs. If only my searching requirements were that simple.

Look I get that I may be an outlier in terms of my search needs, but I'm not a unicorn. There are plenty of people out there who rely on extensive and elaborate searching for both professional and personal reasons, so until all of you search naysayers come to realize that search is a vital necessity, all of your arguments and pointless. You might as well be adults trying to explain things in a Charlie Brown world, because all I hear is "wah wah wah wah wah wah" which is a shame, because I want to hear solutions, I want to hear how things can be improved, but if you're not recognizing that there is a need, then you're not talking to me, you're only preaching to your own choir, which, in some ways, is just what Google is doing. It is only paying attention to its own monetary interests (although I personally think it is shooting itself in its own foot).
posted by sardonyx at 5:33 PM on April 24 [18 favorites]


Search is important, and I was just kidding about webrings, but I think that there is an obvious parallel between page rank revolutionizing search in a keyword spamming universe and some hypothetical authority-based search that could revolutionize things in an AI spam universe. I am not the one to create it, but it definitely feels like we are back to where we were when Google first cleaned shit up. Maybe someone can do the same now?
posted by snofoam at 5:40 PM on April 24 [4 favorites]


I've been usong duckduckgo exclusively for awhile now and I haven't regretted it.

I'm slowly trying to disconnect from the google ecosystem. I host my own mail now, next I will host my own photos, notes, calendar, passwords... it'll take time and I may have to code up some of my own services but I'll get there...
posted by signsofrain at 6:57 PM on April 24 [3 favorites]


Re: perpetual growth without restraint, in physiology we call that "cancer."
posted by biogeo at 8:06 PM on April 24 [5 favorites]


It's not just big things that have this issue, though. Everywhere I look, particularly governments, I see people that are content-free 'managing' organisations into the ground and nobody gives a fuck.

As one inclined to blame Big Consulting for almost all of this, I laughed out loud to see Zitron write
McKinsey is to the middle class what flesh-eating bacteria is to healthy tissue.
Preach it, brother.

Also, SearXNG has been working well for me too. I've been using the instance hosted at perennialte.ch for a while now, along with its Invidious instance.
posted by flabdablet at 8:17 PM on April 24 [8 favorites]


How much does Google earn per person from showing ads? Could they sell an unfucked ad-free version of search results?
I’m sure they could. The question is, after this fuckery, would *I* ever trust them again? Spoiler: “no.”
posted by Gilgamesh's Chauffeur at 8:32 PM on April 24 [3 favorites]


the sobsister: I moved to DuckDuckGo as my default browser

Sorry, did you mean browser or search engine? I'm asking because DDG offers both


Sorry for the confusion. Both, actually. I sometimes use DDG as a browser with its own search engine as the default and, similarly, Firefox + DDG.
posted by the sobsister at 10:01 PM on April 24 [1 favorite]


Mod note: This post has been added to the sidebar and Best Of blog!
posted by Brandon Blatcher (staff) at 5:28 AM on April 25 [5 favorites]


Maybe this is true for you, but it's not true for me, not in the least. Don't assume that your Internet/search experience is anywhere equivalent to my Internet search experience.

I'm SO DAMNED TIRED of reading all of the same arguments whenever the topic of search comes up on the blue. We don't need search. Search is the signifier of the death of the Internet. A webring will solve all your search needs....


If you are tired you can stop reading at any time. It's too bad others' proposed solutions aren't to your liking, but speaking only for myself I will start caring about your "internet search experience" around the time you start paying me to do so. Why am I, why is anyone else responsible for your "search experience"? If you like Google, stay with Google. No one's trying to stop you.
posted by Rev. Irreverent Revenant at 1:06 PM on April 25 [1 favorite]


> The original intent of the Backrub algorithm (later renamed PageRank for obvious reasons) was to use a few high-quality sites to bootstrap the rest of the index, because "good" sites wouldn't link to garbage.

My phrasing of this point differs, but not in disagreement or criticism, just as another way to view it through the lens of wnissen's really excellent phrasing above:

"The original intent of Backrub was to profit off of the work of Internet curators without paying them for their time creating high-quality sites."

Which is the essence of the modern AI content theft problem as well, only starting twenty-five years ago. Once Google's full-text search devalued the effort of curators, most of the unpaid ones stopped doing it at all. Later on, the paid ones got replaced by algorithms. Which means that today, Google Search no longer has the data necessary to function as designed.

(I don't believe that Google has declared this as a material risk to their business on their public filings with the SEC, but I don't expect they would unless pressured to.)
posted by Callisto Prime at 2:23 PM on April 25 [7 favorites]


"The original intent of Backrub was to profit off of the work of Internet curators without paying them for their time creating high-quality sites."

That's really interesting. My browsing habits are definitely "lower depth" than they were 25 years ago. Then I'd be more likely to read a page, follow a link from it, follow a link from that, etc. But now I search for something, I click on it, I read it... depth = 1, and often that's pretty much it.

When I want something outside of what I'm searching for, I rely on a handful of link aggregation sites (like Metafilter). But again I rarely go further than depth 1.

The one major website that's still good for a multi-click journey is Wikipedia. I wonder if the fact that it didn't go commercial is the reason for that.
posted by clawsoon at 4:20 PM on April 25 [6 favorites]


If you are tired you can stop reading at any time. It's too bad others' proposed solutions aren't to your liking, but speaking only for myself I will start caring about your "internet search experience" around the time you start paying me to do so. Why am I, why is anyone else responsible for your "search experience"? If you like Google, stay with Google. No one's trying to stop you.
posted by Rev. Irreverent Revenant at 1:06 PM on April 25 [1 favorite +]


That's because people like you jump into posts about how search is broken and what Google is doing with all sorts of posts about "search isn't important and nobody needs search."

If that's your point of view, great. More power to you. You don't need wide-ranging search to work. Fabulous. You've got ideas for how your ideal search would work in your own little world. Wonderful. Make a post about that. I'd probably even pop in and read it, interested in what you have to say. I'm always open to hearing about different perspectives.

The exception to that is when those perspectives flood over existing topics. I'm always reluctant to say "MeFi doesn't do X topic well," but these days, there is a tiny hint of truth in that. Posts about broken search always turn into "but webrings" and "but nobody REALLY needs search." Similarly posts about AI always turn into "but it's not REALLY AI." Yes, we all know that it's not really AI, but that's the language that has entered the common lexicon so that's what people are talking about. Similarly, sure there are likely other approaches for the ways people find information, but search--whether that's via Google or Dogpile or Veronica or some yet to be named new thing--needs to be functional, and it's frustrating and tiring to keep reading that "no it's not."
posted by sardonyx at 4:49 PM on April 25 [6 favorites]



yes but the search engine concept is itself Internet-killingly bad even without the optimization part

I only look at image results now. I type in my query and look at DuckDuckGo’s image tab (really Bing image search). Images give a sense for the source’s earnestness that is completely lacking in a text stub.

Say I want to understand a craft—I go down until I see amateurish photos lovingly taken. Then I click on that web page.

Or I want to figure out a physics problem— I search for the hand-drawn diagram without the jpg artifacting of some endlessly reposted and reprocessed reference image.

These results would be in the second or third page of text results, completely surrounded by garbage.
posted by Headfullofair at 12:29 PM on April 26 [8 favorites]


There's some great tension in me being released as everyone kind of collectively accepts that google is dead, and google search sucks now. I felt like a conspiracy theorist crank for years, and would regularly lose bar-bet type arguments/conversations looking like a kooky old bird bc i couldn't pull up anything to prove my point about random subjects with the results just being gone.

Hell, i even posted on ask about some of the meta-conversations i was having related to this.

It's still really sad watching this form of embrace-extend-extinguish play out here though. It's created a great emptiness in me the same way, like, the death of myspace, hypemachine, and music blogs did to the 2000s/early 2010s music scene. An entire era of cultural output is either gone or locked away in digital(or literal) basements and closets now.

It's really making the case for people who have called this, and the 2000s, a cultural dark age that will be really really hard to look back on from an anthropological standpoint in like 100 years. There's so much cultural output, and it's all being "recorded", but those recordings might as well be going directly into shredders.

Don't agree? Look at what happened with sites like photobucket, and look at any old forum thread that linked to them.

The older i get the more i feel like i have in common with old queer people i've talked about their experiences in the community in the 80s and 90s, older people in dance music in those same eras, older artists, etc. "God i wish the ways we recorded anything were more accessible" is heard so much, just after "god i wish we recorded things". The latter isn't even a problem anymore really, it's just the former. And it's a complete, unforced error on the part of fucking humanity.
posted by emptythought at 7:03 PM on April 26 [5 favorites]


Early on, google was pretty great, and ads were flagged as ads. It was what the web needed and wanted. Now it's shitty, with gobs of ads, intrusive ads, at least still labeled. Short term gain, long term? Will anybody realize thatgood search with some ads is profitable when done well, if not as obscenely profitable as GOOG's board wants?
posted by theora55 at 4:01 PM on April 29


Now it's shitty, with gobs of ads, intrusive ads, at least still labeled.

This is the part of the thread where I once again express my unending bafflement at the demonstrable willingness of so many people to keep on using the Web via browsers that don't or can't have uBlock Origin installed.

As the world's most competent ad blocker it's also the most popular (it's the most-installed Firefox add-on by a very large margin) but even so its overall install counts are chump change compared to the total number of Web users.

Every time I see people bemoaning the rising shittiness of online advertising and/or offering tips and tricks for how to sidestep its worst effects in ways that don't involve simply installing uBlock Origin and leaving it turned on by default, I have to overcome a burning urge to grab them by the lapels and shake some sense into them. It's frankly exhausting and leaves me feeling dispirited.

Granted, my own visceral loathing for the advertising industry and all its works is disproportionately greater than most people's, but I used to fix computers for a living and the looks of sheer relief visible on every customer's face on being told that blocking 99% of Web advertising is trivially easy is something I won't soon forget. So I can only conclude that failure to do it is mostly driven by ignorance, probably made worse by the widespread and rather bovine acceptance of advertising as in some way necessary and IT in general as dark magicks too dangerous to try to control.

For which I guess I should be grateful. If ad blockers were cutting seriously into the online advertising industry's revenues it would be reorganizing itself to make them less effective, which it shows no signs of doing. Unless and until it does, uBlock Origin will continue to defeat very nearly all of it and my own web browsing experience will remain well insulated from it. As could yours be, if you care to make it so.
posted by flabdablet at 7:23 PM on April 29 [7 favorites]


emptythought, the one bright point about all of this is that, for a long while, Google's benevolent stewardship made it seem like it might be possible to have a corporation that did mostly good things. Google made a strong case that techno-utopianism was not only possible, but viable.

It's been at least 14 years in coming (since the demise of Google Reader [puts a dollar in the Reader Lament Jar]), but it's now evident it's all fallen apart. And if freaking Google couldn't make it work long term, what hope does literally any other profit-seeking entity have?

So, if a corporation can't be trusted to make good decisions by internet users, who can? I can't help but think this is one of the things fueling the adoption of the Fediverse by multiple parties....
posted by JHarris at 8:51 PM on April 29 [2 favorites]


... the looks of sheer relief visible on every customer's face on being told that blocking 99% of Web advertising is trivially easy is something I won't soon forget
Same. I shouldn't be, but I continue to be staggered by the number of people that don't even know there is such a thing as ad-blocking, much less that they can install it and that's all they need to do. I assume it's because, for most people, computers have become an appliance no different from a toaster - you take it out of the box and use it how it comes. I'm reminded of this on every (rare) occasion I try to use the Internet on a device without ad-blocking - it's literally unusable and I don't know how people do it. Even YouTube alone - there is no video that good that I'll watch it with those constant unskippable ads.
posted by dg at 9:32 PM on April 29 [4 favorites]


If ad blockers were cutting seriously into the online advertising industry's revenues it would be reorganizing itself to make them less effective, which it shows no signs of doing

Oh I wish that were true. I see so many "you're using an ad blocker click here to disable!" popups on websites these days. Fewer when using uBlock Origin, since it has its own anti-blocker blockers. They show up more when I use DNS-based blocking on my mobile devices.

These "we detected a blocker please let our malware in" things are clumsy. But a lot of sites just work badly with ad blockers. Facebook is super laggy and slow with uBlock Origin running. YouTube has taken a couple of successful swipes at stopping ad blocking too.

Long run, the web servers will win any war with ad blocking software. (Doubly so for Google, which owns the browser the ad blockers are suffered to operate inside.) I fear for the day that switch is thrown and using uBlock Origin is no longer viable.
posted by Nelson at 9:32 PM on April 29 [1 favorite]


They show up more when I use DNS-based blocking on my mobile devices

Guard your lapels.

Long run, the web servers will win any war with ad blocking software.

uBlock Origin has won the current round against YouTube, if that gives you any comfort.

Google, which owns the browser the ad blockers are suffered to operate inside

#NotAllBrowsers
posted by flabdablet at 9:42 PM on April 29 [2 favorites]


I assume it's because, for most people, computers have become an appliance no different from a toaster - you take it out of the box and use it how it comes.

So you're saying an ad-free web is unauthorized bread?
posted by flabdablet at 9:48 PM on April 29 [3 favorites]


Yeah, I suspect the only reason Google continues to allow the use of adblockers (apart from it harming their competitors just as much as them, maybe) is that the people using adblockers are the same people that wield a lot of influence over which browser people use. Pissing them off too much could see their market share dropping in the same way that making early adopters feel special is partly how Google got so much traction with products like Chrome and Gmail.
posted by dg at 9:48 PM on April 29 [1 favorite]


Another reason Google still allows ad blockers is their own engineers and other employees are using them. The web is just as unusable for them as it is for anyone else.

Also Google uniquely benefits from ad blockers in that their ads are often not blocked by many of the ad blockers. They pay to be part of the "acceptable ads programs" with crappy software like AdBlock Plus.

YouTube will absolutely win any serious battle with ad blockers. The next step will be to embed the ads in the video stream itself. (This already happens but it's the video creator that's putting the ads in. And there is a sponsor block addon to deal with it.) If ad blockers evolve to handle that, then YouTube will use DRM to prevent us from controlling what we see.
posted by Nelson at 10:06 PM on April 29 [1 favorite]


Facebook is super laggy and slow with uBlock Origin running.

I have no Facebook account as a matter of principle, so my experience with it is mostly confined to watching it in use on other people's devices, most of which do not have uBlock Origin installed, and it's always seemed laggy and slow to me. Not convinced this has much to do with uBO.

making early adopters feel special is partly how Google got so much traction with products like Chrome and Gmail

Gmail got early traction because it shat upon all its contemporary freebie webmail services from such an enormous height. Unprecedented amounts of storage, tags rather than folders, conversation threading that just worked, and best-in-class spam filtering were more than enough to make up for the data scraping they did from day 1 to target their genuinely unobtrusive (and, astonishingly, occasionally useful) text ads at the time.

I was an early Gmail adopter - got my invite from a MeFite - and for quite a few years there it was just unequivocally excellent. Until it wasn't.

Chrome got early traction because IE was such a piece of shit and Google pushed Chrome super hard via their main search page and made it super easy to install. I never liked it; early versions used to fuck up font rendering in ways I found irritating and I just never found any compelling reason to switch from Firefox. I currently have Chromium installed to deal with the occasional WebGL thing that FF won't render correctly for reasons I can't be arsed to track down.

YouTube will absolutely win any serious battle with ad blockers.

Widevine has been a thing for a quarter of a century at the point and is already in all the browsers. If there were not a solid business reason for YouTube to avoid it, they would already be using it. They're not. Which says to me that their huff and puff against ad blockers is all bark and no bite.

Frankly, I'm astonished that YouTube continues to exist. It costs more bandwidth and storage than anything else Google does; I can't see how it could possibly be making them anywhere near as much money as flogging ads on other people's sites. I can only assume that Google maintains it for reasons more political than economic.
posted by flabdablet at 10:39 PM on April 29 [2 favorites]


YouTube Ads were $29B of Google's $282B in gross revenue in 2022 (page 28). That's not counting YouTube Premium subscription fees (80M users, including free trials) or other sources of revenue. Unfortunately there's no public numbers I know of on YouTube profitability. Bandwidth and storage are indeed enormous costs but then again Google is probably the very best company in the world at cost efficient storage and bandwidth.

And numbers aside, there's also the value to Google in owning a monopoly on an Internet medium. You want to buy ads on videos shared on the Internet, you buy those ads from YouTube. Although now that TikTok is so popular they've lost exclusivity to a competitor. (Gosh, Google would be a natural to buy US TikTok now that sale is being forced.)

I do think it's interesting YouTube hasn't taken more aggressive steps against ad blocking. My guess is they mostly didn't care, just like all the other ad-supported content sites that haven't been doing anything about ad blocking until recently. The existence of YouTube Premium shifts that internal discussion, now they have a direct "no ads product" to funnel people into. I pay for that myself because ad blocking has never worked nearly as well on my Roku or my mobile devices.
posted by Nelson at 6:53 AM on April 30


I have some adblocking installed but I don't use it all that often, because:
  • I don't spend that much time on ad-heavy sites. Not a member of Facebook, don't often watch YouTube, etc
  • As corny as it sounds, I believe that content-generators and presenters should be compensated, and advertising is one avenue. In some cases, subscriptions/memberships work, and even tip jars, patreon, etc, but these currently don't scale well.
  • In a few cases...advertising is desired. Pre-internet, if you bought a specialist magazine focused on a hobby or trade (eg you're a machinist, or a guitar player)... you expected and wanted relevant advertising. They showed you what was available in that area of interest and where to get that. Sometimes they educate (specifications, new models, etc.). Even in general-interest magazines, the ads were done to a high creative standard, and were not usually wildly irrelevant, offensive or disgusting.
So, in my view, a big problem with internet advertising is... the preponderance of crap inappropriate ads, and too many of them. If the ads were better-targeted and better produced, they would be better received. Like you'd expect from a good magazine. It floors me how the most interactive, responsive ad medium ever conceived is currently also the shittiest and most despised.

I do think the following are irredeemably evil: ads dressed as content. And I have nothing but hate for shitwrappers like Taboola and Outbrain. They degrade any site they're on, turning it into a supermarket-checkout tabloid. And I also hate ads posing as legit results of a search, which hopefully brings me back on topic.
posted by Artful Codger at 7:59 AM on April 30


Another great aspect of ads is how as sites optimize ads don’t so they end up being the bulk of the data transferred.
posted by Artw at 8:03 AM on April 30


I do think it's interesting YouTube hasn't taken more aggressive steps against ad blocking.

Are you aware what YT has been up to over the past six months?
posted by pwnguin at 10:56 AM on April 30 [2 favorites]


I fear for the day that switch is thrown and using uBlock Origin is no longer viable.

There's still the Chrome extension manifest v3 transition which was initially pushed back, but is coming back around starting in June for beta/canary versions of Chrome currently. There's a 'Lite' version of uBlock Origin which's built to work under those new rules, but from what I've seen there's no way to get around it being more limited.
(no dynamic updating of filter lists or importing external filter lists, more limited filter syntax, etc)
posted by CrystalDave at 11:03 AM on April 30 [2 favorites]


Are you aware what YT has been up to over the past six months?

That's the skirmish that they now appear to have given up on, which I count as a resounding victory for the uBlock Origin crew.
posted by flabdablet at 11:04 AM on April 30 [2 favorites]


tip jars, patreon, etc, but these currently don't scale well

Neither does YouTube advertising except from the other direction. For every Mr Beast making Lamborghini money off the thing there are countless content contributors who will never see dollar 1.
posted by flabdablet at 11:33 AM on April 30 [2 favorites]


Are you aware what YT has been up to over the past six months?

Yes. why, we were talking about it right here over the past sixteen comments!
posted by Nelson at 12:29 PM on April 30 [1 favorite]


Neither does YouTube advertising except from the other direction.

Newsflash: monolithic video platform with shitty, intrusive ads favours creators of mindless populist fluff whose non-paying viewers tolerate shitty, intrusive ads.
posted by Artful Codger at 1:15 PM on April 30 [1 favorite]


I was an early Gmail adopter - got my invite from a MeFite Heh, I didn't get mine from a MeFite exactly, but from some now-forgotten Web site I found via MeFi Chat where people were handing them out to anyone prepared to trade something of value. I offered nothing and got a response saying 'here you go - something for nothing' and there I was.
posted by dg at 7:56 PM on April 30 [2 favorites]


the Chrome extension manifest v3 transition which was initially pushed back, but is coming back around starting in June for beta/canary versions of Chrome currently. There's a 'Lite' version of uBlock Origin which's built to work under those new rules, but from what I've seen there's no way to get around it being more limited.

If avoiding advertising is important to you, don't rely on a browser maintained by the advertising industrial complex. Use Firefox.
posted by flabdablet at 9:49 PM on April 30 [4 favorites]


So you're saying an ad-free web is unauthorized bread?
Pretty much, I guess. That's a wonderful story, despite being a little too close to what the future looks like for comfort.
posted by dg at 2:19 PM on May 1


« Older If only your economy room included an escape pod   |   Grace Cummings Newer »


You are not currently logged in. Log in or create a new account to post comments.