MSN inflates search results
April 26, 2005 6:27 AM   Subscribe

Looks like MSN upped their search count results for items that return more than about 2million hits, a lot of entries have grown more than 500% in the last few days. unfortunately this site (which graphs the number of results in the engines at any time for different search phrases) was down the last couple of days. Sounds like MSN are inflating their results, why this happened is yet unclear
posted by leighm (11 comments total)
 
Can I venture a guess ?
posted by troutfishing at 6:27 AM on April 26, 2005


heh yeh, but why would they add them for results that are over a couple of million? i guess so they can say "well we have 200million hits for that one!" but then again, who ever goes past the first page or 3
posted by leighm at 6:33 AM on April 26, 2005


Pros:
* Interesting phenomenon with possibly far-reaching implications on the trustworthiness of web search and forces readers to think (again) how much do we trust web search in the first place?

Cons:
* Links to admittedly poor information.
* No analysis, and no links to analysis.
* Atrocious tagging.
* Bad grammar and missing punctuation.

Verdict: Hung jury.
Sentence: Snarky comments, interspersed with sarcasm and a tiny bit of actual speculation.
posted by Plutor at 7:16 AM on April 26, 2005


dunno, maybe we will have to keep on watching, its a real shame that site stopped ticking over last week for a couple of days, seems to be when the spike hit.

check out some of the other graphs, you can see how much results for different terms fluctuate (some by a million results at a time!)
posted by leighm at 7:53 AM on April 26, 2005


Any relation to this? [/.]
posted by fleacircus at 8:19 AM on April 26, 2005


Why don't we let Slashdot do the FUD and wild speculation and come back here to discuss next week when something substantial is reported?

My guess is their SQL Server 98 server is just doing a backup.
posted by RockCorpse at 8:24 AM on April 26, 2005


...not that I don't love a good lynch mob now 'n then.
posted by RockCorpse at 8:26 AM on April 26, 2005


Back in 1997 or so, I did a series of experiments to figure out just how search results worked. Back then, just about all the engines could handle some kind of boolean syntax, but what they did with them was another matter. For example, "melvin" should return one set, "howard" another set, and "melvin and howard" should return the intersection of the two result sets.

In fact, for all but one engine, the reverse was true: Instead of the intersection, "melvin and howard" would return the union of the two sets.

For a real world example: I might try to narrow my search results by adding terms, and instead I'd see the result-set increase in size. Since I didn't trust algorithms to tell me the value of a page, I'd end up paging deep into the result set. My distrust was often borne out, BTW: I'd frequently find something I wanted 10 or even 15 pages deep. I still experience that with Google, which is the main reason I always set it to return 100 hits per page.

That additive behavior used to piss me off to a sufficient end that I swithed to doing free-text searches exclusively on HotBot, which was the only provider that fully supported the way I wanted to use Boolean queries.

Google, of course, doesn't work that way: Google gives you whatever the hell it thinks you want, based on its holy algorithm. In practice, it's an improvement over the aggregative restult sets of c. 1998 AltaVista, but it's still maddeningly opaque.

Anyway, the point of this little discourse is that there's a really simple way to really drastically increase the size of your result-set: Change the way you parse "and" so that it maps to a Boolean "or". That'll blow up your result-set REAL good...
posted by lodurr at 8:29 AM on April 26, 2005


I don't think that's what's going on here, but then again, I'm really unsure what is going on. Try this experiment:

1) Search for something innocuous, like pancake.
2) Note that there are 234,894,870 results. That should be 23 million pages worth, at ten per page.
3) Click next. Click again. And again. Etc. Keep going.
4) Wait, it stopped? What, around page 76? That's not even 1000 hits!
5) Go back a few pages. Wait, now it says only 43 million hits?
6) Let's try this from the beginning again. Let's search for pancake again. It still says 43 million. What the frost?

Try it with anything else. Pony, tomato, basketweaving, etc. It'll give you a huge number of hits, which you can page through for a while, until you get to an arbitrary "end", which is almost certainly not the total number of results for your search. Then if you page backwards, they give you a different number of total results, which is much much lower, but still seems like an unrealistic number.

I officially do not trust MSN Search. Or rather, I now officially have an actual reason to question MSN Search.
posted by Plutor at 10:45 AM on April 26, 2005


Well, in my experiance MSN search works better then google. Of course, that's probably because I only use MSN search when I can't find what I want with google.
posted by delmoi at 2:11 PM on April 26, 2005


I can't remember exactly which magazine it was, (maybe someone else knows), but I was paging through an article in their most recent issue on Bill Gates and Google as an adversary to Microsoft. Maybe it's some kind of a ploy?
posted by x_3mta3 at 5:05 PM on April 26, 2005


« Older A ballad of dwindling flesh   |   “Living Memorial” Newer »


This thread has been archived and is closed to new comments