Join 3,433 readers in helping fund MetaFilter (Hide)


Apartheid for Blogs!
May 9, 2003 1:34 AM   Subscribe

Google, everyone's favourite search-engine, is planning a seperate category for Blogs, to help searchers "filter out blog noise," from primary search results.
posted by Blue Stone (45 comments total)

 
That blogs will be removed from the 'main' search is speculation by the register, and their reasoning that it'll be removed from the main search like usenet was is rather flawed as usenet isn't part of the web, while blogs are.
posted by fvw at 1:56 AM on May 9, 2003


"It isn't clear if weblogs will be removed from the main search results, but precedent suggests they will be. "

Speculation? I thought that was qualified prediction of a possible outcome. Seemed like a responsible piece to me (maybe we 'quibble over terms'?)
posted by dash_slot- at 2:18 AM on May 9, 2003


considering the possibility however, i think it's a bad move. any asshole can make a non-blog website that contains just as much crap as one that uses blogging software. plus this means that if you're a legitimate source of information on a particular subject you can't use blogging software to make updating easier for fear of being removed from the main index. if one were an art history expert one doesn't (and shouldn't) have to be a computer expert to have an "authoritative" website about art history. anyway, hopefully one or two google corporate representatives check out metafilter regularly enough to stumble upon this comment at least. hopefully others agree with me.

by the way; how would google know which sites are blogs? not all of us are content with blogger. does google know that i use movable type to update? just curious.
posted by magikeye at 2:20 AM on May 9, 2003


Cool.
posted by ed\26h at 2:26 AM on May 9, 2003


Even I can tell when a page has been thrown together with Movable Type. Give Google's expertise in page scanning technology, I'm sure their bot can too.

Besides, they could filter out 90% of blogs just by knocking out a few domains - blogspot.com, livejournal.com, etc.
posted by Mwongozi at 2:48 AM on May 9, 2003


seconding what magikeye said. Also, what if you use slashcode, phpnuke or postnuke or one of those content managers? Is that by definition a blog then?
posted by dabitch at 3:49 AM on May 9, 2003


HOLY SHIT.
mwongozi of the wacky photomanipulations. And also wacky poser pictures. Holy god damn. I was making fun of those with some friends earlier today.

It's a small intarweb.

Cheers, sir. You're awesome.
posted by kavasa at 4:14 AM on May 9, 2003


So... weblogs aren't part of the web? What part of weblogs did I miss? Strange.

It's a good point, how would Google determine a weblog of "my day blah blah blah teenager bitching" from "this is an inciteful web site updated in weblog format" or many other sites that people will miss because they are in weblog format.

Can you tell its a MovableType blog even if it doesn't use a MT template? In theory, I could use MovableType to set up a site that looks very similar to this but it would still be using a weblog software. I don't see how they good easily filter the results.
posted by benjh at 4:45 AM on May 9, 2003


blogs contain more than their fair share of incite.
posted by quonsar at 5:19 AM on May 9, 2003


Does this apply to the image search as well? I hope not.
posted by hama7 at 5:31 AM on May 9, 2003


What's a "blog"?
posted by Postroad at 5:36 AM on May 9, 2003


Ceci n'est pas une weblog.
posted by stavrosthewonderchicken at 6:00 AM on May 9, 2003


I'd like to just filter out Andrew Sullivan and Dave Winer, forever.
posted by The Jesse Helms at 6:04 AM on May 9, 2003


Also: the more weblogs that google cuts out of the search rankings, the higher up pages like "FREE PORN HERE!", "SIGN UP TO MY SPAM PAGE!" and "WELCOME TO MY 1997 HOMEPAGE PLEASE USE NAVIGATOR 3.0!" are going to register.

Thus making google less useful. Do they really want to blot their copybook in such a way?
posted by tapeguy at 6:09 AM on May 9, 2003


one it again. This contretemps redefines "nonissue," and in reading the piece it's apparent that the man seemingly can't keep his jealousy and his seething hatred for a few well-known blog-actives from affecting his coverage. (It's between the lines here, but it's been explicit in the past.)

If the Register has any desire to maintain whatever credibility they currently enjoy, they'll pull him from this beat. He's like a slow leak.
posted by adamgreenfield at 6:09 AM on May 9, 2003


Also, quoting a poster on Slashdot is balls.
posted by The Jesse Helms at 6:28 AM on May 9, 2003


Ahem. Perhaps categorizing blogs will be done to enable blog-only searches. That makes a lot more sense than leaving blogs out, especially as more and more commercial sites use MovableType and similar engines for content management.
posted by Songdog at 6:47 AM on May 9, 2003


their reasoning that it'll be removed from the main search like usenet was is rather flawed as usenet isn't part of the web, while blogs are.
posted by fvw at 1:56 AM PST on May 9


And further, aside from the usenet question, as far as I can tell web content that is searchable under Google's other tabs has not been specifically removed from the main index. This is also true of froogle.google.com searches -- search on a title and author and it looks like you can get hits on the same Amazon web pages in the main index as you can in the froogle search, for example.
posted by markavatar at 7:12 AM on May 9, 2003


I'm certain Google didn't buy Blogger in order to "remove blog noise from searches." What their business model/strategy is here, well, they haven't said. But adding a blog category and/or being able to identify blogs in their search algorithm (setting aside the issue of defining what qualifies as 'blog') could allow them to do a variety of things that might drive revenue--and this of course is important for their eventual IPO. For example they could track memes and quantify viral distribution, which could open up revenue opportunities and by allow easier/better searching of blogs (and don't forget Google Directory) they can create more traffic to Google text ads which also drives revenue.
posted by donovan at 7:22 AM on May 9, 2003


I thought that was qualified prediction of a possible outcome.

That's fantastic. I would love to try that. I predict (though I'd like to qualify this by acknowledging that it's not yet entirely clear) that future updates or releases of both Blogger software and Google features will occur at some point and that the most likely possible outcome of this will be an article written by Mr. Orlowski that addresses the news head on (but is ultimately disapproving of the Blogger-Google relationship) and that seems independent of actual technical verification.

Orlowski has an interesting take on Google and weblogs. He has said that weblogs "might ruin [the] creative process" for writers.1 That author William Gibson "having 'forsaken' his blog" will be able to "preserve his mind."2 That weblogs "discredit" Google's search service3 and that the "world of bloggers" is a "diminishing" group of "'WA's (Weblog-Lobbyists)" with a widespread "supplicant mentality."4 Also - that Google's PageRank mechanism engages in "semantic ethnic-cleansing."5

To repeat, though differently: Google searches can be compared to ethnic cleansing. As writing: that's impressive.
posted by massless at 7:36 AM on May 9, 2003


Orlowsky is clearly unhinged by not being at the center of this story, and he keeps trying to inject himself into the narrative - with success, so far. (Mea culpa.)

I find it hard to understand either his hostility to weblogging as a practice or his ad hominem attacks on folks with whom he's never even corresponded, let alone (in either case) their degree of venom.
posted by adamgreenfield at 7:47 AM on May 9, 2003


I would love to have the option to filter blogs out of my google searches. Many is the time that I've tried searching for some particular bit of information, only to find it buried under hundreds of blog pages containing commentary about that information, none of which bother to link to the primary source.

(The irony, of course, is that generally in these cases the reason I'm searching for the primary source in the first place is that I've stumbled across something about it in a blog, and wanted more information.)

Blogs, by their very cross-linked nature, tend to float to the top of pagerank, so for certain topics which are or have been heavily blogged, can tend to shut out the rest of the web. I don't agree with Chris Roddy that blogs are all "idle chatter" (nor do I understand why they're quoting "a politics and linguistics undergraduate" as some sort of expert on blogs) -- I see blogs more like the Op-Ed section of the web; often useful, often entertaining, but not always what I want to read.
posted by ook at 7:48 AM on May 9, 2003


Oh, and that Pew Research report he keeps quoting approvingly - the one which describes the number of Internet users who read weblogs as "statistically insignificant"? I have to question their terminology and/or methodology, since a simple Fermi approximation suggests otherwise.

I've seen low-end estimates of active blogs at 200,000, with the high end reaching ten times that number. Assume that lowest number - 200,000 - and assume that, on average, each blog has been read by five people in its entire history. (This low number balances the 10,000 daily readers some blogs get against the one- or two-person readership LJ pages, and the inevitable overlap in audiences.)

This gives us the nice round figure of 1,000,000 people who have read a blog or similar. Not even statistically insignificant against the total population of the US, let alone the cohort of active Internet users.
posted by adamgreenfield at 8:04 AM on May 9, 2003


ook: I wonder how well adding "-blog" to the search would work? Surely most blogs use the word pretty frequently, and probably the more official sites you're looking for wouldn't use it at all. Worth a try.
posted by languagehat at 8:22 AM on May 9, 2003


This gives us the nice round figure of 1,000,000 people who have read a blog or similar.

Well, yeah, but most of those are people that stumbled across a blog on accident when they were looking for something useful on google.
posted by toothless joe at 8:30 AM on May 9, 2003


I emailed the Blogger folk suggesting this option about a month ago, so I'll go ahead and take credit for it. Thanks.

I'd like to see a "blogs only search," mainly because there are some things that blogs do much better. If I'm trying to find "cool wallpapers" or a good referrers script, I'm more likely to trust a blogger than any of the dozen results I'd get from Google currently, filled with popups, popunders, and ads all over the place.

I do agree that in some cases, blogs have too high pagerankings--I'm pretty surprised when I get referrals from Google and I'm somehow the top site--but there's gotta be a better way than a -blogs flag. There are plenty of very informative sites that use weblogs (or CMS systems in general) to publish their material (or even just their "news" or "updates"), and I don't think those should be immediately filtered out for simply using a certain format. I'd imagine that Google has already thought of this, however.
posted by gramcracker at 8:42 AM on May 9, 2003


What an irresponsible piece.
posted by davidfg at 9:13 AM on May 9, 2003


This gives us the nice round figure of 1,000,000 people who have read a blog or similar.

Um, does it? Couldn't it be the same five people reading each and every blog?

But anyway, 1e06 is less than half of one percent of internet users. I think the real number is a lot higher.
posted by sfenders at 9:18 AM on May 9, 2003


This is a terrible idea. Not only does this set a bad precedent, implementing a "separate but equal" approach to content categorization that implies that weblogs are, for some unknown purpose, lesser than even a fanboy's headache-inducing Geocities page, but it also sets a premium upon what tools must be used to create a personal site and thus get the appropriate listing.

Of couse, this couldn't possibly have anything to do with the Pyra-Google deal.

If someone uses Blogger, does that mean that a site will have a priority listing? If someone uses Movable Type, then does that mean there won't even be a listing at all?

The comparison between a weblog and a Usenet posting is moot, because the latter involves merely ASCII and, for the most part, crudely formed rants. The former involves layout, typesetting, pullout quotes to distinguish a source, and a conscious design. It is for my money just as substantial, if not more so, than some teenager's Buffy the Vampire fan page. Of course, like anything, you're going to have to wade through the crap to get there.

But this week's New Yorker features an E.L. Doctorow story in which weblogs are mentioned. For fuck's sake, weblogs are omnipresent. They have hit the mainstream. They are known by the commonweal. If an amateur researcher does not understand the difference between a weblog and a "valuable" site (those dated entries might give them a clue), or the dim bulbs cited in the Reigster piece, then they probably shouldn't be given the keys to the Corvette to begin with.
posted by ed at 11:27 AM on May 9, 2003


Somebody who ought to know (Ev Williams) is dismissing Orlowski, too. But, what can you say about a journalist whose personal domain is BadPress.net and who has praised Robot Wisdom for 'good writing'?
posted by wendell at 12:08 PM on May 9, 2003


"only to find it buried under hundreds of blog pages"

I'm confused about this. Care to give an example? I use Google throughout the day and I don't think I've ever had that problem. Ever.
posted by y6y6y6 at 12:10 PM on May 9, 2003


Nua Internet Surveys estimate 605,300,000 net users online. So one million blog readers is significantly less than half of one percent - it's 0.165%. So it's pretty hard to believe that only 1 million users have encountered blogs.

While I think it's great that Google might offer the ability to filter out blogs, I think what this really points to is the weakness of PageRank. While it was an incredible algorithm when first deployed, it may not be appropriate in an age where small groups of people link circularly. Would Google offering a blog-free zone implicitly admit the weakness of that algorithm in a blogged age?
posted by obruni at 3:19 PM on May 9, 2003


Ceci n'est pas une weblog.

Completely off-topic, but the misuse of this tag has been startling the heck out of me. The pipe's not a pipe because it's a representation of a pipe. But now anyone adapts this phrase, understood or not, to be cool. If you don't find this alarming, maybe you haven't seen the English grad student who has tattooed the idiotic "Ceci n'est pas de la chair" on her arm, without pausing to recognize that she has completely missed the point of something she has gone on to make part of her body. It's like people who decide they're going to put pop song lyrics on their body in Hindi or Latin or Greek and can't be bothered to have it written correctly...argh.

[/completely off topic]
posted by Zurishaddai at 3:55 PM on May 9, 2003


A wonderful unintended irony from Orlowski in the second-to-last sentence:
It's a bit like challenging a monarch with the viability of the hereditary principle: you can guess what they'll say.
Yeah, Andrew. Or maybe it's like challenging an editor-appointed columnist with a webful of meritocratically-selected content.
posted by George_Spiggott at 6:36 PM on May 9, 2003


What was that about the number of blog readers (vs. writers) being "statistically insignificant" according to Pew research?

The Pew Institute has published research stating that approximately 4% of online Americans read blogs and approximately 1% of online Americans create blogs. (The study at that last link considers content such as online diaries separately from blogs, so they're using a reasonably narrow definition.)

Since he's got quotes around "statistically insignificant" in his article, one assumes that he's actually quoting from a Pew report, but it's a bit disingenuous, considering that one of the Pew's most recent reports has a section titled "Blogs gain a small foothold."
posted by blissbat at 7:42 PM on May 9, 2003


Yes, but is Metafilter a weblog, is the question, you see, and...

Oh never mind. You just keep on with the condescension, if that's working for you.
posted by stavrosthewonderchicken at 10:53 PM on May 9, 2003


Zurishaddai, I mean.

I need coffee.
posted by stavrosthewonderchicken at 10:53 PM on May 9, 2003


I suspect that Google has already started in this direction. During the past year my blog had been getting spidered regularly by Google at least once a week. But last night it was finally indexed again after a three week gap. Dunno, could be just coincidence.

I'm not using blogging software, but "blog" is in its title: "GammaBlaBlog." About a third of my referer hits come from Google. And I'm sure I often lead those searchers where they wanted to go, as I try to link to primary sources. It'll be interesting to see what happens if blogs get their own tab.
posted by gametone at 3:08 AM on May 10, 2003


Apologies for the self-linkage in this post, but I've written too much about Andrew already...

Right. Where to start - firstly Andrew Orlowski is NOT a trustworthy reporter when it comes to weblogs. He routinely writes pieces about weblogs which purport to be journalism but patently are not because they don't talk to authoritative sources, they print speculation as fact, they extrapolate wildly and they are uniformly written around the assumption that weblogs are stupid and evil. Now - don't get me wrong - some of them have a certain amount of truth to them, but it's more by coincidence than design. And it means, in essence, that anything he writes about weblogs has to be opened up, examined and double or triple-checked. Here's a few things I wrote on some of his recent posts:

Andrew Orlowski is a weblogger
Register Refutations
Oh Self-Correcting Blogosphere

So first and foremost, I'd basically take anything he said with a pinch of salt. Now - that's not the same as saying that weblog posts don't pollute the search results (although I'm not convinced they do). So then we go to the next bit - who has Orlowski got to comment on his theory - "Chris Roddy, a politics and linguistics undergraduate at the University of Emory". That's not particularly conclusive - it's not like the man is an expert - say like someone from Google or a rival search-engine company. And when he talks about the number of weblog readers being "statistically insignificant", he's talking about the survey a while ago where the researchers - when they asked members of the american public how they were tracking the war in Iraq came back with the figure that 4% of the survey said they were using weblogs. It was considered statistically insignificant because they didn't interview enough people to be able to tell whether or not that figure was accurate (say - for example - that they might have had a 2-3% margin of error). Nonetheless, that figure if it was accurate would be a VAST number of interest in weblogs and weblogging.

It's also important to look at the kind of language and rhetoric that he uses: "It's a bit like challenging a monarch with the viability of the hereditary principle: you can guess what they'll say. Just as one-man one-vote democracy terrifies the bejesus out of some people, so surely will a fairer Google."

It's not a particularly sophisticated piece of rhetoric - in fact it's just designed to shut off any debate by suggesting that anyone who disagrees with him is clearly biased. But in fact, I wonder if he could guess what we would say to him? Because from my position, I'm seeing a lot of posts that are saying, "Well Ok - maybe this is true - maybe it's even a good thing if there is a legitimate problem here, but at the same time, Andrew has talked a load of unsubstantiated balls more than once in the recent past, and he does seem to put rhetoric above fact most of the time, so it's hard to believe what he's saying really bears that much relationship to the truth"...

Amid all of this stuff, it's worth noting both that Google has specifically moved to spidering weblogs more regularly because they got them to human-selected content very quickly, and that they've employed weblogs within their own company (I heard the man speak at ETCon - very much "eating their own dogfood"). So while there may be an issue with Blog results polluting the search logs, it should in no way be thought that Google is as 'anti-weblog' as Andrew makes out... Which actually brings us - finally - to the very crux of the whole matter, which is the core of the story. The title of the piece states categorically that Google is moving to FIX THE BLOG NOISE PROBLEM. But if you read the piece itself, the only person who says that is Andrew himself. In fact the only statement from someone at Google is the one that says that Google is now going to provide an index for weblogs. That's it! Everything else - the tab, the removal from the rest of the index - ALL of that stuff is completely unsubstatiated.

So to recap - the news here isn't news at all - the only person who has said that google are fixing the noise problem is Andrew. To support his case he talks to random people on the street who happen to agree with him rather than any kind of experts. He is known for writing anti-weblog posts as a matter of course, and casting every news story about weblogs in a negative light. He also puts a lot of effort into casting anyone who disagrees with him as a lunatic or biased. And he misquotes figures to fit in with his arguments.

Quite why the Register keeps publishing his stuff is a total mystery to me...
posted by barbelith at 4:23 AM on May 10, 2003


stavrosthewonderchicken: Please accept my apologies for what was, no doubt, a case of no coffee on my end. I tried to start with the innocuous "startled," but obviously I was just itching for a tangent to rail against the "idiotic" tattooee, and I don't approve of my seeming to have lumped you in under that label.
posted by Zurishaddai at 9:17 AM on May 10, 2003


Doesn't look like anyone's examining the other side of all this, how it could benefit the blogging community. I for one would probably read more blogs if I could find blog entries more centralized on topics of interest to me. At present the little time I spend reading other people's blogs is based on my interest in that individual's writing style or what have you, but if Google works a search tab geared specifically towards blogs, it'll make it easier to find stuff based on subject instead of personality.

It also means the crap will be easier to see, and some bloggers may find it's not enough to just waft blandly about their day, but focus their ramblings a bit more in order to attract more readers, but of course we know that's not really why anyone does it ..he says eyes rolling.
posted by ZachsMind at 9:41 AM on May 10, 2003


Google already has separate indexes for Linux-specific, Macintosh-specific and Windows-specific search. But none of the sites included in these are excluded from the main index, as far as I know - they are just subsets. It seems more likely to me that Google would treat a weblog index more like these categories of info on the web than like Usenet.
posted by maciej at 2:35 AM on May 11, 2003


there are plenty of blog-only engines, like blizg which focuses on metadata (including geo location), blogdex , daypop, popdex and technocrati (there are more.. ). Seems more likely that google will build one of those kinds of searches.. no?
posted by dabitch at 4:36 AM on May 11, 2003


Zurishaddai : no sweat.
posted by stavrosthewonderchicken at 5:00 AM on May 11, 2003


There definitely is a bias towards blogs: In a google search for "meg" blogeuse Meg Hourihan of megnut.com ranks higher than actress Meg Ryan of Hollywood fame.
posted by meikel at 4:00 AM on May 27, 2003


« Older Ask the White House...  |   Northern Exposure: A North Ko... Newer »


This thread has been archived and is closed to new comments