Robots.cnn.com
May 10, 2001 1:32 PM   Subscribe

Robots.cnn.com is a mirror of CNN, without the ads. (It looks like they use it for web crawlers.) This reminds me of channel.nytimes.com, the backdoor for the New York Times that allows you to skip registration for almost every story. Anyone know of any other major media backdoors?
posted by waxpancake (27 comments total)
 
Well, not surprisingly, robots.cnnfn.com works, too.
posted by beagle at 1:40 PM on May 10, 2001


The obvious question is, how much would you pay for unlimited robots.cnn.com access? $5 a month? $5 a year? Anything?
posted by mathowie at 2:06 PM on May 10, 2001


CNN, like most every major news site, is way too heavy. Too many graphics and candy boxes and scripts. To this day I still use vanilla Nando, which gives me all the major wire services in one quickie page load.
posted by Erendadus at 2:07 PM on May 10, 2001


Is Robots.cnn.... put up by CNN or someone else who strips the ads? I'm tempted to delete this question since it strikes me as stupid, but perhaps its not?
posted by ParisParamus at 2:12 PM on May 10, 2001


By CNN.
posted by hobbes at 2:16 PM on May 10, 2001


Corrected link for robots.cnnfn.com.
posted by anildash at 2:26 PM on May 10, 2001


Thank you, waxpancake! I found the robots.cnn.com page so much cleaner and easier on my brain without the ads.

I don't understand two things: 1) why would CNN put this robots page up, themselves? Don't they need money from those advertisers? 2) Why would anyone go to the regular, ad-full page?
posted by rio at 2:27 PM on May 10, 2001


It's definitely put up by CNN. On another note, I think it's interesting that channel.nytimes.com allows directory listings. Considering how easy this is to change, it seems like a pretty glaring oversight.
posted by waxpancake at 2:27 PM on May 10, 2001


Is there a backdoor for Salon Premium?

Or maybe Salon offering its pay version is WHY we don't get as many [via Salon] posts anymore.

dP
posted by darkpony at 2:28 PM on May 10, 2001


Well, if you block ads, the two look almost exactly the same, anyway... except for the fact that cnn.com seems to think it's part of Netscape Netcenter and is welcoming me with my username. Ugh.

I'm off to delete some cookies...
posted by whatnotever at 2:33 PM on May 10, 2001


Someone will probably beat me to this but:

"why would CNN put this robots page"

Sites would create a special version of their site for robots so that said robots wouldn't screw with their click through count and impressions count. When I was playing with an ad banner system I wrote, I found that 15% of my ad views were from robots (but my site's not very busy).

Robots are just the programs search engines and others use to grab content, index text, search, etc. Completely automated. Very annoying when they mess up your log files, web stats etc.

"Why would anyone go to the regular, ad-full page?"

Even if lots of people find out about this, the vast majority of visitors will use the frontdoor. And CNN could just change it to something more complicated.

"how much would you pay for unlimited robots.cnn.com access?"

$0.00 - It's too easy for me to just mentally filter the ads out. And I don't think there is enough demand for paid ad-less content when the same thing is free with ads. Internet folks just like free stuff to much.

Now content is another matter. I'd be willing to pay for certain content that I couldn't get free. For example, I'd be happy to pay $5/month for Metafilter.
posted by y6y6y6 at 2:58 PM on May 10, 2001


Hmm.. I don't know about you guys, but when I hit up robots.cnn.com I still get the cnn.com main page, ads and all.. even that unbelievably annoying "Select your version" thing.
posted by zempf at 3:27 PM on May 10, 2001


So... why would CNN put up this robots page? I still don't get it. Robots will still index the normal www.cnn.com site. I suppose they then just use some form of cloaking to serve the ad-free pages to the robots, but then why have it all on a separate machine, if the robots will be hitting www.cnn.com anyway?

I'm trying to imagine how they would do this, and the only ways I can think of that would be invisible to the robots involve having the www.cnn.com machines serve the ad-free content themselves, so there doesn't seem to be much need for the separate machine (or at least the distinct ip/hostname).

Interestingly enough, Google indexes robots.cnn.com separately.

And, it seems that www.cnn.com serves Google the robots version, but www2, www3, etc. do not. Google's cache of each of these has the ads and the headlines bar on the right.
posted by whatnotever at 3:37 PM on May 10, 2001


I've written Salon and Slate filters that strip out the tables and graphics, and a Dilbert filter that puts the last few weeks' on one page.

I'm not sure why the Times hasn't blocked the backdoor - I guess they have too much legacy code at their partners in the "channel" for them to change the way it works again. They did move it once, to "partners.nytimes.com", but as long as their name servers allow "ls -t any nytimes.com" it won't be hard to find the backdoor if they move it again! (Dopes.)

robots.cnn.com is probably not for indexing spiders — you're right, whatnotever, how would they find it? — it's probably for content partners' automatic content-gathering robots. Whatever! I'm glad to know of it — it loads in about half the time cnn.com does (on my Mac with IE5.5, anyway...)

Of course we may be giving away the farm here if any of these sites' techs are reading MeFi...
posted by nicwolff at 3:43 PM on May 10, 2001


Alright, nevermind, I'm dumb. I guess I was expecting some text-only version of CNN & just assumed that since graphics were still there, ads were too. Oops.
posted by zempf at 3:43 PM on May 10, 2001


Interesting stuff in those directory listings at channel.mytimes.com. Is this really an article from 1918, for example?
posted by Spanktacular at 3:56 PM on May 10, 2001


It looks like they have a demo article or two in a bunch of years. Here's Godfather winning oscars six months before my birth, and here's the Chernobyl disaster.

The mind reels at the possibilities of seeing these old articles show up on metafilter for comments ("Has anyone seen this new film? while not as violent as Pearl Harbor or Saving Private Ryan, I thought was a good movie...")
posted by mathowie at 4:18 PM on May 10, 2001


The Times had an online feature back in March with the reviews of every Best Picture winning film; the films you mention both won the Academy Award.

They've also done this with, for instance, Stephen Sondheim. I believe every review of his Broadway and off-Broadway musicals and revivals has been placed online, along with other related articles.

I for one would love to see an online database of EVERY movie and theater review the Times has ever printed. How valuable a resource that would be!
posted by bjennings at 4:37 PM on May 10, 2001


"why would CNN put up this robots page?"

Yeah, I don't know either. After I posted my comment I started thinking about how they might do something like that. (note to self - Think first, then post.)

I just sorta assumed there would be something in the robots.txt or the meta tags, but no.

"it's probably for content partners' automatic content-gathering robots."

Ya, that makes more sense. That might also be why the directory listing is open. And it would also mean that they'll keep it around long term.
posted by y6y6y6 at 5:18 PM on May 10, 2001


So I was persuing the robots.txt file on CNN... what possible reason does CNN have to block www.cnn.com/TRANSCRIPTS from search engines? That's one of the more useful parts of the site.
posted by smackfu at 7:48 PM on May 10, 2001




Because CNN has a contract with Federal Document Clearing House to create and sell the transcripts. If they were popping up all over Google, nobody would ever buy them.
posted by aaron at 9:38 PM on May 10, 2001


Thank you, thank you, thank you!
Let me explain: I live in Beijing, where www.cnn.com is blocked. The regime obviously hasn't noticed robots.cnn.com. Now I have unfettered access.

Unfortunately, they do seem to have noticed channel.nytimes.com.
posted by chinstrap at 11:11 PM on May 10, 2001


> Now I have unfettered access.

You did. But now we know. I'll get the fetters ready.

- Chairman Maosover
posted by pracowity at 11:16 PM on May 10, 2001


[y6y6y6] wrote: It's too easy for me to just mentally filter the ads out.

I don't think a lot of advertising agencies or psychologists would agree with you. The subconscious effect is enough for most advertisers (you do remember their name or recognize it). I don't think humans are capable to filter the subconscious effects out.
posted by nonharmful at 2:20 AM on May 11, 2001


Nonharmful: Ad people in every -other- medium are happy with the subconscious effect. Online guys want their clicks, thinking this is the one true response. This is why banner ads supposedly don't work, yet easily-zapped radio and TV ads supposedly do.
posted by Erendadus at 10:16 AM on May 11, 2001


"I don't think humans are capable to filter......."

My brain is far advanced from that of normal humans. Which comes in very handy.
posted by y6y6y6 at 3:14 PM on May 11, 2001


« Older Britain, down the drain or in heaven?   |   Are the Conservatives actively trying to lose... Newer »


This thread has been archived and is closed to new comments