Join 3,440 readers in helping fund MetaFilter (Hide)


Hey, some of us are reading here.
March 3, 2009 1:19 PM   Subscribe

Readability is a wonderful bookmarklet that strips away all the surrounding cruft on a page so you can focus on the content.
posted by jragon (35 comments total) 41 users marked this as a favorite

 
Sweet Merciful Jebus! This is the greatest thing I have ever seen! I just hooked my CPU up to my 32 inch TV and THIS is exactly what I needed.
posted by Brodiggitty at 1:26 PM on March 3, 2009


Cool even though it does not play well with metafilter, cnn, or the first part of wikipedia. It did make a short story page (linked from AskMe) look awesome and a lot more readable than the original, and it works out OK with the relatively simple wikisource.
posted by Science! at 1:31 PM on March 3, 2009


It does not seem to know what to do with the CNN homepage, but when you click through to individual stories...works great!
posted by tdstone at 1:37 PM on March 3, 2009


It's perfect for crap like this.
posted by mattbucher at 1:40 PM on March 3, 2009


I love what this does to news sites.

But:

This would be better as something you could toggle on and off - seems that the only way to undo what it does is to reload.

Also, I do not like that it loads yet another script from arc90.com. Perhaps in a month or two, arc90 could be selling clickstream data to someone else? Perhaps in a couple of months, you'll click this widget and get a fullscreen window telling you not to be such a leech and read some damned ads for a change? The possibilities are endless...

If you run something initiated from your browser, in what security context does it run? I smell XSS exploit in the making.
posted by i_am_joe's_spleen at 1:44 PM on March 3, 2009


Interesting. It almost seems like it's reformatting it for a mobile browser (phone, etc) view. unfortunately, this seems like one of those really clever ideas, that I'll never actually find an useful implementation for.

Still, if I ever get my TV pc set up, I could see something like this being handy for reading news sites.
posted by quin at 1:55 PM on March 3, 2009


It you try it on news.google.com you get an ugly page that looks like it couldn't load the style sheet, and it only shows the most recent comment on Metafilter.

I want to love it. I want to love it so hard. Alas, I can only mock.

meh.
posted by blue_beetle at 1:57 PM on March 3, 2009



Perhaps in a month or two, arc90 could be selling clickstream data to someone else? Perhaps in a couple of months, you'll click this widget and get a fullscreen window telling you not to be such a leech and read some damned ads for a change?
LUKE: Han! Han! Look! A Rebel moonbase! Head for the docking bay--we need to refuel!

HAN: Way ahead of you.

LUKE: Oh shit! It's a trap! That's actually a ship-eating alien creature disguised as a moonbase!

HAN: I can't turn around! We're being sucked into its belly!

LUKE: OH SHIT! WE'RE ALL GONNA DIE!

CHEWY: RRAWWRRR!

HAN: ...

HAN: Oh wait, I forgot I could just stop using the bookmarkle... I mean put the ship into reverse. Nevermind. We're outta here.
posted by nasreddin at 2:00 PM on March 3, 2009


This is amazing - it may have just dramatically improved my life. Thanks!
posted by l33tpolicywonk at 2:05 PM on March 3, 2009


nasreddin: don't forget to clear all the cookies it set that label you as an ad-refuser.

I don't really think that's going to happen, but I haven't seen bookmarklets that load up things from outside before, and I wonder what non-obvious consequences there might be from that.
posted by i_am_joe's_spleen at 2:06 PM on March 3, 2009


It rendered the Salt Lake Tribune main page as nothing but a blank white space.

So in other words, it works as advertised.
posted by mr_crash_davis mark II: Jazz Odyssey at 2:10 PM on March 3, 2009


On blogs it either picks one random comment or post (on this FPP it just shows one comment.) For the front page of MetaFilter, it shows a blank page.
posted by ALongDecember at 2:17 PM on March 3, 2009


We clearly all enjoy what the script's author determines is cruft. Different strokes for different folks.
posted by Science! at 2:20 PM on March 3, 2009


I do not like that it loads yet another script from arc90.com

So grab the script and host your own copy instead.

If you run something initiated from your browser, in what security context does it run?

The context of the web page in which it is invoked, just like any other bookmarklet.

I smell XSS exploit in the making.

Almost by definition, this or any other bookmarklet or greasemonkey script is an "XSS exploit" (of the non-persistent variety) -- you're effectively injecting javascript into someone else's website to modify it for your own purposes.

But you can read the source code to verify that it isn't doing anything malicious, and if you don't trust future updates you can, again, host your own copy and use it instead.
posted by ook at 2:21 PM on March 3, 2009 [2 favorites]


Oh my, could be very useful indeed for some of us. I wonder if they wouldn't mind allowing us to choose a back colour other than white? The idea of browsing the internet is perfect legibility is too much...
posted by Sova at 2:27 PM on March 3, 2009


Wow, it makes reading online "magazines" so much easier! Just open one tab on the homepage, open stories in new tabs, and bookmarklet 'em as you read. Thanks, jragon.

But this is kind of a sad thing, too, because it's reminded me of all the fascinating articles that are hidden inside truly horrid page designs. So many eager writers doing their best to create wonderful, insighful content on deadline, that get their work stuck between blinking page headers slow loading advertisements. Alas, progress...
posted by Kevin Street at 2:29 PM on March 3, 2009


Argh! "Insightful" content, darnit.
posted by Kevin Street at 2:30 PM on March 3, 2009


I wish it would show MORE and without the stupid ads at the bottom. I'm considering ripping this off and rehosting it without the spam:
http://timedoctor.org/fun/mirrors/readability-0.1.js
posted by TimeDoctor at 2:33 PM on March 3, 2009


I wish it would show MORE and without the stupid ads at the bottom.
I don't know that I would classify that as "spam". Guy makes a script that you find useful, you really going to begrudge him a small, tasteful logo at the bottom of the screen?
posted by ook at 2:42 PM on March 3, 2009 [1 favorite]


Now you can have a professional white background everywhere!
posted by grouse at 2:44 PM on March 3, 2009


This is interesting: I work in assistive technology, and one of the big usability problems for blind web surfers is identifying the content of interest - knowing where to start reading to get the content, not the navigation bars.

This appears - correct me if I'm wrong - to know how to present some sites (NY Times) but not others (BBC) so it's fragile in that it assumes the script provider will keep up with changes to the site layouts. But it's nice.
posted by alasdair at 3:00 PM on March 3, 2009


From looking at the code, this wasn't built specifically toward various news sites' layouts -- which would be a losing game.

Instead it's taking a very simple mechanical approach: it finds the block of the page which contains the most paragraphs, assumes that's the important content, and throws out everything else. That's why it doesn't deal well with homepages or the like, but seems to handle a surprisingly wide variety of pages which contain single long articles.

The algorithm isn't perfect: on the BBC, ironically enough, their block of accessibility links for blind users contains more paragraph tags than the real article most of the time. And here on MeFi it drops everything but the longest comment because each one is in a separate <div> tag. Still, for such a simple technique I'm surprised by how well it does work in general.
posted by ook at 3:23 PM on March 3, 2009


VERY cool.
posted by wherever, whatever at 3:37 PM on March 3, 2009


Thanks ook. I built a prototype a couple of years ago that used a more sophisticated algorithm based on area and position and text size and length when laid out (so more text, smaller font, central location, middle of the window = probably the content) but the problem I had is exactly what people have observed above: it works great on site A but not on site B, which means that users won't be able to rely on it, which means that it doesn't really work as a feature.

Some day I'll write the AI to do it properly. (or hey, if you're reading this and want to do some coding for open-source software for blind people, do get in touch!)
posted by alasdair at 4:00 PM on March 3, 2009


Don't like theirs? Make your own! It's much simpler, as it will simply clobber the CSS styles of everything on the page, but it works well for me.

You have to have some idea of what CSS rules to set/clobber, though. Here's what I use for my own reformat/readability bookmarklet:

* {
font-family: sans ! important;
font-size: 14pt ! important;
max-width: 720px ! important;
line-height: 1.4 ! important;
background: #333333 ! important;
color: #ffffdd ! important;
}
body {
margin-left: 2em ! important;
}


Tweak it as you see fit (I like light, sans-serif text on a dark background - you might not), plug it into the bookmarklet generator, and you're good to go.
posted by whatnotever at 4:34 PM on March 3, 2009 [6 favorites]


this or any other bookmarklet or greasemonkey script is an "XSS exploit" (of the non-persistent variety)

Sure. All I'm saying is I hadn't previously seen such a script that pulled in most of its code from an external site. That's the new part (to me, maybe that's old hat) and it struck me that this opens up another vector for the enterprising bad person.
posted by i_am_joe's_spleen at 4:35 PM on March 3, 2009


In Firefox, go to the View menu, through Page Style, and select No Style. You'll be glad you did.
posted by LogicalDash at 5:12 PM on March 3, 2009


Use it when you're actually reading an article. It fails miserably at aggregated content.
posted by empath at 5:23 PM on March 3, 2009


In Firefox, go to the View menu, through Page Style, and select No Style. You'll be glad you did.

Ugh. That put everything in one column.

The Readability bookmarklet is pretty darn cool. I'm a big fan of the zap bookmarklets from Jesse, but this one is something else.

Here's the Javascript I've got:

javascript:(function(){readStyle='style-newspaper';readSize='size-medium';readMargin='margin-narrow';_readability_script=document.createElement('SCRIPT');_readability_script.type='text/javascript';_readability_script.src='http://lab.arc90.com/experiments/readability/js/readability-0.1.js?x='+(Math.random());document.getElementsByTagName('head')[0].appendChild(_readability_script);_readability_css=document.createElement('LINK');_readability_css.rel='stylesheet';_readability_css.href='http://lab.arc90.com/experiments/readability/css/readability.css';_readability_css.type='text/css';document.getElementsByTagName('head')[0].appendChild(_readability_css);_readability_print_css=document.createElement('LINK');_readability_print_css.rel='stylesheet';_readability_print_css.href='http://lab.arc90.com/experiments/readability/css/readability-print.css';_readability_print_css.media='print';_readability_print_css.type='text/css';document.getElementsByTagName('head')[0].appendChild(_readability_print_css);})();

And it works pretty darn well. SFGate.com fails completely (no surprise there), but NY Times, Washington Post, L.A Times, the New Yorker... all look pretty good.

Interestingly, if you want to print out an article on paper, the Readability bookmarklet might actually use more pages of paper than the standard printable version. The New Yorker link above seems to be 14 pages before running the bookmarklet and 21 pages afterward.
posted by mrgrimm at 5:35 PM on March 3, 2009


Heh, I tested it out on the first bookmark of unread stuff I had available in my browser bar, which was David Foster Wallace's novel excerpt in the New Yorker this week. At first, as I scanned down the opening paragraph, I thought, wow, this thing is great. Then the opening paragraph just seemed to keep going and going, and I thought, uh oh, it doesn't seem to handle paragraph breaks very well. Then I thought, damn, finally there is a paragraph break, but now it's omitting quotation marks on the dialogue! Shoot, I thought; such a good idea but it's not quite reliable, and thus I can't trust it for fiction, which is what I'd like to use it for most.
Then I reloaded the original and saw that, in fact, the opening paragraph of the story does go on nearly forever, and that there are no quotations around the dialogue either.
So it was a great moment for Readability the program. Readability the literary concept, though, not so much.
posted by roombythelake at 5:47 PM on March 3, 2009 [1 favorite]


Anybody know of a way to make it easier to pick up where you left off when scrolling down using spacebar or page down in Firefox (or other browsers)?

Seems like it would be easy to develop an add-in that leaves a faint line or something where the bottom of frame was...
posted by micropublishery at 7:42 PM on March 3, 2009


If you use NoScript, I found I had to enable the top-level domain before the bookmarklet would execute. No need to reload the page after enabling it though (unless you have it set to automatically do that).
posted by cj_ at 10:18 PM on March 3, 2009


Darn. Doesn't seem to work on very long texts (e.g., The Adventures of Huckleberry Finn) that you might want to read in your browser after converting the .TXT to .HTM.
posted by micropublishery at 6:11 AM on March 4, 2009


Alas, it was too much to hope that it would work on a Chinese web fiction site.
posted by of strange foe at 9:06 AM on March 4, 2009


The Zap bookmarklet is better.
posted by icheyne at 3:13 PM on March 4, 2009


« Older The corporate logos of Kevin Bewersdorf...  |  World War II: Simple Version.... Newer »


This thread has been archived and is closed to new comments