Google now lets you filter via basic, intermediate and advanced reading levels...
December 21, 2010 9:15 PM   Subscribe

Getting to advanced reading level content. As pioneered by Adrien Chen of Gawker, by far the most interesting application of the tool is its ability to rate the overall level of material on any given site, simply by dropping site: [domain.com] into the search box.
posted by Muirwylde (51 comments total) 8 users marked this as a favorite
 
Metafilter is listed as 69% basic. And the advanced reading level content has a thread titled "Boogers" on the first page of results.
posted by infinitewindow at 9:20 PM on December 21, 2010 [5 favorites]


The second "Advanced" result for a Metafilter URL search? Couldn't be more appropriate. Well played, Google.
posted by Rhaomi at 9:24 PM on December 21, 2010 [5 favorites]


Advanced seems to mean 'jargon-y'
posted by empath at 9:28 PM on December 21, 2010 [1 favorite]


Youtube: 76% basic. I totally called that.
posted by dunkadunc at 9:29 PM on December 21, 2010


Hustler.com gets 1% advanced, and the "advanced" pages seem to focus on Asians.

(NSFW, obvs)
posted by scratch at 9:40 PM on December 21, 2010


Advanced.com is 62% Advanced and 0% Basic, as you would hope.
posted by twoleftfeet at 9:47 PM on December 21, 2010 [1 favorite]


along similar lines, annotations indicating frequency of misspelled words and percent ALL CAPS would be very helpful.
posted by mexican at 9:54 PM on December 21, 2010


I'm not sure why Fark is 80% Basic, although this is the first result.
posted by twoleftfeet at 9:54 PM on December 21, 2010


Apparently, this only works for English search. My Google account defaults to a Spanish-language interface, and I don't have the reading level option under advanced search unless I open a privacy-mode window.
posted by d. z. wang at 9:54 PM on December 21, 2010


What google says about google: science; politics; sex; art; bing; yahoo; google; sychophants.
posted by vapidave at 10:00 PM on December 21, 2010


Oh, if you want to bookmark it, the magic GET parameters appear to be tbs=rl:1 to show the reading level ratings, and tbs=rl:1,rls:N to display results only from the Nth level (e.g., rls:0 for basic, or rls:2 for advanced). If your default Google search is in a language where this feature is not yet available, you also need hl=en to force English language search.
posted by d. z. wang at 10:02 PM on December 21, 2010 [3 favorites]


The Green is more Basic than the Blue, but Grey is more Basic than both.
posted by twoleftfeet at 10:08 PM on December 21, 2010 [1 favorite]


ooops
posted by twoleftfeet at 10:09 PM on December 21, 2010


This is a terrible way to identify reading levels.
posted by TwelveTwo at 10:10 PM on December 21, 2010 [4 favorites]


Interesting thing I noticed when playing with this the other day:

metafilter.com : 69% Basic / 29% Intermediate / <1% Advanced
www.metafilter.com: 58% / 40% / 1%
ask.metafilter.com: 75% / 24% / <1%
metatalk.metafilter.com: 72% / 27% / <1%

So it's not just AskMe's that pull the site average down…

(For comparison, Yahoo Answers ranks as 60% / 37% / 1%, and my gf's P-12 school ranks 6% / 93% / 0%)
posted by Pinback at 10:16 PM on December 21, 2010


Strangely, www.wikipedia.org is 100% intermediate and 0% basic and advanced. Maybe Google are somehow using Wikipedia to calibrate what an "intermediate" reading level is.
posted by L.P. Hatecraft at 10:22 PM on December 21, 2010 [1 favorite]


www.wikipedia.org is 100% intermediate

But english wikipedia is more advanced than french wikipedia, and german wikipedia is more advanced than both.
posted by twoleftfeet at 10:38 PM on December 21, 2010


the magic GET parameters appear to be tbs=rl

For super advanced it's tbs=tl;dr
posted by twoleftfeet at 10:44 PM on December 21, 2010 [2 favorites]


Yeah, someone that works for google wants to buy their kid the GI Joe with the Kung-Fu grip. He or she knows that if they don't buy their kid the GI Joe with the Kung-Fu grip that the morning of 12/25 will be hell and they are having trouble meeting their mortgage in Santa Clara anyways. So they adjust the algorithm such that it makes their employers look good, they work for another week, and the kid gets the GI Joe with the Kung-Fu grip.
posted by vapidave at 10:52 PM on December 21, 2010 [1 favorite]


sodium hydroxide: 1% Basic

:/
posted by Avelwood at 11:18 PM on December 21, 2010 [15 favorites]


My own blog is 25% basic and 75% intermediate. 0% advanced. Knew I shouldn't have dropped out of college.
posted by maxwelton at 12:10 AM on December 22, 2010


What? What does this even mean?
"The feature is based primarily on statistical models we built with the help of teachers. We paid teachers to classify pages for different reading levels, and then took their classifications to build a statistical model. With this model, we can compare the words on any webpage with the words in the model to classify reading levels. We also use data from Google Scholar, since most of the articles in Scholar are advanced." Filter your results by reading level
I've read a lot of scientific papers (i.e. Scholar content) and they're mostly turgid, wordy and rubbish. So is "advanced" just a synonym for "lots of long words and bad syntax"? So we should avoid "advanced"?

Or in other words does:
"Rain is water falling from clouds in the sky"
score as basic while
"Water descending vertically in a downwards direction, produced by clouds located in an upwards direction in the area commonly known as the 'sky', is termed in standard discourse as 'rain' under most common situations."
is advanced? That's not helpful.
posted by alasdair at 12:38 AM on December 22, 2010 [2 favorites]


Alert All Teachers: From Now On, grades for student essays shall be assigned using the following formula: ( advanced score)/(basic score) * (intermediate score)

You're welcome.
posted by honest knave at 12:41 AM on December 22, 2010


Hey, this askme post of mine is #4 in "advanced" results. Heh.

I think they ought to split out different types of "advanced" stuff, it seems as though scientific terms really boost a contents 'advancedness' (or is that advanceditude?)
posted by delmoi at 12:51 AM on December 22, 2010


Yes, rating human output with computer algorithms always works out so very well.

I've seen plenty of 'advanced' writing that was intended to confuse, to be complex for the sake of complexity. I've seen it in academic papers especially; it's showing off in writing, not communicating. Now, there are certainly jargony, dense papers that need to be jargony and dense. Medical articles come to mind; you often need a lot of background to have any idea what a particular paper is about. Explaining everything would take multiple books, and they'd never get their paper written if they tried. But an awful lot of academia goes for turgid prose merely because that's the accepted style, not out of necessity. Dressing up simple ideas in complex words doesn't improve the ideas. (If you can't tell, needless complexity irritates the hell out of me. )

Can Google tell the difference between information density and intentional obfuscation?

As Einstein said, if you really understand your field, you should be able to explain it to a janitor. If you can only explain it to colleagues, you're just not that good.
posted by Malor at 1:37 AM on December 22, 2010 [5 favorites]


"Advanced" does not mean "well-written." Advanced means "you have to be a fairly bright spark to get what the hell he's on about." I'm sure if a good editor were to get a hold of a lot of these advanced texts, turn them upside down and wring out the semi-colons, you'd find the actual ideas in them pretty basic.

However, what this could be quite helpful for is killing those content spam sites that fill the first page of results every time you search for general, domestic how-to type stuff. Most of this are at about a second grade reading level, and only about one in 20 usually contains sufficient information to be any use.
posted by Diablevert at 1:54 AM on December 22, 2010 [2 favorites]


It seems like quite a good way of finding material which treats your subject in an academic way - but the article seems to imply that "advanced" is synonymous with "high quality" simply because it filters out "spammers and mouthbreathers". I think that depends on what you are looking for. For example in vapidave's link to "sex" above that the "advanced" articles sound like the least fun.

I suspect Google may be using something a bit like Flesch-Kincaid or Gunning-Fogg as an algorithm. What I would really like to see is some kind of algorithm which could tease out sites with text which was interesting and written well for online reading. For that this site is probably a better bet.
posted by rongorongo at 2:14 AM on December 22, 2010


I suspect Google may be using something a bit like Flesch-Kincaid or Gunning-Fogg as an algorithm

I think they're doing something else. Re: how we classify pages:
The feature is based primarily on statistical models we built with the help of teachers. We paid teachers to classify pages for different reading levels, and then took their classifications to build a statistical model. With this model, we can compare the words on any webpage with the words in the model to classify reading levels. We also use data from Google Scholar, since most of the articles in Scholar are advanced.
posted by twoleftfeet at 2:48 AM on December 22, 2010


We paid teachers to classify pages for different reading levels, and then took their classifications to build a statistical model.

Interesting. I wonder what would happen if they paid editors to classify material according to how well it communicated its message? Equally I wonder if metafilter FP recommendations could be aggregated in such a way as to produce a predictive model of what new material was worth reading? - That would save us all a LOT of time!
posted by rongorongo at 3:24 AM on December 22, 2010


As Einstein said, if you really understand your field, you should be able to explain it to a janitor. If you can only explain it to colleagues, you're just not that good.

--Ooh, ooh, big ideas made easy!

Tyranny for the Commons Man Basic 0% Intermediate 100% Advanced 0%

-- Hey, that's not bad.

Results by reading level for http://en.wikipedia.org/wiki/Excluded_volume:
Basic 1% Intermediate 6% Advanced 92%

-- Okay, but the next one's not so numbery.

Another Face of Entropy: Particles self-organize to make room for randomness

Results by reading level for http://www.sciencenews.org/pages/pdfs/data/1998/154-07/15407-15.pdf:
Basic 0% Intermediate 0% Advanced 0%

-- Won't work for pdf's :(
posted by dragonsi55 at 3:43 AM on December 22, 2010


This kind of thing is one of Google's minor stocks-in-trade by now, isn't it? Big splashy announcement of a (kinda ridiculous) new service no one asked for to begin with, lots of clicks in the first days and then a long slow decline into nothing. Keeps their name in the financial press.
posted by mediareport at 3:47 AM on December 22, 2010 [1 favorite]


The Flesch-Kincaid and Gunning-Fog tests both rely on a measure of intrinsic complexity of sentences. Flesch-Kincaid uses a silly formula which weights short sentences using words of few syllables as more "basic". Gunning-Fog uses a similar but different silly formula.

Neither of these approaches can possibly take into account the way language is actually used, but maybe Google's approach can. By comparing usage across documents, you could have a "readability rank" instead of a "page rank".
posted by twoleftfeet at 3:56 AM on December 22, 2010 [1 favorite]


I think Google is just messing with me. Basic English is a deliberately simplified version of English, but the books at ogden.basic-english.org are 100% advanced.
posted by twoleftfeet at 4:16 AM on December 22, 2010


Strangely, www.wikipedia.org is 100% intermediate and 0% basic and advanced. Maybe Google are somehow using Wikipedia to calibrate what an "intermediate" reading level is.
On a similar note...
posted by dougrayrankin at 5:06 AM on December 22, 2010 [2 favorites]


We also use data from Google Scholar, since most of the articles in Scholar are advanced

Many excellent scientific papers are written in a close approximation of Simple English so that everyone gets it, including the huge international audience. No points are awarded for prose styling (unless you are Brad Efron or Andrew Gelman).
posted by a robot made out of meat at 5:17 AM on December 22, 2010


December 10th was National Plain English day.
posted by various at 6:36 AM on December 22, 2010


[verbose]= advanced
posted by JJ86 at 7:45 AM on December 22, 2010


Doesn't take into account that Mefites excel at reading between the lines.
posted by Kabanos at 7:57 AM on December 22, 2010 [1 favorite]


I think that this could be really useful to tell you when your website/information is unhelpful. I work in education so that's where my mind immediately goes, and in DC (and tons of other areas) there are lots of families with kids in the school system who are not entirely comfortable with the written word for any number of reasons including language and educational background. Having something that will tell you whether or not you are communicating information in a generally basic and understandable way could be really helpful for teachers and administrators who need to communicate with the families of kids in their care. It's certainly not perfect and it won't tell you whether or not you are actually making any sense, but it could be a useful check to make sure that you are imparting information in an easier to understand way.
posted by Mrs. Pterodactyl at 8:22 AM on December 22, 2010 [1 favorite]


Malor: "Yes, rating human output with computer algorithms always works out so very well.

I've seen plenty of 'advanced' writing that was intended to confuse, to be complex for the sake of complexity. I've seen it in academic papers especially; it's showing off in writing, not communicating.
...

Can Google tell the difference between information density and intentional obfuscation?

As Einstein said, if you really understand your field, you should be able to explain it to a janitor. If you can only explain it to colleagues, you're just not that good
"

Man, can I try this "A Thousand Plateaus"? Would that be like... hyperadvanced?
posted by symbioid at 9:03 AM on December 22, 2010


As Einstein said, if you really understand your field, you should be able to explain it to a janitor. If you can only explain it to colleagues, you're just not that good.

I'm automatically skeptical of any quote attributed to any of the Founding Fathers, Einstein, Winston Churchill or Yogi Berra.

Do you have a source for this?
posted by empath at 9:07 AM on December 22, 2010


At advanced reading level, put site:amazon.com in your search. Getting some interesting books that way that I wouldn't have come across otherwise. Oh, that probably works for books.google.com too.
posted by zeek321 at 9:13 AM on December 22, 2010


It's easy to rip on this kind of technology, but as someone who works in natural language processing, I think this is potentially quite an achievement. It remains to be seen whether or not filtering pages based on reading level actually helps people find the information they want, but if it does, it's a big deal. I wish I'd thought of it first. It's been "obvious" for at least a decade that you can use NLP to help search for things on the Internet, but so far nobody has been able to use NLP to build a search service that people want to use on a daily basis.

The criticisms people are making here are valid, but they kind of miss the point. I'm sure Google would love to rank pages by writing quality or by how intelligent the content is, but those aren't things they can measure, so they've taken the nearest thing they can measure and offered it as a search option. They call it "reading level" because that's a fairly intuitive description of what what they're computing, but you shouldn't read too much into the name. The important thing is that it's a measurable statistical property that can serve as a proxy for how intelligent or well-written a page is. It's far from perfect, but that doesn't mean it can't be extremely useful.
posted by shponglespore at 10:03 AM on December 22, 2010 [2 favorites]


empath: Do you have a source for this?

Well, in a quick search, I didn't find the exact quote I was looking for -- it was one my father repeated to me many times, and it may be in one of his books and not on the web. I did, however, find this one, top of the page:
If you can't explain it simply, you don't understand it well enough.
Near the bottom of that same page:
It should be possible to explain the laws of physics to a barmaid.
On the next page:
Most of the fundamental ideas of science are essentially simple, and may, as a rule, be expressed in a language comprehensible to everyone.
Given how many times my father said it, and the fact that Einstein was approaching that same idea from several directions, I'm pretty sure there must be a version involving a janitor somewhere, but I haven't spotted it yet.
posted by Malor at 10:56 AM on December 22, 2010


Meanwhile, excluding sites aimed at children, here's the dumbest

I think the author is conflating verbiage with message.

It seems possible to offer intelligent viewpoints with simple language.
posted by mrgrimm at 12:08 PM on December 22, 2010


Strangely, www.wikipedia.org is 100% intermediate and 0% basic and advanced. Maybe Google are somehow using Wikipedia to calibrate what an "intermediate" reading level is.

I think Google is just messing with me. Basic English is a deliberately simplified version of English, but the books at ogden.basic-english.org are 100% advanced.

Remember to check your results. L.P. Hatecraft, there is only one page that returns as being from "www.wikipedia.org", a talk page about the Provinces of Iran (wonder how that got in google's results). twoleftfeet, your link only includes one page result as well.

The completely even 33% split for google.com still seems suspect, though.
posted by meandthebean at 1:33 PM on December 22, 2010


That's probably what they're calibrating against, meandthebean... they have some kind of hard scoring algorithm, and then split the entire net into three equal pieces, based on where their scores lie.
posted by Malor at 1:45 PM on December 22, 2010


Oh, thank Go ... Google.

Of course reading level alone (at least, the old standards like Frey) tends to prefer extended Veblanesque elocutionary perambulations which are seldom mellifluous guarantors of cognitive gratification.

They don't usually measure the quality of ideas, elegance of style, literacy or timelessness of prose. People do that, not rules.
posted by Twang at 1:49 PM on December 22, 2010


Given how many times my father said it, and the fact that Einstein was approaching that same idea from several directions, I'm pretty sure there must be a version involving a janitor somewhere, but I haven't spotted it yet

yeah, but those are unsourced quotes, and most of the unsourced quotes I've seen for eintstein are misattributed, when i've researched them.
posted by empath at 1:54 PM on December 22, 2010


Of which that one seems to be, based on wikiquotes. The barmaid quote is more commonly linked with Ernest Rutherford, but even that seems to be dubiously sourced.
posted by empath at 1:58 PM on December 22, 2010


twoleftfeet: "We paid teachers to classify pages for different reading levels, and then took their classifications to build a statistical model."

Wonder how many just relied on Fleisch-Kinkaid and called it a night.
posted by pwnguin at 4:32 PM on December 22, 2010


« Older A music video by Ken Ishii   |   UConn Women's basketball team breaks the UCLA... Newer »


This thread has been archived and is closed to new comments