The Cuilest Encyclopedia
April 10, 2010 3:48 PM   Subscribe

Previously discussed search engine Cuil has a reputation for absurd search results. Now the Cuil team is building on their success with cpedia, the new automated encyclopedia.

From their entry for Metafilter:
MetaFilter was created by Matt Haughey (mathowie). He [Matt Haughey] was the sole admin of the site until 2005 when Jessamyn West (jessamyn) started helping keep Ask MetaFilter on track.

A double post is when someone posts something to the front page of MetaFilter (or sometimes MetaTalk) that has been there before.

Go to his userpage for more information on making good posts to MetaFilter [now with nthdegx's advice on multi-link posts] After you've found a site you want to feature on the front page, first go to the POST A LINK page on Metafilter, and enter in the url of the site.

This came up in a MeFi thread (a site called "Life Support" that supposedly answered any question), as well as in MetaTalk when Ask MetaFilter became reality, and now I'm looking for something similar.
It only gets weirder further down. Heading number one on their outline for Metafilter: Pawn Stars.
posted by JDHarper (61 comments total) 13 users marked this as a favorite
posted by New England Cultist at 3:59 PM on April 10, 2010 [1 favorite]

Best Web Hosting Offer Cuil launches with an index of 120 billion web pages making it a most comprehensive magic! Affiliate Secrets search engine on the web and also a potential try to get the Google competitor. Clickbooth Cuil but not avail due to flooding traffics and making their servers 'too hot' to handle. After googling Cuil is down at the moment to 'cool' down.

Hell no..!! Cuil is a search engine that has launched on this (28th 2008). "But according to reviews on CNN, cuil fails to deliver a good service better than Google.
A most comprehensive magic.
posted by Lemurrhea at 4:00 PM on April 10, 2010 [4 favorites]

It's like some form of internet performance art. My guess is they're trying to show off their natural language processing and generation technology. It's the kind of thing that can produce interesting results but not very useful results now.

These things get better of time. just look at how the biggest use of babelfish was to translate pages back and forth to see for the humor value, whereas Google translate actually gives you a pretty good idea of what pages in other languages were saying.

Sadly, the only real "purpose" I see for stuff like this is to take SEO spamming to the next level. Rather then hire hundreds of people to write crap for demand media, you're just going to see auto-generated articles crop up generated on the fly. They'll be even more useless and finding real results on the 'un-moderated' part of the web is just going to become impossible. You'll get all your results from wikipedia or something.

And of course you could even use this tech to post lots of spam to wikipedia that regular users wouldn't be able to really detect either. Hmm.

(Another use could be more highly customized articles. So for example instead of getting an article on "Nintendo" you could have a custom generated article like "Nintendo for developers" or "Nintendo for gamers" or "Nintendo's relationship with Sega in the 1990s" or whatever -- and get a detailed article with info from lots of sources)


Also, MetaFilter: sounds like it could work as part of a Lady GaGa backing track.
posted by delmoi at 4:08 PM on April 10, 2010

One line of the article, no changes:
MetaFilter: Roman Polanski Daycare Centre.
posted by delmoi at 4:10 PM on April 10, 2010 [22 favorites]

Looks to me like they are just scraping content from other websites without attribution.
posted by empath at 4:12 PM on April 10, 2010

Looks to me like they are just scraping content from other websites without attribution.

I searched "Wittgenstein." The first two sentences are from a SparkNotes message board. The fourth is from Wikipedia.
posted by carter at 4:17 PM on April 10, 2010 [1 favorite]

The cpedia entries are like reading a spammer's e-mail.
posted by Nattie at 4:18 PM on April 10, 2010 [3 favorites]

They're putting the AI in FAIL ...
posted by carter at 4:18 PM on April 10, 2010 [34 favorites]

I have six entries devoted to me, which means Cuil recognizes my inherent self-value. Rory Marinich approves of this!
posted by Rory Marinich at 4:23 PM on April 10, 2010

It is pretty much useless crap.... really not worth paying attention to..
posted by HuronBob at 4:23 PM on April 10, 2010

I dunno, it seems pretty accurate to me. First line from George W. Bush: A large part of the hope accompanying the election of Barack Obama as president of the United States has been relief at the departure of George W. Bush.

What more do you really need to know?
posted by shakespeherian at 4:25 PM on April 10, 2010

Very beginning of Lady Gaga:
Apparently the two fastest growing HIV rates are among women in the age groups 17-24 and 39-60, which Lady GaGa and Cyndi Lauper represent.

Kanye West and current radio darling Lady GaGa are heading out on a North American tour together. The Bad Romance singer recently had a sexual health screening in New York because she 'forgot' to look after her body properly when she was performing live shows for The Monster Ball Tour concerts.
posted by mccarty.tim at 4:27 PM on April 10, 2010

cortex is Jeff Goldblum. That explains a lot.
posted by lukemeister at 4:29 PM on April 10, 2010 [1 favorite]

I don't know if this makes me happier or sadder that I don't have a wikipedia entry.

On preview, I didn't realize there were multiple articles. Man. Which one is the best/worst me?
posted by cortex at 4:33 PM on April 10, 2010

jeff goldblum was cute in that private eye tv show and i sat through independence day twice for his geeky look. i don't want to hear anything la la la against him

i think i saw lady gaga walking down my street last week before easter (with a security guard type escort) - i was sorta stumbling along behind her when she turned and gave me 'the eye' over her shoulder

could I write spam results for this thing then do you think?
posted by infini at 4:35 PM on April 10, 2010

Presenting penis
posted by The Devil Tesla at 4:40 PM on April 10, 2010 [1 favorite]

I found out that:

"So he [{my real name}] got an easy ride to Hollywood and helped make an awful movie based on a great book."

I also seem to have a kid I didn't know about "But his [Harvey Weinstein's] parents {my real name}..."

And this was by searching for my kid's name...
posted by HuronBob at 4:43 PM on April 10, 2010

garkov has finally reached its full potential.
posted by lukemeister at 4:49 PM on April 10, 2010

Metafilter: I think stressing about it now causes my penis and testicles to be in an almost permanent shrunken state.
posted by Horace Rumpole at 4:49 PM on April 10, 2010 [1 favorite]

I love these people and I hope for their next project they make an office suite. cProcessor, cSheet, and so on.
posted by Wolfdog at 4:53 PM on April 10, 2010 [4 favorites]

That S.O.B. Alvy Moore and his stinking grave pushed me off of my tab.
posted by Alvy Ampersand at 4:55 PM on April 10, 2010 [1 favorite]

This is... a joke, right? They weren't seriously expecting people to like this non-ironically, were they?
posted by mccarty.tim at 4:57 PM on April 10, 2010


cSheet sounds about right.
posted by lukemeister at 4:58 PM on April 10, 2010

Amazing, how did they capture what goes through my mind when I've spent too long reading wikipedia then try to go to sleep.
posted by 7-7 at 4:59 PM on April 10, 2010 [2 favorites]

It reads like a cross of Markov chains & the autosummarize function. And I wonder what those would do to Cuil entries.
posted by Pronoiac at 5:14 PM on April 10, 2010

There are 7 different pages about Jesus Fucking Christ in Cpedia. View other pages about Jesus Fucking Christ.
posted by Wolfdog at 5:15 PM on April 10, 2010

Wow, what a trainwreck. And usually I try to respect folks making efforts in information retrieval. But this is.. just.. awful. It's like someone published their poetry journal full of automatic writing.

The only reason I can see to launch this is hoping you've stuffed enough keywords that you can get some ad revenue from searchers who accidentally land here. Or else it's a tech demo produced under extreme duress, desperate for a buyer. A technoabsurdist buyer.
posted by Nelson at 5:27 PM on April 10, 2010

I wonder how long it'll be before someone sues them for defamation?

If my own entry (shared with every other Daniel Rutter in the world, of course!) is anything to go by, cpedia seems to have a habit of taking things written by a person, or things written in comments on someone's blog, and turning them into statements about that person.

"My" cpedia article says I have a fake Ph.D, that I'm a crazy incompetent "expert witness", that I'm a nutjob who may try to rob someone, that I download "90-odd gig a month" of torrents, etc etc. Oh, and that I was once trapped in a vat of custard. All because these statements have been made somewhere within X semantic links of my name.

(And this is before we even consider the lulz 4chan will derive from the name "cpedia" alone...)
posted by dansdata at 5:29 PM on April 10, 2010 [3 favorites]

Content from 81 webpages curated by Cpedia, the automated encyclopedia.
There are 17 different pages about poo in Cpedia. View other pages about poo.

And speaking of poo, I never wet or pooed inside the van once all weekend. [1]

1. Anna Torv
2. Rabelaisian or Swiftian
3. Hysterical
4. Stephen Pinker
posted by Wolfdog at 5:32 PM on April 10, 2010


You're with friends. You can admit your mega-torrenting, custard-loving, sword-carrying ways to us.
posted by lukemeister at 5:34 PM on April 10, 2010 [1 favorite]

These folks can't even figure out how not to hammer my websites with stupid levels of crawling — I don't see them as having much of a shot at developing AquaNet, much less SkyNet, with their resource-squandering ways.
posted by adipocere at 5:38 PM on April 10, 2010 [1 favorite]

Sadly, the only real "purpose" I see for stuff like this is to take SEO spamming to the next level. Rather then hire hundreds of people to write crap for demand media, you're just going to see auto-generated articles crop up generated on the fly.

I dunno about that. I've done some SEO-optimized marketing copywriting, and a basic level of uniqueness, credibility and quality is required before you can upload any article into a database. I don't think machine-generated language will be attractive to many businesses trying to market themselves on the web.
posted by KokuRyu at 5:48 PM on April 10, 2010

As a software engineer, this depresses me and freaks me out a little. What do they think they're doing?! Dozens of man-years of work have probably gone into this... why!?
posted by lupus_yonderboy at 5:49 PM on April 10, 2010

Is it possible that they could add on some kind of learning system ("Was this information helpful to you? Coherent in any way? Y/N") and let that teach it to have correct information? Not sure if I'm describing that quite right, but I'm thinking about that game that guesses what famous person you're thinking of.
posted by TochterAusElysium at 6:22 PM on April 10, 2010

Well, here's what I got when I plugged a common hobby into both Wiki (NSFW, but very informative) and Cpedia (nonsensical). I think I know which one I'll keep using.
posted by FatherDagon at 6:34 PM on April 10, 2010

The biggest problem seems to be that it's so indiscriminate in picking its sources that it bases whole pages on blog comments and myspace profiles. Pages based mostly on news articles are a bit more coherent, although the page structure is still worthless and it just barfs three unrelated sentences together to form most of the paragraphs. If it picks a mixture of reporting, opinion, and personal (blog-ish) sources, then it ends up with a jarring mixture of tones and voices, and all the information is given roughly equal weight.

Kind of looks like they've solved a lot of the easier problems, but there's a lot way to go.
posted by plant at 6:38 PM on April 10, 2010

The wrinkled avenger also blew Thomas testicles to kingdom come, but doctors managed to save his mangled penis, police said. A man may increase both the length and girth of his penis by performing this exercise.

This thing is a goldmine.
posted by kaspen at 6:53 PM on April 10, 2010 [2 favorites]

From World Wide Web:

"Speaking in Washington last night (14 September), Sir Tim Berners-Lee unveiled an exciting new vision for the next phase of development of the World Wide Web

Then, Tim Berners-Lee said, "Let there be a World Wide Web!" and it was created, and he saw that it was good. From the very beginning of the Web at CERN, the specifications that described how the Web works were drafted by Tim Berners-Lee and made available online."
posted by oulipian at 6:55 PM on April 10, 2010 [1 favorite]

"My" cpedia article says I have a fake Ph.D, that I'm a crazy incompetent "expert witness", that I'm a nutjob who may try to rob someone, that I download "90-odd gig a month" of torrents, etc etc. Oh, and that I was once trapped in a vat of custard.

Well if, as they say, a good laugh will help your overall levels of health, cpedia has ensured that I will now live until 104. You couldn't make this up. Have you read the cortex one? Utter surreal hilarity.
Tenure, schmenure Dan Rather will leave CBS News later this year. One imagines an uptick in playtime for a certain REM single over the next few months. Puppy love subverted woman assaults breeder with dog corpse. No word yet on followup legislation regarding the bi-curious. There is, I reckon, something in the air.
posted by jokeefe at 6:59 PM on April 10, 2010

Wow. Is Cuil really run by the The Yes Men as some type of situationist prank?
posted by benzenedream at 7:13 PM on April 10, 2010

Don't they also have a reputation for having a crawler that was exceptionally aggressive and crashed sites all over the place?
(Granted, it would honour .robots if they had rate limiting stuff in them, but until then, nobody had ever had the need to use such features)
posted by TravellingDen at 8:01 PM on April 10, 2010

I am beside myself with joy - this so very much reminds me of my Markov chain fun a few years back, just with way more processor time.

This is the way the Singularity begins - not with a bang, but with an uncontrollable snicker.
posted by Michael Roberts at 8:44 PM on April 10, 2010 [2 favorites]

Here's a blog post on it, Mostly in japanese, in keeping with surreality (but contains and English quote giving some motivation for doing this) article on TheNextWeb, here's a reddit thread, and another

One of the funniest aspects are the inexplicable tables of contents hirarchies. Here's one on Obama:
4 Democratic Party
1. Republican and Democratic leaders
2. Kathleen Sebelius
3. Jake Tapper
4. House and Senate Democrats in December
5 Ben Bernanke
1. Robert Gibbs
2. Emanuel
3. John Edwards
4. Timothy Geithner
6 Dalai Lama
posted by delmoi at 8:44 PM on April 10, 2010 [2 favorites]

Pope Guilty


1. 2009
2. Isaiah Kalebu
3. Marxist
posted by Pope Guilty at 9:50 PM on April 10, 2010

Next up, a round of venture capital financing in order to invest in an infinite amount of monkeys and typewriters.
posted by snofoam at 11:22 PM on April 10, 2010

From their About Us page: "We do the heavy lifting of removing all the repetition, so that unique and novel content surfaces."
I certainly can't argue with that.

Also, the image on the page about their web crawling tech is cracking me up. It's just like the search results - almost totally wrong but sort of kind of connected to something correct. There's something sublime about the whole endeavor.
posted by chaff at 12:31 AM on April 11, 2010

So this is what Wikipedia would look like with no human intervention. It's like looking directly into the internet's subconscious and realising that the internet is basically a huge, insane baby with no concept of meaning.
posted by him at 5:55 AM on April 11, 2010 [1 favorite]

1. Time Cube "> of Cubicism Time
    a. Hunter S Thompson
    b. TimeCube RPG Rules
    c. 1day
    d. Rubik
    e. Flying Spaghetti Monster
2. Cold Fusion

I leave it for you to deduce the article.
posted by Wolfdog at 6:24 AM on April 11, 2010

The idea of Cortex as Markee Dragon's lead moderator warped the fabric of the universe for a moment there. How weird.
posted by restless_nomad at 7:36 AM on April 11, 2010


1. belonging to the emperor
2. embalmed
3. tame
4. sucking pigs
5. sirens
6. fabulous
7. stray dogs
8. included in the present classification
9. frenzied
10. innumerable
11. drawn with a very fine camelhair brush
12. et cetera
13. having just broken the water pitcher
14. that from a long way off look like flies
posted by Horace Rumpole at 9:00 AM on April 11, 2010 [7 favorites]

From the World Wide Web article:

World Wide Web ("WWW", or simply "Web") is an information space in which the items of interest, referred to as resources, are identified by global identifiers called louisiana blues Uniform Resource Identifiers (URI).

Well that's all I needed to know.
posted by Tomorrowful at 10:17 AM on April 11, 2010 [1 favorite]

Wow, the antithesis of Wikipedia is much more entertaining than I'd expected.
posted by furtive at 12:06 PM on April 11, 2010

Try ' keller'
15 results, no helen.

Cuils theory still applies
posted by CitoyenK at 1:14 PM on April 11, 2010

Cpedia: almost as reliable than Conservapedia
posted by CitoyenK at 1:21 PM on April 11, 2010 [1 favorite]

(sorry for the spam, cpedia cracks me up! It's 22 minutes past twenty past four...)

Now that the cpedia and cuil projects are gold, they should go for :
an automated aggregator of best of the web with autogenerated comments.
BoingCuil, same as above with ads and Ego-trips.
and why not: ... But I can barely imagine how that would be...
posted by CitoyenK at 1:43 PM on April 11, 2010 [1 favorite]

I'm just cycling back into this topic to note that - perversely - this is the kind of thing that keeps me smiling for days. That the world could have something so insanely wonderful in it!
posted by Michael Roberts at 10:42 PM on April 11, 2010

I was wondering what would take the place of MarkovFilter for almost-human sounding hilarity...
posted by anthom at 8:41 AM on April 12, 2010

The community raised what is smurfily in the onsmurf environment a serious ethical breach, yet GiveWell responded in essence by affirming the offender's value and that's not even taking smurf account of any questions raised his [Karnofsky's] co-founder. Most people have never considered the ethical implications of 'astroturfing'.

Karnofsky has admitted to ' astroturfing,' which in this case means he posted questions and comments not as himself but in a disguised itentity of an independent observer.

In a section discussing co-founder Holden Karnofsky's 'brash' style, the Chronicle writes: [2]

#Members of Metafilter
On Metafilter, an online message board, Mr. Karnofsky promoted GiveWell without identifying himself. A Metafilter member uncovered the self-promotion, which violated the Web site's rules, and announced the discovery on the message board.

He [Karnofsky] also offered to make a contribution to Metafilter to compensate for his mistake an offer that was derided by Metafilter contributors as a bribe. Metafilter members found other examples of Mr. Karnofsky's praising GiveWell as an anonymous source, including instances where he criticized other nonprofit groups.

But it seems his [Karnofsky's] energy has gone terribly awry, and it's a real shame. I hope some good comes of this for the [very worthy] work Holden wanted to do and for anyone observing the situation.

On December 31st, 2007, Miko creates thread in calling out the previous Ask.Metafilter question as an attempt by Holden Karnofsky to use sock puppetry to anonymously plug, a violation of Metafilter rules.

Members of Metafilter then proceeded to find numerous other cases of Holden promoting GiveWill on other websites without disclosing his involvement and invariably criticising other sites at the same time. The board of directors of GiveWell initially didn't believe what was going on, and then admitted that Holden's actions were wrong.
posted by delmoi at 5:45 PM on April 12, 2010 [1 favorite]

Wow, that's fascinating. I like the part where they explain the pages about Django Rheinhart and Django (Python) get mixed up because "the distinction is not as sharp as it might be". Um, it's pretty sharp for where I'm standing.

The telling quote is where he categorizes Cpedia as "a different UI on a search engine." Cast that way, the launch makes sense. Cpedia's page for Metafilter, then, is like Google's. Only instead of being for navigation, Cpedia displays longer snippets. I still don't get why they call it an encyclopedia. But there's "wiki" in the URL: maybe they expect humans will start editing these pages soon. That might be interesting.
posted by Nelson at 8:14 AM on April 14, 2010

« Older Welcome Sophophora melanogaster   |   Quix, the super bookmarklet Newer »

This thread has been archived and is closed to new comments