Scrambled Text
September 14, 2003 10:54 PM   Subscribe

Scrambled Text. Tihs jrivascapt let's you puodcre scmbleard txet jsut lkie a ctraien prgpaarah taht kepes ppoipng up all oevr the pclae. "Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a total mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe."
posted by bobo123 (58 comments total) 3 users marked this as a favorite
 
Wheatver.
posted by Samsonov14 at 10:57 PM on September 14, 2003


Certainly true - reading the quote itself proves it!
posted by gregb1007 at 10:59 PM on September 14, 2003


I'm azamed at how esay taht was for me to raed. Vrey cool -- good psot.
posted by Tubes at 11:01 PM on September 14, 2003


Opos, I was wonrg aobut jrviacspat, it's dnoe wtih PHP.
posted by bobo123 at 11:07 PM on September 14, 2003


OUCH! This makes my haed HRUT!!!!
posted by kain at 11:24 PM on September 14, 2003


It's a well-documented phenomena in cognitive psychology. People recognize words by their shape. Ever notice how much easier it is to read (recognize) lower case text? SEE HOW MUCH HARDER IT IS TO READ TEXT IN ALL CAPS? IT'S BECAUSE WORDS LOOK LIKE FEATURELESS BLOCKS instead of sporting nice ascenders and descenders to give you a clue to what letterforms words contain.

This is also why the really important bits of all legal documents are in small, capitalized text: it's purposely done to reduce reading comprehension so you won't read the contract.
posted by mathowie at 11:24 PM on September 14, 2003


Tihs mghit be uesd to get aurnod sapm fletirs.
posted by teg at 11:53 PM on September 14, 2003


Somebody should tell the people at French Connection UK. People may think that their t-shirts actually read "fuck".
posted by seanyboy at 12:09 AM on September 15, 2003


An interesting phenomenon indeed, however the example given in the FPP is a lot more readable than a random piece of shuffled text imho. Whether this is because the choice of words or whether the shuffle in the FPP isn't a "full" shuffle I don't know.

Actually, looking at the FPP more carefully, the example is mostly just single transpositions, nothing like the random ordering of interior letters in the CGI.
posted by fvw at 12:15 AM on September 15, 2003


Sweetness.
posted by woil at 12:19 AM on September 15, 2003


This seems ok if you read the words in order. However, sometimes, I like to just glance at one part of one paragraph to get a general sense of what it is about. With the scrambled text it makes jumping around much harder.
posted by gyc at 12:31 AM on September 15, 2003


Oooookay...

I spent half the evening trying to trace this meme back to its source, to no avail. I was hoping nobody would try to FPP until we discovered whether the "university research" really existed.
One blogger linked it to an article by a couple California academics in Nature magazine, but it's really not the same thing. I've found no evidence at the sites for Cambridge University or Cambridge University Press.

But now that someone has made the "scramble script" that had already showed up in Boing Boing intro something any old web user can play with, it's a whole 'nother thing. Still, I seriously doubt the premise behind the word-scrambling meme, and suspect that the scrambling in the original message was definately NOT random, in order to foster a belief in the concept (agreeing with fvw).

Tihs nxet papgrraah was psrecoesd tuohrgh the Slbrmecar Srcpit.
I snlreiecy digarese wtih the perisems put frtoh aobut scbrialnmg wrods, so I'm itionltlnaney enrovdaenig to ulizite leetnghir cpocmtaeild wodrs, not nclesiesray uonommcn wrdos, taht can not be dceerihped as ieuntlivity as tohse in the oirginal prgpraaah. The frist of my dsiceorives is taht wrods endnig in sufefxis or bnegining in pierxfes bmecoe daggesiend form the frist/lsat ltteer rothlpisneias taht spupedsloy are the baiss of the pmseires, and bemcoe mcuh mroe clinaelnhgg, amsolt ieclenaipbhrde. See?

All this proves is that some scrambled words are a lot harder to decipher than others, whether or not you keep the first and last letters in place, and a lot of people on the web are too gullible when somebody's quoting an Elingsh uinervtisy (335 results), or more specifically Cmabrigde Uinervtisy (30 results). In spite of this change in sources, both versions have the same mistake: if "rscheearch" is supposed to be "research", it has an extra "ch"; if it's supposed to be "researcher", it needs to end in an "r"...

Until somebody finds me the original research cited in the original quote, I 'd call it a hoax.

Besides, while I was researching the research, I found a REAL study that shows that order does matter on things like ballots...

this post has not been spell-checked, because it'd take an hour to do so with all the scrambled words!
mathowie, I know what you did, and you WILL get away with it

posted by wendell at 12:49 AM on September 15, 2003 [1 favorite]


I've always held this this type of thing up to the "Hookt ahn Fonix" crowd to explain why their approach was a bit on the simplistic side (to be charitable).

Wonder if languagehat will check in ...
posted by RavinDave at 1:05 AM on September 15, 2003


bizzare
posted by delmoi at 1:34 AM on September 15, 2003


languagehat's take on it
posted by wendell at 1:44 AM on September 15, 2003


I can read it fine, but I read it substantially slower than I would a unmuddled paragraph.
posted by rudyfink at 2:23 AM on September 15, 2003


After a couple of beers, I read the scrambled text faster! Can I get a grant now? Please?

This sort of thing fascinates me. And not just because of the beer.
posted by stavrosthewonderchicken at 2:34 AM on September 15, 2003


Also, I'm put in mind of the laughably-dumbass-as-hip-and-edgy FCUK brand.

Huh-huh-huh geddit? [/beavis]
posted by stavrosthewonderchicken at 3:01 AM on September 15, 2003


I so don't have any competence in cognitive psychology, but I just spent some time searching on the Web of Science and didn't come up with anything. It looks like much of the research that's been done in this area is in things like text-recall and priming.

So, yeah -- it looks like the process of word recognition is nowhere nearly as simple as this would suggest. If it were, I'm sure there wouldn't be 1739 entries for 'lexical decision' on Psychinfo.

Then again, I could have just demonstrated my incompetence in database-searching...
posted by Sonny Jim at 3:11 AM on September 15, 2003


I've always found it interesting that I can intuitively recognize a seemingly endless variations of a single letter (say: "R") in isolation, whereas a machine can only make close approximations (when it isn't explicitly taught) -- and even then will screw up a significant percentage of the time. I don't know if the pattern matching involved in comprehending individual characters is the same animal as that which allows for the phenomenon in question here, but I suspect they're related at some level. Regardless, it offers great prospects for those psychology/linguistic/education/compsci students looking for a promising dissertation topic.
posted by RavinDave at 3:30 AM on September 15, 2003


This appears to be happening wherever this topic shows up on the web. Even though the original premise is a hoax/put-on based on a flawed, if not invalid premise, it's generating scholarly yet interesting discussion of linguistic/psychological/etc. issues well beyond the original premise...
*shrugs, goes to bed*
posted by wendell at 4:00 AM on September 15, 2003


So here I am to rock you like a hurricane. Today my teacher caught be dreaming off about snow again, so I get detention on lunch. I hate her. Maybe I shouldnt dream at skool but then I would get so bored I would fall asleep anyway.
posted by Keyser Soze at 4:35 AM on September 15, 2003


so hree I am to rcok you lkie a harrnicue. Tdoay my teheacr chgaut be drmeaing off aubot sonw aaign, so I get dtnetoien on lncuh. I htae her. Myabe I suoldnht draem at sokol but tehn I wluod get so breod I wolud flal aleesp aanywy.
posted by Keyser Soze at 4:35 AM on September 15, 2003


Seems to me that the premise may hold true for simple letter transpositions (the knid of erorrs taht are msot liekly to appaer in tpyed txet).

Wendell pretty well demolished it beyond simple transpositions, I think.
posted by Irontom at 5:05 AM on September 15, 2003


Most. Annoying. Meme. Ever.
posted by IshmaelGraves at 5:14 AM on September 15, 2003


English is a non-inflected language, dependent mostly on word order to carry certain information. An inflected language (like Russian) has less of a restriction in the regard, but the payoff is that they have to tack extra morphemes onto their words to carry that same info. In short, I'm guessing it's probably easier for English readers to gloss over letters -- the mere position of the word allows us tro make rapid educated guesses. This would probably not be the case in Russian, which lacks that visual cue.

As mentioned. Interesting fodder for research, regardless of its genesis.
posted by RavinDave at 5:15 AM on September 15, 2003


Wendell, I actually found your intentionally obfuscated paragraph reasonably easy to read. While the words were longer, I could also draw on context. This context is nested and supplied at different scales, firstly by the subject of this thread in general, and then by your sceptical response in your post. So I kind of know the subject matter. Then there's the general rules of syntax/grammar which dictate what sorts of words we expect next in the sentence. I think it would be much harder to unscramble words that were chosen randomly with regard to subject, syntax, grammar, etc., because you would not know what to 'expect' next.

Regarding your point that it doesn't work as well with longer words, I think this high level of redundancy need not have to work with longer words. As most people generally use short words most of the time, for it to be a useful feature of language, working with just the shorter words is just fine.

Anyway nice find bobo123, and nice contributions/arguments from others!
posted by carter at 6:33 AM on September 15, 2003


Wow, carter's like, all smart and shit.
posted by Samsonov14 at 7:09 AM on September 15, 2003


Ever notice how much easier it is to read (recognize) lower case text?
I think that is why they print words in the vocabulary section of graduate school tests (like the GRE) in all upper case. STYLISTIC. DIVERSIFY. And without a context the voacbulary words that you know look like jibberish.

Great discussion. Someone could edit these posts and mail them to Cambridge as a rebuttal!
posted by philfromhavelock at 7:11 AM on September 15, 2003


Wendell, thanks for the link. But:

Even though the original premise is a hoax/put-on based on a flawed, if not invalid premise

I can't figure out why you're so hostile to the premise. Even if the attribution to an unnamed "Elingsh uinervtisy" (as in the version I used) or to Cambridge (as here) is bullshit, so what? The premise justifies itself (as I said on my blog: "Via Avva, who doesn't provide a source, but the point is made without one"). Like carter, I read your artificially difficult paragraph almost as easily as the original one; the premise is clearly correct, even if it doesn't have official backing. It's also true, as RavinDave says (and as the discussion on Avva, a Russian-language site, verifies), that this only works well for largely uninflected languages like English; when almost all words have inflectional endings, it eliminates half the anchor (the last letter, which doesn't carry word meaning).

IshmaelGraves, you're entitled to your opinion, but this is the only non-annoying meme I've seen in many a moon. It actually shows you something interesting about the world.
posted by languagehat at 7:43 AM on September 15, 2003 [1 favorite]


Most. Annoying. Meme. Ever.

I can think of one more annoying.
posted by angry modem at 8:19 AM on September 15, 2003 [1 favorite]


I also had very little problem reading wendell's longer-word scrambles (after tripping on the first "intentionally," I seemed to get into the rhythm of it). But that said, of course it's going to be somewhat harder to read longer scrambled words than shorter ones. If the word "that" has its middle scrambled, there's still half the word holding up the key first-and-last points, and the scrambling can't put the middle that far out of order. If it's "pclae," you still have 40% of the word in its correct spot. On the other hand, if it's "itionltlnaney," your key points are reduced to 7% of the total word, with much more variability within the scrambled middle. (And then there's "alibiandtitsitsmeinsashenram.")

So a more accurate algorithm for this thesis would increase the number of letters kept in order at the beginning and end as the word length increases, e.g. "intentionally" would become "intonnaeitlly." Anyway, hoax or no, this is a fascinating phnoneemon.
posted by soyjoy at 9:28 AM on September 15, 2003


No edntieno que snfgiciia la plraaaba "ieetlcnfd", preo qierua aireuagvr si ctesnalalo teina la mmisa podapried. Me praece que si, se pdeue llreeo.

actually, that's pretty much the same as my normal spanish anyway. i'm not sure (as i say) what "inflected" means, but if it's to do with adding silly endings to words then spanish has it too - and i think that's still legible (it may contain errors in the original, too, of course).

the worst bit, imho, is double-ls getting separated and created. maybe that shouldn't really happen anyway (they're a separate letter, in some sense).
posted by andrew cooke at 9:44 AM on September 15, 2003


Soyjoy: Alibi and tits? It's me in sash! ...en...RAM!
posted by leotrotsky at 11:23 AM on September 15, 2003


The Javascript doesn't do anything for Japanese, so I tried it by hand, with one of the news items from today's Asahi. In English, the original is roughly "On the 15th in Riyadh, the capital of Saudi Arabia, a fire broke out in a prison, killing 67 prisoners."

This technique doesn't make Japanese illegible, either, though it does make a pretty good hash of "Saudi Arabia" there at the beginning - "SABIJIURAAA". The rest looks really wrong, but once you realize that everything's been mixed up, you can easily read it.

It doesn't work because there's too many blocks of three symbols or less, I think. Reverse them, and you can still make out the meaning. You could probably play hell with more complex sentences, though, if you were also allowed to mix up all the particles... or even worse, if you could randomly mix up all the characters, rather than just the ones that go together into one word.
posted by vorfeed at 11:32 AM on September 15, 2003


She shloud hvae deid heefreatr;
Trehe wolud hvae been a tmie for scuh a wrod.
To-mroorw, and to-moorrw, and to-mrroow,
Ceerps in tihs ptety pcae form day to day
To the lsat sablylle of rcoreedd tmie,
And all our ydraeteyss hvae liegthd folos
The way to dstuy detah. Out, out, beirf cldnae!
Lfie's but a waiklng sahdow, a poor pelayr
Taht srttus and ftres his huor uopn the sgtae
And tehn is haerd no mroe: it is a tlae
Tlod by an iiodt, flul of sunod and fruy,
Siginifnyg nhtnoig.
posted by DevilsAdvocate at 12:26 PM on September 15, 2003


----------------------
hewoevr, tihs deos not wrok for pohne nmubres
posted by Peter H at 12:41 PM on September 15, 2003


Metafilter: Sabiji-URAAAAA!

Also, following up on my own post from a while back, Leonard Richardson's implemented this in his Eater of Meaning. It's called "Eat chewy caramel center", and now you can browse Metafilter (or CNN or whatever) through the lens of smarclebd txet.
posted by wanderingmind at 12:54 PM on September 15, 2003


When I read this post, my first thought was, "He misspelled iprmoatnt." Does that make me a freak?
posted by boaz at 1:08 PM on September 15, 2003


Alibi and tits? It's me in sash! ...en...RAM!

Well, yeah, I did that by hand, and my first run-through had too many obvious clues - "ment" was still all together, and "anti" had re-formed somewhere else in the word. So I did help generate "alibi and tits" by moving a couple of letters. Where "it's me in sash" came from though, I don't know - apparently some subconscious urge.
posted by soyjoy at 1:22 PM on September 15, 2003


Also, I've noticed that most people can almost instantly unscramble even totally-jumbled words of 5 letters or less, but that at 6+ letters they start having a difficult time of it. Could this just be an instance of that effect with the easy range moved out to 7 letters by leaving the 1st and last letters unmoved (and having the context provided by the other words to help with the few 8+ letter words)?

For example: Idneingt seelnkrros sflillukly mailspce pelreudnd taeurrse.
posted by boaz at 1:30 PM on September 15, 2003


As for me, I welmoce our new dyslexic ovelrords. [*dukcs]
posted by palancik at 2:29 PM on September 15, 2003


boaz - I dunno if this is the answer, but he difference between 5- and 6-letter words is the diff. between 3! and 4! - i.e. adding one letter gives you four times as many possible combinations inside the word. Six combinations is probably something our minds can scroll through pretty easily, but 24 is another order entirely.

Another order! I kill me!
posted by soyjoy at 2:41 PM on September 15, 2003


I worte some qciuk and dtriy prel to tset the theroy that its the shape of the words that cunot. Pretty raadeble. On the ohter hand, not vrey mnay lettres get seihctwd anroud, so of cuorse its gniog to be pertty raedable. Maybe if I don't limit each output word to the origianl chraacters:

I woete ssve qwksk aad dkziy pvsl to tmzt the tlrrzy tiat its tte sdmge of tie wozds tixt cnoxt. Pzvlfy rznhmtfe. On tte ofhzr hrxd, nwt vnvy mrry laffmrs gut smhbskrd aczvvd, so of cewrwe ihs gefng to be pmxfty rsrfetbe. Mnjie if I dun't ldatt ermh ovkqvt wxnd to tke ocdpixml cderxebxxs.

Thats a lot harder to read, even tho the words are still the same shape. (its the same text as the first paragraph)

Hmm, on preview, maybe lower case "i"s aren't tall?
posted by duckstab at 2:56 PM on September 15, 2003 [1 favorite]


(btw, the letter sets i used for that: t, i, d, f, h, k, l, and b were considered equal; as were q, y, p, g and j; and w, e, r, u, o, a, s, z, x, c, v, n, and m)
posted by duckstab at 3:00 PM on September 15, 2003


duckstab, I could still read your second paragraph pretty easily. I had an initial moment of hesitation that was missing when reading the first, though, almost as if I had to switch my brain over to whole-language mode.
posted by vorfeed at 4:02 PM on September 15, 2003


vorfeed, it's paefky exuy to mmie ort tle svcvad ecoeyke wfen yeu've grt tle fdzut as rstsanoee. Bot it's atst mnwe ewzvpanntlwg wten ywu dan't atvusby heoe tle plwlaimut vuxobxn of the escrlcd tzzt. I'd wrqxr tfvt its sdlil pwvsldke, bat nnt at "ruotkxg spocd", lbke it is whrn uahzg tie szae ckvacxkuos as in tke oaipbxzl weeks.
posted by duckstab at 4:31 PM on September 15, 2003


I think the most interesting aspect of this phenomenon is that it allows one to create a simple letter-replacement form of encryption (applying it to the scrambled text) that would be much harder to decipher, since the typical patterns one uses to break a letter-replacement system would not apply.
posted by troybob at 5:25 PM on September 15, 2003


troybob, frequency analysis would still work (for the original method, not for my keep-the-shape-but-loose-the-original-characters thing.. well, not as easily), and frequency analysis is a pretty easy attack. It'd probably take a little more time, because you wouldn't be as likely to spot words before you had all their letters.
posted by duckstab at 5:37 PM on September 15, 2003


That is amazing!!
posted by cockeyed at 5:41 PM on September 15, 2003


duckstab, I'm amazed to find that I could read your second example ("it's paefky exuy...") quite easily (though, as you suggest, not quite as quickly as the first). The amount of redundancy in human language is astonishing (and a good thing, contrary to what the proponents of various "logical languages" seem to think).
posted by languagehat at 6:41 PM on September 15, 2003


(mostly off topic) Just for kicks, I compressed the text from this page, and a file of the same size using the same characters, but derived randomly. The results are as expected. Uncompressed, both are 19987 bytes. The text from this page compresses to 8897 bytes; the random data compresses to 15946 bytes. Obviously, the more something compresses, the more patterns (redundancy) it contains.

I wonder if wordy languages are more compressable vs. terse languages. I'd certainly expect so.

Here's a section of a paper about entropy & redundnacy in the english language.
posted by duckstab at 7:38 PM on September 15, 2003


duckstab - yes, freq analysis would work, but you wouldn't know (automatically) that you had the right translation.

take a plaintext, scramble it in this way (so it's mixed but legible) and then encrypt it using, say DES. now how is someone going to crack that? they're going to get their beowulf cluster to run through all possible keys looking for the correct decrypt. now a *lot* of those decrypts are going to have about the right letter frequency, so you need something more to flag the correct detection - i guess they check for common letter groups, common words etc. those tests will fail in this case. so you'd need a human to check through all candidate decrypts by eye...

of course, they could still use word length distributions and anything that uses first/last letters, two and three letter words. an alternative would be to use a caesar cipher before encryption (i wonder whether the nsa has code for things like this?)
posted by andrew cooke at 5:07 AM on September 16, 2003


May the circle be unbroken...
posted by Pollomacho at 9:39 AM on September 17, 2003


David Harris did an interesting experiment with the meme, quoting the paragraph in slightly different form each day on his blog to see how it propagated. (His comments have translations into French and Portuguese.)

I must say, it's a strange experience being Snopesed...
posted by languagehat at 12:31 PM on September 17, 2003


Wow, so you're the typhoid Mary of this meme, languagehat? Congrats... I guess?
posted by soyjoy at 2:27 PM on September 17, 2003


Reminds me of this short short short story by David Garnett (page all the way down to the bottom):

He hid to retch the slop before cloning time.
Talking hurriedly alone, he chucked his witch. Only ode minute new, even loss...

posted by straight at 6:40 AM on September 18, 2003


Duckstab--there was an interesting study a while ago that was sort of related to your point:
posted by adamrice at 6:52 AM on September 24, 2003


« Older Ashcroft makes baby Madison cry.   |   Tanzanian Cartoons Newer »


This thread has been archived and is closed to new comments