Join 3,523 readers in helping fund MetaFilter (Hide)


Of possible punctuation past.
May 18, 2012 4:49 PM   Subscribe

Unicode’s sad lack of intellectual smileys | And, really, wouldn't this be a better world if we had a rhetorical question mark?
posted by titus-g (27 comments total) 5 users marked this as a favorite

 
Here's the thing: ambiguity is more expressive and useful than certainty.
posted by wobh at 5:16 PM on May 18, 2012 [2 favorites]


ambiguity is more expressive and useful than certainty.

XD
posted by longsleeves at 5:22 PM on May 18, 2012 [2 favorites]


I will miss thee, Fake Unicode Consortium.
posted by mkb at 5:23 PM on May 18, 2012 [2 favorites]


Here's the thing: ambiguity is more expressive and useful than certainty.

Go on...
posted by alex_skazat at 5:27 PM on May 18, 2012 [1 favorite]


Sentences have words. If you pay attention to those, you'll pick up on the meaning of the sentence pretty quickly.

That's how I do it, at least. Pro-tip.
posted by chasing at 5:35 PM on May 18, 2012 [1 favorite]


Unicode is obviously a plot by the patriarchy.
posted by Chocolate Pickle at 5:37 PM on May 18, 2012



Sentences have words. If you pay attention to those, you'll pick up on the meaning of the sentence pretty quickly.


If you say so.
posted by Pogo_Fuzzybutt at 5:41 PM on May 18, 2012 [6 favorites]


Sentences have words. If you pay attention to those, you'll pick up on the meaning of the sentence pretty quickly.

Well, bless your heart.

Is it...
Sarcasm?
Irony?
Both?

*shakes head*
Not reading you.
posted by BlueHorse at 5:47 PM on May 18, 2012 [3 favorites]


Lojban would probably never have been invented (a ghastly artificial language with constructs to signify, among other things, sarcasm and irony)

The "constructs" he's talking about are interjections.

Also, Lojban's claim to unambiguity is based on its resemblance to first-order predicate logic, which only requires parentheses. Way too many parentheses, you know, like LISP.
posted by LogicalDash at 5:53 PM on May 18, 2012


I'm pretty sure this post was tongue-in-cheek, but in fact Unicode has a meticulous, academic process for deciding whether to include characters or not. The case has to be made, and mostly the case is historic use; some pre-existing code sheet, or writing system, or the like. His citations on those use are useful to make the case, actually, although I think the Unicode has a fairly high threshold for notability.
posted by Nelson at 5:57 PM on May 18, 2012


Ah, looks like a good place to pitch my equivocation mark.
posted by BlackLeotardFront at 5:58 PM on May 18, 2012 [6 favorites]


I'm getting a distinctly U+061F vibe from this.
posted by mazola at 6:01 PM on May 18, 2012 [1 favorite]


... I think the Unicode has a fairly high threshold for notability.
Right.
posted by dopeydad at 6:02 PM on May 18, 2012 [3 favorites]


The Standard Pile of Poo is notably easy for people to recognize across many cultures. Banal things can be notable.
posted by LogicalDash at 7:13 PM on May 18, 2012


Pile of poo was included when all of the emoji "character set" was brought into Unicode 6.0 wholesale. Emoji is cemented in Japanese phone messaging, and was requested in Unicode by Google and Apple. So, yes, fairly high threshold, that being "all of Japan, and the top two mobile phone operating system companies" in this instance.
posted by mendel at 7:43 PM on May 18, 2012 [1 favorite]


Sorry; just trying to be funny.
posted by dopeydad at 8:00 PM on May 18, 2012


You know what I want? Misquotation marks. For when you're sarcastically putting words into someone's mouth for rhetorical effect.
posted by nebulawindphone at 8:01 PM on May 18, 2012 [2 favorites]


Instead of words being mere strings, I think in the future, the language I want to write in should have everything be an object.
posted by alex_skazat at 8:15 PM on May 18, 2012 [2 favorites]


        .
       /'
      //
  .  //
  |\ /7
 /' " \       
.   . .       I dunno, I've found the ASCII range of characters to be pretty expressive
| (    \    / 
|  '._  '    
/   \ '-'

posted by hellojed at 10:13 PM on May 18, 2012 [3 favorites]


No punctuation is better than this punctuation
posted by iotic at 11:15 PM on May 18, 2012


What‽‽‽
The case has to be made, and mostly the case is historic use; some pre-existing code sheet
Yeah, the key word here is code sheet. Characters aren't usually added to Unicode one at a time, rather what they try to do (I assume) is set things up so that existing documents using existing code sets converted into Unicode without losing anything. That's why all those DOS box drawing fonts characters are there - there were already files in existence that used them, and the goal was to be able to convert them over.

SO the Emoji makes a lot of sense. Those characters all had standard encodings in Japanese text messages, so if you wanted to preserve those while moving to Unicode, they had to be included.

These random punctuation ideas might have been interesting, but it's not like there was a lot of text written using them, and even less likely that there are digital files out there that have them included.

Also I'm guessing that emoji would have been included much earlier, but I bet the people responsible were really resistant.
posted by delmoi at 4:43 AM on May 19, 2012


Instead of words being mere strings, I think in the future, the language I want to write in should have everything be an object.

他们有那个语∶是汉语

(actually, modern Chinese usually uses words formed from (usually) two characters, rather then using a single 汉子 for everything - but each character does have an independent meaning, and some words (such as the word for 'word', 词 is only one character)
posted by delmoi at 4:52 AM on May 19, 2012


Seriously, how and who decides these things? Black Snowman?!? ⛇ Firework Sparkler? Woman With Bunny Ears?!?!?!?
posted by pashdown at 8:51 AM on May 19, 2012 [2 favorites]


how and who decides these things?

The Unicode consortium decides these things. Did you read the comments? We've been discussing that. It's anything but arbitrary and if you're genuinely curious, you might learn something.

Firework Sparkler and Women With Bunny Ears are both part of the emoji set that delmoi noted. What's new in Unicode 6.0? is useful for its summarization of the Emoji debate. The proposal is quite detailed, as is the symbol list which was finally adopted.

Black Snowman is listed as a "Weather symbol from ARIB STD B24". I didn't Google up all the details but I'm guessing a Japanese broadcaster encoded text messages with some weather symbols enough that Unicode adopted a mapping of those characters.

There's a wonderful database of Unicode characters. It'd be great if it included links to the history of that code point's inclusion and uses in previous places. Does such a thing exist? It'd be an interesting bit of technology history to assemble.
posted by Nelson at 11:41 AM on May 19, 2012


There's a wonderful database of Unicode characters. It'd be great if it included links to the history of that code point's inclusion and uses in previous places. Does such a thing exist? It'd be an interesting bit of technology history to assemble.
Yeah, it would be awesome if there was a DB that went over the history of each character.
posted by delmoi at 1:01 AM on May 21, 2012


Firework Sparkler and Women With Bunny Ears are both part of the emoji set that delmoi noted.

OK, smart guy, how and who decided they needed to be Emoji? Who insisted on the dancing twins character?
posted by pashdown at 10:15 PM on May 25, 2012


Well, originally that encoding system was only meant to be used for Japanese text messages. They needed a larger code space than usual to get all the standard kanji, of which there are around two thousand.

Without knowing the particulars of the encoding used, I can say that the characters were assigned index numbers encoded in binary. An 11 bit index gets you 2048 characters, but that's not really enough because you also need hiragana and katakana, the phonetic characters. Each of those has around 60 members. Best to include all the ASCII codes, too, since people are used to having those.

So 2048 characters isn't quite enough. Add one more bit, and you have 4096 characters to play with.

Even if they included a bunch of nonstandard kanji, that's more than they need. Including some goofy emoticons doesn't cost them anything now, and if the stupid Verizon ads of some years ago are any indication, emoticons make good ad copy. So they told their art department to fill it up.

Maybe the Unicode people could have negotiated to leave those out, if they really wanted to, but if there's one problem Unicode doesn't have, it's lack of space.
posted by LogicalDash at 10:27 AM on May 28, 2012


« Older Novelist Neil Gaiman tells the graduating 2012 cla...  |  “I believe I owe the gay commu... Newer »


This thread has been archived and is closed to new comments