libgrapheme - A suckless Unicode string library
December 23, 2021 5:54 PM   Subscribe

libgrapheme is an extremely simple C99 library providing utilities for properly handling Unicode strings made up of user-perceived characters ('grapheme clusters') according to the Unicode standard. While providing convenience functions to operate on UTF-8-encoded strings, you can also use libgrapheme for any other encoding as well. [via lobste.rs]
posted by cgc373 (6 comments total)

This post was deleted for the following reason: Poster's request -- goodnewsfortheinsane



 
There are a handful of problem domains in computing where you have only two choices in how to approach them: you either dedicate your life to that problem, or you exclusively use tools and libraries built by the people who have dedicated their lives to that problem. Every other choice is a long road with madness and agony at the end of it.

Cryptography and time zones are two such domains. Unicode string handling is another. The only responsible thing you can do as a programmer is treat unicode strings like an opaque binary blob and handle them with tongs. Every other choice eventually leads to disaster.

I hope this stands up to both inspection and life in production.
posted by mhoye at 7:22 PM on December 23, 2021 [13 favorites]


time zones

I don't know what it's like these days, but once upon a time #ntpd on freenode was a place you could go to hang out with people who were basically time wizards, and if you were lucky and patient they might even answer your novitiate questions.
posted by snuffleupagus at 7:51 PM on December 23, 2021 [1 favorite]


Freenode no longer exists, but the zoneinfo mailing lists still do, and I suspect the ntpd people found their way to Libera.
posted by mhoye at 8:30 PM on December 23, 2021


The most interesting thing about this library to me is the suckless philosophy, a group writing new Unix tools in a very spare and stripped down way. It's not for me and I'm not sure it makes sense in the contemporary open source ecosystem. But they do good work and I appreciate their æsthetic.

Shout out to emoji, which is the Trojan horse for forcing Unicode implementations to be incorrect. Even the stripped down ones. As recently as 10 years ago folks were still shipping software that pretended Unicode was a 16 bit character set. MySQL, most notoriously. Emoji doesn't work if your software is that dumb and since everyone thinks software that can't handle emoji is 💩 it's forced all the Unicode implementations to get their act together. I do wonder what libgrapheme gave up though, their talk about "satisfying in 99% of the cases" makes me a bit nervous. Particularly if none of the team works fluently in languages that are more complex to handle correctly.

The NTP world still exists but has fractured in an interesting way. chrony is the best implementation now and the old ntpd has gotten a full overhaul with NTPsec. Then there's whatever bullshit systemd has built in which serves the function of being a noobie trap.

The NTP community I pay attention to is the NTP Pool. But that's not a place to ask about timezones. The gold standard for time zone nerdery is the TZ Database (aka the Olsen Database) and that commnunity is still going strong. There was even some drama recently.
posted by Nelson at 8:44 PM on December 23, 2021 [6 favorites]


Just a word of warning that the many of the suckless people seem to enjoy using Nazi iconography (they had a tiki torch march at one of their conferences, naming one of their computers "wolfsschanze", etc). From what I've seen, most of the people who end up contributing to suckless projects know about those politics, and are either fine with hanging out and working with people like that or share the interest in it (whether that's legitimate beliefs or just wanting to be "edgy" I don't know and don't care).

I'd rather this thread not get bogged down in talking about this stuff, I just think it's important for people to be aware of what they're getting into when they interact with projects like this.
posted by wesleyac at 9:46 PM on December 23, 2021 [5 favorites]


naming one of their computers "wolfsschanze", etc

To be clear, that's a reference to Hitler's eastern front headquarters. Usually translated as "The Wolf's Lair."
posted by ChurchHatesTucker at 9:57 PM on December 23, 2021 [1 favorite]


« Older Eve Babitz, chronicler of LA in the 70s and 80s...   |   It's official: A beach is coming to Williamsburg Newer »


This thread has been archived and is closed to new comments