Making time safe for historians
February 6, 2013 7:49 AM   Subscribe

Who needs machine readable dates? As far as I can see there are two target audiences for this operation. The first is obviously social applications that have to work with dates, and where it can be useful to compare dates of two different events. An app must be able to see if two events fall on the same day and warn you if they do. However, as a target audience social applications are immediately followed by historians (or historical, chronological applications). After all, historians are (dare I say it?) historically the most prolific users of dates, until they were upstaged by social applications.

Quirksmode's ppk (Peter-Paul Koch) discusses the history of dates and calendars, through the lens of the HTML5 specification's <time> element, which was itself removed before being reinstated.
posted by smcg (39 comments total) 14 users marked this as a favorite
 
No, banking, accounting and financial companies were (and are) the most prolific users of dates, and representing pre-epochal time is a solved problem.
posted by Slap*Happy at 8:01 AM on February 6, 2013 [4 favorites]


Yes it's true, there are more than just social applications. But of course every half-whit writer who is trying to make link bait these days just has to mention a social network or Apple computers.
posted by Napierzaza at 8:03 AM on February 6, 2013 [2 favorites]


You know who else would like to have reliably dated internet documents? Patent offices and people seeking to invalidate patents. Finding extremely relevant prior art is useless if it's impossible to prove that this prior art was, indeed, prior.
posted by Skeptic at 8:03 AM on February 6, 2013


Int64 of seconds since the big bang. Want more precision? Add another Int64. Problem solved.

posted by blue_beetle on 4.7304e+17 [+] [!]
posted by blue_beetle at 8:04 AM on February 6, 2013 [5 favorites]


Slap*Happy, if you read the article, you will see that ISO 8601 suffes from the same problem as the proposed HTML5 element, the use of the proleptic Gregorian calendar for historical dates for which it is not appropriate.

Nice article. It is also very much worth reading Erik Naggum's somewhat-famous The Long, Painful History of Time.
posted by enn at 8:06 AM on February 6, 2013 [1 favorite]


Sometimes, in my more whimsical moments, I wonder what Metafilter would be like if people read the articles that were posted.
posted by smcg at 8:06 AM on February 6, 2013 [15 favorites]


I did not know Russia stayed Julian until Lenin's time. Wasn't there a similar deal in the early modern period - like they skipped a whole month to correct the calendar? Answering a question like "how many days ago was July 12, 1485?" is not as easy as you might think - really, if you think anything about dates and times and calendars is easy, it indicates you don't know much about them, or have never tried to implement computer representations of them
posted by thelonius at 8:14 AM on February 6, 2013 [1 favorite]


I've often thought that there needs to be a web-archive-of-record authority that would produce digital signatures consisting of a content hash and a timestamp. So, say, a deviantART artist or a blogger could register an image or a blog post, receiving an unforgable certificate that could be used to defend against later copyright infringement. Like an internet content notary.
posted by qxntpqbbbqxl at 8:14 AM on February 6, 2013 [1 favorite]


Int64 of seconds since the big bang. Want more precision? Add another Int64. Problem solved.

Problem very far from solved. You need more than formats for time. You need a system to handle astronomical things like leap seconds and whatnot. This is very very very important in orbital and other space-related calculations. Very.
posted by DU at 8:18 AM on February 6, 2013


NNng. Date data structures are a great weed-out question for interviews. If you think your home-rolled date-time functions are complete for any reasonable definition, you're probably too dang cocky or Bjarne Stroustrup, and that's before you get to dates like January 0, or February 30th. Those examples aren't to say that those dates are in common use, but if you're working in historical contexts like Koch addresses, you'd be better off incorporating his ideas about <time> attributes for those rare but obvious cases.
posted by boo_radley at 8:18 AM on February 6, 2013 [1 favorite]


Oh! There's also This corker of an answer from Jon Skeet over at stack overflow!. Measuring time is one part rational frequency/tick counting, and ten parts cultural/ convention accounting -- it's not just a science.
posted by boo_radley at 8:22 AM on February 6, 2013


I've often thought that there needs to be a web-archive-of-record authority that would produce digital signatures consisting of a content hash and a timestamp.

Here you go. It's been running since 1995, I've used it occasionally. The best way to use it is to send them a detached PGP signature of the content in question, so you don't have to send them a giant blob of arbitrary data -- the chain to the content is just as strong. The service "timestamps" and signs the signature and then sends it back to you.

It's fairly clever. There are commercial versions of the same thing but they offer no advantage as far as I can tell.
posted by Kadin2048 at 8:22 AM on February 6, 2013 [6 favorites]


Also of interest: A literary appreciation of the Olson/Zoneinfo/tz database.
posted by enn at 8:28 AM on February 6, 2013 [2 favorites]


Proper time naming is terribly difficult and way out of the scope of the HTML5 process. So I'll take the easy way out he offers, and have everyone understand the <time> element is not some fancy historically accurate label for all things. Honestly once you say "microformats" I start thinking about RDF and semantic web and before too long I'll be caught in an infinite loop producing specifications with no actual value for the last ten years.
posted by Nelson at 8:29 AM on February 6, 2013 [1 favorite]


Here you go.

I love the internet. It's like the very act of imagining an idea causes it to come into existence. (The alternate view is that everything has already been invented so why bother, but that is depressing so I'm not going to dwell on it)

There are commercial versions of the same thing but they offer no advantage as far as I can tell.

I think the advantage is durability and continuity. The PGP Digital Timestamping appears to be run by one guy. What happens when he loses interest or (heaven forbid) dies? The existence of an incoming revenue stream is insurance against the service just disappearing---important because, if the service is compromised, its keys could fall into the wrong hands and potentially be used to create back-dated certificates.
posted by qxntpqbbbqxl at 8:30 AM on February 6, 2013 [1 favorite]


A favorite quote on the subject.
posted by brennen at 8:36 AM on February 6, 2013 [1 favorite]


I use Julian dates in spreadsheets (with weird double years, too, because the year changed in March), and I find it easy: I just turn OFF the stupid spreadsheet date-format and use a text string in ISO. So January 30, 1630/31 (Old Style) is written 1630/31-01-30. And then the alphabetical sorting puts it after Dec 31, 1630, as it should.

I don't have to mix Georgian and Julian dates - most historians don't, because we tend to look at one country/period at a time. The only people I can think of who really need to do that are international historians who are worried about something like exactly what day something happened in a Julian country when compared to a Georgian calendar country. As for comparing things over 100s of years - then the couple of weeks difference between the two calendars aren't very important.

But yeah, everyone should be using ISO if they are using numerical dates. It's the only standard that makes sense, and which can then be sorted in chronological order with a simple alphabetical sort.
posted by jb at 8:42 AM on February 6, 2013


This seems like major beanplating. The new time element doesn't accommodate historical European time standards; it doesn't accommodate historical Chinese, Hebrew, or Indian time standards either, and those are still in widespread use. If historians want to use the new time element, they have to manually convert times to a modern calendar. Lots of software is available to help with this.
posted by miyabo at 8:46 AM on February 6, 2013


I use Julian dates in spreadsheets

I was really confused by your comment at first until I realized that "Julian dates" in this context referred to the Julian calendar. In supply chain informatics, "Julian date" usually refers to the ordinal day of the year.
posted by Slothrup at 8:56 AM on February 6, 2013 [1 favorite]


Int64 of seconds since the big bang.

That doesn't solve the problem of approximate dates, for starters.

Date and time handling are one of those things, like text encoding, that seem like a straightforward problem on their face but are really, really hard to get right once you start dealing with edge cases. Anybody can count ticks, whether seconds or milliseconds or whatever, from an arbitrarily-chosen epoch, just like anyone can assign numbers to the keys on a typewriter keyboard and claim that text encoding is solved. Naive designs like that work fine until suddenly they don't, and then they fail messily.

IMO, although it's not a perfect solution, the way I'd improve the <time> element would be to specify a mandatory format attribute, rather than trying to come up with One Time Format To Rule Them All. There are just too many uses of date/time stamps to have one specification that will cover all cases.

E.g., <time format="unix">1360168346</time> would satisfy modern never-before-1970 timestamp needs, for stuff like creation or modification dates of files. But for something like dates of birth, you'd need to use a more complex specification, like RFC3339 or maybe an extended version of RFC3339 that specifies the calendar system in use.

It would be up to the user agent or application to render the time into the end user's desired format, or leave it alone (if that's considered preferable). I don't much like to see Unix timestamps, so I'd want them converted; someone dealing with historical documents might not want Julian dates converted to Gregorian ones, though.

The key part is just making sure that the type of date is specified alongside the date information itself, in the same way that modern systems are (or should be) careful to transmit the text encoding in use alongside the text data itself.

I think the advantage is durability and continuity. The PGP Digital Timestamping appears to be run by one guy. What happens when he loses interest or (heaven forbid) dies?

Then the service shuts down (or it might not, if he plans very well), but the timestamps are still good; they're all publicly available in files like this. They're archived, among other places, in the Internet Archive. So I think they are unlikely to become suddenly unavailable, absent a concerted effort on the part of someone to scrub them off the 'net.

The threat of having the keys fall into the wrong hands is real and a good point, but can be pretty easily mitigated by having others periodically download and sign the signature files, or even just sign a list of hashes of all the files. I'm not sure if anyone is doing this right now, but if I get bored later today I might do it just for giggles; shouldn't take more than a minute or two. The more people who have signed hashes hanging around, the harder it would be for someone to retroactively forge a signature.

Also, at least based on recent history it's not as though the commercial PKI CAs have a great track record; furthermore, since they depend on a perception of trustworthiness to stay in business, it's reasonable to expect that in the event of a catastrophic key compromise, they'd do everything possible to cover it up for as long as possible. (Since when word gets out, you're done.) Someone running a signing service as a hobby wouldn't have the same perverse incentives.

Although the jury is still out, I think it's possible that systems run by passionate people basically as hobbies have some of the best/longest longevity on the Internet, because they are not constantly required to justify their own existence via an income stream.
posted by Kadin2048 at 8:57 AM on February 6, 2013 [3 favorites]


That was absolutely fascinating. I knew about some of these temporal complexities, but I had no idea just how wobbly it all gets as you push back past 300BC.
posted by yoink at 8:59 AM on February 6, 2013


And then we find out that in 17th century Europe there was a trend of worrying about privacy of time similar to current worry about our privacy of other personal information and people willingly used false dates in their correspondence and records to master their own time and to make future historians cry. Not really. But could have been.
posted by Free word order! at 9:01 AM on February 6, 2013 [2 favorites]


Other problems might include other calendars (e.g., I don't know whether current scholarly writing about the French Revolution uses the French Republican Calendar) and the distant future, should we abolish leap seconds (since the "day" would then become unlinked from solar time).

And for all those who think that counting seconds or fractions-of-seconds is a good way of talking about past dates (I've certainly been guilty of thinking this!), consider the question of how many seconds have passed since 1972-01-01 00:00:00 to the present.

Present-day timekeeping is immensely geeky, and so is old timekeeping.
posted by jepler at 9:04 AM on February 6, 2013 [2 favorites]


Typically in science "Julian date" means number of days since 1/1/4713 BC, with decimal points (today is 2456323.6292708). This is pretty widely used to refer to astronomical events.

Why 4713 BC? Basically it made mystical/pseudomathematical sense to the guy who invented the system 500 years ago. So there's some funny history there too.

A lot of business just use Julian date as a fancy way to say "days since the beginning of the year" though.
posted by miyabo at 9:07 AM on February 6, 2013 [1 favorite]


Typically in science "Julian date" means number of days since 1/1/4713 BC, with decimal points (today is 2456323.6292708).

...

A lot of business just use Julian date as a fancy way to say "days since the beginning of the year" though.


Businesses that do astronomical data processing do both and it gets confusing. Plus time zones. Plus leap seconds. Plus epochs. And that's just to figure out what time it is--you haven't even started on the 6 other coordinates (3 spatial, 3 first derivatives of same).
posted by DU at 9:18 AM on February 6, 2013


I was really confused by your comment at first until I realized that "Julian dates" in this context referred to the Julian calendar. In supply chain informatics, "Julian date" usually refers to the ordinal day of the year.

Sorry - yes, I meant Julian Calendar - or rather, the English Old Style Calendar, which is its own bit of specialness. (Julian calendar, but counting the beginning of the year on March 25).

That doesn't solve the problem of approximate dates, for starters.

An ISO-based text string is still quite good, even for approximate dating. If you know the month, you might write 1645-04 (meaning April 1645, but no day specified), or just the year as 1645. If you do an alphabetical sort, you can still order your items by date (and it will throw a month-only date to the top of the month). If I want to indictate c, I tend to put that on the end (eg 1645c) so that I can still sort by date. Missing dates can be 1645? or 164? (I've worked with damaged documents where you can only see part of the date).
posted by jb at 9:39 AM on February 6, 2013


In retrospect, it's kind of amusing that most of the ancient/medieval histories I've read had at least a page explaining the system for the proper nouns the book will use and how it has or hasn't been adapted to modern English tastes but not a single word about the how the chronology is connected to anything. That said, it does make sense that most historians are naturally more like philologists at heart than number crunchers.
posted by Copronymus at 9:42 AM on February 6, 2013 [2 favorites]


thelonius: "Wasn't there a similar deal in the early modern period - like they skipped a whole month to correct the calendar?"

Check out the Wikipedia article on Gregorian calendar reform. Ten to thirteen days were skipped, depending when the country adopted the new calendar.
posted by Chrysostom at 9:59 AM on February 6, 2013


None of these account for relativistic shifts, either. Time is local, not universal.
posted by jenkinsEar at 10:07 AM on February 6, 2013 [4 favorites]


Leap seconds in the context of tick counting are a distraction. They should only be applied when converting the count to a time. And the relativistic point is a good one, because we reach levels of precision where they come into play pretty frequently these days. (By "we" I man humans.) I think the best you can do is a counter in an agreed-upon inertial reference frame, and an exhaustive database of transformation rules. But getting from a time to a number on that counter would not be trivial. You'd have to know the jurisdiction, the calendar, and timezone in use at the time in question (which causes unresolvable ambiguity in some cases since you don't actually know exactly how that time fits on the timeline yet), the number of leap seconds recognized in that jurisdiction before the time in question, and other local traditions with regard to time.
posted by Nothing at 10:23 AM on February 6, 2013 [1 favorite]


Not to mention: information could be lost in the transformation.
posted by Nothing at 10:25 AM on February 6, 2013


It sounds like this.person needs an XML schema.
posted by double block and bleed at 10:49 AM on February 6, 2013


Also, what about cultures where calendar days start at a time other than midnight, something that continues to this day (culturally if not legally)..
posted by jepler at 11:07 AM on February 6, 2013


In retrospect, it's kind of amusing that most of the ancient/medieval histories I've read had at least a page explaining the system for the proper nouns the book will use and how it has or hasn't been adapted to modern English tastes but not a single word about the how the chronology is connected to anything.

Most books on British history will specify whether spelling has been modernised in quotes and which dating system has been used (modern or original). One habit is to leave the month & day as in the Old Style, but with year changing at January 1 (as it did in some records but not others). This just confuses me (am I looking at Jan 1630 before April 1630, or the Jan 1630 AFTER April 1630 - I was misdating a certain event for ages because of this), so I always use the double year when relevant (January to March).

As for historians being more interested in philology than numbers -- that's a function of academic fashion. There were lots of number crunchers in the 60s, and there still is in certain places and fields (obviously in economic history, there is lots of math). But the way that the Historical academy is currently set up, there isn't that much to attract or support historians interested in working quantitatively, let alone doing complex chronological analysis. I know one person who has done indepth chronological analysis, and he invented his own system. He uses ISO dates.
posted by jb at 11:53 AM on February 6, 2013


It sounds like this.person needs an XML schema.

Now you have two problems.
posted by Joe in Australia at 11:57 AM on February 6, 2013 [5 favorites]


modern never-before-1970 timestamp

32-bit unix timestamps can count back as far as December 1901 (-(2**31)). 1970 is just the 0 point.
posted by Pruitt-Igoe at 12:01 PM on February 6, 2013 [2 favorites]


This just confuses me (am I looking at Jan 1630 before April 1630, or the Jan 1630 AFTER April 1630 - I was misdating a certain event for ages because of this), so I always use the double year when relevant (January to March).

Actually, thinking about it, some books on Islamic history must have gone into dating systems because of some similar calendar conversion issues there. It's not something you tend to see in books about, say, Alexander the Great, though, even when they start throwing around dates authoritatively.

As for historians being more interested in philology than numbers -- that's a function of academic fashion.

Undoubtedly. It's certainly not inherent to the discipline, which is somewhere in between a social science and a humanity and should be able to use a wide variety of interpretive methods.
posted by Copronymus at 2:47 PM on February 6, 2013


Date and time handling are one of those things, like text encoding, that seem like a straightforward problem on their face but are really, really hard to get right once you start dealing with edge cases

Anyone who's assigned (or takes upon herself) a programming task which includes a calendar should assign themselves the punishment of writing that sentence on the blackboard 500 times.

Once tried it's easy to understand why astronomers and Unix programmers threw up their hands and made up their own. Conversions from everyday usage to a float can be a PITA but there lies shelter from the storm.
posted by Twang at 2:58 PM on February 6, 2013


> The only people I can think of who really need to do that are international historians who are worried about something like exactly what day something happened in a Julian country when compared to a Georgian calendar country.

Or historians of Russia who have to deal with the 1918 changeover, not to mention the one from 7208 to 1700.
posted by languagehat at 10:36 AM on February 7, 2013 [2 favorites]


« Older "Our preferred policy solution is to abolish...   |   When Walt Met Peter Met Abe Met Andy Met Philip:... Newer »


This thread has been archived and is closed to new comments