Time-filling links
April 24, 2012 7:08 AM   Subscribe

Aldo Cortesi blogs about the interface between computer science and visualisation. He has found some interesting applications for space-filling curves: making colour maps of images and of executable binary files. A bit more work and one can visualise entropy in binary files. The cryptographic material sticks out like a sore thumb.

All his code for playing with space-filling curves is up on github.

Related cool fact: space-filling curves are ideal for maintaining memory locality in cache-oblivious algorithms.
posted by Talkie Toaster (19 comments total) 22 users marked this as a favorite

This is really interesting. He also made these nice sorting algorithm visualizations.
posted by scose at 7:25 AM on April 24, 2012

*The cryptographic material sticks out like a sore thumb.*

It would be nice if this would get crypto fetishists to shut up, but on past form they'll probably just start insisitng everyone fill up the empty space on their drives with output from /dev/random. I have nothing against crypto as such, but the 'encrypt everything' crowd remind me of the concealed carry advocates. As a matter of fact, I have very little data that I consider worth encrypting and think burning CPU cycles to encrypt everything as a matter of course is a dreadful waste of time and energy.
posted by anigbrowl at 7:26 AM on April 24, 2012

I wish I were smart.
posted by clvrmnky at 7:31 AM on April 24, 2012 [2 favorites]

anigbrowl: if you ever go to a web page with a picture of a naked person, a copy of that image is stored on disk. It is rarely accompanied by proof of the model's age.
posted by idiopath at 7:32 AM on April 24, 2012

It does seem like he drinks some of his own kool-aid about this Hilbert cube ordering. He uses it here for sorting algorithm visualizations where, in my opinion, a standard smooth colormap would be more effective.
posted by scose at 7:34 AM on April 24, 2012

I'm not totally convinced by the cryptographic stuff. He finds "cyptographic material" but not something that's actually encoded. This is like the difference between finding a codebook on a spy and finding a regular book where you don't know if there's a message in it.

Which isn't to say you couldn't find encoded material this way, because yes, the entropy should be off. But the solution is pretty obvious and easy: encode your steganographic material more evenly throughout the file. Use ever other byte or every 10th byte or *every* byte so he never finds a peak.
posted by DU at 7:41 AM on April 24, 2012

anigbrowl: if you ever go to a web page with a picture of a naked person, a copy of that image is stored on disk. It is rarely accompanied by proof of the model's age.

Do you mean on my disk? And if so, why should I care? I'm not in the publishing business, nor in the hobby of collecting porn, and when I did collect porn, it wasn't biased towards'barely legal' or 'hot teen' models that could possibly be mistaken for underage ones.

I do read a lot of criminal cases (because I'm a law student) and every so often that includes cases about people being prosecuted for possession or distribution of child pornography. I rather dislike reading those, because the details of what some people consider erotic can be absolutely nauseating. I'll take my chances, because the vast majority of of enforcement efforts against child pornography goes into prosecuting the egregious stuff.
posted by anigbrowl at 7:46 AM on April 24, 2012

While I don't encrypt my entire hard drive, I don't think "I'm not doing anything illegal anyway!" is a very good reason for that. You aren't doing anything that is *now* considered illegal (maybe). But in 10 years for instance, who knows what books, websites or acquaintances may be grounds for re-education.
posted by DU at 7:49 AM on April 24, 2012 [1 favorite]

Considering the historically expansionist trend of first amendment litigation, I'll take my chances with that too.
posted by anigbrowl at 7:54 AM on April 24, 2012

anigbrowl: you don't need to be "collecting" anything for the images to be stored. If the image shows up in your browser, a copy is stored on your hard drive (theoretically it is only stored short term, but I have seen exceptions; also mind you the cached image files usually have meaningless file names with no three letter extension, but they are still easy to find and view). I am just saying that many people who don't think they have anything to hide still end up with material that could get them in trouble stored on their computers.
posted by idiopath at 7:58 AM on April 24, 2012

Currently, thieves take cellphones to resell them. Cellphones are easy to steal, and easy to sell, and the stolen phones easy to keep using. This has now changed - stolen cellphones are going to be locked out of the network. Thieves can't sell them as phones, or to those who are going to use them as phones.

Now we have a conundrum, as smartphones continue to be easy to steal. What will the thieves do with them now?

Sell them to criminals who will retrieve personal information - such as bank account info, credit card numbers and expiration dates, passwords for webmail, web retailers and any number of other, potentially lucrative accounts.

There are facilities in Asia gearing up to receive stolen smartphones, and organized crime is setting up fencing operations here in the 'states to feed them, as they will be astonishingly cheap now that there's no other market for them, and there will always be thieves trying to sell them.

If it starts working out for them, a good business needs to expand, so notebooks and business' desktops and servers are next... if a junkie knows a pawn-shop or a flea-market stall that buys old HDDs for $10 a pop, no case or computer needed, they will disappear along with the usual haul of jewelry and high-end electronics during burglaries.

In addition to the mass-scale strip-mining operations gearing up, there are already local organizations operating on a small scale, specializing in stolen laptops and desktops. They buy on the cheap, no questions asked, raid the HDDs for personal info they can turn into easy money, and resell after wiping and reinitializing.

Encrypt your drives. Use a robust password on your user account.
posted by Slap*Happy at 8:20 AM on April 24, 2012

anigbrowl: you don't need to be "collecting" anything for the images to be stored.

What images? While I understand caching and the internals of browsers and operating systems perfectly well, thank you, all of this rests on the unfounded assumptions that I a) visit sites featuring porn on a regular basis and b) that there's a reasonable probability of encountering legally dubious porn, such as that involving underage subjects. In 25 years of maintaining my own computer, I have yet to come across images that I wasn't aware I had downloaded. when I say I don't see any great point in encrypting my hard drives, I assure you that it's not due to a lack of awareness about the contents thereof, legal or illegal.
posted by anigbrowl at 8:32 AM on April 24, 2012

I ofen end up on websites with naked pictures without explicitly seeking them out. You don't need pictures of children to get in trouble, lack of proof that they aren't children can be enough.

This is a derail anyway. And regarding the larger point, being able to find out that something is encrypted due to entropy measurements is hardly an argument against encryption. But if you never see a website with a naked picture on it, and you don't store any passwords or personally identifying information, maybe you have no need for encryption? Who knows.
posted by idiopath at 8:48 AM on April 24, 2012

No cryptography enthusiast that I can imagine would use this as any evidence of anything, especially because the simplistic algorithm used here causes false positives on the ASCII<->EBCDIC translation tables in the ksh binary, which are anything but cryptographic. And even if they were, as with the code signatures, who cares? The fact that some binaries are signed on modern operating systems is anything but a secret. It's just an interesting visual, let's not read too much into it.
posted by Rhomboid at 9:14 AM on April 24, 2012

scose: "This is really interesting. He also made these nice sorting algorithm visualizations."

I'd love to see the visualization for Quantum Bogosort.
posted by mullingitover at 11:34 AM on April 24, 2012

There are two kinds of paranoia: total, and insufficient. I am both, because if you think you are sufficiently paranoid, you’re not.
posted by bukvich at 2:14 PM on April 24, 2012

I think he's detecting compression, not cryptography. If he ran this over a JPEG he should see high entropy through the entire file.
posted by nixt at 2:20 PM on April 24, 2012

Thanks, Talkie Toaster. That was an interesting set of articles.
posted by benito.strauss at 2:46 PM on April 24, 2012

nixt: "I think he's detecting compression, not cryptography."

strictly speaking he is detecting exactly what he says he is detecting: a simplistic measure of entropy. This can be a sign of good compression algorithms, cryptography, random data, corrupted data, or data that is ordered in a "complex" way (the character conversion table was predictable and ordered, but not in a way his entropy detector was designed to measure). I think it is interesting that a trivial measure of entropy, even one that does not deal with simple patterns like an orderly character table, can be a rule of thumb for finding things like cryptographic data or translation tables. This is an interesting look into how one would start do data forensics (I am sure real data recovery / forensics tools do much more than just look at entropy of course, but it seems like that may be an indicator that would help find boundaries between files or parts of a binary for example).
posted by idiopath at 3:57 PM on April 24, 2012

« Older Red-Tailed Hawk and Great Blue Heron Webcams   |   Timeo Danaos et dona ferentis. Newer »

This thread has been archived and is closed to new comments