Skip
# it's a Dell dude!

Quantum cryptography is still susceptible to the man-in-the-middle attack.

posted by vacapinta at 6:39 PM on December 2, 2003

Any text (eg this one) can be considered as a very long number. Just take a range of ASCII hexadecimals and "push them together". Converting a,b,c,d,e to abcde is a fast algorithm: it's e*x^0+d*x^1+c*x^2 ... where x is the base. Computationally trivial, probably actually un-necessary if you use the right method of operation on the numbers in computer memory.

Anyhow, once you have a DAMN_BIG_NUMBER, you can compress it, by expressing it as a string of primes: p(5325)^78*p(5172)^56 .... etc. This is going to be a long formula, but the longer the text to be compressed, the comparitively shorter the formula will be. It's computationally trivial to reverse these formulas and generate the original text.

If you can factorize quickly you can also search a few million numbers either side to find one that factorizes comparitively "neatly", ie has a notably shorter formula, and you can use its formula + or - the difference.

I guess, on no good evidence, that predicability of the length of factorization formulas is a problem of the same class as fast factorization. If these problems are solveable, so is "ultimate compression". Google's cache of the internet could be expressed by a formula a few megabytes, maybe even a few kilobytes, long. You could put the entire internet on a CD. Of course you'd have to uncompress it entirely into a space the size of Google's cache before you could use it, but multi-terabyte HDDs are coming too.

posted by aeschenkarnos at 6:51 PM on December 2, 2003

Um, largest

Reminds me of when I once heard on NPR that someone had found a new prime number that was 4 times larger than the previously largest-known prime.

"

posted by straight at 9:07 PM on December 2, 2003

It it qualifies by representing the same information in fewer bits.

posted by kindall at 9:07 PM on December 2, 2003

Now there's some

posted by LowDog at 10:06 AM on December 3, 2003

Post

# it's a Dell dude!

December 2, 2003 5:07 PM Subscribe

A Dell with a 2GHz Pentium processor owned by a Michigan State University student has found the world's largest prime number -- containing more than 6.3 million digits. The student was loaning his extra computer cycles to the GIMPS project [sort of like SETI and other monster farms]. Here's a web page he created about palindromic prime numbers before he became Mr. Biggest Prime Number Guy.

If anyone's interested in joining a distributed project I'd recommend Folding@home, a protein-folding analyzation effort out of Stanford. [previous mefi thread]

Searching for mersenne primes is a neat way of flexing computational muscles, but not terribly productive in the long term (in my opinion)

posted by spatula at 5:36 PM on December 2, 2003

Searching for mersenne primes is a neat way of flexing computational muscles, but not terribly productive in the long term (in my opinion)

posted by spatula at 5:36 PM on December 2, 2003

To give you an idea of how huge (and useless?) this number is - the total number of elementary particles in the known universe is a number about 80 digits long.

posted by vacapinta at 5:39 PM on December 2, 2003

posted by vacapinta at 5:39 PM on December 2, 2003

As far as using primes for encryption, their days are numbered. Pun intended.

posted by anathema at 6:14 PM on December 2, 2003

posted by anathema at 6:14 PM on December 2, 2003

Time to update the prime number shitting bear and goatse (second NSFW, obviously.)

posted by homunculus at 6:16 PM on December 2, 2003

posted by homunculus at 6:16 PM on December 2, 2003

*As far as using primes for encryption, their days are numbered. Pun intended.*

Quantum cryptography is still susceptible to the man-in-the-middle attack.

posted by vacapinta at 6:39 PM on December 2, 2003

**anathema**There's another use for primes that awaits the discovery of a fast factorizing algorithm, even an approximate one: Godel's compression method.

Any text (eg this one) can be considered as a very long number. Just take a range of ASCII hexadecimals and "push them together". Converting a,b,c,d,e to abcde is a fast algorithm: it's e*x^0+d*x^1+c*x^2 ... where x is the base. Computationally trivial, probably actually un-necessary if you use the right method of operation on the numbers in computer memory.

Anyhow, once you have a DAMN_BIG_NUMBER, you can compress it, by expressing it as a string of primes: p(5325)^78*p(5172)^56 .... etc. This is going to be a long formula, but the longer the text to be compressed, the comparitively shorter the formula will be. It's computationally trivial to reverse these formulas and generate the original text.

If you can factorize quickly you can also search a few million numbers either side to find one that factorizes comparitively "neatly", ie has a notably shorter formula, and you can use its formula + or - the difference.

I guess, on no good evidence, that predicability of the length of factorization formulas is a problem of the same class as fast factorization. If these problems are solveable, so is "ultimate compression". Google's cache of the internet could be expressed by a formula a few megabytes, maybe even a few kilobytes, long. You could put the entire internet on a CD. Of course you'd have to uncompress it entirely into a space the size of Google's cache before you could use it, but multi-terabyte HDDs are coming too.

posted by aeschenkarnos at 6:51 PM on December 2, 2003

Whoa. I've never heard of Godel compression.

More to the point, I don't buy it. The amount of data needed to encode factors is about the same as that needed to encode the original number.

In general, if you multiply an N digit number by an M digit number, you'll get a number with about M+N digits (i.e. 10^100=10^72*10^28).

Now, I am aware that you are encoding using prime cardinality rather than the primes themselves but you can consider this as just another "base". That is all you are doing is converting a number from base 10 to "base prime". And I fail to see how this qualifies as data compression...

posted by vacapinta at 7:50 PM on December 2, 2003

More to the point, I don't buy it. The amount of data needed to encode factors is about the same as that needed to encode the original number.

In general, if you multiply an N digit number by an M digit number, you'll get a number with about M+N digits (i.e. 10^100=10^72*10^28).

Now, I am aware that you are encoding using prime cardinality rather than the primes themselves but you can consider this as just another "base". That is all you are doing is converting a number from base 10 to "base prime". And I fail to see how this qualifies as data compression...

posted by vacapinta at 7:50 PM on December 2, 2003

Forget Godel compression, this is going to get him sooo laid!

(Er, maybe not.)

posted by Fofer at 8:44 PM on December 2, 2003

(Er, maybe not.)

posted by Fofer at 8:44 PM on December 2, 2003

*the world's largest prime number*

Um, largest

*known*prime number, surely?

Reminds me of when I once heard on NPR that someone had found a new prime number that was 4 times larger than the previously largest-known prime.

"

*Exactly*four times larger?" I wondered.

posted by straight at 9:07 PM on December 2, 2003

*I fail to see how this qualifies as data compression*

It it qualifies by representing the same information in fewer bits.

posted by kindall at 9:07 PM on December 2, 2003

I think the piece of info missing is that Marseinne (sp?) primes are prime numbers of the form (2^n)-1 (n is prime). N has quite a few fewer digits than the original prime. So, that's where the compression comes from.

For example, 253921 is 8191 ((2^13)-1) times 31 ((2^5)-1). If you represent it as 13, 5, you've compressed six digits (2,5,3,9,2,1) into three (1,3,5). As you scale to larger numbers, you get even more compression. If you use only Mersenne primes, you can call out, 3, 5 (the third and fifth Mersenne primes and compress even further.

posted by notsnot at 9:45 PM on December 2, 2003

For example, 253921 is 8191 ((2^13)-1) times 31 ((2^5)-1). If you represent it as 13, 5, you've compressed six digits (2,5,3,9,2,1) into three (1,3,5). As you scale to larger numbers, you get even more compression. If you use only Mersenne primes, you can call out, 3, 5 (the third and fifth Mersenne primes and compress even further.

posted by notsnot at 9:45 PM on December 2, 2003

There's more info missing than that. On average, you simply can't compress random text. Lots of info on the compression faq. It doesn't make sense to talk about compression without talking about the probability distribution of the source (or, if you prefer, the "patterns" in the text).

Also, compression of large arbitrary (nonrandom) strings has already been solved optimally by a number of methods. Methods which, I might add, are quite computationally tractable.

posted by Galvatron at 11:19 PM on December 2, 2003

Also, compression of large arbitrary (nonrandom) strings has already been solved optimally by a number of methods. Methods which, I might add, are quite computationally tractable.

posted by Galvatron at 11:19 PM on December 2, 2003

As a non-mathematician, can I ask: what's the damn point?

posted by Pericles at 1:55 AM on December 3, 2003

posted by Pericles at 1:55 AM on December 3, 2003

As a non-mathematician, can I reply: it's the dot that seperates the fraction from the whole number? (",)

posted by dash_slot- at 2:15 AM on December 3, 2003

posted by dash_slot- at 2:15 AM on December 3, 2003

Galvatron is right, the entropy of the source needs to be considered. But, just look at a back-of-the-napkin estimate:

notsnot: it can't just be Mersenne primes, because most numbers can't be factored only with Mersenne primes.

(I couldn't find a reference to Godel compression either via Google or Amazon, does anyone have a cite?)

posted by brool at 3:50 AM on December 3, 2003

- take a message with 100 bytes, convert to a number ~ 2^800
- average number of prime factors ~ ln ln x = 6.31
- number of primes = pi(x) ~ x/ln x = 1.2e238
- compute the entropy and you get an
*expansion*

notsnot: it can't just be Mersenne primes, because most numbers can't be factored only with Mersenne primes.

(I couldn't find a reference to Godel compression either via Google or Amazon, does anyone have a cite?)

posted by brool at 3:50 AM on December 3, 2003

this is all like that bugs bunny cartoon where he tunnels up from under the sand, opens his suitcase and an entire set of furniture pops out, including a table under a large beach umbrella holding a fruity drink in a tall glass with a tiny umbrella?

posted by quonsar at 4:02 AM on December 3, 2003

posted by quonsar at 4:02 AM on December 3, 2003

"

Hehe... maths is funny.

posted by Blue Stone at 4:32 AM on December 3, 2003

*Reminds me of when I once heard on NPR that someone had found a new prime number that was 4 times larger than the previously largest-known prime.*

"Exactly four times larger?" I wondered." - straight."Exactly four times larger?" I wondered.

Hehe... maths is funny.

posted by Blue Stone at 4:32 AM on December 3, 2003

Damn! You folks need to get a life! Go out and pound down a few beers and get laid.... or something!

posted by LowDog at 6:40 AM on December 3, 2003

posted by LowDog at 6:40 AM on December 3, 2003

I don't get the monster farms. By the time you spend all the time and labor and money to build one, plus all the electricity and space it will suck up, not to mention cooling requirements in the summer, you might as well buy a blade server. A 20 CPU (PIII 800mz) 3U rack unit with 1 power cord and 1 ethernet cord can be had for about $10k on ebay. Granted not everyone has $10k up front but thats probably what a monster costs in the big picture.

posted by stbalbach at 7:21 AM on December 3, 2003

posted by stbalbach at 7:21 AM on December 3, 2003

This simplest explanation of the compression problem I ever heard is this: imagine trying to compress a 4-digit number into a 3-digit number. There are 9999 possible 4-digit numbers and only 999 possible 3-digit numbers. There's no way you can represent every possible 4 digit number with one of the 3 digit numbers.

No possible compression scheme can compress

posted by straight at 8:32 AM on December 3, 2003

No possible compression scheme can compress

*every*x-byte message into an (x-1)-byte message, much less achieve compression savings of more than one byte.posted by straight at 8:32 AM on December 3, 2003

brool, sorry; yeah you're right. I *barely* recall writing that list night. Affluence of incohol, or something.

posted by notsnot at 8:50 AM on December 3, 2003

posted by notsnot at 8:50 AM on December 3, 2003

It makes sense that you can create any number from the product of several primes. If you use the indices of the primes for encoding, you would save digits when encoding larger numbers (eg. lots of text). The problem as pointed out generally by those who mention the compression problem is specifically this:

The number to encode must require more storage than the indices of all of its prime factors plus a number indicating the number of factors. So the number 10000 (only 5 digits/2 bytes in most reconing) would compress down to a not very small 9 bytes. (Assuming a modest 1 byte per prime factor and 1 byte for the factor count.) The idea is that any large number that is

You would have a better chance at compressing much, much larger texts, but in terms of time computationally (unless the mentioned problems of prime factorization are solved) you might as well print it out and UPS it.

Rather than wasting time telling people to get laid or drunk instead of doing math, why not get laid or drunk?

posted by ringmaster at 9:17 AM on December 3, 2003

The number to encode must require more storage than the indices of all of its prime factors plus a number indicating the number of factors. So the number 10000 (only 5 digits/2 bytes in most reconing) would compress down to a not very small 9 bytes. (Assuming a modest 1 byte per prime factor and 1 byte for the factor count.) The idea is that any large number that is

**not**prime - even moreso with one that is a product of many factors - could easily require more storage for its factors than the required storage for the raw number.You would have a better chance at compressing much, much larger texts, but in terms of time computationally (unless the mentioned problems of prime factorization are solved) you might as well print it out and UPS it.

Rather than wasting time telling people to get laid or drunk instead of doing math, why not get laid or drunk?

posted by ringmaster at 9:17 AM on December 3, 2003

*Rather than wasting time telling people to get laid or drunk instead of doing math, why not get laid or drunk?*

Now there's some

**useful**logic!

posted by LowDog at 10:06 AM on December 3, 2003

My source was "Godel, Escher, Bach" by Douglas Hofstadter. Brilliant, brilliant book.

Anyhow, the key to the 'compression thing' is that you don't have to use the actual exact factorization of the huge number in question. You use the smallest factorization within a few million (hell, a few hundred billion if you like) and add an offset.

Sort of like Merseine primes, but more generalized. The ideal is:

HUGE_NUMBER = X^Y + Z

or alternatively,

= p(X)^Y + Z

where p(n) = the n'th prime, because the primes get scarcer as n increases, hence a significant compression with numbers about a billion digits long.

And if you get another huge number in the course of doing this, which you probably will, then you can "compress" it by the same method. So,

HUGE_NUMBER = compress(x,y,compress(a,compress(i,j,k),c))

So long as there is a power of something (come to think of it it doesn't actually have to be a prime, which could speed the search up) within a few billion of the HUGE_NUMBER, we achieve a compression by converting HUGE_NUMBER to a formula.

Now, I make no claims to be a mathematician. I am nothing but a internet crank with a lot of time and interested in a wide range of things. (And I am a modest and rational crank, who's quite willing to change his mind if proven wrong.) So I could be completely off track here, and there could be enormous ranges of numbers which lack any "small formulae" at all. I dunno. So if there's anything to this--and it is, as I said, based on a comment by Douglas Hofstadter about Kurt Godel's work--then it is for mathematicians and programmers to find. Not cranks, not playas, and not drunkards.

posted by aeschenkarnos at 5:20 PM on December 3, 2003

Anyhow, the key to the 'compression thing' is that you don't have to use the actual exact factorization of the huge number in question. You use the smallest factorization within a few million (hell, a few hundred billion if you like) and add an offset.

Sort of like Merseine primes, but more generalized. The ideal is:

HUGE_NUMBER = X^Y + Z

or alternatively,

= p(X)^Y + Z

where p(n) = the n'th prime, because the primes get scarcer as n increases, hence a significant compression with numbers about a billion digits long.

And if you get another huge number in the course of doing this, which you probably will, then you can "compress" it by the same method. So,

HUGE_NUMBER = compress(x,y,compress(a,compress(i,j,k),c))

So long as there is a power of something (come to think of it it doesn't actually have to be a prime, which could speed the search up) within a few billion of the HUGE_NUMBER, we achieve a compression by converting HUGE_NUMBER to a formula.

Now, I make no claims to be a mathematician. I am nothing but a internet crank with a lot of time and interested in a wide range of things. (And I am a modest and rational crank, who's quite willing to change his mind if proven wrong.) So I could be completely off track here, and there could be enormous ranges of numbers which lack any "small formulae" at all. I dunno. So if there's anything to this--and it is, as I said, based on a comment by Douglas Hofstadter about Kurt Godel's work--then it is for mathematicians and programmers to find. Not cranks, not playas, and not drunkards.

posted by aeschenkarnos at 5:20 PM on December 3, 2003

« Older Above the Law? Maybe Not. | The Case Against Miloševic Newer »

This thread has been archived and is closed to new comments

posted by tiamat at 5:29 PM on December 2, 2003