December 31, 2009 5:39 AM Subscribe

XKCD random number generator explained on StackOverflow

posted by srboisvert (89 comments total) 7 users marked this as a favorite

posted by srboisvert (89 comments total) 7 users marked this as a favorite

(also the fact that the Dilbert cartoon is the top Stack Overflow answer indicates that people do not understand the XKCD comic)

posted by null terminated at 5:49 AM on December 31, 2009

posted by null terminated at 5:49 AM on December 31, 2009

Hmm, will chose my reading of the cartoon at random

posted by fallingbadgers at 5:55 AM on December 31, 2009

posted by fallingbadgers at 5:55 AM on December 31, 2009

In fact, it's worse than (2). There is no way to distinguish with certainty any function from a random number generator using a finite number of tests. Obviously you can look at the probability distribution and say 'I'm 99.999999% sure this isn't random, but you can't be CERTAIN.

Is it ROSENCRANTZ AND GILDENSTERN ARE DEAD where one of the characters flips a coin continually and always gets heads? I've always wanted to expand that concept into the basis for a movie exploring the concepts of chance and luck.

posted by unSane at 5:59 AM on December 31, 2009 [7 favorites]

Is it ROSENCRANTZ AND GILDENSTERN ARE DEAD where one of the characters flips a coin continually and always gets heads? I've always wanted to expand that concept into the basis for a movie exploring the concepts of chance and luck.

posted by unSane at 5:59 AM on December 31, 2009 [7 favorites]

When I first saw the comic, I thought the joke was a statistical fallacy - namely, programmer needed a random number generator (rather than just a number) and modeled it after the roll of a die, but did not roll sufficiently many number to generate an algorithm. Sort of a "one point determines a line" kind of thing.

posted by backseatpilot at 6:00 AM on December 31, 2009

posted by backseatpilot at 6:00 AM on December 31, 2009

What's to get? The function states it returns a random number. The "4" was chosen randomly and is therefore a random number. Presto, mission accomplished.

posted by Bovine Love at 6:05 AM on December 31, 2009

posted by Bovine Love at 6:05 AM on December 31, 2009

Oh boy! Sleep! That's where I'm a random number generator!

posted by billysumday at 6:16 AM on December 31, 2009 [15 favorites]

posted by billysumday at 6:16 AM on December 31, 2009 [15 favorites]

4? That's where I'm a Viking!

posted by Mayor West at 6:16 AM on December 31, 2009 [3 favorites]

posted by Mayor West at 6:16 AM on December 31, 2009 [3 favorites]

damnit billysumday

posted by Mayor West at 6:16 AM on December 31, 2009 [3 favorites]

posted by Mayor West at 6:16 AM on December 31, 2009 [3 favorites]

"*Is it ROSENCRANTZ AND GILDENSTERN ARE DEAD where one of the characters flips a coin continually and always gets heads?*"

Yep, that's the one.

posted by tdismukes at 6:17 AM on December 31, 2009

Yep, that's the one.

posted by tdismukes at 6:17 AM on December 31, 2009

Darn.

posted by goodnewsfortheinsane at 6:20 AM on December 31, 2009 [1 favorite]

posted by goodnewsfortheinsane at 6:20 AM on December 31, 2009 [1 favorite]

If you only use the function once (without knowing the internals), it is as good as any random number generator out there.

posted by vacapinta at 6:22 AM on December 31, 2009 [1 favorite]

posted by vacapinta at 6:22 AM on December 31, 2009 [1 favorite]

This is very exciting for me, especially because at some point in college I decided that I had this dream of being a random number generator. I would make up numbers and divide them by seven and keep track of the results and if approximately one out of seven was evenly divisible I counted it as a success (please note: I was an English major. This is probably clear from the lack of mathematical rigor in my plan.). My husband (then boyfriend) has mocked my ambitions of being a random number generator, but I am used to people laughing at my dreams. I feel that this comic vindicates me because it shows that no one can prove that I'm NOT a random number generator. I win! Take THAT, Mr. Pterodactyl!

posted by Mrs. Pterodactyl at 6:26 AM on December 31, 2009 [15 favorites]

posted by Mrs. Pterodactyl at 6:26 AM on December 31, 2009 [15 favorites]

Oh, boy, Random Number Generators! That's where I'm metaphorically a Viking!

posted by Plutor at 6:40 AM on December 31, 2009

posted by Plutor at 6:40 AM on December 31, 2009

I always took it to be a programmer who was so smart-dumb that he thought he could write a method that was random by simply getting a number he knew to be random. Kind of like the Yahoo Answers question on how to make your desktop wallpaper into a mirror, where the first obvious solution the asker saw was to scan a mirror.

For a logical machine like a computer, randomness is impossible. Hence, we need pseudo-random algorithms to find numbers that are random, so far as the end user is concerned, using a seed like milliseconds from midnight or noise from a webcam or microphone.

Granted, I have heard that some computers can actually return a random number via a decaying transistor, which is governed by quantum physics, and thus really and truly random.

PS: CPU cycles be damned, I actually kind of would like a mirror desktop. With a webcam and Windows 7 or OSX Snow Leopard, can I do that?

posted by mccarty.tim at 6:43 AM on December 31, 2009 [2 favorites]

For a logical machine like a computer, randomness is impossible. Hence, we need pseudo-random algorithms to find numbers that are random, so far as the end user is concerned, using a seed like milliseconds from midnight or noise from a webcam or microphone.

Granted, I have heard that some computers can actually return a random number via a decaying transistor, which is governed by quantum physics, and thus really and truly random.

PS: CPU cycles be damned, I actually kind of would like a mirror desktop. With a webcam and Windows 7 or OSX Snow Leopard, can I do that?

posted by mccarty.tim at 6:43 AM on December 31, 2009 [2 favorites]

And if someone would just make a very large library of such one-use random number generating functions, bingo! You could just call them in any old order you like.

posted by Wolfdog at 6:45 AM on December 31, 2009 [2 favorites]

You have to *look inside the bean* to determine its nature, as well as its configuration with respect to other beans, and to the plate.

On the basis of its post-plated*output,* a given bean may *appear* to have been a wayward piece of broccoli, but detailed review of its internal structure is necessary to *conclusively determine* the truth of this assessment.

posted by killdevil at 6:45 AM on December 31, 2009 [3 favorites]

On the basis of its post-plated

posted by killdevil at 6:45 AM on December 31, 2009 [3 favorites]

Also, how far off are we, in terms of metamaterials, from a screen that can dynamically change its surface to be near 100% reflective?

posted by mccarty.tim at 6:46 AM on December 31, 2009

posted by mccarty.tim at 6:46 AM on December 31, 2009

It's also a joke about how developers can implement code that follows the letter of the requirements but totally misses the point of them. Product management asked for a function that returned a random number without specifying that it had to be a different random number each time. Development gave them what they asked for.

I've been in meetings where marketing/product management says, "this is useless, no customer would want this," and development says, "but that's what it said in the PRD."

posted by octothorpe at 6:54 AM on December 31, 2009

I've been in meetings where marketing/product management says, "this is useless, no customer would want this," and development says, "but that's what it said in the PRD."

posted by octothorpe at 6:54 AM on December 31, 2009

The Babbage quote in the second-highest Stack Overflow answer is the only suitable commentary on this entire sad mess.

posted by Skorgu at 6:55 AM on December 31, 2009 [2 favorites]

posted by Skorgu at 6:55 AM on December 31, 2009 [2 favorites]

One subtelty of this that passed me by until I read the comments a bit further down the SO page is that the implied question (is 4 a "random number"?) does not make sense. Numbers are just numbers, and randomness is not one of their attributes. Randomness is a property of sequences of numbers^^ , which is why to get a "random number" in Java you would call a function like .get*Next*Int().

A sequence of all 4s fails any sensible randomness test.

Thanks srboisvert, this was an interesting plate of beans for a lazy day off.

posted by mjg123 at 6:56 AM on December 31, 2009

A sequence of all 4s fails any sensible randomness test.

Thanks srboisvert, this was an interesting plate of beans for a lazy day off.

posted by mjg123 at 6:56 AM on December 31, 2009

Yeah, half the people on SO are wrong, including the most popular answer. Heh. (obviously the top answer was picked because it was another comic, and kind of funny)

*Is it ROSENCRANTZ AND GILDENSTERN ARE DEAD where one of the characters flips a coin continually and always gets heads? I've always wanted to expand that concept into the basis for a movie exploring the concepts of chance and luck.*

There was a TV show on Fox way back when called Strange Luck that was kind of like what you're talking about.

posted by delmoi at 6:57 AM on December 31, 2009

There was a TV show on Fox way back when called Strange Luck that was kind of like what you're talking about.

posted by delmoi at 6:57 AM on December 31, 2009

Radiolab had a good show on stochasticity a few months ago. Definitely worth the listen for the tale of the two Laura Buxtons.

posted by longdaysjourney at 7:02 AM on December 31, 2009 [2 favorites]

posted by longdaysjourney at 7:02 AM on December 31, 2009 [2 favorites]

mjg - and yet it could be the legitimate output of a RNG.

If you flip a coin in front of most Americans, and it comes up heads 10 times in a row, and you ask them "is it more or less likely to come up heads again", they'll tell you "more likely to be tails". This is false. The nature of a stream of random numbers is specifically that the number at position*n* can not be determined by simply looking at *n-1*, etc. Also, the number at position *n* tells you nothing about *n+1*.

So much of where we fail as a country is because a vast majority of the populace hasn't even rudimentary understanding of math and statistics.

posted by petrilli at 7:18 AM on December 31, 2009 [1 favorite]

If you flip a coin in front of most Americans, and it comes up heads 10 times in a row, and you ask them "is it more or less likely to come up heads again", they'll tell you "more likely to be tails". This is false. The nature of a stream of random numbers is specifically that the number at position

So much of where we fail as a country is because a vast majority of the populace hasn't even rudimentary understanding of math and statistics.

posted by petrilli at 7:18 AM on December 31, 2009 [1 favorite]

Just as an aside, I was doing some work several years ago with the a true RNG (based on the concepts of avalanche breakdown), and we happened to see a sequence of numbers monotonically increasing. 28 of them in a row. And yet, it was a provably random number source.

As someone once said (who?), you can find anything you want in an irrational number, like pi.

posted by petrilli at 7:22 AM on December 31, 2009

As someone once said (who?), you can find anything you want in an irrational number, like pi.

posted by petrilli at 7:22 AM on December 31, 2009

Random numbers are fun because on the one hand we encounter randomness constantly, and most people feel like they understand it, and on the other hand, it's very difficult to understand rigorously, and specific examples can confound your expectations. Early in my career, I wrote a program that simulated a cloud of two kinds of dots on a 2D lattice: they both drifted randomly, but one was biased to mostly drift down, and the other up. We were interested in what kinds of biases and densities produced "traffic jams". Early on I found out that good random number generators were slow, but fast ones produced weird results. A bad, fast random number generator would cause the dots to jam up into each other at lower densities, when a good one wouldn't, and this was completely repeatable. Scrambling the order of the list of dots in memory caused the bad number generator to produce the same results as the good one. This left a big impression on me, and at least once I've been able to completely blow a colleague's mind by listening his tale of woe about not getting a Monte Carlo simulation to run right, and promptly telling him to get a better random number generator. Sadly, good random number generators are the default in most software now, so I will not be able to walk the Earth helping misguided scientists by fixing their random numbers.

posted by Humanzee at 7:23 AM on December 31, 2009 [3 favorites]

posted by Humanzee at 7:23 AM on December 31, 2009 [3 favorites]

on preview, petrilli, I remember an episode of Mythbusters when Adam rejected a bread-flipping machine as not random because it the same side came up 7 times out of 10, instead of 50/50.

posted by Humanzee at 7:25 AM on December 31, 2009

posted by Humanzee at 7:25 AM on December 31, 2009

You know, it's really quite easy to differentiate between heads and tails on a typical American quarter by touch alone. If you hold it between your thumb and first two fingers, with a light rub you can determine whether your thumb is touching Washington's smooth, smooth face, or something else. Then, you can smack the quarter down on the back of your hand, the back of someone else's hand or on the table and declare with 100% certainty whether it is heads or tales. It's important to transition from the "catch" to the "rub washington's face" to the "smack down" in one fluid motion and this requires some practice. When they ask you, "how do you know, every time?" it's important to deflect. I usually kind of sigh and say, "well to be honest I'm just not really sure. I think my brain can like, subconsciously see how high the quarter goes or something and how many rotations it makes. Here, *you* flip the quarter to me, I'll catch it and put it on *your* hand and we'll see if it still works."

I have been using this trick to win bar bets for years. Now, metafilter, it is yours.

posted by Baby_Balrog at 7:26 AM on December 31, 2009 [10 favorites]

I have been using this trick to win bar bets for years. Now, metafilter, it is yours.

posted by Baby_Balrog at 7:26 AM on December 31, 2009 [10 favorites]

petrilli: *If you flip a coin in front of most Americans, and it comes up heads 10 times in a row, and you ask them "is it more or less likely to come up heads again", they'll tell you "more likely to be tails". This is false. The nature of a stream of random numbers is specifically that the number at position n can not be determined by simply looking at n-1, etc. Also, the number at position n tells you nothing about n+1.*

Well, actually they're even more wrong than you think. The right answer would be that it's more likely to come up heads:

posted by Kattullus at 7:34 AM on December 31, 2009 [1 favorite]

Well, actually they're even more wrong than you think. The right answer would be that it's more likely to come up heads:

Preliminary analysis of the video-taped tosses suggests that a coin will land the same way it started about 51 percent of the time. "It's a gem-like example of what we know that isn't so," Diaconis says. Though a skeptic since childhood, he believed that "if you flipped a coin vigorously, it was going to be fair.[source]

"But it's not so bad," he says. "One in a hundred is pretty close, actually. It gives me faith that probability assumptions can be validated and useful, but you have to look at them case by case."

posted by Kattullus at 7:34 AM on December 31, 2009 [1 favorite]

That's assuming of course that you don't turn the coin around after you've flipped it.

posted by Kattullus at 7:37 AM on December 31, 2009

posted by Kattullus at 7:37 AM on December 31, 2009

I see no mention in the function of sequence. Clearly some of you lack the preciseness to implement such a function correctly.

posted by Bovine Love at 7:47 AM on December 31, 2009

I think the majority of careful people (ie not me in my comment above, sigh) would say "infinite sequence" (eg), to make it a useful statement.

posted by mjg123 at 7:53 AM on December 31, 2009

If 4 is indistinguishable from a random number I don't understand why they don't just use 4 instead of building these random number generators.

posted by weapons-grade pandemonium at 7:55 AM on December 31, 2009

posted by weapons-grade pandemonium at 7:55 AM on December 31, 2009

mccarty.tim : *CPU cycles be damned, I actually kind of would like a mirror desktop. With a webcam and Windows 7 or OSX Snow Leopard, can I do that?*

1.) Set background to a neutral color. Black maybe.

2.) Order a sheet of mirror silver window tint.

3.) Cut to fit, mount onto screen.

Working(ish) monitor, working mirror, all in one.

posted by quin at 7:59 AM on December 31, 2009

1.) Set background to a neutral color. Black maybe.

2.) Order a sheet of mirror silver window tint.

3.) Cut to fit, mount onto screen.

Working(ish) monitor, working mirror, all in one.

posted by quin at 7:59 AM on December 31, 2009

1) A programmer, needing to write a function that returned a random number, rolled a die once and got the number four, which he hardcoded into the function.

2) A function that returns "4" repeatedly cannot easily be distinguished from a random number generator without repeated tests (as illustrated by the Dilbert comic).

Both the text accompanying the XKCD comic as well as the fact that interpretation (2) would make sense only if the source code of the function were hidden indicate that (1) was what Randall Munroe was going for, but (2) seems to be how many people interpret the comic, including many of the Stack Overflow respondents.

You're being too kind. (2) is for idiots. We are looking at the source-code. An arbitrary string of 4s could be the output of a RNG, but in context we know we are not looking at an RNG because we have access to the underlying process. Come on! "Random" means that future output of the function is unpredictable! This is not unpredictable!

And of course mjg is right that statistical tests for randomness will reject a long sequence of 4s as very unlikely (although whether you can conclude a given sequence is the product of a random process will depend not only on some frequentist statistical test but on the prior distribution of possible random vs. non-random processes).

posted by grobstein at 8:13 AM on December 31, 2009 [1 favorite]

You could, however, use this function to generate a random sequence of 4s.

posted by grobstein at 8:21 AM on December 31, 2009 [1 favorite]

posted by grobstein at 8:21 AM on December 31, 2009 [1 favorite]

A random sequence of 4s would probably be faster than Mersenne twister.

That'd be a good name for a drink, now that I think about it.

posted by Plutor at 8:25 AM on December 31, 2009

That'd be a good name for a drink, now that I think about it.

posted by Plutor at 8:25 AM on December 31, 2009

There is only one way to interpret that comic and it is the first one. The second interpretation is excluded by the comment.

posted by DU at 8:32 AM on December 31, 2009

posted by DU at 8:32 AM on December 31, 2009

Is that sort of how most people think they're smarter than average? :) Anyway, is it even so bad? Yes, har har, the coin doesn't know it's landed heads up ten times already. But I don't think that this trick question—and it is, really—proves much.

So you meet some American and flip a coin ten times in front of him. Of the 1,024 possible outcomes, only two (all heads, all tails) are really interesting enough to set him up for the fallacy. Thus your new friend is reasonably stunned to have borne witness to the kind of event that has only a 1 in 512 chance of happening. Seizing the opportunity, you ask him, "Is it more or less likely to come up heads again?" He thinks for a bit and says, "Less—" but you cut him off: "Fool! Every flip is equally likely to come up heads!" You dash off in search of a new victim, confident in your superiority, while he finds someone more pleasant to be friends with.

But how about this: in a sequence of 10 coin flips, the odds of coming up all heads is 1 in 1024. In a sequence of 11 coin flips, the odds of coming up all heads is 1 in 2048. Right? So your subject correctly intuited that the second scenario is less likely than the first, and told you as much. Sure, they weren't thinking it through all the way. But if you conducted a survey, I doubt

(Hopefully, I managed to pull off that simple math up their correctly. If not, I guess you're right, and I'll be on my way to join my friends at the low end of the bell curve :)

posted by Garak at 8:41 AM on December 31, 2009 [2 favorites]

This may well be the most plate-of-beans thread ever, a perfect way to end the year.

posted by davejay at 8:42 AM on December 31, 2009 [2 favorites]

posted by davejay at 8:42 AM on December 31, 2009 [2 favorites]

I always felt that the interpretation of the comic was that generating good pseudorandom numbers is hard, there are a lot of pitfalls to watch for, and people tend to want to make the numbers feel "randomer"* even though that generally makes them less random. And that leads to mind-bogglingly stupid results.

* For example, one might make the numbers "feel" more random by ensuring that if someone generates seven numbers between 0 and 7, that each value comes up exactly once. Or one might make sure the same number is generated twice in a row. Or one might not like the "pseudo" part of pseudorandom and decide that a fair dice roll is far superior to any algorithm.

posted by Blue Jello Elf at 8:54 AM on December 31, 2009

* For example, one might make the numbers "feel" more random by ensuring that if someone generates seven numbers between 0 and 7, that each value comes up exactly once. Or one might make sure the same number is generated twice in a row. Or one might not like the "pseudo" part of pseudorandom and decide that a fair dice roll is far superior to any algorithm.

posted by Blue Jello Elf at 8:54 AM on December 31, 2009

Metafilter: random, but predictable.

posted by blue_beetle at 8:55 AM on December 31, 2009

posted by blue_beetle at 8:55 AM on December 31, 2009

From Stack Overflow:

*In an infinite sequence of random numbers, you will see infinite sequences of the same number. - Lasse V. Karlsen*

Is there mathematical proof for this? – StackedCrooked

@StackedCrooked: It isn't true. But you would see arbitrarily long sequences of the same number. – Jason Orendorff

Methinks Mr. Orendorff does not understand the peculiar qualities of infinity.

posted by scrowdid at 8:56 AM on December 31, 2009 [1 favorite]

Is there mathematical proof for this? – StackedCrooked

@StackedCrooked: It isn't true. But you would see arbitrarily long sequences of the same number. – Jason Orendorff

Methinks Mr. Orendorff does not understand the peculiar qualities of infinity.

posted by scrowdid at 8:56 AM on December 31, 2009 [1 favorite]

What a perfectly tasty plate of beans!

posted by The Esteemed Doctor Bunsen Honeydew at 8:57 AM on December 31, 2009

posted by The Esteemed Doctor Bunsen Honeydew at 8:57 AM on December 31, 2009

Well, one of you two doesn't.

posted by Wolfdog at 9:04 AM on December 31, 2009 [1 favorite]

This is wrong, and the reason why is actually quite subtle. (this is all shamelessly ripped off from NNT, who is usually anything but subtle.)

You talk about the nature of a stream of random numbers, and the sequence of physical - real world - coin flips as if they were interchangeable. Reasonable, because most people were taught about randomness using dice and coins as examples.

The thing is, you not only have no guarantee that a "fair" coin is heads 50% of the time (and some posters above indicate otherwise) you also have no indication that this coin really is fair.

If someone flipped a coin for me and it came up heads 10 times in a row, I would think "this is a rigged coin", especially if there was money riding on the outcome.

posted by atrazine at 9:13 AM on December 31, 2009

Outside of the humorously pedantic interpretations of the interpretation of

1) Spec says module must be identified by a GUID.

2) Programmer looks up GUID and discovers GUIDs often have a random bit. For the sake of ease, we will say it is all random (but GU, at least to a reasonable assurance).

3) Programmer implements getGUID(), which calls

4) Testing quickly points out that the GUID should always be the same for the same module (where same is the same class-level, not instance). Sure it uses a random number, but a given module should always use the *same* random number.

5) Maintenance programmer modifies getRandomNumber() to return a fixed, but randomly selected, number.

A completely plausible* reason to have

Of course, a couple of years down the road, some other maintenance programmer will stumble upon the

(*) If you don't think this is plausible, you have not read enough code.

posted by Bovine Love at 9:19 AM on December 31, 2009 [6 favorites]

`getRandomNumber()`

(and I insist since no mention of sequences, or "new" numbers is made, the function meets it perfectly), consider this possible scenario:1) Spec says module must be identified by a GUID.

2) Programmer looks up GUID and discovers GUIDs often have a random bit. For the sake of ease, we will say it is all random (but GU, at least to a reasonable assurance).

3) Programmer implements getGUID(), which calls

`getRandomNumber()`

, which he also implemented in the interests of "modularity" (but, of course, is only ever called by that one function).4) Testing quickly points out that the GUID should always be the same for the same module (where same is the same class-level, not instance). Sure it uses a random number, but a given module should always use the *same* random number.

5) Maintenance programmer modifies getRandomNumber() to return a fixed, but randomly selected, number.

A completely plausible

`getRandomNumber()`

returning a fixed, but randomly selected number. Of course, a couple of years down the road, some other maintenance programmer will stumble upon the

`getRandomNumber()`

, go WTF???? and "fix" it.(*) If you don't think this is plausible, you have not read enough code.

posted by Bovine Love at 9:19 AM on December 31, 2009 [6 favorites]

The generation of random numbers is too important to be left up to chance.

posted by Vindaloo at 9:44 AM on December 31, 2009 [3 favorites]

posted by Vindaloo at 9:44 AM on December 31, 2009 [3 favorites]

Man you people really like overthinking mediocre webcomics.

posted by graventy at 10:02 AM on December 31, 2009 [2 favorites]

posted by graventy at 10:02 AM on December 31, 2009 [2 favorites]

scrowdid:

Methinks Mr. Orendorff does not understand the peculiar qualities of infinity.

wolfdog:

Well, one of you two doesn't.

Here's the thing. It depends on the domain from which you're choosing the numbers. If you've got a finite set of elements---say, the digits from 0 to 9---that you're putting in an infinite sequence, then you have to repeat (at least one of) the elements infinitely many times. To see this, suppose not. Then each element must appear only a finite number of times in your sequence, and a finite number of elements repeated a finite number of times is finite.

On the other hand, if you've got an infinite sequence of numbers from which you're randomly choosing numbers to fill a second sequence, there's certainly no guarantee that you're going to get any one of those elements repeated infinitely many times---or even that you will get any given element repeated more than once.

So, Mr. Orendorff is confused, either way.

posted by leahwrenn at 10:02 AM on December 31, 2009

`From Stack Overflow:`

In an infinite sequence of random numbers, you will see infinite sequences of the same number. - Lasse V. Karlsen

Is there mathematical proof for this? – StackedCrooked

@StackedCrooked: It isn't true. But you would see arbitrarily long sequences of the same number. – Jason Orendorff

Methinks Mr. Orendorff does not understand the peculiar qualities of infinity.

wolfdog:

`Methinks Mr. Orendorff does not understand the peculiar qualities of infinity.`

Well, one of you two doesn't.

Here's the thing. It depends on the domain from which you're choosing the numbers. If you've got a finite set of elements---say, the digits from 0 to 9---that you're putting in an infinite sequence, then you have to repeat (at least one of) the elements infinitely many times. To see this, suppose not. Then each element must appear only a finite number of times in your sequence, and a finite number of elements repeated a finite number of times is finite.

On the other hand, if you've got an infinite sequence of numbers from which you're randomly choosing numbers to fill a second sequence, there's certainly no guarantee that you're going to get any one of those elements repeated infinitely many times---or even that you will get any given element repeated more than once.

So, Mr. Orendorff is confused, either way.

posted by leahwrenn at 10:02 AM on December 31, 2009

That'd be a good name for a drink, now that I think about it.

"Bartender, gimme a few Random Sequence of 4s, please!"

"How many?"

"I dunno; surprise me."

posted by Greg_Ace at 10:09 AM on December 31, 2009

"As someone once said (who?), you can find anything you want in an irrational number, like pi."

No you can't, at least not for sure. It's not a property of irrational numbers, and it's not even a property of transcendental numbers. In fact the very first transcendental number discovered, Liouville's number, only has 0s and 1s in it, so it obviously doesn't contain any sequence.

The property described is only true for normal numbers. Almost every number is a normal number, although they are not often come across. It is not known if pi is normal or not - although it would surprise few people if it were.

posted by edd at 10:13 AM on December 31, 2009 [1 favorite]

No you can't, at least not for sure. It's not a property of irrational numbers, and it's not even a property of transcendental numbers. In fact the very first transcendental number discovered, Liouville's number, only has 0s and 1s in it, so it obviously doesn't contain any sequence.

The property described is only true for normal numbers. Almost every number is a normal number, although they are not often come across. It is not known if pi is normal or not - although it would surprise few people if it were.

posted by edd at 10:13 AM on December 31, 2009 [1 favorite]

Also, why did the poster go to StackOverflow to ask that question? Every xkcd strip has an associated interblag post where comments and questions can go.

posted by DU at 10:13 AM on December 31, 2009

posted by DU at 10:13 AM on December 31, 2009

There is a claim (recently deleted from Wikipedia, so it **must** be true), that *The Hitchhiker's Guide to the Galaxy's* never-revealed Ultimate Question of Life, the Universe, and Everything (for which the answer is 42) is actually **"Think of a number, any number"**, the closest (but obviously flawed) human-based equivalent of a Random Number Generator, which suggests that any infinitely-repeating-same-number-generator masquerading at randomness should be using **not** 4 or 9, but 42.

The logic being: Marvin the Paranoid Android states that he has a brain the size of a planet, and since the Planet Earth was built to be a large enough computer to determine The Question, he could compute the question if he so chose. In "Life, the Universe, and Everything" Marvin is conversing with a sentient mattress, and to prove how much smarter he is, he tells the mattress "Think of a number, any number" to which the mattress replies "five". Marvin immediately responds "wrong", because the correct answer is 42 (which remains unstated).

But it is claimed that this fact was confirmed by Douglas Adams in a personal conversation before his death. Interestingly, Stephen Fry says that Adams told him "exactly why 42", but has vowed not to tell anyone the secret, saying it must go with him to the grave, giving disgruntled geeks another reason to kill that smug Englishman.

Which really is totally irrelevant to whatever topic is actually being discussed here, but since it adds*Hitchhiker's* to a thread that already contains XKCD and Dilbert, achieves a **Nerd Trifecta** (w/extra points for referencing Fry), which may or may not cause **The Singularity** to occur, causing the Known Universe to disappear and be replaced by something even more bizarre and inexplicable. On the eve of Y2K+X, it was worth a try.

posted by oneswellfoop at 10:31 AM on December 31, 2009 [18 favorites]

The logic being: Marvin the Paranoid Android states that he has a brain the size of a planet, and since the Planet Earth was built to be a large enough computer to determine The Question, he could compute the question if he so chose. In "Life, the Universe, and Everything" Marvin is conversing with a sentient mattress, and to prove how much smarter he is, he tells the mattress "Think of a number, any number" to which the mattress replies "five". Marvin immediately responds "wrong", because the correct answer is 42 (which remains unstated).

But it is claimed that this fact was confirmed by Douglas Adams in a personal conversation before his death. Interestingly, Stephen Fry says that Adams told him "exactly why 42", but has vowed not to tell anyone the secret, saying it must go with him to the grave, giving disgruntled geeks another reason to kill that smug Englishman.

Which really is totally irrelevant to whatever topic is actually being discussed here, but since it adds

posted by oneswellfoop at 10:31 AM on December 31, 2009 [18 favorites]

Huh? I'm not really sure why you think this is the case. It seems like there would be finite runs, but how can you have an infinite run? The probability of a run of length N approaches zero as N approaches infinity, so no infinite runs.

posted by delmoi at 10:36 AM on December 31, 2009

(unless you're not talking about runs, but rather talking about simple repetition, obviously each of the elements would be 'repeated' infinitely.

So for example in the string 4334201, 4 is repeated twice in the sense of appearing twice, while three is repeated twice in the sense of having a run of length 2)

posted by delmoi at 10:38 AM on December 31, 2009

So for example in the string 4334201, 4 is repeated twice in the sense of appearing twice, while three is repeated twice in the sense of having a run of length 2)

posted by delmoi at 10:38 AM on December 31, 2009

I thought The Question, as determined by playing scrabble with cavemen, was "What's six multiplied by nine?".

Although bad understanding of randomness is often described in terms of "a coin comes up heads 10 times, what's the odds of it coming up heads the next time?" the sort of misunderstanding of randomness is endemic in less obvious examples. Sheldrake, the idiot behind morphic fields ran some experiments testing if people could tell if they were being stared at. He wanted them to be random, so he tossed a coin to determine yes/no. But looking at some of the sequences, he could tell they weren't random, so he changed them to make them more random. This is an instance of one problem people have with randomness ---they underestimate streakiness. In this case, creating random number sequences that "feel" more random, is going to help people guess correctly, undermining the whole experiment. And like I said, Adam from Mythbusters (who I don't think is an idiot) thought that the bread toss (essentially a coin toss) coming up one way 7 out of 10 times was evidence of non-randomness.

Runs of numbers: yes, you pick a string of numbers and a length, and you can find a string that repeats that many times. But "infinity" is not a length, and no string will repeat infinity times.

posted by Humanzee at 10:44 AM on December 31, 2009 [1 favorite]

Although bad understanding of randomness is often described in terms of "a coin comes up heads 10 times, what's the odds of it coming up heads the next time?" the sort of misunderstanding of randomness is endemic in less obvious examples. Sheldrake, the idiot behind morphic fields ran some experiments testing if people could tell if they were being stared at. He wanted them to be random, so he tossed a coin to determine yes/no. But looking at some of the sequences, he could tell they weren't random, so he changed them to make them more random. This is an instance of one problem people have with randomness ---they underestimate streakiness. In this case, creating random number sequences that "feel" more random, is going to help people guess correctly, undermining the whole experiment. And like I said, Adam from Mythbusters (who I don't think is an idiot) thought that the bread toss (essentially a coin toss) coming up one way 7 out of 10 times was evidence of non-randomness.

Runs of numbers: yes, you pick a string of numbers and a length, and you can find a string that repeats that many times. But "infinity" is not a length, and no string will repeat infinity times.

posted by Humanzee at 10:44 AM on December 31, 2009 [1 favorite]

It's trivial to find a string that repeats consecutively infinite times in the digits of a rational number. For example, the string "666" repeats a lot in 2 / 3. But if you're looking in Pi, you may have to look for a while. Here are some tables to get you started.

and see above on "normal numbers"

posted by grobstein at 10:59 AM on December 31, 2009

Is this a game of questions?

posted by inigo2 at 11:08 AM on December 31, 2009 [3 favorites]

Leo Marks describes the problems of generating one time pads in *Between Silk and Cyanide*. He had rooms of women drawing letters from bingo hoppers and writing the letters on the one time pads for field agents, but statistical analysis of the values showed some biases. When he observed the women, the values would be random, but when they were not observed they would trend away from uniform distributions.

As you might guess, the problem was that they would occasionally discard a drawn letter because it didn't seem "random enough".

Regarding coin flips, my initial experiments with the Thomas Jefferson US$1 coins produced nearly 60% heads in a trial of n=100. I am seeking funding to expand this study -- please send me coins.

posted by autopilot at 11:14 AM on December 31, 2009 [1 favorite]

As you might guess, the problem was that they would occasionally discard a drawn letter because it didn't seem "random enough".

Regarding coin flips, my initial experiments with the Thomas Jefferson US$1 coins produced nearly 60% heads in a trial of n=100. I am seeking funding to expand this study -- please send me coins.

posted by autopilot at 11:14 AM on December 31, 2009 [1 favorite]

Oh, but it does. And it also remembers that time back in 1972 when it landed tails up 19 times in a row. That's where people make their mistake. If you want to predict the next flip, you have to remember flips all the way back to when it was minted, and that's pretty much impossible unless you've owned the coin the whole time and done all the flipping yourself. I have a dime like that, and it's made me a lot of money.

posted by weapons-grade pandemonium at 11:15 AM on December 31, 2009 [7 favorites]

Man, some of you people really like underthinking excellent webcomics.

posted by Aquaman at 11:47 AM on December 31, 2009 [1 favorite]

posted by Aquaman at 11:47 AM on December 31, 2009 [1 favorite]

The function is called getRandomNumber() and it does exactly that. *It's a naming problem.* What you typically want is getNextNumberFromRandomSequence(void), but that has no state retention implied, so you might want something like getNextNumberFromRandomSequenceFedByEntropyPool(Sequence, Pool).

posted by morganw at 12:05 PM on December 31, 2009

posted by morganw at 12:05 PM on December 31, 2009

Well, the original quote was about "infinite sequences of the same number". I do not interpret this to mean that they need to be consecutive. Clearly, you can't have an infinite-length 'run' (defined as consecutive elements of the sequence, I guess). Which was the point. I didn't say it was a deep observation.

(more precisely, given a sequence of numbers x_i, where each of the elements in the sequence are chosen from the same finite set, there is a subsequence x_i_j in which all of the entries are the same.)

And you have to be careful. It is in fact false that *each* of the elements would be " 'repeated' infinitely". All you can guarantee is that at least one of the elements is repeated infinitely. (For example, in the binary sequence 1,0,1,1,1,1,1,1,..., (that is, x_2 = 0 and x_i = 1 for i != 2) the element 0 appears only once.)

posted by leahwrenn at 12:15 PM on December 31, 2009

grobstein, yes I'm aware of the existence of infinite repeats in fractions. I was referring to a comment about infinite sequences of random digits. I should have been more specific.

There was a misconception. Arbitrarily-long finite number of repetitions is not the same as infinitely many repetitions.

posted by Humanzee at 12:33 PM on December 31, 2009

There was a misconception. Arbitrarily-long finite number of repetitions is not the same as infinitely many repetitions.

posted by Humanzee at 12:33 PM on December 31, 2009

I think most people would interpret them as being consecutive, especially in the context of this post (about the function that always returns 4, and the 'nine nine nine nine' thing in the answer)

posted by delmoi at 1:01 PM on December 31, 2009

One of my colleagues is responsible for the process that draws electors from the electoral roll for jury duty. O the fun we have discussing the nuances of "random" as we figure out how to prevent people getting called too many times in a short period, which happens pretty damned often, even with thousands and thousands of candidates.

posted by i_am_joe's_spleen at 1:18 PM on December 31, 2009

posted by i_am_joe's_spleen at 1:18 PM on December 31, 2009

The radiolab episode longdaysjourney linked to had a very enlightening segment about coin flips. A professor of mathematics (I think she was) had her class divide into two groups. She asked them to divide themselves into two groups. One would flip a coin some large number of times and record the results; the other group was to make up, off the top of their heads, a list of coin flips of the same length that felt random. She then left the room to let them do their thing.

Upon coming back, she was easily able to identify which group was which just by looking at the sequences (and she'd done this same experiment many times before, with similar results). The group who actually flipped the coin inevitably ended up with long strings of one result in a row, while the group that tried to make a random-seeming artificial list never had such long strings because those results didn't "feel random".

posted by jiawen at 1:35 PM on December 31, 2009

Upon coming back, she was easily able to identify which group was which just by looking at the sequences (and she'd done this same experiment many times before, with similar results). The group who actually flipped the coin inevitably ended up with long strings of one result in a row, while the group that tried to make a random-seeming artificial list never had such long strings because those results didn't "feel random".

posted by jiawen at 1:35 PM on December 31, 2009

This is why it pays to pick lottery numbers in a "non-random looking" sequence, like 4,5,6,7,8,9. You have the same chance of winning, but a better chance of not having to share the prize money if you do win, since most people think those strings have a lower probability. Of course you will have to throw me a few bucks because I told you this.

posted by weapons-grade pandemonium at 1:59 PM on December 31, 2009

posted by weapons-grade pandemonium at 1:59 PM on December 31, 2009

No, you actually have a much higher chance of sharing the prize money, because even if a small portion of the public likes 'non-random' numbers, they are all much more likely to pick the

posted by delmoi at 2:03 PM on December 31, 2009

Oh, this is like that geometry exam pic you see sometimes where there's a triangle and the problem reads "Find *x*" so the student helpfully drew an arrow to the side labelled *x* and wrote "There it is" to show their work.

More or less.

posted by Spatch at 2:21 PM on December 31, 2009

More or less.

posted by Spatch at 2:21 PM on December 31, 2009

Random Metafilter posts:

http://www.metafilter.com/4/

http://www.metafilter.com/44/

http://www.metafilter.com/444/

http://www.metafilter.com/4444/

http://www.metafilter.com/44444/

http://www.metafilter.com/444444/

posted by obiwanwasabi at 2:45 PM on December 31, 2009 [2 favorites]

http://www.metafilter.com/4/

http://www.metafilter.com/44/

http://www.metafilter.com/444/

http://www.metafilter.com/4444/

http://www.metafilter.com/44444/

http://www.metafilter.com/444444/

posted by obiwanwasabi at 2:45 PM on December 31, 2009 [2 favorites]

In the title for the comic (hover over it to see it), the author says that "RFC 1149.5 specifies 4 as the standard IEEE-vetted random number." I am surprised that no one has pointed out that while there is no RFC 1149.5, RFC 1149 is "A Standard for the Transmission of IP Datagrams on Avian Carriers" -- that is, TCP/IP via carrier pigeon. In other words, you could do this, but it would be silly.

posted by ubiquity at 2:46 PM on December 31, 2009 [1 favorite]

posted by ubiquity at 2:46 PM on December 31, 2009 [1 favorite]

Totally wrong. Because the odds of witnessing ten heads in a row because someone is showing you a rigged coin are much higher than the odds of witnessing ten heads in a row because 10 fair tosses landed all heads.

So it is much more likely to come up heads the 11th time.

posted by straight at 4:15 PM on December 31, 2009

I think the point of interpretation #2 is that the

Or maybe it's code for a slot machine.

posted by straight at 4:22 PM on December 31, 2009

In case you are looking for a source for allegedly random numbers, there's this.

"Allegedly" for hopefully obvious reasons

posted by birdsquared at 5:45 PM on December 31, 2009

"Allegedly" for hopefully obvious reasons

posted by birdsquared at 5:45 PM on December 31, 2009

Needs "beanplate" tag.

posted by the painkiller at 8:36 PM on December 31, 2009

posted by the painkiller at 8:36 PM on December 31, 2009

I know this is a lighthearted thread but there is a very profound issue here. Incorrectly identifying uncorrelated phenomena as correlated (eg, seeing 444444 as a non-random sequence when you expected a random one) is the basis of paranoia. To take a more mundane example, if you go through N red lights, and all of them are against you, at what value of N do you start to think that it's a set up? This example comes from a psychiatrist friend of mine, who finally understood what his paranoid patients were suffering when he was trying to get to a speaking engagement, running late, and hit a bafflingly long series of red lights. There came a point, he told me, where he became convinced that the lights were a conspiracy.

Human beings are programmed to associate correlation with causation. Correlation absent causation throws a wrench in the works.

posted by unSane at 10:24 PM on December 31, 2009 [3 favorites]

Human beings are programmed to associate correlation with causation. Correlation absent causation throws a wrench in the works.

posted by unSane at 10:24 PM on December 31, 2009 [3 favorites]

Well, let's be clear about this. If you don't know if the source of a set of numbers is random or not, the longer the run the more sure you can be sure it's not random and in fact will always return that number. Especially if it starts off with that number. There's no way to be

posted by delmoi at 7:18 AM on January 1, 2010

petrilli: *If you flip a coin in front of most Americans, and it comes up heads 10 times in a row, and you ask them "is it more or less likely to come up heads again", they'll tell you "more likely to be tails". This is false.*

If I call the random number function in question and get 4 back 10 times in a row, I'd expect it to be highly likely I would get back a 4 on the 11th trial. The observer doesn't know the method by which you are discovering this data. If someone flipped a coin and got heads 10 times in a row, I'd highly suspect they are using a double-headed coin, and would probably say more likely to be heads.

posted by newper at 8:43 AM on January 1, 2010

If I call the random number function in question and get 4 back 10 times in a row, I'd expect it to be highly likely I would get back a 4 on the 11th trial. The observer doesn't know the method by which you are discovering this data. If someone flipped a coin and got heads 10 times in a row, I'd highly suspect they are using a double-headed coin, and would probably say more likely to be heads.

posted by newper at 8:43 AM on January 1, 2010

For the record, I have found MS Word's VBA random() function to be completely insufficiently randomlike.

Details: I sign my emails with "randomly" chosen quotes, from a file with ~3,000 quotes in it. Word VBA is the particular critter I use to pick these quotes. In regular usage, grabbing one random number at a time, quotes are fairly likely to repeat within 3-4 emails. This plagued me for years (reasonably large sample size), until I switched to a file containing 50,000 9-digit numbers based on radioactive decay times.

And, yes, I am a obsessive geek.

posted by IAmBroom at 5:25 PM on January 1, 2010

Details: I sign my emails with "randomly" chosen quotes, from a file with ~3,000 quotes in it. Word VBA is the particular critter I use to pick these quotes. In regular usage, grabbing one random number at a time, quotes are fairly likely to repeat within 3-4 emails. This plagued me for years (reasonably large sample size), until I switched to a file containing 50,000 9-digit numbers based on radioactive decay times.

And, yes, I am a obsessive geek.

posted by IAmBroom at 5:25 PM on January 1, 2010

IAmBroom: I doubt it's a problem with the random() function, it's more likely you're suffering from simply too small a set of quotes and your probability of getting a repeated quote is too high - it's like the birthday paradox.

That or you weren't seeding it well enough.

posted by edd at 4:32 AM on January 2, 2010

That or you weren't seeding it well enough.

posted by edd at 4:32 AM on January 2, 2010

One of my dad's colleagues, is a maths teacher. Which includes teaching randonmess/probability. So one day, he produced a coin and asked the class, what the chances of getting a head is. Prompting a corus of "1 in 2". Some smart alec said "But sir, it might land on its side". Whch the teacher sensibly dismissed.

With a "let's see if it does" attitude, teacher flips the coin. And it lands bang on the side, rolls and stops on the side. Class asks the teacher "how did you do that", "going on do it again", not realsing this will 'probably' be only time they ever see it happen.

So it's not quite 1 in 2. I'll leave someone else to work out exactly what p(lands on side) equals.

posted by 92_elements at 6:50 AM on January 2, 2010

With a "let's see if it does" attitude, teacher flips the coin. And it lands bang on the side, rolls and stops on the side. Class asks the teacher "how did you do that", "going on do it again", not realsing this will 'probably' be only time they ever see it happen.

So it's not quite 1 in 2. I'll leave someone else to work out exactly what p(lands on side) equals.

posted by 92_elements at 6:50 AM on January 2, 2010

That or you weren't seeding it well enough.

1) The odds of getting a repeat out of *3000* possible selections in the next very few choices are fairly low; this happened several times a week (as a guess, I'd say 10-20% of the choices repeated). The birthday paradox doesn't predict that 10-20% of the people you meet will share a birthday with you...

2) I followed the guidelines in VBA Help for seeding based on the computer clock ticks since midnight; ergo, if my seeding wasn't correct, the fault lay with the implementation anyway.

In my experience, seeding based on computer ticks is a terrible methodology, regardless of whatever pseudo-random algorithm follows.

And, to quote ?xkcd?, "Anyone who claims to code a random number generator is committing an act of blasphemy."

posted by IAmBroom at 10:52 PM on January 2, 2010

Weird. Googling around shows that the VBA rnd function is pretty awful, but not awful enough to explain 10-20% repetitions.

It turns out there are problems in the seeding method which severely limits the number of seeds you can get, but the numbers for that are also still way too low to explain 10-20% repetitions. It's a pretty terrible PRNG but it should have been good enough, as far as I can see. (One article claims that you have to use 'rnd -1' before seeding, which might easily be missed and be the source of the problem though)

And you're right that I overestimated the effect - the probability is a couple of orders of magnitude too small for what you say you saw. Although "The birthday paradox doesn't predict that 10-20% of the people you meet will share a birthday with you..." isn't a good argument - naturally the numbers in that example don't match the numbers in yours, so you wouldn't expect the percentages to match either - I was merely making the point that people frequently underestimate the probabilities of coincidence.

posted by edd at 7:14 AM on January 4, 2010

It turns out there are problems in the seeding method which severely limits the number of seeds you can get, but the numbers for that are also still way too low to explain 10-20% repetitions. It's a pretty terrible PRNG but it should have been good enough, as far as I can see. (One article claims that you have to use 'rnd -1' before seeding, which might easily be missed and be the source of the problem though)

And you're right that I overestimated the effect - the probability is a couple of orders of magnitude too small for what you say you saw. Although "The birthday paradox doesn't predict that 10-20% of the people you meet will share a birthday with you..." isn't a good argument - naturally the numbers in that example don't match the numbers in yours, so you wouldn't expect the percentages to match either - I was merely making the point that people frequently underestimate the probabilities of coincidence.

posted by edd at 7:14 AM on January 4, 2010

Thanks for the interesting research work on the irreliability of the function. I simply knew that what I was seeing was unbelievable, as a random function.

posted by IAmBroom at 2:31 AM on January 8, 2010

« Older End of the decade flash fun: Picma Picture Enigmas... | The US Census has a blog - Rob... Newer »

This thread has been archived and is closed to new comments

1) A programmer, needing to write a function that returned a random number, rolled a die once and got the number four, which he hardcoded into the function.

2) A function that returns "4" repeatedly cannot easily be distinguished from a random number generator without repeated tests (as illustrated by the Dilbert comic).

Both the text accompanying the XKCD comic as well as the fact that interpretation (2) would make sense only if the source code of the function were hidden indicate that (1) was what Randall Munroe was going for, but (2) seems to be how many people interpret the comic, including many of the Stack Overflow respondents.

posted by null terminated at 5:48 AM on December 31, 2009 [7 favorites]