Being the nerd that I am, I was itching to crunch some badass numbers
June 7, 2013 4:51 PM   Subscribe

Debarghya Das, an Indian student at Cornell, wanted to impress his friends by obtaining their examination marks for the Indian Certificate of Secondary Examination and the Indian School Certificate and, thanks to some poorly written javascript, discovers the entire database containing the grades for 200,000 Indian students, as well as potential evidence of widespread tampering.
posted by elgilito (36 comments total) 16 users marked this as a favorite
 
He shows that the numbers couldn't statistically have the distribution they do and it's impossible to get certain scores but it's unclear that tampering was done to promote any certain score intentionally. FYI
posted by ishrinkmajeans at 5:07 PM on June 7, 2013 [1 favorite]


Oh, wow. Makes me wonder how much sensitive information is one macro away from being public knowledge. Heads are going to roll for this - I hope one of them isn't his.
posted by Mooski at 5:08 PM on June 7, 2013 [1 favorite]


Dear lord, it just gets more and more horrifying as you go.

The [personal id number].[file extension] format was suggested by a former boss of mine when the subject arose of making certain highly sensitive payroll documents available to employees via the company intranet. Because I am not an idiot, I staunchly refused to even consider this as a valid approach. I had to argue with him about it on a number of occasions, and I don't think he ever actually understood the eminently avoidable security issues such a "plan" would cause.

He's no longer my boss, thank all that's good.
posted by trunk muffins at 5:11 PM on June 7, 2013 [1 favorite]


Here in the US, he'd already have his ticket punched for a good long stay in prison for posting that. I hope for his sake that the powers that be in India have more common sense.
posted by deadmessenger at 5:14 PM on June 7, 2013 [2 favorites]


And the issue of the jagged graphs/missing marks is an interesting pickle; I'd be inclined to suspect a problem with whatever process is used to extract the data, rather than systematic tampering, with such a large number of tests to score.
posted by trunk muffins at 5:15 PM on June 7, 2013 [2 favorites]


A mango pickle, I presume.
posted by Nomyte at 5:31 PM on June 7, 2013 [4 favorites]


Here in the US, he'd already have his ticket punched for a good long stay in prison for posting that. I hope for his sake that the powers that be in India have more common sense.

He's in the US. I wonder how US hacking laws will apply to this. Guess we'll find out.
posted by qxntpqbbbqxl at 5:44 PM on June 7, 2013


Terroristic web browsing!
posted by XMLicious at 5:49 PM on June 7, 2013 [1 favorite]


I don't think he understands the distribution of scores at all. I am a professional exam scorer and I see graphs like this all the time. These scores aren't normalized, they're not all supposed to be bell curves that peak at 50. It is not unusual at all to see a jaggy distribution curve, with a peak shifted dramatically. For example, I would expect the curve to peak in the 90s on the Hindi test, but a lower, more broad curve peaking around 70 for the English test (presumably this is their second language). Comp Sci scores peak high, and this is a CS test after all. But Science scores are lower and broader.

Also, the Math distribution has lots of scores in the 50-60 range, another group in the 70s, and a broad group around the 90s. I have seen this in Math tests before. I asked a math instructor about this and he said he sees this in his classes all the time, there are three groups: the kids that don't get it but try (50s), the kids who get it but don't study (70s) and the kids who get it and study (90s).

Surely there are problems with the test. But the fact that the graphs are jagged and some score points were never attained, is expected. It is likely that there is no combination of correct answers that adds up to that score point. If I gave you a 20 question test and each answer is worth 5 points, nobody is ever going to score a 53.

The exam design may be sloppy, but not nearly as sloppy as this analysis.
posted by charlie don't surf at 5:50 PM on June 7, 2013 [31 favorites]


His 'evidence' for 'widespread tampering' seems to basically be "wow - the graphs don't look anything like what my Statistics 101 class said they ought to".

Welcome to the real word of real data, mate.
posted by Pinback at 5:51 PM on June 7, 2013 [25 favorites]


Hmm, it seems that "widespread tampering" is a tad harsh. Maybe "human tendency to round numbers off" is more likely the answer?

I feel like it's more human nature than anything else to give someone that is one point shy of passing that passing grade by rounding up instead of having to fight them over it. A lot of the missing numbers seem to be odd numbers in between standard scores, it could just be teachers don't get so granular?
posted by mathowie at 6:19 PM on June 7, 2013


Another option: he may only think he has the final grades, but actually has "randomized" data they were using to test the system.
posted by jason_steakums at 6:21 PM on June 7, 2013


So, for anyone who didn't get to the bottom of the article, he hams up the bit about the graphs looking weird but the thrust of his argument for the results being massaged is that for half of the possible passing scores (35-100) no one in the entire country achieved those scores. But for the highest possible scores (94-100) every one was hit by at least some of the students nationwide, so he reasons that there ought to be possible combinations of answers that should add up to the missing scores.
posted by XMLicious at 6:31 PM on June 7, 2013


My analysis says that humanities should be a pre-requisite to all of us young hotshots with cool data analysis software libraries and abilities to write rules that transport blobs of text from one location to the other attempting to analyze anything that has to do with human beings.
posted by instinkt at 6:33 PM on June 7, 2013 [1 favorite]


I understand the skepticism but we're talking about large, subcontinent-level amounts of data here. I don't mind the lack of normalization, but the distributions should be smoother. For comparison, see these nice distributions and the numbers are 20 times smaller. There's no way such peaks exist by chance. So either he actually found something (which may have a simpler explanation than "tempering"), or the CSV export/import does weird things to the data (which would be the first thing to investigate).
posted by elgilito at 6:40 PM on June 7, 2013 [1 favorite]


JETSON! YOU'RE FIRED!
posted by 4ster at 7:42 PM on June 7, 2013 [2 favorites]


I am a professional exam scorer

It is my vague impression, by now, that charlie don't surf holds or has held every conceivable job.
posted by kenko at 7:53 PM on June 7, 2013 [17 favorites]


I've done a lot of different crap during my life, kenko. My scoring job is part time. I'm doing a new round of math scoring starting next week.

elgilito: For comparison, see these nice distributions and the numbers are 20 times smaller. There's no way such peaks exist by chance.

There are plenty of other factors that can rough up the results. As I mentioned, some score points may not be achievable just because no combination of scores add up to that number. But there might also be some score points that are more likely because there are more combinations of answers that add up to that score, and they're easier to achieve in multiple ways, with different combinations of correct answers. It takes careful planning in test design to prevent these problems. And sometimes the problems slip through despite careful design.

There are plenty of other possibilities. Many exams have validation questions that aren't scored, but are being tested to see how the questions work and to normalize the results. Those questions may or may not be used in future tests. But they generally are tested in live exams. This can cause results to bunch up, because people wasted their exam time on questions that didn't add to their score. That may affect different test takers in different ways, but there may be many people that get sidetracked in a similar way. Unfortunately, this is a price the exam takers have to pay, so that next year's exam takers have fresh questions and old irrelevant material can be retired.
posted by charlie don't surf at 8:06 PM on June 7, 2013 [1 favorite]


Some of those "tampering" spikes might be explained by bad integer math in the post-processing.

If you have n < 100, and you compute normalized_score = int(score/n*100), there are naturally going to be a bunch of missing numbers at regular intervals, with corresponding peaks before or afterwards, depending on how you round to integers.

Now suppose somebody, for some inane reason, scaled the scores to different maximums twice in series, casting to integers each time. This compounds the error and introduces spikes of irregular heights. For example, suppose there are subsections of the test, and each of those is worth 40 points but contains 33 questions. So our test-administrator genius (the same guy who came up with the bomb-proof security scheme of hiding test results in public, guessable URLs) first scales the subsection scores to an integer between 0-40, then adds those scores and scales the 0-80 sum onto an integer in the range 0-100. Mayhem.
posted by qxntpqbbbqxl at 8:20 PM on June 7, 2013 [5 favorites]


As I mentioned, some score points may not be achievable just because no combination of scores add up to that number. But there might also be some score points that are more likely because there are more combinations of answers that add up to that score, and they're easier to achieve in multiple ways, with different combinations of correct answers. It takes careful planning in test design to prevent these problems. And sometimes the problems slip through despite careful design.

And they managed to create exams for every subject that made these particular scores impossible to achieve? What a coincidence.

quote:
There were specific numbers, in no real pattern, that were missing for the distribution of the entire distribution of all subjects achieved by all students. And these missing numbers were regularly interspersed on the number line. For example, 81, 82, 84, 85, 87, 89, 91 and 93 were visibly missing. I repeat, no one in India had achieved these marks in the ICSE.
posted by jacalata at 8:26 PM on June 7, 2013


> It is likely that there is no combination of correct answers that adds up to that score point.

The article goes into this issue in some detail. I'm not sure his reasoning completely convinces me - I'd have to read it again - but to argue against the article by claiming that he didn't take this case into account is just not correct.
posted by lupus_yonderboy at 8:59 PM on June 7, 2013


32, 33 and 34 were visibly absent. This chain of 3 consecutive numbers is the longest chain of absent numbers. Coincidentally, 35 happens to be the pass mark.

This bit is interesting to me because marks at our university would show a similar pattern. This is because there is a rule that a student who completes all coursework for the year, and gets a final mark between 48 and 50 (with 50 a pass), is allowed to request a supplementary exam, which, if they pass, then raises their final mark to a pass.

Of course, creating a supplementary exam for a couple of students, grading that, and then doing the paperwork to change a final grade is a pain in the ass at a time of year when the professors are itching to be done with teaching for the term. So instead we are all very very careful to never assign marks between 48 and 50. Some people write exams so that combinations of marks are unlikely to add up to those scores. Others have complicated "if x then y" marking schemes that mean low-to-average scoring students will get extra marks for some things that higher-scoring ones won't, and others are just nicer in interpreting ambiguous or incomplete answers for borderline students.

I wonder if some similar regulation exists in India for these exams. That doesn't explain the other missing scores, though.
posted by lollusc at 10:08 PM on June 7, 2013 [2 favorites]


It is likely that there is no combination of correct answers that adds up to that score point.

The article goes into this issue in some detail. I'm not sure his reasoning completely convinces me - I'd have to read it again - but to argue against the article by claiming that he didn't take this case into account is just not correct.
Not terribly convincingly at all though. To me, with the continuous runs at low and high scores and the nodes/anti-nodes in between, overall it looks overwhelmingly like a combination of 1st, 2nd, 3rd, and possibly higher odd-order effects. Which is almost exactly the kind of result you wouldn't be surprised to see from a test with gimmee questions worth 1, middlingly-hard questions worth 2 or 3, hard questions worth 5 or 10, and no part marks given.

Now, the really interesting bits are the nulls at 56-7, 68, 71, etc. That smacks to me of marks being fiddled to match pass/credit/distinction criteria. I wonder if these exams are marked or moderated/adjudicated by the student's own teachers?
posted by Pinback at 10:28 PM on June 7, 2013 [1 favorite]


Yikes. Clever enough to get at the data, probably a little naive/overenthusiastic in his interpretation, and foolish enough to admit to hacking computer systems (while resident in the US!). I really hope Bad Stuff doesn't happen to him as a result.

Claiming no hacking was involved but also posting "I recently also cracked the CBSE class XII security" on the same page doesn't seem too smart. The safest way to do this whole thing would probably have been to run the scripts on a public computer in an internet cafe or library in a different town, then anonymously pass the data and methodology to a journalistic organisation of some sort. No glory that way, of course.
posted by dickasso at 11:18 PM on June 7, 2013 [1 favorite]


normalized_score = int(score/n*100)

Yeah, that was my guess, too. The smooth-ish bits areound the top end of the scale are kinda funny under this interpretation, but I think it's fairly likely. It's hard to say without knowing the composition of the tests and how they're doing the marking. It could be they sample k problems for most students to get a score and then do a full marking of n problems if you're at the top?

I got the chance to work with the full data from the KCSE (Kenyan secondary exit test) last year. It's a country of 40 million people, and 413,000 people took the test in 2011. (It was pretty fascinating to analyze those scores, too, btw.) The size of the population taking the ICSE to be pretty low; 165,000 from a country of over a billion people? They say it's the second most popular testing board, but they must be a pretty distant second....
posted by kaibutsu at 12:30 AM on June 8, 2013


Okay, the privacy concerns are overblown. This guy has zero context of how things are done in India; when they release results for 10th and 12th grades, newspapers have special editions where they publish realms and realms of the hall ticket numbers and the grades. That is not a bug, it is a feature; you want to get your results out as quickly as possible to as many people as possible.

I'm certain about two things here: not only would they have done this to handle the hits they'd have gotten, but that they'd have given a CD with the results if you showed a media ID and asked for the file.

Now whether they *should* have been some password protection is a different question altogether. But the reality is that there is no expectation of privacy in these matters at all.

As for why the results are quantized, did the guy look at the exam paper? If you have 20 questions of five marks each for a total of 100 marks, and you don't get half marks for any answer, you are likely to get results in jumps of five. Now the real ICSE paper would have a different pattern - perhaps a few questions of six marks each, some with three, the rest with two - but it is not at all shocking that marks aren't in a general curve. Certainly, not without knowing how the papers were set.
posted by the cydonian at 7:37 AM on June 8, 2013 [1 favorite]


I wonder if these exams are marked or moderated/adjudicated by the student's own teachers?

External examiners sent to a different region of the country, and picked from universities etc. Except in, perhaps, languages, it is impossible to trace a paper back to a specific region or state or school. (It *may* be possible with languages, because the ICSE is extremely diverse in its offerings; they offer papers in all of India's 22 constitutionally languages, and in such international languages as Swahili, Dzongkha, Tibetan, French, Bahasa and many more. So if your school is one of the 10 ICSE schools offering, say, Bishnupriya Manipuri, then it is likely that at least some would be marked in your school.

As for the apparent "rounding off" at grade-boundaries, as an ex-ICSE student, that actually makes me happy. Shows that the markers arent being a**holes. You have to understand: things get super competitive to get into universities in India. Know people who lost university seats over 2-3 percentile points. These things matter a lot in India.
posted by the cydonian at 7:53 AM on June 8, 2013


Here in the US, he'd already have his ticket punched for a good long stay in prison for posting that. I hope for his sake that the powers that be in India have more common sense
Interesting factoid, India has over a billion popele, but only a few hundred thousand people in prison. Their incarceration rate is about 3 in 10,000. The US on the other hand has almost three million people in prison, or about 1 in 100, literally 1% of our entire population is behind bars. So you're far more likely to be in prison in the US then India, about 33 times more likely in general.
It is my vague impression, by now, that charlie don't surf holds or has held every conceivable job.
Or that he has an active imagination.
posted by delmoi at 7:59 AM on June 8, 2013 [1 favorite]


The size of the population taking the ICSE to be pretty low; 165,000 from a country of over a billion people? They say it's the second most popular testing board, but they must be a pretty distant second....

So every state in India has its own education system. That's where most schools are affiliated with. In addition, there are two "national" systems, in the sense that their offices are in Delhi. One of them, the CBSE, is directly connected with the central government, in that its officers are appointed by the central government (but are supposed to be autonomous of governmental control; in practice too, for the most part, but controversies abound every now and then)

The other one is the ICSE; it gets its authority from a federal law, but sets its syllabi and is run by an association of public schools in India. (Public schools in India have the same meaning as those in Britain; they are elite schools, often with individual histories as chequered as those in Britain; many have their own museums, at least one saw action in the First War of Indian Independence in 1857, and many have been featured on Indian stamps. )

I believe there are schools in Nairobi as well that are affiliated with the ICSE (which is why Swahili is offered as a subject in the exams); there certainly are schools in the Gulf and in Jakarta that are.
posted by the cydonian at 8:05 AM on June 8, 2013


the cydonian: This guy has zero context of how things are done in India; when they release results for 10th and 12th grades newspapers have special editions where they publish realms and realms of the hall ticket numbers and the grades.

You presumably missed the fact that he graduated from high school in India in 2011 and was looking up results for his friends in India in high school this year, which I think counts for more than zero context. And his actual results, which were not just a ticket id and grade but name, birthdate and grade. Are these ticket numbers all public knowledge, so you can look up any individuals grade in the paper just by knowing his name?

As for why the results are quantized, did the guy look at the exam paper? If you have 20 questions of five marks each for a total of 100 marks, and you don't get half marks for any answer, you are likely to get results in jumps of five. Now the real ICSE paper would have a different pattern - perhaps a few questions of six marks each, some with three, the rest with two - but it is not at all shocking that marks aren't in a general curve. Certainly, not without knowing how the papers were set.

From the article:
One of the most common critiques of my theory was this - maybe there were questions with only 3 or 4 mark intervals in all subjects making certain marks mathematically unattainable. My counterargument? All numbers from 94 to 100 are attainable and have been attained. What does this mean? It means that increments of 1 to 6 are attainable. By extension, all numbers from 0 to 100 are achievable.

Do you have a mathematical argument against this?
posted by jacalata at 12:28 PM on June 8, 2013


Driving around LA you see quite a few big "Doc 420" billboards for a medical marijuana doctor. Their website used to have a verification form, presumably for use by dispensaries, where you could enter a patient ID number and if they had a legit subscription from this doctor it would return their name & address.

Several years ago I noticed that patient IDs were nothing more than consecutive integers, and you could spam the URL and find the names and addresses of all the patients with medical marijuana cards. I emailed them about it and never heard back, but I see now that they've removed that form.
posted by jjwiseman at 2:13 PM on June 8, 2013


Markos "DailyKos" Moulitsas fired his pollster over results like this last year, and rightly so.
posted by localroger at 2:41 PM on June 8, 2013


kenko: "It is my vague impression, by now, that charlie don't surf holds or has held every conceivable job."

Nope, that would be me. How many peoples' comprehensive resumes include brushfire fighter, dietary cook, archaeological lap tech, sound technician, and polysomnographic technologist? And we will not even mention IT support for a pair of highly questionable Latvian commodity broker brothers.
posted by Samizdata at 4:13 PM on June 8, 2013


Samizdata: "kenko: "It is my vague impression, by now, that charlie don't surf holds or has held every conceivable job."

Nope, that would be me. How many peoples' comprehensive resumes include brushfire fighter, dietary cook, archaeological lap tech, sound technician, and polysomnographic technologist? And we will not even mention IT support for a pair of highly questionable Latvian commodity broker brothers.
"

laB tech even. slaps hands. Mumbles "Stupid rented hands."
posted by Samizdata at 4:27 PM on June 8, 2013 [1 favorite]


Markos "DailyKos" Moulitsas fired his pollster over results like this last year, and rightly so.

Only after months of analysis by professional statisticians. And the specific issues were not like this OP.
posted by charlie don't surf at 10:47 AM on June 9, 2013


What went down at DailyKos

Although Grebner et al analyzed months of data (well over a year's worth actually) it's doesn't appear that they did "months of analysis." What seems to have happened is that they noticed suspicious patterns over a period of months, and then cranked the data in a relatively short amount of work time.

The anomalies they found involved missing values of the sort that might not "look random enough," with round numbers under-represented, and missing values of change from week to week, specifically "no change" being under-represented. While this is abit more subtle than what Das found it is essentially the same kind of failure to make the results look like random results.

(Also it was "last year" for values of "last" equal to "three years ago." Time it flies.)
posted by localroger at 1:12 PM on June 9, 2013


« Older Mechanical Paper Robots   |   The Greatest Event in Television History! Newer »


This thread has been archived and is closed to new comments