How the Circle Line rogue train was caught with data
December 4, 2016 11:29 AM   Subscribe

The MRT Circle Line was hit by a spate of mysterious disruptions in recent months, causing much confusion and distress to thousands of commuters. Like most of my colleagues, I take a train on the Circle Line to my office at one-north every morning. So on November 5, when my team was given the chance to investigate the cause, I volunteered without hesitation.
posted by infini (27 comments total) 36 users marked this as a favorite
 
This is very cool but what was the evil train actually DOING to the other trains?
posted by showbiz_liz at 11:44 AM on December 4, 2016 [13 favorites]


So, does the rogue train just have bad sensor reporting hardware/software sending shit up the line or is the rogue train a train that is being run for nefarious purposes, or what exactly is going on?

It's a fascinating bit of data mining and sifting to find out what the actual culprit is, but I'm unclear on the actual outcome. The article says that the Circle Line can be once again taken with confidence of its efficiency and timing or something to that effect, but did the offending train get taken offline or fixed? I read twice looking for this information but didn't find it.
posted by hippybear at 11:45 AM on December 4, 2016 [2 favorites]




Can't they see where the trains are? Couldn't they just notice that this one paralysed every train it passed on the other line? Or have I misunderstood?
posted by Segundus at 12:52 PM on December 4, 2016


I don't believe it was every single train it passed.
posted by showbiz_liz at 12:56 PM on December 4, 2016 [2 favorites]


#notalltrains
posted by Joe in Australia at 1:37 PM on December 4, 2016 [16 favorites]


Can't they see where the trains are? Couldn't they just notice that this one paralysed every train it passed on the other line?
That’s exactly what the article is about.
posted by migurski at 1:57 PM on December 4, 2016 [7 favorites]


That moment of disappointment when you realize that the story isn't going to end with the one surviving member of the team hunting the rogue train through the subway tunnels in the dead of night with an EMP cannon.
posted by Halloween Jack at 3:00 PM on December 4, 2016 [47 favorites]


This also means that train signals are now a potential vector for a mildly disruptive almost undetectable attack.
posted by srboisvert at 3:23 PM on December 4, 2016 [5 favorites]


This also means that train signals are now a potential vector for a mildly disruptive almost undetectable attack.

Ah, so they're written in JavaScript.
posted by maxwelton at 4:10 PM on December 4, 2016 [9 favorites]


What a great story! A Sherlock Holmes tale de nos jours.

Be sure to read the press release linked at the end, not just because it has more background but for the many unstated reveals. The people running the investigation - which included the transport authority, the rail operator, and the signalling equipment provider - got as far as working out that an intefering radio signal was jamming the rail signalling comms on random trains at random times. Their first conjecture was that this was caused by 'telecommunications systems' - ie, mobile networks. So they turned off potential interfering signals - they disabled mobile across the line for first part of a day, then a whole day.

Imagine what that would take - if it were even possible - anywhere in Europe or the USA. There are ways in Singapore to debug the city as you would a flakey motherboard. Is this useful? Yes. Is it good? Not so sure.

Then there was the conclusion, which was carefully worded (unlike much of what is an exceedingly unevenly edited document) to say that special thanks is due to the data scientists, without whom we would not have been able to formulate the rogue train idea. Sorry, didn't catch that, who forumlated the idea again? Not only did Team Datanerd formulate the idea, they tested it and then went out and pinpointed the exact train. But hierarchies gonna hierarch.

And now I've got to go off and find out what exactly went wrong with the radio signalling. It's not obvious at all without knowing some of the underlying technicalities, and while it's good that the problem triggered fail-safe, it's still a problem ib a critical system that on first blush one thinks should have been better handled in the stack somewhere.

(I'd love to see a dramatised documentary of this in full-on overblown Discovery Channel/Hollywood High Camp mode. Provided it worked all the tech and engineering in with sufficient bravado. Would watch. But what to call it? The Raking of Pelham 123? Python The Rogue Catcher? Braking Bad? )
posted by Devonian at 5:51 PM on December 4, 2016 [7 favorites]


So cool. Thanks for posting this, infini.

Can't they see where the trains are? Couldn't they just notice that this one paralysed every train it passed on the other line? Or have I misunderstood?

Not sure if this is clear from the article, but for anyone who may not be fully aware of the context, the Circle Line is part of a fully driverless subway system. That may explain why they could not identify the rogue train at first, or indeed even suspect a rogue train. There are so many possibilities to explain how a signalling system which is fully computerised, and which probably depends heavily on wifi/radio signals, might be disrupted, and the data seemed completely random at first glance.

Which brings into sharp relief how many ways a fully automated, driverless system could go wrong. This is with trains that are literally railroaded on tracks. Imagine some unforeseen bug or hardware fault with driverless cars. This is why I'm very skeptical about fully automated, driverless vehicles on the road. We already do driverless trains daily on a citywide scale; it's not perfect, but at least it's not life-threatening. Yet.
posted by satoshi at 8:20 PM on December 4, 2016 [5 favorites]


MetaFilter: it's not perfect, but at least it's not life-threatening. Yet.
posted by hippybear at 9:46 PM on December 4, 2016 [4 favorites]


My favourite part of the tale was that they suspected a rogue train from the data, but needed to visualise it to confirm that it was non-random. Also, that they named the rogue train Gyarados.

Imagine what that would take - if it were even possible - anywhere in Europe or the USA. There are ways in Singapore to debug the city as you would a flakey motherboard. Is this useful? Yes. Is it good? Not so sure.

My info is dated, but the Tube doesn't have mobile services in most of the network doesn't it? Also, thought the actual disruption was actually quite minimal, in that you still had wifi and other services in the stations. Given that the average distance between stations on the Circle Line is about 800m, didn't think it was too onerous.

Finally, you have to weigh this against, well, the trains not running in the first place. My sense is that they were pretty desperate by then, and were looking for some sign somehow.
posted by the cydonian at 10:58 PM on December 4, 2016 [1 favorite]


Especially since there'd been those issues with the MRT a couple of years ago - the infrastructure is wearing out in the older tracks
posted by infini at 1:43 AM on December 5, 2016


And nobody has posted this yet?? Shock, horror!
posted by adamgreenfield at 2:34 AM on December 5, 2016 [1 favorite]


Imagine what that would take - if it were even possible - anywhere in Europe or the USA.

President Trump visiting London will probably suffice to shut down mobile service within a 10-mile radius.
posted by acb at 3:42 AM on December 5, 2016


President Trump visiting London will probably suffice to shut down mobile service within a 10-mile radius.

To prevent him from tweeting? Cannot you just block twitter.com?
posted by effbot at 3:52 AM on December 5, 2016 [3 favorites]


Security arrangements/ego stroking. I don't think that, in his position as alpha-male-of-alpha-males, he'd be satisfied with security arrangements that didn't impose severely on the multitudes beneath him.

During a previous Presidential visit (Bush II, I think), they shut down parts of the Tube.
posted by acb at 3:59 AM on December 5, 2016


Stuff gets shut down in London all the time for security reasons, just ask a cabbie.. It's more the ability to do it to debug trains which is peculiarly puissant.
posted by Devonian at 4:52 AM on December 5, 2016


Devonian I'd love to see a dramatised documentary of this in full-on overblown Discovery Channel/Hollywood High Camp mode

Yes Please!

Well written article. I really enjoyed the 4 hours+ I've spent learning all about Alstom and their many products. They also have a couple of really neat videos buried in their literature.
posted by james33 at 5:36 AM on December 5, 2016


Ok... so I've been following this for a while, and have heard some additional information from people in the know.

First, about the fault (some of this is mentioned in various articles quoted above, I'll summarize): what's happening here is that the Circle Line uses a wireless signalling mechanism (Alstom Urbalis 300) for communications between the trains and station control. One critical piece of information the train provides via this comms channel is its current location, which is used for collision detection and making sure trains are separated by a safe braking distance. In the event that the channel is disrupted, the trains will immediately fire their emergency brakes.

It turns out that the wireless communications system used by the trains runs on top of 802.11, i.e. WiFi. As an infosec guy I was pretty gobsmacked to read about this. Before the rogue train was discovered, SMRT suspected that the disruption might have been caused by too many people using WiFi hotspots on their phones; that's why they got the telcos to disable mobile data in the trains for a few days to test out this hypothesis. (The newspapers made it sound like the interference was due to cellular, but that's impossible as cellular operates on a totally different frequency than WiFi.)

Based on the above, we can guess that the reason for the disrupted communications was a loss of WiFi connectivity due to signal interference. WiFi, being a broadcast medium, relies on a media access control (MAC) mechanism, whereby all parties must be cooperative and practice collision avoidance. Too many clients sharing the same space can lead to disruption; a rogue client jamming the shared frequency can do so as well. Now that we know its a rogue train, what probably happened was that the WiFi client running on the train overwhlemed the channel, resulting in other clients being unable to communicate with the access points.

Now that we've found the culprit and removed it from service, everybody's all smiles and all is well, right? Right. But, as srboisvert mentioned, the fact that a single "rogue" train can take down the entire network (and that interference from WiFi hotspots inside the trains might be a plausible cause) suggests something is pretty wrong here. Bringing down the train system is a huge deal in ultra-dense Singapore. A DoS on the train system shouldn't be happen so easily. Communications between trains and control centres should not be over a channel that can be jammed using commodity hardware.

Even worse, doesn't the above also mean that the signalling traffic is exposed and thus eavesdroppable and potentially modifiable? Unless they're using properly designed encryption (and SCADA systems rarely are), this is a really big risk. Could somebody board a train with a largish antenna and a laptop and start messing with things? I'm really worried about this.

Then there was the conclusion, which was carefully worded (unlike much of what is an exceedingly unevenly edited document) to say that special thanks is due to the data scientists, without whom we would not have been able to formulate the rogue train idea. Sorry, didn't catch that, who forumlated the idea again? Not only did Team Datanerd formulate the idea, they tested it and then went out and pinpointed the exact train. But hierarchies gonna hierarch.

From what I've heard, there was been a bit of jostling over who deserves credit for cracking the case. There were a few different government agencies involved; a newspaper report released a day or so before the data.gov.sg blogpost gave most of the credit to a different agency. So this blog post seems to be designed as a response to that, and was probably written at the behest of the higher ups. While they managed to determine the rogue train through their analysis, other people might have reached the same conclusion through different means before them. tldr; its political.
posted by destrius at 7:24 AM on December 5, 2016 [3 favorites]


Everyone knows this was Mas Selamat, who has been living in a special MRT floor cubby all this time. He got a new phone and forgot to turn on auto-updating over WIFI in Google Play. What an amateur.
posted by grumpybear69 at 7:58 AM on December 5, 2016 [2 favorites]


This is really neat. Thanks!

But, coming at this as and RF geek rather than a big-data geek. . . the punchline seems to be missing entirely. A Sherlock Holmes story that fingers the culprit but doesn't explain how he carried out the deed isn't all that satisfying. Knowing which train did it is useful, but understanding what the train did is a lot more interesting.

Thanks, destrius, for the more detailed info above.

I'm glad I have too many other things to do to consider taking up the hacking train anti-collision signaling hobby. 'cause it sounds like it could be a lot of fun.
posted by eotvos at 8:23 AM on December 5, 2016


Over smart city lah!
posted by infini at 8:24 AM on December 5, 2016 [5 favorites]


Further tests on PV46 by engineers from LTA, DSTA and Rohde & Schwarz showed that faulty train signalling hardware on PV46 was emitting erroneous signals in addition to the ones it was supposed to emit

The question that immediately brings to mind is: What do those signals look like? How secure is that protocol, and how easy would it be to hide a tiny battery powered microcontroller somewhere that could play havoc by sending the signals at random times, or cripple the entire line on command?
posted by CaseyB at 3:07 PM on December 5, 2016


All I know is that after reading this thread, I'll be only taking buses
posted by infini at 11:57 PM on December 5, 2016


« Older The Distribution of Users’ Computer Skills: Worse...   |   David Bowie's final three songs Newer »


This thread has been archived and is closed to new comments