Join 3,564 readers in helping fund MetaFilter (Hide)


Computer says No
June 23, 2012 10:48 AM   Subscribe

For the past 4 days, up to 12 million NatWest / Royal Bank of Scotland customers have been unable to pay bills, move money or get paid due to a technical problem. Customers have been unable to complete on house purchases and some are stuck because they can't pay hotel bills abroad. The new mobile banking service has also been affected. The bank has called in 7,000 staff to open all weekend as problems persist.
Just three months ago, the State-controlled bank outsourced nearly 300 back-office roles to Hyderabad in India.
posted by Lanark (71 comments total) 8 users marked this as a favorite

 
Is there a causal link between the outage and the outsourcing? Because none of the articles appear to support that.
posted by vacapinta at 10:54 AM on June 23, 2012 [10 favorites]


David Silverstone, delivery and solutions director at banking technology specialists NMQA, said the glitch was almost inevitably caused by the introduction of a change.

Goals for 2012 IT Infrastructure Team:
- Introduce Change
posted by benzenedream at 11:05 AM on June 23, 2012 [20 favorites]


Interestingly, most bank errors favor the bank. Funny, that.
posted by SPrintF at 11:10 AM on June 23, 2012 [1 favorite]


How does this favor the bank?

Maybe they get marginally more money on the float for money that wouldn't otherwise be in their bank for the past few days, but that seems like it's probably trivial when compared to the amount of money they get on the float in normal circumstances, not to mention when compared to the badwill generated by being unable to get money to their customers.
posted by Flunkie at 11:22 AM on June 23, 2012


It may or may not have been the culprit here, but outsourcing really has the potential to make any problem worse. Not because there's anything wrong with the staff at the outsourcer -- often they're more skilled than own your staff because they're specialists and in a competitive environment. No, the problem is that they've thousands of miles away, speak a different language, and, most importantly, report to a different organization which will have every same incentive to skimp on serving you that you have to skimp on serving your own clients. When you do a software rollout yourself it's all hands on deck and brace for a storm. When an outsourcer does it for you, it'll never get the same focus.
posted by tyllwin at 11:31 AM on June 23, 2012 [22 favorites]


Just fyi, Bank Transfer Day is still chugging along (previously).
Building Societies are the U.K.'s version of Credit Unions, btw.
posted by jeffburdges at 11:31 AM on June 23, 2012 [3 favorites]


Is there a causal link between the outage and the outsourcing? Because none of the articles appear to support that.

I followed this thread on Slashdot where the general opinion seems to be that intimate knowledge of backend systems may not have been adequately documented and/or communicated to the outsourced developers.

In my own development experience, when an original development team is replaced for on-the-cheap problems begin creeping in which leads management to hire local consultants, hire the former developers as consultants, or to build up a new team of on-site developers.
posted by mistersquid at 11:33 AM on June 23, 2012 [7 favorites]


intimate knowledge of backend systems may not have been adequately documented and/or communicated to the outsourced developers

Because people who are losing their jobs are not often incented to make it easy. They're incented to find other employment and bail out ASAP.
posted by tyllwin at 11:46 AM on June 23, 2012 [16 favorites]


Interestingly, most bank errors favor the bank. Funny, that.

Hahahaha, no. Nobody chooses this degree of brand damage. Multiple days of prominent blanket media coverage informing everyone of how you're a crap bank doesn't favour the bank.
posted by oliverburkeman at 11:46 AM on June 23, 2012


How does this favor the bank?

I don't see anything that implies that it does favor the bank.
posted by Tomorrowful at 11:46 AM on June 23, 2012


This is Hyper-bad.
posted by Monkey0nCrack at 11:48 AM on June 23, 2012 [2 favorites]


As much as outsourcing always creates problems and is probably partially to be blamed for this, I suspect that it is caused by the UK's bad attitude towards software engineering:

- that they use IT to mean both system/computer/software engineering
- that engineering is not treated as a respectable important profession see cultural guffawing and references to boffins, masculinity in the UK media
- given the above two, I expect clueless banking executives with no technical background argued "software is is a commodity" and having no idea what testing was needed before rolling something into production

In my view we are entering The Age of the Algorithm and businesses that don't rise to the needs of this time (technically aware leadership, a valuing of software as important) will suffer and die.
posted by niccolo at 11:56 AM on June 23, 2012 [11 favorites]


"Hyderabad in India," you say? For a second I thought it was Hyderabad-on-Wye or Hyderabad-under-Dinmore.
posted by Nomyte at 12:01 PM on June 23, 2012 [24 favorites]


How does this favor the bank?
I don't see anything that implies that it does favor the bank.
Certainly this...
"Interestingly, most bank errors favor the bank. Funny, that."
... does not, strictly speaking, necessarily imply that this incident of a bank error that we're talking about favors the bank. But if it wasn't meant that way, then it seems to me like it's just a non sequitur in this context.
posted by Flunkie at 12:07 PM on June 23, 2012


Computer says no.
posted by empatterson at 12:09 PM on June 23, 2012


Flunkie, that was probably a joke referencing the Monopoly "chance" card that reads, "Bank error in your favor, take $50" (or however much). It's a pretty famous phrase.
posted by gilrain at 12:13 PM on June 23, 2012 [1 favorite]


Here it is.
posted by gilrain at 12:14 PM on June 23, 2012


Worst toilet bank in Scotland.
posted by Blazecock Pileon at 12:19 PM on June 23, 2012 [6 favorites]


This is also affecting customers of Ulster Bank in Ireland.
posted by nfg at 12:24 PM on June 23, 2012 [1 favorite]


I moved to Wales (from the US) for a year when I was in my early twenties. Starting a local bank account was a little daunting because of my ignorance. I was bartending at a tourist hotel, and on my first night I decided to question a nice elderly man who was sitting at bar.

"I don't know anything about your banks. Do you have a recommendation as to where I should open an account?"

"Were at Barclay's. They're mostly the same, I reckon," He paused and then dropped to a conspiratorial whisper. "What ever you do, don't use Nat West," the old man said as he glanced to see who was in earshot. "Those people are bastards."

I didn't even check to see if Nat West had a local branch.
posted by Mayor Curley at 12:26 PM on June 23, 2012 [6 favorites]


niccolo -

"that engineering is not treated as a respectable important profession see cultural guffawing and references to boffins, masculinity in the UK media"

Is there a negative connotation to "boffin"? I came across it in an article recently and looked it up; the definitions I found are "a person engaged in scientific or technical research; a person with knowledge or a skill considered to be complex, arcane, and difficult."

Does "boffin" suggest wimpy or something?

(Sorry for the derail.)
posted by kristi at 12:48 PM on June 23, 2012


"Hyderabad in India," you say? For a second I thought it was Hyderabad-on-Wye or Hyderabad-under-Dinmore.

Or, you know, Hyderabad in Pakistan, population 1.5 million.
posted by lefty lucky cat at 12:51 PM on June 23, 2012 [15 favorites]


I suspect that it is caused by the UK's bad attitude towards software engineering:

- that they use IT to mean both system/computer/software engineering


That's not just the UK. Pretty sure it happens everywhere you have multiple technological fields that appear to be reasonably similar to a layman.
posted by mikurski at 12:57 PM on June 23, 2012


Is there a negative connotation to "boffin"?

Depending on context, it may be simply jocular, as in "Boffins to talk up moon colony", but more often leans negative, as in "Cancer killer doesn't work, admit boffins". I don't think it's wimpiness so much as a generalized convergent fear of elites and technology.

But frankly, I'm not sure I would disagree with niccolo's comments if they were about American industry. I mean, we've got Dilbert.
posted by dhartung at 1:01 PM on June 23, 2012


According to a comment in the aforementioned Slashdot thread, the problem was with an update to the scheduler for the batch jobs that run overnight. People's accounts haven't been updated because the jobs that normally do that haven't been running. That's a nasty problem--schedulers are tricky things to deal with.

Still, the fact that this bug started hitting all the customers at the same time indicates that either they don't use a gradual rollout process or if they do, it's not working. Either way, I expect that the technical people in charge are being asked tough questions right now, and I hope the bank has learned to be more careful about technical stuff.
posted by A dead Quaker at 1:06 PM on June 23, 2012 [2 favorites]


often they're more skilled than own your staff because they're specialists and in a competitive environment. No, the problem is that they've thousands of miles away, speak a different language
Most indian professional speak english. I wonder how well someone from scotland and India could understand eachother's accents, though. But written communication should be fine, I think
According to a comment in the aforementioned Slashdot thread, the problem was with an update to the scheduler for the batch jobs that run overnight. People's accounts haven't been updated because the jobs that normally do that haven't been running. That's a nasty problem--schedulers are tricky things to deal with.
Isn't that usually the kind of thing you "fix" by having some guy manually run the process every day for a while? Or has it actually been that one of the scheduled tasks fails?

I'm betting they're still running some goofball mainframe system that no one knows how to use except the people who are retired. No one who's actually passionate about computers or technology wants to get into mainframe COBOL programming.
posted by delmoi at 1:12 PM on June 23, 2012


Software upgrades can be very risky, that's why lots of core systems are running on decades old software. When I worked as a developer at a major bank, they spent 100 million on a project to replace an old system, just to scrap that project when they realized it wouldn't work.
posted by Bort at 1:14 PM on June 23, 2012


Most indian professional speak english. I wonder how well someone from scotland and India could understand eachother's accents, though. But written communication should be fine, I think

In my experience, nearly everyone of the Indian outsourced developers have been extremely nice people that I enjoyed working with on a personal level. However, both verbal and written communication ranges from somewhat difficult to almost impossible.

As an aside, the level of detail needed in the specs for the code, combined with the rework due to bugs or misunderstandings causes anything short of whole-sale outsourcing to be less than cost effective, IMHO. Although I've seen level 1 support outsourcing work well - just not development.
posted by Bort at 1:22 PM on June 23, 2012 [1 favorite]


I'd agree with niccolo that blame probably rests with the U.K.'s overall incompetent attitude towards engineering, possibly paired with nepotism.

Anyone remember insurers withdrawing lost luggage coverage for Heathrow terminal five because they felt the dangers could no longer be considered unforseen circumstances? Yeah, that happened because they found some clueless idiot with high level connections to handle running it initially. She found qualified staff from various other airports and terminals, but she never arranged that they actually see the building before the opening.
posted by jeffburdges at 1:24 PM on June 23, 2012 [1 favorite]


delmoi: Isn't that usually the kind of thing you "fix" by having some guy manually run the process every day for a while?
Not to lay too much snark on it, but in a professionally managed environment that's the kind of thing you fix by running it on the redundant, parallel test system first to make goddamned sure it's going to work before you roll it out to the production system serving 12 million customers.
posted by ob1quixote at 1:34 PM on June 23, 2012 [19 favorites]


Oh damn? they took out the scheduler? Those things are frightening to deal with.
posted by roboton666 at 1:57 PM on June 23, 2012


Not to lay too much snark on it, but in a professionally managed environment that's the kind of thing you fix by running it on the redundant, parallel test system first to make goddamned sure it's going to work before you roll it out to the production system serving 12 million customers.
Yeah, this is not about outsourcing or complexity, it's about being cheap and stupid.
posted by fullerine at 1:58 PM on June 23, 2012


Not to lay too much snark on it, but in a professionally managed environment that's the kind of thing you fix by running it on the redundant, parallel test system first to make goddamned sure it's going to work before you roll it out to the production system serving 12 million customers.

Yeah, this is not about outsourcing or complexity, it's about being cheap and stupid.


There are situations were differences between QA and production are just enough to cause problems, and other times the redundant system upgrades are such that they cannot run in parallel on differing code bases, thus requiring a "fail forward" approach.

Sometimes you just have to put your ass in a sling and hope for the best...
posted by roboton666 at 2:03 PM on June 23, 2012 [1 favorite]


Isn't that usually the kind of thing you "fix" by having some guy manually run the process every day for a while? Or has it actually been that one of the scheduled tasks fails?

It's not "a" scheduled task. Something as simple as reconciling daily transactions for a single account my have hundreds, or most likely thousands of tasks that have to be run in a specific order. Scheduling software is mind-boggling complicated, you run one task manually out of order and then BAM! you just completely fucked up 500 other processes, that will take a team of 10 engineers a week to fix, working near 24 hours a day.

Right now they are most likely trying to wrap their heads around exactly what has been impacted, and what the first step is. Running headlong into this situation could make it a hundred times worse.
posted by roboton666 at 2:13 PM on June 23, 2012 [1 favorite]


I was supposed to get paid on Friday; it would be really nice if that happened sometime very soon.

(I am in the fortunate position that I'm not completely reliant on my Ulster Bank account. Right now I'm more concerned about the potential impact on my credit score if any of the automatic payments which are timed to come out of my account right after I get paid fail to get made.)
posted by meronym at 2:32 PM on June 23, 2012 [1 favorite]


Not to lay too much snark on it, but in a professionally managed environment that's the kind of thing you fix by running it on the redundant, parallel test system first to make goddamned sure it's going to work before you roll it out to the production system serving 12 million customers.

Testing software is difficult and complicated. One of the core ideas in QA is, in fact, that testing can never prove that a piece of software works, only give a level of confidence that it will. Regardless of how parallel you try to make your test environments, they're never exactly the same. For example, you probably don't want live customer data in your test environment for data protection reasons. And it quickly diverges from live data.

Additionally, in banking you're typically dealing with ancient systems that aren't as amenable to testing as modern systems. You can only really test what you understand and as time goes by, knowledge of these systems decays. Maybe the system can't easily be sped up, so you can't verify the changes will work months in the future. In the end, all testing is a trade-off between what can be done and what should be done. Maybe (probably?) there wasn't enough testing done, but I'd be astonished if the kind of pre-production testing you think should have happened didn't happen in some form.
posted by xchmp at 2:36 PM on June 23, 2012 [1 favorite]


You can only really test what you understand and as time goes by, knowledge of these systems decays.

c.f. Vernor Vinge's software archaeologists.
posted by Justinian at 2:57 PM on June 23, 2012 [2 favorites]


There are situations were differences between QA and production are just enough to cause problems, and other times the redundant system upgrades are such that they cannot run in parallel on differing code bases, thus requiring a "fail forward" approach.

Testing software is difficult and complicated. One of the core ideas in QA is, in fact, that testing can never prove that a piece of software works, only give a level of confidence that it will. Regardless of how parallel you try to make your test environments, they're never exactly the same. For example, you probably don't want live customer data in your test environment for data protection reasons. And it quickly diverges from live data.
They just laid off over a thousand of their support staff, do you honestly think this is just a case of bad luck? Some support staff are expensive for a reason and cutting corners gets you on the news.

It's happening everywhere and happening with systems which should never be run cheaply. You may think there's a line where safety or security critical systems would not be outsourced or done on the cheap, but they used to say that about banking systems.

Unite should take out full page ads in all the Sunday papers.

WE. TOLD. YOU. SO.
posted by fullerine at 2:59 PM on June 23, 2012 [14 favorites]


Not to lay too much snark on it, but in a professionally managed environment that's the kind of thing you fix by running it on the redundant, parallel test system first to make goddamned sure it's going to work before you roll it out to the production system serving 12 million customers.
Maybe today on commodity hardware, but how many mainframe shops run completely independent exact replicas of their main systems on different hardware complete with simulated load? If I'd had to guess they probably use virtualization to create a test environment on the same hardware, and maybe copy over the databases regularly and test their batch processes on it before deploying. But remember a lot of these processes and systems were setup when computer hardware at that level was very expensive and you didn't want to waste it running a completely redundant test.

Of course who really knows anyway, other than the people there? I don't know that much about mainframe programming.
Scheduling software is mind-boggling complicated, you run one task manually out of order and then BAM! you just completely fucked up 500 other processes, that will take a team of 10 engineers a week to fix, working near 24 hours a day.
There are lots of things you could do today with modern systems that could avoid most of these potential problems. If you have a bunch of tasks that need to be run in a specific order, it would seem like that's something you would just use a simple shell script to do. If you want to run things in parallel if they can be, you can use a queue system where process A puts its results in the input queue for process B, C, and D which can run when they get all the input they need and so on. (Maybe that's what people mean when they say "scheduler"? From my perspective a "scheduler" is just something that runs a certain script at a certain time, or else the CPU scheduler used for multithreading)

And definitely you would want to have some mechanism to roll back or simply not commit the results if things didn't work (i.e. the process would work on a copy of the data, then when finished the system would point to the new DB, rather than messing with a single copy of the data)
They just laid off over a thousand of their support staff, do you honestly think this is just a case of bad luck? Some support staff are expensive for a reason and cutting corners gets you on the news.
Yeah. They probably laid off the guy who knew how to run that process manually :P

But more seriously, even they weren't trying to save money, the people who are running these systems are retiring on their own anyway.
posted by delmoi at 3:16 PM on June 23, 2012 [1 favorite]


They just laid off over a thousand of their support staff, do you honestly think this is just a case of bad luck? Some support staff are expensive for a reason and cutting corners gets you on the news.

Well, they appear to be advertising for test analysts (also business analysts), so it's not clear to me that change roles have been outsourced.

My guess would be a combination of bad luck and underinvestment. Things go wrong when you make software changes, but not having an effective rollback plan is pretty inexcusable.
posted by xchmp at 3:19 PM on June 23, 2012 [1 favorite]


xchmp: Regardless of how parallel you try to make your test environments, they're never exactly the same. For example, you probably don't want live customer data in your test environment for data protection reasons. And it quickly diverges from live data. Additionally, in banking you're typically dealing with ancient systems that aren't as amenable to testing as modern systems.

All of which is to say, they were too cheap to do it right.

If their parallel system isn't identical to production, it's because they didn't want to spend the money for a fully redundant system. If they don't use production data on their parallel system, it's because they didn't want to spend the money on the necessary hardening and data protection. If their hardware and software are so ancient that they can't easily duplicate them so that they have a parallel system, it's because they didn't want to spend the money to upgrade to modern equipment. Hell, they apparently don't even have a backup of the software they can roll back to.

Due to the amount of money and customers involved, this is a Class V system. It should be run to the same standard as a system for which failure entails a risk of the loss of human life. There is absolutely no excuse for any this in the 21st century.

I really wish business school types would uderstand, here in future, money spent on IT is just as important as money spent improving the physical plant. Hardware and software capital expenditures have all been depreciable or amortizable since the '90s.

Not that I'm bitter…


P.S. On preview, exactly, xchmp. It's shameful, really.
posted by ob1quixote at 3:35 PM on June 23, 2012 [6 favorites]


In my own development experience, when an original development team is replaced for on-the-cheap problems begin creeping in which leads management to hire local consultants, hire the former developers as consultants, or to build up a new team of on-site developers.


And long may this practice continue
posted by the noob at 4:02 PM on June 23, 2012 [1 favorite]


Ironically, in the modern era, banks are usually just collections of software; a (vast) database, an ATM integration system, systems to integrate with other banks, credit and debit card networks, reconciliation and reporting jobs, online banking, and teller systems to record in-person transactions—but most are still run by "business" types with no discernible skills who list their IT infrastructure as "expense" and resent that the nerds in the back room have more knowledge of their institution than they do.
posted by sonic meat machine at 5:00 PM on June 23, 2012 [7 favorites]


The only think I can add to this thread is some Manic Street Preachers. Nat West-Barclays-Midlands-Lloyds, so you can listen as you read (or sob, if you're an RBS customer).
posted by Mezentian at 7:37 PM on June 23, 2012


If their parallel system isn't identical to production, it's because they didn't want to spend the money for a fully redundant system.
How many mainframe systems are setup that way? How do you emulate, for example, thousands of bank employees entering data on TN3270 terminals?
Due to the amount of money and customers involved, this is a Class V system. It should be run to the same standard as a system for which failure entails a risk of the loss of human life. There is absolutely no excuse for any this in the 21st century.
Well, it's hard to put an exact dollar (or Pound) figure on this but the thing is most of the costs won't be paid by them. Customer A has his bank setup to pay his other bank's credit card: he misses his payment and the bank charges him a late fee. Maybe he doesn't notice: in that case the money isn't lost it's just transferred from one person who is Not Them, to another group of people who are also Not Them.

If they do notice the late fee, they've got to call up and try to get it reversed. That, as well as lots of other messed up transactions can be fixed, and the cost is the cost of the time it takes people to fix it. That's lost productivity for companies and cost for consumers of their time (which you can assign a value too) But they don't show up as direct costs on a balance sheet. And in any event, most people involved are Not Them.

Looking at the costs to the bank itself, on it's balance sheet, they have to look at what the costs are to them vs. the cost of an upgrade. Now, I think it could be done cheap but "enterprise" people like "enterprise" software which tends to be needlessly complex and expensive.

(And let's be honest, software developers and IT people who go work for a bank aren't doing it because they love banking software, they're trying to make money and so they don't have a huge motivation to do things in a cost effective way. And decision makers base those decisions on minimizing blame if things go wrong, so they buy IBM and if something gets messed up they can't be blamed for trying some risky startup or new technology. "no one got fired for buying IBM" and so on. So the costs might be higher then you would expect, based on the total amount of software written)

Aaaaanyway the basic point is that while this is hugely embarrassing, UK Pound for UK Pound (can't just say 'pound for pound' here, I guess) the ultimate cost for the bank itself might be less then the money saved by cutting corners for years and years. It's also possible that it's less then what was saved by outsourcing. however if stuff like this keeps happening due to communications problems and so on... then they're going to continue to have issues.

You could argue that might do some damage to the brand, but I'm not sure if banks really suffer too greatly for that. Look at Bank of America, for example, which has gotten in trouble for illegally foreclosing on people, in some case sending people around to 'trash out' their houses by mistake. I don't know how people feel about this bank but in the US there is a lot of discontent overall, but since the effort required to switch seems like a lot (although it isn't actually that difficult) people don't do it.
posted by delmoi at 7:45 PM on June 23, 2012 [1 favorite]


0 1 * * * /usr/local/bin/update-accounts
posted by nicwolff at 9:36 PM on June 23, 2012 [1 favorite]


Its not always the outsourcers... the corporate culture that makes outsourcing seem like a good idea can have internal impacts as well. I worked for an outsourcer, where the project and testing teams were internal to the client. They asked me to implement a change on a system. I emailed them to test in the Dec environment and they all immediately replied with positive test results.

Thinking it wasn't anywhere near enough time I verified with them and the PM it was all good. They assured me it was so I implemented in production.

Turns out to save money on the project under very tight financial restrictions they had one person, a grad no less, do some basic testing and then tell everyone to fill in positive test results. Rollback plan was tested in the same manner.

Yeah that was a fun weekend.
posted by Admira at 10:24 PM on June 23, 2012 [1 favorite]


Dec = Dev ... tablet autocorrect...
posted by Admira at 10:25 PM on June 23, 2012


delmoi: How many mainframe systems are setup that way? How do you emulate, for example, thousands of bank employees entering data on TN3270 terminals?
I don't actually disagree with you, delmoi, especially about the "A times B times C equals X" nature of the decision to just ignore systems that sometimes fail as long as it doesn't affect the bottom line. So I'm not really arguing with you. I guess I'm arguing with the kind of management who allow something like this to happen.


I think we're well past the point where it's irresponsible to still use badly outdated mainframe hardware and serial terminals in 2012. The software isn't magical. One presumes it must run to completion with a deterministic output. Otherwise what good would it be to the bank? It could be rewritten into a modern system in a finite amount of time for a finite amount of money.

That is, and this goes to your point about "nobody ever got fired for buying IBM", as long as you didn't bring in the kind of big outfit that's going to send one senior guy and a bunch of kids right out of college to work on your project. They'll all look good in suits though.

They're harder to find, but you can hire people who get things done rather than leave you with nothing but "white papers" to show for your multi-million dollar consulting fee invoice. Hell it took a dozen of us only a few months to rewrite the ACH interface for a major bank in Gupta SQL Fucking Windows and that was back when suck.com was being updated daily.


I understand what you're saying though. Management does some kind of cost/benefit analysis and concludes that it's cheaper and easier to just keep going with what they have rather than upgrade. Despite dire warnings from the professionals they hire to run their IT departments. Then something like this happens and they don't want to hear from years of neglected infrastructure, outdated hardware and software, and them cheaping out on everything from desktops to post-it notes.[1] The nerds screwed up, it couldn't be management's fault.

The point being, I'm really not sure what goes through the heads of senior management anymore when it comes to IT. You'd think that a business that can't really afford to have the computers go down ever would take it seriously, but I know from experience they don't. IT spending is for whatever reason looked upon as wasteful, even in organizations who live and die by their Internet sales.

When I started in the business, I figured all the it's-just-a-computer—push-a-few-buttons-and-make-it-work—you're-not-even-really-working—what-do-I-even-pay-you-for types who could never understand computers would all be retired by now. I now fear that we'll never actually be rid of them.


[1] I did a job one time for a civil engineering firm that, in addition to the kind of small potatoes city parks & rec gig they had me in to help them with, designed projects with seven-figure fees like bridges and expressway interchanges. The boss made the PEs working for him cut the post-it notes in half so they could get twice as many uses out of one 50¢ pad. True story.
posted by ob1quixote at 10:41 PM on June 23, 2012 [1 favorite]


Ah, Hyderabad. Seperated from Asgard to the west by the River of Death Ice, locked in perpetual winter. Good times.
posted by obiwanwasabi at 10:54 PM on June 23, 2012


No-one seemed to notice their cash machine network was down a couple of Saturdays ago. It was impossible to withdraw cash from them (or any other cash machine) for a couple of hours. All the evidence I have is a few irate tweets. This seems a much more pervasive issue than is being acknowledged.

Ah, the backing sector. Extracting a small fortune (~9% of the economy) to fuck things up on a global, continental, national and individual basis.
posted by davemee at 11:56 PM on June 23, 2012


the ultimate cost for the bank itself might be less then the money saved by cutting corners for years and years. It's also possible that it's less then what was saved by outsourcing

But I can't be the only super-lazy Nat West customer who has finally decided to move to the Co-op bank, which I have been kind of meaning to get around to for literally years. It's just a little push.
posted by communicator at 1:24 AM on June 24, 2012


I have a savings account at the RBS and was wondering why I couldn't get to it. (it's working again now though)

I always thought the online banking interface looked like it used web technology from 1998, and this fuckup does not inspire me with great confidence about the underlying tech.

I can't be the only one to think that it would probably be best to move my money somewhere else, where there might be 0.1% less interest but a bit more professionalism at work.
posted by ts;dr at 1:52 AM on June 24, 2012


A lot of people here are deliberately missing the point. The issue isn't outsourcing, that was just the straw that broke the camel's back. The problem is years of underinvestment in IT, the real IT gurus were let go years ago, the present IT systems are the product of the kids and consultants brought in to replace them. It is these "johnny-come-latelys" who botched things, their botched and inadequately documented systems were then outsourced and the whole shoddy mess exposed.

Also anyone who says "schedulers are complicated" should be escorted from the building and never allowed back in. Systems are complicated, which is why they have to be structured correctly. Schedulers are simple. Ordering mutually dependent events was a problem solved back in the 60s and many times since. The problem is the IT staff seem to have had no clue what they were doing so they botched it.
posted by epo at 2:02 AM on June 24, 2012 [5 favorites]


Ah, Hyderabad. Seperated from Asgard to the west by the River of Death Ice, locked in perpetual winter. Good times.

What makes this even funnier is that it's been one of the hottest summers in the region in recent times, Hyderabad itself reaching 44 or 45 C, and some suburbs and neighboring towns reaching 49C.

Perpetual winter? That sounds more like London's weather! :)

On a serious note, I'm surprised by this:

The FSA fell short of saying that RBS would face any regulatory action as a result of its IT issues.

Somehow, I thought there'd be fines involved, but apparently, that's how it is in Asia as well.
posted by the cydonian at 2:07 AM on June 24, 2012


I really wish business school types would uderstand, here in future, money spent on IT is just as important as money spent improving the physical plant.

Do you sincerely believe that business school types care a lot about improving the physical plant?
posted by Skeptic at 2:24 AM on June 24, 2012


Only when it impacts them directly. Like too few stalls in the executive washroom, or too few chairs and tables in the executive dining room.
posted by mephron at 2:39 AM on June 24, 2012 [1 favorite]


epo: Ordering mutually dependent events was a problem solved back in the 60s and many times since.
Precisely. As in, it's covered in articles 82 and 88 of the CRC Computer Science Handbook.
Skeptic: Do you sincerely believe that business school types care a lot about improving the physical plant?
They used to…
posted by ob1quixote at 5:40 AM on June 24, 2012 [1 favorite]


My bank is RBS and although my pay, which came in on Friday, appears in my account it is not available to me. I am going to have to waste time going into my branch tomorrow, dammit. And I think they could have emailed customers to let them know - they have my email address.

Part of RBS is in the process of being taken over by Santander. I had a letter recently with a colour-coded diagram of where my account is at the moment. Apparently I'm at stage two, "Being kept informed". I think it was yellow.
posted by paduasoy at 2:46 PM on June 24, 2012


The problems mentioned here are not confined to RBS, but their problems seem worst than most.

In 2009, I was playing Tribal Wars (it's interesting, but I don't recommend it as a game), and was trying to make a payment for premium features using a monstrosity called RBS Worldpay. Somehow, there was a problem in the payment process that led to a dead end. When I tried to explain the problem to RBS tech support, I got asked all the usual dumb questions (are your sure cookies are allowed?, etc.), but basically the problem was a logical flaw in the next page display process (I think).

I still maintain that adequate and exhaustive quality assurance will help you avoid glaring errors and unhappy customers. RBS didn't seem interested, but Tribal Wars (InnoGames GmbH) was--after a few months, they dropped RBS. My point here is people will vote with their feet--if you don't give them what they want or need, most likely someone else will.

--pjm
posted by pjmoy at 5:10 PM on June 24, 2012


If this goes on much longer, RBS is going to suffer a mass exodus.
posted by pharm at 12:26 AM on June 25, 2012


Oh please oh please oh please
posted by jeffburdges at 2:44 AM on June 25, 2012


Heh. It's funny that 'scheduling is simple' and 'scheduling software is nightmarish' can both be simultaneously true. The thing about a lot of back-office software for places like this is that it's all sourced and evaluated by non-technical folks. The guy making the decision on what consultant to hire is not doing so based on a sound evaluation of his experience and talents but on who takes him out to the best steaks and strippers. The manager evaluating a piece of software has no idea that a scheduling system is a trivial task but that if it has a horrifically complicated infrastructure dedicated to running it it's going to add complexity and failure modes and attack surface to every transaction it touches, he's just awed by the shiny GUI drag-and-drop interface. So since that's what makes the money that's what gets the development time. Making the thing work reliably isn't the problem of the sales team or the procurement manager, that'll just be Someone Else's Problem and usually Someone Else who can be easily left holding the bag.
posted by Skorgu at 6:20 AM on June 25, 2012 [1 favorite]


Not to lay too much snark on it, but in a professionally managed environment that's the kind of thing you fix by running it on the redundant, parallel test system first to make goddamned sure it's going to work before you roll it out to the production system serving 12 million customers.

As an experienced, certified (you have to be certified to do this job) tester I approve of this message, but it won't be the first time that something has been tested and validated in acceptance that failed on inplementation in production. Any new install is a risk and I've still have to find the test or acceptance environment that is 100% production like.

I could tell stories, but then I would get sued.
posted by MartinWisse at 6:48 AM on June 25, 2012


In fact, optimal scheduling problems are not merely hard but often NP-hard, although usually good approximation algorithms exist (pdf).
posted by jeffburdges at 6:54 AM on June 25, 2012


There are lots of things you could do today with modern systems that could avoid most of these potential problems. If you have a bunch of tasks that need to be run in a specific order, it would seem like that's something you would just use a simple shell script to do.

Ha ha ha. No.

that's good enough for toy systems, but you really don't want to run even moderately complex banking or social security system schedules in code; rather you use something like Tivoli Workload Scheduler, which lives on its own system and which can actually handle multi-machine, multi-OS, multi-environment schedules.

You could do all that in code, but it would get ugly very quickly and, as importantly, would only really be understood by the programmers, rather than the people who actually have to deal with the scheduling day in, day out: the functional admins and sysadmins and such. Not that TWS is that easy to understand, but you can get a basic grasp with a five day course and some experience.

Also, I would question the idea that mainframe = old and crappy.
posted by MartinWisse at 7:03 AM on June 25, 2012 [1 favorite]


Update from the Guardian here with far more technical detail than I ever expected to see in a mainstream newspaper.
posted by xchmp at 3:37 PM on June 25, 2012 [1 favorite]


jeffburdges: In fact, optimal scheduling problems are not merely hard but often NP-hard, although usually good approximation algorithms exist (pdf).
I don't think we're talking about an optimization or preemptive scheduling problem though. I think this is something more like making sure the same defined set of jobs, working on the same defined database, occur in the same defined order every night. The article xchmp linked indicates they're using CA-7 Workload Automation for this.


MartinWisse: I would question the idea that mainframe = old and crappy.
Certainly not, and that's not what I said. Badly outdated mainframes and serial terminals are old and crappy. I'm a huge fan of IBM System z and Power hardware, especially for a high-availability, fail-safe application like handling bank accounts.

However, if an organization is still on a fully depreciated S/390, AS/400, or RS/6000 in 2012, especially if it's because they have software they think is "irreplaceable", I'm calling the people in charge of that decision irresponsible.
posted by ob1quixote at 4:07 PM on June 25, 2012


Financial Ombudsman Service: NatWest crisis could see knock-on effects drag on for weeks
posted by Lanark at 10:12 AM on June 26, 2012


...And I still haven't been paid.
posted by meronym at 3:48 PM on June 26, 2012


And britishairways.com looks down too. I suppose U.K. banks and airline websites work better than their train sites though, which still beat renfe.com. lol
posted by jeffburdges at 5:17 PM on June 26, 2012


« Older Why Is the U.S. Selling Billions in Weapons to Aut...  |  Having trouble with Javascript... Newer »


This thread has been archived and is closed to new comments