Fully and robustly tested and meets the banking industry standards
December 13, 2020 8:12 AM Subscribe

For years the UK Post Office denied there were any fundamental problems with the Horizon software, provided by IT specialist Fujitsu. Instead, it blamed mistakes on dishonesty from sub-postmasters, who run most of its 11,500 branches. Introduced in 1995 as a PFI deal costing £1 billion, problems with the system were first reported in early 2000. Hundreds of post masters and other Post Office employees have since been jailed and financially ruined.
In 2019, class action civil litigation, brought by 550 sub-postmasters was settled by the Post Office.
In December 2020 the first sub-postmasters wrongly accused of theft and fraud have finally had their convictions quashed.
posted by Lanark (60 comments total) 41 users marked this as a favorite

Never Question The Algorithm. It is Always Right. Any Incorrect Outcomes are only attributable to Human Error. All Human Error will be prosecuted to the Fullest Extent of the ~~Law~~ Algorithm.

"You best start believin' in Cyberpunk Dystopias. You're in one."
posted by deadaluspark at 8:23 AM on December 13, 2020 [16 favorites]

Thanks for this post, Lanark. I’ve only followed this at a distance, but the way those sub-postmasters were treated is despicable. It’s not an exaggeration to say that hundreds of lives were ruined by this.

I realise I already know the answer to this question, but: will the people responsible face consequences for this mistake?
posted by adrianhon at 8:24 AM on December 13, 2020 [8 favorites]

In reading the 2019 judgement, it is clear that without the testimony from Richard Roll, a former Fujitsu employee they never would have won this case.

The entire Fujitsu support team had access to a super-user account with the ability to alter live data: “When we go off piste we use APPSUP”. That is a powerful user privilege which allows users to do virtually anything. It was intended “for unenvisaged ad-hoc live amendment” of data. It had been used on average about once a day, and was assigned on a permanent basis to the ID’s of all the IT support staff looking after Horizon.
posted by Lanark at 8:39 AM on December 13, 2020 [14 favorites]

will the people responsible face consequences for this mistake?

This field would be extremely different if programmers faced consequences for their mistakes.
posted by mhoye at 8:45 AM on December 13, 2020 [28 favorites]

Adrianhon: "there is no suggestion at present that any Post Office manager will be prosecuted for their part in the Horizon scandal." from The Register link.

As depressing as it is entirely expected. This whole thing feels like an awful real life version of the Milgram experiment.
posted by slimepuppy at 8:46 AM on December 13, 2020 [2 favorites]

As much as we want to blame the postmasters, let's be clear, Lanark is 100% correct, if it wasn't for the testimony of Richard Roll, this would have gone nowhere. Especially as the higher-ups at Fujitsu kept claiming that they had no such access to the Horizon system.

Fujitsu management should shoulder part of the blame, too. They were more than happy to ruin lives to be able to claim their software worked great and without issue. Honestly, I'm a little shocked and disappointed that more wasn't made out of the fact that the management of the company and the workers are saying different things and nobody is going after the management.

(I mean, Fujitsu is based in Japan, which makes prosecuting them for this even more complicated...)

But let's be real. Do we know for sure things would have gone down exactly the same if Fujitsu management hadn't kept claiming that their engineers had no such access when that was a lie? Was part of the reason these people were prosecuted because the government trusted the word of a private corporation over the word of their employees?

It's quite possible they would have still blamed employees, but I'd like to think if the corporation wasn't trying to cover it up, that perhaps these people wouldn't have their lives ruined.
posted by deadaluspark at 8:52 AM on December 13, 2020 [10 favorites]

I can't find the link now, but Im sure I read somewhere that the problems seemed to mostly occur in small remote post offices which had a slow internet connection, in many cases dial-up. Quite likely the system was setup to terminate transactions after 30 seconds and boom your money is gone. This scandal will have accelerated the decline in the number of local post offices. 48% have closed since 1982.
posted by Lanark at 9:06 AM on December 13, 2020 [9 favorites]

As a software industry retiree, it frankly frightens me how willing non-programmers apparently remain to believe that commercial-grade IT systems are anywhere even close to defect-free until proven otherwise.
posted by flabdablet at 9:07 AM on December 13, 2020 [52 favorites]

As a software industry retiree, it frankly frightens me how willing non-programmers apparently remain to believe that commercial-grade IT systems are anywhere even close to defect-free until proven otherwise.

"If you can't see the bug, it doesn't exist."
posted by deadaluspark at 9:10 AM on December 13, 2020 [4 favorites]

Also, the way that postmasters were required to make up for software errors out of their own pockets is eerily similar to the effects of the recent RoboDebt clusterfuck in Australia.
posted by flabdablet at 9:11 AM on December 13, 2020 [6 favorites]

As much as we want to blame the postmasters

Just to be clear on the terminology, while from context it looks like you're saying head office people at Royal Mail should share the blame with Fujitsu, the people who were wrongfully convicted (and in some cases imprisoned) all have the job title of sub-postmaster.
posted by ambrosen at 9:13 AM on December 13, 2020 [3 favorites]

@mhoye: Programmers are not in charge of software development shops. There's an army of useless PMs and BAs that act as little more than salesmen and control the client interaction. And they've been entering software development without any education or training a software developer would receive.
posted by DetriusXii at 9:24 AM on December 13, 2020 [7 favorites]

Private Eye has covered this story extensively for years . Their special report is here. They make it clear this is a failure of management, not of programmers or sub-postmasters. The PDF you can download from that link has a handy "Who To Blame" section.
posted by chavenet at 9:24 AM on December 13, 2020 [25 favorites]

There's an army of useless PMs and BAs that act as little more than salesmen and control the client interaction.

I have yet to meet a PM-with-MBA who was competent. Literally not one.

Competent PGMs, sure, no problem. They're everywhere.

Most recent encounters, at a vastly profitable firm you've heard of:
PM with MBA from Harvard -- moron, conceptualized every project decision as though it were an M&A
PM with MBA from Stanford -- barely able to string words together, totally unable to conceptualize software
PM with that 1+1 MBA thing from Oxford -- all decisions hinged on whether it would raise his profile

...most of the "why did they do that?" decisions that emanate from Big Tech can be laid directly at the feet of some idiot PM who was angling for a promotion by launching a New Thing (and, necessarily, deprecating the middling-successful Old Thing).

The more PMs (and managers who act like PMs) you have in your reporting chain above you, the more it's gonna suck. From what little I recall of Fujitsu back in my S/390 days, I'm not surprised they fucked this up repeatedly. Fujitsu + UK Post = recipe for disaster.
posted by aramaic at 9:49 AM on December 13, 2020 [12 favorites]

"If you can't see the bug, it doesn't exist."

If you can't see the bug you have remained within the narrow range of use-cases where everything works as intended.
posted by justsomebodythatyouusedtoknow at 10:06 AM on December 13, 2020 [6 favorites]

As a software industry retiree, it frankly frightens me how willing non-programmers apparently remain to believe that commercial-grade IT systems are anywhere even close to defect-free until proven otherwise.

I'm more disgusted at a failure to properly triage issues that RESULTED IN THE LOSS OF MONEY. Those tickets should be showstoppers.
posted by mikelieman at 10:10 AM on December 13, 2020 [7 favorites]

This is horrifying, predictable, and I am so upset that even after being vindicated in court and winning some money the sub-postmasters are still going to have a huge net financial loss.
posted by jeather at 11:03 AM on December 13, 2020 [3 favorites]

Lanark’s link to an IT auditor’s damning summary is great - I recommend starting with the first part of three.

“What caught my eye when I first heard about this case were the arguments about whether the problems were caused by fraud, system error, or user error. As an auditor who worked on the technical side of many fraud cases the idea that there could be any confusion between fraud and system error makes me very uncomfortable. The system design should incorporate whatever controls are necessary to ensure such confusion can’t arise.

“When we audited live systems we established what must happen and what must not happen, what the system must do and what it must never do. We would ask how managers could know that the system would do the right things, and never do the wrong things. We then tested the system looking for evidence that these controls were present and effective. We would try to break the system, evading the controls we knew should be there, and trying to exploit missing or ineffective controls. If we succeeded we’d expect, at the least, the system to hold unambiguous evidence about what we had done.”

None of which, as he details, was present or even looked for in either the software system or the internal Post Office enforcement authority that made the terrible convictions.
posted by clew at 11:12 AM on December 13, 2020 [12 favorites]

This field would be extremely different if programmers faced consequences for their mistakes.

GOOD! ABOUT DAMN TIME! FUCK YES!
posted by sexyrobot at 11:22 AM on December 13, 2020 [1 favorite]

I'm more disgusted at a failure to properly triage issues that RESULTED IN THE LOSS OF MONEY. Those tickets should be showstoppers.

You would think so, but... well, the loss of Someone Else's Money isn't really prioritized.

Without going into too much detail, I am aware of a bug (now long fixed) at a payment processing company that, for at least two years, just... ate the occasional disbursement, because the system that controlled the customer's account would 1) send a disbursement request to the disbursements system, 2) debit the customer's account, 3) never verify that the request actually made it. There were manual adjustments (in some cases) to fix individual accounts, including some that were over $100k. The bug was only found because someone on a totally different team investigated it on a lark.

This is a company that is generally known for getting their tech right, too.
posted by reventlov at 11:23 AM on December 13, 2020 [5 favorites]

I am deeply saddened by the mental distress and trauma suffered by the subpostmasters as the powers that be blamed them for their own fuckups.

.
posted by infini at 11:59 AM on December 13, 2020 [6 favorites]

I would imagine Private Eye's reporting on this might be interesting. They've been (rightfully) stirring the pot on this and related issues for years and years.
posted by aesop at 12:26 PM on December 13, 2020 [1 favorite]

PostOfficeTrial.com is a crowd funded website run by the journalist Nick Wallis who did a lot of the investigation in this case, it has a lot of additional details.
Interesting that the Post Office used NDA's to restrict who could give evidence.

The episode has also triggered a wider review into corporate private prosecutions brought by companies who are "victim, investigator and prosecutor"
Such a review may disrupt BBC TV Licensing and the train companies, to name but two, if private prosecution powers were taken away from them.
posted by Lanark at 12:29 PM on December 13, 2020 [4 favorites]

The episode has also triggered a wider review into corporate private prosecutions brought by companies who are "victim, investigator and prosecutor"

I had heard of this scandal couple of times in passing, and it had always seemed a bit strange to me, because I didn't understand how it had gotten so far. It's not that I think UK prosecutors are impeccable, but the victims didn't seem like the usual groups that get fucked over by them -- not Irish or Muslim people accused of terrorism, or black people accused of random crimes, etc. Mostly small town upstanding pillars of the community, the kind that usually get the benefit of the doubt. So I wondered how hundreds of times no prosecutor had said "you got a new accounting system, and now all these people are reporting problems, and you are trying to charge them with fraud?" and asked questions. This post is the first time I found out it was actually the *post office itself" running the prosecutions and now it makes so much more sense.

It's still pretty horrible that judges apparently let these cooked prosecutions go through hundreds of times without asking anything (since I assume the post office doesn't have its own judges) but less surprising since it would make more work for them, as opposed to independent state prosecutors bouncing cases which would make less work for them.
posted by tavella at 1:04 PM on December 13, 2020 [4 favorites]

@tavella just looking at the six former subpostmasters who's convictions were quashed at Southwark Crown Court. The majority are from ethnic minorities.
posted by Lanark at 1:14 PM on December 13, 2020 [5 favorites]

This field would be extremely different if programmers faced consequences for their mistakes.

GOOD! ABOUT DAMN TIME! FUCK YES!

Bear in mind if we adopted the liability system practiced by engineers, virtually all programmers would be protected by the "industry exemption": when you do engineering while employed by a sufficiently large company, the company accepts any liability incurred for your work.

And while I'm sure everyone here is excited to think about how much higher quality the software we write would be, preventing the NOT SUITABLE FOR ANY PARTICULAR PURPOSE AND YOU TOTALLY CANT SUE US boilerplate clause would also lead to less software and fewer engineers in general, when customers balk at the price we need paid to properly offset liability. Many if not most of us would not basking in proofs of correctness and TLA+, but talking with the unemployment counselor about career change. After all, there's really no reason a gigantic corporation couldn't just negotiate that clause away, and yet in virtually all cases, they don't.

There's also a question about whether this would actually lead to better results at all. The Blameless Postmortem process posits that post incident reviews prevent people from sharing. If Bob is fired for missing a transaction bug in QA test passes, you damn well know Alice won't be mentioning she didn't look for it in canary, or that Dana was struggling to get their own code to pass tests because of the same bug, but couldn't figure out why.
posted by pwnguin at 1:14 PM on December 13, 2020 [6 favorites]

Interesting that the liability persisted after the Post Office privatization that started in 2011. A village postmaster: they're basically the banker, bringer of news and trusted administrator in tiny villages. To be railroaded into pleading guilty because "computer says no" (with Fujitsu knowing all along that their system was shit) is the worst.

For the postmasters who took their own lives after this scandal destroyed theirs:
.
posted by scruss at 1:24 PM on December 13, 2020 [2 favorites]

Aww fuck, and I didn't realise that by Fujitsu, they really meant the remains of ICL. I have family ties to ICL, and this just pisses me off.
posted by scruss at 1:28 PM on December 13, 2020 [2 favorites]

"As a software industry retiree, it frankly frightens me how willing non-programmers apparently remain to believe that commercial-grade IT systems are anywhere even close to defect-free until proven otherwise."

Amen to that. As someone who had a bit-part in several UK Government IT projects as a consultant since the mid 90s, this doesn't surprise me at all.

A lot of these companies aren't so much "IT specialists" as "landing big Government contracts and still walking off with bags of cash even when it crashes specialists"

Possibly my favo(u)rite, with apologies for the non-technical folks, was the one where, halfway through the contract, the Govt side said they wanted to change from Windows/ASP.NET/SQL Server to Linux/J2EE/Oracle, they'd been told that should be OK and could we still deliver on time? Hilarity, as they say, ensued.

And the Post Office... yes, there was one for the PO as well. I still shudder when I think of it, and am very glad that it never actually got into production.
posted by 43rdAnd9th at 1:55 PM on December 13, 2020 [8 favorites]

I don't want say this but isn't the UK govmint the same whose contact tracing app ran on excel? and cost a billion pounds?
posted by infini at 1:58 PM on December 13, 2020 [2 favorites]

halfway through the contract, the Govt side said they wanted to change from Windows/ASP.NET/SQL Server to Linux/J2EE/Oracle

I just broke out in a cold sweat. In my soul.
posted by Mr. Bad Example at 2:16 PM on December 13, 2020 [9 favorites]

This field would be extremely different if programmers faced consequences for their ~~mistakes~~ hubris.

FTFY.

There is nothing wrong with making mistakes if you're there to correct them. There's plenty wrong with claiming you can't make mistakes and someone else must be acting maliciously.

similar to the effects of the recent RoboDebt clusterfuck in Australia.

It's not that similar. I got a robodebt letter, and after one phone call I realised it had taken data from one department that doesn't care when during the year you'd earnt your income, and imported it into a database that wanted to know exactly when you'd earnt your money. Instead of flagging it as unknown, it just averaged it out over the whole year, EVEN IF YOU'D ALREADY PROVIDED THEM WITH A TERMINATION LETTER THAT GAVE THEM THE EXACT DATE RANGE TO AVERAGE OVER. Robodebt is not the act of programmer hubris, it's the act of a sociopathic government that thinks it can make jobs appear by punishing the unemployed. A government rarely questioned when it uses the words unemployed and bludger interchangeably. A government that thinks "mutual obligation" is slave labour, when the mutual obligation is that you pay them when you're working and they pay you when you're not.

Sorry, I just really hate the government I've been stuck with for, what is it now, 1000 years? 10000? Some of their ideas are certainly from the stone age...
posted by krisjohn at 2:18 PM on December 13, 2020 [12 favorites]

Without going into too much detail, I am aware of a bug (now long fixed) at a payment processing company that, for at least two years, just... ate the occasional disbursement

I spent a decade doing property and casualty IT for an insurance broker. The owner really gave a shit about cash handling because (1) he trained as an attorney and (2) he was well aware that without solid IT and auditing, the people in the retail locations would steal him blind, and leave him on the hook for things like "Insurance we promised to write, but didn't because a branch employee stole the premium down payment and never bound the policy."

It was nice.
posted by mikelieman at 2:31 PM on December 13, 2020 [2 favorites]

Paid back double any fine. If they went to jail, £1000 per day for jail minimum for these victims of the state.

Every fucking shareholder of royal mail loses minimum 1% of their shares to make the victims of THEIR company whole . hey it's capitalism, risk-takers and job makers, right?
posted by lalochezia at 2:39 PM on December 13, 2020 [3 favorites]

I also don't understand why there's not even any mention of prosecution. They lied to the courts, resulting in unjust convictions, imprisonment, and even suicides. Where are the criminal charges?
posted by tavella at 3:04 PM on December 13, 2020 [6 favorites]

This field would be extremely different if programmers faced consequences for their mistakes.

I've very rarely found that mistakes in software come down to a single programmer[0]. A line-level programmer is, at most, the R on a RACI chart for an individual feature. There is typically an entire infrastructure around and above the programmers that should have processes in place to catch errors like this. Even then, it's still really hard to do. But sure, if we want to hold Software Engineers to the same standards as Mechanical Engineers, let's put the system in place and everyone can pay 10x for the development and wait 10x the time.

[0] - I was hired into a small-ish company as the VPE because they had just made a $1mm mistake in a platform they launched. My first task was to find out why it happened. The "what" was already known: a mistake in single database query. Tech had focused on what happened, management was focused on who caused the problem. The who was easy. We had commit logs. But I asked to review the project schedule (too short), the change requests (too many), the actual hours the development team was working (way too many, the last 1.5 months averaged 80-100 hours/week for developers). Tempers were high, money had been lost, heads had to roll.

The reality was the failure went all the way up the chain to the CEO. And I told him that. They overpromised, they didn't control the change requests, and they overworked the development team. The particular bug that caused the issue was from an untested SQL query that was a client change request a few days before launch. Tech pushed back, product mediated, sales said it was a mandatory change. Sales won. The developer who made the change had worked over 100 hours that week (and the ones previous), literally sleeping under his desk. So who's liable in that situation?
posted by ryoshu at 3:28 PM on December 13, 2020 [26 favorites]

Robodebt is not the act of programmer hubris, it's the act of a sociopathic government that thinks it can make jobs appear by punishing the unemployed. A government rarely questioned when it uses the words unemployed and bludger interchangeably. A government that thinks "mutual obligation" is slave labour, when the mutual obligation is that you pay them when you're working and they pay you when you're not.

Quite so. And again, as an ex-programmer I can personally attest that I have worked with very few working programmers who exhibit anything even vaguely resembling hubris, but a hell of a lot of empty suits in positions of power for whom it is their defining personality characteristic.

The parallels I see between the RoboDebt and Post Office clusterfucks are the apparently unexamined assumptions made by people in charge of policy decisions that (a) a software system both could and should replace considered human judgement and (b) this particular software system, one presumably obtained from the lowest bidder, would of course do all the things they would wish it to do and none of the things they would wish it not to, regardless of how fuzzy, vague and frequently self-contradictory those wishes would inevitably turn out to be when examined in enough detail to translate them into software.

In my experience, non-programmers tend to treat software as if it were literally a kind of magic, rather than a product of human effort whose quality depends sensitively on the soundness with which that effort is organized.

Calls to make programmers "answerable" for the quality of their work by imposing harsh penalties on those whose defective work turns out to be consequential are, I think, misguided. Such calls tend to come from people who haven't had much exposure to the processes by which commercial software is actually created, and who don't fully understand just how many defects are due to competent people having been forced into the position of making a good-faith effort to interpret vague and/or contradictory requirements according to the principle of least surprise.

I'm not attempting to say that many defects don't arise from rank programmer incompetence, because of course many do; but the way to deal with that is to build enough review into the development process so that this kind of defect can be caught and fixed before it makes it all the way to release, and the programmers responsible either trained up or let go. But that kind of quality assurance costs money and time and is often the first thing to be slashed off a project plan in order to make a tender look competitive.

Plus everything ryoshu said.
posted by flabdablet at 7:24 PM on December 13, 2020 [11 favorites]

On the Robodebt fiasco - it was even worse than you actually realise.

There were records of weekly/fortnightly payments historically reported to and held by the social welfare authority. Rather than aggregating the weekly/fortnightly amounts to see if those matched the annual total reported to the taxation authority, ie. the LOGICAL way to audit - the annual total was DISAGGREGATED and averaged against the weekly reported amounts. Unsurprisingly this generated huge numbers of discrepancies.

The previous weekly/fortnightly records were still there - so when my children provided the fortnightly breakdowns, they were advised that they had historically reported different numbers. When at my prompting, they pointed out that the correspondence said there were NO records to compare against, the various officials conceded that the information did in fact exist.

So after four half hour conversations, three or four emails and a couple of bureaucrats spending multiple hours on the phone, one of my four children owed about $100.
posted by Barbara Spitzer at 7:53 PM on December 13, 2020 [1 favorite]

>This field would be extremely different if programmers faced consequences for their mistakes.
Technically true, and a satisfyingly fantasy, but that would not have saved the subpostmasters.
Programmers are too far removed from the end user for them to have mattered in this case. If management hands out master keys to their vault to every employee and customer, you can't hold the vault maker responsible for the inevitable robbery. If a shady sandwich shop lies to you and gives you a peanut butter sandwhich while claiming it's almond butter, you can't hold the peanut farmer or the peanut butter maker responsible.
posted by mrgoldenbrown at 8:15 PM on December 13, 2020 [1 favorite]

With apologies to actual bridge building engineers, I believe the likely problems are pretty well known by now, so the biggest issue is people screwing up the design or the construction (lower spec materials etc). So it makes sense to make sure only highly qualified people do it, and have hefty liability penalties. Though sometimes it's down to unexpected issues, ala the millenium bridge, these are rare.

If you were to build a bridge like software:
we'll tell you it's granite on both banks when it's actually clay on one and mud on the other.
we expect the bridge to be moveable because we haven't decided where the ends go yet. We'll tell you just before we open.
so we won't tell you how wide the river actually is until then.
we'll tell you we want a footbridge, but actually expect it to handle motorway traffic.
on opening day it will get 10,000 trucks expecting to cross, then never again.
we won't tell you it needs to allow boats to pass, we expect you discover that on your own. What height boats? All of them.
we're only going to give you half the time to build it that it needs, because that other guy says he can do it even faster.
the material qualities used for construction will be unknown.
so we expect you to use the cheapest ones.
and the cheapest staff to build it.
and they don't get overtime pay.
and you won't have time or money to test the construction materials.
we won't tell how we want it to look, but we'll just tell you we want it looking 'better'. Repeatedly.
we're going to ignore every warning you make.
we reserve the right to mandate a change in the the materials you use halfway through, but expect the same price and delivery date.
we're not going to do any maintenance, but it's your fault if it fails at any point in the future.

Is it any surprise that 'not warranted fit for any purpose' gets added to the contract?

Making software professionals qualified engineers held liable for their work, ala mech and electrical engineers makes more sense when the far bigger issues with project specification and project management are dealt with, because every example I've ever encountered in my life has been the wild west.
posted by Absolutely No You-Know-What at 10:07 PM on December 13, 2020 [13 favorites]

The clarotesting series Lanark links to, and the postofficetrial.com site, are amazing and horrifying reading. It sure seems some elements of Fujitsu were irresponsible -- and two are under criminal investigation -- but the multiple Post Office failures and malfeasances here look completely reprehensible. They knew that the software had bugs (software has bugs) and they did not put process mitigations in place around them, no, they left subpostmasters holding the bag, extracted money from them, and prosecuted them criminally. I can't say I expect justice, but c'mon, 2021 owes us something after 2020, right?

clarotesting post 2

It’s not just a question of users holding a superuser privilege all the time, bad though that is. It reveals a lot about the organisation and its systems if staff have to jump in and change live data routinely. An IT shop that can’t control superusers effectively probably doesn’t control much. It’s basic.
[...]
However, the Post Office’s internal auditors don’t have the excuse of incompetence. The problem was flagged up by the external auditors Ernst & Young in 2011.

postofficetrial.com at "The morning after the night before":

For some reason, during the forex transaction, the computer chose not to write a line of code into the Horizon message store creating a surplus of $1000 (around £484 at the time).

A fix was proposed - go into the Subpostmaster's computer and manually enter the line of code [they mean a data record] which was missing. This was done over a period of 10 minutes at 5pm on the 5 Dec 2007, and as Fujistu engineer Andy Keil adds to the PEAK: "Worth noting that the branch did not have any issues with the mismatched transactions because this was fixed before they did the roll. The branch is not aware of this and it's best that the branch is not advised."
[...]
The fix goes in. But, instead of writing the proposed:

"[Quantity]:-1, SaleValue:-484, PQty:-1,000 with, other attributes (including exchange rate) as before."

... Mr Kiel manages to update the POLFS feed for the branch with a sale value of 1,014.73 and PQty of 2,080. This has the effect of just over $2,000 being inserted in the Post Office system, generating a $1000 loss at the branch.

An update to the PEAK by Fujitsu's Anne Chambers is recorded on 14 Dec 2007:
"The counter problem which caused the first issue has been corrected by inserting a message into the messagestore, for equal but opposite values/quantities, as agreed with POL... Once the problem was corrected, there should have been no impact on the branch. However, it has been noted that the stock unit BDC had a loss of $1,000, which was generated after the correction was made."
Don't worry, says Ms Chambers: "We have already notified Gary Blackburn at POL ... this appears to be a genuine loss at the branch, not a consequence of the problem or correction."

Fujitsu failed to spot their own engineer's mistake.

tl,dr they were fudging an accounting log remotely, in agreement with the Post Office (POL), without telling the subpostmaster who was financially liable.

And Anne Chambers is now under criminal investigation, but I bet that if you look around her, you're not likely to find she's a rogue actor within an organization devoted to integrity over profit. I bet you'll find that they promoted her and others for just this kind of dodgy behavior -- showing results -- as long as they got away with it.
posted by away for regrooving at 11:07 PM on December 13, 2020 [8 favorites]

In that story ^ does it make sense to pin blame on the programmer who introduced the code triggering the "For some reason [...] the computer chose not to" behavior? (Or since most bugs occurs at boundaries, the other programmer who wrote the function they called in a way it wasn't expecting? Or the tech writer (lol nobody pays for tech writers) who didn't fully document the API's odd expectations?) Or the test writer who didn't catch it? Or the test plan designer who didn't cover the integration that would have been better suited to catch it? Maybe the director who didn't fund a stronger test plan or who set the schedule?

It's always an organizational failure. The corporation is to blame, the corporation should be liable to levels where they feel pain, and if they don't like that, they can create a safe environment to run blameless postmortems so it doesn't keep happening.

When you see a corporation blaming a failure on "human error" or "bad actors", that's a corporation's way of dodging responsibility.
posted by away for regrooving at 11:20 PM on December 13, 2020 [7 favorites]

does it make sense to pin blame on the programmer who introduced the code triggering the "For some reason [...] the computer chose not to" behavior?

With the complexity of today's underlying software stacks: no, absolutely not. In 2020 it is literally impossible for most programmers to know how the most of the libraries they're relying on actually work in sufficient detail to anticipate all of the possible abstraction leakages.

Abstraction is an absolutely fundamental tool for reasoning about how software works, but in 2020 the typical large commercial project requires so many layers of it as to make reasoning about the various ways in which software can fail require orders of magnitude more time than is typically allocated. Designing and implementing software can be done using nothing more than strong familiarity with the specs and interfaces of the applicable libraries; debugging it requires coming to grips with what the failing software actually did, which frequently requires huge amounts of expensive spelunking through library source code.

To extend the (faulty, inadequate) bridge engineering metaphor a little: software designers frequently find themselves in the position of an engineer who makes the reasonable assumption that five inch stainless steel cable will not to turn into cottage cheese, only to find out that in fact it does do exactly that on the third Wednesday of every July following a leap year with a blue moon in November.

Administrators who, given the choice between suspecting human malfeasance and software failure, are inclined to suspect the people first - especially when the software concerned is a large bespoke system - are totally doing it wrong.
posted by flabdablet at 12:35 AM on December 14, 2020 [5 favorites]

Flabdablet: the website you linked provides a good example of how it might be impossible to assign blame. See the page on comments, where he describes a hack he came across that makes a "deep copy" of a variable, rather than risk modifying it when passing it by reference:

const data = JSON.decode(JSON.encode(response))

Now, this hack works because the compiler is dumb and doesn't realise that the function is just translating the variable back and forth. So let's say the compiler is improved so it knows that the encode-decode operation is superfluous, and it optimises it out of existence. Suddenly our deep copy of a variable becomes a pass-by-reference and any changes to it affect the variable in all contexts. All sorts of subtle and contingent bugs may be introduced. Who's responsible? The person who improved the compiler? The person who authorised the use of the updated compiler? The author of the hack, the person who relied on a variable being a deep copy, or the person who didn't consider whether a variable's state might be changed out of context? I don't know whether it's practical to blame anybody.
posted by Joe in Australia at 2:24 AM on December 14, 2020 [1 favorite]

Who's responsible? The person who improved the compiler?

If the compiler was "improved" in such a way as to optimize a JSON.decode(JSON.encode()) construction away entirely, as opposed to merely having it skip the complementary stringify and parse steps so that all it does is make the implied deep copy, then the person who drove that "improvement" would indeed benefit from a short but vigorous application of the clue-by-four.

This is, of course, assuming that the libraries chosen for the project don't actually provide any tidily designed way to perform a deep copy explicitly. If they do, the encode/decode hack should be replaced on sight, mainly because it will probably fail in at least some of the edge cases that a more carefully considered deep copy library function will have been built to handle properly.
posted by flabdablet at 3:39 AM on December 14, 2020 [1 favorite]

I wrote the Clarotesting series that's been linked to above. This case hit some of my buttons, and if I'd been at the Post Office, operating in hard-nosed, corporate suit, real bastard mode, like I was as an IT auditor in a top class audit team, some deserving people would have had a very unhappy time. We were tasked by the group chief auditor to go after bullshitters, chancers and the dishonest. One of the joys of working in that environment is that you are paid to confront even senior managers who are out of line. Our organisational independence meant we were bullet-proof. If any manager wanted to go over our heads they had to go up to board level, to the sort of people who couldn't care less about a manager whingeing that he (invariably a he) had been disrespected by some bastard from audit.
This all meant we would sometimes provide protection to the poor troops on the front line who were getting it in the neck when we could see they'd made a heroic effort to get a system in despite a lack of support from users and senior management.
I one received a mild rebuke from the group chief auditor after I exposed a lying manager at a meeting. I produced the evidence to show he was lying, and left him dangling in the wind. I was told I'd let him off too easily - I should have totally humiliated the guy. Lying to the auditors is a huge no-no! But not nearly as big as lying in court, and it seems prety clear that's what PO & Fujitsu executives did. I hope they get what they deserve.
posted by Claro at 8:38 AM on December 14, 2020 [17 favorites]

I wrote the Clarotesting series that's been linked to above.

Thank you! And welcome to MeFi.

I particularly enjoyed your account of the way your GCA dealt with the puffer-fish GM:

“Abdication of management responsibilities” was the nuclear phrase in our audit department. It was only to be used by the Group Chief Auditor. He put it in the management summary of one of my reports, referring to the UK General Manager. The explosion was impressive. It was the best example of audit independence I’ve seen. The General Manager stormed into the audit department and started aggressively haranguing the Chief Auditor, who listened calmly then asked. “Have you finished? Ok. The report will not be changed. Goodbye”. I was in awe. You can’t intimidate good auditors. They tend to be strong willed.

That's gold.
posted by flabdablet at 9:42 AM on December 14, 2020 [12 favorites]

Hi Claro!

I’ve been thinking about the similarity between organizational structure and software structure. At first it seemed so hopelessly thick and unfixable, but - if the workers figure out who needs to know what to work reliably, does that change the actual shape of the organization?

I suppose I’m also thinking of political life, really.
posted by clew at 10:43 AM on December 14, 2020 [1 favorite]

"I’ve been thinking about the similarity between organizational structure and software structure. At first it seemed so hopelessly thick and unfixable, but - if the workers figure out who needs to know what to work reliably, does that change the actual shape of the organization?"

Wow, there's a PhD in that one. Quick answer; yes, the organisation's structure influences the software structure. That's Conway's Law. It's a biggie in my experience as an auditor and tester. The screw-ups tend to be concentrated where different parts of the organisation bump into each other. I wrote about that here, in "An abdication of managerial responsibility. I see I referred to the same incident in both this blog and the 3rd in the Post Office series, referred to above. It made a huge impact on me, sitting outside the office listening to that exchange.

Yes, I think the software can and should change the organisation - when it is feasible. I suspect it's probably more likely to happen in a start-up. Reshaping a big corporation to reflect its systems is a massive challenge. I have a stong dislike of corporate bureaucracy, but I have to admit it is a necessary evil. Even more than the bureaucracy I hate half-arsed change programmes that stir teams around, baffling and demotivating people, and distracting everyone from the real work. It's extremely difficult to do it well, and I'm not convinced that all the consultancies offering the service are up to the job.
posted by Claro at 11:59 AM on December 14, 2020 [5 favorites]

@DetriusXii "Programmers are not in charge of software development shops. There's an army of useless PMs and BAs that act as little more than salesmen and control the client interaction. And they've been entering software development without any education or training a software developer would receive."

I agree up to a point, but having worked as both a PM & BA, albeit with a good tech grounding first, I'd pin most of the blame on the real sales team, the glib patter merchants. It would really, really piss me off when I took over a project and found the sales team had closed the deal by overpromising or undercosting.

The most ludicrous example is easily comprehensible to lay people because it wasn't a tech issue. The client was based in Nottingham. The sales team chose to assume that one highly specialised sub-project, ie mine, requiring two people for 15 months, would be staffed by local experts based in the English Midlands, so the costing required no travel expenses. A quick phone call would have established that the only people who would be available in the timescales for which the sales team had committed the company were based in Scotland. So when I took over I was told "sorry - there's no budget for travel or hotel accommodation". After much moaning I was given a laughable £1,200 to cover two people for the project. We did almost all of the work from home despite the client's complaints.
posted by Claro at 12:12 PM on December 14, 2020 [2 favorites]

If anyone is interested in reading more about the legal issues in England surrounding computer evidence here is a very serious, long read, "The Post Office Horizon IT scandal and the presumption of the dependability of computer evidence", that I wrote.

I was asked to write the article for the Digital Evidence & Electronic Signature Law Review after I wrote my blog series on the Horizon case. The first half of the article is based on the first two blog posts. The second half is a discussion, or rather an attack on, the presumption in England that computer evidence should be relied on, unless the other party in the dispute can point to some reason to doubt it. In practice it is usually impossible for that party to gather enough information to mount a challenge, so seriously flawed evidence from dodgy systems is treated as the gospel truth in court.

This is a very controversial problem and I have been involved in some further work trying to reform the English law. It is an uphill battle because, at present, the convenience of the courts has a higher priority than justice. Too many people want to believe what is unjustifiable, that computers can always be relied on. It's only possible to hold onto that belief if one maintains a state of complete ignorance of what is going on in complex systems.
posted by Claro at 12:25 PM on December 14, 2020 [12 favorites]

Reshaping a big corporation to reflect its systems is a massive challenge. I have a stong dislike of corporate bureaucracy, but I have to admit it is a necessary evil.

It's actually not that hard to reorg -- I've worked for very big companies that do so seemingly any time a manager resigns (so about once a quarter). What happens is that quickly the software no longer reflects the org chart and bugs in component A have no clear owner, since the team that wrote it has been disbanded, half the team no longer works there and other half is busy delivering on unrelated component B.

Dealing with that bureaucracy is what wins you the Archibald Tuttle awards.
posted by pwnguin at 2:36 PM on December 14, 2020 [1 favorite]

I'm not convinced that all the consultancies offering the service are up to the job

Wait, you've encountered one that is?
posted by flabdablet at 8:58 PM on December 14, 2020

I'd pin most of the blame on the real sales team, the glib patter merchants. It would really, really piss me off when I took over a project and found the sales team had closed the deal by overpromising or undercosting.

Are you me?

Fucking marketroids who think that selling software is the same as selling washing powder, and proceed to do so without a clue about the difference between an easy tweak and something damn near impossible without total re-architecting, are the bane of the developer's existence.
posted by flabdablet at 9:02 PM on December 14, 2020 [1 favorite]

Too many people want to believe what is unjustifiable, that computers can always be relied on. It's only possible to hold onto that belief if one maintains a state of complete ignorance of what is going on in complex systems.

O preach it, brother, preach it.
posted by flabdablet at 9:05 PM on December 14, 2020 [3 favorites]

@ryoshu "I've very rarely found that mistakes in software come down to a single programmer"

Yes. Outsiders assume the code is basically the whole thing. It's obviously massively important and absolutely vital, but there's much more to it than that. There's all the rest of the infrastructure required to make code useful. Perfectly written code can contribute to disaster if something unexpected changes. The code was perfect only according to the expectations of the designers and programmers for a specific context at a specific time.

And then there's abstraction...

@flabdablet "Abstraction is an absolutely fundamental tool for reasoning about how software works, but in 2020 the typical large commercial project requires so many layers of it as to make reasoning about the various ways in which software can fail require orders of magnitude more time than is typically allocated. Designing and implementing software can be done using nothing more than strong familiarity with the specs and interfaces of the applicable libraries; debugging it requires coming to grips with what the failing software actually did, which frequently requires huge amounts of expensive spelunking through library source code."

Without layers of abstraction you'll never build anything commercially useful. With these layers you'll never build anything that is comprehensible. The world sucks, but that's just the way it is. It only really frightens me when people try to pretend it ain't so and then build systems on that deluded basis, rather than acknowedging the fearsome complexity of modern software. You need resilience, and that means redundancy, deliberate inefficiencies, fallbacks, margins for error, all the boring stuff that costs money, but means that fewer people get killed.

"Too many people want to believe what is unjustifiable, that computers can always be relied on. It's only possible to hold onto that belief if one maintains a state of complete ignorance of what is going on in complex systems."

I phrased this badly above. You can never know what is really going on in complex systems, but what you can know is that this intractable incomprehensibility is real. We can only offer meaningful opinions about software if we can own up to the vast extent of our ignorance. This is where the Dunning Kruger effect bites hard. The field is wide open for glib chancers pretending they do understand what is going on, while the real experts are much more cautious about what they know. Who is most likely to get the ear of clueless senior executives? That's why you need people like good IT auditors who take a delight in chopping the chancers off at the knees.

@pwnguin "It's actually not that hard to reorg -- I've worked for very big companies that do so seemingly any time a manager resigns (so about once a quarter). What happens is that quickly the software no longer reflects the org chart and bugs in component A have no clear owner, since the team that wrote it has been disbanded, half the team no longer works there and other half is busy delivering on unrelated component B.

Dealing with that bureaucracy is what wins you the Archibald Tuttle awards."

It's amazing how many senior managers think the hard part of a reorganisation is drawing up the org chart, and all the rest is just detail for the little people to worry about and work out.

@flabdablet "Wait, you've encountered one that is?"

Well, no, I've never seen one that was more or less competent! But logically that doesn't mean such consultancies can't exist. It's just my experience. I was covering myself. ;-)
posted by Claro at 4:46 AM on December 15, 2020 [5 favorites]

@Claro I was quite struck, inspired to be honest, by your description of the organizational position of internal audit. Reporting past the management chain directly to the board, because they might be expected to be concerned about normalized uncontrolled risk? There's a particular type of organizational maturity here that's utterly alien to me, looking at Big Tech companies.

Tech companies have security, privacy, data protection, also compliance of all kinds. Often top-notch. But not organizationally empowered like that. Everything can be overridden as a "business decision", where that means one VP wants to deliver and make SVP, and gets to externalize harm onto users or risk onto the whole of the company.
posted by away for regrooving at 11:18 PM on December 15, 2020

Sales overselling is also an organizational failure, I would say, though a tough one to crack. The failure is that the salesperson who cut the overselling deal gets paid, eng is left on the hook to back them up, and the customer gets more or less burned too -- depending on how much of sales' extraction of value is leeched from eng versus customer. This failure is allowed in part because upper management would look like chumps to hand back money on a just-signed deal, also because reality is slow while the salesperson flits on.

I worked adjacent to a reputedly good sales guy. He hated this "bonus on closing" pay system, because it devalued his particular skills as well as harming everybody but the bonus extractors. He was good at understanding the customer's real needs behind what they said, hearing what engineering said even though he wasn't technical, and occasionally helping customers "manage up" as to why what they were buying was ~effectively~ the checkbox feature the CTO wanted that didn't make sense. (Not a move to count on though.)

He thought it was obvious that sales should get paid for a sale after the customer has taken acceptance of what sales sold. He also stuck around a long time himself, which is not the sales way. He may have been weird.
posted by away for regrooving at 11:40 PM on December 15, 2020 [3 favorites]

(You know, it just struck me, as I was wondering it it was significant that that sales guy was An Old -- I was young at the time but maybe he was in his fifties -- it struck me that I literally cannot think of another salesperson I know who is over, I don't know, approximately 34 years old.

what happens to old salespeople?
do they evolve into some new form?
or are they unaging entities?

)
posted by away for regrooving at 12:49 AM on December 16, 2020 [1 favorite]

@away for regrooving - I don't want to oversell the whole of internal audit. Good internal audit teams are very good. The bad ones are dreadful. I was extremely fortunate to be trained in an excellent team that was doing cutting edge, risk based auditing before it was common. They only recruited people with significant tech experience for IT audit. The theory was that tech people with the right personal qualities could pick up audit skills fairly quickly. It meant they only went for high calibre people, so it was a badge of honour to get in. Crucially, internal audit was seen as a stepping stone to more senior roles. So you got a viruous circle. Audit could attract the best people, which meant it could do high quality, high profile work, with auditors then being poached for good roles elsewhere in the company. So that made it easy to attract the next cohort of wannabe auditors. I also worked for a big tech corp that had the opposite approach, and was stuck in a vicious circle; low quality auditors, poor work, lack of respect, bad reputation, with the inevitable result that they couldn't get good people, so they produced lousy work. I made myself very unpopular by accusing Internal Audit of being dysfunctional and unprofessional, based on my own professional experience. That was the polite version I put in print. In conversation I just called them incompetent fucking wasters who were screwing the company up.

At first, when I joined audit, I feared I'd make so many enemies it would screw up my career. They I realised that doing the job fearlessly would only rule me out of working for the sort of people I wouldn't dream of working for. It was a great way of seeing the bigger picture. Most people, certainly in IT, are so busy with the difficult detail of their jobs that they find it hard to see that bigger picture. You need people who understand how things fit together, how simple changes and trivial damage in one place might have serious consequences elsewhere or further down the line, how damage can cascade through systems. That was our job in audit. In the end I left audit only because I was approached to take on an IT development role that was too good to turn down. I got that job because it required someone with a grasp of how the company was using IT strategically and excellent contacts in the business, both of which I'd picked up in audit.

As for sales, I never met a sales person. Therein lies the problem. They were always long gone by the time I arrived and if they ever appeared on site while I was there they kept their distance. The idea that sales people could pick up bonuses based on closing sales was always crazy, and we certainly argued the point. This was at the same corporation as the incompetent fucking wasters of an audit department. Sales bonuses was exactly the sort of problem that internal audit should have looked at, instead of working their way down compliance checklists that required little thought or understanding.
posted by Claro at 4:43 AM on December 16, 2020 [2 favorites]

« Older I think the mountains have meant more to me than... | Space Mail Newer »

This thread has been archived and is closed to new comments

MetaFilter

Fully and robustly tested and meets the banking industry standards
December 13, 2020 8:12 AM Subscribe

Tags

Share

Fully and robustly tested and meets the banking industry standards December 13, 2020 8:12 AM Subscribe

Tags

Share

Fully and robustly tested and meets the banking industry standards
December 13, 2020 8:12 AM Subscribe