maybe we should throw an exception here??
October 27, 2017 5:31 AM   Subscribe

ProPublica sought and got access to the source code for the Forensic Statistical Tool, or FST, that NYC's Office of the Chief Medical Examiner created and used till early 2017 to analyze complex DNA evidence in "about 1,350 cases over about 5 1/2 years." Now that a judge has unsealed the codebase, ProPublica's put it on GitHub, and "two newly unredacted defense expert affidavits are also available." From Exhibit C, October 2016: "I have seen no code indicating that any test code has been written for, or automated software testing has been performed on, FST."

Also from Exhibit C:
Numerous references to the popular coding website Stack Overflow are observed throughout the code, including the sole comment describing the routine “ByteArrayHelper” in the file “IndividualReportPrinter.cs”. Without any elaboration as to the underlying function of the routine, the comment simply states, “/// This helper class is used to find patterns in byte arrays. Found on stack overflow. Thank you, Internet!”

....

More subjective code smells are prevalent in FST, such as ... long variable names such as the 131-character-long “compareMethodIDSerializationBackupDoNotMessWithThisVariableBecauseItsAnUglyHackAroundACrappyLimitiationOfTheFCLsCrappyXMLSerializer” seen in the file “ComparisonData.cs”.

Title drawn from line 157 of Comparison.cs.
posted by brainwane (89 comments total) 26 users marked this as a favorite
 
FST.Web/Reports/FSTResultReport.rdlc
FST.Web/Reports/FSTResultReport - Backup.rdlc
FST.Web/Reports/Copy of FSTResultReport.rdlc


Guys. There's this thing I saw one time. I think it rhymed with “shmersion shmontrol”.
posted by letourneau at 5:42 AM on October 27, 2017 [20 favorites]


I am not a C# programmer but the code looks like a lot of code I have seen over the years in Java. I have probably even written similar code.

And that is a big problem!

I write enterprise monitoring software, not human rated software. Nobody goes to jail if I my code makes a mistake.

This industry average code could get someone thrown in jail. If the code is not up to medical or aircraft software standards, it should not be used for legal processes.
posted by KaizenSoze at 5:44 AM on October 27, 2017 [55 favorites]


code smells

stop saying this, people
posted by thelonius at 5:44 AM on October 27, 2017 [19 favorites]


It's almost as if software being used to provide data that is presented as evidence in judicial proceedings should be required to be open source and highly vetted for accuracy.
posted by tocts at 5:45 AM on October 27, 2017 [32 favorites]


The phrase "code smells" should never have caught on. That said, that variable name is hilarious.
posted by demiurge at 5:47 AM on October 27, 2017 [10 favorites]


I'm a software engineer, but haven't written C Sharp.

The UglyHack is definitely an UglyHack, but it's very well-documented. Lots of comments throughout the code. The question about "maybe we should throw an exception here?" is just wondering whether to actually throw an exception, or return a null value, which is what the programmer does instead. Both patterns are fine, really.

I don't see anything particularly alarming here, but yes, some unit tests would be reassuring.

Open sourcing any code used in court sounds like an absolute no-brainer to me.
posted by phenylphenol at 5:50 AM on October 27, 2017 [15 favorites]


Assuming the code from SO is fit for purpose, I don't consider it a bad thing that someone chose to both use it and sort-of credit the author. If this software weren't being used for such a critical task, I wouldn't even be too concerned about a lack of unit testing.

Unfortunately, it is critical software, so a lack of an automated test suite and version control (that's inexcusable for any non-toy project) is pretty damning. The test suite you could get around with excellent specifications, change management, and functional testing ala Space Shuttle, but when you don't even have version control or a real source repository, the chance of any other controls being in place is basically zero.

How much you want to bet that Breathalyzers, ALPRs, and even Stingrays have equally shitty development practices?
posted by wierdo at 5:50 AM on October 27, 2017 [13 favorites]


"I have seen no code indicating that any test code has been written for, or automated software testing has been performed on, FST."

And that sucks. But the mere presence of automated testing wouldn't mean all is well, since it's really easy to write basically trivial and meaningless test-for-success suites. tocts has the remedy - the codebase needs to be vetted by multiple people.
posted by thelonius at 5:52 AM on October 27, 2017 [16 favorites]


multiple competent people who care
posted by thelonius at 5:53 AM on October 27, 2017 [11 favorites]


The "FST.Web/Reports/Copy of FSTResultReport.rdlc" doesn't seem like a big deal or an indication that version control wasn't used when developing the code. These reports are XML artifacts generated by code that can be manually edited, not code itself.
posted by phenylphenol at 5:54 AM on October 27, 2017 [1 favorite]


I just want to keep favouriting KaizenSoze's comment forever. This is perfectly adequate business software but that's not good enough when human lives depend on it.
posted by suetanvil at 5:55 AM on October 27, 2017 [3 favorites]


But the mere presence of automated testing wouldn't mean all is well, since it's really easy to write basically trivial and meaningless test-for-success suites.

This. Double this. All "passing" automated tests means is that my assumptions about how things should work properly are borne out by the tests that I built based entirely on these assumptions.

Automated testing can only point out obvious errors in the functioning of your code. It can never make up for faulty assumptions.
posted by sutt at 5:57 AM on October 27, 2017 [12 favorites]


I agree that "automated tests" are not nearly enough to say "okay all is well" -- Section 5.2.2 "Extent of testing" and Section 5.3 "Validation and verification (v&v)" of Exhibit C go into that way further.

I also find Section 5.4.3.3 "Comments" interesting in its discussion of excessive commenting in some areas, lacks of comments in others, and commented-out code.
posted by brainwane at 6:00 AM on October 27, 2017 [1 favorite]


bool longVariableNamesAreFineAsLongAsTheyAreMeaningful = true;
posted by mystyk at 6:14 AM on October 27, 2017 [7 favorites]


bool thePostedExampleVariableIsOneOfThose = false;
return codeSmellIsPresent; // const bool - true
posted by mystyk at 6:18 AM on October 27, 2017 [2 favorites]


return codeSmellIsPresent; // const bool - true

Given that you're camel-casing a constant, yep, always true.
posted by tocts at 6:24 AM on October 27, 2017 [5 favorites]


How much you want to bet that Breathalyzers, ALPRs, and even Stingrays have equally shitty development practices?

That's my bet as well.

From the affidavit:
In January 2016, I was retained in a criminal case and inspected the source code of the commercially-available probabilistic genotyping program STRmix™. Due to a non-disclosure agreement that I signed, I am not allowed to discuss the findings of my review of STRmix™ outside of that particular case.
and
It is my experience (to the extent that I am free to comment on it) that the development of probabilistic genotyping software is substantially less mature than the field of software engineering itself, especially in the domain of software testing, validation, and verification.
(I believe NYC's OCME switched away from FST to TrueAllele, which came up in last month's post about The People of the State of California, Plaintiff and Respondent, v. Billy Ray Johnson, Jr., Defendant; Cybergenetics doesn't want to let defense attorneys or their experts see the source code for TrueAllele, even under a protective order, because they say it's a trade secret. I see that Friday a week from now, November 3rd, there'll be a livestream of the one-day conference run by the Justice Through Science nonprofit that TrueAllele's creator also created. I am trying to persuade myself that I have client work to do and should not spend all day heckling the screen.)
posted by brainwane at 6:30 AM on October 27, 2017 [4 favorites]


I'm going to hold back on any criticism until I know what editor the programmer used.
posted by srboisvert at 6:31 AM on October 27, 2017 [43 favorites]


The standards of medical software is not necessarily something to aspire to, either.

In general, the use of mediocre software - specifically, the use of mediocre software components like third-party libraries - has probably killed people and is going to kill more people. It's something like 30 years worth of accumulated technical debt building up like a massive wave.
posted by rmd1023 at 6:39 AM on October 27, 2017 [3 favorites]


This makes me feel better about my C# code, ugly hacks and all.

Unit tests, written before the code or to confirm simple assumptions: mostly useless.

Automated regression tests that run before checkins are allowed (or if especially slow and involved, at least weekly): fantastic.
posted by Foosnark at 6:43 AM on October 27, 2017 [6 favorites]


It is my experience (to the extent that I am free to comment on it) that the development of probabilistic genotyping software is substantially less mature than the field of software engineering itself, especially in the domain of software testing, validation, and verification.

That is some serious drag going on there. Like dissing-two-nerds-with-one-stone dragging.

Nice.
posted by jonnay at 6:44 AM on October 27, 2017 [4 favorites]


I'm like suuuuuuuuuuuuuper duper proud of my friend the author who had done a bunch of extremely important, high-impact, justice-oriented reporting with PP.
posted by entropone at 6:49 AM on October 27, 2017 [7 favorites]


If the code is not up to medical or aircraft software standards, it should not be used for legal processes.

you're saying that as if the criminal justice system was fair and impartial and not an exercise in blaming individuals for systemic failures so that it can sustain a narrative of moral objectivity. the layperson's trust in Science! (TM) and Technology! (TM) plays a large role in excusing their own complicity within that system, screening out the need to actually deal with considering the lifecourse of individuals and their interactions with institutions of power or with considering the victims, how their pain can be alleviated by pumping money into things like trauma relief programs or therapy sessions or so on

you know, because SCIENCE! WOW BILL NYE ATHEISM Grisham said it was CSI'd so OF COURSE we will need to punish this single individual in a dramatic, symbolic, outsized fashion and once that is complete, yet another tear in our social fabric will be mended and everything will be all right instead of horrifyingly still broken
posted by runt at 6:54 AM on October 27, 2017 [12 favorites]


Excessive use of #region makes my eyes bleed. But then, really, any use of #region makes my eyes bleed.

DataTable might be convenient, but it's not really an appropriate DTO. The code would be much easier to understand with strongly-typed DTOs.

FST.Common/Comparison.cs seems to be where the interesting stuff happens.
posted by Slothrup at 6:59 AM on October 27, 2017


Given that you're camel-casing a constant, yep, always true.

I added that to the comment at the last second and didn't notice, but otherwise yes, I'd agree.
posted by mystyk at 7:00 AM on October 27, 2017


Code without tests is very common. What we call "software engineering" is often more "I tacked and glued some shit together and it seems to work". There's nothing at all surprising to see code like this without tests. No surprise about snippets copy-and-pasted from StackOverflow, either. At least they owned up to it (although as the Exhibit C notes, "No URL’s to the original Stack Overflow web page are provided for reference by future developers." So you'd have to dig a bit to figure out where it came from.)

I'm not saying any of this is good software practice. I'm saying it's common, from podunk two man shops to big temples of software engineering. (Buy me a beer and I have some good Google stories for you, although it's better there now.) The real lesson here is that software can't be trusted. Even relatively simple software like this standalone 23k cloc .NET thing. Much less some complex machine learning monstrosity.

Computers suck.
posted by Nelson at 7:02 AM on October 27, 2017 [15 favorites]


If the code is not up to medical or aircraft software standards, it should not be used for legal processes.

I mean, given the crap that passes for forensic "science" that'd be a good first step. Of course, the difference is that people are willing to PAY for medical and aircraft software. Folks won't even pony up for constitutionally mandated public defenders.

One of the reasons we have the broken plea-bargain based system is that it would be impossible to bring all those cases to trial with the system as it is, there's just not the money or the people. It would grind everything to a halt. A fair and functional justice system is just too expensive, and folks aren't willing to pay.
posted by leotrotsky at 7:03 AM on October 27, 2017 [3 favorites]


I saw a presentation by this company's founder circa 2006. Everything was VBA and Access then. So ... it's improved? Even back then I was way skeeved out and not impressed by their people at all.
posted by miyabo at 7:07 AM on October 27, 2017 [3 favorites]


I was interviewing a lawyer about a landmark data security court case, and we were discussing some of the evidence presented by the state from, let us say, a governmental system. I asked whether that evidence had been challenged by the defence as meeting the normal standards for admissible evidence, as to me it looked as if it could have just been made up. (I don't doubt it wasn't, but there was no way to check.)

The lawyer said that the defence had been told not to make that challenge, as if it succeeded it would open the floodgates to a huge number of previous cases being challenged and verdicts vacated, well beyond this class of case, to the extent that he doubted the system could have coped. He didn't say who had told the defence this, but he did say the defence agreed.

That's my direct experience of something I have heard (hearsay only!) elsewhere, and repeatedly, that there are understandings - kludges, one might say - within the UK justice system that there are doors not to be opened, because if you actually break things that badly the results will be chaotic and widespread. These are, of course, terribly difficult kludges to re-engineer, and people do try, but I fear that as the case in the FPP demonstrates, it is going to be increasingly difficult to keep such doors shut. And when one opens...

We can all, I'm sure, come up with a model to prevent bad software generating bad evidence - open, audited code, avionics/pharma levels of regulation and approval, patent models rather than trade secrets - and such things would be a theoretically good fit elsewhere in places like electronic voting. But if you look at other aspects of such regulated systems, which include huge expense and a glacial path from innovation to deployment, then how that fits with an already vastly underfunded and overloaded judicial system is far less clear.

(on preview, what leotrotsky said)
posted by Devonian at 7:26 AM on October 27, 2017 [6 favorites]


What constitutes best practice in Software Development is always up for debate and fashions change over time. There is maybe some agreement on what is 'good enough' practice but 'best' practice frequently leads to long arguments.

The use of the term *code smells* has always irked me, when you see the term 'code smells' you know that coming up next are probably finicky criticisms that ignore the real-world reality of how software gets made and patched over time. My theory is that its a term used from an 'academic' viewpoint - "My guru said things should be done this way", and (I suspect) is less often used by people who've actually had to churn out working code for several years. You could argue, in fact, that use of the term *code smells* is a 'smell' for detecting coding pedants.

So like the other coders in this thread, I'm not overly troubled by the stuff highlighted by Exhibit C. Taking stuff from StackOverflow is actually a very good sign; it shows that the programmer is at least vaguely up to date with how things are done in the last decade or so. Exhibit C goes on about long methods or long classes being a bad sign, but really, breaking complexity up between 20 different methods or classes instead of one big one really is a very debateable practice. The complexity is still there, you've just spread it around. Is that more maintainable? And so on.

The proof of the pudding, so to speak, would be to compare the DNA analysis made by the software to some other independent source. And do a lot of that comparing.

I suspect if you said 'all software used in courts must adhere to standard X' (lets say 'formally verified' or '100% test coverage' or 'provided by a government approved company' or whatever) you would end up with big companies like EDS doing everything and frankly, you'd probably have even worse software at the end of it. I suppose the military have those sorts of constraints on their software, and I suspect it makes everything very expensive and very slow to develop and probably not much better.

Insisting that everything was open source would however be a very good idea.
posted by memebake at 7:50 AM on October 27, 2017 [10 favorites]


(I'll note here that I am a coder in this thread and I am indeed troubled, although I think not overly so, by the findings in Exhibit C.)
posted by brainwane at 8:15 AM on October 27, 2017 [1 favorite]


Insisting that everything was open source would however be a very good idea.


To an extent, it is. If I have a factory producing magic widgets or a delicious cola that doubles as a wood varnish, then it's quite possible to keep the production method as a trade secret. Re-engineering can be very hard indeed. Code, however, is code: you might not have the source, but you can create _a_ source through decompilation that will work and can be studied and tested. You can't go ahead and ship a product based on that source, because other intellectual property protections exist.

This breaks down with remote execution a la cloud (and to some extent 'trusted' encrypted silicon-level systems) where you never see the code running. This is definitely something that should be recognised and guarded against in situations where forensic analysis of software may prove essential in the future. The 'we can't show the court the code because it's a trade secret' is bollocks otherwise, and the software industry hasn't suffered elsewhere without this stricture; filing code to a judicial escrow service would be a good thing all round.
posted by Devonian at 8:17 AM on October 27, 2017 [3 favorites]


Open sourcing any code used in court sounds like an absolute no-brainer to me.

This would amuse me no-end, because every layer is defined by software, including the bare metal.

If you truly want to apply "all bugs are shallow", you're going to be left with very few platforms you can run on. Maybe RMS can point us in the right direction.
posted by Leon at 8:18 AM on October 27, 2017 [2 favorites]


Here's a good way to tell if code is decent: can you read and understand it quickly? If so, you should be able to judge each piece of code on its merits (or lack there of.) If you can't read and understand it quickly, you have to assume it is bad until proven otherwise. Tests can prove that code to be decent, but only if those tests are decent (and the same rules of clarity and understanding apply, because tests are code.)

What you can't tell in the code itself is whether the fundamental assumptions are decent, and whether the data being fed in are decent. That requires expertise in the domain, comprehensive and accurate documentation, and rigorous review.

That is, of course, expensive (time- and money-wise), and those costs tend to get paid only when mandatory (via regulation, transparency, accountability.) So bring on the open sourcing of code like this*, and regulate the requirements, so that everyone involved in writing, testing and documenting can defend against shortsighted timelines and over-prioritized profit motives.

Only then can you confirm that code (and everything related to it) goes above decent into the realm of good.

*where critical things are on the line
posted by davejay at 8:24 AM on October 27, 2017 [4 favorites]


i find code smells a useful concept. call me a code pedant if you like. bullshit is fractal. if it exists at the scale of one line, it exists everywhere, in *really* important places.
posted by j_curiouser at 8:30 AM on October 27, 2017 [18 favorites]


I have nothing substantive to add here, but as a newbie coder, it's really fascinating to see people talk about how this all works in the real world.

I suspect that the question of whether it should be open source reflects just a fundamental cultural difference between software developers and people in law enforcement. The former think that being open source would make the code safer, but the latter might think that it would make it more vulnerable. They would think of it as allowing potential bad actors to have access to and potentially influence over their tools.
posted by ArbitraryAndCapricious at 9:15 AM on October 27, 2017


Yeah, I don't love the phrase "code smell", but I do think it's useful to have a short phrase that amounts to, "I can't point to a for-sure bug on this line of code, but the way it is written makes me concerned about more fundamental problems in this codebase". It gets misused (like basically every other term) and it's a bit too cute by half (in software? never!) but it has valid uses.
posted by tocts at 9:18 AM on October 27, 2017 [16 favorites]


code smells

stop saying this, people


I'm not a programmer, and the straightforward explanations of "code smell" I found in a very quick Google search didn't indicate disdain for the phrase. Granted I may be missing some undercurrent, but why are you so against it, thelonius and demiurge?
posted by Greg_Ace at 9:20 AM on October 27, 2017 [3 favorites]


Sounds like "code smell" is a terminology smell.
posted by clawsoon at 9:25 AM on October 27, 2017 [1 favorite]


I fear in our rush to accept this as "normal coding" the more important point is lost: normal software is often broken. Anyone who uses computers knows this, intuitively, the bullshit and brokenness is literally all around us. But somehow in a legal context sometimes it gets forgotten. "The computer said there was a DNA match: must be guilty!". That is entirely inappropriate. Software engineering is not capable of enough certainty (in normal practice) to be used as the sole decision maker. I believe courts mostly know this already but I think the point is lost on ordinary people.
posted by Nelson at 9:25 AM on October 27, 2017 [14 favorites]


One of the reasons we have the broken plea-bargain based system is that it would be impossible to bring all those cases to trial with the system as it is, there's just not the money or the people. It would grind everything to a halt. A fair and functional justice system is just too expensive, and folks aren't willing to pay.

I don't often interact with the justice system, but I realized how broken things were when I was in for jury duty a couple of weeks ago.

The trial I was picked for was a criminal case over about $500 worth of possibly stolen goods, more than a year ago. Which means the meager pittance that this state pays its jurors alone was more than the value of the goods, nevermind the judge and all other associated court staff, and whatever the county spent prosecuting this. And geez, was this defendant waiting in jail for 13 months over this? I'm not saying don't prosecute relatively minor crimes, and I'm not saying don't have fair trials, but... gah. Some things are messier than software.

As for "code smells" -- it's generally associated with people simply not liking the coding style or having a vague hunch that something might be wrong. It's a bit like pronouncing judgement on the probable quality of a Chinese restaurant's food because it's tucked away in a dingy little shopping center and something is misspelled on the menu.
posted by Foosnark at 9:30 AM on October 27, 2017 [6 favorites]


Thanks, Foosnark.
posted by Greg_Ace at 9:33 AM on October 27, 2017


jeff's code smell post. it goes beyond style.
posted by j_curiouser at 9:38 AM on October 27, 2017 [5 favorites]


The concept of code smells- and lists like Jeff Atwood's - are also a good way to help rookie programmers develop heuristics for evaluating code. (Or, at least, I found them very helpful when I was starting out, and I know a number of other people who felt the same.) I would also be curious to hear why people here hate it so much.

It's a bit like pronouncing judgement on the probable quality of a Chinese restaurant's food because it's tucked away in a dingy little shopping center and something is misspelled on the menu.

I would say it's more like noticing that a restaurant has crumbs on the carpet and spills on the counters that nobody is cleaning up, and questioning whether they extend the same lack of care to the rest of the place. Which...maybe they don't! Maybe the kitchen is sparkling clean and the FOH is just kinda lazy and management doesn't mind too much. There's code that has code smells that works well. But there's a lot more code with code smells that doesn't work well, and that's incredibly difficult to reason about and maintain even just weeks or months later. And a lot of the code that's smelly-but-functional is also really hard to come back to and maintain or improve, because of those issues.

Regarding the actual links, it sounds like the software was written at about the same quality as the crappy CRUD app I used to work on, and that's inexcusable for something that can send people to prison.
posted by protocoach at 9:40 AM on October 27, 2017 [16 favorites]


having a vague hunch that something might be wrong

To be fair, a lot of being an experienced engineer (or an experienced attorney) is having such vague hunches -- they can suggest efficient places to start looking for problems. If you have to review every line of code, then you get the economic infeasibility everybody's talking about. But the solution is not to throw up your hands -- it's to have experienced people with an understanding of the problem space and the critical relevant issues poke around and see what their hunches say.

For example, there might be a buffer overflow in the code that prints the title of a report. So what? Theoretically, an attacker might use that as the entry to force the software to produce a "guilty" match, but if that's your defense, you need more than "it's technically possible that this happened!"

The places to look, it seems to me (as a coder in a former life, and an IP lawyer now) are nearer the handling of the DNA data (in this case). Where does it come from? What do you do with it? What's the theory behind doing that? What things could cause a false positive? How are you accounting for that? If that part of the code smells, then you've got a problem.

The Daubert standard is supposed to deal with that, but IME it mostly looks like a comparison of degrees (PhD from here, MD from there) and then just believing (or not) whatever the expert says.
posted by spacewrench at 9:47 AM on October 27, 2017 [6 favorites]


For researchers developing new toxicology tools (e.g. in vitro screening arrays to replace antiquated and near useless whole animal tests), regulators like the FDA almost uniformly disallow proprietary black box components, including on the computational side. Even if certain pieces are kept proprietary, they can't be kept secret if they're intended for use to support something like a new drug application. They have to be validated, and that validation has to include a public report. It's a bit stunning that the criminal justice agencies aren't similarly controlled. A moral of this story is that more scientists, engineers, mathematicians, etc. need to work in public policy and bureaucratic positions.
posted by late afternoon dreaming hotel at 9:50 AM on October 27, 2017 [5 favorites]


But the mere presence of automated testing wouldn't mean all is well, since it's really easy to write basically trivial and meaningless test-for-success suites.

As someone whose job involves being an advocate for automated testing, I've heard this argument a lot and I really hate it. What having automated tests does (for me) is show that the functionality and limitations of your code are well-understood, by you, and that you're able to reason about that in a systematic and comprehensive way. It's easy to write meaningless tests, but it should be just as easy for a reviewer to see that the tests are useless.

I love writing automated tests, because I was a philosophy minor in college and always had a soft spot for those medieval scholars who presented elegant A -> B -> C -> God proofs that held together logically but had involved priors that we just don't accept anymore. Your unit tests should be treated like a scholastic proof, as an internal validation based on priors that, at some level, have to be taken on faith.
posted by OverlappingElvis at 9:51 AM on October 27, 2017 [7 favorites]


They would think of it as allowing potential bad actors to have access to and potentially influence over their tools.

These are cops. They want to avoid oversight, not counterintelligence.

(although I guess most cops would consider the ACLU bad actors so in that sense you're not wrong)
posted by PMdixon at 10:01 AM on October 27, 2017 [5 favorites]


Granted I may be missing some undercurrent, but why are you so against it, thelonius and demiurge?

1. People can (and do) call things "code smells" without saying whether it's a potential security problem, an actual bug, or just a coding style preference. It's a vague term.
2. I find it a gross and off-putting term. Would you rather people say there are potential problems with this code or the code has a lot of "code smells"?

What's wrong with just saying code doesn't follow best practices?
posted by demiurge at 10:31 AM on October 27, 2017 [2 favorites]


Isn't saying it "doesn't follow best practices" equally vague, though? There are best practices for indentation and best practices for storing passwords, but one of those is a bit more important than the other. And I can't argue with #2, since that's a preference, but I think that's kind of the point - it's a little more evocative than just saying something isn't following best practices.
posted by protocoach at 10:37 AM on October 27, 2017 [1 favorite]


A code smell is just a red flag, right? It's an issue with the code that doesn't cause a problem but that indicates sloppiness or otherwise suggests that problems may be present. Is the issue just that "smell" is a gross metaphor?

I definitely found the list of code smells to be helpful, fwiw.
posted by ArbitraryAndCapricious at 10:41 AM on October 27, 2017 [1 favorite]


Saying "it doesn't follow best practices" is a rather binary way to term it, like it's being graded pass/fail. No code ever fully follows all relevant best practices, and what even counts as a best practice is often arguable in context. I feel like phrasing it this way gives a false sense of precision.

I think "code smell", for all that it's a kind of gross term, at least accurately communicates that the problem or concern is vague and arbitrary and maybe just a matter of taste or just something that would be a bad smell under other circumstances.
posted by fatbird at 10:41 AM on October 27, 2017 [3 favorites]


Granted I may be missing some undercurrent, but why are you so against it, thelonius and demiurge?

Ghe undercurrent is that programmers can be extraordinarily rude and dismissive when talking bout other people's work and it's juvenile and exclusionary.
posted by Space Coyote at 10:43 AM on October 27, 2017


On code smells...
> 1. []It's a vague term
I think it's meant to be. It's just a general cluse that there could be things wrong. If my food smells funny I'm going to investigate a bit before I eat it, even though I don't know what specifically, if anything, is wrong with it.
> 2. I find it a gross and off-putting term
I think it's meant to be.

That said, I never use this term (on preview, with good reason). But I don't object to it being used by others. It serves its purpose well enough.

A lot of commenters here and elsewhere seem to be thinking the most important thing about code is that it should work. I think that's wrong. The most important thing about code is that it should be maintainable. Code that works but is unmaintainable will quickly become code that doesn't work and is unmaintainable. Code that doesn't work but is maintainable will quickly become code that does work and is maintainable.

Unit tests are important because they confer confidence in refactoring, to keep your code maintainable. They are also your documentation.

Comments are a sign of weak code. If you feel you need a comment, refactor until you don't. If you have unit tests then you can do that reliably.
posted by merlynkline at 10:45 AM on October 27, 2017 [3 favorites]


> Comments are a sign of weak code.

Indeed, Donald Knuth is widely renowned for the weakness of his code.
posted by ardgedee at 10:52 AM on October 27, 2017 [8 favorites]


> > Comments are a sign of weak code.

> Indeed, Donald Knuth is widely renowned for the weakness of his code.


Not all computer languages have the benefit of adequate expressiveness for reasonable comment avoidance. I still comment SQL :(

C# has no such excuse.

That said, your linked example contains three comments in 416 lines of code, two of which are attribution lines...
posted by merlynkline at 11:02 AM on October 27, 2017 [2 favorites]


compareMethodIDSerializationBackupDoNotMessWithThisVariableBecauseItsAnUglyHackAroundACrappyLimitiationOfTheFCLsCrappyXMLSerializer

Now I'm worried that I wrote this software and somehow forgot about it, because that is totally the sort of variable name that I would have.
posted by It's Never Lurgi at 11:05 AM on October 27, 2017 [3 favorites]


Granted I may be missing some undercurrent, but why are you so against it, thelonius and demiurge?

Well, I'm not running a crusade against it. It annoys me, and I think that's because it's often used by people who are copping a kind of unearned attitude of intellectual superiority. They've learned a few principles of object oriented programming and think that they have a complete understanding of "best practices", as they like to say. Often these practices are kind of superficial, or, in some cases, represent like Kool-aid drinking (as the idiom goes), like people, say, who've been totally indotrinated into inversion of control patterns, and think that writing a class with a constructor that takes paramaters is a "code smell". Sometimes the things they thing are important are actually pointless or trivial optimizations.

The concept itself is not really defective, I admit - there are indeed things (using global variables, etc) that, if you see them in a codebase, are probably good tells that there are lots of problems.

Then there's the purely subjective aspect that I'm kind of a curmudgeon about overuse of cute slogans and catch phrases. (For example, when people use the term "CRUD" every chance they get instead of just saying, data operations or something. I acknowledge that that doesn't have any kind of validity objectively, as a criticism of programming, it's just a personality issue on my end..
posted by thelonius at 11:07 AM on October 27, 2017 [5 favorites]


sutt: "Automated testing can only point out obvious errors in the functioning of your code. It can never make up for faulty assumptions."

Dear friends who feel that tests cannot find errors that we ourselves would find, I'm sure you understand the value of functional and integration tests but let me introduce you to fuzzing.
posted by bdc34 at 11:07 AM on October 27, 2017 [2 favorites]


Oh for goodness' sake. Comment where you need to comment. Have you had to think about how to implement something, and found some good but not obvious ways to do it? Then you're probably a good coder, and the good coders to come after you will thank you to explain briefly how your non-obvious but good code works. Have you had to do a work-around for an API or function funny? Say what you did and why. Are you adding two integer variables with clear names and using a clear name for the result? Don't bother.

The poor grunt who gets to dive into your code next year will thank you - and that poor grunt may be you.

Comments are useful and abusable. They serve a purpose.
posted by Devonian at 11:09 AM on October 27, 2017 [15 favorites]


Comments are a sign of weak code. If you feel you need a comment, refactor until you don't. If you have unit tests then you can do that reliably.

Oh, come on. That's garbage and you know it. At least, I hope you know it. Comments that explain what you are doing are bad and should, in all likelihood, be removed. Comments that explain why you are doing something are very useful.

Comments can also provide a higher level overview of the code and guidance on when and how to use parts of the code correctly.

The fact that comments can (and frequently are) misused is not an argument against comments.
posted by It's Never Lurgi at 11:11 AM on October 27, 2017 [16 favorites]


I write lots of tests, and have often been told I'm not dumb. Generally, if my tests pass the first time I run them, it's an excellent indication that I've done something wrong.

Deploying code without tests is sheer hubris. (So you're doing one-off checks that things run correctly? Why not package those into... Actual automated test cases?)
posted by kaibutsu at 11:14 AM on October 27, 2017 [3 favorites]


Writing good comments is pretty hard. There is a good chapter about comments in the book "The Practice Of Programming" by Kerhnigan and Pike.

Sometimes people were made to comment almost everything in CS classes, and they produce pointless comments like:

//reinitialize the counter
i = 0;

That should certainly be omitted.

Kerhnigan and Pike discuss a serious issue with comments: they tend to not be maintained, and the code diverges from them. Is the comment wrong? The code? Or both? They have a great real example, from some telecom software. There was a comment to the effect of like "handle Italy and Jordan" and then code that checks to see if the country code is for Italy or a different country - Italy and Kuwait, say. That's the divergent situation, but, as they say, the more serious deficiency is, why do two countries need special handling? What's going on here?The comment doesn't say, and a maintainer must try to figure it out for themself.
posted by thelonius at 11:17 AM on October 27, 2017 [5 favorites]


(and please note, I wasn't trying to say tests are dumb or bad - I was trying to say that writing good tests isn't trivial and takes care).
posted by thelonius at 11:18 AM on October 27, 2017


I acknowledge that that doesn't have any kind of validity objectively, as a criticism of programming, it's just a personality issue on my end..

Sort of a code discussion smell.
posted by tobascodagama at 11:25 AM on October 27, 2017 [2 favorites]


Comments that explain why you are doing something are very useful.

EG, is the max orders per second constant mandated by the exchange and should NOT be changed or is it something set firm-side for $REASONS?
posted by PMdixon at 11:30 AM on October 27, 2017 [2 favorites]


I write code that is well used and often updated. I am grateful for unit tests so that I can quickly see if my own code has unintended consequences re behavior that other coders have chosen as important enough to have explicit tests for.

The presence of unit tests doesn't guarantee good code, but the absence, especially in a project that mutiple programers at different times have modified, is a warning flag that you're going to find code where no one had the time or space to give a fuck.

It doesn't necessarily mean the code is bad, or the coders are hacks, but it is a warning sign that the project wasn't set up professionally (eg managers said "don't waste time in tests, git er done so we can complete this contract")
posted by zippy at 11:50 AM on October 27, 2017 [7 favorites]


'Code smell' is an interesting term because it makes more sense if you take it somewhat literally-- in the real world, everything smells like something. You just get used to the smells of people and places you encounter regularly, and take notice of unfamiliar smells. When someone talks about a code smell, often they are simply saying the code is unfamiliar to them, in style or structure.
posted by Pyry at 12:11 PM on October 27, 2017 [5 favorites]


If code were buildings, most small-company software would be ramshackle lean-tos. Right now, I'm in the middle of patching some holes in the tarpaulin roof and adding a couple of cross braces so that the software our company uses to make itself more efficient doesn't let in quite so much rain or blow over in the wind quite so often.

But there's decent money in fixing up lean-tos, so I'm happy to have the opportunity.

Back to code smells: Some of them are like fine cheese. You have to develop your palate before you can appreciate the sense behind the stink. The use of goto in the Linux kernel and the Samba project is one that comes to mind.
posted by clawsoon at 12:49 PM on October 27, 2017 [4 favorites]


'Code smell' is an interesting term because it makes more sense if you take it somewhat literally

This is how I like to take the term. Some people just have weird little idioms in their programming that may be totally valid and may even fit into the project's style guide, but it still smells like them. It might even be a nice smell, and you might adopt that smell yourself like a perfume. Um. Metaphors are weird.
posted by OverlappingElvis at 12:55 PM on October 27, 2017 [4 favorites]


> Comments are a sign of weak code

>> Oh for goodness' sake. Comment where you need to comment.

>> Oh, come on. That's garbage and you know it.

Apologies for something of a derail in this thread. I will just say that (a) that was obviously a bit of a dramatised absolute but (b) I still think it's essentially true and (c) although a derail here, this (codecraft in general) is a subject dear to me and I would be delighted to discuss further in PM or whatever, and (d) here's a hurried anonymisation of an article on this particular detail that I wrote for work a while ago: https://docs.google.com/document/d/17GBptcrUp3b0vQynPPyKI3R3zj76VgjGv4Nu3_9ROzk/edit?usp=sharing
posted by merlynkline at 1:59 PM on October 27, 2017 [2 favorites]


Comments are a sign of weak code

Oh Christ, one of those.

Please know that everyone you work with hates you and you are about a thousand percent less awesome than you think you are.
posted by Artw at 2:16 PM on October 27, 2017 [12 favorites]


Thanks. I refer you to my last comment. Delighted to continue out of band.
posted by merlynkline at 2:25 PM on October 27, 2017


https://github.com/propublica/nyc-dna-software/blob/master/FST.Web/FST_Production_Service/Connection.txt

Cool, they checked in the production server’s database credentials. Heckuva job, boys.
posted by jenkinsEar at 2:25 PM on October 27, 2017 [5 favorites]


What constitutes best practice in Software Development is always up for debate and fashions change over time.

As I said. Even something apparently uncontroversial like commenting code can still turn out to be controversial.
posted by memebake at 2:44 PM on October 27, 2017


I know C#, I write C#, and you sir are no... umm, nevermind.
posted by blue_beetle at 2:52 PM on October 27, 2017


> What constitutes best practice in Software Development is always up for debate and fashions change over time.

Computers change over time, very rapidly. Where we once had to write code that would fit into a few K of memory and be reasonably executed by an 8 bit processor at a few KHz, we now have the luxury of almost boundless memory and processing power, and have gone through many steps between. If our software development practices aren't changing to keep up with this then we're doing something wrong. (That might include changing our ideas about the utility of comments :P)
posted by merlynkline at 3:14 PM on October 27, 2017 [1 favorite]


No, pretty sure that is always going to be the mark of people who are terrible to work with.
posted by Artw at 3:31 PM on October 27, 2017 [3 favorites]


Mod note: please stop the commenting derail? thank you.
posted by jessamyn (staff) at 3:39 PM on October 27, 2017


If the code is not up to medical or aircraft software standards, it should not be used for legal processes.

Aircraft software does tend to be pretty good, from what I hear. Medical software, not so much. There's a thousand Therac 25s for every one we hear about.

Here's just one great design I encountered during my tenure doing medical human factors: a large-volume infusion pump where pushing the "stop infusing" button once stopped infusing, but pushing it twice ran the 'eject' program which meant it would infuse the entire contents of the bag into the patient at as high a speed as the pump could run (meaning, typically, 1000x overdose in the first second or two).

And if you're a HCP and you just realized you're infusing the wrong drug, you might go "oh shit shit shit" and jam the stop button a bunch of times...

(This was the Baxter Colleague infusion pump, which was notably recalled because of this and its many, many other issues.)
posted by dmd at 3:41 PM on October 27, 2017 [13 favorites]


This is a thread about judging the quality of code, right? Its an interesting subject I think.

I only just looked at the usernames of the people I'm talking to and (this is 100% true) merlynkline and I have crossed paths before, long ago, and he was indeed great to work with. So no more arguments from me and I will PM!
posted by memebake at 3:54 PM on October 27, 2017 [1 favorite]


people, say, who've been totally indotrinated into inversion of control patterns, and think that writing a class with a constructor that takes paramaters is a "code smell".

totally agree. it's either parameterized constructors or ioc w public (ugh) setters. you *could* call code smell either way. at least if we talk about it, we can get a quick sanity check.

(ioc is so fucking overkill for almost everything, imho)
posted by j_curiouser at 7:01 PM on October 27, 2017 [1 favorite]


The use of the term *code smells* has always irked me, when you see the term 'code smells' you know that coming up next are probably finicky criticisms that ignore the real-world reality of how software gets made and patched over time.

I understand this attitude and the sort of personalities that have probably provided fodder for it, but as a counterpoint I’ve also seen this sort of practicality rhetoric deployed repeatedly on a local basis to justify what I identified as, yes, code smells, until repeated local instances of bad style metastasized into a global problem once a dimension of the problem space that was assumed to have a fixed value suddenly became mutable, leading to a tedious and error-prone process of making the same logic change in a million different places. And even when the evidence was right in everyone’s faces about just how difficult we’d made life for ourselves, and which axes of the problem space were likely to undergo change at a pace that ought to be explicitly designed for, my suggestions that we make the type system do the work were superficially acknowledged but still tacitly dismissed as academic fluffery. This was in Java, and really all I was arguing for was providing environment-specific behaviors via composition for each provider and a factory that sniffed the platform’s capabilities and then provided the right implementation for that provider’s interface. Not exactly NASA shit, you know? I think the argument that we should get our heads out of the clouds and just make something that works has the inherent advantage of seeming the most practical, but it ignores externalities in a way that quickly becomes poisonous if taken too far. Being practicality-oriented but with an eye towards ease of maintenance seems to make you a coder without a country, in my experience, unless you’re at a rare shop.
posted by invitapriore at 7:11 PM on October 27, 2017 [4 favorites]


ioc is so fucking overkill for almost everything, imho

The thing that caused me a lot of trouble was, everything is configured in XML files. Wait, or using annotations, not in XML files. Wait, or by convention-over-configuration and it's not configured at all but there are magic names and locations to put things in. All that's not so bad once you get used to it, but it was a rough road for me. Perhaps I'm too literal-minded.

Exapanding on my gripe (and, for here and now, abandoning my above-mentioned aesthetic objections to vocabulary), the kind of folks I have in mind seem to think that OOP is all and only writing Javabeans style classes (zero-argument constructors, public getters and setters) for everything, and then feeding them to a framework that uses reflection to do who knows what with them at runtime......isn't that just more or less abandoning the idea of encapsulation?

But a lot of the things in the code odors (I just can't say it, sorry) post linked above do indeed sound very bad.
posted by thelonius at 8:16 PM on October 27, 2017 [2 favorites]


isn't that just more or less abandoning the idea of encapsulation?

dude. we should party.
posted by j_curiouser at 9:05 PM on October 27, 2017 [4 favorites]


"Dear friends who feel that tests cannot find errors that we ourselves would find..."

Your example literally does nothing to invalidate my point. You are still testing the things that you decide to treat, using a metric that you have decided upon based upon assumptions that you have made. That the input is pseudo-randomized (based again upon a method that you have decided upon) is utterly inconsequential.
posted by sutt at 6:56 AM on October 28, 2017


If you're reading that Wikipedia article about fuzzing and saying to yourself that things like buffer overflows, race conditions, and memory leaks are the "obvious errors" you mention above which no one would need to do automated testing to find, I think maybe you are making faulty assumptions about the irrelevance of testing your own assumptions.
posted by XMLicious at 8:37 AM on October 28, 2017


I think maybe you are making faulty assumptions about the irrelevance of testing your own assumptions.

I think maybe you're making faulty assumptions about my assumptions of your assumptions.
posted by sutt at 8:56 AM on October 28, 2017


I think I’m glad I exited the Java developer career path when I left excite.com in May 2001.
posted by Annika Cicada at 10:34 AM on October 28, 2017 [4 favorites]


« Older Jackie Chan’s Plan to Keep Kicking Forever   |   “You can sew almost anything into the canvas of a... Newer »


This thread has been archived and is closed to new comments