Reckless Software Development Must End

On the 6th of November, 2019, I made a comment on Twitter:

Okay, so today’s news isn’t as dramatic as Uber killing a homeless woman by not programming in the fact that pedestrians might not use crosswalks, but it is based in the same mode of thought.

Today’s news is that the US state of Iowa has had issues with their election processes (processes that are a bit too complex for me to provide you an overview in this blog). The problem boils down to reckless abandon of software engineering principles.

As reported in the New York Times and The Verge, in addition to many other outlets, there were a number of failings in the development and deployment of this software package that would have been trivial to prevent.

My personal belief is that the following issues significantly contributed to the failure we have seen.

No test plan

There was no well-defined plan of testing.

The test plan should have covered testing of the back-end (server) portion of the software, including synthetic load testing. My test plan would have included a swarm of all 1600+ precincts reporting all possible data at the same time, using a pool of a few inexpensive systems running multi-connection clients.

The test plan should have also included testing of the deployment of the front-end (user facing) portion of the software. They should have asked at least a few of the precinct staffers to attempt to complete installation of the software.

Ideally, a member of the development team would be present for this, to note where users encounter hesitation or issues. However, we are far from an ideal world. My test plan would have included a simple Skype or FaceTime session with the poll workers, if face-to-face communication would have been prohibitive.

These sessions with real-world users can be used to further refine the installation process, and can inform what should be written in documentation to simplify and streamline the experience for the general user population. Then, users should be allowed to input mock test data into the software. This will allow the development team to see any issues with the input routines, and function as an additional real-world test for the back-end portion.

By “installation”, I mean the set up required after the software is installed. For instance, logging in with the unique PIN that reportedly controlled authentication. I am not including the installation of the app software onto the device, which should not have been an issue at all — and which is covered in the following section.

Lack of release engineering

Software must be released to be used.

It appears that the developers of this software either did not have the software finished before the Iowa caucus began (requiring them to on-board every user as a beta tester), or they did not intend to have a proper ‘release’ of the software at any time (meaning every user was intended to be a beta tester). I could write a full article on the sad state of software release engineering, but I digress.

The software was distributed to users via a testing system, used for providing pre-release or “beta” versions to testers. This is an essential system to use when you have a test plan like what I described above. This is, however, a bad idea to use for releasing software for production.

On Apple’s platform, distributing final releases via TestFlight or TestFairy can result in your organisation being permanently banned from accessing any Apple developer material. Not counting the legal (contract law) issues surrounding such a release, on Android this requires your users to enable what is called “side-loading”, or installing software from untrusted third-party repositories.

All of the Iowa caucus precinct workers using the Android OS now have mobile devices configured in a severely vulnerable way, and they have had sideloading normalised as something that could be legitimate. The importance of this cannot be understated. This is a large security risk, and I am already wondering in the back of my mind how this will affect these same workers if they are involved with the general election in November. The company responsible for telling them to configure their mobile devices in this manner may, and in my opinion should, be liable for any data loss or exploitation that happens to these people.

My release plan document would have involved clearly defined milestones, with allowances for what features would be okay to postpone for later releases. This could include post-Iowa caucus releases, if necessary — the Nevada Democratic Party intended to use this software for their 22nd February caucus. Release planning should include both planned dates and required dates. For example:

  • Alpha release for internal testing. Plan: 6 December. Must: 13 December.
  • Beta release, sent for wider external testing. Plan: 3 January. Must: 10 January.
  • Final release, sent to Apple and Google app stores. Plan: 13 January. Must: 20 January.
  • Iowa Caucus: 3 February (hard).

Such a release plan would have given the respective app stores at least two weeks to approve the app for distribution.

Alternatively, if the goal was to avoid deployment to the general app stores of the mobile platforms, they could have used “business-internal” deployment solutions. Apple offers the Apple Business Manager; Google offers Managed Google Play. Both of these services are included with their respective developer subscriptions, so there is no additional cost for the development organisation.

Lack of security processes

Authentication control is important in all software, but especially so in election software. This team demonstrated to me a lack of understanding of proper security processes by providing the PIN on the same sheet of paper that would be used on the night of the election for vote tallying.

I would have had the PIN sent to the precinct workers via either email, or using a separate sheet which they could have in their wallet. Ideally, initial log in and authentication would have taken place on the device before the release, with the credentials stored in the secure portion of device storage (Secure Enclave on iPhone, TrustZone on Android). However, even if this is not possible, it was still possible to provide the PIN to users in a more secure manner.

Apparent lack of clearly defined specification

I have a sneaking suspicion that the combination of these failings mirror the many other development organisations who refuse to apply the discipline of engineering to their software projects. They are encouraged by bad stewards of engineering to “Move Fast and Break Things”. They are encouraged by snake-oil peddlers of “process improvement” that formal specification and testing are unnecessary burdens. And this must change.

I’m not alone in this call. Even the Venture Capitalist section of Harvard Business Review admits that this development culture is irresponsible and outdated. Software developers and project managers must be willing to #Disrupt the current industry norm and be willing to Move Moderately and Fix Things.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s