Do you know what I love about software developers? They’ll just code to what the requirements or user stories are. I have a little joke to myself, that with the dramatic industry shift to everything-as-code (i.e. infrastructure-as-code, software-defined networking, etc), that pretty much all jobs will eventually one day be “software developer”. Take network design for example. It used to be a networking specialist hand-crafting an artisanal configuration for your datacenter, laying all the cable, and configuring all the gear. Today you just give the task to a software developer with the requirements for how the systems can communicate with each other, and then they just go off and write the code and the unit tests to verify the code meets those requirements.
(Now this is obviously just a joke, I’m well aware there’s way more to it than that.)
So what’s interesting about this is the concept that a programmer will just code to exactly whatever the requirements are, and with CI/CD pipelines and unit tests the code will be verified to meet those requirements. That’s great, but there’s an important piece to that puzzle…the requirements must be correct and complete!
There’s plenty of focus in software engineering placed on the validity of the requirements. That is, software engineering is generally focused on working with users to ensure not only that we are “building the thing right”, but that we are “building the right thing”. These functional requirements can only be developed and verified by the end users themselves, in conjunction with people who have enough contextual knowledge of the domain to bring that flair for innovation. But, there’s another category of requirements as well, that of “quality” requirements.
What do I mean by that?
Essentially, almost every software project comes with this pre-defined, exists-for-every-project set of requirements. Some of these for example:
- Ensure that the system can handle unexpected input types
- Ensure memory is freed up when no longer in use (I don’t just mean free(), I also mean scope your variables properly)
- Ensure that the system can be modified in the future, to add, remove, or change capabilities (affects both data structure and function design)
Where do these requirements come from? Some of them, there’s an expectation that as a skilled and experienced software developer, this quality is just built-in. For example, when an experienced car mechanic replaces an oil filter, they know just the right tightness to install the new one, this is just assumed to be inherent knowledge that any car mechanic should know. The same applies to software engineering. It is assumed that any software developer will just do input validation or bounds checking as needed, there’s no written requirement to just do that for every single function or method you write. But if you’re working with inexperienced developers, then maybe introducing some of these requirements holistically across your project has value.
Luckily, for many programming languages you can just add quality analysis tools to your pipeline, such as pylint for Python or Brakeman for Ruby on Rails, for example, and this will help you with these quality issues. But with so many *-as-code constructs these days, many with their own domain-specific-languages (such as CloudFormation or Terraform) there may not always be a pre-existing automated way to check for these types of quality issues in the code pipeline.
When I was leading teams of developers, of which each team was building sets of infrastructure-as-code to deploy the applications they were responsible for into the cloud-based virtual datacenter, I had created a set of generic “architectural requirements” that each project was required to meet, on top of whatever functional requirements were defined by either the users or the application itself. I’m working from memory here, but if I recall the requirements were as follows:
- All user-facing web services must utilize a load balancer
- A server in the load-balanced cluster can be forcibly terminated with no impact or loss of capability to the users
- A server in the load-balanced cluster can be forcibly terminated and must be able to be brought back to a known-good working state (Threshold: 4 hours, Objective: 15 minutes and completely automated)
- The complete application stack must be able to be deployed to a greenfield environment in an automated fashion, with only an initial human interaction at the beginning
Note that each of these requirements could be tested, either by inspection (Is there a load balancer?) or by actual system test (OK, let’s forcibly terminate an instance in the autoscaling group and see what happens).
By defining these quality requirements upfront with the developers, including the defined test cases, this actually helped with effort estimation because the team knew exactly what was expected of their product and how it would be tested. And no surprises were waiting for us down the line, such as suddenly realizing that this product crashed frequently and we needed to take turns with pager duty on weekends.
The lesson here is to take some time and think about what well-written non-functional requirements you can add to your backlog that drive the product’s quality in the right direction. This also ensures that in addition to delivering the functional capabilities to the users you are delivering a high-quality product as well.