Like any other principle, types of testing justify their existence in the time they save you. That is, presuming that software quality is ensured by manual testing in the absence of automated testing, the criteria for telling if adding a test is appropriate is whether the time taken to write the test is less than the expected time saved by the test.
I want to touch on the notion of testing everything. Aside from being inefficient, I’m confident it’s not realistically possible. That is: I’d wager that for any real, used codebase with a set of tests one can write an alteration that will break the codebase while keeping all the tests passing.
If you accept the conclusion that testing everything isn’t plausible (and no, 100% code coverage certainly isn’t testing everything), then the question becomes when/where do we test? Well, obviously in cases when the time saved exceeds the time spent implementing the tests. But let’s break this formula down a little. What factors influence the time saved and inform the decision of which things to test?
- The probability that code will break (and the number of times)
- The difficulty of manual testing
- The probability that the test will catch the code breaking
This is a start, let’s break it down further:
- The probability that code will break (and the number of times)
- Amount of change to be made to code
- Difficulty of code
- Team Size (i.e. people working on code who don’t understand it)
- Intensity of external factors/dependencies (environments)
- The difficulty of manual testing
- The variety of different cases in the piece of code (“branchiness”)
- The visibility of a problem when present
- The probability that the test will catch the code breaking
Based on this list, a hypothetical case where implementing a unit test might be the most valuable: You have a complicated algorithm that uses advanced mathematics you don’t entirely understand which you will need to optimize several times for speed. Moreover, other team members will be working on it too. It relies on special GPU processing which has varying logic for different hardware. It also has branching logic that requires at least 8 test cases to ensure accuracy. Because the math is so complex, determining if the function has answered correctly requires looking at every digit of a 15 digit number.
A hypothetical case where implementing a unit test might be the least valuable: You are putting one line of code into your app’s startup to set the background color to steel-grey. It’s a basic 1-liner. Nobody else will touch your code, and you know for a fact from your waterfall documentation that this requirement will never be altered. Your app only runs on a very specific device on one OS. There is no special logic involved. Every time you startup your app (which takes .01 seconds) you’ll immediately notice whether or not the background color is right.
I believe an expert engineer will always be doing this math with all code he writes, while keeping in mind a margin of error in his estimations. And I think any engineer who opts for a simpler equation is being simplistic and in that respect, the less advanced engineer.