Code coverage is a meaningless metric
Oftentimes people ask: How much code coverage is sufficient? While I believe that this is a legitimate question to ask, especially when just getting started with testing, there is a better, a counter-question: Do you want to do Test-Driven Development or not?
This question has three possible answers:
- Yes
- No
- What is Test-Driven Development?
Test-Driven Development (TDD), is a practice that suggests following three simple rules:
- You are not allowed to write any production code unless it is to make a failing unit test pass.
- You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
- You are not allowed to write any more production code than is sufficient to pass the one failing unit test.
Robert C. Martin, The Three Rules of TDD
If developers follow TDD, not a single line of production code is written without a failing test first. Consequently, the resulting code coverage will always be 100%, and code coverage becomes a meaningless metric.
If developers do not follow TDD, they likely write production code, and either never or only occasionally write tests afterwards. Consequently, code coverage will vary between 0% and 100%. Code coverage will decrease when
- tests are deleted
- untested code is added
- tested code is reformatted or refactored (the same problem is solved with fewer lines of code)
and it will increase when
- tests are added
- untested code is removed
- tested code is reformatted or refactored (the same problem is solved with more lines of code)
While you can achieve 100% code coverage when writing tests after the fact, how confident will you be that the tests are meaningful? How confident will you be that the code works as expected in production?
For example, I recently added a --dry-run
option to ergebnis/composer-normalize
. Since a collaborator from a 3rd-party package is marked as final
and does not implement an interface, I did not have a possibility to create a test double for it. If you look closely, the code I added to output a diff in --dry-run
mode is largely untested. Still, I have 100% code coverage.
While the changes in code coverage might tell you something (what exactly, you have to find out for yourself), do you still think code coverage itself is a meaningful metric?
Personally, I have decided to follow the three rules of TDD. Following these rules simplifies a lot of things for me:
- There's no need to discuss what to test - every component is equally important and will be thoroughly tested.
- There's no need to discuss when to test - tests are always written first.
- There's no need to discuss how much code coverage is sufficient - it will always be 100%.
You might have a different point of view, and if that works for you and your clients, fine.
However, if you are still asking how much coverage you should aim for, here is a tweet by Marco Pivetta for you:
Broke production because I didn't bother covering >90%, and affected code path was in the 10%.
— @ocramius@mastodon.social 🇺🇦🍥 (@Ocramius) January 31, 2018
To those that say testing everything isn't worth doing: STFU.
If you are still hesitant, the hardest part with testing is getting started.
Good luck!