Deleting tests is a best practice

As mentioned in a previous blog post, when I started programming, testing software was a rare practice. And like a lot of folks (even if they might not admit it), when I started reading about automated testing… it felt like a waste of time for me.

But as time passed and while working on different kinds of projects, I started learning about the importance of testing to ensure the quality of software over time.

During this journey, my conception of automated testing evolved a lot. Younger, I had a tendency to read tech articles like bibles, and if any famous developer suggested that unit tests were the panacea, I would think it must hold true for any project, any code and if I didn’t reach the 100% coverage, I was doing it wrong.

As you might expect, my opinion changed a lot over time, and I have fewer certainties these days. What I really learned is that any policy, for any given project, is dependent on the priorities and the context of that project and that there’s no such thing as best practice or anti-pattern that can hold true across contexts. This applies to tests as much as any other development-related practice.

So when it comes to testing, I developed some intuitions that I wanted to share in addition to some of the reasoning behind them.

Don’t take my word for it though and build your own intuitions depending on your specific context.

  • In general, when starting a new project that still needs time to prove its value with no guarantee to last in time, I rarely write tests. My main priority is to make sure the experiment is valid and the proof of concept is worth it before investing in tests.
  • When projects mature, involve several developers have come and gone, it becomes very important to invest in a strong testing policy. What I consider a good testing policy these days is a mixture of a number of unit tests and end-to-end testing for the critical paths of your project.
  • I consider testing small components, functions, and straightforward code as mostly useless. The tests are often a duplication of the code logic and need to be updated every time you make a change to the production code. That said, initially, it’s not always easy to identify this kind of tests but we shouldn’t be afraid of removing tests if they prove to be useless and problematic to maintain. Removing tests shouldn’t be taboo.
  • I don’t abuse unit tests, they shine for complex code with a well-defined API. Functions that have clear inputs and outputs and where the path from inputs to outputs is not straightforward but requires advanced logic.
  • I prefer end-2-end tests in most cases as they test the behavior of the software. End-2-end tests mean different things for different kinds of projects though. For a website, a mobile application, or any user-facing software, the end-2-end tests are tests simulating the user interactions on headless browsers or device simulators. For packages, libraries, these are referred to as integrations tests. They often resemble unit tests except that they address external APIs essentially.
  • End-2-end tests for user-facing applications are very important to avoid regressions but while the tooling has made substantial progress in the last couple years with things like Docker, Puppeteer, headless browsers… these remain fragile and they generally take a long time to run, so it’s important to be smart about what you’re testing, focus on the critical paths without forgetting about the maintainability cost of these tests.
  • On several occasions, we can be tempted to rely on tests based on generated fixtures and snapshots to quickly increase the coverage. Fixture-based tests are tests that perform a complex operation multiple times by slightly changing the inputs and saving a snapshot of the output. I’ve seen them being used for: navigating into pages and capturing the HTML of specific areas of the page, parsing hundreds of documents, and saving the result. I would personally avoid this type of test as much as possible. While they do increase the coverage very quickly, they fail to reach the main goal of the testing policy: ensure software stability. The main reason for this is the human aspect: The expected results from this kind of tests are often unclear. When an error happens, developers get confused about whether the changes to the fixtures are expected due to the code change they performed, or whether it’s a real failure. Over time, they develop habits to regenerate the fixtures when the tests fail without giving it too much thought. I don’t blame the developer for that but I see the test without clear expectations as the main issue here. Again, in these cases, removing tests shouldn’t be seen as a bad practice.

These are some practices and intuitions I’ve developed over time and I’m certain that I’ll continue to reconsider some of these and build new ones. If I have a single piece of advice to give, it would be to always consider the project’s priorities and context for defining policies, best practices, and anti-patterns. These change from project to another and evolve over time on the same project.

Related literature


4 responses to “Deleting tests is a best practice”

  1. The tests are often a duplication of the code logic and need to be updated every time you make a change to the production code. That said, initially, it’s not always easy to identify this kind of tests but we shouldn’t be afraid of removing tests if they prove to be useless and problematic to maintain. Removing tests shouldn’t be taboo.

    If this is happening, you’re doing_it_wrong. Your architecture should not be duplicating logic. If it is, including in tests themselves, then you need to break some classes or methods up to not do this.

    For instance, if you’re building a feature that has an admin UX component to trigger an event and a CLI component that does the same thing, the two codeflows should be using the same code and not duplicating it. Similarly, if your test has to duplicate code in production just to prove a unit test, you’re also doing it wrong. It too should be using reusable code.

    Never remove tests. That’s how you get unexpected regressions. If the tests don’t pass, fix your code.

    • Thanks for your insights. These practices took a lot of time to mature.
      I appreciate that you think differently and that’s fine and if a different approach is working for you, stick with it, I’m just sharing what I learned over the years.

  2. In general, when starting a new project that still needs time to prove its value with no guarantee to last in time, I rarely write tests. My main priority is to make sure the experiment is valid and the proof of concept is worth it before investing in tests.

    I really resonate with this. Especially in the context of historical WordPress where there isn’t a rigid framework in place, when trying to take the project in a new direction it hasn’t gone before, the road to implementation can be very windy. It’s impossible to do test-driven-development when needing to find creative solutions to work around many many years of legacy code.

  3. Well I just spent time writing tests for a codebase which I’d deprecated months ago. I Don’t agree, and think tests with quality coverage as high a percentage as possible are the desired state, but I am very grateful for your works on WordPress. The codebase I have just been tinkering with is a sort of Kata exercise. Practice so that when I need {X} I’ll have it available. It uses some of your Gutenberg work from production and proof-of-concept.

    It is nearly impossible to TDD with 100% coverage and maintain velocity and not be sad if a heap of work is thrown; but I do think every line of code that hits production should have some test-suite for every feature at-least in all intended green paths. I too had given up around 2018 on ever seeing 100% in a production project. A Fantastic lead developer and open source maintainer Morgan Roderick showed me not only that it was possible, but maintainable.

    I Still may not always attain 100% code coverage, and I also know that having it isn’t always a mark of high software quality; but I do try to get the numbers up over time and invest in things I know will be of value. Without tests there, it’s a lot of human work to back-fill that assurance.

Leave a Reply to Riad BenguellaCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from Riad Benguella

Subscribe now to keep reading and get access to the full archive.

Continue reading