RailsConf Dispatch - Test Always?: How not thinking carefully about your test suite can hold you back

There were two conveniently sequential presentations today at RailsConf that reminded me of some thoughts I'd had regarding testing: Michael Feathers' talk on legacy code and Glenn Vanderberg's talk on real software engineering. It seems to me that both talks had a theme in common: what is the function of tests? Why do we want them, what role do they play from an engineering perspective in the larger process, and what precisely are they meant to indicate to us?

Michael at one point talked about the expense of 100% code coverage for tests, instead recommending we test the parts of the code that change the least and are most important. Ugly code in legacy projects has utility, he explained, and untested code is a rational response to churn. Afterwards, Glenn discussed software development in the context of engineering principles from older, more established disciplines like structural engineering, finding areas of similarity, analogy, and abject difference. However, his testing point compared experiments in code to experiments in more physical engineering fields, remarking on how relatively cheap tests are for us. I suppose the common thread I found concerned the emphasis on cost: that what it means for us to do our job well is to do it effectively, and not subordinate our conscience and creativity to a mechanical process.

For some background, I've been practicing behavior driven development for a year or two. I love the confidence that testing gives me, independent of the value to the client. Verifying that my code works is fine and all, but what lets me sleep at night is the assurance derived from approaching a problem in a rational, systemic manner. By moving in small chunks and expressing problems in terms I understand well enough to programmatically recreate, I ground myself in a real comprehension of the system I'm building at the most relevant level and stage. I avoid the confusion of jumping ahead, thinking too large scale or minutely, or making unwarranted assumptions that come back to bite.

But I've found there are definitely times when testing first is the wrong approach. Remember: testing is supposed to reinforce your understanding of the problem. But what happens when you fundamentally don't understand it? When you first encounter the project, you don't necessarily have expectations or any way of identifying what a successful outcome is. Test first is supposed to get you to think about these things, but there's no substitute for writing and running code.

Experimentation, trial and error, and playing around are important discovery mechanisms that give us the understanding we then apply to more rigorous processes. Spending time writing tests that do nothing but reinforce the fact that you don't know what you're doing is stupid. I've found that you have to think carefully about what you're expecting your tests to actually accomplish for you, as you can dig yourself down rabbit holes needlessly by stressing form over function.

Similarly, as Feathers pointed out, even well understood requirements and algorithms often cost more to test than its worth. Code that changes often can cause test churn that burns up effort needlessly. And often we write large swaths of code that, while useful, just isn't that critical to the success or failure of the project. Given these tensions, 100% coverage may not simply be an unreasonable goal - it may be positively wasteful.

Again, the problem is deeper, as Glenn pointed out in his talk. Historically (and surprisingly in spite of salient warnings as early as the 60's) "Software Engineering" with capital letters has been biased towards a philosophy of formal, defined process models that demand predictability and reproducibility. Although behavior driven development may seem to encapsulate this philosophy, remember that when we're dealing with behavior we're not necessarily realizing mathematical precision at the level many engineers would expect. Accuracy is important in some engineering contexts, but flexibility can sometimes trump it.

A final thought that I consider my unique contribution to this discussion: if you do test first, consider that your tests may be disposable. It's great that you are using tests to drive the development, and building a test suite imparts tremendous satisfaction as you pile on more and more features. But just because a given test is useful to writing the code doesn't mean it's useful for verifying the project's success. It may in fact be, but that should be a demonstrable standard - don't let your personal satisfaction detract you from the big picture goal of the project.

100% code coverage is increasingly seen as superfluous because areas of code churn are just a fact of software development. You may find the changes are so targeted and frequent that more exploration and manual play pays off more than working out programmatic unit tests. Also, why wouldn't your test suite have cruft just like your source code? Perhaps your legacy tests reflect an emphasis on certain requirements that are just too rigid. In that sense, paring down the tests-as-specs is entirely consistent with the project's interests as it reflects the actual state of the project. Tests out of sync with current requirements are often worse than no tests at all.

Conventions and best practices are no excuse to forget our responsibilities to the client's budget. The point is not to adopt a rule that tests are no longer important to maintain; rather, avoid blindly following any rule for its own sake, especially when you're being paid to use your brain. Your job is not to write code; it's to deliver client value, and that requires careful thought. Depending on the importance of a given section of code, rewriting that unit test may be less useful than scrapping it for the moment - or perhaps focusing on integration tests instead. Always be willing to step outside your comfort zone to better understand the project and realize it's criteria for success - even if you lose a bit of sleep over it.

Written on Tuesday, June 08, 2010 | Tags: ruby, rails, bdd, testing, development, railsconf