Friday, 24 June 2011

Test resistant code and the battle for TDD

I have been watching a number of videos from the NDC 2011 conference recently, and noticed a number of speakers expressing the sentiment that TDD has won.

By “won”, they don’t mean that everyone is using it, because true TDD practitioners are still very much in the minority, Even those who claim to be doing TDD are in reality only doing it sometimes. In fact, I would suggest that even the practice of writing unit tests for everything, let alone writing them first, is far from being the norm in the industry.

Of course, what they mean is that the argument has been won. No prominent thought-leaders are speaking out against TDD; its benefits are clear and obvious. The theory is that, it is only a matter of time before we know no other way of working.

Or is it?

There is a problem with doing TDD in languages like C# that its most vocal proponents are not talking enough about. And that is that a lot of the code we write is test resistant. By test resistant, I don’t mean “impossible to test”. I just mean that the effort required to shape it into a form that can be tested is so great that even if we are really sold on the idea of TDD, we give up in frustration once we actually try it.

I’ve got a whole list of different types of code that I consider to be “test-resistant”, but I’ll just focus in on one for the purposes of this post. And that is code that has lots of external dependencies. TDD is quite straightforward if you happen to be writing a program to calculate the prime factors of a number; you don’t have any significant external dependencies to worry about. The same is largely true if you happen to be writing a unit testing framework or an IoC container. But for many, and quite probably the majority of us, the code we write interacts with all kinds of nasty hard to test external stuff. And that stuff is what gets in our way.

External dependencies

External dependencies come in many flavours. Does your class talk to the file-system or to a database? Does it inherit from a UserControl base class? Does it create threads? Does it talk across the network? Does it use a third party library of any sort? If the answer to any of these questions is yes, the chances are your class is test-resistant.

Such classes can of course be constructed and have their methods called by an automated testing framework, but those tests will be “integration” tests. Their success depends on the environment in which they are run. Integration tests have their place, but if large parts of our codebase can only be covered by integration tests, then we have lost the benefits of TDD. We can’t quickly run the tests and prove that our system is still working.

Suppose we draw a dependency diagram of all the classes in our application arranged inside a circle. If a class has any external dependencies we’ll draw it on the edge of the circle. If a class only depends on other classes we wrote, we’ll draw it in the middle. We might end up with something looking like this:

Dependencies Diagram

In my diagram, the green squares in the middle are the classes that we will probably be able to unit test without too much pain. They depend only on other code we have written (or on trivial to test framework classes like String, TimeSpan, Point etc).

The orange squares represent classes that we might be able to ‘unit’ test, since file systems and databases are reasonably predictable. We could create a test database to run against, or use temporary files that are deleted after the test finishes.

The red squares represent classes that will be a real pain to test. If we must talk to a web-service, what happens when it is not available? If we are writing a GUI component, we probably have to use some kind of unwieldy automation tool to create it and simulate mouse-clicks and keyboard presses, and use bitmap comparisons to see if the right thing happened.

DIP to the rescue?

But hang on a minute, don’t we have a solution to this problem already? It’s called the “Dependency Inversion Principle”. Each and every external dependency should be hidden behind an interface. The concrete implementers of those interfaces should write the absolute minimal code to fulfil those interfaces, with no logic whatsoever.

Now suddenly all our business logic has moved inside the circle. The remaining concrete classes on the edge still need to be covered by integration tests, but we can verify all the decisions, algorithms and rules that make up our application using fast, repeatable, in-memory unit tests.

All’s well then. External dependencies are not a barrier to TDD after all. Or are they?

How many interfaces would you actually need to create to get to this utopian state where all your logic is testable? If the applications you write are anything like the ones I work on, the answer is, a lot.

IFileSystem

Let’s work through the file system as an example. We could create IFileSystem and decree that every class in our application that needs to access the disk goes via IFileSystem. What methods would it need to have? A quick scan through the application I am currently working on reveals the following dependencies on static methods in the System.IO namespace:

  • File.Exists
  • File.Delete
  • File.Copy
  • File.OpenWrite
  • File.Open
  • File.Move
  • File.Create
  • File.GetAttributes
  • File.SetAttributes
  • File.WriteAllBytes
  • File.ReadLines
  • File.ReadAllLines
  • Directory,Exists
  • Directory.CreateDirectory
  • Directory.GetFiles
  • Directory.GetDirectories
  • Directory.Delete
  • DriveInfo.GetDrives

There’s probably some others that I missed, but that doesn’t seem too bad. Sure it would take a long time to go through the entire app and make everyone use IFileSystem, but if we had used TDD, then IFileSystem could have been in there from the beginning. We could start off with just a few methods, and then add new ones in on an as-needed basis.

The trouble is, our abstraction layer needs to go deeper than just wrappers for all the static methods on System.IO. For example, Directory.GetFiles returns a FileInfo object. But the FileInfo class has all kinds of methods on it that allow external dependencies to sneak back into our classes via the back door. What’s more, it’s a sealed class, so we can’t fake it for our unit tests anyway. So now we need to create IFileInfo for our IFileSystem to return when you ask for the files in a folder. If we keep going in this direction it won’t be long before we end up writing an abstraction layer for the entire .NET framework class library.

Interface Explosion

This is a problem. You could blame Microsoft. Maybe the BCL/FCL should have come with interfaces for everything that was more than just a trivial data transfer object. That would certainly have made life easier for unit testing. But this would also add literally thousands of interfaces. And if we wanted to apply the “interface segregation principle” as well, we’d end up needing to make even more interfaces, because the ones we had were a bad fit for the classes that need to consume them.

So for us to do TDD properly in C#, we need to get used to the idea of making lots of interfaces. It will be like the good old days of C/C++ programming all over again. For every class, you also need to make a header / interface file. 

Mocks to the Rescue?

Is there another way? Well I suppose we could buy those commercial tools that have the power to replace any concrete dependency with a mock object. Or maybe Microsoft’s Moles can dig us out of this hole. But are these frameworks solutions to problems we shouldn’t be dealing with in the first place?

There is of course a whole class of languages that doesn’t suffer from this problem at all. And that is dynamic languages. Mocking a dependency is trivial with a dynamically typed language. The concept of interfaces is not needed at all.

This makes me think that if TDD really is going to win, dynamically typed languages need to win. The limited class of problems that a statically typed language can protect us from can quite easily be detected with a good suite of unit tests. It leaves us in a kind of catch 22 situation. Our statically typed languages are hindering us from embracing TDD, but until we are really doing TDD, we’re not ready to let go of the statically typed safety net.

Maybe language innovations like C# 4’s dynamic or the coming compiler as a service in the next .NET will afford enough flexibility to make TDD flow more naturally. But I get the feeling that TDD still has a few battles to win before the war can truly be declared as over.

read the follow-on articles here:

No comments: