Thursday, 30 June 2011

Test-Resistant Code #4–Third Party Frameworks

This one might seem like a repeat of my first post in this series, where I talked about dependencies being the biggest barrier to unit testing, and the challenge of creating abstraction layers for everything just to make things testable. However, even if we manage to isolate our third party framework entirely behind an abstraction layer, it still doesn’t eliminate a few conundrums when it comes to actually testing what we have done. I’ll just focus on two example scenarios:

Magic Invocations

Writing code that consumes a third party API can be one of the most difficult developer experiences. You write the code you think ought to work, based on the somewhat unreliable documentation, and it fails with some bizarre error. You then spend the next few days trying every possible combination of parameters, ordering of method calls, and scouring the web for any help you can get. Eventually you stumble across the “magic invocation” – the perfect combination of method calls and parameters that makes it work. You might have no clue why, but you are just relieved to be finally done with this part of the development experience, and ready to move onto something else.

As an example, consider this code I wrote once that attempts to find set the recording level for a given sound card input. It’s pretty hacky stuff, with a different codepath for different versions of Windows. It “works on my machine”, but I’m sure there are some systems on which it does the wrong thing.

private void TryGetVolumeControl()
{
    int waveInDeviceNumber = waveIn.DeviceNumber;
    if (Environment.OSVersion.Version.Major >= 6) // Vista and over
    {
        var mixerLine = waveIn.GetMixerLine();
        //new MixerLine((IntPtr)waveInDeviceNumber, 0, MixerFlags.WaveIn);
        foreach (var control in mixerLine.Controls)
        {
            if (control.ControlType == MixerControlType.Volume)
            {
                this.volumeControl = control as UnsignedMixerControl;
                MicrophoneLevel = desiredVolume;
                break;
            }
        }
    }
    else
    {
        var mixer = new Mixer(waveInDeviceNumber);
        foreach (var destination in mixer.Destinations)
        {
            if (destination.ComponentType == MixerLineComponentType.DestinationWaveIn)
            {
                foreach (var source in destination.Sources)
                {
                    if (source.ComponentType == MixerLineComponentType.SourceMicrophone)
                    {
                        foreach (var control in source.Controls)
                        {
                            if (control.ControlType == MixerControlType.Volume)
                            {
                                volumeControl = control as UnsignedMixerControl;
                                MicrophoneLevel = desiredVolume;
                                break;
                            }
                        }
                    }
                }
            }
        }
    }
}

Having finally made this wretched thing work, does it matter whether I have any “unit tests” for it or not? I could write an “integration test” for this code which actually opens an audio device for recording and attempts to set the microphone level. But to know whether it worked or not would not be automatically verifiable. You would have to manually examine the recorded audio to see if the volume had been adjusted.

What about real “unit tests”? Is it possible to unit test this? Well in one sense, yes. I could wrap an abstraction layer around all the third party framework calls and the operating system version checking. Then I could test that my foreach loops are indeed picking out the “mixer control” I think they should in both cases, and that the desired level was set on that control.

But supposing I did write that unit test. What would it prove? It would prove little more than my code does what it does. It doesn’t prove that it does the right thing. The best it can do is demonstrate that the behaviour that I know “works on my machine”, is still working. In other words I could create mocks that return data that mimics my own soundcard’s controls and ensure that the algorithm always picks the same mixer control for that particular test case.

However, I this that this sort of testing fails the cost-benefit analysis. It requires a lot of abstraction layers to be created just for the purpose of creating a test that has dubious value. Tests should verify that requirements are being met, not check up on how they are implemented.

Writing Plugins

Another common scenario is when you are writing code that will be called by a third party framework – as a plugin of sorts. In this case, you are often inheriting from a base class. The public interface has already been decided for you, for better or worse. Also, depending on the framework, you may not be entirely sure what the possible inputs to your overridden functions might be, or in what order they might be called.

The approach I tend to take here is to put as little code in the derived class as possible. Instead it calls out into my own, more testable classes. This works quite well, and leaves you with the choice of how much effort you want to go to to test the derived plugin class itself, which may be best left to a proper integration test with the real system.

One place this approach doesn’t work well for me is with my own framework, NAudio. In NAudio, you often create small classes that inherit from IWaveProvider or WaveStream. These return audio out of a Read method in a byte array, which is great for the soundcard, but not at all friendly for writing unit tests against. I could again move my logic out into a separate, testable class, but this doubles my number of classes and reduces performance (which is a very important consideration in streaming audio). Add to that that audio effects and processors tend to be hard to automatically verify, and I wonder whether this is another case where the effort to write everything in a strictly TDD manner doesn’t really pay off.

Anyway, I’m nearly at the end of this series now. Just one more form of test-resistant code left to discuss, before I move on to share some ideas I’ve had on how to better automate the testing of some of this code.

Wednesday, 29 June 2011

Test-Resistant Code #3–Algorithms

This is the third part in a mini-series on code that is hard to test (and therefore hard to develop using TDD). Part 1: Test-Resistant Code and the battle for TDD, Part 2: Test-Resistant Code – Markup is Code Too

You may have seen Uncle Bob solve the prime factors kata using TDD. It is a great demonstration of the power of TDD to allow you to think through an algorithm in small steps. In a recent talk at NDC2011 entitled, “The Transformation Priority Premise” he gave some guidelines to help prevent creating bad algorithms. In other words, following the right rules you can use TDD to make quick sort, instead of bubble sort (I tried this briefly and failed miserably). It seems likely that all sorts of algorithms can be teased out with TDD (although I did read of someone trying to create a Sudoku solver this way and finding it got them nowhere).

However suitable or not TDD is for tackling those three problems (prime factors, list sorting, sudoku), they share one common attribute: they are easily verifiable algorithms. I know the expected outputs for all my test cases.

Not all algorithms are like that – we don’t always know what the correct output is. And as clever as the transformation priority premise might be for iterating towards optimal algorithms, I am not convinced that it is an appropriate attack vector for many of the algorithms we typically require in real-world projects. Let me explain with a few simple examples of algorithms I have needed for various projects.

Example 1 – Shuffle

My first example is a really basic one. I needed to shuffle a list. Can this be done with TDD? The first two cases are easy enough – we know what the result should be if we shuffle an empty list, or shuffle a list containing one item. But when I shuffle a list with two items, what should happen? Maybe I could have a unit test that kept going until it got both possibilities. I could even shuffle my list of two items 100 times and check that the number of times I got each combination was within a couple of standard deviations of the expected answer of 50. Then when I pass my list of three items in, I can do the same again, checking that all 6 possible results can occur, followed by more tests checking that the distribution of shuffles is evenly spread.

So it is TDDable then. But really, is that a sensible way to approach a task like this? The statistical analysis required to prove the algorithm requires more development effort than writing the algorithm in the first place.

But there is a deeper problem. Most sensible programmers know that the best approach to this type of problem is to use someone else’s solution. There’s a great answer on stackoverflow that I used. But that violates the principles of TDD: instead of writing small bits of code to pass simple tests, I jumped straight to the solution. Now we have a chunk of code that doesn’t have unit tests. Should we retro-fit them? We’ll come back to that.

Example 2 – Low Pass Filter

Here’s another real-world example. I needed a low pass filter that could cut all audio frequencies above 4kHz out of an audio file sampled at 16kHz. The ideal low pass filter has unity gain for all frequencies below the cut-off frequency, and completely removes all frequencies above. The only trouble is, such a filter is impossible to create – you have to compromise.

A bit of research revealed that I probably wanted a Chebyshev filter with a low pass-band ripple and a fairly steep roll-off of 18dB per octave or more. Now, can TDD help me out making this thing? Not really. First of all, it’s even harder than the shuffle algorithm to verify in code. I’d need to make various signal generators such as sine wave and white noise generators, and then create Fast Fourier Transforms to measure the frequencies of audio that had been passed through the filter. Again, far more work than actually creating the filter.

But again, what is the sane approach to a problem like this? It is to port someone else’s solution (in my case I found some C code I could use) and then perform manual testing to prove the correctness of the code.

Example 3 – Fixture Scheduler

My third example is of an algorithm I have attempted a few times. It is to generate fixture lists. In many sports leagues, each team must play each other team twice a season – once home, once away. Ideally each team should normally alternate between playing at home and then away. Two home or away games in a row is OK from time to time, three is bad, and four in a row is not acceptable. In addition it is best if the two games between the same teams are separated by at least a month or two. Finally, there may be some additional constraints. Arsenal and Tottenham cannot both play at home in the same week. Same for Liverpool and Everton, and for Man Utd and Man City.

What you end up with is a problem that has multiple valid solutions, but not all solutions are equally good. For example, it would not go down too well if one team’s fixtures went HAHAHAHA while another’s went HHAAHHAAHHAA.

So we could (and should) certainly write unit tests that check that the generated fixtures meet the requirements without breaking any of the constraints. But fine tuning this algorithm is likely to at least be a partially manual process, as we tweak various aspects of it and then compare results to see if we have improved things or not. In fact, I can imagine the final solution to this containing some randomness, and you might run the fixture generator a few times, assigning a “score” to each solution and pick the best.

Should you use TDD for algorithms?

When you can borrow a ready-made and reliable algorithm, then retro-fitting a comprehensive unit test suite to it may actually be a waste of time better spent elsewhere.

Having said that, if you can write unit tests for an algorithm you need to develop yourself, then do so. One of TDD’s key benefits is simply that of helping you come up with a good public API, so even if you can’t work out how to thoroughly validate the output, it can still help you design the interface better. Of course, a test without an assert isn’t usually much of a test at all, it’s more like a code sample. But at the very least running it assures that it doesn’t crash, and there is usually something you can check about its output. For example, the filtered audio should be the same duration as the input, and not be entirely silent.

So we could end up with tests for our algorithm that don’t fully prove that our code is correct. It means that the refactoring safety net that a thorough unit test suite offers isn’t there for us. But certain classes of algorithms (such as my shuffle and low pass filter examples) are inherently unlikely to change. Once they work, we are done. Business rules and customer specific requirements on the other hand, can change repeatedly through the life of a system. Focus on testing those thoroughly and don’t worry too much if some of your algorithms cannot be verified automatically.

Tuesday, 28 June 2011

How to Unit Test Silverlight Apps

I recently tried writing some unit tests for a Silverlight 4 project. The first barrier I ran into is that the typical way of creating unit tests – create a DLL, reference NUnit and your assembly to test, doesn’t work, since you can’t mix Silverlight assemblies with standard .NET assemblies.

The next hurdle is that fact that the Silverlight developer tools do not include a unit testing framework, and searching for Silverlight unit testing tools on the web can lead you down a few dead ends. It turns out that the thing you need is the Silverlight Toolkit. If you are just doing Silverlight 4, you only need the April 2010 toolkit.

Once this is installed, you get a template that allows you to add a “Silverlight Unit Test Application” to your project. This is a separate Silverlight application that contains a custom UI just to run unit tests.

Now you can get down to business writing your unit tests. You have to use the Microsoft unit testing attributes, i.e. [TestClass], [TestMethod], [TestInitialize]. You can also use the [Tag] attribute, which is the equivalent of NUnit’s [Category] to group your unit tests, allowing you to run a subset of tests easily.

It has a fairly nice GUI (though not without a few quirks) that displays the outcome of your tests for you, and allows you to copy the test results to your clipboard:

image

Unfortunately it doesn’t support Assert.Inconclusive (gets counted as a failure), but apart from that it works as expected.

Testing GUI Components

Another interesting capability that the Silverlight Unit Testing framework offers is that it can actually host your user controls and run tests against them.

Unfortunately some of the documentation that is out there is a little out of date, so I’ll run through the basics here.

You start off by creating a test class that inherits from the SilverlightTest base class. Then in a method decorated with the TestInitialize attribute, create an instance of your user control and add it to the TestPanel (a property on the base class, note that it has changed name since a lot of older web tutorials were written).

[TestInitialize]
public void SetUp()
{
    var control = new MyUserControl();
    this.TestPanel.Children.Add(control);
}

Note that if you don’t do this in the TestInitialize method and try to do it in your test method itself, the control won’t have time to actually load.

There is a nasty gotcha. If you user control uses something from a resource dictionary, then the creation of your control will fail in the TestInitialize method, but for some reason the test framework ploughs on and calls the TestMethod anyway, resulting in a confusing failure message. You need to get the contents of your app.xaml into the unit test application.

I already had my resource dictionaries split out into multiple files, which helped a lot. I added the resource xaml files to the test application as linked files, and then just referenced them as Merged dictionaries in my app.xaml:

<Application.Resources>
    <ResourceDictionary>
        <ResourceDictionary.MergedDictionaries>
            <ResourceDictionary Source="MarkRoundButton.xaml"/>
        </ResourceDictionary.MergedDictionaries>
    </ResourceDictionary>
</Application.Resources>

Now in your actual test method, you can now perform tests on your user control, knowing that it has been properly created and sized.

[TestMethod]
public void DisplayDefaultSize()
{
    Assert.IsTrue(tc.ActualWidth > 0);
}

There are also some powerful capabilities in there to allow you to run “asynchronous” tests, but I will save that for a future blog post as I have a few interesting ideas for how they could be used.

Monday, 27 June 2011

Test Resistant Code #2–Markup is Code Too

After blogging about one of the biggest obstacles to doing TDD, I thought I’d follow up with another form of test resistant code, and that is markup.

Now calling markup ‘code’ might seem odd, because we have been led to believe that these are two completely different things. After all applications = code + markup right? It’s the code that has the logic and all the important stuff to test in. Markup is just structured data.

But of course, we all know that you can have bugs in markup. 100% test coverage of your code alone does not guarantee that your customer will have a happy experience. In fact, I would go so far as to say, my experience with Silverlight and WPF programming is that the vast majority of the bugs are to do with me getting the markup wrong in some way. 100% unit coverage of my ViewModels is nice, but doesn’t really help me solve the most challenging problems I face. If I declare an animation in XAML, I have to watch it to know if I got it right. If I get it wrong, I see nothing, and I have no idea why.

Exhibit A – XAML

Here’s a bunch of XAML I wrote for a Silverlight star rating control I made recently:

<UserControl x:Class="MarkHeath.StarRating.Star"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d"
    d:DesignHeight="34" d:DesignWidth="34" Foreground="#C0C000">
    <Canvas x:Name="LayoutRoot">
        <Canvas.RenderTransform>
            <ScaleTransform x:Name="scaleTransform" />
        </Canvas.RenderTransform>
        <Path x:Name="Fill" Fill="{Binding Parent.StarFillBrush, ElementName=LayoutRoot}" 
                Data="M 2,12 l 10,0 l 5,-10 l 5,10 l 10,0 l -7,10 l 2,10 l -10,-5 l -10,5 l 2,-10 Z" />

        <Path x:Name="HalfFill" Fill="{Binding Parent.HalfFillBrush, ElementName=LayoutRoot}"                
                Data="M 2,12 l 10,0 l 5,-10 v 25 l -10,5 l 2,-10 Z M 34,34" />

        <Path x:Name="Outline" Stroke="{Binding Parent.Foreground, ElementName=LayoutRoot}"     
                StrokeThickness="{Binding Parent.StrokeThickness, ElementName=LayoutRoot}" 
                StrokeLineJoin="{Binding Parent.StrokeLineJoin, ElementName=LayoutRoot}"
                Data="M 2,12 l 10,0 l 5,-10 l 5,10 l 10,0 l -7,10 l 2,10 l -10,-5 l -10,5 l 2,-10 Z" />
        <!-- width and height of this star is 34-->
    </Canvas>
</UserControl>

Does this XAML have any ‘bugs’ in it? Well, not that I know of, but in the process of developing it, I ran into a bunch of problems. For a long time the top level element was a Grid rather than a Canvas, but strange things were happening when I resized it. I also resisted putting the ScaleTransform in there for a while, convinced I could do something with the Stretch properties on the Path objects to get the resizing to happen automatically (it was the half star that spoiled it).

Was this ‘real’ development? I think it was. XAML is after all just a way of writing creating objects and setting properties on them with a syntax that is more convenient for designer tools to work with. I could create exactly the same effect by creating a UserControl class, and setting its Content property to an instance of Canvas and so on.

Could I have developed this using TDD? Hmmm. Well you can always write a test for a piece of code. The trouble comes when you try to decide whether that test has passed or not. For this code, the only sane way of validating it was for me to visually inspect it at various sizes and see if it looked right.

Exhibit B – Build Scripts

I recently had the misfortune of having to do some work on MSBuild scripts. Don’t get me wrong, MSBuild is a powerful and feature rich build framework. It’s just that I had to resort to books and StackOverflow every time I attempted the simplest of tasks.

It is not long before it becomes clear that a build script is a program, and not simply structured data. In MSBuild, you can declare variables, call subroutines (‘Targets’), and write if statements (‘Conditions’). A build script is a program with a very important task – it creates the application you ship to customers. Can there be bugs in a build script? Yes. Can those bugs affect the customer? Yes. Would it be useful to be able to test the build script too? Yes.

Which is why felt a bit jealous when I watched a recent presentation from Shay Friedman on IronRuby, and discovered that with rake you can write your build scripts in Ruby. I’ve not learned the Ruby language myself yet, but I would love my build processes to be defined using IronPython instead of some XML monstrosity. Not only would it be far more productive, it would be testable too.

Exhibit C – P/Invoke

My final example might take you by surprise. As part of NAudio, I write a lot of P/Invoke wrappers. It is a laborious, error-prone process. A bug in one of those wrappers can cause bad things to happen, like crash your application, or even reboot your PC if your soundcard drivers are not well behaved.

Now although these wrappers are written in C#, they are not ‘code’ in the classic sense. They are just a bunch of class and method definitions. There is no logic and there are no expressions in there at all. In fact, it is the attributes that they are decorated with that do a lot of the work. Here’s a relatively straightforward example that someone else contributed recently:

[StructLayout(LayoutKind.Explicit)]
struct MmTime
{
    public const int TIME_MS = 0x0001;
    public const int TIME_SAMPLES = 0x0002;
    public const int TIME_BYTES = 0x0004;

    [FieldOffset(0)]
    public UInt32 wType;
    [FieldOffset(4)]
    public UInt32 ms;
    [FieldOffset(4)]
    public UInt32 sample;
    [FieldOffset(4)]
    public UInt32 cb;
    [FieldOffset(4)]
    public UInt32 ticks;
    [FieldOffset(4)]
    public Byte smpteHour;
    [FieldOffset(5)]
    public Byte smpteMin;
    [FieldOffset(6)]
    public Byte smpteSec;
    [FieldOffset(7)]
    public Byte smpteFrame;
    [FieldOffset(8)]
    public Byte smpteFps;
    [FieldOffset(9)]
    public Byte smpteDummy;
    [FieldOffset(10)]
    public Byte smptePad0;
    [FieldOffset(11)]
    public Byte smptePad1;
    [FieldOffset(4)]
    public UInt32 midiSongPtrPos;
}

Can classes like this be written with TDD? Well sort of, but the tests wouldn’t really test anything, except that I wrote the code I thought I should write. How would I verify it? I can only prove these wrappers really work by actually calling the real Windows APIs – i.e. by running integration tests.

Now as it happens I can (and do) have a bunch of automated integration tests for NAudio. But it gets worse. Some of these wrappers I can only properly test by running them on Windows XP and Windows 7, in a 32 bit process and a 64 bit process, and with a bunch of different soundcards configured for every possible bit depth and byte ordering. Maybe in the future, virtualization platforms and unit test frameworks will become so integrated that I can just use attributes to tell NUnit to run the test on a whole host of emulated operating systems, processor architectures, and soundcards. But even that wouldn’t get me away from the fact that I actually have to hear the sound coming out of the speakers to know for sure that my test worked.

Has TDD over-promised?

So are these examples good counter-points to TDD? Or are they things that TDD never promised to solve in the first place? Uncle Bob may be able to run his test suite and ship Fitnesse if it goes green without checking anything else at all. But I can’t see how I could ever get to that point with NAudio. Maybe the majority of code we do in markup can’t be tested without manual tests. And if they can’t be tested without manual tests, does TDD even make sense as a practice for developing them in the first place?

For TDD to truly “win the war”, it needs needs to come clean about what problems it can solve for us, and what problems are left unsolved. Some test-resistant code (such as tightly coupled code) can be made testable. Other test-resistant code will always need an amount of manual test and verification. The idea that we could get to the point of “not needing a QA department” does not seem realistic to me. TDD may mean a smaller QA department, but we cannot get away from the need for real people to verify things with their eyes and ears, and to verify them on a bunch of different hardware configurations that cannot be trivially emulated.

Friday, 24 June 2011

Test resistant code and the battle for TDD

I have been watching a number of videos from the NDC 2011 conference recently, and noticed a number of speakers expressing the sentiment that TDD has won.

By “won”, they don’t mean that everyone is using it, because true TDD practitioners are still very much in the minority, Even those who claim to be doing TDD are in reality only doing it sometimes. In fact, I would suggest that even the practice of writing unit tests for everything, let alone writing them first, is far from being the norm in the industry.

Of course, what they mean is that the argument has been won. No prominent thought-leaders are speaking out against TDD; its benefits are clear and obvious. The theory is that, it is only a matter of time before we know no other way of working.

Or is it?

There is a problem with doing TDD in languages like C# that its most vocal proponents are not talking enough about. And that is that a lot of the code we write is test resistant. By test resistant, I don’t mean “impossible to test”. I just mean that the effort required to shape it into a form that can be tested is so great that even if we are really sold on the idea of TDD, we give up in frustration once we actually try it.

I’ve got a whole list of different types of code that I consider to be “test-resistant”, but I’ll just focus in on one for the purposes of this post. And that is code that has lots of external dependencies. TDD is quite straightforward if you happen to be writing a program to calculate the prime factors of a number; you don’t have any significant external dependencies to worry about. The same is largely true if you happen to be writing a unit testing framework or an IoC container. But for many, and quite probably the majority of us, the code we write interacts with all kinds of nasty hard to test external stuff. And that stuff is what gets in our way.

External dependencies

External dependencies come in many flavours. Does your class talk to the file-system or to a database? Does it inherit from a UserControl base class? Does it create threads? Does it talk across the network? Does it use a third party library of any sort? If the answer to any of these questions is yes, the chances are your class is test-resistant.

Such classes can of course be constructed and have their methods called by an automated testing framework, but those tests will be “integration” tests. Their success depends on the environment in which they are run. Integration tests have their place, but if large parts of our codebase can only be covered by integration tests, then we have lost the benefits of TDD. We can’t quickly run the tests and prove that our system is still working.

Suppose we draw a dependency diagram of all the classes in our application arranged inside a circle. If a class has any external dependencies we’ll draw it on the edge of the circle. If a class only depends on other classes we wrote, we’ll draw it in the middle. We might end up with something looking like this:

Dependencies Diagram

In my diagram, the green squares in the middle are the classes that we will probably be able to unit test without too much pain. They depend only on other code we have written (or on trivial to test framework classes like String, TimeSpan, Point etc).

The orange squares represent classes that we might be able to ‘unit’ test, since file systems and databases are reasonably predictable. We could create a test database to run against, or use temporary files that are deleted after the test finishes.

The red squares represent classes that will be a real pain to test. If we must talk to a web-service, what happens when it is not available? If we are writing a GUI component, we probably have to use some kind of unwieldy automation tool to create it and simulate mouse-clicks and keyboard presses, and use bitmap comparisons to see if the right thing happened.

DIP to the rescue?

But hang on a minute, don’t we have a solution to this problem already? It’s called the “Dependency Inversion Principle”. Each and every external dependency should be hidden behind an interface. The concrete implementers of those interfaces should write the absolute minimal code to fulfil those interfaces, with no logic whatsoever.

Now suddenly all our business logic has moved inside the circle. The remaining concrete classes on the edge still need to be covered by integration tests, but we can verify all the decisions, algorithms and rules that make up our application using fast, repeatable, in-memory unit tests.

All’s well then. External dependencies are not a barrier to TDD after all. Or are they?

How many interfaces would you actually need to create to get to this utopian state where all your logic is testable? If the applications you write are anything like the ones I work on, the answer is, a lot.

IFileSystem

Let’s work through the file system as an example. We could create IFileSystem and decree that every class in our application that needs to access the disk goes via IFileSystem. What methods would it need to have? A quick scan through the application I am currently working on reveals the following dependencies on static methods in the System.IO namespace:

  • File.Exists
  • File.Delete
  • File.Copy
  • File.OpenWrite
  • File.Open
  • File.Move
  • File.Create
  • File.GetAttributes
  • File.SetAttributes
  • File.WriteAllBytes
  • File.ReadLines
  • File.ReadAllLines
  • Directory,Exists
  • Directory.CreateDirectory
  • Directory.GetFiles
  • Directory.GetDirectories
  • Directory.Delete
  • DriveInfo.GetDrives

There’s probably some others that I missed, but that doesn’t seem too bad. Sure it would take a long time to go through the entire app and make everyone use IFileSystem, but if we had used TDD, then IFileSystem could have been in there from the beginning. We could start off with just a few methods, and then add new ones in on an as-needed basis.

The trouble is, our abstraction layer needs to go deeper than just wrappers for all the static methods on System.IO. For example, Directory.GetFiles returns a FileInfo object. But the FileInfo class has all kinds of methods on it that allow external dependencies to sneak back into our classes via the back door. What’s more, it’s a sealed class, so we can’t fake it for our unit tests anyway. So now we need to create IFileInfo for our IFileSystem to return when you ask for the files in a folder. If we keep going in this direction it won’t be long before we end up writing an abstraction layer for the entire .NET framework class library.

Interface Explosion

This is a problem. You could blame Microsoft. Maybe the BCL/FCL should have come with interfaces for everything that was more than just a trivial data transfer object. That would certainly have made life easier for unit testing. But this would also add literally thousands of interfaces. And if we wanted to apply the “interface segregation principle” as well, we’d end up needing to make even more interfaces, because the ones we had were a bad fit for the classes that need to consume them.

So for us to do TDD properly in C#, we need to get used to the idea of making lots of interfaces. It will be like the good old days of C/C++ programming all over again. For every class, you also need to make a header / interface file. 

Mocks to the Rescue?

Is there another way? Well I suppose we could buy those commercial tools that have the power to replace any concrete dependency with a mock object. Or maybe Microsoft’s Moles can dig us out of this hole. But are these frameworks solutions to problems we shouldn’t be dealing with in the first place?

There is of course a whole class of languages that doesn’t suffer from this problem at all. And that is dynamic languages. Mocking a dependency is trivial with a dynamically typed language. The concept of interfaces is not needed at all.

This makes me think that if TDD really is going to win, dynamically typed languages need to win. The limited class of problems that a statically typed language can protect us from can quite easily be detected with a good suite of unit tests. It leaves us in a kind of catch 22 situation. Our statically typed languages are hindering us from embracing TDD, but until we are really doing TDD, we’re not ready to let go of the statically typed safety net.

Maybe language innovations like C# 4’s dynamic or the coming compiler as a service in the next .NET will afford enough flexibility to make TDD flow more naturally. But I get the feeling that TDD still has a few battles to win before the war can truly be declared as over.

read the follow-on articles here:

Saturday, 18 June 2011

SOLID Code is Mergeable Code

Not every developer I speak to is convinced of the importance of adhering to the “SOLID” principles. They can seem a bit too abstract and “computer sciency” and disconnected from the immediate needs of delivering working software on time. “SOLID makes writing unit tests easier? So what, the customer didn’t ask for unit tests, they asked for an order billing system”.

But one of the very concrete benefits of adhering to SOLID is that it eases merging for projects that require parallel development on multiple branches. Many of the commercial products I have worked on have required old versions to be supported with bug fixes and new features for many years. It is not uncommon to have more than 10 active branches of code. As nice as it would be to simply require every customer to upgrade to the latest version, that is not possible in all commercial environments.

One customer took well over a year to roll out our product to all their numerous sites. During that time we released two new major versions of our software. Another customer requires a rigorous and lengthy approval testing phase before they will accept anything new at all. The result is that features and bug fixes often have to be merged through multiple branches, some of which have diverged significantly over the years.

So how do the SOLID principles aid merging?

The Single Responsibility Principle states that each class should only have a single reason to change. If you have a class that constantly requires changing on every branch, creating regular merge conflicts, the chances are it violates SRP. If each class has only a single responsibility there is only a merge conflict if both branches genuinely need to modify that single responsibility.

Classes that adhere to the Open Closed Principle don’t need to be changed at all (except to fix bugs). Instead, they can be extended from the outside. Two branches can therefore extend the same class in completely different ways without any merge conflict at all – each branch simply adds a new class to the project.

Designs that violate the Liskov Substitution Principle result in lots of switch and if statements littered throughout the code, that must be modified every time we introduce a new subclass into the system. In the canonical ‘shapes’ example, when the Triangle class is introduced on one branch, and the Hexagon class is created on another, the merge is trivial if the code abides by LSP. If the code doesn’t, you can expect a lot of merge conflicts as every place in the code that uses the Shape base class is likely to have been modified in both branches.

The Interface Segregation Principle is in some ways the SRP for interfaces. If you adhere to ISP, you have many small, focused interfaces instead of few large, bloated interfaces. And the more your codebase is broken down into smaller parts, the lower the chances of two branches both needing to change the same files.

Finally, the Dependency Inversion Principle might not at first glance seem relevant to merges. But every application of DIP is in fact also a separation of concerns. The concern of creating your dependencies is separated away from actually using them. So applying DIP is always a step in the direction of SRP, which means you have smaller classes with more focused responsibilities. But DIP has another power that turns out to be very relevant for merging.

Code that adheres to the Dependency Inversion Principle is much easier to unit test. And a good suite of unit tests is an invaluable tool when merging lots of changes between parallel branches. This is because it is quite easily possible for merged code to compile and yet still be broken. When merges between branches are taking place on a weekly basis, there is simply not enough time to run a full manual regression test each time. Unit tests can step into the breach and give a high level of confidence that a merge has not broken existing features. This benefit alone repays the time invested in creating unit tests.

In summary, SOLID code is mergeable code. SOLID code is also unit testable code, and merges verified by unit tests are the safest kind of merges. If your development process involves working on parallel branches, you will save yourself a lot of pain by mastering the SOLID principles, and protecting legacy features with good unit test coverage.

Thursday, 16 June 2011

Essential Developer Principles #2–Don’t Reinvent the Wheel

For some reason, developers have a tendency towards a “not invented here” mentality. Sometimes it is simply because we are unaware of any pre-existing solutions that address out problem. But very often, we suspect that it will be far too complicated to use a third-party framework, and much quicker and simpler to write our own. After all, if we write our own we can make it work exactly how we want it to.

Reinventing the Framework 

The arguments for reinventing the wheel initially seem compelling, mainly because most programming problems seem a lot simpler at first than they really are.

So we head off to create our own chart library, our own deployment mechanism, our own XML parser, our own 3D rendering engine, our own unit test framework, our own database, and so on.

Months later we realise that our own custom framework is incomplete, undocumented and bug ridden. Instead of working on features that our customers have actually asked for, all our effort and attention has been diverted into maintaining these supporting libraries.

It is almost always the right choice to take advantage of a mature, well documented, existing framework, instead of deciding to create a bespoke implementation.

Repeatedly Reinventing

Another variation on the reinventing the wheel problem is when a single (often enterprise scale) application contains multiple solutions to the same problem. For example, there might be two separate bits of code devoted to reading and writing CSV files. Or several different systems for messaging between threads. This usually happens because the the logic is so tightly coupled to its context that we can’t reuse it even if we want to.

... except when ...

As with most developer principles, this one has some exceptions. I think there are three main cases in which it is OK to reinvent the wheel. These are…

1. Reinvent to learn

We don’t need another blog engine, but we do need more developers who know how to build one. Likewise, we probably don’t need yet another ORM, or yet another IoC container, or yet another MVVM or MVC framework. But the process of building one, even if it is never completed, is an invaluable learning exercise. So go ahead, invent your own version control system, encryption algorithm or operating system. Just don’t use it in your next commercial product unless you have a lot of spare time on your hands.

2. Reinvent a simpler wheel

There are times when the existing offerings are so powerful and feature-rich that it seems overkill to drag a huge framework into your application just to use a fraction of its powers. In these cases it might make sense to make your own lean, stripped down, focused component that does only what you need.

However, beware of feature creep, and beware of problems that seem simple until you begin to tackle them. A good example of creating simpler alternatives is Micro-ORMs, which are just a few hundred lines of code, stripped down to the bare essentials of what is actually needed for the task at hand.

3. Reinvent a better wheel

There are those rare people who can look at the existing frameworks, see a limitation with them, imagine a better alternative, and build it. It takes a lot of time to pull this off, so it doesn’t make sense if you only need it for one project. If however, you are building a framework to base all the applications you ever write on, there is a chance that it will pay for itself over time.

An example of this in action is FubuMVC, an alternative to ASP.NET MVC that was designed to work exactly the way its creators wanted. The key is to realise that they have gone on to build a lot of commercial applications on that framework.

In summary, don’t reinvent the wheel unless you have a really good reason to. You’ll end up wasting a lot of time and resources on development that is only tangentially related to your business goals.

Wednesday, 15 June 2011

Essential Developer Principles #1–Divide and Conquer

With this post I am starting a new series in which I intend to cover a miscellany of developer ‘principles’. They are in no particular order, and I’m not promising any great regularity but I am intending to use some of the material I present here as part of the developer training where I work, so please feel free to suggest improvements or offer counterpoints.

The essence of software development is divide and conquer. You take a large problem, and break it into smaller pieces. Each of those smaller problems is broken into further pieces until you finally get down to something you can actually solve with lines of code.

If you can’t break problems down into constituent parts, you don’t have what it takes to be a programmer. Apparently some who purport to be developers can’t even solve trivial problems like FizzBuzz. They can’t see that to solve FizzBuzz (the “big” problem), all they need to do is solve a few much simpler problems (count from 1 to 100, see if a number is divisible by three, see if a number is divisible by 5, print a message). Once you have done that, the pieces aren’t too hard to put together into a solution to the big problem.

Cutting the Cake

But being a good developer is not just about being able to decompose problems. If you need to cut a square birthday cake into four equal pieces there are several ways of doing it. You could cut it into four squares, four triangles, or four rectangles, to name just a few of the options. Which is best? Well that depends on whether there are chocolate buttons on top of the cake. You’d better make sure everyone gets an equal share of those too or there will be trouble.

A good developer doesn’t just see that a problem can be broken down; they see several ways to break it down and select the most appropriate.

For example, you can slice your ASP.NET application up by every web page having all its logic and database access in the code behind – vertical slices if you like. Or you can slice it up in a different way and have your presentation, your business logic, and your database access as horizontal slices. Both are ways of taking a big problem and cutting it into smaller pieces. Hopefully I don’t need to tell you which of those is a better choice.

Keep Cutting

Alongside the mistake of cutting a problem up in the wrong way likes the equally dangerous mistake of failing to cut the problem up small enough.

In the world of .NET you might decide that for a particular problem you need two applications – a client and a server. Then you decide that the client needs two assemblies, or DLLs. And inside each of those modules you create a couple of classes.

divide-and-conquer-1

All this is well and good, but it means that your tree of hierarchy has only got three levels. As your implementation progresses, something has to receive all the new code you will write. Unless you are willing to break the problem up further, what will happen is that those classes on the third level will grow to contain hundreds of long and complicated methods. In other words, our classes bloat:

divide-and-conquer-2

It doesn’t need to be this way. We can almost always divide the problem that a class solves into two or more smaller, more focussed problems. Understanding the divide and conquer principle means we keep breaking the problem down until we have classes that adhere to the “single responsibility principle” – they do just one thing. These classes will be made up of a small number of short methods. Essentially, we add extra levels to our hierarchy, each one breaking the problem down smaller until we reach an appropriately simple, understandable, and testable chunk:

divide-and-conquer-3

I’ve not shown it on my diagram, but an important side benefit is that a lot of the small components we create when we break our problems up like this turn out to be reusable. We’ve actually made some of the other problems in our application easier to solve, by virtue of breaking things down to an appropriate level of granularity. There are more classes, but less code.

And that’s it. Divide and conquer is all it takes to be a programmer. But to be a great programmer, you need to know where to cut, and to keep on cutting until you’ve got everything into bite-sized chunks.

Monday, 13 June 2011

Silverlight Star Rating Control

I found myself needing a star rating control for a Silverlight application recently, and although I found a few open source ones, none of them worked quite how I wanted them to. Besides, I designed my own star shape in XAML a while ago and wanted to use that as the basis.

StarRating

In particular I wanted the ability to have half-star ratings. I thought at first it would require me to use some kind of complicated clipping construct, with a rectangle hidden behind my star, but I realised I could cheat by creating a half-star shape. Each star is a UserControl made up of three shapes – a star fill, a half star on top and then an outline on top.

I also wanted the colours to change as you hovered the mouse over it, indicating what rating would be given were you to click at any moment. This required the background of the control to be painted with a transparent brush, otherwise mouse events are only received while you are over stars and not in between them.

The biggest difficulty was making resize work. Path objects can be a pain to resize. I ended up putting them on a Canvas and using a ScaleTransform to get them the right size. All the brushes are customisable (six of them), plus you can change the number of stars, the line joins (pointy or round edged stars) and the line thicknesses. You can also turn off editing to simply display a star rating.

The Silverlight Star Rating Control is open source and available on CodePlex. The easiest way to install it is using NuGet.

To have a play with the star rating control, you can try it in my (rather chaotic looking) test harness below:

Saturday, 11 June 2011

MSBuild task to update AssemblyVersion and AssemblyFileVersion in C# files

A while ago I decided I needed an MSBuild task that could update the version numbers in my AssemblyInfo.cs file. I had a search around the web, but all the custom tasks I found seemed to work in a different way to what I wanted. So I created my own.

What I wanted was a simple way to say “increment the revision number but leave other numbers the same”, or “set the version to exactly this”. So I came up with a simple syntax to let me do just that. For example:

<SetVersion FileName="AssemblyInfo.cs" AssemblyVersion="1.2.+.=" />

This means, set the major version to 1, set the minor version to 2, increment the revision number and leave the build number at whatever it was before. You can also specify a rule for AssemblyFileVersion if you want to keep them both in sync.

SetVersionTask is open source, and hosted on BitBucket. You can download the build of the latest version there, and read more detailed instructions on how to set it up in your MSBuild script.

I had originally planned to make it work for nuspec files too, but I don’t have a pressing need for that just yet. Shouldn’t be too hard to add though. Feel free to make use of it if it does what you need.

Monday, 6 June 2011

So … is Silverlight dead then?

Microsoft’s Windows 8 announcement has generated a surprising amount of controversy given that what was shown (a fancy new start menu that can be navigated using touch alone) was exactly what everyone was expecting. The only thing that could conceivably be described as a surprise was that Microsoft’s offering for the slate will be the “full” Windows OS, rather than the Windows Phone 7 OS adapted for a larger screen.

Apparently you will be able to make stuff for this new touch interface with HTML5 and JavaScript. Quite how that makes Silverlight dead I am not sure. I certainly don’t see any reason to panic. Given that it is full Windows, you will be able to write apps for it in VB6, Delphi or QBasic if you really want to. If you can’t write Windows 8 apps in Silverlight I’ll eat my hat. And if there is, as expected, a Windows 8 app store, you can be sure you’ll be able to sell Silverlight apps on it (in fact, if they base it on the Windows Phone app store, then Silverlight might even be the preferred/required dev platform).

I suppose it is possible that you won’t be able to make “tiles” using Silverlight (though they haven’t announced that you can’t), but I can’t get too worked up about that. I don’t want my applications consuming loads of memory and CPU cycles before I load them. Tiles should only be doing minimal processing.

Which brings us to the question everyone is asking. Is Silverlight dead?

When people call a technology “dead”, what they often really mean is “stable” and “mature”. I still develop WinForms apps on a daily basis. The API is effectively complete. It hasn’t had any meaningful new features for years. But it is a mature framework. It does what it does well enough, and if Microsoft were still actively trying to shoehorn new features into it that would be a bad thing. After a while, what is needed is not new features, but a new development paradigm.

WPF was, at first, that new paradigm, making it easy to do what was a nightmare in WinForms. And it followed a similar path to maturity. It’s got all the main capabilities you need to build impressive looking LOB apps. Microsoft got some things right with WPF, but like WinForms, it too has its limitations. I am not concerned that there isn’t a vast number of new features being announced for it every six months.

And guess what, Silverlight is on the same trajectory. Early releases picked up new capabilities at rapid pace. The fact it was chosen as the Windows Phone 7 dev platform has given it a new lease of life. But already it is reaching the stage where new versions are more incremental than revolutionary. That’s a good thing – the technology is approaching maturity.

To summarise: no need to panic. Keep building apps with the most appropriate technology that is available now. Those apps will continue to work just fine for years to come. Don’t worry that new technologies come along every few years that seem to “kill” older ones. I for one am glad that I am not currently developing using Visual Basic 16 or MFC version 23. And in 5 years time, I fully expect that there will be something even better than Silverlight to write Windows apps in. If there isn’t, it will be Microsoft, not Silverlight that is dead.