Friday, 17 May 2013

Code Reviews–What are we looking for?

A few years back I wrote about the issue of when to perform code reviews. Code reviews which happen late in the development cycle can be a waste of time, since it may be too late to act on the findings. However, there is another, much bigger, problem with code reviews, and that is the issue of what are we supposed to be looking for in the first place?

At first this might seem to be a non-issue. After all, the reviewer is simply looking for problems of any sort. But the trouble is that there are so many different types of problem a piece of code might contain that it would take a very disciplined reviewer to consciously check for each one. It would also take a very long time. The reality is that most code reviewers are only checking for a small subset of the potential issues.

So what sorts of things ought we to be looking for?

 

Code Standards

I put this one first as it is the one thing that almost all code reviewers seem to be good at. If you break the whitespace rules, variable naming conventions, or miss out a function or class comment, you can be sure the code reviewer will point this out.

However, these are things that a good tool such as ReSharper ought to be finding for you. Human code reviewers ought to be focusing their attention on more significant problems.

 

Language Paradigms

Other things an experienced developer will be quick to pick up on are when you are using the language in a sub-optimal way. In C#, this includes things like correct use of using statements and the dispose pattern, using LINQ where appropriate, knowing when to use IEnumerable and when to use an array, and making types immutable where appropriate.

Observations like this are particularly useful to help train junior developers to improve the way. But again, many of these issues could be discovered by a good code analysis tool.

 

Maintainability

In a large system, keeping code maintainable is as critical as keeping it bug free. So a code review should look for problems such as high code complexity. I almost always point out a few places in which the code I review could be made “cleaner” by refactoring into shorter methods with descriptive names.

Uncle Bob Martin’s “SOLID” acronym is also a helpful guide. Are there bloated classes with too many responsibilities? Are we constantly making changes to the same class because it isn’t “open to extensibility”? Are we violating the “Liskov Substitution Principle” by checking if an object is an instance of a specific derived type?

Noticing maintainability problems at code review time, such as tight or hidden coupling through the use of singletons, may require some substantial re-work of the feature, but it will be much less painful to do it while it is still fresh in the developer’s mind rather than having to sort the mess out later on in the project lifetime.

 

Project Conventions

Every software project will have some kind of organization and structure, and in an ideal world, all developers are aware of this and make their changes in the “right” place and the right way. However, with large projects, it can be possible for developers to be unaware of the infrastructure that is in place. For example, do they ignore the error reporting mechanism in favour of re-inventing their own? Do they know about the IoC container and how to use it? Did they keep the business logic out of the view code?

For issues like this to be picked up, the code reviewer needs to be someone familiar with the original design of the code being modified. It may be necessary on some occasions for the code reviewer to ask the developer not to copy an existing convention because it has been discovered to be problematic.

 

Finding Bugs

Most discussions of code reviews assume that finding bugs is the main purpose, but from the list of items I’ve already mentioned it is not hard to see how we could actually forget to look for them.

This involves visually inspecting the code looking for things like potential null reference exceptions, or off by one errors. A lot of it is about checking whether the right thing happens when errors are encountered.

The trouble is that if the code is highly complex or badly structured, then it may be close to impossible to find bugs with a visual inspection. This is why the “maintainability” part of the code review is so important. Giray Özil sums this problem up brilliantly:

“Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.”

 

Meeting Requirements

If the code reviewer doesn’t have a firm grasp of the actual requirements that the coder was trying to implement, it is quite possible that code that simply doesn’t meet the requirements gets through the code review. The more understanding a code reviewer has of the actual business problem this code is trying to solve, the more likely they are to spot flaws (or gaping holes) in its implementation.

 

User Interface

Now we move onto some areas for review that relate to specific types of code. The first is user interface code. This adds a whole host of things for a reviewer to check such as correct layout of controls, correct use of colour, all localisable text read from resource files, etc. There will likely be some established conventions in place for user interface, and for a professional looking project, you need to ensure they are adhered to.

There is the also need to review user interfaces for usability. Developers are notoriously bad at designing good UIs, so a separate review with a UX expert might be appropriate.

 

Threading

Multi-threaded code is notoriously hard to get right. Are there potential race conditions, dead-locks, or resources that should be protected with a lock? It is important that a code reviewer is aware of which parts of the code might be used in a multi-threaded scenario, and give special attention to those parts. Again, the code needs to be as simple as possible. A 4000 line class that could be called on multiple different threads from dozens of different places will be close to impossible to verify for thread safety.

 

Testing

A good code reviewer should be asking what has been done to test the code under review, including both manual and automated tests. If there are unit tests, they too ought to be code reviewed for correctness and maintainability. Sometimes the best thing a code reviewer can do is suggest additional ways in which the code ought to be tested (or made more testable – the preference should always be for automated tests where possible).

 

Security

One of the most commonly overlooked concerns in code reviews is security. There are the obvious things like checking the code isn’t vulnerable to a SQL injection attack, or that the passwords aren’t stored in plaintext, or that the admin-level APIs can only actually be run by someone with the correct privileges. But there are countless subtle tricks that hackers have up their sleeves, and really your code will need regular security reviews by experts.

 

Performance

This is another commonly overlooked concern. Of course, much of our code doesn’t need to be aggressively performance tuned, but that doesn’t mean we can get away without thinking about performance. This requires knowledge of what sort of load the system will be under in “real-world” customer environments.

 

Compatibility

Another big issue particularly with large enterprise systems that the code may need to run on all kinds of different operating systems, or against different databases. It might need to be able to load serialized items from previous versions of the software, or cope with upgrading from the previous version of the database. If these problems are not spotted in code review, they can cause significant disruption in the field as customers find that the new version of the software doesn’t work in their environment.

Where possible, automated tests should be guarding against compatibility breakages, but it is always worth considering these issues in a code review.

 

Merging

If you are in the situation where you need to actively develop and maintain multiple versions of your product, then a code reviewer needs to consider the impact of these changes on future merges. As nice as it might be to completely reformat all the code, rename all the variables and reorganize all the code into new locations, you may also be making it completely impossible for any future merges to succeed.

 

Summary

Looking at the list above you might start to think that a code reviewer has no hope of properly covering all the bases. For this reason I think it is important that we clarify what a particular code review is intended to find. There may need to be special security, performance or UI focused reviews. I also think we should be using automated tools wherever possible to find as many of these issues for us.

I’d love to hear your thoughts on this topic. Which of these areas ought a code review to focus on? Have I missed any areas?

Wednesday, 15 May 2013

Knockout - What MVVM should have been like in XAML

I've not got a lot of JavaScript programming experience, but I've been learning a bit recently, and decided to try knockout.js. I was pleasantly surprised at how quickly I was able to pick it up. In part this is because it follows a MVVM approach very similar to what I am used to with WPF. I've blogged a three-part MVVM tutorial here before as well as occasionally expressing my frustrations with MVVM. So I was interested to see what the MVVM story is like for Javascript developers. What I wasn't expecting, was to find that MVVM in JavaScript using knockout.js is a much nicer experience than I was used to with XAML + C#. In this post I'll run through a few reasons why I was so impressed with knockout.

Clean Data-Binding Syntax


The first impressive thing is how straightforward the binding syntax is. Any HTML element can have a data-bind property attached to it, and that can hold a series of binding expressions. Here's a simple binding expression in knockout that puts text from your viewmodel into a div:

<div data-bind="text: message" ></div>

To bind multiple properties, you can just add extra binding statements into the single data-bind attribute as follows:

<button data-bind="click: prev, enable: enablePrev">

In XAML the syntax for basic bindings isn't too cumbersome, but a bit more repetitive nonetheless:

<TextBox Text="{Binding Message}" 
Enabled="{Binding EnableEditMessage}" />

Event Binding

One frustration with data binding in XAML is that you can't bind a function directly to an event. So for example when an animation finishes, it would be great to be able to call into a method on our ViewModel with something like this:

<Storyboard Completed="{Binding OnFinished}" />

Sadly, that is invalid XAML, but with knockout.js, binding to any event is trivial:

<div data-bind="click: handleItem"/>

Arbitrary expressions in binding syntax


XAML data binding does allow you to drill down into members of properties on your ViewModel. For example, you can do this:

<TextBlock Text="{Binding CurrentUser.LastName}" />

And it also does give you the ability to do a bit of string formatting, albeit with a ridiculously hard to remember syntax:

<TextBlock Text="{Binding Path=OrderDate, StringFormat='{}{0:dd.MM.yyyy}'}" />

But you can't call into a method on your ViewModel, or write expressions like this

<Button IsEnabled="{Binding Items.Count > 0}" />

However, in knockout.js, we have the freedom to write arbitrary bits of JavaScript right there in our binding syntax. It's simple to understand, and it just works:

<button data-bind="click: next, enable: index() < questions.length -1">Next</button>

This is brilliant, and it keeps the ViewModel from having to do unnecessary work just to manipulate data into the exact type and format needed for the data binding expression. (Anyone who's done MVVM with XAML will be all too familiar with the task of of turning booleans into Visibility values using converters)

Property Changed Notifications


Obviously, knockout needs some way for the ViewModel to report that a property has changed, so that the view can update itself. In the world of XAML this is done via the INotifyPropertyChanged interface, which is cumbersome to implement, even if you have a ViewModelBase class that you can ask to do the raising of the event for you.

private bool selected;
public bool Selected 
{
   get { return selected; }
   set   
   {
        if (selected != value)
        {
            selected = value;
            OnPropertyChanged("Selected");
        }
   }
}

Contrast this with the gloriously straightforward knockout approach which uses a ko.observable:

this.selected = ko.observable(false);

Now selected is a function that you call with no arguments to get the current value, and with a boolean argument to set its value. It's delightfully simple. I can't help but wonder if a similar idea could be somehow shoehorned into C#.

To be fair to the XAML MVVM world, you can alleviate some of the pain with Simon Cropp's superb Fody extension, which allows you to add in a compile time code weaver to automatically raise PropertyChanged events on your ViewModels. I use this on just about all my WPF projects now. It is a great timesaver and leaves your code looking a lot cleaner to boot. However, in my opinion, if you have to use code weaving its usually a sign that your language is lacking expressivity. I'd rather directly express myself in code.

Computed properties


Computed properties can be a pain in C# as well, as you have to remember to explicitly raise the PropertyChanged notification. (Although Fody is very powerful in this regard and can spot that a property getter on your ViewModel depends on other property getters.) Here's an example of a calculated FullName property in a C# ViewModel:

private string firstName;
public string FirstName 
{
   get { return firstName; }
   set   
   {
        if (firstName!= value)
        {
            firstName= value;
            OnPropertyChanged("FirstName");
            OnPropertyChanged("FullName");
        }
   }
}

public string FullName 
{
    get { return FirstName + " " + Surname; }
}

Knockout's solution to this is once again elegant and simple. You simply declare a ko.computed type on your ViewModel:
this.fullName = ko.computed(function() {
     return this.firstName() + " " + this.lastName();
}, this);

Elegant handling of binding to parent and root data contexts


Another area that causes regular pain for me with XAML databinding is when you need to bind to your parent's context or the root context. I think I've just about got the syntax memorized now, but I must have searched for it on StackOverflow hundreds of times before it finally stuck. You end up writing monstrosities like this:

...Binding="{Binding RelativeSource={RelativeSource FindAncestor, 
AncestorType={x:Type Window}}, Path=DataContext.AllowItemCommand}" ...

In knockout, once again, the solution is simple and elegant, allowing you to use $parent to access your parent data context (and grandparents with $parent[1] etc), or $root. Read more about knockout's binding context here.

<div data-bind="foreach: currentQuestion().answers">
    <div data-bind="html: answer, click: $parent.currentQuestion().select"></div>
</div>

Custom binding extensions!


Finally, the killer feature. If only we could add custom binding expressions to XAML, then maybe we could work around some of its limitations and upgrade it with powerful capabilities. Whilst I have a feeling that this is in fact technically possible, I don't know of anyone who has actually done it. Once again knockout completely knocks XAML out of the water on this front, with a very straightforward extensibility model for you to create your own powerful extensions. If you run through the knockout tutorial, you'll implement one yourself and be amazed at how easy it is.


Conclusion

I've only used knockout.js for a few hours (here's what I made) and all I can say is I am very jealous of its powers. This is what data binding in XAML ought to have been like. XAML has been around for some time now, but there have been very few innovations in the data-binding space (we have seen the arrival of behaviours, visual state managers, merged dictionaries, format strings, but all of them suffer from the same clunky, hard to remember syntax). And now we have another subtly different flavour of XAML to learn for Windows Store 8 apps, it seems that XAML is here to stay. Maybe it is time for us to put some pressure onto Microsoft to give XAML data binding power to rival what the JavaScript community seem to be able to take for granted.

Thursday, 28 March 2013

Thoughts on the demise of Google Reader (and Blogging)

I started blogging in 2004. The ability to add new articles to my website without the laborious task of modifying HTML files and FTPing them up to my webspace was nothing short of magical. Even better was the community that suddenly sprung up around shared interests. If you read something interesting on someone else’s blog, you could comment, but even better, you could write your own post in response, linking to theirs, and a “trackback” would notify them, and a link to your response would appear at the bottom of their post. It was a great way of finding like-minded people.

But spammers quickly put an end to all that, and it wasn’t long before trackbacks were turned off for most blogs. Your only interaction came in the comments, and even that was less than ideal as so few blogs supported notification for comments. Blog posts were no longer conversation starters, they were more like magazine articles.

The next major setback for blogging was twitter. With an even quicker way to get your thoughts out to the world, many bloggers (myself included to be honest) started to neglect their blogs. In one sense, this is no big deal. I’d rather follow many blogs that have infrequent but interesting posts, rather than a few that have loads of posts of low quality. Which is why I love RSS and Google Reader. Some of the people I follow only blog a few times a year. But when they do write something, I know immediately, and can interact with them in the comments.

Now Google Reader is going away and this could be the killer blow for many small, rarely updated blogs. Of my 394 subscribers (according to feedburner), 334 are using Google Reader. I wonder how many I’ll have left after July 1st? Sure, there are a good number of us who are researching alternative products, but there are also many non-technical users of Google Reader. Take my wife for example. She likes and uses Google Reader, but doesn’t really want the hassle of switching and could easily miss the deadline (unless I transfer her subscriptions for her). Her response when I told her Google Reader was being shut down: “Why don’t they shut down Google Plus instead? No one uses that.” My thoughts exactly.

Now it is true that I get a lot more traffic from search engines than I do from subscribers. Google regularly sends people here to look at posts I made five years ago about styling a listbox in Silverlight 2 beta 2. But for me, part of the joy of blogging is interacting with other bloggers and readers about topics you are currently interested in. Without subscribers, you need to not only blog, but announce your every post on all the social networking sites you can access, quite possibly putting off potential readers with your constant self-promoting antics. If you choose not to do this, then you could easily find that no one at all is reading your thoughts. And then you start to question whether there is any point at all in blogging.

Google Reader Alternatives

So what are the options now that Google Reader is going? It does seem that there are a few viable replacement products – feedly and newsblur are rising to the challenge, offering quick ways to import your feeds. Apparently Digg are going to try to build an alternative before the cut-off date. But one thing all these options have in common is that they are scrambling to scale and add features fast enough to meet a very challenging deadline. There is no telling which of them will succeed, or come up with a viable long-term business model (how exactly do the free options plan to finance their service?), or offer the options we need to migrate again if we decide we have made the wrong choice. I could easily still be searching for an alternative long after Google Reader is gone. And then there is the integration with mobile readers. I use Wonder Reader on the Windows Phone. I have no idea whether it will continue to be usable in conjunction any of these new services.

Or I could think outside the box. Could I write my own RSS reader and back-end service in the cloud just for me? Possibly, and I can’t say I haven’t been tempted to try, but I have better things to do with my time. Or how about (as some have apparently already done), giving up altogether on RSS, and just get links from Twitter, or Digg, or from those helpful people who write daily link digests (the Morning Brew is a favourite of mine)? I could, and perhaps I would find some new and cool stuff I wouldn’t have seen in Google Reader. But there is nothing as customised to me as my own hand-selected list of blog subscriptions. There aren’t many people who share my exact mix of interests (programming, theology, football, home studio recording), and it would be a great shame to lose touch with the rarely updated blogs of my friends. And that’s to say nothing of the other uses of RSS such as being notified of new issues raised on my open source projects at CodePlex, or following podcasts. In short, I’m not ready to walk away from RSS subscriptions yet. At least, not until there is something actually better.

What’s Next To Go?

The imminent closure of Google Reader leaves me concerned about two other key components of my blogging experience which are also owned by Google – Feedburner and Blogger (Blogspot). I chose blogger to host this blog as I felt sure that Google would invest heavily in making it the best blogging platform on the internet. They haven’t. I have another blog on WordPress and it is far superior. I’ve been researching various exit strategies for some time (including static blogging options like octopress) but as with RSS feed readers, migrating to an alternative blog provider is not a choice to be taken lightly. Even more concerning is that feedburner was part of my exit strategy – I can use it to make the feed point to any other site easily. If Google ditch that, I’ll lose all my subscribers regardless of what reader they are using. It is rather concerning that Google have the power to deal a near-fatal blow to the entire blogging ecosystem should they wish.

What I’d Like To See

Congratulations if you managed to get this far, and apologies for the gloomy nature of this post so far. So why don’t I end it with a blogging challenge? What I’d like to see is posts from gurus in all things cloud, database, nosql, html5, nodejs, javascript, etc on how they would go about architecting a Google Reader replacement. What existing open source components are ready made building blocks? Would you build it on Azure table storage, or perhaps with RavenDB? How would you efficiently track items read, starred and tagged? What technologies would you use to make a reader interface that works for desktop and mobile? I’m not much of a web developer myself, so I’d love to see some cool open source efforts in this area, even if they are of the self-host variety.

Wednesday, 27 March 2013

How to convert byte[] to short[] or float[] arrays in C#

One of the challenges that frequently arises when writing audio code in C# is that you get a byte array containing raw audio that would be better presented as a short (Int16) array, or a float (Single) array. (There are other formats too – some audio is 32 bit int, some is 64 bit floating point, and then there is the ever-annoying 24 bit audio). In C/C++ the solution is simple, cast the address of the byte array to a short * or a float * and access each sample directly.

Unfortunately, in .NET casting byte arrays into another type is not allowed:

byte[] buffer = new byte[1000];
short[] samples = (short[])buffer; // compile error!

This means that, for example, in NAudio, when the WaveIn class returns a byte[] in its DataAvailable event, you usually need to convert it manually into 16 bit samples (assuming you are recording 16 bit audio). There are several ways of doing this. I’ll run through five approaches, and finish up with some performance measurements.

BitConverter.ToInt16

Perhaps the simplest conceptually is to use the System.BitConverter class. This allows you to convert a pair of bytes at any position in a byte array into an Int16. To do this you call BitConverter.ToInt16. Here’s how you read through each sample in a 16 buffer:

byte[] buffer = ...;
for(int n = 0; n < buffer.Length; n+=2)
{
   short sample = BitConverter.ToInt16(buffer, n);
}

For byte arrays containing IEEE float audio, the principle is similar, except you call BitConverter.ToSingle. 24 bit audio can be dealt with by copying three bytes into a temporary four byte array and using ToInt32.

BitConverter also includes a GetBytes method to do the reverse conversion, but you must then manually copy the return of that method into your buffer.

Bit Manipulation

Those who are more comfortable with bit manipulation may prefer to use bit shift and or to convert each pair of bytes into a sample:

byte[] buffer = ...;
for (int n = 0; n < buffer.Length; n+=2)
{
   short sample = (short)(buffer[n] | buffer[n+1] << 8);
}

This technique can be extended for 32 bit integers, and is very useful for 24 bit, where none of the other techniques work very well. However, for IEEE float, it is a bit more tricky, and one of the other techniques should be preferred.

For the reverse conversion, you need to write more bit manipulation code.

Buffer.BlockCopy

Another option is to copy the whole buffer into an array of the correct type. Buffer.BlockCopy can be used for this purpose:

byte[] buffer = ...;
short[] samples = new short[buffer.Length];
Buffer.BlockCopy(buffer,0,samples,0,buffer.Length);

Now the samples array contains the samples in easy to access form. If you are using this technique, try to reuse the samples buffer to avoid making extra work for the garbage collector.

For the reverse conversion, you can do another Buffer.BlockCopy.

WaveBuffer

One cool trick NAudio has up its sleeve (thanks to Alexandre Mutel) is the “WaveBuffer” class. This uses the StructLayout=LayoutKind.Explicit attribute to effectively create a union of a byte[], a short[], an int[] and a float[]. This allows you to “trick” C# into letting you access a byte array as though it was a short array. You can read more about how this works here. If you’re worried about its stability, NAudio has been successfully using it with no issues for many years. (The only gotcha is that you probably shouldn’t pass it into anything that uses reflection, as underneath, .NET knows that it is still a byte[], even if it has been passed as a float[]. So for example don’t use it with Array.Copy or Array.Clear). WaveBuffer can allocate its own backing memory, or bind to an existing byte array, as shown here:

byte[] buffer = ...;
var waveBuffer = new WaveBuffer(buffer);
// now you can access the samples using waveBuffer.ShortBuffer, e.g.:
var sample = waveBuffer.ShortBuffer[sampleIndex];

This technique works just fine with IEEE float, accessed through the FloatBuffer property. It doesn’t help with 24 bit audio though.

One big advantage is that no reverse conversion is needed. Just write into the ShortBuffer, and the modified samples are already in the byte[].

Unsafe Code

Finally, there is a way in C# that you can work with pointers as though you were using C++. This requires that you set your project assembly to allow “unsafe” code. "Unsafe” means that you could corrupt memory if you are not careful, but so long as you stay in bounds, there is nothing unsafe at all about this technique. Unsafe code must be in an unsafe context – so you can use an unsafe block, or mark your method as unsafe.

byte[] buffer = ...;
unsafe 
{
    fixed (byte* pBuffer = buffer)
    {
        short* pSample = (short*)buffer;
        // now we can access samples via pSample e.g.:
        var sample = pSample[sampleIndex];
    }
}

This technique can easily be used for IEEE float as well. It also can be used with 24 bit if you use int pointers and then bit manipulation to blank out the fourth byte.

As with WaveBuffer, there is no need for reverse conversion. You can use the pointer to write sample values directly into the memory for your byte array.

Performance

So which of these methods performs the best? I had my suspicions, but as always, the best way to optimize code is to measure it. I set up a simple test application which went through a four minute MP3 file, converting it to WAV and finding the peak sample values over periods of a few hundred milliseconds at a time. This is the type of code you would use for waveform drawing. I measured how long each one took to go through a whole file (I excluded the time taken to read and decode MP3). I was careful to write code that avoided creating work for the garbage collector.

Each technique was quite consistent in its timings:

  Debug Build Release Build
BitConverter 263,265,264 166,167,167
Bit Manipulation 254,243,250 104,104,103
Buffer.BlockCopy 205,206,204 104,103,103
WaveBuffer 239.264.263 97,97,97
Unsafe 173.172.162 98,98,98

As can be seen, BitConverter is the slowest approach, and should probably be avoided. Buffer.BlockCopy was the biggest surprise for me  - the additional copy was so quick that it paid for iteself very quickly. WaveBuffer was surprisingly slow in debug build – but very good in Release build. It is especially impressive given that it doesn’t need to pin its buffers like the unsafe code does, so it may well be the quickest possible technique in the long-run as it doesn’t hinder the garbage collector from compacting memory. As expected the unsafe code gave very fast performance. The other takeaway is that you really should be using Release build if you are doing audio processing.

Anyone know an even faster way? Let me know in the comments.

Monday, 18 March 2013

Why Static Variables Are Dangerous

In an application I work on, we need to parse some custom data files (let’s call them XYZ files). There are two versions of the XYZ file format, which have slightly different layouts of the data. You need to know what version you are dealing with to know what sizes the various data structures will be.

We inherited some code which could read XYZ files, and it contained the following snippet. While it was reading the XYZ file header it stored the file version into a static variable, so that later on in the parsing process it could use that to make decisions.

public static XyzVersion XyzVersion { get; set; }

public static int MaxSizeToUse
{
    get
    {
        switch (XyzVersion)
        {
            case XyzVersion.First:
                return 8;
            case XyzVersion.Second:
                return 16;
        }

        throw new InvalidOperationException("Unknown XyzVersion");
    }
}

public static int DataSizeToSkip
{
    get
    {
        switch (XyzVersion)
        {
            case XyzVersion.First:
                return 8;
            case XyzVersion.Second:
                return 0;
        }

        throw new InvalidOperationException("Unknown XyzVersion");
    }
}

Can you guess what went wrong? For years this code worked perfectly on hundreds of customer sites worldwide. All XYZ files, of both versions were being parsed correctly. But then, we suddenly started getting customers reporting strange problems to do with their XYZ files. When we investigated it, we discovered that we now had customers whose setup meant they could be dealing with two different versions of the XYZ file. That on its own wasn’t necessarily a problem. The bug occurred when our software, on two different threads simultaneously, was trying to parse XYZ files of a different version.

So one thread started to parse a version 1 XYZ file, and set the static variable to 1. Then the other thread started to parse a version 2 XYZ file and set the static variable to 2. Now, when the first thread carried on, it now incorrectly thought it was dealing with a version 2 XYZ file, and data corruption ensued.

What is the moral of this story? Don’t use a static variable to hold state information that isn’t guaranteed to be absolutely global. This is also a reason why the singleton pattern is so dangerous. The assumption that “there can only ever be one of these” is very often proved wrong further down the road. Here the assumption was that we would only ever see one version of the XYZ files on a customer site. That was true for several years  … until it wasn’t anymore.

In this case, the right approach was for each XYZ file reader class to keep track of what version it was dealing with, and pass that through to the bits of code that needed to know it (it wasn’t even a difficult change to make). Static variables get used far too often simply because they are convenient and “save time”. But any time saved coding will be lost further down the road when your “there can only be one” assumption proves false.

Friday, 15 March 2013

NAudio on .NET Rocks

I was recently interviewed by Carl Franklin and Richard Campbell for their .NET Rocks podcast and the episode was published yesterday. You can have a listen here. I was invited onto the show after helping Carl out with an interesting ASIO related problem. I essentially built a mixer for his 48 in and 48 out MOTU soundcard. It is by far the most data that anyone has ever tried to push through NAudio (to my knowledge) and it did struggle a bit – he had to reduce the channel count to avoid corruption, but it was still impressive what was achieved at a low latency. However, I’m hoping to do some performance optimisations, and it would be very interesting to see if we can get 48 in and 48 (at 44.1kHz working smoothly in a managed environment). I’ll hopefully blog about it once I’ve got something working.

Friday, 8 March 2013

Essential Developer Principles #4 – Open Closed Principle

The “Open Closed Principle” is usually summarised as code should be “open to extension” but “closed to modification”. The way I often express it is that when I am adding a new feature to an application, I want to as much as possible be writing new code, rather than changing existing code.

However, I noticed there has been some pushback on this concept from none other than the legendary Jon Skeet. His objection seems to be based on the understanding that OCP dictates that you should never change existing code. And I agree; that would be ridiculous. It would encourage an approach to writing code where you added extensibility points at every conceivable juncture – all methods virtual, events firing before and after everything, XML configuration allowing any class to be swapped out, etc, etc. Clearly this would lead to code so flexible that no one could work out what it was supposed to do. It would also violate another well-established principle – YAGNI (You ain’t gonna need it). I (usually) don’t know in advance in what way I’ll need to extend my system, so why complicate matters by adding numerous extensibility points that will never be used (or more likely, still need to be modified before they can be used)?

So in a nutshell, here’s my take on OCP. When I’m writing the initial version of my code, I simply focus on writing clean maintainable code, and don’t add extensibility points unless I know for sure they are needed for an upcoming feature. (so yes, I write code that doesn’t yet adhere to OCP).

But when I add a new feature that requires a modification to that original code, instead of just sticking all the new code in there alongside the old, I refactor the original class to add the extensibility points I need. Then the new feature can be added in an isolated way, without adding additional responsibilities to the original class. The benefit of this approach is that you get extensibility points that are actually useful (because they are being used), and they are more likely to enable further new features in the future.

OCP encourages you to make your classes extensible, but doesn’t stipulate how you do so. Here’s some of the most common techniques:

  • Pass dependencies as interfaces into your class allowing callers to provide their own implementations
  • Add events to your class to allow people to hook in and insert steps into the process
  • Make your class suitable as a base class with appropriate virtual methods and protected fields
  • Create a “plug-in” architecture which can discover plugins using reflection or configuration files

It is clear that OCP is in fact very closely related to SRP (Single Responsibility Principle). Violations of OCP result in violations of SRP. If you can’t extend the class from outside, you will end up sticking more and more code inside the class, resulting in an ever-growing list of responsibilities.

In summary, for me OCP shouldn’t mean you’re not allowed to change any code after writing it. Rather, it’s about how you change it when a new feature comes along. First, refactor to make it extensible, then extend it. Or to put it another way that I’ve said before on this blog, “the only real reasons to change the existing code are to fix bugs, and to make it more extensible”.