Sound Code: March 2013

Thursday, 28 March 2013

Thoughts on the demise of Google Reader (and Blogging)

I started blogging in 2004. The ability to add new articles to my website without the laborious task of modifying HTML files and FTPing them up to my webspace was nothing short of magical. Even better was the community that suddenly sprung up around shared interests. If you read something interesting on someone else’s blog, you could comment, but even better, you could write your own post in response, linking to theirs, and a “trackback” would notify them, and a link to your response would appear at the bottom of their post. It was a great way of finding like-minded people.

But spammers quickly put an end to all that, and it wasn’t long before trackbacks were turned off for most blogs. Your only interaction came in the comments, and even that was less than ideal as so few blogs supported notification for comments. Blog posts were no longer conversation starters, they were more like magazine articles.

The next major setback for blogging was twitter. With an even quicker way to get your thoughts out to the world, many bloggers (myself included to be honest) started to neglect their blogs. In one sense, this is no big deal. I’d rather follow many blogs that have infrequent but interesting posts, rather than a few that have loads of posts of low quality. Which is why I love RSS and Google Reader. Some of the people I follow only blog a few times a year. But when they do write something, I know immediately, and can interact with them in the comments.

Now Google Reader is going away and this could be the killer blow for many small, rarely updated blogs. Of my 394 subscribers (according to feedburner), 334 are using Google Reader. I wonder how many I’ll have left after July 1st? Sure, there are a good number of us who are researching alternative products, but there are also many non-technical users of Google Reader. Take my wife for example. She likes and uses Google Reader, but doesn’t really want the hassle of switching and could easily miss the deadline (unless I transfer her subscriptions for her). Her response when I told her Google Reader was being shut down: “Why don’t they shut down Google Plus instead? No one uses that.” My thoughts exactly.

Now it is true that I get a lot more traffic from search engines than I do from subscribers. Google regularly sends people here to look at posts I made five years ago about styling a listbox in Silverlight 2 beta 2. But for me, part of the joy of blogging is interacting with other bloggers and readers about topics you are currently interested in. Without subscribers, you need to not only blog, but announce your every post on all the social networking sites you can access, quite possibly putting off potential readers with your constant self-promoting antics. If you choose not to do this, then you could easily find that no one at all is reading your thoughts. And then you start to question whether there is any point at all in blogging.

Google Reader Alternatives

So what are the options now that Google Reader is going? It does seem that there are a few viable replacement products – feedly and newsblur are rising to the challenge, offering quick ways to import your feeds. Apparently Digg are going to try to build an alternative before the cut-off date. But one thing all these options have in common is that they are scrambling to scale and add features fast enough to meet a very challenging deadline. There is no telling which of them will succeed, or come up with a viable long-term business model (how exactly do the free options plan to finance their service?), or offer the options we need to migrate again if we decide we have made the wrong choice. I could easily still be searching for an alternative long after Google Reader is gone. And then there is the integration with mobile readers. I use Wonder Reader on the Windows Phone. I have no idea whether it will continue to be usable in conjunction any of these new services.

Or I could think outside the box. Could I write my own RSS reader and back-end service in the cloud just for me? Possibly, and I can’t say I haven’t been tempted to try, but I have better things to do with my time. Or how about (as some have apparently already done), giving up altogether on RSS, and just get links from Twitter, or Digg, or from those helpful people who write daily link digests (the Morning Brew is a favourite of mine)? I could, and perhaps I would find some new and cool stuff I wouldn’t have seen in Google Reader. But there is nothing as customised to me as my own hand-selected list of blog subscriptions. There aren’t many people who share my exact mix of interests (programming, theology, football, home studio recording), and it would be a great shame to lose touch with the rarely updated blogs of my friends. And that’s to say nothing of the other uses of RSS such as being notified of new issues raised on my open source projects at CodePlex, or following podcasts. In short, I’m not ready to walk away from RSS subscriptions yet. At least, not until there is something actually better.

What’s Next To Go?

The imminent closure of Google Reader leaves me concerned about two other key components of my blogging experience which are also owned by Google – Feedburner and Blogger (Blogspot). I chose blogger to host this blog as I felt sure that Google would invest heavily in making it the best blogging platform on the internet. They haven’t. I have another blog on WordPress and it is far superior. I’ve been researching various exit strategies for some time (including static blogging options like octopress) but as with RSS feed readers, migrating to an alternative blog provider is not a choice to be taken lightly. Even more concerning is that feedburner was part of my exit strategy – I can use it to make the feed point to any other site easily. If Google ditch that, I’ll lose all my subscribers regardless of what reader they are using. It is rather concerning that Google have the power to deal a near-fatal blow to the entire blogging ecosystem should they wish.

What I’d Like To See

Congratulations if you managed to get this far, and apologies for the gloomy nature of this post so far. So why don’t I end it with a blogging challenge? What I’d like to see is posts from gurus in all things cloud, database, nosql, html5, nodejs, javascript, etc on how they would go about architecting a Google Reader replacement. What existing open source components are ready made building blocks? Would you build it on Azure table storage, or perhaps with RavenDB? How would you efficiently track items read, starred and tagged? What technologies would you use to make a reader interface that works for desktop and mobile? I’m not much of a web developer myself, so I’d love to see some cool open source efforts in this area, even if they are of the self-host variety.

Wednesday, 27 March 2013

How to convert byte[] to short[] or float[] arrays in C#

One of the challenges that frequently arises when writing audio code in C# is that you get a byte array containing raw audio that would be better presented as a short (Int16) array, or a float (Single) array. (There are other formats too – some audio is 32 bit int, some is 64 bit floating point, and then there is the ever-annoying 24 bit audio). In C/C++ the solution is simple, cast the address of the byte array to a short * or a float * and access each sample directly.

Unfortunately, in .NET casting byte arrays into another type is not allowed:

byte[] buffer = new byte[1000];
short[] samples = (short[])buffer; // compile error!

This means that, for example, in NAudio, when the WaveIn class returns a byte[] in its DataAvailable event, you usually need to convert it manually into 16 bit samples (assuming you are recording 16 bit audio). There are several ways of doing this. I’ll run through five approaches, and finish up with some performance measurements.

BitConverter.ToInt16

Perhaps the simplest conceptually is to use the System.BitConverter class. This allows you to convert a pair of bytes at any position in a byte array into an Int16. To do this you call BitConverter.ToInt16. Here’s how you read through each sample in a 16 buffer:

byte[] buffer = ...;
for(int n = 0; n < buffer.Length; n+=2)
{
   short sample = BitConverter.ToInt16(buffer, n);
}

For byte arrays containing IEEE float audio, the principle is similar, except you call BitConverter.ToSingle. 24 bit audio can be dealt with by copying three bytes into a temporary four byte array and using ToInt32.

BitConverter also includes a GetBytes method to do the reverse conversion, but you must then manually copy the return of that method into your buffer.

Bit Manipulation

Those who are more comfortable with bit manipulation may prefer to use bit shift and or to convert each pair of bytes into a sample:

byte[] buffer = ...;
for (int n = 0; n < buffer.Length; n+=2)
{
   short sample = (short)(buffer[n] | buffer[n+1] << 8);
}

This technique can be extended for 32 bit integers, and is very useful for 24 bit, where none of the other techniques work very well. However, for IEEE float, it is a bit more tricky, and one of the other techniques should be preferred.

For the reverse conversion, you need to write more bit manipulation code.

Buffer.BlockCopy

Another option is to copy the whole buffer into an array of the correct type. Buffer.BlockCopy can be used for this purpose:

byte[] buffer = ...;
short[] samples = new short[buffer.Length];
Buffer.BlockCopy(buffer,0,samples,0,buffer.Length);

Now the samples array contains the samples in easy to access form. If you are using this technique, try to reuse the samples buffer to avoid making extra work for the garbage collector.

For the reverse conversion, you can do another Buffer.BlockCopy.

WaveBuffer

One cool trick NAudio has up its sleeve (thanks to Alexandre Mutel) is the “WaveBuffer” class. This uses the StructLayout=LayoutKind.Explicit attribute to effectively create a union of a byte[], a short[], an int[] and a float[]. This allows you to “trick” C# into letting you access a byte array as though it was a short array. You can read more about how this works here. If you’re worried about its stability, NAudio has been successfully using it with no issues for many years. (The only gotcha is that you probably shouldn’t pass it into anything that uses reflection, as underneath, .NET knows that it is still a byte[], even if it has been passed as a float[]. So for example don’t use it with Array.Copy or Array.Clear). WaveBuffer can allocate its own backing memory, or bind to an existing byte array, as shown here:

byte[] buffer = ...;
var waveBuffer = new WaveBuffer(buffer);
// now you can access the samples using waveBuffer.ShortBuffer, e.g.:
var sample = waveBuffer.ShortBuffer[sampleIndex];

This technique works just fine with IEEE float, accessed through the FloatBuffer property. It doesn’t help with 24 bit audio though.

One big advantage is that no reverse conversion is needed. Just write into the ShortBuffer, and the modified samples are already in the byte[].

Unsafe Code

Finally, there is a way in C# that you can work with pointers as though you were using C++. This requires that you set your project assembly to allow “unsafe” code. "Unsafe” means that you could corrupt memory if you are not careful, but so long as you stay in bounds, there is nothing unsafe at all about this technique. Unsafe code must be in an unsafe context – so you can use an unsafe block, or mark your method as unsafe.

byte[] buffer = ...;
unsafe 
{
    fixed (byte* pBuffer = buffer)
    {
        short* pSample = (short*)buffer;
        // now we can access samples via pSample e.g.:
        var sample = pSample[sampleIndex];
    }
}

This technique can easily be used for IEEE float as well. It also can be used with 24 bit if you use int pointers and then bit manipulation to blank out the fourth byte.

As with WaveBuffer, there is no need for reverse conversion. You can use the pointer to write sample values directly into the memory for your byte array.

Performance

So which of these methods performs the best? I had my suspicions, but as always, the best way to optimize code is to measure it. I set up a simple test application which went through a four minute MP3 file, converting it to WAV and finding the peak sample values over periods of a few hundred milliseconds at a time. This is the type of code you would use for waveform drawing. I measured how long each one took to go through a whole file (I excluded the time taken to read and decode MP3). I was careful to write code that avoided creating work for the garbage collector.

Each technique was quite consistent in its timings:

	Debug Build	Release Build
BitConverter	263,265,264	166,167,167
Bit Manipulation	254,243,250	104,104,103
Buffer.BlockCopy	205,206,204	104,103,103
WaveBuffer	239.264.263	97,97,97
Unsafe	173.172.162	98,98,98

As can be seen, BitConverter is the slowest approach, and should probably be avoided. Buffer.BlockCopy was the biggest surprise for me - the additional copy was so quick that it paid for iteself very quickly. WaveBuffer was surprisingly slow in debug build – but very good in Release build. It is especially impressive given that it doesn’t need to pin its buffers like the unsafe code does, so it may well be the quickest possible technique in the long-run as it doesn’t hinder the garbage collector from compacting memory. As expected the unsafe code gave very fast performance. The other takeaway is that you really should be using Release build if you are doing audio processing.

Anyone know an even faster way? Let me know in the comments.

Monday, 18 March 2013

Why Static Variables Are Dangerous

In an application I work on, we need to parse some custom data files (let’s call them XYZ files). There are two versions of the XYZ file format, which have slightly different layouts of the data. You need to know what version you are dealing with to know what sizes the various data structures will be.

We inherited some code which could read XYZ files, and it contained the following snippet. While it was reading the XYZ file header it stored the file version into a static variable, so that later on in the parsing process it could use that to make decisions.

public static XyzVersion XyzVersion { get; set; }

public static int MaxSizeToUse
{
    get
    {
        switch (XyzVersion)
        {
            case XyzVersion.First:
                return 8;
            case XyzVersion.Second:
                return 16;
        }

        throw new InvalidOperationException("Unknown XyzVersion");
    }
}

public static int DataSizeToSkip
{
    get
    {
        switch (XyzVersion)
        {
            case XyzVersion.First:
                return 8;
            case XyzVersion.Second:
                return 0;
        }

        throw new InvalidOperationException("Unknown XyzVersion");
    }
}

Can you guess what went wrong? For years this code worked perfectly on hundreds of customer sites worldwide. All XYZ files, of both versions were being parsed correctly. But then, we suddenly started getting customers reporting strange problems to do with their XYZ files. When we investigated it, we discovered that we now had customers whose setup meant they could be dealing with two different versions of the XYZ file. That on its own wasn’t necessarily a problem. The bug occurred when our software, on two different threads simultaneously, was trying to parse XYZ files of a different version.

So one thread started to parse a version 1 XYZ file, and set the static variable to 1. Then the other thread started to parse a version 2 XYZ file and set the static variable to 2. Now, when the first thread carried on, it now incorrectly thought it was dealing with a version 2 XYZ file, and data corruption ensued.

What is the moral of this story? Don’t use a static variable to hold state information that isn’t guaranteed to be absolutely global. This is also a reason why the singleton pattern is so dangerous. The assumption that “there can only ever be one of these” is very often proved wrong further down the road. Here the assumption was that we would only ever see one version of the XYZ files on a customer site. That was true for several years … until it wasn’t anymore.

In this case, the right approach was for each XYZ file reader class to keep track of what version it was dealing with, and pass that through to the bits of code that needed to know it (it wasn’t even a difficult change to make). Static variables get used far too often simply because they are convenient and “save time”. But any time saved coding will be lost further down the road when your “there can only be one” assumption proves false.

Friday, 15 March 2013

NAudio on .NET Rocks

I was recently interviewed by Carl Franklin and Richard Campbell for their .NET Rocks podcast and the episode was published yesterday. You can have a listen here. I was invited onto the show after helping Carl out with an interesting ASIO related problem. I essentially built a mixer for his 48 in and 48 out MOTU soundcard. It is by far the most data that anyone has ever tried to push through NAudio (to my knowledge) and it did struggle a bit – he had to reduce the channel count to avoid corruption, but it was still impressive what was achieved at a low latency. However, I’m hoping to do some performance optimisations, and it would be very interesting to see if we can get 48 in and 48 (at 44.1kHz working smoothly in a managed environment). I’ll hopefully blog about it once I’ve got something working.

Friday, 8 March 2013

Essential Developer Principles #4 – Open Closed Principle

The “Open Closed Principle” is usually summarised as code should be “open to extension” but “closed to modification”. The way I often express it is that when I am adding a new feature to an application, I want to as much as possible be writing new code, rather than changing existing code.

However, I noticed there has been some pushback on this concept from none other than the legendary Jon Skeet. His objection seems to be based on the understanding that OCP dictates that you should never change existing code. And I agree; that would be ridiculous. It would encourage an approach to writing code where you added extensibility points at every conceivable juncture – all methods virtual, events firing before and after everything, XML configuration allowing any class to be swapped out, etc, etc. Clearly this would lead to code so flexible that no one could work out what it was supposed to do. It would also violate another well-established principle – YAGNI (You ain’t gonna need it). I (usually) don’t know in advance in what way I’ll need to extend my system, so why complicate matters by adding numerous extensibility points that will never be used (or more likely, still need to be modified before they can be used)?

So in a nutshell, here’s my take on OCP. When I’m writing the initial version of my code, I simply focus on writing clean maintainable code, and don’t add extensibility points unless I know for sure they are needed for an upcoming feature. (so yes, I write code that doesn’t yet adhere to OCP).

But when I add a new feature that requires a modification to that original code, instead of just sticking all the new code in there alongside the old, I refactor the original class to add the extensibility points I need. Then the new feature can be added in an isolated way, without adding additional responsibilities to the original class. The benefit of this approach is that you get extensibility points that are actually useful (because they are being used), and they are more likely to enable further new features in the future.

OCP encourages you to make your classes extensible, but doesn’t stipulate how you do so. Here’s some of the most common techniques:

Pass dependencies as interfaces into your class allowing callers to provide their own implementations
Add events to your class to allow people to hook in and insert steps into the process
Make your class suitable as a base class with appropriate virtual methods and protected fields
Create a “plug-in” architecture which can discover plugins using reflection or configuration files

It is clear that OCP is in fact very closely related to SRP (Single Responsibility Principle). Violations of OCP result in violations of SRP. If you can’t extend the class from outside, you will end up sticking more and more code inside the class, resulting in an ever-growing list of responsibilities.

In summary, for me OCP shouldn’t mean you’re not allowed to change any code after writing it. Rather, it’s about how you change it when a new feature comes along. First, refactor to make it extensible, then extend it. Or to put it another way that I’ve said before on this blog, “the only real reasons to change the existing code are to fix bugs, and to make it more extensible”.