Sound Code: January 2012

Friday, 27 January 2012

Handling multi-channel audio in NAudio

One of the recurring questions on the NAudio support forums is to do with how you can route different sounds to different outputs in a multi-channel soundcard setup. For example, can you play one MP3 file out of one speaker and a different one out of the other? If you have four outputs, can you route a different signal to each one?

The first issue to deal with is that just because your soundcard has multiple outputs, doesn’t mean you can necessarily open WaveOut with multiple outs. That depends on how the writers of the device driver have chosen to present the card’s capabilities to Windows. For example a four output card may appear as though it were two separate stereo soundcards. The good news is that if you have an ASIO driver, you ought to be able to open it and address all the outputs.

Having got that out of the way, in NAudio it is possible for audio streams to have any number of channels. The WaveFormat class has a channel count, and though this is normally set at 1 or 2, there is no reason why you can’t set it to 8.

What would be useful is an implementation of IWaveProvider that allows us to connect different inputs to particular outputs, kind of like a virtual patch bay. For example, if you had two Mp3FileReaders, and wanted to connect the left channel of the first to output 1 and the left channel of the second to output 2, this class would let you do that.

So I’ve created something I’ve called the MultiplexingWaveProvider (if you can think of a better name, let me know in the comments). In the constructor, you simply provide all the inputs you wish to use, and specify the number of output channels you would like. By default the inputs will be mapped directly onto the outputs (and wrap round if there are less outputs than inputs – so a single mono input would be automatically copied to every output), but these can be changed.

Creating and Configuring MultiplexingWaveProvider

In the following example, we create a new four-channel WaveProvider, so the first two outputs will play left and right channel from input1 and the second two outputs will have the left and right channels from input2. Note that input1 and input2 must be at the same sample rate and bit depth.

var input1 = new Mp3FileReader("test1.mp3");
var input2 = new Mp3FileReader("test2.mp3");
var waveProvider = new MultiplexingWaveProvider(new IWaveProvider[] { input1, input2 }, 4));

Then you can configure the outputs, which is done using ConnectInputToOutput:

waveProvider.ConnectInputToOutput(2,0);
waveProvider.ConnectInputToOutput(3,1);
waveProvider.ConnectInputToOutput(1,2);
waveProvider.ConnectInputToOutput(1,3);

The numbers used are zero-based, so connecting inputs 2 and 3 to outputs 0 and 1 means that test2.mp3 will now play out of the first two outputs instead of the second two. In this example I have connected input 1 (i.e. the right channel of test1.mp3) to both outputs 2 and 3. So you can copy the same input to multiple output channels, and not all input channels need a mapping.

Implementation of MultiplexingWaveProvider

The bulk of the work to achieve this is performed in the Read method of MultiplexingWaveProvider. The first task is to work out how many “sample frames” are required. A sample frame is a single sample in a mono signal, a left and right pair in a stereo signal, and so on. Once we have worked out how many sample frames we need, we then attempt to read that many sample frames from every one of the input WaveProviders (irrespective of whether they are connected to an output – we want to keep them in sync). Then, using our mappings dictionary, work out if any of the channels from this input WaveProvider are needed in the output. Since samples are interleaved in both input and output waveproviders, we can’t do just one Array.Copy – we must copy each sample across individually and put it into the right place.

A well behaved Read method will always return count unless it has reached the end of its available data (and then it should always return 0 in every subsequent call). The way we do this is work out the maximum number of sample frames read out of any of the inputs, and use that to report back the count that is read. This means that we will keep going until we have reached the end of all of our inputs. Because buffers might be reused, it is important that we zero out the output buffer if there was no available input data.

Here’s the implementation as it currently stands:

public int Read(byte[] buffer, int offset, int count)
{
    int sampleFramesRequested = count / (bytesPerSample * outputChannelCount);
    int inputOffset = 0;
    int sampleFramesRead = 0;
    // now we must read from all inputs, even if we don't need their data, so they stay in sync
    foreach (var input in inputs)
    {
        int bytesRequired = sampleFramesRequested * bytesPerSample * input.WaveFormat.Channels;
        byte[] inputBuffer = new byte[bytesRequired];
        int bytesRead = input.Read(inputBuffer, 0, bytesRequired);
        sampleFramesRead = Math.Max(sampleFramesRead, bytesRead / (bytesPerSample * input.WaveFormat.Channels));

        for (int n = 0; n < input.WaveFormat.Channels; n++)
        {
            int inputIndex = inputOffset + n;
            for (int outputIndex = 0; outputIndex < outputChannelCount; outputIndex++)
            {
                if (mappings[outputIndex] == inputIndex)
                {
                    int inputBufferOffset = n * bytesPerSample;
                    int outputBufferOffset = offset + outputIndex * bytesPerSample;
                    int sample = 0;
                    while (sample < sampleFramesRequested && inputBufferOffset < bytesRead)
                    {
                        Array.Copy(inputBuffer, inputBufferOffset, buffer, outputBufferOffset, bytesPerSample);
                        outputBufferOffset += bytesPerSample * outputChannelCount;
                        inputBufferOffset += bytesPerSample * input.WaveFormat.Channels;
                        sample++;
                    }
                    // clear the end
                    while (sample < sampleFramesRequested)
                    {
                        Array.Clear(buffer, outputBufferOffset, bytesPerSample);
                        outputBufferOffset += bytesPerSample * outputChannelCount;
                        sample++;
                    }
                }
            }
        }
        inputOffset += input.WaveFormat.Channels;
    }

    return sampleFramesRead * bytesPerSample * outputChannelCount;
}

Performance

Looking at the code above, you will probably notice that this could be made more efficient if we knew in advance whether we were dealing with 16, 24 or 32 bit input audio (it currently has lots of calls to Array.Copy to copy just 2, 3 or 4 bytes). And I might make three versions of this class at some point, to ensure that this performs a bit better. Another weakness in the current design is the creation of buffers every call to Read, which is something that I generally avoid since it gives work to the garbage collector (update – this is fixed in the latest code).

I have written a full suite of unit tests for this class, so if it does need some performance tuning, there is a safety net to ensure nothing gets broken along the way.

MultiplexingSampleProvider

NAudio 1.5 also has a ISampleProvider interface, which is a much more programmer friendly way of dealing with 32 bit floating point audio. I have also made MultiplexingSampleProvider for the next version of NAudio. One interesting possibility would be then to build on that to create a kind of bus matrix, where every input can be mixed by different amounts into each of the output channels.

Uses

This class actually has uses beyond supporting soundcards with more than 2 outputs. You could use it to swap left and right channels in a stereo signal, or provide a simple switch that selects between several mono inputs.

You also don’t need to output to the soundcard. The WaveFileReader will happily write multi-channel WAV files. However, there are no guarantees about how other programs will deal with WAVs that have more than two channels in them.

Availability

I’ve already checked in the initial version to the latest codebase, so expect this to be part of NAudio 1.6. The only caution is that I might change the class name if I come up with a better idea.

Tuesday, 10 January 2012

10 Commandments of Inversion of Control Containers

Inversion of Control containers can be a very powerful tool for decoupling spaghetti code in large software systems. However, with any power tool, you can hurt yourself badly if you don’t use it correctly. In this post, I present “10 commandments” to help you avoid causing problems with IoC, with a particular focus on very large software systems with many developers and hundreds of interfaces. I am currently using Unity, but these tips apply to pretty much any IoC container.

1. Configure everything before your first resolve

It is possible to ask the IoC container to resolve an interface before you have finished configuring the container. So long as it knows how to make the interface you asked for and its dependencies, it will have no problem fulfilling your request. But if your application hasn’t yet finished configuring the container yet, you run the risk of getting the wrong thing. Consider the following simple example:

IUnityContainer container = new UnityContainer();
container.RegisterType<IFoo, Foo>(new ContainerControlledLifetimeManager());
var f1 = container.Resolve<IFoo>();

Fairly straightforward, we ask for IFoo and get an instance of Foo. But what if we hadn’t quite finished configuring the container, and some other part in your app attempts to override the registration for IFoo:

container.RegisterType<IFoo, Foo2>(new ContainerControlledLifetimeManager());

Now, whenever we attempt to resolve IFoo, we get Foo2. But the initial component that did an early resolve is using the wrong implementation. This can make for horrible debugging sessions. Configure your container completely, before you start to resolve things from it. Which leads us to our second commandment…

2. Don’t pass the container around

What I mean here, is don’t pass around the top-level interface that allows further configuration of the container. In Unity, this is the IUnityContainer interface. It might feel very powerful to send it around, since it allows other parts of your application register new rules, but it opens the door for the kind of bugs we just discussed.

But what about just a Service Locator interface. Can I pass one of those around? Onto commandment 3…

3. Avoid passing an IServiceLocator around

Passing a service locator in as a dependency gives your class great power. It can ask for anything it wants, which is super convenient. But it also introduces some problems.

First, it means your class no longer advertises what it needs in the constructor. Without examining the code you can’t be sure what needs to be in the container for the class to work correctly. It means that unit tests, or callers of your class that aren’t using a container, will have to mock a container just to instantiate your class.

This has been called the “Hollywood principle” – Don’t call the DI Container, it’ll call you. Just put the interfaces you really need in your constructor.

4. Avoid making the container a singleton

Making your container a singleton is the quickest route to providing access to all your services everywhere in your application. It can seem like a great idea because wherever you are, you can just do this to get whatever interface you like:

var foo = MyContainer.Instance.Resolve<IFoo>();

But there are two big problems. First, this introduces hidden dependencies into your class. Instead of your constructor advertising its dependencies, we again must examine the code to know what needs to be set up in the container.

Second, it assumes that all parts of your application will work with the same container, and the same implementations of each interface. This may be true for small applications, but in large enterprise systems, there may well be the need for multiple IoC containers. In our systems, we make use of Unity child containers, allowing different sections of the application get access to their own implementations of interfaces. With a singleton container, this is impossible.

5. Avoid constructor bloat

One of the great things about IoC containers is that you don’t have to call constructors yourself – the container does the hard work for you. This means you can have a dozen constructor parameters, each representing a different dependency, and yet without ever having to write code that calls it.

public MyClass(IFoo foo, IFoo2 foo2, ILog log, IExporter exporter, IEmailer emailer, ISettings settings, IAudit audit)
{
}

That is, until you want to test it. Then you have to mock up all of those interfaces. And probably you will find that your class only needs to call one or two methods on each. This is the time to apply the Interface Segregation Principle and replace them with one or two more focussed interfaces that represent the real dependencies of your class under test.

6. Avoid property injection

Most IoC containers offer a way to let you put attributes on properties in order to tell the container that it needs to set that property after constructing the object. In Unity you use the Dependency attribute:

class Bar
{
    [Dependency]
    public IFoo Foo { get; set; }

    public Bar()
    {
    }
}

Although this seems like a great feature, it has the effect of hiding this dependency from anyone who is constructing your object without an IoC container. At the very least, your class should report a good error message if someone forgets to set up a property dependency.

7. Document your interfaces

If you are using an IoC container, chances are you are working on a large system, and other developers are resolving interfaces that you put in the container.

Often developers will add good comments to the concrete implementation of a class, but spend very little time commenting the interface (who likes to write the same comments twice?). So in the concrete class we might have a comment like this:

/// <summary>
/// Call this to process all the files in the InputFiles collection, using the rules from the Rules collection
/// </summary>
/// <param name="mode">Processing mode, 0 = replace, 1 = update, 2 = overwrite, 3 = test only</param>
public void Process(int mode)

but in the interface, we couldn’t be bothered to type that all again (and we hate cutting and pasting anyway) so we just have this:

/// <summary>
/// Process
/// </summary>
void Process(int mode);

But it is the interface that is the public API for your service. The comments on the interface will be used to display Intellisense to the user. The caller may not even have access to the code for the concrete implementation. If you are going to spend time writing good comments, write them on the interface. Here’s the difference in intellisense experience:

versus:

8. Don’t depend on Dispose in a specific order

When you Dispose your IoC container, it will go through all of the IDisposable instances it knows about and call Dispose on them. But this can introduce a problem, because we cannot guarantee the order the services are disposed in. If you make a call into another service in your Dispose method, how do you know that service hasn’t already been Disposed?

Instead, design your services in such a way that they can be disposed in any order, and use events or some other form of messaging to report to your system that a shutdown is about to happen, allowing any last minute logging, saving etc to be done beforehand, while all the services are still up and running.

9. Make sure Lifetime Management is communicated

If you call Resolve<IFoo> twice on your IoC container, you might get two new instances of the Foo class, or you might get the same one twice. Without looking at how your container is configured, you have no way of knowing. But this can be a real headache if IFoo implements IDisposable. How do you know whether you ought to call Dispose on it or not?

I don’t know of any slick solution to this, but I would typically avoid instances where a Resolve gives you something you need to Dispose yourself. Instead I would return a factory object that makes it very clear you are building a new instance that you are in control of its lifetime yourself. Whatever approach you use, make sure your whole development team understands it. You don’t want someone Disposing a service too early, resulting in the next person to use it getting a nasty exception.

10. Document your public API

In a large system, a container can easily fill up with a lot of interfaces. The trouble is, not all of these are at the same level. Some are high-level interfaces, allowed to be called from anywhere, whilst other things are only in the container so they can fulfil the dependencies of those high-level interfaces. It means that you run the risk of developers guessing incorrectly about which interface they are supposed to call to achieve a particular task. They will assume that if they can get at it from the container, then they must be allowed to call it.

Now there are ways of having interfaces defined in your container that people can’t get at from the wrong place by making good use of assemblies and the internal keyword, but really, you need to make it easy for developers to know what is in the container and how they are intended to use it. This may well involve maintaining an API document, and also means writing good comments on the interface. If you don’t do this, don’t be surprised to see code that inadvertently circumvents key functionality by calling into a component at too low a level, or the wheel being reinvented, simply because a developer didn’t know the container included a service that had the desired behaviour.

Do you have any tips for getting the best out of IoC containers? Please let me know in the comments.