Friday 14 December 2012

Media Foundation Support in NAudio 1.7

I’ve been working on adding Media Foundation support to NAudio 1.7 over the past few weeks. There are two reasons for this. The first is that Windows Store apps can use Media Foundation but cannot use ACM or DMO, which were the two codec APIs that NAudio did support. This was the impetus I needed to finally get round to wrapping Media Foundation API, having been put off by the thought of wrapping yet another large COM-based API.

The second is simply that Media Foundation is the future for audio codecs in Windows, and includes some codec support that ACM doesn’t offer, such as AAC encode and decode. Windows 8 even comes with an MP3 encoder.

There was a lot of interop code required to get Media Foundation working, and with the help of a few people (most notably ManU from the Codeplex forums). I have now done enough to enable the three main uses of Media Foundation – encoding, decoding and resampling.

NAudio 1.7 will have the following three main classes to support Media Foundation:

  • MediaFoundationReader this implements WaveStream and basically allows you to play anything that Media Foundation can play. This means MP3, AAC, WMA, WAV, and includes streaming from the internet. It can even pull the audio out of video files. It may be that this class becomes the primary way of playing audio for NAudio going forwards. The output of MediaFoundationReader will always be PCM, so no second converter step is required. It also tries to hide the very awkward problem of COM apartment state issues from you by (optionally) recreating the Media Foundation source reader in the first call to Read (as that might come from an MTAThread). It uses Media Foundation’s support for repositioning which so far looks pretty good (although it might not get to exactly the point you asked for), and can even reposition in MP3 files you are downloading from the internet.
  • MediaFoundationEncoder I wanted to make encoding as simple as possible, and I’m quite pleased with the API I came up with (you can read a bit about it here). This class includes helper methods for encoding WMA, MP3 and AAC (assuming you have the encoders), and all you need to do is supply the output filename, the PCM source stream and the desired bitrate. It is also extensible enough to let you use any other encoder you have.
  • MediaFoundationResampler The most useful of all the media foundation effects, and based on MediaFoundationTransform, which you can use to wrap other effects if needed. The resampler in Media Foundation is reasonably good quality and can also change the bit depth and channel count, making it a very useful general purpose class. This also is hugely beneficial to supporting playback and recording with WASPI in Windows Store applications since the DMO interface which the existing WASAPI support uses is not allowed.

I’m also working on adding Windows Store support for . The main difference is the way you read and write files in Windows Store apps. Currently I’ve got a derived MediaFoundationReaderRT class in the demo, which allows you to open files from an IRandomAccessStream. I’ll probably do a similar thing for the encoder class as well.

I think the code can still be optimised a bit, particularly in the way that Media Buffers are created during resampling, but I am actually very close to completion, and I think this is going to be a fantastic feature for the next version of NAudio. If you want to try it out, you can build the latest NAudio from code yourself, or lookout for preview builds of NAudio 1.7 on Nuget. The NAudio WPF Demo app includes demonstrations of using all three of the main NAudio Media Foundation classes, plus how to enumerate the Media Foundation codecs.