Thursday, 13 May 2010

Converting MP3 to WAV with NAudio

One of the more common support requests with NAudio is how to convert MP3 files to WAV. Here’s a simple function that will do just that:

public static void Mp3ToWav(string mp3File, string outputFile)
{
    using (Mp3FileReader reader = new Mp3FileReader(mp3File))
    {
        using (WaveStream pcmStream = WaveFormatConversionStream.CreatePcmStream(reader))
        {
            WaveFileWriter.CreateWaveFile(outputFile, pcmStream);
        }
    }
}

… and that’s all there is to it.

Notes:

  1. There is no need to wrap pcmStream with a BlockAlignReductionStream since we are not repositioning the MP3 file, just reading it from start to finish.
  2. It uses the ACM MP3 decoder that comes with Windows to decompress the MP3 data.
  3. To be able to pass the MP3 data to the ACM converter we need to do some parsing of the MP3 file itself (specifically the MP3 frames and ID3 tags). Unfortunately, there are some MP3 files that NAudio cannot parse (notably the sample ones that come with Windows 7). My attempts at tracking down the cause of this problem have so far failed. (the basic issue is that after reading the ID3v2 tag, we don’t end up in the right place in the MP3 file to read a valid MP3Frame – see MP3FileReader.cs). Please let me know if you think you have a solution to this.

Wednesday, 12 May 2010

LINQ to General Election Part 2 – LINQ to XLS

In my previous post, I demonstrated how to run LINQ queries against the UK general election data the Guardian newspaper in JSON format. They additionally made a Google spreadsheet available which not only has the winners in every constituency but the number of votes for each candidate that stood.

Now it might be possible to run LINQ queries directly against the Google spreadsheet using the Google Data APIs (perhaps an experiment for another day), but I chose instead to download the spreadsheet as a Microsoft Excel spreadsheet and run my LINQ queries against that instead.

First of all, I needed a LINQ to Excel provider and found linqtoexcel. Unfortunately, it requires you to reference several DLLs, but once you have done so, it is pretty straightforward to use.

Mapping the Spreadsheet

One nice feature of linqtoexcel is that it makes it easy to map your spreadsheet data onto a strongly typed class. By default it will simply look for properties matching the column names. Here’s my election result class:

public class Result
{
    public string Seat { get; set; }
    public string Candidate { get; set; }
    public string Party { get; set; }
    public int Vote { get; set; }
    public double VotePercent { get; set; }
    public string Winner { get; set; }
}

Now I can open the XLS file using the ExcelQueryFactory class and I also need to add a custom mapping since the title of the VotePercent column is not a valid C# identifier. Note that you don’t have to map all the fields. I have just picked out the six that I needed for my queries.

var repo = new ExcelQueryFactory();
repo.FileName = @"..\..\..\Data\General election 2010 results.xls";
repo.AddMapping<Result>(x => x.VotePercent, "%Vote");
Querying the Data

Having done this, we are now in a position to perform any queries. We can specify the worksheet name from which our data comes (in this case “FULL CONSTITUENCY DATA”). Having the use of strongly typed classes makes the LINQ much more readable than in the JSON example.

var unluckyLosers = from p in repo.Worksheet<Result>("FULL CONSTITUENCY DATA")
                       where (p.Party != p.Winner) && (p.VotePercent > 42)
                       orderby p.VotePercent ascending
                       select p;
Console.WriteLine("Unlucky Losers:");
foreach (var winner in unluckyLosers)
{
    Console.WriteLine(String.Format("{2}%: {0} ({1})", winner.Candidate, winner.Party, winner.VotePercent));
}

Unfortunately, there seems to be a bug in that the orderby clause seems to be ignored. I tried a few different sort fields but with no success. The output from the above query is as follows:

Unlucky Losers:
44.52%: King, Nick (C)
45.51%: Connor, Rodney (Ind)
43.79%: Kelley, Claire (LD)
42.11%: Dismore, Andrew (Lab)
42.01%: Harris, Evan (LD)
42.81%: Kramer, Susan (LD)
42.55%: Throup, Maggie (C)
44.51%: Rees-Mogg, Annunziata (C)
42.36%: Stroud, Philippa (C)
42.2%: Formosa, Mark (C)
42.53%: Heathcoat-Amory, David (C)
43.08%: Tod, Martin (LD)

Patently, they have not been sorted in the requested order. I have filed a bug against linqtoexcel. Hopefully it can be resolved soon. Update: A fix was released shortly after my bug report - linqtoexcel 1.3.70 works fine.

Tuesday, 11 May 2010

LINQ to General Election Part 1 – LINQ to JSON

I discovered this week that the Guardian newspaper is making the UK General Election results available in JSON format. Their politics API gives access to all kinds of results, party and candidate information.

What I wanted to do was to be able to quickly and easily perform some of my own queries on the election results. Having not worked with JSON, I needed something that could parse it and allow me to run LINQ queries on it. My search led me to Json.NET, which is a very easy to use library that allows you to parse JSON into an object that you can run LINQ queries against.

Visualising JSON

When working with JSON, you need to have some kind of tool that will let you explore the structure of the JSON data. I found a handy utility called JsonViewer that parses it into a tree format. Alternatively, you can use this online JSON formatter.

Parsing JSON

The first step is to download the JSON and parse it into a JObject:

string url = "http://www.guardian.co.uk/politics/api/general-election/2010/results/json";
WebClient wc = new WebClient();
string json = wc.DownloadString(url); 
JObject o = JObject.Parse(json);

LINQ Queries

Now it is straightfoward to write queries against the object. Notice that you can use Children() to select all the children under a specific node (useful for arrays).

var unpopularWinners = from winningMp in o["results"]["called-constituencies"].Children()["result"]["winning-mp"]
    where (decimal)winningMp["votes-as-percentage"] < (100 / 3M)
    orderby (decimal)winningMp["votes-as-percentage"] ascending
    select winningMp;

To get the values out of the instances of JToken returned, you have to remember to cast them to the appropriate type (e.g. int, string, decimal):

Console.WriteLine("Unpopular Winners:");
foreach (var winner in unpopularWinners)
{
    Console.WriteLine("{0:F1}% {1} ({2})", (decimal)winner["votes-as-percentage"], (string)winner["name"], (string)winner["party"]["name"]);
}

LINQ to JSON seems to support the full range of LINQ operations, including grouping:

var seatsPerParty = from winningMp in o["results"]["called-constituencies"].Children()["result"]["winning-mp"]
    group winningMp by (string)winningMp["party"]["name"] into g
    orderby g.Count() descending
    select new { Party = g.Key, Seats = g.Count(), TotalVotes = g.Sum(c => (int)c["votes-as-quantity"]) 
};

Unfortunately, the Guardian do not provide one JSON file that contains the votes for all candidates, although it looks like you can get that information if you follow another link for each one. However, they did make the full results available in a Google spreadsheet, so I plan to follow up shortly with another post on how to perform LINQ queries against Excel data.