There are two extensions to the standard WAV file format which you may sometimes want to make use of. The first is the RF64 extension, which overcomes the inherent limitation that WAV files cannot be larger than 4GB. The second is the Broadcast Wave Format (BWF) which builds on the existing WAV format and specifies various extra chunks containing metadata.
In this post, I’ll explain how you can make a class to create Broadcast Wave Files using NAudio that supports large file sizes using the RF64 extension, and includes the “bext” chunk from the BWF specification.
File Header
First of all, a WAV file starts with the byte sequence ‘RIFF’ and then has a four byte size value, which is the number of bytes following in the entire file. However, for large files, instead of ‘RIFF’, ‘RF64’ is used, and the following four byte integer for the RIFF size is then ignored (it should be set to -1).
Then we have the ‘WAVE’ identifier (another 4 bytes), and following that in a normal WAV file we would usually expect the format chunk (with the ‘fmt ‘ 4 byte identifier). But to support RF64, we add a “JUNK” chunk. This is of size 28 bytes, and initially is all set to zeroes. If the overall size of the entire WAV file grows to over 4GB, then we will turn this “JUNK” chunk into a ‘ds64’ chunk. If the overall file size does not grow beyond 4GB, then the junk chunk is simply left in place, and media players will just ignore it.
The ds64 Chunk
A ds64 chunk consists of three 8 byte integers and a four byte integer. These are the RIFF size, which is the size of the entire file minus 8 bytes, the data size, which is the number of bytes of sample data in the ‘data’ chunk and the ‘sampleCount’ which is the number of samples. The sample count is optional really, as it corresponds to the sample count found in the ‘fact’ chunk of a standard WAV file. This chunk is usually only present for non-PCM audio formats as it is trivial to calculate the sample count for PCM from the byte count. Finally, a ds64 chunk can have a table containing the sizes of any other huge chunks, but usually this would be unused since it is likely only the ‘data’ chunk that will grow larger than 4GB. So the final four bytes of a typical ds64 chunk are 0s, indicating no table entries.
The bext chunk
Following the ds64 chunk, we have the bext chunk from the BWF specification. This has space for a textual description of the file as well as timestamps, and newer versions of the bext chunk also allow you to specify various bits of loudness information. The algorithms for calculating this are hard to track down, so I tend to use version 1 of bext and ignore them.
The fmt chunk
Then we have the standard ‘fmt ‘ chunk, which works just the same way it does in a standard WAV file, containing a WAVEFORMATEX structure with information about sample rate, bit depth, encoding and number of channels. RF64 files are almost always either PCM or IEEE floating point samples, since it is only with uncompressed audio that you typically end up creating files larger than 4GB.
The data chunk
Finally, we have the ‘data’ chunk, containing the actual audio data. Again this is used in exactly the same way as it is in a regular WAV file, except that the chunk data length only needs to be filled in the file is less than 4GB. If it is a RF64 file, the four byte length for the data chunk is ignored (set it to –1), and the size from the ds64 chunk is used instead.
The code
Here’s a simple implementation of a BWF writer class, that creates BWF files with a simple bext chunk and turns them into RF64 files if necessary. I plan to clean this code up a little and import it into NAudio in the future (either as its own class or upgrade WaveFileWriter to support RF64 and BWF – let me know your preference in the comments).
using System; | |
namespace NAudioUtils | |
{ | |
// https://tech.ebu.ch/docs/tech/tech3285.pdf | |
class BextChunkInfo | |
{ | |
public BextChunkInfo() | |
{ | |
//UniqueMaterialIdentifier = Guid.NewGuid().ToString(); | |
Reserved = new byte[190]; | |
} | |
public string Description { get; set; } // max 256 chars | |
public string Originator { get; set; } // max 32 chars | |
public string OriginatorReference { get; set; } // max 32 chars | |
public DateTime OriginationDateTime { get; set; } | |
public string OriginationDate { get { return OriginationDateTime.ToString("yyyy-MM-dd"); } } | |
public string OriginationTime { get { return OriginationDateTime.ToString("HH:mm:ss"); } } | |
public long TimeReference { get; set; } // first sample count since midnight | |
public ushort Version { get { return 1; } } // version 2 has loudness stuff which we don't know so using version 1 | |
public string UniqueMaterialIdentifier { get; set; } // 64 bytes http://en.wikipedia.org/wiki/UMID | |
public byte[] Reserved { get; private set; } // for version 2 = 180 bytes (10 before are loudness values), using version 1 = 190 bytes | |
public string CodingHistory { get; set; } // arbitrary length string at end of structure | |
// http://www.ebu.ch/CMSimages/fr/tec_text_r98-1999_tcm7-4709.pdf | |
//A=PCM,F=48000,W=16,M=stereo,T=original,CR/LF | |
} | |
} |
using System; | |
using System.Diagnostics; | |
using System.IO; | |
using System.Text; | |
using NAudio.Wave; | |
namespace NAudioUtils | |
{ | |
/// <summary> | |
/// Broadcast WAVE File Writer | |
/// </summary> | |
class BwfWriter : IDisposable | |
{ | |
private readonly WaveFormat format; | |
private readonly BinaryWriter writer; | |
private readonly long dataChunkSizePosition; | |
private long dataLength; | |
private bool isDisposed; | |
public BwfWriter(string filename, WaveFormat format, BextChunkInfo bextChunkInfo) | |
{ | |
this.format = format; | |
writer = new BinaryWriter(File.OpenWrite(filename)); | |
writer.Write(Encoding.UTF8.GetBytes("RIFF")); // will be updated to RF64 if large | |
writer.Write(0); // placeholder | |
writer.Write(Encoding.UTF8.GetBytes("WAVE")); | |
writer.Write(Encoding.UTF8.GetBytes("JUNK")); // ds64 | |
writer.Write(28); // ds64 size | |
writer.Write(0L); // RIFF size | |
writer.Write(0L); // data size | |
writer.Write(0L); // sampleCount size | |
writer.Write(0); // table length | |
// TABLE appears here - to store the sizes of other huge chunks other than | |
// write the broadcast audio extension | |
writer.Write(Encoding.UTF8.GetBytes("bext")); | |
var codingHistory = Encoding.ASCII.GetBytes(bextChunkInfo.CodingHistory ?? ""); | |
var bextLength = 602 + codingHistory.Length; | |
if (bextLength%2 != 0) | |
bextLength++; | |
writer.Write(bextLength); // bext size | |
var bextStart = writer.BaseStream.Position; | |
writer.Write(GetAsBytes(bextChunkInfo.Description, 256)); | |
writer.Write(GetAsBytes(bextChunkInfo.Originator, 32)); | |
writer.Write(GetAsBytes(bextChunkInfo.OriginatorReference, 32)); | |
writer.Write(GetAsBytes(bextChunkInfo.OriginationDate, 10)); | |
writer.Write(GetAsBytes(bextChunkInfo.OriginationTime, 8)); | |
writer.Write(bextChunkInfo.TimeReference); // 8 bytes long | |
writer.Write(bextChunkInfo.Version); // 2 bytes long | |
writer.Write(GetAsBytes(bextChunkInfo.UniqueMaterialIdentifier, 64)); | |
writer.Write(bextChunkInfo.Reserved); // for version 1 this is 190 bytes | |
writer.Write(codingHistory); | |
if (codingHistory.Length%2 != 0) | |
writer.Write((byte) 0); | |
Debug.Assert(writer.BaseStream.Position == bextStart + bextLength, "Invalid bext chunk size"); | |
// write the format chunk | |
writer.Write(Encoding.UTF8.GetBytes("fmt ")); | |
format.Serialize(writer); | |
writer.Write(Encoding.UTF8.GetBytes("data")); | |
dataChunkSizePosition = writer.BaseStream.Position; | |
writer.Write(-1); // will be overwritten unless this is RF64 | |
// now finally the data chunk | |
} | |
public void Write(byte[] buffer, int offset, int count) | |
{ | |
if (isDisposed) throw new ObjectDisposedException("This BWF Writer already disposed"); | |
writer.Write(buffer,offset,count); | |
dataLength += count; | |
} | |
public void Flush() | |
{ | |
if (isDisposed) throw new ObjectDisposedException("This BWF Writer already disposed"); | |
writer.Flush(); | |
// could do FixUpChunkSizes(true) here to ensure WAV file created is always playable after Flush | |
} | |
private void FixUpChunkSizes(bool restorePosition) | |
{ | |
var pos = writer.BaseStream.Position; | |
var isLarge = dataLength > Int32.MaxValue; | |
var riffSize = writer.BaseStream.Length - 8; | |
if (isLarge) | |
{ | |
var bytesPerSample = (format.BitsPerSample / 8) * format.Channels; | |
writer.BaseStream.Position = 0; | |
writer.Write(Encoding.UTF8.GetBytes("RF64")); | |
writer.Write(-1); | |
writer.BaseStream.Position += 4; // skip over WAVE | |
writer.Write(Encoding.UTF8.GetBytes("ds64")); | |
writer.BaseStream.Position += 4; // skip over ds64 chunk size | |
writer.Write(riffSize); | |
writer.Write(dataLength); | |
writer.Write(dataLength / bytesPerSample); | |
// data chunk size can stay as -1 | |
} | |
else | |
{ | |
// fix up the RIFF size | |
writer.BaseStream.Position = 4; | |
writer.Write((uint)riffSize); | |
// fix up the data chunk size | |
writer.BaseStream.Position = dataChunkSizePosition; | |
writer.Write((uint)dataLength); | |
} | |
if (restorePosition) | |
{ | |
writer.BaseStream.Position = pos; | |
} | |
} | |
public void Dispose() | |
{ | |
if (!isDisposed) | |
{ | |
FixUpChunkSizes(false); | |
writer.Dispose(); | |
isDisposed = true; | |
} | |
} | |
private static byte[] GetAsBytes(string message, int byteSize) | |
{ | |
var outputBuffer = new byte[byteSize]; | |
var encoded = Encoding.ASCII.GetBytes(message ?? ""); | |
Array.Copy(encoded, outputBuffer, Math.Min(encoded.Length, byteSize)); | |
return outputBuffer; | |
} | |
} | |
} |
No comments:
Post a Comment