articles

A few posts ago I wrote about wavcuthhh, the tool I wrote to split .wav files into separate tracks. I recently added the feature to look for periods of silence within a .wav file, and to cut the file based on those points. This might be useful for splitting up dj sets that consist of one huge mp3 file, or for isolating the 'hidden track' that some bands will include in the final track of an album after a minute of silence.

I use an amplitude threshold to specify what is 'silent'. The transition is not always clean, though, because songs can have a beat and the amplitude will change frequently. First, I made a large low pass filter and basically averaged the amplitude about every 0.2 seconds. To check the results, I could have written a GUI, or printed to a text file and plotted the data, but instead, went with a quick and dirty solution: writing data into another .wav file and using Audacity itself to visualize data.

The upper channel approximates the current amplitude after lpf, the lower channel indicates whether the amplitude is above or below the cutoff.

In a few cases, this was not enough, and there was still bouncing. I don't think there's a general way to solve this problem, because the silence between tracks can be the same as the silence within a track. I came up with a heuristic that has worked well so far. I assume that the first song is fading out, and the second song is fading in. Both of these might have temporary dips where the amplitude goes below the threshold, but then rises again. My algorithm goes from left to right. When encountering a drop below the threshold, I examine the next 10 seconds of audio. I look for the longest string of consecutive samples where the amplitude is below the threshold, and choose the cut location as the middle of this string. This tends to work because the pause between tracks tends to be longer than the temporary drops below the threshold during a fade-in or fade-out.

Here's what it looks like (these are ~800kb because of an example .wav):
C# source
Win32 binary