Thursday, November 26, 2009

An idea for music transcription.

Many musicians spend time transcribing music, using computer software such as Sibelius or Finale to enter notes into sheet music. This process, though, could use improvement, particularly when a midi keyboard is not available. (In some programs the computer keyboard mapping is not intuitive; in Finale 2006, one enters notes by typing their letter names. The "g" key produces a g note and so on - but on a qwerty layout, this doesn't work well, because of the non-consecutive placement of a,b, and c. Also, I have yet to see a good way to select note duration.) Both Finale and Sibelius have recently released new transcription features, such as Sibelius' keyboard window introduced in May 2009. Here I will introduce one of my ideas for faster music transcription.

Instead of following an arbitrary metronome when recording, my program lets the performer tap the pulse themselves. This lets the performer play more naturally, and also to slow down for parts of the music that are more complex. I recently finished a proof-of-concept of this system that needs only a standard computer keyboard. The keys are played like a piano, and the Tab key is tapped for every quarter note. It works well; I've already used it to transcribe some of a Bach cantata.

Some screenshots of the prototype, called "Trilling":

A short fragment, saved to MusicXML and imported by Finale:

The same fragment, saved to LilyPond:

Even if you don't have Finale, Sibelius, or LilyPond installed, you can see a rough preview of the music, drawn by my little "scoreview" program:

I plan to take this project forward, adding more features, and creating a truly useful tool for music transcription.

In order to export to MusicXml and LilyPond, I use code from an open source project called Mingus. I'm grateful for that project, which turned out to be very similar to what I needed. I added the capability of tied notes and changed some of the ways it writes musicxml files.

Download source, requires Python 2.5, released under GPLv3.
At the moment, it only runs on Windows because I am using the winsound module to play .wav files in real-time. (I'm not using midi output anymore because, if one doesn't have hardware midi, there is a noticeable delay during playback.)

Sunday, November 22, 2009

Doppler Effect Simulation

Here's a hacked-together-in-one-night project just like the ones from the old days.


This is a Doppler effect simulation, in order to create the sound of particles whizzing past you, in stereo. You, the "observer", are represented by the L and R rectangles. The green numbers represent points along the path of that the particle travels. One specifies the path by moving the position of the numbers, using the mouse. If your points are spread apart, the particle travels faster, and if the points are close, the particle moves slower.

It's fun to play with. I recommend downloading and trying it. By clicking the checkbox, a second particle can be added.




The physics are only approximate, because this is an audio project and not a physics project. Each pixel corresponds to 2 meters. The distance between L and R "observers" is deliberately exaggerated to get a better stereo effect. The volume is scaled by 1/r instead of 1/(r^2), in order to hear the particle more easily when it is far away. This is something to explore further.

I was able to write the code quickly. The expression (V/(V+vS)) gives a frequency shift, so if this value is 1.5, the perceieved pitch is 1.5 times higher. It turns out that this frequency shift is very well suited to code. First, I generate many seconds of source audio, say a sine wave at a fixed frequency. Then, I walk through this source audio at the "frequency shift" rate, interpolating between values when needed. So if the "frequency shift" came to 1.3, I would take the 0th,1.3th,2.6th,3.9th, samples from the source audio as the output audio, which results in output audio with a 1.3 times higher pitch. The whole project simplified to this:

V = 340.0; //speed of sound
//for each timestep:
//move x and y
//find distance between particle and observer
distance = Math.Sqrt((x-xMe)*(x-xMe) + (y-yMe)*(y-yMe));
vS = (distance - prevdistance) / dt;
freqShift = (V / (V + vS));

intensity = scale * (1 / distance);
for (int i = 0; i < dt*sampleRate; i++)
outputAudio[index+i] = intensity * interpolatedValue(sourceAudio, fPositionInSourceAudio); 
fPositionInSourceAudio += freqShift;


Download (Windows), unzip and run Doppler.exe. No setup needed.

Source, released under GPLv3.

Monday, November 16, 2009

Computer keyboard as piano keyboard

A continuation of a program I wrote a few years back, this little app lets you use your computer keyboard as a piano keyboard:

It can be useful for recording fragments of melodies or songs. The results are saved as a standard .mid file. Also, because it supports easy multi-tracking, it can be fun to make a little song by combining a bass part, harmony, and melody with different instruments for each voice.

(Midisketch sends out real time midi events. So, if a midi loopback driver like Maple or LoopBe1 is used, it can act like a midi input device as well. In fact, it might be interesting to have Midisketch install keyboard hooks like Ctrl-alt-a, Ctrl-alt-b, and so on, that would play the corresponding notes even when the program is minimized. Then it could be used just like a midi keyboard, and recognized as such by other software).

Anyways, here is a short video showing the keyboard:

Midisketch demo from bngjbng on Vimeo.

Three or more note polyphony is supported, constrained only by limitations of most keyboards. The "C# MIDI Toolkit" by Leslie Sanford is used to send midi data to the Windows api.

Download, win32.
(Unzip the file and run Midisketch.exe. See also readme.txt. Requires .NET 2).
Released under Gplv3.

Wednesday, November 11, 2009

Audio: noise

I have been continuing some digital audio experiments.

I find red noise to sound calmer and more natural than white noise. (Red noise can be produced by integrating white noise). The way I produce red noise is to start with a single value, add a small random (positive or negative) sample to it, and then continue. In white noise, the samples are unconnected leaps, but for red noise, the samples do not change as drastically.Let's say one takes a small chunk of red noise, 220 samples long. If one plays this chunk repeatedly, a 200hz tone is heard. The tone is colored by the frequencies present in the initial chunk of noise.

Now let's create 15 different chunks of red noise, each of length 220. If we play one of the chunks ten times, then the next ten times, then the next ten times, and so on, eventually repeating,the result is a nice "warble" at 200hz. The upper frequencies present change abruptly about 20 times a second, causing a robotic sound. This is a fun sound that can be used for various purposes.


However, what if our goal is to create a more natural sound? As mentioned before, a problem is that the frequency content changes abruptly. We want to create more subtle changes, while still having the presence of buzzing red noise.

The solution I came up with was inspired by how the red noise was created in the first place. I still begin with a chunk of red noise. After playing it ten times, instead of switching to an entirely different chunk, I instead slightly alter the chunk: by adding small random (positive or negative) values to each of the 220 samples. In this way, the frequency content does not change suddenly, and yet it changes in an unpredictable and subtle way. I find the resulting tone to still be harsh and noise-like, but surprisingly natural and musical. In the past it has only been in the frequency domain that I've been able to produce natural sounds.

Two examples of this "softened noise":



The second is seeded initially with a sine wave, and higher frequencies emerge over time.

I've been using my C audio library, "bcaudio", for these experiments. Writing psuedo-object-oriented C isn't that bad. It's fun to write code that will run very quickly.


Sunday, May 10, 2009

Simple image output

Sometimes, it can be nice to have a dead-simple format for bitmap images. If I'm writing some low level code (say, in C) dealing with graphics, at times I will want a quick way to quickly output an image file. Let's say that I'm too lazy to bring in a graphics library that can write actual .pngs . The .bmp format, while simple, is a bit more complicated than a list of rgb values, because of alignment.

I wrote a little program that converts ".simple" images (which are a list of R,G,B bytes) into 24 bit .bmp files. So, your program simply writes a tiny header and three bytes per pixel, and simpletobmp.exe turns this into a .bmp.

This works well from Python, too, when the PIL isn't around.

Here is an example of how to draw a ".simple" image:
FILE * fout;
fout = fopen("test.simple","wb");
fputc('S', fout);
fputc('2', fout);
fputc('4', fout);
int x,y,width,height;
width = 512; height = 256;
fwrite(&width,sizeof(int), 1, fout); 
fwrite(&height,sizeof(int), 1, fout); 

for (y=0; y<height; y++)
  for (x=0; x<width; x++)
    fputc( y%256 , fout); //Red
    fputc( x%256 , fout); //Green
    fputc( 0 , fout);          //Blue

Or, in Python,

import array
fout = open('pyout.simple', 'wb')

chars = array.array('c') #char

ints = array.array('l') #signed long

bytes = array.array('B') #unsigned char
for y in range(HEIGHT):
 for x in range(WIDTH):

(This one draws a gradient in red and green. Change the fputc lines in the inner loop to draw the image you want. A common usage is to set up a 2d array of pixels, draw the picture into that array, and then output everything to a file).

Now, one can run
simpletobmp.exe o test.simple test.bmp
to get the image.

This is very similar to how in Linux, one can write a .ppm file, which is, literally, a brief header and list of rgb values. The .ppm format even accepts pixel information in human-readable ascii digits! Sounds ridiculous, but this type of thing can be useful.

Windows binary
Source LGPL license.

Simpletobmp uses the LGPL bmp_io library by John Burkardt.

Bit packing preprocessor

When working at lower-levels, sometimes it pays to conserve bits. Binary formats sometimes use all of the bits they are given - when I was working with MIDI, I came across this a lot. (If you have a full int, you might as well use all 32 of those guys.)

On a more practical note, 16bit color is often 5bits red, 6bits green, and 5bits blue. A color is stored as a 16 bit integer in the format rrrrrggggggbbbbb, where each letter represents a bit. It isn't difficult to write code to extract these bit fields, but to make it faster and more readable, I wrote some Python to write the C for me. The Python script takes a string like "00rrr000" and outputs C code with the appropriate shifts and masks.

Bit tools. Example usage:
> necessarybits(640)
10 bits are required to store values up to 640
2 ** 10 = 1024

> print tobinary(46)

> print frombinary('1100_1100') 

> pattern('00rrr000', True)
  r is a 3bit number
  unsigned char packed |= r<<3;
  unsigned char r = (packed & 0x3f)>>3; //packed & 0b00111111

> pattern('00000bbb')
  b is a 3bit number
  unsigned char packed |= b;
  unsigned char b = packed & 0x7; //packed & 0b00000111

I find "00rrr000" to be a lot clearer than '(packed & 0x3f)>>3' .

Python source
(34 lines of quick/unpretty code)

Tuesday, March 31, 2009

Midi (part 4)

Today I released a new project, bmidi to wave.

It can be used to:

  • Turn midi files into great-sounding wav and mp3 files.
  • Play midi files, especially if you are unsatisfied with your current midi out quality.
  • Get information about SoundFont files and preview their voices.

You can find high quality SoundFonts online, and play each instrument in the midi with the SoundFont of your choice, resulting in the best possible sound. Use the mixer to fine-tune the volume and pan of each channel. Also, you can see a score view of notes in a particular track.

It is essentially a frontend for Timidity, the program that does the actual playback. Instead of having to edit the configuration files by hand, though, there is a gui.

This is a program I've been meaning to write for a long time. I first had the idea and initial designs in 2002, when midi files were more common. It didn't take very long to write.

Process management in Python is not too bad thanks to the subprocess module, which has a good interface. I also learned about Python threads through this project, primarily because one wants a responsive GUI while the song is playing and the time slider is moving.

Surprisingly, one of the more complicated parts of the program was allowing playback to start other places in the song. Timidity doesn't do this, and so I have to create a temporary truncated midi file for it to play. I would just chop off all events before a certain time, but because events like instrument change can occur at any time, all of the instruments would be wrong, let alone tempo and pitch bend. Tempo changes can occur at any time, and so it is not simple to correlate a midi tick with clock time.

Midi is a compact, but pretty well-designed format. Not many binary formats from 1982 are around today.

MS Paint Animation

I gave MS Paint the ability to make animations and save them as .avi files.

I wanted to write a "flipbook" program, where you could draw a series of frames and create a simple animation. I think this type of program could be fun for kids.

I was about to write the program in Pygame, but found myself re-implementing many of the standard bitmap editing tools. It's easy to make a rectangle and oval tool, but I didn't really feel like making the fill tool or selection tool. So, instead, I used MSPaint as part of the interface to the program. (This is completely a hack, and the resulting program isn't robust, but it was kind of interesting to do). See a video of how to use it - you can move from frame to frame, duplicate the current frame, and play the animation.

I'm trying to make this look like just one program. A lot is going on behind the scenes. The program is a c# app that, first, launches Paint. It has a window style that causes it to be on top of other windows. The program uses many Windows API SendKey calls to send key events to Paint. When you move from one frame to the next, it does the following:
  • Tell paint to select all (Ctrl A), and cut (Ctrl X)
  • Take that image from the clipboard and save it to a .png file
  • Open the next .png file in memory and put it in clipboard
  • Tell paint to paste (Ctrl V) and deselect (Esc)
The other operations are done with a similar series of events. I had to tune the timing; the c# program sleeps while waiting for Paint. The play preview actually opens up a borderless c# window that is positioned so that it appears above the image, and cycles through the images.

It ended up working. The Win api gives you almost too much to play with. Now I know that c# apps can send simulated keystrokes to other processes, for semi-practical purposes.

Friday, March 20, 2009


I made an arcade-style game. You have to quickly out-maneuver your enemies and push blocks to squish them.

I was influenced by an old Mac shareware game, but I've added some twists. The blocks have numbers, and if you can add to 15, it will stun your enemies and give you points.

The game is written using Pygame. I've also ported it to the One Laptop Per Child XO.

I'm going to replace the graphics and sound effects; the ones there now are essentially placeholders. I'll also rewrite some of the levels.

Download. To play it, you need a recent version of Python, and for Pygame to be installed. (Sorry, if you don't already have Pygame you'll have to get it.) Works in Windows and Linux.

When the game is more complete I'll post a Windows binary.

Sunday, February 22, 2009

Midi (part 3) Tunescript

Here is another midi project done in my spare time. Tunescript is a musical toy, where you can enter a list of notes to create a song. Unlike other interfaces like this, though, tunescript supports a lot of features like multiple tracks and instruments, chords, percussion, accented notes, and even pitch bends. Also, I put much thought into the syntax.

For a lot of information, many examples, and to download, visit this page.

It is pretty fun to invent a domain specific language. Because the language doesn't (yet) have nested constructs, I don't need full parsing. I came up with a nice way to interpret the input. It works kind of like a finite state machine that is receiving a stream of instructions. For example, the character 'b' can mean either flat, as in 'Ab', or the note 'b', but there is no ambiguity, because the symbol 'A' causes a transition to a state that can accept the 'b'. Adding multiple tracks ended up being pretty simple, because I just use simple string operations to split the tracks.

In more detail, the core of my interpreter looks something like this. The use of the while loop isn't very good and should probably be made into a for loop, but for some reason I was thinking of gotos, perhaps because of the underlying finite-state-machine influence.

...main loop...
  while s!='':
    result, s = self.pullFullNote(s, track)
    if result: continue
    result, s = self.pullFullNoteSet(s, track)
    if result: continue
    result, s = self.pullFullModOctave(s, track)
    if result: continue
    #if i get here, i couldn't interpret something, throw an error.
def pullFullNote(self, s, track):
  if it is not a note,
    return False, s
  #otherwise, consume some of the characters from s
  next_s = s[2:]
  #add the note to the track
  return True, next_s

What is nice is that this pattern can be followed repeatedly on smaller levels. The pullFullNote() can call pullPitch() or pullVolumeDuration() in just the same way, and those can themselves call pull functions. If pullVolumeDuration() doesn't see a match, it simply returns False with the original string given. Essentially, the benefit is that there is no need to be explicitly asking "can the next thing be a note?", because the pullNote style functions will smoothly drop through when the next thing is not a note. I don't think I'm explaining this well, but it is all there in if you are interested.

Midi (part 2) Scoreview

For fun, I wrote a program that visualizes the contents of Midi files.
The "score view" is just a Tkinter canvas on which lines and ovals are drawn. None of the graphics are bitmaps (except the clefs), which means that there is freedom to quickly zoom in and out. The sharp signs are actually the text "#" drawn at that point. Writing a custom coordinate translation made this code so much easier. When I specify y coordinates, they are given in units where 2 units is the height of between staff lines. So, moving a note up or down just means incrementing or decrementing its position, and only the lowest level of code needs to know about the actual pixel coordinates.

This tool can be a useful way to explore the contents of a midi song. Besides showing the score for a track, it can also get channel information:

Also, one can view all of the midi events in a track, in a human-readable format:

For a lot of information, and to download, visit this page.

If you've been wondering, yes, the eventual goal is to create a midi editor. This project isn't high priority, though.

Sunday, February 15, 2009

Midi (part 1)

I've been playing around with MIDI files lately.

MIDI files are essentially a list of notes with associated timing, and in the 90s were popularly used to make and share music. They can be easily made with a synthesizer and a computer. A drawback, though, is that the songs sound different if played on different computers. For example, different sound cards have different interpretations of what the "acoustic grand piano" voice should sound like, and the midi file itself does not contain the audio sample data. However, midis can be programmatically generated, and one can change tempo and pitch without losing quality. After broadband internet and mp3s became common, most people stopped using MIDI, although it still finds a use in music notation programs.

Unfortunately, these days, new computers come with awful-sounding software MIDI instruments, creating the false impression that "MIDI" is synonymous with "cheap, bad audio quality." In fact, there is nothing inherently bad-sounding about MIDI music, and with a set of good SoundFonts, it can sound very good.

Anyways, here are some Python scripts that can be used to create midis. The interface is not the best, but it was very quick to implement. Examples:
b = BMidiBuilder()
b.note('c', 1) 
#the 1 means the note's length is 1 qtr note
b.note('d', 1)
b.note('e', 1)
b.note('f', 1)
b.note('g', 4)'out.mid')
This plays part of a scale. To play a chord, "rewind" can be used to go back and lay more notes on top of existing notes.

b = BMidiBuilder()
b.note('c#', 1)
b.note('f', 1)
b.note('g#', 1)'out.mid')
Two tracks can be joined, to make a little tune:
tr1 = BMidiBuilder()
tr1.setInstrument('acoustic bass')
tr1.note('c3', 2)
tr1.note('d3', 2)
tr1.note('e3', 2)

tr2 = BMidiBuilder()
tr2.note('e4', 2)
tr2.note('f4', 2)
tr2.note('g4', 2)

bbuilder.joinTracks( [tr1, tr2], 'out.mid')

Also, there are a set of classes in that can build a midi file and any type of event, like percussion, modulation, metadata, or pitch bends.

Download, tested in Python 2.5

Tonight I wondered what it would sound like to play the same note on the 127 general midi instruments. The result sounds interesting.
b = bbuilder.BMidiBuilder()
b.tempo = 400
for i in range(127):
  #insert a raw instrument change event
  evt = bmidilib.BMidiEvent()
  evt.type='PROGRAM_CHANGE' = i
Hear it.
(Sound depends on your midi device, I don't recommend Quicktime so you may find it better to download this file and open it elsewhere).

Sunday, February 1, 2009


Here is a long-term project of mine that still isn't finished, but is starting to become useful. I started this in a computer science class last year, so not all of it was done on my own time. At Olin, sometimes I'd like to have the software equivalent of my TI calculator. Matlab can be useful, but takes a while to load and cannot be described as light-weight. So, for quick calculations, I made Minimath, a small command line interface to evaluate math.

The main design goal was to save typing for commonly-occuring tasks. As you can see, the UI is pretty sparse now, and that is on my list of things to improve.


  • Fills-in incomplete expressions. It is perfectly ok to evaluate "sin(pi" without the closing paren, because it will understand what you meant.
  • Line history, with arrow keys. Control-Up and Control-Down search history based on prefix like Matlab.
  • Supports complex numbers, saving values in variables.
  • You can press Alt-. to create a "->" symbol to store values, like a TI 83.
  • "ans" refers to last result.
  • If the first key you press is an operator, it fills in the "ans" like a TI 83. This one saves a lot of time.
  • Any valid Python expression can also be evaluated.
  • Pressing the "\" key (while not in a string literal) creates a lambda symbol, so that expressions like "f = λx.x+2" can be used.
  • Pretty print, which behind-the-scenes is parsing the expression, creating and rendering a temporary LaTeX file.
  • Uses Numpy library which provides many math functions.
Another feature is variable substitution. Pressing alt-, makes a "<-" symbol that can be used to subsitute values into an expression. For example, the result of "2*x^2 + 3*x <- 4" is 44.

Added Syntax

  • The symbol "^" now is exponentiation.
  • The ternary expression ? : can be used as in C.
  • The shortened "for(i,5)" can be used in place of "for i in range(5)".
  • Syntax for making arrays: "arr = a[1 2 3]" and "arr = a[1 2 3;4 5 6]". (Note that arrays are different than Python lists).
  • "i[0,4,10]" is like Matlab's linspace, 10 elements equally spaced from 0 to 4.
  • "i[0..4]" is an inclusive range, the result is [0,1,2,3,4].
  • It allows more items on one line than Python, just use semicolons. In particular, something like "for val in array:t+=val;print val" will work all on one line.
In the future, I plan to add more-advanced interpretation. For example, the string "3x4" could be recognized as "3*x^4", and "4(x+3)" to "4*(x+3)" Also, I should integrate Matplotlib or something similar.

Friday, January 23, 2009


An alternative to developing new technology is to apply existing technology in a new way. I was thinking about face recognition software a few days ago, and thought of a fun application (this isn't a serious project at all, by the way). So I quickly wrote AutoShoop, that uses a face-detection algorithm to draw "shoop da woop" shapes on people's faces. In other words, it automatically turns this:
into this: This works automatically - no Photoshop needed! I could have spent some time making the results better, but that's not really the point. It's more of a proof of concept. (Windows GUI binary and sample images) (C# source)

I didn't write any of the face-detection part; I simply used fdlib. Your results may vary; I found that for some reason fdlib sometimes has a hard time finding faces in certain photos. When it works, though, it works pretty well.

Wednesday, January 14, 2009


Let's say I want to be able to parse an arbitrary expression like
f(f( a+f(a * b * f(c))))

where a, b,c are variables, and f is some function. I want to make a parse tree, and possibly to transform the tree. Last night I thought of a very unorthodox way to do this.

I've heard of yacc/lex, pyparsing, and Python's ast. However, for what I have in mind this is all I need:
- Python supports overloading operators.
- Python has an "eval" for dynamically running code.

My idea is to create Python objects for f, a, b,c, and then *eval* the expression as if it were Python code. The objects I create will have their operators overloaded so that a+b, for example, returns an expression like ['+', a, b]. Here's an example that supports addition, multiplication, and functions taking any number of arguments. (Adding the rest of the operators is simple).
class FunctionSymbol:
  def __init__(self, name): = name
  def __call__(self, *args):
    return SubExp(self, list(args))
  def __str__(self):
    return 'FN_'

class VariableSymbol:
  def __init__(self, name): = name
  def __add__(self, other):
    return SubExp('OP_ADD', [self, other])
  def __mul__(self, other): 
    return SubExp('OP_MULT', [self, other])
  def __str__(self):
    return 'VAR_'

class SubExp(VariableSymbol):
  def __init__(self, op, args):
    self.op = op; self.args= args
  def __str__(self):
    return str(self.op) + '(' + ','.join(map(str, self.args)) + ')'

def strangeparser(s):
  #parse something by evaluating it as if it were Python code
  symbols = {}
  #create objects for the symbols in the string 
  snospace = s.replace(' ','').replace('\t','') + ' '
  import re
  for match in re.finditer('[a-zA-Z_]+', snospace):
    strfound =
    if strfound not in symbols:
      #assume that if the next character is "(", then it is a function
      if snospace[match.end()]=='(':
        symbols[strfound] = FunctionSymbol(strfound)
        symbols[strfound] = VariableSymbol(strfound)
  # evaluate it
    return eval( s , globals(), symbols )
  except Exception, e: 
    print 'Could not parse. %s' % str(e)
    return None
def main():
  tree = strangeparser('a+b+c')
  print tree
  tree = strangeparser('f(f( a+f(a * b * f(c))))')
  print tree
  # FN_f(FN_f(OP_ADD(VAR_a,FN_f(OP_MULT(OP_MULT(VAR_a,VAR_b),FN_f(VAR_c))))))

Also, I've written code that turns this tree into a sequence of simple operations. Coincidentally, I think this could be used to write a compiler. I haven't read any books about that, though.
 f(f( a+f(a * b * f(c))))

 return i7;

I'm working on a way to conserve the temporary variables used here, because there is probably a way to reuse them after they aren't needed. What I do is go down to the bottom of the tree, find something that can be evaluated, and replace that deepest node with a temporary variable. I then repeat that process until the whole tree has been "flattened".

Thursday, January 1, 2009


On the internet, I now have a place that I can call my home. I've uploaded an initial version of a personal website.

It can be seen at (previously b 3 n f dot com).

(The section called "Recent Experiments" consists of links to this blog, but the other sections like Projects are new. There's a lot of content to explore.)

I've been working on this occasionally over the course of a few years, which is why the site doesn't seem very unified yet.

I intend to keep this blog active, though, so stay tuned.