Wednesday, January 20, 2016

A Simple Interface to Read/Write Audio Metadata in Python

I wrote a small wrapper for Mutagen that makes it easier to read/write audio metadata (tags for mp3, ogg, flac, m4a/mp4) in Python. Here's an example:
    o = EasyPythonMutagen('file.mp3')
    o.set('title', 'song title')
    o = EasyPythonMutagen('file.flac')
    o.set('title', 'song title')
    o = EasyPythonMutagen('file in id3_v23.mp3', use_id3_v23=True)
    o.set('title', u'title with unicode: \u0107')

A few differences from Mutagen:
  • You can use the same class and interface for different audio formats.
  • You won't need to catch exceptions in case the mp3 doesn't have an id3 tag yet.
  • You won't have to use a low level interface to write tags in id3v2.3, for compat. with Windows and smartphone apps.

It'd be nice to add id3v2.3 support in EasyID3 to the mutagen project at some point. In the meantime I'll use this wrapper.

See the source and download it on GitHub.

Other small features of easypythonmutagen:

  • Provides method to get the empirical ("actual") bitrate in addition to stated bitrate.
  • The "get" methods directly return a value, instead of a list.
  • Intentionally disallows adding unrecognized fields A typo like o['aartist'] fails instead of succeeding silently.
  • Added a few fields, like 'Composer' and 'Website' for mp4/m4a.

Saturday, May 2, 2015

How to write a program using Skia on Windows

Skia is an open source 2D graphics library which provides common APIs that work across a variety of hardware and software platforms. It serves as the graphics engine for Google Chrome and Chrome OS, Android, Mozilla Firefox and Firefox OS, and many other products. Skia is an alternative to the Cairo library.

Posting this in case it helps anyone else.

Visual Studio 2013 (including the express or community editions, which are free)
Unzipping tool like 7zip, WinRAR

The command prompt lines below should be run in the same session (i.e. it won't work if you close and reopen a new command prompt).

  • Download from the Install Depot Tools page
  • Use 7zip or WinRAR to Extract All to a path like c:\path\to\depot_tools (no spaces in path). The Windows built-in unzip ight skip hidden files.
  • Open a command prompt
  • Run "cd c:\path\to\depot_tools"
  • Run "echo %PATH%"
  • In the output, if you already have Python installed and see a Python directory, you might want to remove this from the path. set PATH=x can do this for just this command session.
  • In the output, if you already have Git installed and see a Git directory, you might want to remove this from the path. set PATH=x can do this for just this command session.
  • Run "set PATH=%PATH%;c:\path\to\depot_tools" to add depot tools to the path
  • Run "gclient". This will download and sync the needed tools.
  • Make a directory like c:\path\to\skia (no spaces in path)
  • In the same command prompt Run "cd c:\path\to\skia"
  • Run git config --global "Your Name"
  • Run git config --global
  • mkdir skia
  • cd skia
  • gclient config --name . --unmanaged
  • gclient sync
  • git checkout master
  • Run "set GYP_GENERATORS=msvs"
  • Run "python gyp_skia"
  • Run "ren out out86"
  • Run "python gyp_skia -D skia_arch_width=64"
  • Run "ren out out64"
  • Open .\out86\skia.sln in Visual Studio
  • For me, I only needed to build Release
  • For me, I didn't need these projects, and also these failed to build as they couldn't find QT. Open Configuration Manager, under the Debug/Release drop down, uncheck Build for the following debugger, debugger_qt_mocs, pdfviewer, pdfviewer_lib
  • Hit Build Solution, and wait several minutes
  • When the build is done, you may see some compilation warnings/errors but if the default project HelloWorld runs correctly, (Ctrl+F5), it's likely that all of the important parts work.
  • Open .\out64\skia.sln in VS
  • Repeat the above steps for x64.
Now, to create an example project that doesn't need Google's gyp system:
  • Open Visual Studio and create a new project. Other languages > Visual C++ > Win32 > Win32 Console Application
  • In the Win32 Application Wizard, click Application Settings, uncheck Precompiled Header, check Empty Project.
  • Switch from Debug to Release
  • Go into the project's options, Configuration Properties > C/C++ > General > Additional Include Diretories and add: c:\path\to\skia\include\core;c:\path\to\skia\include\config
  • Go into the project's options, Configuration Properties > C/C++ > Preprocessor > Preprocessor Definitions and add:
  • Go into the project's options, Configuration Properties > Linker > Input > Additional Dependencies and add (preferably as relative paths)
Then, add a main.cpp to the project, with the following code,
#include <string>
#include <fstream>

#include "SkCanvas.h"
#include "SkData.h"
#include "SkDocument.h"
#include "SkGraphics.h"
#include "SkSurface.h"
#include "SkImage.h"
#include "SkStream.h"
#include "SkString.h"

#include "..\effects\SkGradientShader.h"

void save_ppm(SkBitmap const& bitmap, std::string const& filename)
  SkAutoLockPixels l(bitmap);

  std::ofstream ofile(filename.c_str(), std::ios_base::binary | std::ios_base::trunc);
  if (ofile.is_open())
    ofile << "P6 " << bitmap.width() << " " << bitmap.height() << " 255 ";

    for (int i = 0; i != bitmap.height(); i++)
      for (int j = 0; j != bitmap.width(); j++)
        SkColor const* c = bitmap.getAddr32(j, i);
        char buf[3] = { SkColorGetR(*c), SkColorGetG(*c), SkColorGetB(*c) };
        ofile.write(buf, 3);

void TestSkia(SkCanvas& canvas)
  SkPaint paint;
  SkRect rect = {
    20, 20,
    50, 50
  canvas.drawRect(rect, paint);

int main(int argc, char * const argv[])
  SkAutoGraphics ag;
  SkBitmap bitmap;
  int width = 800;
  int height = 600;
  bitmap.allocPixels(SkImageInfo::MakeN32Premul(width, height));
  SkCanvas canvas(bitmap);


  save_ppm(bitmap, "out.ppm");
  return 0;

// stub out openGl dependency, which isn't needed in this case.
extern "C"
#ifdef _WIN64
  PROC WINAPI __imp_wglGetProcAddress(LPCSTR)
    return nullptr;
  HGLRC WINAPI  __imp_wglGetCurrentContext()
    return nullptr;
  PROC WINAPI _imp__wglGetProcAddress(LPCSTR)
    return nullptr;

  HGLRC WINAPI _imp__wglGetCurrentContext()
    return nullptr;

Running this little program will create a valid ppm file with a red rectangle!

To build for x64, you can create a new x64 target and update the lib directories from c:\path\to\skia\out86 to c:\path\to\skia\out64.

To add codecs for saving to different image types:
  • In Linker Inputs, add a reference to skia_codecs.lib
  • Add #include "..\images\SkForceLinking.h"

To add OpenGL:
  • remove the __imp_wglGetProcAddress and __imp_wglGetCurrentContext stubs
  • In Linker Inputs, add references to the following:

Install Depot Tools
Skia Quick Start Guides Windows

Tuesday, April 7, 2015

Copying files out of a VM guest machine

A nice benefit of using a guest VM is that the host machine is protected from any malware that infects the guest (barring security vulnerabilities in the VM software itself). I've been using VMs fairly frequently over the past five years, first with VMWare, and now with VirtualBox.

If the guest machine is possibily affected by malware, how then can one transfer data from the guest to the host? Sending through e-mail divulges password information, and uploading to some type of file transfer site is slow and inconvenient. Running a ftp server or web server on the host takes time and introduces another attack surface. Using VirtualBox's shared clipboard works for text files and I was able to use it for binary files after escaping characters, but is also inconvenient and less sure to be safe. VirtualBox's default way of transfering files, emulating a SMB network drive, is not safe, as malware can propagate across a network drive.

I'll describe the approach I came up with. I do use bridged networking so that the guest can ping the host, but I disable all of VirtualBox's shared folders/network drives/USB connectivity. I then make sure that no shared folders on the host are publically writable. I install Python on the guest and use scripts to transfer files over a socket by ip address. (To see the guest's ip, in Windows ipconfig, in Linux ifconfig).

First run this script on the host, which I put together from some stack overflow answers,
import socket

f = open('output_file', 'wb')
conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
portnumber = 8206
conn.bind(('', portnumber))
channel, details = conn.accept()
print 'connected'
while True:
  received_data = channel.recv(4096)
  if not received_data:
print "transfer complete!"

Run this script on the guest, after changing the file name and ip address; I haven't found a need for the script to support files that won't fit into memory.
import socket

file_to_send = './filename'
ip_of_recipient = ''
portnumber = 8206
f = open(file_to_send, 'rb')
all_file_contents =

conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
conn.connect((ip_of_recipient, portnumber))

To ensure that the data is intact, I can use a quick checksum with SHA512,
import hashlib

hash = hashlib.sha512()
while True:
  # update the hash 256k at a time
  buf = * 256)
  if not buf: break
  hash.update (buf)

print hash.hexdigest()

The chances of malware are now much lower. Only one file can come through a port that is quickly closed. I don't use this to transfer executable files, though, as they can have been modified, but in general it seems to be a safer way to copy files from a VM.

Sunday, November 3, 2013

Trace Travel Debugging

While working in a large codebase, from time to time I have encountered bugs that are difficult to reproduce. I know that something has gone wrong, but it is difficult to pinpoint the failure; by the time the failure has occurred, the interesting values/faulty state is gone. For a rarely hit bug, stepping through with a debugger won't work, and crash dumps for ship builds can be difficult to interpret.

One way to attack a problem like this is to use Time Travel Debugging, where the program state is recorded over time, and so a developer can re-play the exact sequence of events. Time Travel Debugging is costly to set up, though, and in addition the overhead it introduces alters timing and potentially altering the repro. (These issues are commonly thread or cross-process concurrency bugs).

I recently thought of a way to assist these investigations. I wrote a tool that I call "Trace Travel Debugging". Trace Travel Debugging uses a logging approach. By comparing logs during a failure with logs during success, one can potentially see the point of divergence. Instead of manually adding log statements, though, my tool automatically traces all function calls, making it much more likely that the point of divergence will be seen.

An example trace for one thread of a program using the Anti-Grain graphics library:
    void Agg2D::lineCap(LineCap cap)
    void Agg2D::lineJoin(LineJoin join)
  void Agg2D::attach(Image& img)
    void Agg2D::attach(unsigned char* buf, unsigned width, unsigned height, int stride)
    height =64 
    width =256 
      void Agg2D::resetTransformations()
      void Agg2D::lineWidth(real w)
      void Agg2D::lineColor(unsigned r, unsigned g, unsigned b, unsigned a)
        void Agg2D::lineColor(Color c)
      void Agg2D::fillColor(unsigned r, unsigned g, unsigned b, unsigned a)
        void Agg2D::fillColor(Color c)
      void Agg2D::textAlignment(TextAlignment alignX, TextAlignment alignY)
      void Agg2D::clipBox(real x1, real y1, real x2, real y2)
      void Agg2D::lineCap(LineCap cap)
      void Agg2D::lineJoin(LineJoin join)
      void Agg2D::flipText(bool flip)
      void Agg2D::imageFilter(ImageFilter f)
      void Agg2D::imageResample(ImageResample f)
  void Agg2D::lineWidth(real w)
  void Agg2D::lineColor(unsigned r, unsigned g, unsigned b, unsigned a)
    void Agg2D::lineColor(Color c)
  void Agg2D::line(real x1, real y1, real x2, real y2)
    void Agg2D::addLine(real x1, real y1, real x2, real y2)
    void Agg2D::drawPath(DrawPathFlag flag)
      void Agg2D::addStrokePath()
      void Agg2D::render(bool fillColor)
      time at Agg2D::render is (2013-11-3-22-7-54-170790)
        static void render(Agg2D& gr,
          void blend_solid_hspan(int x, int y,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
          void blend_solid_hspan(int x, int y,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
Seeing the flow of execution like this can also be generally useful for learning how a large program works.


1) Add the following line to the headers section of a .cpp or .h file to log: #define TTD_TRACELOG(s,n)

2) If desired, add some manual logging statements in these files. For example, TTD_TRACELOG("the height is", height); or TTD_TRACELOG("in function main, time is {TIME}", 0);

3) Add each file to a list in
srcs = [ r'~/fastdev/fastpixel/fastpixel_v3/sdl/agg24/src', 
 '~/fastdev/fastpixel/fastpixel_v3/sdl/agg24/include/agg_pixfmt_rgb.h' ]
outputdir = r'~/output'

Now, run It will modify the specified sources (after backing up the original) and add logging statements for each function/method. Rebuild and run your program. Run, and you will see traces for each thread.


Because race conditions / thread concurrency issues could be the cause, this tool needs to have minimal impact on the program. Each trace results in only one 32bit integer written to an already-open file handle. The resulting overhead was sufficiently small in the test projects I used.

The Python script uses a heuristic to find all functions and methods. The script associates a 32 bit integer with the current function and writes this to a persisted lookup file to use later. It injects code that creates a class instance. The class's constructor and destructor both write traces, and so we can trace both entry and exit.

For each trace statement, the current thread id is hashed to a 32-bit integer, the lower 14 bits are taken and used as an index into an array of 16384=2^14 file handles. If the file handle isn't set, it is opened. The integer and any associated data (like the current time if specified by the string "{TIME}") are then written in binary to the file. (On Windows, _setmaxstdio will work around file limits in long-running thread spawning scenario, and fwrite_no_lock can be used to ensure a critsec is not taken). Buffered output dramatically reduces the cost of these many fwrite calls. simply reads the binary output, finding each tag in the lookup file that was persisted earlier.

In Windows, ETW is a very-low-overhead tracing mechanism that would fit well here, but it is not cross-platform, and requires an ETW consumer module to be written to output events to disk.

I'll add this to GitHub soon.

Monday, May 13, 2013

Keyboard shortcuts for the SciTE code editor

One of the reasons I use the SciTE code editor is that keyboard shortcuts can be easily changed in a plain-text configuration file.

If you use SciTE, here is a Python script to list current keyboard shortcuts. You will need to have downloaded SciTE's source code, or at least download the file SciTERes.rc.

# list SciTE shortcuts
# Ben Fisher,
# 1) shortcuts from user.shortcuts
# 2) shortcuts from commands
# 3) shortcuts from SciTE's rc
# does not include scintilla's simple keybindings (see KeyMap.cxx KeyMap::MapDefault)

scite_src = 'path/to/scite'
scite_props = 'path/to/sciteinstall/properties'
specificfiletypes = []
# specificfiletypes = ['*.py', '$(']

import os, sys, re

class Shortcut(object):
  def __init__(self, name='',keys='',filetypes='',type='plugin'):; self.keys=keys; self.filetypes=filetypes; self.type=type
  def __str__(self):
    return self.keys.ljust(20) + self.type
  def getsortkey(self):
    # create a tuple specifying sort priority. length of key-shortcut is most important.
    sortvallastpart = self.keys.split('+')[-1]
    sortlenlastpart = len(sortvallastpart)
    sorttypepriority = dict(builtin=3, usershortcuts=2, plugin=1).get(self.type, 0)
    return (sortlenlastpart, sortvallastpart,self.keys, sorttypepriority)

def go():
  scite_src_rc = os.path.join(scite_src, 'win32', 'SciTERes.rc')
  assert os.path.exists(scite_src)
  assert os.path.exists(scite_src)
  assert os.path.exists(scite_props)
  if not os.path.exists(os.path.join(scite_props, '')):
    print 'warning: not found.'
  results = []
  f = open(scite_src_rc, 'r')
  bAccels = False
  for line in f:
    if bAccels:
      if line=='BEGIN' or not line or line=='*/': continue
      if line.startswith('//') or line.startswith('#') or line.startswith('/*'): continue
      if line=='END': break
      items = [item.strip() for item in line.split(',')]
      keys = items[0].replace('"','').replace('VK_','')
      keys = keys[0].upper() + keys[1:].lower()
      if 'SHIFT' in items: keys = 'Shift+'+keys
      if 'ALT' in items: keys = 'Alt+'+keys
      if 'CONTROL' in items: keys = 'Ctrl+'+keys
      results.append(Shortcut(items[1], keys, '*', 'builtin'))
    if line=='ACCELS ACCELERATORS': bAccels = True
  propcmds = {}
  for filename in os.listdir(scite_props):
    if not filename.lower().endswith('.properties'): continue
    f = open(os.path.join(scite_props, filename))
    bUsershort=False # are we currently looking at user.shortcuts
    for line in f:
      if bUsershort:
        if '|' in line:
          shortcut, cmd, unused = line.split('|')
          results.append(Shortcut(cmd, shortcut, '*', 'usershortcuts'))
        if not line.endswith('\\'): bUsershort=False
        parse = re.match(r'command\.(name|shortcut)\.(\w+)\.([^=]+)=(.*)', line)
        if parse and in specificfiletypes:
          # add to a dict of in-progress commands
          obj = propcmds.setdefault(';', Shortcut())
          obj.filetypes =
          if len( obj.keys = 'Ctrl+'
          if'name': =;
          elif'shortcut': obj.keys =

        if line.startswith('user.shortcuts='): bUsershort=True
  for propcmd in propcmds: results.append(propcmds[propcmd])
  results.sort(key = lambda item: item.getsortkey())
  for res in results:
    if not'IDM_BUFFER+'):
      if not'IDM_TOOLS+'):

if __name__=='__main__':

Monday, March 18, 2013

Python printval

After reading about Python inspect, I realized that a lot of information is available to a function being called. One can even look downwards into the scope of the caller. It occurred to me that this could be useful for debugging; for quick and dirty debugging I sometimes use the "print values to stdout and see what's wrong" technique. I came up with Printval -- quick Python debugging that can also print expressions and locals.
>>> from printval import printval
>>> a = 4
>>> b = 5

>>> printval| a
a is 4

>>> printval| b
b is 5

>>> print('3*a*b is '+str(3*a*b))
3*a*b is 60

>>> printval| 3*a*b
3*a*b is 60

See the last example -- saves typing!

Here's all of printval:

import inspect, itertools
class PythonPrintVal(object):
    def _output(self, name, val):
        # expands a struct to one level.
        print('printval| '+name+' is '+repr(val))
        if name=='locals()':
            for key in val:
                if not key.startswith('_'):
                    self._output(key, val[key])
        elif ' object at 0x' in repr(val):
            for propname in dir(val):
                if not propname.startswith('_'):
                    sval = repr(val.__getattribute__(propname))
                    print(' \t\t.'+propname+'  =  '+sval)

    def __or__(self, val):
        # look in the source code for the text
        fback = inspect.currentframe().f_back
            with open(fback.f_code.co_filename, 'r') as srcfile:
                line = next(itertools.islice(srcfile, fback.f_lineno-1, fback.f_lineno))
                self._output(line.replace('printval|','',1).strip(), val)
        except (StopIteration, IOError):
            return self._output('?',val)

printval = PythonPrintVal()
#really, that's all the code there is!

It will also enumerate through the fields of a class. If you use printval| locals(), you get a more-nicely formatted representation of all locals in scope.

(In an earlier version, I used the syntax printval.a, so I could create chains like printval.a.b. This method doesn't require looking at source code -- the printval object knows the string 'a' from the parameter passed to __getattribute__, and retrieves its value from the scope of the caller. As a benefit this could work in compiled/frozen apps. The downside is that this doesn't allow expressions like printval|a+4)

It turns out that other people have had similar ideas; there are modules for ScopeFormatter, Say, and Show.

For ultimate scratchpad-ability:

  1. The LnzScript editor allows you to run untitled scripts. Modify the file so that it runs Python instead.
  2. Add a keybinding where Ctrl+\ inserts the text "printval| " into the editor.
  3. You now have a nifty scratchpad for running math expressions and testing Python code.

Monday, February 25, 2013

LnzScript 0.5 released

Announcing a new release of LnzScipt:

Some LnzScipt functions used to depend on a tool called nircmd. In this release, Clipboard.saveImage, Screen.convertImage, and Screen.saveScreenshot were rewritten from scratch, and nircmd is no longer necessary.

Also, the API Reference now looks a lot better. It still uses a client-side XSL transform, it's nice to use the same xml file for rendering docs in the browser, rendering docs in a little C# tool that comes with LnzScript, and rendering tooltip documentation within the LnzScript Editor.

See screencasts and revamped documentation here.