Wednesday, July 27, 2016

Sending WM_COPYDATA in Python with ctypes

Putting together pieces from a few online sources, here is example code that:
  • Supports 32-bit and 64-bit Python
  • Uses only standard libraries (for Python 2.5+)
  • Can be used to send commands to the SciTE code editor

In particular, there is no win32gui dependency.

import sys
import ctypes
from ctypes import wintypes


context64bit = sys.maxsize > 2**32
if context64bit:
  class COPYDATASTRUCT(ctypes.Structure):
    _fields_ = [('dwData', ctypes.c_ulonglong),
      ('cbData', ctypes.wintypes.DWORD),
      ('lpData', ctypes.c_void_p)]
  class COPYDATASTRUCT(ctypes.Structure):
    _fields_ = [('dwData', ctypes.wintypes.DWORD),
      ('cbData', ctypes.wintypes.DWORD),
      ('lpData', ctypes.c_void_p)]

def findWindow(windowClass):
  receiver = None
  hwnd = ctypes.windll.user32.FindWindowA(
    windowClass, receiver)
  return hwnd or None

def sendMessage(message, hwnd, dwData=0):
  assert isinstance(message, str)
  sender_hwnd = 0
  buf = ctypes.create_string_buffer(message)
  copydata = COPYDATASTRUCT()
  copydata.dwData = dwData
  copydata.cbData = buf._length_
  copydata.lpData = ctypes.cast(buf, ctypes.c_void_p)
  return ctypes.windll.user32.SendMessageA(
    hwnd, WM_COPYDATA, sender_hwnd,

In Python 3, the parameter would presumably be a bytes object and not a str object.

Monday, July 11, 2016

Moving a directory in git from one repo to another

I recently moved a directory from one repo to another, and preserved both commit history and last-modified times.

Summarizing what I've learned from a few different sources. These commands should be run in Linux; there is likely a PowerShell equivalent but it was not forthcoming. (Speaking of sharing a repo across platforms, adding a .gitattributes with * text=auto to handle newline characters has worked well for me).

Let's say we want to move "dir/directory_src" of "repo_one" to "dir/directory_dest" of "repo_two".

First let's move all files from directory_src to directory_dest and remove all files outside of directory_dest.

mkdir ~/repos
cd ~/repos
git clone https://servername/repo_one
cd repo_one
# intentionally prevent ourselves from pushing changes
git remote rm origin
# run the following and look at the resulting filenames to see if they look correct
git ls-files -s | sed "s-\tdir/directory_src/-\tdir/directory_dest/-"
# rewrite history so that "dir/directory_src" moves to "dir/directory_dest"
git filter-branch --index-filter  'git ls-files -s | sed "s-\tdir/directory_src/-\tdir/directory_dest/-" | GIT_INDEX_FILE=$ git update-index --index-info && mv "$" "$GIT_INDEX_FILE"' HEAD
git filter-branch --index-filter  'git ls-files -s | sed "s-\tdir/directory_src/-\tdir/directory_dest/-" | GIT_INDEX_FILE=$ git update-index --index-info && if [ -f "$" ]; then mv "$" "$GIT_INDEX_FILE"; fi' HEAD
(move or delete the directory .git/refs/original)
cd ~/repos/repo_one

# run the following and look that files in dir/directory_dest aren't included
git ls-files | egrep -v ^dir/directory_dest/
# rewrite history and remove all other files
git filter-branch --tree-filter 'rm -rf $(git ls-files | egrep -v ^dir/directory_dest/)' -- --all
# delete empty commits (optional)
git filter-branch --commit-filter 'git_commit_non_empty_tree "$@"' HEAD
Now let's copy the files into repo_two.
cd ~/repos
git clone https://servername/repo_two
cd repo_two
git remote add from_repo_one ~/repos/repo_one
git pull from_repo_one master
git remote rm from_repo_one
# after confirming that everything looks right,
# run git push.
"repo_two" should now contain "dir/directory_dest" and all of its contents.

Note: the tree filter removal script will leave filenames that contain spaces or tabs.


Moving files from one Git repository to another

How can I move a directory in a Git repo

Monday, February 29, 2016

coordinate_music, keeping my music tidy

I wrote a set of tools to keep my local music library "coordinated", to have perfect consistency between filename, id3 tag, and Spotify's metadata.

When importing music, coordinate_music will walk through audio files and use the Spotify API to search for the associated track. This can either be done one album at a time, or on a track by track basis. It will present you with a list of candidates, then you can then confirm, or type "hear0" to hear the original, or type "hear1" to hear the first candidate. Here's what it looks like when searching by track:

Here's what it looks like when searching by album:

This association is saved in the website ID3 tag in the audio file (mp3, m4a, or flac). After importing music, this set of scripts can:
  • check that every directory and filename is formatted correctly.
  • check for consistency between filename, id3 tag, and Spotify's metadata. set tags from name and vice versa.
  • create .url files that open directly to Spotify Desktop.
  • search Spotify interactively by artist, title, album to find a corresponding Spotify track.
  • save all metadata to a utf-8 text file, which can be useful for backup.

Other features include, if enabled:

  • opening a .mp3 redirects to the associated track to play in Spotify desktop, which often has higher audio quality.
  • typing "BRK" into any interactive text prompt to view the current directory in UI and retry the current operation.
  • filenames in the format .sv.mp3 are synced to an external directory for backup.
  • working with Spotify playlists (viewing tracks, removing tracks, creating playlist from directory of mp3s).
  • saving a Spotify playlist to text file of song lengths and names.
  • indicating a song's subjective "rating" by its bitrate.
  • renaming files in a directory based on Spotify playlist.
  • saving disk space, by interactively walking through directories, and
    • if low bitrate and Spotify's 'popularity' data indicates high popularity,
    • replace the file with a .url linking to Spotify, after asking the user.
Tests pass on Linux (latest Linux Mint) and Windows (7 and later supported).

See the source code, and a more complete explanation, on GitHub.

Copying files in Python without race conditions

When copying files in Python, shutil.copy (and shutil.copy2) are able to silently overwrite the destination file if it already exists. At times this is the desired behavior, but I find that more often, I want to prevent overwriting the destination. A "naive" check would be this:
def supposedlySaferCopy(srcfile, destfile):
    if not exists(destfile):
        raise IOError('destination already exists')
    shutil.copy(srcfile, destfile)
But it has a race condition: there is a small window of time between checking for existence and running the copy. Sometimes this is check is a safeguard, for example to make sure file operations in a complex script are not overwriting data when not expected to. In general this pattern can also be a security issue, e.g. a type of symlink race.

In Windows one can make a call directly to the Windows api; both CopyFile and MoveFile take a parameter for preventing overwrite. This can be done in pure Python because ctypes is built into Python's standard library (in 2.5 and later). In Posix systems, I wrote a copyFilePosixWithoutOverwrite function. The O_CREAT flag ensures the file is new, and the O_EXCL will hold the file handle exclusively. Here are my open and copy implementations:
def copy(srcfile, destfile, overwrite):
    if not exists(srcfile):
        raise IOError('source path does not exist')
    if srcfile == destfile:
    elif sys.platform == 'win32':
        from ctypes import windll, c_wchar_p, c_int
        failIfExists = c_int(0) if overwrite else c_int(1)
        res = windll.kernel32.CopyFileW(c_wchar_p(srcfile), c_wchar_p(destfile), failIfExists)
        if not res:
            raise IOError('CopyFileW failed')
        if overwrite:
            shutil.copy(srcfile, destfile)
            copyFilePosixWithoutOverwrite(srcfile, destfile)

def move(srcfile, destfile, overwrite):
    if not exists(srcfile):
        raise IOError('source path does not exist')
    if srcfile == destfile:
    elif sys.platform == 'win32':
        from ctypes import windll, c_wchar_p, c_int
        replaceExisting = c_int(1) if overwrite else c_int(0)
        res = windll.kernel32.MoveFileExW(c_wchar_p(srcfile), c_wchar_p(destfile), replaceExisting)
        if not res:
            raise IOError('MoveFileExW failed')
        copy(srcfile, destfile, overwrite)
def copyFilePosixWithoutOverwrite(srcfile, destfile):
    # fails if destination already exist. O_EXCL prevents other files from writing to location.
    # raises OSError on failure.
    flags = os.O_CREAT | os.O_EXCL | os.O_WRONLY
    file_handle =, flags)
    with os.fdopen(file_handle, 'wb') as fdest:
        with open(srcfile, 'rb') as fsrc:
            while True:
                buffer = * 1024)
                if not buffer:
Fairly comprehensive tests and more file utilities can be found in and on my GitHub page here.

In Python 2, starting a Windows process with non-ascii characters

I recently encountered an exception in Python 2, using subprocess on Windows. If the process name or any of the arguments contain non-ascii/Unicode characters, an error like the following is raised: UnicodeEncodeError: 'ascii' codec can't encode character u'\xc5' in position 5: ordinal not in range(128).

The issue was opened several years ago, on the official bug tracker, and fixed in Python 3 but not Python 2. It looks like the ultimate source of the issue is the use internally of CreateProcessA instead of CreateProcessW. (Some of the workarounds on this page, like specifying a code page, aren't full solutions since they'll still fail for most unicode characters).

Here's my workaround. It uses, which is MIT Licensed and available here as well as many other places on GitHub.

def runWithoutWaitUnicode(listArgs):
    # in Windows, non-ascii characters cause subprocess.Popen to fail.
    import subprocess
    if sys.platform != 'win32' or all(isinstance(arg, str) for arg in listArgs):
        p = subprocess.Popen(listArgs, shell=False)
        import winprocess
        import types
        if isinstance(listArgs, types.StringTypes):
            combinedArgs = listArgs
            combinedArgs = subprocess.list2cmdline(listArgs)
        combinedArgs = unicode(combinedArgs)
        executable = None
        close_fds = False
        creationflags = 0
        env = None
        cwd = None
        startupinfo = winprocess.STARTUPINFO()
        handle, ht, pid, tid = winprocess.CreateProcess(executable, combinedArgs,
            None, None,
            int(not close_fds),
        return pid
This only accounts for CreateProcess, and not ShellExecute (i.e. passing shell=True to subprocess). However, you can use the "start" command as a way to ShellExecute. For example, in Windows, to open a file with its default program, you can use runWithoutWaitUnicode([u'cmd', u'/c', u'start', filePath]). (As a side note, if a directory name is passed, the directory will be opened in Explorer UI, which can be useful).

For tests, including tests that specifically exercise the Unicode case that was previously broken, see and on my GitHub page here.

Thursday, February 11, 2016

Adding features to Create Synchronicity

Create Synchronicity is a lightweight open source backup and synchronization program. After choosing a source directory and a destination directory, it will send updated files from the source to the destination. It supports previewing, scheduled actions, filtering by file type, and checksum verification.

Although I use dedicated backup software, I've found Create Synchronicity useful for ad-hoc synchronization like maintaining a mirror of my music library on an external hard drive. I recently modified Create Synchronicity's source code to add some new features to make it even more useful.

Adding a Context Menu

After selecting item(s) in the Preview list, right-click to show my new context menu.
  • Show Differences...
    • Highlights differences between the files, using winmerge.exe or other diff/merge software.
  • Copy Source to Destination...
    • Selectively sync only the files that are highlighted, after showing a preview.
  • Copy Destination to Source...
    • "Reverse sync" (from destination to source) the files that are highlighted, after showing a preview.
  • Keep Source and Destination...
    • In some cases, you want to keep both the source version of the file and the destination version of the file. In order to do this, "Keep Source and Destination" appends a timestamp to the destination filename and copies the file to both locations, after showing a preview.

Additional settings

To turn on these settings, press Ctrl+Alt+E to enable "expert" features. From now on, the Settings page will show this menu in the bottom left:
  • Check for newly added contents before deleting folders
    • Time can pass between the user running Preview and Sync. New files added during this window can be potentially deleted if the parent directory is marked for deletion in the Preview. Turn on this check to eliminate the race condition.
  • Show yellow icon if destination is newer
    • When in "strict mirror" mode, show a yellow icon for files where the destination (about to be overwritten) is more recent than the source.
  • Potential speedup when MD5 and compare file size are enabled
    • Reordered code to reduce the number of checksums needed.
  • Tests
    • Low level tests cover every branch of newly added functions, every combination of file/folder, create/update/delete. Component tests write to a temp directory and verify all directories, file contents written as expected.
Download link and source code coming soon!

Wednesday, January 20, 2016

A Simple Interface to Read/Write Audio Metadata in Python

I wrote a small wrapper for Mutagen that makes it easier to read/write audio metadata (tags for mp3, ogg, flac, m4a/mp4) in Python. Here's an example:
    o = EasyPythonMutagen('file.mp3')
    o.set('title', 'song title')
    o = EasyPythonMutagen('file.flac')
    o.set('title', 'song title')
    o = EasyPythonMutagen('file in id3_v23.mp3', use_id3_v23=True)
    o.set('title', u'title with unicode: \u0107')

A few differences from Mutagen:
  • You can use the same class and interface for different audio formats.
  • You won't need to catch exceptions in case the mp3 doesn't have an id3 tag yet.
  • You won't have to use a low level interface to write tags in id3v2.3, for compat. with Windows and smartphone apps.

It'd be nice to add id3v2.3 support in EasyID3 to the mutagen project at some point. In the meantime I'll use this wrapper.

See the source and download it on GitHub.

Other small features of easypythonmutagen:

  • Provides method to get the empirical ("actual") bitrate in addition to stated bitrate.
  • The "get" methods directly return a value, instead of a list.
  • Intentionally disallows adding unrecognized fields A typo like o['aartist'] fails instead of succeeding silently.
  • Added a few fields, like 'Composer' and 'Website' for mp4/m4a.