Sunday, November 3, 2013

Trace Travel Debugging

While working in a large codebase, from time to time I have encountered bugs that are difficult to reproduce. I know that something has gone wrong, but it is difficult to pinpoint the failure; by the time the failure has occurred, the interesting values/faulty state is gone. For a rarely hit bug, stepping through with a debugger won't work, and crash dumps for ship builds can be difficult to interpret.

One way to attack a problem like this is to use Time Travel Debugging, where the program state is recorded over time, and so a developer can re-play the exact sequence of events. Time Travel Debugging is costly to set up, though, and in addition the overhead it introduces alters timing and potentially altering the repro. (These issues are commonly thread or cross-process concurrency bugs).

I recently thought of a way to assist these investigations. I wrote a tool that I call "Trace Travel Debugging". Trace Travel Debugging uses a logging approach. By comparing logs during a failure with logs during success, one can potentially see the point of divergence. Instead of manually adding log statements, though, my tool automatically traces all function calls, making it much more likely that the point of divergence will be seen.

An example trace for one thread of a program using the Anti-Grain graphics library:
  Agg2D::Agg2D(
    void Agg2D::lineCap(LineCap cap)
    void Agg2D::lineJoin(LineJoin join)
  void Agg2D::attach(Image& img)
    void Agg2D::attach(unsigned char* buf, unsigned width, unsigned height, int stride)
    height =64 
    width =256 
      void Agg2D::resetTransformations()
      void Agg2D::lineWidth(real w)
      void Agg2D::lineColor(unsigned r, unsigned g, unsigned b, unsigned a)
        void Agg2D::lineColor(Color c)
      void Agg2D::fillColor(unsigned r, unsigned g, unsigned b, unsigned a)
        void Agg2D::fillColor(Color c)
      void Agg2D::textAlignment(TextAlignment alignX, TextAlignment alignY)
      void Agg2D::clipBox(real x1, real y1, real x2, real y2)
      void Agg2D::lineCap(LineCap cap)
      void Agg2D::lineJoin(LineJoin join)
      void Agg2D::flipText(bool flip)
      void Agg2D::imageFilter(ImageFilter f)
      void Agg2D::imageResample(ImageResample f)
  void Agg2D::lineWidth(real w)
  void Agg2D::lineColor(unsigned r, unsigned g, unsigned b, unsigned a)
    void Agg2D::lineColor(Color c)
  void Agg2D::line(real x1, real y1, real x2, real y2)
    void Agg2D::addLine(real x1, real y1, real x2, real y2)
    void Agg2D::drawPath(DrawPathFlag flag)
      void Agg2D::addStrokePath()
      void Agg2D::render(bool fillColor)
      time at Agg2D::render is (2013-11-3-22-7-54-170790)
        static void render(Agg2D& gr,
          void blend_solid_hspan(int x, int y,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
          void blend_solid_hspan(int x, int y,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
            void blend_pix(value_type* p,
Seeing the flow of execution like this can also be generally useful for learning how a large program works.

Usage:

1) Add the following line to the headers section of a .cpp or .h file to log: #define TTD_TRACELOG(s,n)

2) If desired, add some manual logging statements in these files. For example, TTD_TRACELOG("the height is", height); or TTD_TRACELOG("in function main, time is {TIME}", 0);

3) Add each file to a list in tracetraveldebugging.py:
srcs = [ r'~/fastdev/fastpixel/fastpixel_v3/sdl/agg24/src', 
 '~/fastdev/fastpixel/fastpixel_v3/sdl/agg24/include/agg_pattern_filters_rgba.h', 
 '~/fastdev/fastpixel/fastpixel_v3/sdl/agg24/include/agg_pixfmt_rgb.h' ]
outputdir = r'~/output'


Now, run tracetraveldebugging.py. It will modify the specified sources (after backing up the original) and add logging statements for each function/method. Rebuild and run your program. Run tracetraveldebugging_print.py, and you will see traces for each thread.

Details:

Because race conditions / thread concurrency issues could be the cause, this tool needs to have minimal impact on the program. Each trace results in only one 32bit integer written to an already-open file handle. The resulting overhead was sufficiently small in the test projects I used.

The Python script uses a heuristic to find all functions and methods. The script associates a 32 bit integer with the current function and writes this to a persisted lookup file to use later. It injects code that creates a class instance. The class's constructor and destructor both write traces, and so we can trace both entry and exit.

For each trace statement, the current thread id is hashed to a 32-bit integer, the lower 14 bits are taken and used as an index into an array of 16384=2^14 file handles. If the file handle isn't set, it is opened. The integer and any associated data (like the current time if specified by the string "{TIME}") are then written in binary to the file. (On Windows, _setmaxstdio will work around file limits in long-running thread spawning scenario, and fwrite_no_lock can be used to ensure a critsec is not taken). Buffered output dramatically reduces the cost of these many fwrite calls.

tracetraveldebugging_print.py simply reads the binary output, finding each tag in the lookup file that was persisted earlier.

In Windows, ETW is a very-low-overhead tracing mechanism that would fit well here, but it is not cross-platform, and requires an ETW consumer module to be written to output events to disk.

I'll add this to GitHub soon.

Monday, May 13, 2013

Keyboard shortcuts for the SciTE code editor


One of the reasons I use the SciTE code editor is that keyboard shortcuts can be easily changed in a plain-text configuration file.

If you use SciTE, here is a Python script to list current keyboard shortcuts. You will need to have downloaded SciTE's source code, or at least download the file SciTERes.rc.

# list SciTE shortcuts
# Ben Fisher, http://halfhourhacks.blogspot.com
# 1) shortcuts from user.shortcuts
# 2) shortcuts from commands
# 3) shortcuts from SciTE's rc
# does not include scintilla's simple keybindings (see KeyMap.cxx KeyMap::MapDefault)

scite_src = 'path/to/scite'
scite_props = 'path/to/sciteinstall/properties'
specificfiletypes = []
# specificfiletypes = ['*.py', '$(file.patterns.py)']

import os, sys, re

class Shortcut(object):
  def __init__(self, name='',keys='',filetypes='',type='plugin'):
    self.name=name; self.keys=keys; self.filetypes=filetypes; self.type=type
  def __str__(self):
    return self.keys.ljust(20) + self.name.ljust(30)+ self.type
  def getsortkey(self):
    # create a tuple specifying sort priority. length of key-shortcut is most important.
    sortvallastpart = self.keys.split('+')[-1]
    sortlenlastpart = len(sortvallastpart)
    sorttypepriority = dict(builtin=3, usershortcuts=2, plugin=1).get(self.type, 0)
    return (sortlenlastpart, sortvallastpart,self.keys, sorttypepriority)

def go():
  scite_src_rc = os.path.join(scite_src, 'win32', 'SciTERes.rc')
  assert os.path.exists(scite_src)
  assert os.path.exists(scite_src)
  assert os.path.exists(scite_props)
  if not os.path.exists(os.path.join(scite_props, 'SciTEGlobal.properties')):
    print 'warning: SciTEGlobal.properties not found.'
  results = []
  
  f = open(scite_src_rc, 'r')
  bAccels = False
  for line in f:
    line=line.strip()
    if bAccels:
      if line=='BEGIN' or not line or line=='*/': continue
      if line.startswith('//') or line.startswith('#') or line.startswith('/*'): continue
      if line=='END': break
      items = [item.strip() for item in line.split(',')]
      keys = items[0].replace('"','').replace('VK_','')
      keys = keys[0].upper() + keys[1:].lower()
      if 'SHIFT' in items: keys = 'Shift+'+keys
      if 'ALT' in items: keys = 'Alt+'+keys
      if 'CONTROL' in items: keys = 'Ctrl+'+keys
      results.append(Shortcut(items[1], keys, '*', 'builtin'))
    if line=='ACCELS ACCELERATORS': bAccels = True
  f.close()
  
  propcmds = {}
  specificfiletypes.append('*')
  specificfiletypes.append('*.*')
  for filename in os.listdir(scite_props):
    if not filename.lower().endswith('.properties'): continue
    f.close()
    f = open(os.path.join(scite_props, filename))
    bUsershort=False # are we currently looking at user.shortcuts
    for line in f:
      line=line.strip()
      if bUsershort:
        if '|' in line:
          shortcut, cmd, unused = line.split('|')
          results.append(Shortcut(cmd, shortcut, '*', 'usershortcuts'))
        if not line.endswith('\\'): bUsershort=False
      else:
        parse = re.match(r'command\.(name|shortcut)\.(\w+)\.([^=]+)=(.*)', line)
        if parse and parse.group(3) in specificfiletypes:
          # add to a dict of in-progress commands
          obj = propcmds.setdefault(parse.group(2)+';'+parse.group(3), Shortcut())
          obj.filetypes = parse.group(3)
          
          if len(parse.group(2))==1: obj.keys = 'Ctrl+'+parse.group(2)
          if parse.group(1)=='name': obj.name = parse.group(4);
          elif parse.group(1)=='shortcut': obj.keys = parse.group(4)

        if line.startswith('user.shortcuts='): bUsershort=True
    
  
  for propcmd in propcmds: results.append(propcmds[propcmd])
  results.sort(key = lambda item: item.getsortkey())
  for res in results:
    if not res.name.startswith('IDM_BUFFER+'):
      if not res.name.startswith('IDM_TOOLS+'):
        print(res)

if __name__=='__main__':
  go()

Monday, March 18, 2013

Python printval

After reading about Python inspect, I realized that a lot of information is available to a function being called. One can even look downwards into the scope of the caller. It occurred to me that this could be useful for debugging; for quick and dirty debugging I sometimes use the "print values to stdout and see what's wrong" technique. I came up with Printval -- quick Python debugging that can also print expressions and locals.
>>> from printval import printval
>>> a = 4
>>> b = 5

>>> printval| a
a is 4

>>> printval| b
b is 5

>>> print('3*a*b is '+str(3*a*b))
3*a*b is 60

>>> printval| 3*a*b
3*a*b is 60

See the last example -- saves typing!

Here's all of printval:

import inspect, itertools
class PythonPrintVal(object):
    def _output(self, name, val):
        # expands a struct to one level.
        print('printval| '+name+' is '+repr(val))
        if name=='locals()':
            for key in val:
                if not key.startswith('_'):
                    self._output(key, val[key])
        elif ' object at 0x' in repr(val):
            for propname in dir(val):
                if not propname.startswith('_'):
                    sval = repr(val.__getattribute__(propname))
                    print(' \t\t.'+propname+'  =  '+sval)

    def __or__(self, val):
        # look in the source code for the text
        fback = inspect.currentframe().f_back
        try:
            with open(fback.f_code.co_filename, 'r') as srcfile:
                line = next(itertools.islice(srcfile, fback.f_lineno-1, fback.f_lineno))
                self._output(line.replace('printval|','',1).strip(), val)
        except (StopIteration, IOError):
            return self._output('?',val)

printval = PythonPrintVal()
#really, that's all the code there is!

It will also enumerate through the fields of a class. If you use printval| locals(), you get a more-nicely formatted representation of all locals in scope.

(In an earlier version, I used the syntax printval.a, so I could create chains like printval.a.b. This method doesn't require looking at source code -- the printval object knows the string 'a' from the parameter passed to __getattribute__, and retrieves its value from the scope of the caller. As a benefit this could work in compiled/frozen apps. The downside is that this doesn't allow expressions like printval|a+4)

It turns out that other people have had similar ideas; there are modules for ScopeFormatter, Say, and Show.

For ultimate scratchpad-ability:

  1. The LnzScript editor allows you to run untitled scripts. Modify the js.properties file so that it runs Python instead.
  2. Add a keybinding where Ctrl+\ inserts the text "printval| " into the editor.
  3. You now have a nifty scratchpad for running math expressions and testing Python code.

Monday, February 25, 2013

LnzScript 0.5 released

Announcing a new release of LnzScipt:



Some LnzScipt functions used to depend on a tool called nircmd. In this release, Clipboard.saveImage, Screen.convertImage, and Screen.saveScreenshot were rewritten from scratch, and nircmd is no longer necessary.

Also, the API Reference now looks a lot better. It still uses a client-side XSL transform, it's nice to use the same xml file for rendering docs in the browser, rendering docs in a little C# tool that comes with LnzScript, and rendering tooltip documentation within the LnzScript Editor.

See screencasts and revamped documentation here.

Sunday, February 17, 2013

Exporting a remote Subversion repo

I wrote a script to create a backup of a complete remote svn repo, containing all revisions. It looks like there's a project rsvndump that does something similar, but I knew that what I wanted could be done in a few lines of Python.

Change the first few lines, and then run the script.
url='https://url.of.repo/svn/'
output=r'C:\full\path\to\desired\output'
username='user.name'

import os, subprocess, shutil
from os.path import join, exists

def runwithshell(listArgs):
  print('  '.join(listArgs))
  popen = subprocess.Popen(listArgs, shell=True, stdout=subprocess.PIPE)
  stdout = popen.communicate()[0]
  return stdout.rstrip(), popen.returncode

def run_assertsuccess(cmd):
  assert '|' in cmd
  cmd = cmd.split('|')
  stdout, ret = runwithshell(cmd)
  assert ret == 0

def go():
  assert exists(output)
  os.chdir(output)
  run_assertsuccess('svnadmin|create|lrepo')
  if os.name=='nt':
    assert ':' in output, 'want full path'
    fout=open('lrepo/hooks/pre-revprop-change.bat','w')
    fout.write('')
    fout.close()
  else:
    assert output.startswith('/'), 'want full path'
    fout=open('lrepo/hooks/pre-revprop-change','w')
    fout.write('#!/bin/sh\nexit 0;')
    fout.close()
    run_assertsuccess('chmod|+x|localrepos/hooks/pre-revprop-change')
    
  run_assertsuccess('svnsync|init|--username|'+username+'|file:///'+output+'\lrepo|'+url)
  run_assertsuccess('svnsync|sync|--username|'+username+'|file:///'+output+'\lrepo')

def createExportDump():
  assert exists(output)
  os.chdir(output)
  run_assertsuccess('svnadmin|dump|.\lrepo|--incremental|>|out.dump')



if __name__=='__main__':
  go()
  createExportDump()
  print('saved to out.dump')



Later, to restore from the dump, use:
# svnadmin create newrepo
# svnadmin load ./newrepo < out.dump
References: How do I download my Subversion history

Monday, January 28, 2013

Moving a svn with CRLF line-endings to git

In case this helps someone else out there,

Moving a svn repo with Windows line endings to git

Background: It's best to store Git files with Unix line endings (LF) and not Windows endings (CRLF). This convention makes it easier for others to use your code and is what the Git implementation was designed to work with. Note that git's core.autocrlf setting is deprecated and it's better to specify lines in a .gitattributes file.

I wanted to set .gitattributes with * text=auto, which means that Git will internally store text files with LF, but the working copy will have appropriate newlines for the platform. Git will also automatically distinguish between text and binary files.

svn maintains the CRLF endings, and doing a git svn import even on a posix system imports the files with CRLF. Adding the svn:eol-style=native property to all text files in the svn repo could work, as would importing into git and then doing one large commit to renormalize line endings, but these would show up in revision history, which I didn't want.

The following worked:
(open terminal in Linux.)
(install dos2unix if it's not there.)
git svn clone -s --no-metadata file:///path/to/svn/repo/newgit
cd newgit
# rewrite history and change newlines
# dos2unix, at least recently, skips binary files
git filter-branch -f --tree-filter 'find . -path './.git' -prune -o -type f -exec dos2unix \{} \;' HEAD
# wait as history is rewritten. loops through every revision.
# now, create a new repo that isn't attached to svn.
mkdir ../newgit2
cd ../newgit2
git clone file:///path/to/newgit
(cd to this new repo)
(add the .gitattributes file specifying * text=auto)
git add .gitattributes; git commit
# done. if this repo is cloned on a Windows system, 
# there will be CRLF in the working copy and LF internally,
# as Git expects.

I wonder if someone should go through Stack Overflow and note the outdated information about core.autocrlf.

References:
http://maururu.net/2009/svn-to-git-take-2/
Stack overflow

Tuesday, January 22, 2013

CellRename - An Interesting Example


I have a CD by Dungen, one of my favorite Swedish indie rock bands. When copying the CD to mp4s on my computer, it looks like the audio software choked on the Swedish characters, and the resulting mp4 files have truncated names. Is there a way I can avoid having to type all of the song names in manually?



First, I'll sort by file-creation time, to sort in album order:



Now, I'll look up the album, Ta Det Lugnt, on amazon.com.


I copy the text from this website, go back to CellRename, click on the first cell under "New name", and use Ctrl+V to paste. (Cool!)


There are other ways to accomplish this, but let's use regular expressions to remove the track numbers. I select "Replace in Filename" from the Edit menu, and type:
(The r: means to use a regular expression. The [0-9] means to match any number from 0 to 9. The + means that the [0-9] can occur one or more times).


Looks like this worked:


Now, I'll add back the track numbers, and leading zeros, and the .mp4 extension. After selecting Pattern... from the Edit menu, I'll type:
(For each file, the %n will be replaced by a new number, and the %f will be replaced by the previous filename.)

Now I'm done!




(Download CellRename on Github)