Monday, January 28, 2013

Moving a svn with CRLF line-endings to git

In case this helps someone else out there,

Moving a svn repo with Windows line endings to git

Background: It's best to store Git files with Unix line endings (LF) and not Windows endings (CRLF). This convention makes it easier for others to use your code and is what the Git implementation was designed to work with. Note that git's core.autocrlf setting is deprecated and it's better to specify lines in a .gitattributes file.

I wanted to set .gitattributes with * text=auto, which means that Git will internally store text files with LF, but the working copy will have appropriate newlines for the platform. Git will also automatically distinguish between text and binary files.

svn maintains the CRLF endings, and doing a git svn import even on a posix system imports the files with CRLF. Adding the svn:eol-style=native property to all text files in the svn repo could work, as would importing into git and then doing one large commit to renormalize line endings, but these would show up in revision history, which I didn't want.

The following worked:
(open terminal in Linux.)
(install dos2unix if it's not there.)
git svn clone -s --no-metadata file:///path/to/svn/repo/newgit
cd newgit
# rewrite history and change newlines
# dos2unix, at least recently, skips binary files
git filter-branch -f --tree-filter 'find . -path './.git' -prune -o -type f -exec dos2unix \{} \;' HEAD
# wait as history is rewritten. loops through every revision.
# now, create a new repo that isn't attached to svn.
mkdir ../newgit2
cd ../newgit2
git clone file:///path/to/newgit
(cd to this new repo)
(add the .gitattributes file specifying * text=auto)
git add .gitattributes; git commit
# done. if this repo is cloned on a Windows system, 
# there will be CRLF in the working copy and LF internally,
# as Git expects.

I wonder if someone should go through Stack Overflow and note the outdated information about core.autocrlf.

Stack overflow

Tuesday, January 22, 2013

CellRename - An Interesting Example

I have a CD by Dungen, one of my favorite Swedish indie rock bands. When copying the CD to mp4s on my computer, it looks like the audio software choked on the Swedish characters, and the resulting mp4 files have truncated names. Is there a way I can avoid having to type all of the song names in manually?

First, I'll sort by file-creation time, to sort in album order:

Now, I'll look up the album, Ta Det Lugnt, on

I copy the text from this website, go back to CellRename, click on the first cell under "New name", and use Ctrl+V to paste. (Cool!)

There are other ways to accomplish this, but let's use regular expressions to remove the track numbers. I select "Replace in Filename" from the Edit menu, and type:
(The r: means to use a regular expression. The [0-9] means to match any number from 0 to 9. The + means that the [0-9] can occur one or more times).

Looks like this worked:

Now, I'll add back the track numbers, and leading zeros, and the .mp4 extension. After selecting Pattern... from the Edit menu, I'll type:
(For each file, the %n will be replaced by a new number, and the %f will be replaced by the previous filename.)

Now I'm done!

(Download CellRename on Github)

CellRename - A Simple Example

I took some photos on my vacation in The Dalles, Oregon. After copying them onto my computer, I see that the filenames are messy. I'll open CellRename, File->Open, and open the directory.

First, I'll change .JPG to .jpg, which looks nicer to me. I can choose Replace in Filename from the Edit menu,

, and replace ".JPG" with ".jpg".

Now, I want to add the prefix Dalles Trip to each filename. I can select Add Prefix... from the Edit menu, enter "Dalles Trip ", and click OK.

Now, I select Perform Rename from the File menu, and the files are renamed.

If I need to Undo the rename, I can, from the Edit menu. But this is a good starting point until I'll organize these photos more thoroughly.

(Download CellRename on Github)

CellRename is on Github

I polished up an old project from 2008, CellRename: a file-rename utility with a spreadsheet-like UI. Take a look at my brand new Github to download (supported on Windows and Linux).

features include:

  • setting filename based on pattern (or append numbers like 001, 002, etc)
  • adding a prefix / suffix
  • search/replace within filenames
  • regex replace within filenames
  • if you copy a file path, cellrename will automatically start in that directory

I tested on Win XP, Win7, Win8, Ubuntu Linux, Fedora Linux, Mageia Linux, and Mint Linux. Credit pyinstaller as a great Python stand-alone tool.

This is only the beginning; there are many more half hour hacks to be unfurled.