Saturday, November 15, 2008

ISM, an intermediate level language

For our Computer Architecture class, a partner and I created a MIPS cpu in Verilog. This was a really cool project that deserves its own description, but for now I'll focus on something I made while testing the cpu. The cpu runs MIPS assembly code. Writing assembly is not bad for simple tests, but when I started to write more complicated programs, I got tired of remembering what I was using $s0 for and so on.

So, just for fun, I quickly wrote a Python script to translate assembly-like code into assembly. The small language I designed is an intermediate between higher-level code and assembly. The syntax is more like a higher level language, assigning values with "a=b" or adding with "c = a+b". It's not a compiler, but doesn't really need to be, and with too much abstraction one would be better off writing in C anyway.

The cool part about the project is how quickly and naturally the code turned out. It only took an hour to have a fully working version that understood the opcodes, and had useful error messages for syntax errors. The code just flowed quickly as I found ways to reuse code and simplify the problem.

For example, I needed to split each line of code by tokens. I came up with a creative way to do this - Python has a built-in way to split a string based on whitespace, " 'a b c    d e'.split() " evaluates to "['a','b','c','d','e']". So in order to split on both + and - tokens and whitespace, I did "input=input.replace('+',' + ').replace('-',' - '); result=input.split()". By surrounding each occurrance of '+' with whitespace, now the string is split by both whitespace and tokens, so that a+b and a + b are both recognized as ['a','+','b']. (This could also be done by splitting on a regex, but that would be slightly more complicated).

The first, one-hour version, understands code like:
//add numbers from 1 to 25 and put result in sum.

registers i, const_one,sum,highest
seti const_one=1
seti sum=0
seti highest=25
seti i=0
 add i=i+const_one
 add sum=sum+i
 bne (i!=highest) goto LOOP

Then, later, I started to add more features. I added syntax shortcuts like a++, and simulated more types of branching even though our cpu only supported bne. I also added inference, where each line is checked to infer which operation you are trying to do. For example, if the line is "c=a+b" it is clear that you are doing an add instruction, so you don't have to type "add". Also, it now understands simple loops and if structures, translating them to assembly. It looks like Python if you squint.
// find prime numbers from 2 to 100.
// Place the largest found in result.
registers i,tmp, base, limit, result

// clear memory
for i,300,0
  dataWordIndex[i] = zero

for base,2,100
  // check if we've already seen this number come up
  tmp = dataWordIndex[base]
  (tmp != zero) goto CONTINUELOOP
  // check off multiples of this number.
    i = i + base
    dataWordIndex[i] = one
    (i < limit) goto INNERLOOP

// put largest prime found into result
for i,100,2
  tmp = dataWordIndex[i]
  if (tmp==zero)
    goto DONE


It was fun to come up with syntax for this. Branches are done with the unique, yet readable syntax "(a==b) goto Label". Array-like syntax is used for accessing data memory, like 'dataWordIndex[i] = b'. (dataWordIndex means that the index is automatically multiplied by four, so that it can be used as an array of 32 bit integers.)

When the script runs, it also prints out Verilog $display statements, which can be placed in the Verilog code to see the values of the registers.

Here is the code. is the first version. I realize that it won't be useful unless you have a mips assembler and cpu.

Sunday, November 2, 2008

Rickroll Detector

Some people evidently don't understand when a meme is dead, or deserves to die. Rickrolls have been around for far too long and have even been picked up by the mainstream media. It's just not really funny anymore.

Today I was kind of bored and made this in a few hours. It's a system for checking if a link is a rickroll. You provide a URL, then some audio analysis is done, and the result is a plot showing whether or not it was a rickroll. What's nice is that it will also detect spin-offs like Barak-roll, scary roll, and so on.

How it works:
A Launchorz script directs what happens.
First, your audio is muted automatically.
Then, behind the scenes, the url is opened in Firefox. The program Total Recorder is used to capture the audio played by the webpage.
The page is closed after 15 seconds of capturing audio.
The audio is saved as a wav file in a temporary location, and your audio is unmuted.
Then, a MATLAB script is used to analyze the audio.
The audio is normalized to approximately the same volume.
Then, the cross-correlation is taken between a reference Rickroll audio and the recorded audio.
If there are peaks in the cross-correlation above around 200, then it is likely the page contained a Rickroll.

Results: After adjusting of parameters in the MATLAB code, the results are pretty good.

"Warning: rick-roll detected"

"Safe: no rick-roll detected"

Kylie Minogue's "I Should Be So Lucky".

(The mostly-evenly spaced peaks correspond with the steady beat in the music.) The results would probably be even better if the comparison was done in frequency, instead of in time. But I'm not about to waste any more time than the few hours I put into this.

Here's the script I used but it needs Launchorz, MATLAB, and the shareware program Total Recorder.