July 17, 2008

This page hosts an excellent example of gesture/writing recognition implemented in pure Javascript. The author has a blog post here with more information about his implementation. The algorithm presented works fantastically on the web and if you follow the gestures in the PDF you will find it gets good accuracy even with clumsy, mouse-drawn lines. Even better, the source code is compact and quite clear. Inspired by the simplicity of the algorithm and the small size of the code, I ported it to pure Python. Lately I have been thinking a lot about Jython and Python on the Nintendo DS, and so I've kept the module clean and hopefully very portable with no dependencies on other libraries, and only one small dependency on a Python built-in library. I've also tried to remain dimensionally agnostic where possible, meaning that you should be able to detect 'gestures' in any kind of data of any dimensionality (3d movements, musical pitches, whatever) with a little tweaking. See below for the full source code listing. Copy the first three bits of code and paste them into a file called OneStrokeGesture.py(.) The last couples of bits of code are demos, which require the pygame library.

First up is the Path class. This class holds a single drawn line and is what is created when raw point data is passed to the library. The Path class requires the only dependency of this module, which is the sqrt function from the standard math library included with Python.

from math import sqrt

class Path:
    def __init__(self, points=[]):
        self.lens = []
        self.points = []
        self.normalized = False

        [self.AddPoint(p) for p in points]

        if len(self.points):
            self.Normalize()

    def __len__(self):
        return len(self.points)

    def AddPoint(self, point):
        self.points.append(point)

        if len(self.lens):
            l = self.Distance(self.points[-2], self.points[-1])
            self.lens.append(self.lens[-1] + l)
        else:
            self.lens.append(0)

    def Distance(self, a, b):
        return sqrt(sum([pow(a[i] - b[i], 2) for i in range(len(a))]))

    def DistancePath(self, path):
        if len(self) == len(path):
            return sum(self.Distance(self.points[i], path.points[i]) for i in range(len(self)))

    def Interpolate(self, array, index, ratio):
        return float(array[index]) + float(array[index + 1] - array[index]) * ratio

    def Normalize(self):
        if not self.normalized:
            self.normalized = True

            normLen = 10
            newPoints = []
            len = self.lens[-1]

            mx_min = min([x for x, y in self.points])
            mx_max = max([x for x, y in self.points])
            my_min = min([y for x, y in self.points])
            my_max = max([y for x, y in self.points])
            mx_diff = mx_max - mx_min
            my_diff = my_max - my_min

            if my_diff == 0:
                proportion = 10000000000000.0
            else:
                proportion = abs(float(mx_max - mx_min) / float(my_max - my_min))

            curIndex = 0
            for i in range(normLen):
                targetLen = i * len / normLen;

                while (self.lens[curIndex+1] < targetLen):
                    curIndex += 1

                ratio = (targetLen - self.lens[curIndex]) / (self.lens[curIndex + 1] - self.lens[curIndex])
                x = self.Interpolate([px for px, py in self.points], curIndex, ratio)
                y = self.Interpolate([py for px, py in self.points], curIndex, ratio)

                if (proportion < 0.2):
                    newPoints.append((0, (y - my_min) / (my_diff)))
                elif (proportion > 5):
                    newPoints.append(((x - mx_min) / (mx_diff), 0))
                else:
                    newPoints.append(((x - mx_min) / (mx_diff), (y - my_min) / my_diff))

            self.points = newPoints

    def PosValue(self, x, y):
        if (x < 0.5 and y < 0.5):
            return locals.StartTopLeft
        if (x >= 0.5 and y < 0.5):
            return locals.StartTopRight
        if (x < 0.5 and y >= 0.5):
            return locals.StartBottomLeft
        if (x >= 0.5 and y >= 0.5):
            return locals.StartBottomRight

    def MatchConstraint(self, c):
        if not c:
            return True
        if not self.normalized:
            return False

        startValue = self.PosValue(self.points[0][0], self.points[0][1])
        endValue = self.PosValue(self.points[-1][0], self.points[-1][1]) << 4

        return ((startValue | endValue) & (~c)) == 0

As you can see, the core algorithm requires that all paths are 'normalized' so that they contain the normLen number of points describing the generalised shape of the path. The Euclidean distance from each descriptive point to the corresponding point in each of a library of potential lines is then compared, and the line with the closest match is likely to be the line that the user was trying to draw. If you want more complex lines you could increase the number in normLen, but it should be fine for most letter/gesture detecting activities using hand-drawn lines.

The next piece of the puzzle is the following 'locals' class which holds a bunch of static variables and a default simplistic alphabet of gestures that correspond to letters of the alphabet. This is a rough approximation of the Palm 'Graffiti' alphabet:

class locals:
    StartTopLeft = 1 # starting in the top left corner is allowed
    StartTopRight = 1 << 1
    StartBottomLeft = 1 << 2
    StartBottomRight = 1 << 3
    StartAny = StartTopLeft | StartTopRight | StartBottomLeft | StartBottomRight

    EndTopLeft = 1 << 4
    EndTopRight = 1 << 5
    EndBottomLeft = 1 << 6
    EndBottomRight = 1 << 7
    EndAny = EndTopLeft | EndTopRight | EndBottomLeft | EndBottomRight
    Any = StartAny | EndAny

    capitals = [
        ["A", Path(zip([0, 5, 10], [10, 0, 10])), StartBottomLeft | EndBottomRight],
        ["B", Path(zip([0, 0, 0, 3, 3, 0], [0, 10, 7, 7, 10, 10])), StartTopLeft | EndBottomLeft],
        ["C", Path(zip([10, 0, 0, 10], [0, 0, 10, 10])), StartTopRight | EndBottomRight],
        ["D", Path(zip([0, 0, 10, 10, 0], [10, 0, 0, 10, 10])), StartBottomLeft | EndBottomLeft],
        ["E", Path(zip([10, 0, 0, 3, 0, 0, 10], [0, 0, 5, 5, 5, 10, 10])), StartTopRight | EndBottomRight],
        ["F", Path(zip([10, 0, 0], [0, 0, 10])), StartTopRight | EndBottomLeft],
        ["G", Path(zip([10, 0, 0, 10, 10, 5], [0, 0, 10, 10, 5, 5])), StartTopRight | EndAny],
        ["H", Path(zip([0, 0, 0, 3, 3], [0, 10, 7, 7, 10])), StartTopLeft | EndBottomRight],
        ["I", Path(zip([5, 5], [0, 10])), StartTopLeft | EndBottomLeft],
        ["J", Path(zip([10, 10, 0], [0, 10, 10])), StartTopRight | EndBottomLeft | EndTopLeft],
        ["K", Path(zip([10, 0, 0, 10], [0, 10, 0, 10])), StartTopRight | EndBottomRight],
        ["L", Path(zip([0, 0, 10], [0, 10, 10])), StartTopLeft | EndBottomRight],
        ["M", Path(zip([0, 0, 5, 10, 10], [10, 0, 5, 0, 10])), StartBottomLeft | EndBottomRight],
        ["N", Path(zip([0, 0, 10, 10], [10, 0, 10, 0])), StartBottomLeft | EndTopRight],
        ["O", Path(zip([5, 0, 0, 10, 10, 5], [0, 0, 10, 10, 0, 0])), StartTopLeft | StartTopRight | EndTopLeft | EndTopRight],
        ["P", Path(zip([0, 0, 0, 10, 10, 0], [0, 10, 0, 0, 5, 5])), StartBottomLeft | EndAny],
        ["P2", Path(zip([0, 0, 10, 10, 0], [10, 0, 0, 5, 5])), StartBottomLeft | EndAny],
        ["Q", Path(zip([4, 0, 0, 4, 4], [0, 0, 4, 4, 7])), None],
        ["R", Path(zip([0, 0, 0, 10, 10, 0, 10], [0, 10, 0, 0, 5, 5, 10])), StartBottomLeft | EndAny],
        ["R2", Path(zip([0, 0, 10, 10, 0, 10], [10, 0, 0, 5, 5, 10])), StartBottomLeft | EndBottomRight],
        ["S", Path(zip([10, 0, 0, 10, 10, 0], [0, 2, 4, 6, 8, 10])), StartTopRight | EndBottomLeft],
        ["T", Path(zip([0, 8, 8], [0, 0, 10])), StartTopLeft | EndBottomRight],
        ["U", Path(zip([0, 5, 10], [0, 10, 0])), StartTopLeft | EndTopRight],
        ["U2", Path(zip([0, 0, 10, 10], [0, 10, 10, 0])), StartTopLeft | EndTopRight],
        ["V", Path(zip([10, 5, 0], [0, 10, 0])), StartTopLeft | EndTopRight],
        ["V2", Path(zip([0, 3, 6, 10], [0, 10, 0, 0])), StartTopLeft | EndTopRight],
        ["W", Path(zip([0, 0, 5, 10, 10], [0, 10, 5, 10, 0])), StartTopLeft | EndTopRight],
        ["X", Path(zip([0, 10, 10, 0], [10, 0, 10, 0])), StartBottomLeft | EndTopLeft],
        ["Y", Path(zip([0, 0, 5, 5, 5, 5, 5, 10], [0, 5, 5, 0, 5, 10, 5, 5])), StartTopLeft | EndAny],
        ["Z", Path(zip([0, 10, 0, 10], [0, 0, 10, 10])), StartTopLeft | EndBottomRight],
    ]

Finally, the algorithm which executes the stroke by stroke, point by point search is implemented in Python in the following class:

class OneStrokeGestureException(Exception):
    pass

class OneStrokeGesture:
    """
        Can detect single-stroke pen gestures.
        Pass in the strokes variable to supply your own strokes to be recognised.

        Adapted by Chris McCormick <chris@mccormick.cx> from the Javascript code at:
        http://blog.monstuff.com/archives/images/JS-graffiti.html
    """
    paths = []

    def __init__(self, strokes=locals.capitals):
        self.strokes = strokes
        self.path = None
        self.down = False

    def PenDown(self, pos):
        self.path = Path()
        self.path.AddPoint(pos)
        self.down = True

    def PenTo(self, pos):
        self.path.AddPoint(pos)

    def PenUp(self, pos):
        self.down = False
        if len(self.path) > 1:
            self.path.AddPoint(pos)
            return self.FindBestMatch()

    def Learn(self, id, points):
        self.strokes += [[id, Path(points), locals.Any]]

    def FindGesture(self, points):
        self.path = Path(points)
        return self.FindBestMatch()

    def FindBestMatch(self):
        if self.path.points[0] != self.path.points[1]:
            self.path.Normalize()
            minDist = 100
            minStroke = [None, None, None]
            results = [(s[0], self.path.DistancePath(s[1])) for s in self.strokes if self.path.MatchConstraint(s[2])]
            results.sort(lambda a,b: cmp(a[1], b[1]))
            return results
        else:
            raise OneStrokeGestureException("Paths are different lengths")

This module will return a list of tuples of the names of all of the paths that match, and how closely they were matched. Generally, the first item matched will be the one you're looking for. You run the detection algorithm on the line that the user has created using FindGesture and passing in the list of points which make up the line. You can also call the PenDown, PenTo, and PenUp methods to have the class build the path itself incrementally as the user draws a line.

Together the above classes make up the gesture recognition code. Paste them into a file called OneStrokeGesture.py and then you are ready to begin detecting gestures. Here is a bit of test code which uses the Pygame library to help you try out the gesture recognition against the Graffiti alphabet of drawn symbols:

from OneStrokeGesture import OneStrokeGesture
import sys, pygame

pygame.init()

screen = pygame.display.set_mode((320, 240))
g = OneStrokeGesture()
paths = []

while True:
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            sys.exit()

        if event.type == pygame.MOUSEBUTTONDOWN:
            g.PenDown(event.pos)
            paths.append(g.path.points)
        if event.type == pygame.MOUSEMOTION:
            if g.down:
                g.PenTo(event.pos)
        if event.type == pygame.MOUSEBUTTONUP:
            print g.PenUp(event.pos)

    screen.fill([255, 255, 255])
    [pygame.draw.aalines(screen, [0, 0, 0], False, p, 1) for p in paths if len(p) > 1]
    pygame.display.flip()

I added the method Learn, which allows you to take existing paths drawn by the user and add them to the library of known paths. I have found this to be really effective. The algorithm finds already-learned paths with a high degree of accuracy - much higher in some cases than the preset Graffiti alphabet. This next bit of test code makes you teach five letters (a, b, c, d, e) three times before starting the recognition process. You'll want to run this on the command line as this test will print out the letter it wants to learn next. Once all letters are learned, the demo goes into recognition mode and you can freely draw strokes which will be detected against those that you input earlier.

import sys, pygame
from random import shuffle

from OneStrokeGesture import OneStrokeGesture

pygame.init()

screen = pygame.display.set_mode((320, 240))
paths = []
letters = [chr(c) for c in range (ord('a'), ord('f'))] * 3
g = OneStrokeGesture([])
print g.strokes

print "learning:", letters[0]

while True:
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            sys.exit()

        if event.type == pygame.MOUSEBUTTONDOWN:
            g.PenDown(event.pos)
            paths.append(g.path.points)
        if event.type == pygame.MOUSEMOTION:
            if g.down:
                g.PenTo(event.pos)
        if event.type == pygame.MOUSEBUTTONUP:
            if len(letters):
                g.Learn(letters[0], g.path.points)
                letters = letters[1:]
                if len(letters):
                    print "learning:", letters[0]
                g.PenUp(event.pos)
            else:
                print "found:", g.PenUp(event.pos)

    screen.fill([255, 255, 255])
    [pygame.draw.aalines(screen, [0, 0, 0], False, p, 1) for p in paths if len(p) > 1]
    pygame.display.flip()

Have fun!

July 5, 2008

Futuristic Building #1

I recently bought a 'digital notepad'. It's a device that looks kind of like a clipboard, and you can clip regular paper notepads into. When you draw on the paper with a special pen, your pen strokes are digitally recorded. Later you can hook the thing up to a computer to copy the vectorised notes across and get your hand-drawings in a nice flexible vector format.

Futuristic Building #2

This one is made by a company called Laser here in Australia but pretty much every digital notepad on the market is made by one taiwanese company, Waltop, who sell them to other companies for rebranding. The device runs fine under Debian in both USB disk mode, and tablet mode. Here is a link to a blog post with a heap of information about the digital notepad and running it under GNU/Linux (there is lots of info in the comments too):

http://eddie.niese.net/20071129/new-digital-notepad-gadget/

Futuristic Building #3

Modes

The notepad operates in two distinct modes. The first mode is as a regular notepad that internally digitises everything that you write in ink while it's not connected to your computer. This is reasonably effective although I have found that very occasionally, maybe one to three times per page, the pen will miss a stroke. I think this is due to the angle I write on, or not pressing the pen down hard enough when I do a light stroke. For what I am using the notepad for it's no problem since I import the notes as SVG files and then modify them in tablet mode later. I use a small Python script by Jorik Blaas to turn the notes from the .top file format they are stored as, to svg format. You can copy the .top files from the device to your computer when it's connected via the USB cable. Click here to download the Python script (I got it from the URL listed above).

Futuristic Building #4

Which brings me to the other mode; tablet mode. When the notepad is connected via USB to a computer it acts like a USB disk that you can copy notes off, but it simultaneously makes the notepad behave like a primitive graphics tablet. I say primitive because it has no pressure sensitivity, and it really just emulates a mouse - as you move the pen about on the surface, the mouse cursor follows. When you press the pen down, the mouse clicks. So far this has been fine for the little sketches I've been making.

Futuristic Building #5

Other stuff

The box also came with a black portfolio to carry the notepad around in, a USB cable, batteries, replacement ink cartridges, a plastic tipped cartridge for tablet mode, and a bunch of Windows software that I did not use. The whole package cost me about $160(AU) plus twenty five bucks or so postage.

Futuristic Building #6

I am pretty excited about this device and it's been fun playing with it over the last week or so. One of the difficult things about being a solo indie games programmer is not having graphics for your games. I have always found the hand-drawn aesthetic quite endearing in video games so I might have a shot at sketching some characters and backgrounds.

Futuristic Building #7

I have a bunch of other ideas for applications using tablet/touchscreen, so hopefully I'll get around to coding up some of those. Exciting times!

Futuristic Building #8

June 26, 2008

UfoLiberation is a game I wrote a couple of years ago in a one week game-jam-like session with my friend Crispin (he made a Missile Command clone that totally ruled). After the week I spent a few months of spare time polishing it up, adding menus, tutorial, icons, and an installer. My aim was to sell this as a casual kids game, but I don't think I'm ever going to get around to that, so here it is. At the moment there's just a Windows installer, but I plan on adding a Mac OSX one sometime.

My aim was to make a game which combines the game mechanics of two of my favorite arcade games; Choplifter and Joust. In Choplifter the core game mechanic revolves around landing on the ground to rescue people whilst enemies attack you. In Joust the game mechanic is to try and land on enemies from above and avoid being landed on from above. I felt like these two game mechanics would nicely balance eachother; if you need to fly low to pick people up, and at the same time try and hit enemy UFOs from above, there is some kind of risk/reward tradeoff that the player must make.

I find it most fun to play with a USB gamepad, but you can use joystick, mouse, or keyboard.

Click here to download it.

Enjoy!

June 10, 2008

Sfxr is an inspiring little program which generates old-school sound effects for use in games or music, or whatever. It's one of those delightful little nuggets of software that just does exactly what it says it will and nothing more. It's great fun and looks great too, and its source-code fits in a single c++ file! Finally, it's cross platform, running on all the major desktops. Wonderful stuff.

http://www.ludumdare.com/compo/2007/12/13/sfxr-sound-effects-for-all/

May 28, 2008

Yesterday was the project presentation for the final unit in my Computer Science degree, and the last bit of actual work. As such, I guess this would be a prudent time to post a bit of a life update.

STUDIES

Well, that appears to be it; my degree is over. Results come back in three weeks or so, but overall I'm pretty happy with how things went during the course of my part time studies. The prospect of having less stress and more time to do more interesting things is very exciting right now.

MARRIAGE

Probably everyone I know will know this by now, but it's worth documenting here anyway. In July during the Freeplay conference in Melbourne, on a bridge across the Yarra, I asked Moose to marry me. To my lucky suprise she said yes! To celebrate we had even more to drink and I somehow chipped a tooth. We had a mostly-family wedding in March and we're both really glad that that ordeal is over! The wedding I mean, not the marriage. It was nice though, and we got lots of sweet photos and caught up with lots of people.

TRAVELS

Later this year we are heading to Europe and the USA to do some campervanning and greyhound/motel hopping respectively. I'm going to play some gigs and stuff and try to drop in on a couple of interesting conferences. Next year our plan is to stop in London to work for a while. Should be fun!

MUSIC

Since Fenris has moved to Melbourne I am now playing solo gigs, except when he's in Perth or I am in Melboure. I've had a few solo gigs now; two at Shape and one at the Velvet Lounge, and they all went off without hitch. At the first Shape one there were people dancing and everything! The music is quite different from what Fenris and I do since I am using all software (Pure Data) and midi controllers whilst chr+fen is mostly about game consoles and delay/distortion pedals. Later in the year when Moose and I travel to Europe and the USA I'm planning on playing some gigs.

WORK

Since the beginning of the year I have been working full time on games technology, which is the fulfilment of a sort of dream of mine.

I am working 3 days per week for Interzone, doing web integration for their MMO. That is nice because it's a regular paycheque, and the work is interesting, and I get to leave it at the office when it's home time. It's kind of strange working in an office again after working for myself from home for so many years, but it hasn't been bad, and the social aspect is even quite nice.

I am working 2 days per week on the ABC/Screenwest/GWE contract with my pal Jessee from Studio Robot. This game is lots of fun to work on and utilises some AJAX technology that I wrote a few years ago. It's basically a Python web server which maintains a game state, and communicates changes in game state to multiple connected web clients - something that's a bit difficult in traditional web servers. It's really nice to work with an artist of Jessee's calibre and it makes the programming bits way more fun when you get to see something cool as the outcome. We just sent them the 4th milestone out of 8, so the project is well on it's way.

Because of the full time work I don't get as much time to work on Pixelbox these days as I'd like to. Though luckily Jacob and Patrick are there to share the small workload. Occasionally, maybe once a quater, I drive out to Wangarra to physically install a new machine or swap a hard drive. I'm due for one of those visits quite soon, actually.

There are a couple of other small bits of client work I am neglecting pretty badly at the moment. Sometimes I manage to squeeze in an hour or two somewhere, but I guess these will ramp up once I get a bit more time. One of these is a quite exciting art project which I'll post about here when it's finished.

GAMES

Since I'm working on games stuff full time I don't get a lot of time to work on my own games these days. I am looking forward to later in the year when I will get to do my own stuff again. I finished a small web based Breakout clone which I called Bricker, but that was mostly just to keep myself thinking about games and web tech. I have two different indie games ideas on the boil at the moment but I won't have time to work on them for at least a month or two. I also need to get UfoLiberation up on the website, either as a free download or put some kind of a payment gateway in place. Though once again, who knows when that will happen! The code is all finished, and Windows binary is even built; I just need to do the last 1% to get it out there.

So for the most part, that's what is going on in my life.