Thursday, January 22, 2009

Safe Server-Side Unit Testing

I like build systems :) My first experience with integrating a vcs, bug tracker, and ant was a very fulfilling experience, and it only got better when we added things like EMMA to give developers a feel of how their project was progressing. So, you can understand why my ears perked up when, during a conversation about the SVN setup in Dr. Project/Basie, Greg mentioned that they had tried to incorporate a continuous integration routine into Dr. Project, but failed, citing complexities and difficulty with the administration. Now, being the cinical, cold-hearted person that I am, my first thought was, "You clearly need better administrators", but then I remembered trying to do something similar with VMWare, and how rediculously hard it was to get it working, and once it was, keeping it there was almost impossible, so I held my tongue.

The basic premise here is to have the server which runs the Dr. Project/Basie installation also manage a system of virtual machines. When code is checked into the SVN repository for a given project, a virtual machine is spawned. Inside this VM, we download a copy of latest revision from the SVN, build it, run the unit tests, generate the reports, publish them, then kill the VM. Obviously, we can't do the build and test by just forking a process, without the VM, because that would allow the project groups to run arbitrary code on the Dr. Project web server, which is just about the biggest security hole I can think of. So, the goal here is to utilize the virtual machines to completely isolate the code from the web server, so that the tests are run in a completely safe environment, and at the same time providing benefits like strictly reproducible execution environments (every unit test starts from the same vm snapshot).

To accomplish these goals, we're looking at using the SnowFlock system. All vm's start from a master image, clones are quick to create (~100msecs), we can instantiate many, many clones at the same time, and the whole thing is wrapped up in a nice little Python API.

It will be interesting to see if this works for Dr. Project/Basie's needs, and if it does, I'd like to see if it could be extended to do cluster testing for larger distributed systems projects. The ease and speed of creating a new clone vm means that for each test, a small cluster of machines could be created, the test run, and torn down. I'm not sure if a tool like this exists already, it sounds like a fairly straightforward idea, but should be fun to investigate either way.

1 comment:

Karen Reid said...

I'd like to talk with you sometime, because this is exactly what I want for OLM!

I think one of the most interesting questions is just how much you want to protect against errors or malice. We need to control how much information gets out of the server, but I think it is a an open question how much effort needs to be put into cleaning between runs.

Karen Reid