Friday, February 26, 2010

Migrating

Hi all.

I'm migrating my blog from here over to http://rorytulk.wordpress.com, soon to become www.rorytulk.com. Please adjust accordingly.

Sunday, February 7, 2010

Software Testing Techniques, an Empirical Approach

Proper software testing regimes are a cornerstone of effective software engineering. Progress has been made to teach students sound testing techniques, but improvement can still be made. My master's research supervisor and I conducted a study designed to empirically determine the difference in ability between student and professional software testers, and elicit from the experts behaviours or techniques which may be used to enhance undergraduate curriculum.

Our experimental setup consisted of in-lab observational sessions where subjects wrote thorough suites of JUnit tests for sample software we'd created. Subjects were drawn from the University of Toronto’s undergraduate computer science student body and professional developers from the Greater Toronto Area. The test code and video logs created during these sessions were examined for trends present in the student and professional groups.

Our intuition going into the study was that professionals would find more defects with their test suites, an advantage stemming from some metric such as number of tests written, lines per test, code coverage per test, etc. Analysis of these metrics did not confirm these hypotheses, however. Students and professionals performed equally well in terms of number of bugs found. However, student code contained more defects and, more importantly, the types of bugs found differed strikingly between the two groups.

Bugs in the sample code were broken down into two categories: stateless and stateful. A stateless bug is uncovered by inputting invalid values into a method invocation, and the method returns invalid results or throws an exception. A stateful bug occurs when a method call corrupts the object's state, and so subsequent calls perform incorrectly. Students found a mix of stateless and stateful bugs in the code, with a strong majority being stateless. The professionals sampled found strictly stateful defects. There are several possible explanations for this effect, although no evidence to support one over the others is immediately apparent.

The full text can be found here.

Monday, February 1, 2010

Left-Fold for Bash

I'd like to share a recent bash programming experience I've had. It began while processing the reams of data generated in my M.Sc. research study. I was producing long lists of frequency data in text files, and had the need to sum up all the lines in these tables. This is of course a trivial problem in many languages, and I had a wide array of options available to me:
  • I could write another ant task to do the summing (this required more work that I was willing to invest, as ant isn't really suited for computational tasks)
  • I could write a python script that took the contents of the specified file and returned the sum. I didn't really like this approach because it involved yet another file in my build process, invoked from the ant task. I always sort of thought that if you were forced to use , you were performing a task beyond the scope of your tool.
  • I could skip the generation of the table and go straight to the sum. A few of the tables were created with XSLT, so this was a valid option. However, my XSLT programming ability is very much trial-and-error based, so I thought this might take some time. Also, some of the other tables, created with grep would not be affected.
  • Write it in shell. I liked this idea. I really liked the feel of being able to just pipe something to 'sum' and have the sum returned. So this is what I chose.
My first version looked like this:
#!/bin/bash

sum=0
while read line
do
sum=$(($sum + $line))
done

echo $sum
exit 0
It did the trick quite well. I had suggestions from office mates for the following alternatives:
Using python (courtesy Aran Donohue):

python -c "import sys;print sum(float(x) for x in sys.stdin.read().split())"

Using tr (courtesy Zak Kincaid):

cat numbers | tr '\n' '+'|head --bytes='-1'|bc
(Note that this version doesn't quite work. bc throws a syntax error. not exactly sure why.)


Using my original design, I realized that if I abstracted out the operator, I could use this script to perform any 2-operand function I wished on the list, essentially creating a basic left fold:

#!/bin/bash
op=""
if [ "$1" == "" ]; then
op="+"
else
op=$1
fi

sum=0
while read line
do
sum=$(($sum$op$line))
done

echo $sum
exit 0

I've done a little bit of error checking, to see if the parameter supplied is blank, and if so replace it with + by default.

I've only seen this work for '+' and '-'. If I use '*', it breaks because it replaces the wildcard/multiplication character with the listing of the current directory.

Tuesday, November 3, 2009

Unifying Visual theme

A discussion this morning over coffee has prompted me to again consider creating a personal web presence, more than just a facebook or twitter page. A public, hosted site with data about my work and research. Inevitably, whenever I begin to consider this, I always get hung up on the visual theme for said space. In a perfect world, I would create a site in which the visuals closely match my personality, in a cool-hip sort of way, be cleanly unified across all the pages, and indeed could even integrate well into my local operating system's visuals (this doesn't serve any particular pupose, the idea of total uniformity just seems sort of cool to me). Anyway, looking at hosting and gnome-look.org for inspiration right now. Probalby won't lead me anywhere :P

Tuesday, October 27, 2009

Multi-User SVN on a Single Debian Account

Just finished setting up a multi-user subversion repository, based on the instructions found here. The catch here is that the SVN server is running on a single server-side user account, and the multi-user support is done by multiplexing on the incoming ssh rsa key. Every key gets its own artificial user name, so we can track who has been doing what. The process was fairly straight forward, just backup your .ssh directory in case you bork something, like I did:P

Wednesday, September 16, 2009

Algonquin Park Map

After doing some looking around for an electronic map of the canoe routes in Algonquin Provincial Park, I was only able to come up with the following :

http://www.cs.cmu.edu/~crpalmer/algonquin/map.html

It's pretty old, and the site hosting it looks like it may be somewhat unreliable, so I duplicated the file and made it available here for posterity:

http://www.cs.toronto.edu/~rtulk/alg2-1.pdf

Friday, July 3, 2009

In Praise of the TS&CC

This past weekend, while enjoying a leisurely walk along the waterfront between High Park and Ontario Place, I came across the Toronto Sail & Canoe Club (TS&CC). I headed back there, by bike, last night, in hopes of getting some more information about the club. Within minutes of arrival I was on the crew list, and had spoken to a few skippers who were looking for crew. I ended up on a Beneteau First 235, which is a nice little racer. After 45 minutes or so of reconfiguring the race course, we were off! Just in time for a the storm front to get close enough to steal away all our wind! The race was uneventful, but relaxing, and I got to meet some new people, so it was an overall positive experience. I think I'll be going back next week for more of the same.

Oh, did I also mention that a Crew membership at the TS&CC is extremely affordable? Well, it is!