Tuesday, November 3, 2009

Unifying Visual theme

A discussion this morning over coffee has prompted me to again consider creating a personal web presence, more than just a facebook or twitter page. A public, hosted site with data about my work and research. Inevitably, whenever I begin to consider this, I always get hung up on the visual theme for said space. In a perfect world, I would create a site in which the visuals closely match my personality, in a cool-hip sort of way, be cleanly unified across all the pages, and indeed could even integrate well into my local operating system's visuals (this doesn't serve any particular pupose, the idea of total uniformity just seems sort of cool to me). Anyway, looking at hosting and gnome-look.org for inspiration right now. Probalby won't lead me anywhere :P

Tuesday, October 27, 2009

Multi-User SVN on a Single Debian Account

Just finished setting up a multi-user subversion repository, based on the instructions found here. The catch here is that the SVN server is running on a single server-side user account, and the multi-user support is done by multiplexing on the incoming ssh rsa key. Every key gets its own artificial user name, so we can track who has been doing what. The process was fairly straight forward, just backup your .ssh directory in case you bork something, like I did:P

Wednesday, September 16, 2009

Algonquin Park Map

After doing some looking around for an electronic map of the canoe routes in Algonquin Provincial Park, I was only able to come up with the following :

http://www.cs.cmu.edu/~crpalmer/algonquin/map.html

It's pretty old, and the site hosting it looks like it may be somewhat unreliable, so I duplicated the file and made it available here for posterity:

http://www.cs.toronto.edu/~rtulk/alg2-1.pdf

Friday, July 3, 2009

In Praise of the TS&CC

This past weekend, while enjoying a leisurely walk along the waterfront between High Park and Ontario Place, I came across the Toronto Sail & Canoe Club (TS&CC). I headed back there, by bike, last night, in hopes of getting some more information about the club. Within minutes of arrival I was on the crew list, and had spoken to a few skippers who were looking for crew. I ended up on a Beneteau First 235, which is a nice little racer. After 45 minutes or so of reconfiguring the race course, we were off! Just in time for a the storm front to get close enough to steal away all our wind! The race was uneventful, but relaxing, and I got to meet some new people, so it was an overall positive experience. I think I'll be going back next week for more of the same.

Oh, did I also mention that a Crew membership at the TS&CC is extremely affordable? Well, it is!

Monday, June 29, 2009

Creating Defects

I'm in the process of inserting defects into three pieces of software I've written - for the purpose of creating testing samples for the study. This process is a lot more painful that I would have anticipated. I expect it is due to being trained for so long at removing defects, not deliberately creating them. Every time I break something, I realize the cool test case that will fail because of it, and then I feel bad. Also, trying to create non-obvious errors, or defects that are a little more human than those created by my java mutator, is challenging. For example, consider a class constructor like the following:

public MyAccount(int account, int initialBalance) { m_accountNum=account; m_balance=initialBalance; }
The mutator would do something like this
public MyAccount(int account, int initialBalance) { m_accountNum=account; m_balance=++initialBalance; }
following its set of one-line operator rules. I think I a more 'human' error would be something like this:

public MyAccount(int account, int initialBalance) { m_accountNum=initialBalance; m_balance=account; }
or this:
public MyAccount(int account, int initialBalance) { m_accountNum=account; }
All still valid java programs, but definitely erroneous, and certainly slips that an overworked developer could make. What sets them apart from some of the mutation bugs is that many of the mutation rules require more effort on the part of the developer, rather than less (ie. a ++ at the end of a variable manipulation command is more likely to be omitted than included by accident).

Friday, June 19, 2009

Think Aloud and Coding Analysis

Chris pointed me at a paper by Mayrhauser and Vans "Identification of Dynamic Comprehension Processes During Large Scale Maintenance" that seems fairly relevant, in that they are using methods that align with mine so far. They've used a Think Aloud process and recorded participant actions while performing a maintenance change request. The activity took approx. 2 hours per subject (11 subjects. I think I can do better). Video and audio recordings were transcribed and coded. The authors posit that a) coding should be based on categories defined a priori (before the video/audio is recorded), and that b) Think Aloud does not work out of phase with the change action (thinking aloud after doing the task). This concerns me as a) I don't have a set of codes yet (I could certainly come up with some rather quickly, but they would be without significant justification), and b) I kind of liked the idea of the post-task interview.
These concerns aside, the data analysis in this paper is excellent. The authors code all the transcripts, and derive a set of patterns that the participants take while performing the tasks. These are formulated as finite state machines, in which each state represents a code. This, to me, validates their choice of codes. This may be a good model to follow for at least part of my analysis procedure.

Wednesday, June 17, 2009

Nielsen's Heuristics for Software Testing

I had a quick conversation today with Dustin, of DGP and MSR fame, and he asked me if there was anything similar to Nielsen's heuristics for usability that might be used when looking for errors in code. There wasn't anything that jumped to my mind, but that certainly doesn't mean that nothing in fact exists. However, the first 10 hits for "software testing heuristics", "nielsen heuristics code", "nielsen heuristics software testing errors" didn't contain what I was looking for, either. It makes me think that the imagined output from my research study could be a valid contribution to knowledge. I think the list would probably contain things like:

always try negative numbered parameters
always try null values
how well does the .equals() method work?
add-remove-add to/from the collection, is it the same semantically?
always check date-based roll-overs

Monday, June 15, 2009

Simple Web Services are Too Hard to Find

So I'm building some software to use as a System Under Test for my thesis experiment, and found several seminal examples in Paul Jorgensen's "Software Testing". Since I had already implemented the Simple ATM problem (well, a variant of it, without any GUI), and the Triangle problem, I decided that the Currency Exchange problem might fit well in between these two, in terms of complexity and size of code. Basically, the program takes in a value and source currency, and converts it to a destination currency of your choice (4 options in Jorgensen's text). This seemed a bit outdated to me, as it relies on the programmer hard-coding the exchange rates by hand. Boo-urns, I say! So, I figured I'd use some benevolent, free web service to pull down live exchange rates and use them in my program. This should be simple, I imagine, because a) this is the type of service presented in every tutorial on making web services, and b) it provides exactly the type of functionality for which web services are suited: some mysterious online entity that has a little glob of information that I want to use in my application. Google searching for such a web service turned up a disappointing lack of results. There are several online currency converters, of which I'm sure the reader is aware, but none of those are particularly machine friendly - I would have to hack up and throw away a bunch of HTML to get a single number out of the page. I encountered one service that was designed to be machine readable, but it used the bloated WS* XML web service stack, requiring me to auto-generate hundreds of lines of code, all so that I can read a single number (this is not an option for me, because I don't want my subjects attempting to write tests for a bunch of machine created JAX-WS goop). If that wasn't enough, I also had to apply for a Trial API Key which would allow me to access the single number I needed for a period of two weeks, after which I would need to purchase a commercial API license. Grrrrrrr! A subsequent search of "REST currency exchange web service" turned up bupkiss. Why is this so hard? There should be dozens of services like this. Maybe I'm just not looking in the right place.

Thursday, June 4, 2009

Idea for a Meta-Study

Idea for a meta-study: lots of papers have been published in which a controlled experiment is performed to examine the potential benefits of TDD. These are mostly all of the flavor: have control group implement some spec, using code-first-test-last, have experimental group implement same spec using TDD, measure time to complete and defect count of both groups. It seems (at least in my experience), that there are two possible outcomes from this, either the results are inconclusive, or they tend slightly towards the author's own feelings on the subject, either for or against TDD. I think an interesting, and probably easy to conduct, meta-study would be to pull down copies of all papers that are performing studies like these and see what the trend is, as well as analyzing things like geographic region of the authors or other relevant, although maybe not immediately obvious, correlations.

Wednesday, June 3, 2009

Iteratvie Thesis Development

Yesterday I decided to try a new way of organizing my time. In 'the real world', summer time always correlated with reduced productivity of a development team, in part due to developers taking vacation time, but also supervisors being absent and leaving the team with a lack of direction. I have proactively given myself this direction by dividing my summer into 3 iterations, which coincide with the three remaining months of the season. In each iteration, there are four phases, in which I will examine and refine the methodology, data acquisition, and analysis of this thesis I'm pursuing, and in the 4th phase I will write up my results. I figure that by doing three iterations in this way, the pilot study this summer should give me a really good idea on how to run a successful study in the fall, as well as a head start on some of the write-up.

A User Study for Mutation-based Testing Analysis

I recently read some material by Andreas Zeller in which he discusses the merits of using mutation testing as a method of verifying the quality of a test suite for a piece of software. These methods are meant to expose tests which perform an action, and assume that it was performed properly so long as an error is not thrown by the system under test - they do not verify that the action resulted in the desired program state. Code mutation (either on source or directly on the bytecode), when combined with accurate code coverage data, can identifiy these deficient tests by mutating the code they cover and observing tests that do not fail.

I believe that there is value in using this data, if it can be presented in the appropriate manner. If a developer has to spend an hour to generate the mutation report and cross-reference it with code coverage, then the investment likely outweighs the benefit. However, if I can arrive at my desk at 9am, and have in my inbox a build report, test report with coverage, and a mutation report for every branch, all of which are properly hyperlinked to each other and backed up on a central storage, then I would certainly use it. The problem here seems to be that this degree of automation and integration is hard to set up, and often times delicate when in place. It seems that some sort of standard platform for use by build engineers for integrating all of their reports, packaging operations, tests, etc, is called for. However, I have yet to see anything more sophisticated than Ant or Perl in widespread use by engineers. Maybe I just have an unrepresentative sample, though.

Monday, May 18, 2009

Testing Heuristics and Static Analysis at ICSE 09

I had a very critical talk with two grad students from Queen's yesterday, whom I unfortunately can only identify as Jackie and Brahm. They had a difficult time accepting the immediate usefulness/significance of the study I was proposing, and that's fine, as they came up with a few useful suggestions that I can leverage, primarily to keep a narrow focus and always be aware of the result I'm trying to generate. Two interesting opinions they shared with me are below:

Are the good testers simply those who are most familiar with the SUT (software under test)? This seems entirely plausable (in my own work experience, I was a vastly better tester of sotware that I had written, or at least software written by someone on my team). If so, then the correlation between programming experience and software testing would be much weaker if we are inventing the SUT for our subjects to test. However, I believe to have some empirical evidence to the contrary, in the form of a man called Scott Tindal, who is one of the most skilled QA engineers I've ever had the pleasure of working with. Scott was of the caliber that he could almost single-handedly test every product in the OpenText/Hummingbird line. An interesting configuration of our study may be to experiment with developers testing their own code versus devlopers testing new code that they've never seen before.

Will the heuristics extracted with this study be the same as those already in use by static analysis tools like FindBugs?
It appears that determining a correlation between the techniques that experienced testers use and the heuristics employed by static analysis tools such as FindBugs could be an interesting topic. If there is no correlation between these two, may it be possible to integrate these into SA suites for any significant improvement? If there is a correlation, why haven't more testers adopted static analysis to aid in their efforts? Is it a matter of usability of SA tools? Is it that the formalism used to express the SA heuristics isn't understood by everyday testers, and so they aren't aware of what SA has to offer? I plan to talk to Nathaniel Ayewah, University of Maryland, who is working on the FindBugs team, about this further.

ICSE 09 - Day Three

Andreas Zeller, professor at Saarland University in Germany, apart from being on of the nicest people I've met so far this week, also had some very positive feedback on my research idea. He agreed with it's principle, and that research in this area would be valid. One of his first questions after I finished presenting my idea was about incentive for participation. I thought he was asking about how we would convince students and pros to give up their time to participate in our study, but this wasn't the case. he recounted results found at Saarland when teaching students to test their code. Traditional methods obviously didn't work effectively, so they switched tracks. Student assignments were assigned with a spec and a suite of unit tests that the students could use to validate their work as they progressed. Once nightly a second set of tests were run by the instructor, the source of which was kept secret, and the students were informed in the morning of the number of tests they passed and failed. To grade the assignment, a third set of (secret) tests were run. Students begin with a mark of 100%. The first failing test reduces their mark to 80%. The second, to 70%. The third to 60%. The grading continues in a similarly harsh fashion. The first assignment in the course is almost universally failed, as students drastically underestimate how thorough they need to be with their tests. Subsequent assignments are much better, and the students focus strongly on testing, and indeed collaborate by sharing test cases and testing each others assignments to eliminate as many errors as possible before the submission date. Once the students have the proper motivation to test, they eagerly consume any instruction in effective testing techniques. Andreas found that simple instruction in JUnit and maybe TDD was all that was required, and the rest the students figured out for themselves. This type of self-directed learning is encouraging, but the whole situation makes me think that these students may be working harder, but not smarter, to test their software. It may be possible that by providing instruction not only in the simple operation of an XUnit framework, but also in things like effective test case selection, they may be able to reach similar test coverage or defect discovery rates, while expending less frantic effort.
Thought: drop the first assignment from the course's final mark, as it is not used for evaluation as much as it is a learning experience (of course, don't tell the students this, otherwise the important motivational lesson will be lost).
In addition to this, Andreas, just as other folks I've spoken to so far, emphasized the need to keep my study narrowly focused on a single issue, as well as controlling the situation in a way that enables precise measurements to be made both before and after implementing the curriculum changes we hope to elicit from the study, to accurately determine the improvements (if any exist). I am beginning to see a loose correlation between a researcher's viewpoint regarding empiricism in SE research (positivist on the right and constructivist on the left) and their amount of emphasis on narrowness of focus. Often times, those people who advocate exploratory qualitative studies also recommend wider bands of observation while conducting these studies. This is likely an over generalization, and I apologize in advance to anyone who may disagree with this.

ICSE 09 - Day Two

Day Three of ICSE 09 was another great success! It began with a rousing keynote by Tom Ball of Microsoft Research fame. He described the early beginnings of version control and the emergence of mining techniques for these version systems, and continued the narrative through to the latest version of visual studio and the tool Microsoft calls CRANE, which suggests to developers area of the code which should be examined given that they are changing some other area. Pretty cool. I got to talk to him one-on-one about it over lunch.

On the previously blogged march through stanley park, I had a talk with Tom Ostrand from AT&T Labs in New Jersey. One thing I wanted to talk about (besides pitching my research idea) was leveraging the version history and reports generated from a test suite to predict bugs in target code using MSR techniques to augment the existing tools, such as mining deltas and bug trackers. This topic came up in the morning's MSR session, but aparently Tom missed it (making me look much smarter than can be measured in reality). He seemed interested in the possibility, but of course there are barriers to getting it going. Primarily, assuming that the dev team for the software we're analyzing is writing tests and putting them into the vcs, the error and coverage reports almost certainly aren't, making it difficult to to versioned history analysis over them. I thought that maybe, given the source code of the target software and the test code (both of which at some version), the MSR tool could build the code and execute the tests to create the needed reports. This will likely be extremely difficult to get going in the field, however, and make MSR mining, which is already an extremely expensive, long running process, even slower.

Sunday, May 17, 2009

Farewell to the Borkenstocks

The time has finally come to retire the Canadian Tire brand cork bottomed sandals, so lovingly referred to as the 'Borkenstocks'. After a season in the closet and a plane ride to Vancouver, the sandals got their first taste of summer today, as they accompanied me on a walk through Stanley Park. The results of this were as follows:

blisters and uncomfortableness.

Feedback on my Research Proposal

Over the last 24 hours I've had the chance to talk to a few individuals about my research idea. It seems to go as follows: I give my 30 second pitch, the marks stares at me blankly. Then, I elaborate. The mark gets it. They tell me first a problem they see, and then what they'd like to see out of the results. In a couple of cases, the mark got particularly excited about the outcome. I've summarized below.

Chris Bird
Chris has a strong background in empirical software engineering, with particular emphasis on qualitative exploratory studies. As such, the quantitative analysis aspects of my pitch fell on deaf ears, but he was extremely interested in the in-lab observation sessions I was proposing. He felt that 5 or 6 actionable recommendations that might come out of observing professionals would be invaluable. He suggested that I find an older Microsoft Research study in which they trained new hires by having them monitor a screen shared by a senior developer for some amount of time (probably a few days), and had the new hire ask questions with the senior at a later date, by reviewing the video logs. I like this idea because it allows us to elicit the information from the developer directly, instead of trying to infer it ourselves, but since we're not bothering them during the initial testing session, we wouldn't be affecting their performance. This isn't without problems, though. Primarily, there is the risk that the subject isn't always sure why they do the things they do, and so are likely to invent reasons, or invent foresight where none necessarily exists. Interesting idea, though. Also, should we interview students in the same way? On one hand, they students likely don't have any special insights that we can leverage (assuming they are less effective at testing than pros). On the other hand, it may illuminate any areas of misconception or misunderstanding which we could address in future curriculum changes.

Jim Cordy
Jim is a professor at Queens, and was my instructor in my 4th year compilers course. After he heard my pitch, he had a warning about an affect he had seen in his industrial work, and it comes from a generational difference in the training of developers. Developers who were trained more than 15 or 20 years ago had a delay between changing the source code and the results of program execution on the order of hours; new developers are used to delays on the order of minutes. Also, the current state of the art in debugging utilizes interactive debuggers, which were either unavailable or unreliable in earlier days. This has lead to 'old-school' programmers to a) rely heavily on source code inspection and b) insert enormous amounts of instrumentation (debugging statements) when running the program becomes required. In comparison, new generation developers often use smaller amounts of instrumentation, relying on quick turn around times to find the cause of errors. In Jim's experience, the old school programmers were orders of magnitude more effective (in terms of bugs found or solved per hour) than younger programmers. If this is in fact the case, it should be an effect we can see if we recruit subjects trained during this era of computer science research.

Saturday, May 16, 2009

ICSE 09 - Day One

We're wrapping up the first day of talks here at ICSE 2009. I've talked my way into the Mining Software Repositories (MSR) workshop. Here's a quick breakdown of some noteworthy points:

Keynote: Dr. Michael McAllister, Director of Academic Research Centers for SAP Business Objects
An hour and a half long talk in which he sells BI to the masses. Spoke a lot about integrating data silos, and providing an integrated, unified view of the data to business level decision makers. Also interesting anicdotes on how BI helped cure SARS. Forgot what OLAP stands for. Kind of concerning. This talk made me (after spending 2+ years working for a BI company) want a running example of what BI is and how it is used in the context of an expanding organization - from the point before any computational logistics assistance is required, and progressing forward to a Wallmart sized operation. Most examples I've seen start with a huge complex organization, complete with established silos, and then installs things like supply chain management, repository abstraction, customer relations management, document management, etc.

Mining GIT Repositories - presented the difficulties in mining data out of GIT (or DVCS systems in general) as opposed to traditional centralized systems like svn. Noteworthy items include high degree of branching and the lack of a 'mainline' of development.

Universal VCS - by looking for identical files in the repos of different projects, a single unified version control view is established for nearly all available software. Developed by creating a spider program which crawled the repos of numerous projects, downloading metadata and inferring links where appropriate.

Map Reduce - blah blah blah use idle PCs for quick pluggable clustering to chuck away on map reduce problems. Look @ Google MapReduce and Hardoop.

Alitheei Core - A software engineering research platform. Plugin framework for performing operations on heterogeneous repositories. Can define a new Metric by implementing an interface, and then evaluate the metric against all repositories in the framework. Look @ SQO OSS.

Many of these previous talks created tools for mining repositories, but with no greater purpose than that. When asked about this ('mining for the sake of mining'), none of the authors seemed to have a problem. The conclusion of the discussion was that this lack of purpose was a problem, and that the professional community should be surveyed to find out what needs they have for mining repos.

Research extensions/ideas:
The third MSR session today focused heavily on defect prediction. After showing off 3 or 4 methods of mining vcs systems to predict buggy code that improved prediction probability by 4% or 5%, the discussion boiled down to this, "What do managers/developers want in these reports to help them do their jobs?" Obviously, the room full of academics didn't have a definitive answer. One gentleman asked the question I had written down, which was "has anyone used the history and coverage of a software's test suite in combination with data from the VCS as a defect predictor (in theory, heavily tested areas are less likely to contain bugs)?" I found this particularly interesting. Also, I began to wonder, if we had one of these defect prediction reports, does it improve a developer's ability to find bugs, and if so, to what extent? Would it be measurable in the same way as I intend to measure testing ability with students and professionals?

Stay tuned for more info (and pictures of beautiful vancouver)!

Monday, April 20, 2009

AeroPress and Number Theory

A warning to all who may use the coffee grinder in the SE lounge to make fine-grained coffee for use in an AeroPress: shaking the coffee grinder will blow the circuit breaker! It seems some of us don't realize that by applying rotating the axis of a spinning body, orthogonal to the plane of rotation, we are in fact applying a force against the angular momentum of said body. If this rotating device is powered by an electric motor, this causes the motor to draw more current to maintain its current speed. In short, don't shake the coffee grinder.

Also, I picked up the Annotated Turing again over the weekend, and read the first two chapters on number theory. I found this to be absolutely fascinating! Now, if you're already well versed in number theory, then these overviews may be redundant for you, but I thoroughly enjoyed it. Looking forward to what else is in this book.

Sunday, April 19, 2009

Things on my Todo list

My current list of things I want/need to do:

Reading:
Petzold, "The Annotated Turing"
Homer, "The Odyssey"
Huth and Ryan, "Logic in Computer Science"
Tennant, "Specifying Software"
"Software Architecture, A Primer"
Gorton, "Essential Software Architecture"
Dickens, "Oliver Twist"

Writing:
2125 end of term summary paper
2130 empirical study paper
conceptual model of REST process & motivation
ERB paperwork

Coding:
Instrumented JUnit and PyUnit
Diplomacy relationship analyzer
Qualitative Coding Application
iPhone app for navigation and aviation
Custom Braid mod - or just re-implement the engine with Java 2D and OpenGL

Misc:
Taxes
Health Insurance Claim
Personal/professional website, portfolio, and cards. Would like these to have a unifying visual theme.
Bribe people on craigslist for sold out Johathan Coulton tickets
Plan Carmen's bachelor party, 3 camping trips, and a houseboat rental scheme.
Paperwork for windsurfing class.
Compare & choose sailing clubs for the summer
Finish Bluenose

Obviously, this list is waayyy to long to be reasonable. I think you can see where the priorities should go, however (finishing ERB paperwork > reading Oliver Twist :P ).

Tuesday, March 31, 2009

REST in Django

So I'm now officially a Google Summer of Code mentor for the Django open source group! w00t! Now all I have to do is pull off all those cool ideas I came up with earlier, as well as have a group of open source developers all agree that they are as cool as I think they are (which could be easier said than done).

While brushing teeth this morning, was thinking "I wonder what sort of empirical study I could pull out of this situation? What sort of research is there to be done in the field of REST? What can I learn, for academic purposes, not my own, from this experience?"

Ideas? I'm going to mull it over at the gym.

Tuesday, March 24, 2009

Google and Space and Time

Has Google finally solved that whole space-time-bendy problem? Observe:

Sunday, March 22, 2009

Reading Last Week

Seaman: Qualitative Methods

Description of ways in which qualitative methods can be used in conjunction with a positivist stance. Tools include observational studies and interviews.
Interesting tradeoff: amount of data collected in an interview vs. amount of interviewee's time used vs. amount of direction in interview. Often, the importance/implication of the data collected isn't known for a long time after the interview.
When conducting an interview, stress that it is not an evaluation. There are no 'right' or 'wrong' answers.

Cohen: Statistical Power Analysis

Examines the importance of analyzing and reporting the power of a statistical relation in empirical research.
Author proposes that a sound target power be 0.80, as it produces feasible sample sizes for given values of alpha and ES.
Not a very understandable piece of literature, at least from my perspective.

Rosenthan and MiMatteo: Meta-Analysis: Recent Developments in Quantitative Methods for Literature Reviews

A good introduction to meta-analysis (that is, analyzing the results of many studies/experiments to determine h0, instead of directly testing subjects to prove h0).
Interesting points about making sure the studies in your meta-analysis are independent (if meta-analyzing multiple studies from the same research group, subjects and or data may be reused, and so the results may be overlapping).
Also interesting discussion of inherent bias in meta-analysis, arising in the form in which the experimentalists choose to include/exclude studies from their sample space.

Card: Ender's Game

Enjoyable novel about a young boy who is called upon to train as a military commander to protect the earth from the threat of alien invasion.
Good character development, although the author lays on the bloodlust and homo-eroticism a bit thick.
As a consequence, the sci-fi aspects of the story seemed secondary to the character plot, even tacked on in some places.
Generally, I liked it, but probably wouldn't invest the time to read the 5 or 6 sequels

Modeling Topic

After a couple months of soul searching, the students enrolled in John Mylopoulos' Conceptual Modeling course are narrowing down ideas on what to model. The requirements of the project (to 'model something') left it fairly open, but perhaps too open to be immediately tractable. However, Michalis, Alicia, and I are closing the gap. I'm leaning toward modeling certain aspects of the REST web service framework I've been working on for the last little while in CSC 2125.

Obviously, system descriptions and class diagrams are uninteresting, as John pointed out earlier in the term, but modeling the requirements of what such a system should do are not. I'm looking into Tropos, but not really sure if it applies. However, we can probably do a goal model illustrating the motivation and principles of REST, contrasted with those of ws*. Also, a use case diagram describing the interface to a general ROA service could be created. Finally, we could model certain pieces of important logic, such as a delayed get, using a description logic syntax.

Reading Last Week

Sim et. al: Using Benchmarking to Advance Research: A Challenge to Software Engineering

Argues the merits of creating benchmarks in software engineering as an exercise to strengthen the community and promote advancement, using the reverse engineering community as an example.

Lau: Towards a framework for action research in information systems studies

Proposes a framework with which action research efforts can be categorized and evaluated.
Describes Action Research as an iterative process, in which a researcher introduces a small change, observes its effect, and uses it as input to the next small change.
Reminded me of Agile. I wonder if there are any other lessons from Agile that we can apply to Action Research?

Taipale and Smolander: Improving Software Testing by Observing Practice

Case study conducted to shake out some ways of improving software testing, where it is deemed to be lacking. Methods used include subject interviews and grounded theory.
Authors found that testing practices were most strongly correlated to business processes.
Thought this could lend some insight into how to observe testers at work (ie. as they write tests). No such luck, though. All recommendations for improving testing had to do with business process alterations/improvements, not hands on testing stuff.

Also, while glancing at my bookshelf, I came across a couple of old undergrad texts that I would like to glance through. By looking at the spines, I don't think I've ever opened them:

Logic in Computer Science by Huth and Ryan. This was the text for my computational logic course in 3rd year. The course notes and instructor were good enough without having to read this, but my propositional logic has become so rusty, I think I need this as a refresher.

Specifying Software by Tennant. Text for a formal methods course. Turing machines, model checking & verification, etc.

Also, my Amazon shipment arrived a couple days ago, bringing with it a copy of The Annotated Turing, by Charles Petzold, and O'Reily's Programming Erlang (this one is for Aran, but when we're both done with our purchases, we'll likely swap).

Thursday, March 19, 2009

Testing Tools

UTest
Implements a reverse test oracle (submit tests to black box piece of code). Unsure of level of functionality. Also has an eclipse plugin. Mentions sandboxing of code being run. Candidate for virtualization efforts I've been looking at.

WebCat
An online grading system developed at Virginia Tech, in which students submit assignments and have the instructor's test suite run against it. Use of this system was found to encourage test-first development practices among students, as well as early assignment submission (thanks to a hint system). Impossible to install, however, unless you are Stephen Edwards, and even then only on alternating weeks.

Marmoset
System for snapshot collection and automated testing. Using Marmoset, researchers can easily gather detailed information about students development patterns, as an Eclipse plugin checks in all code changes to a central version control repository, which can be mined. Also, Marmoset provides automatic test feedback to students, which they can use during development of an assignment, the goal of which is to improve their experience while learning to program. It is unclear whether or not these tests are also used for [semi]automatic grading.

JUnit
Although I haven't found any evidence of it yet, I'm pretty sure some combination of junit and the java remote debugger can be used to create a quick and cheap reverse test oracle. More digging required.

Tuesday, March 17, 2009

ORM-REST Code Sprint - Day 2

5:00 pm, day two of the code sprint is almost wrapped up. Not as much amazing progress today as I would have liked (I was minus one team member for some reason). That aside, here's what we've got:

Mohammad blocked out and implemented the pseudocode for the URL reverser described in our blog here. I took a further look at it, and filled in the magic that inverts the django url list and gives us a url from a view name and primary key value. Yay! Plugged it into the xml serializer, and voilla! Rest-like xml representation, with hyperlinks! The only thing missing from this bit is the 'http://hostname:port' part. From past experience, I've found this to be trickier than you might think (gets hairy if you've got one web server feeding into another, or a proxy/load balancer in the way). I think we'll try just using relative URIs for now.

After a couple more feature points are implemented, this thing needs a huge refactoring pass to clean it up and encapsulate it. Also, it uses some classes from the Django Rest Interface, but this library has some pathological faults that I want to not include in the tool. Yay for open licensing.

Also, discussions with Aran produced some new feature proposals. Lots of useful, tiny, easy-to-implement things that will improve the overall RESTability of the library. Further yay!

Automatic URL localization

Automatic anything localization

Model introspection and url pattern creation

Computed resources instead of data resources - make http interface automatic

Monday, March 16, 2009

ORM-REST Code Sprint - Day 1

At quarter to 5:00, the first day of the ORM REST code sprint is winding down. Mo' and I hacked from 2:00, and this is what we've got to show:

Rory finished one direction of proper xml serialization of Django models. That is, given a [list of] model instance[s], we get either a nice xml document (unlike the object name="" pk="" garbage we had before), or a concise list of objects with names and placeholder URIs, which will be changed to live urls when Mo' gets his piece working.

Mohammad synchronized with the svn, set up a django development environment, and familiarized himself with the code I had written. Following this, he began work on the reverse URL mapper. Given a model classname and a primary key value, he's pulling a live instance from the django ORM, and using it to query the Django URL dispatcher. This gives us the regular expression which will match URLs to access the specified object. Now he's got to turn the whole thing on its head! Good luck Mo! You can do it!

So, if you're keeping score, we are 1/2 + 1/2 = 1 feature point finished, out of 4. Might just make it by end of term :)

CSC2125 Live

My first live-ish blog. Hope it turns out well.

Class in the GSU pub

This class we did a quick series of elevator pitches by each group, as a way to practice their presentation skills. The first few groups were pretty good. Subsequent groups had to take two or three tries at the pitch.

Mohammad and I both had to present. I got to stand on a chair because I'm too short and Greg likes to pick on me:)

The bar will not make Irish coffees.

The last week of classes is in three weeks. The demo day is supposed to be the monday after that, but that is easter. This demo day will be moved further into the future, either some other time that week, or the following monday.

Greg's token plot twist: do another lap of elevator pitches, but this time for thesis work instead of 2125 project. Grad students have to explain their topic, undergrads need to come up with a thesis idea on the spot. 2 minutes prep time.

My topic: Studying the effects of integrating unit testing into standard CS undergrad programs. Questions: can be determine a measure for unit test effectiveness, can we track testing improvement over course of career, and is RTO useful?

I got cut off after my 20 seconds :( Only had like 3 words left.

Everyone's job outside of class is to come up with a thesis topic for nick. The winner gets ice cream.

I'm willing to bet we're going to go around again to try to pear things down. I don't know how to modify my speech, though.

Yup. This time I was too short.

Now we're talking about consulting fees. Greg charges $150/hr plus expenses. Clearly, the undergrads have no idea how much to charge. They've never seen the Entrepeneurship 101 talk about this. You should discount a yearly salary for an employee doing the same job you're shipping out, and normalize it over the length of the project.

If you get some consulting work on the side, U of T Legal Services can look over any contract you may have. It's free, but not speedy (couple of weeks). See Jason Betcham.

Combined degree CS + Law = name your price. Too bad law is dreadfully dull.

Next round of interrogation: what special skills do you have that cannot be easily picked up by other CS grads? I don't think I have one. Maybe experience in ECM?
Speaking another language is a big one!
Also, having a well connected professional network is important.

Last question: if you don't have something special, what are you doing to fix that?

Go get Garth Gibson's PHD thesis: how do raids work? Really good communication.

Pay attention to email for info on last class.

Friday, March 13, 2009

Zak is the worst person! The worst!

Muller and Pfahl: Simulation Methods

Chapter describing the way in which simulation can be used to project the outcome of a software project.
Most readers found this method to be too clunky, or simply inappropriate for software development estimation. The counter example of embedded or safety critical systems seemed to sway a few minds, however.
Interesting discussion about whether this actually qualifies as an empirical method. Also, everyone seemed to agree that what the Hadley Center is doing is valid science, even though it is simulation.

Atkins, et al: Using version control data to evaluate the impact of software tools

Paper evaluating possibly the worst version control system ever! At a more meta level, it was an example of how you can run an empirical study who's sole input is data mined from a past project (similar to what Samira did for her master's).
Nick mentioned that, despite its archaic premise, a versioned editor like this one would have been helpful at EA.
Discussion ensued as to whether this type of validation was actually required for this tool. It seems almost anything would be better than the existing 'version control'. In fact, there are some in the field who feel that expert intuition is ultimately more useful than empirical experimentation.

Sharp & Robinson: An Ethnographic Study of XP Practice

Ethnographic study of an extremely well-oiled XP team in england
Study found that, in this case, XP was the style best suited for maximal performance of the team
Threats to validity include not spending enough time (one iteration?) with the subjects

Kitchenham & Pfleeger: Personal Opinion Surveys

Chapter describing the process of creating and administering personal opinion serveys (questionnaires and the like)
Primary message is: making a questionnaire isn't easy! There's lots of confounding effects/sources of bias to worry about.
Interesting discussion ensued concerning the reuse of standard instruments from psychology, and whether or not SE should have similar standard instruments.

Cherubini et al: Let's go to the whiteboard: how and why software developers use drawings

Interesting case study conducted by Microsoft Research to see how their developers use graphical representations of code
Researchers were able to categorize their uses into Understanding, Design, and Communication, and the amount of investment into Transient, Reiterated, Rendered, and Archival.
Pretty good

Flyvbjerg: Five Misunderstandings about Case Study Research

This paper attempts to disprove several common misconceptions about case study research, primarily things like "case study results cannot be generalized to a larger population", "case studies cannot be used to test hypotheses".
A fairly good piece of advocacy. It certainly makes me feel better about considering a case study as a direction for my research.

Edwards: Using software testing to move students from trial-and-error to reflection-in-action and related papers

Details findings of the WebCAT system - an online assignment submission and automatic grader created at the Virginia Tech.
Edwards found that the system was useful and well received by both instructors and students. The primary objective, encouraging students to do test-first development, was achieved.
Interesting effects of introducing hints into the automatic test cases to discourage last minute submissions.

Juristo et al: Reviewing 25 Years of Testing Technique Experiments

A taxonomy/summary of the various means of divining test cases that have been invented over the last quarter century.
Focuses mainly on machine-derived cases (random input-output samples, etc), doesn't focus too much on human-created unit tests, unfortunately.

Thursday, March 12, 2009

RESTful Efforts in Django

My CSC2125 team appear to have concrete direction in what we're going to build for end of term. We've been building prototype REST web services, relying on ORM data, for the last few weeks, most recently in Django. Since Django is so closely integrated into the community here, we're going to use that as our target platform.

It seems that both myself and Bill Conrad started with the Django REST Interface, a Google SoC project from 2007. This interface takes your Django Models and throws a quick and dirty xml interface ontop of them. Functional yes, but not really complete (or even good REST). However, since this interface has some problems, they're just aching for me to fix them!

First, the xml/json/yaml returned from the interface is the standard Django serialzation format, which isn't very pretty for REST purposes. It can be cleaned up.

Most importantly, I think, is that the inter-object relations are expressed as sets of primary keys, instead of URLs to related objects. This flyes in the face of REST. It will require something akin to reversing the urls.py map to get a URL from a Model and primary key. Non-trivial, interesting, and crucial to the correctness of the resluting service.

A few other nice-to-haves include implicit delayed GET, algorithmic(query) resources, and dynamic representations.

I spoke with Bill today about his work on the Basie REST API. He seemed convinced that he solved most of what I'm after already, with the notable exception of reverse URL mapping. The REST blog post on the Basie Blog mentions the following features:

Generic Models - looks like Django Models will be turned into REST resources automagically, a la the Django REST Interface. Also, the Basie team had need of a deep synchonization of objects, and so added this to their REST API. While I'm sure this fulfulls their requirements, according to Robert Brewer via Greg's Blog this is a RESTful no no.

Intuitive Data Access - discusses url structure. There's mention in this section of an algorithmic (Bill calls them filtered) resources. If this is done, then that's one item off my list!

AJAX Friendly - not really interested in this, but good on ya.

The impression I get after going through this (and I'll admit, I haven't looked at the Basie codebase yet, but its next on my list), is that there has been effort in the area I want to pursue in 2125, but not to the extent or in the specific details I intended. Yay! Project still holds validity!

Wednesday, February 18, 2009

Code Monkies and the Recipe for Happiness

Hanging out in my office today. Being moderately productive, but in an inexplicably good mood. The apparent recipe for happiness is as follows:

Purchase a ridiculously expensive bagel
Coffee
Install shiny new operating system on laptop
Meet with supervisor
Coffee
Create UI mockups for hypothetical testing tool
Listen to 'Code Monkey' by Jonathan Coulton a few hundred times.
Coffee

I like this song. It gives me this mental image of developers as knuckle-dragging primates. Especially the way Coulton removes all articles from the lyrics of the song. Ex. "Code monkey get up, get coffee." And after all, if you can't laugh at yourself, at whom can you laugh? (sentences also can't end in prepositions). A couple excerpts:

"Rob says Code Monkey very diligent, but his output stink. Code Monkey's code not functional or elegant. What do Code Monkey think? Code Monkey think maybe manager want to write god damn login page himself? Code Monkey not say it out loud. Code Monkey not crazy, just proud!"

"Much rather wake up, eat coffee cake. Take bath. Take nap. This job fulfilling in creative ways. What a load of crap."

Monday, February 16, 2009

Saxaphone

This is for the numerous saxophone players I know:

http://www.youtube.com/watch?v=RXyC7S-4LR8

Tuesday, February 10, 2009

Security

After a conversation with Aran about the U of T library site possibly being vulnerable to sql injection attacks, we came up with the idea that a really really cool name for a book on internet security would be

"); DROP TABLE Books;

Not only is it funny, but is very existence would serve to test the security measures of countless online bookstores and libraries across the globe. Most valuable internet security book you'll never have to read.

Thursday, February 5, 2009

A Long Train of Thought with One (1) Cool Idea and Several Tangents

So I've been complaining for months now that I'd like an operating system, or just a desktop manager, that was similar to the kind of thing you've seen in movies like Swordfish and Hackers. Some eclectic, that looks really frigging cool, and allows me to do the kinds of operations that I want to do, quickly and easily (Note to you UX experts out there, I am well aware of the fact that interfaces that look cool aren't usable. I'm not going for market appeal, this is a totally custom job). It seemed obvious that the only way to get this was to build it myself. So, I started small, thinking "What kind of operations/features would I want on a small, portable device, like a Netbook or EEE PC?" (Note: I decided I wanted an EEE PC when I saw Richard Stallman speak earlier this week. It was the only thing I really liked about his talk.) A simple window manager (with really flashy graphics), file system navigator, browser, and IDE. That's pretty much it. Oh, and the whole thing should be built on top of some flavor of Unix, so that I can still use 3rd party apps, etc. Ambitious project, I know, but I'm just fantasizing here. Also, while I was daydreaming about this, I decided to restructure my personal computing setup, but that's another story.

Anyway, I figured I'd start with the IDE, since I had always wanted to make one. This specific idea came up over the summer while I was at work. I really liked the coding features in IntelliJ, but wished it fit better with our continuous integration infrastructure. I came to the conclusion that the best IDE would have the most notable coding features from IntelliJ, but be transparent enough to allow you to plug in whatever tools you might already be using: SVN/Perforce, any JDK, any Ant, etc. Before you start saying "But Rory, Eclipse does ..." or "But Rory, NetBeans blah blah blah...", one of the strongest motivating factors for this idea was that it sounded like a lot of fun, regardless of wheter the requirements have been fulfilled by something else. Also, these heavyweight IDEs are just that, too heavy. For me, their feature set can be at times too full, and having a single application that consumes > 1GB of memory seems a bit silly, when all it really has to do is edit text files and invoke some commands from the system prompt. Also, keep in mind that I enlisted (is that the right word?) in grad school to do just this: redesign developer tools.

I mentioned this to Zak, to which he replied "It sounds like you just need to learn to use Emacs properly." That also doesn't sound like fun. I mentioned this to Aran, to which he replied "Oh my god, me too!" Yay! Someone else wants to make and IDE. Except, Aran's IDE is a Javascript IDE. Written in Javascript. That runs in a browser.

My first instinct was that this idea sounds rather boring. Then I thought more carefully about it. Google has browser based versions of all the other applications you could use on a daily basis: mail, word processor, spreadsheet, image editor, etc. Why not a browser based IDE? Are the UI controls available in a browser less expressive than those on a native client? Probably not.

Now, I had originally thought of this IDE as running in the browser, operating on local files, etc, but what if it were more closely integrated with the web? What if it were a component of an online software portal? It would automatically know which source code repository you're using. It could automatically update documentation (wiki's). It would have strong integration with the portal's bug tracker. Imagine it, you're favorite portal would have a "Code" tab in addition to "Wiki", "Docs", "Browse", "Mail", etc, at the top of the page, and when you clicked it, everything was configured and ready to go.

What can be get if we utilize the cloud for some of the processing, instead of relying on the browser to be the engine of this fantastic IDE? First off, my IDE no longer consumes > 1 GB of memory. What else? Rendering of the controls would still be done on the client, but can the cloud be used for more interesting problems, like static analysis? Anything that could benefit from a bit of parallelization is a good candidate for migration.

Could this include running unit tests? Should the compiled code be run on the server, or in the browser? Security concerns say that it should be run on the client, but if it were run on the server it could possibly be done faster, and run in multiple browsers in multiple environments instead of just the ones installed on the client. To protect security, we could run the build and execute the code in a virtual machine, like a SnowFlock VM for example. Now, as for unit tests, executing these tasks are extremely parallelizable. We could fork a vm for every test, run them all at once. Huzzah!

I may have to rewrite this in a more concise form :P

Also, Aran mentions Heroku and AppJet, which are similar to this.

Wednesday, February 4, 2009

BitWhat?

I've noticed that a lot of the tech blogs I've been reading have titles that include creative uses of the word 'bit'. Ex. The Third Bit, BitWorking, Bitgistics, etc. Now, don't get me wrong, the authors and opinions presented are insightful, but how often does a python programmer, for example, worry about bits? Shouldn't these catchy names use higher level concepts, like "Objection", "DataDemon", or "Quine 'n Cheese". Bleh, just my opinion?

Tuesday, February 3, 2009

Reading Week

While thinking about which papers/books I should consume over reading week, I came to the conclusion that what I would really rather do is finish up the planking on my scratch-built Bluenose. This was a project that I started a little over a year ago, and boxed when I moved to Toronto. My plan was to finish it before christmas, what with all the spare time I'd have as a lazy grad student. Turns out, things are a bit different than I originally estimated, and the Bluenose has stayed in the closet for the last 4 or 5 months.

The hull was carved from a single 6"x6"x48" Basswood timber, with additional pieces carved/shaped/cut from mixes of pine, balsa, and basswood. The false-underdeck is pretty much done, just needs a bit of sanding along the rails, then I get to start planking the deck. I estimate I'll spend most of the first day figuring out the scales & sizes I chose for everything. In hindsight, I wish I had written some of this stuff down, but I think I'll manage.

RESTful Questions

While working on my 2125 project, my partner and I created a quick little RESTful web service using CherryPy, and SQLAlchemy for persistence. SQLAlchemy worked wonderfully. CherryPy did a great job of making data-driven web pages, and the MethodDispatcher made it easy to invoke certain methods within a class when an http request comes in, based on the http method. This seemed almost ideal for REST, but some clunkiness in the design prevents it from being really what we're after.

What are we after, exactly? We're trying to find ways in which we can avoid duplication of effort when using both Object Relational Mappers and RESTful Web Services. In their book "RESTful Web Services", Richardson and Ruby hit on the point that the process of translating objects into REST resources is very similar to the process of translating the same objects into tables in a relational database. So, if a web service is storing objects in a database and exposing them via a rest api, we would be doing the same sort of mapping procedure twice.

My partner (in crime?) and I had a chat with Greg about this, and came up with some questions to investigate. Below are the questions and their answers:
How do REST APIs represent foreign-key relationships (ie. object aggregation)? Specifically, are references to the other objects stored/returned, or the entire object on each request?

It is common REST practice to return hyperlinks to other objects/resources that are aggregated by the given resource. This would require an additional http request for each referenced resource.

Can we uniquely identify object instances (REST resources) based on some identifier?

Yes. The resource's URI is its identifier. Every resource has one that identifies it. However, it is possible for one resource to have many URIs that point to it (ex. /releases/2_05 and /releases/latest could be the same thing).

Can we cache REST objects on the client side, based on their identifier (whatever that may happen to be)?

Yes. It would be silly not to. However, the multi-identifier problem stated in the last answer might make this less efficient.

If we assume that the meat of the service is some object graph (probably a DAG), can we reconstruct the graph on the client side, out of stubs instead of actual objects, given identifiers and caching?

I think so.

Thats all for now. Check here or on the project wiki for more information coming soon!

What I've Read This Week

Singer, J.; Vinson, N.G., "Ethical issues in empirical studies of software engineering," Software Engineering, IEEE Transactions on , vol.28, no.12, pp. 1171-1180, Dec 2002
URL: http://www.ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1158289&isnumber=25950

This paper presents a handful of ethical dilemmas that researchers who conduct empirical studies can get themselves into, along with advice on getting out or avoiding the situation all together.

What kings of studies could be create which contain no human subjects, but in which individuals can be identified (ie. from their source code)?
When can an employee's participation in an empirical study threaten their employment?
Is it possible to conduct a field study in which management doesn't know which of their employees are participating?
Should remuneration rates be adjusted to compete with a standard software engineer's salary?
Are raffles or draws valid replacements for remuneration? Does the exclusivity of the compensation (ie. only one subject wins the iPod) affect the data collected by the study? Will subjects 'try harder' in the task assigned if they think they may win a prize? Can prizes affect working relationships/situations after the researcher has left?
Does ACM Article 1.7 eliminate deceptive studies?
Regarding written concent/participation forms, does having a large number of anticipated uses of the data detract from a studies credability, and thereby make subjects less likely to participate?

John P. A. Ioannidis, "Why Most Published Research Findings Are False"
URL: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1182327

This paper describes a detailed statistical method (proof?) illustrating evidence that the majority of research papers published in this day and age go on to be refuted in the near future.

What is the 'power' the authors are referring to?
Is corollary 5 (corporations sponsoring research supress findings that they deem unfavorable for business reasons) just plain evil or misleading?
Null fields sound interesting. How do I tell if I'm stuck in a null field?
How do we determine R for a given field?

M. Jørgensen, and D. I. K. Sjøberg (2004) "Generalization and Theory Building in Software Engineering Research"
URL: http://simula.no/research/engineering/publications/SE.5.Joergensen.2004.c

Null hypotheses are a tell tale of (sometimes misused) statistical hypotheses testing. Should we as readers be concerned when we see clearly stated null hypotheses?
In their recommendations, the authors suggest that purely exploratory studies hold little or no value, given that vast amounts of knowledge concerning software engineering has been accumulated in other, older fields such as psychology. Although I agree that cross-disciplinary research is useful for SE, and many old ideas can be successfully applied in SE, I'm not sure I agree that there is no use in exploratory studies.
Proper definition of populations and subject sampling is important
It is difficult to transfer the results in one population to another. The most common example of this is performing a study on CS grad/undergrad students and expecting it to transfer to professionals. Is there any way we as CS grad students can perform studies that will be relevant to professionals, then?

Still working my way though RESTful Web Services. Just wrapped up the author's definition of ROA (resource oriented architecture). Very interesting. Hopefully this answers some questions brought up by my 2125 project.

Also on the stack are this paper about the Snowflock VM System and A Software Architecture Primer.

And, if there's time, I'll try to finish Ender's Game.

Thursday, January 22, 2009

Calling all Build Engineers

During my recent CSC301 lecture, I was surprised to find out how few students were aware that they could actually get a job as a full-time build engineer. It makes me think that there's probably a study to be done in this area.

To what extent are build processes and project maintenance discussed in CS education?
What is the market value of a professional build engineer?
How to build engineers perform their job? Is there a way it can be improved?

The last point is something that causes me mild concern. It seems that, in some cases, the methods used to set-up and execute project wide builds can be semi-structured voodoo.

Safe Server-Side Unit Testing

I like build systems :) My first experience with integrating a vcs, bug tracker, and ant was a very fulfilling experience, and it only got better when we added things like EMMA to give developers a feel of how their project was progressing. So, you can understand why my ears perked up when, during a conversation about the SVN setup in Dr. Project/Basie, Greg mentioned that they had tried to incorporate a continuous integration routine into Dr. Project, but failed, citing complexities and difficulty with the administration. Now, being the cinical, cold-hearted person that I am, my first thought was, "You clearly need better administrators", but then I remembered trying to do something similar with VMWare, and how rediculously hard it was to get it working, and once it was, keeping it there was almost impossible, so I held my tongue.

The basic premise here is to have the server which runs the Dr. Project/Basie installation also manage a system of virtual machines. When code is checked into the SVN repository for a given project, a virtual machine is spawned. Inside this VM, we download a copy of latest revision from the SVN, build it, run the unit tests, generate the reports, publish them, then kill the VM. Obviously, we can't do the build and test by just forking a process, without the VM, because that would allow the project groups to run arbitrary code on the Dr. Project web server, which is just about the biggest security hole I can think of. So, the goal here is to utilize the virtual machines to completely isolate the code from the web server, so that the tests are run in a completely safe environment, and at the same time providing benefits like strictly reproducible execution environments (every unit test starts from the same vm snapshot).

To accomplish these goals, we're looking at using the SnowFlock system. All vm's start from a master image, clones are quick to create (~100msecs), we can instantiate many, many clones at the same time, and the whole thing is wrapped up in a nice little Python API.

It will be interesting to see if this works for Dr. Project/Basie's needs, and if it does, I'd like to see if it could be extended to do cluster testing for larger distributed systems projects. The ease and speed of creating a new clone vm means that for each test, a small cluster of machines could be created, the test run, and torn down. I'm not sure if a tool like this exists already, it sounds like a fairly straightforward idea, but should be fun to investigate either way.

ORM Mapping for Web Service Definition

This post is an experiment with the Blogger/Google Docs interoperability functionality. I'm not terribly impressed with the quality of the translation between doc and blog post. If you're as disgusted with the layout as I am, feel free to read the google doc here. This is a document describing an idea Greg proposed to me and another student in his CSC2125 class. In once sentence, the problem is : Can we use the object mapping definition from an Object-Relational Mapping tool to describe objects/resources in a RESTful web API, and in so doing leverage some of the benefits ORMs have lent to persistance, reduce redundancy, and generally make people happier? Unfortunately, the more I look into it, the more I think the answer is 'no'. However, we're not quite finished the investigation yet.

ORM Mapping for Web Service Descriptors

Traditional Situation

Figure 1 illustrates the traditional deployment situation for a client/server application, in which the client communicates with the server via an exposed web service, and the server persists data into a database using an object-relational mapper. The server-side process consists of a layered architecture, in which the business logic interfaces with the database via an ORM mapping layer. The application logic stores its data in the form of objects (hence the ORM), and a mapping is defined by the application programmer between the class definitions for these objects and a relational database schema.

The client side application code performs some useful operation with the data or service exposed by the server-side business logic. To access this information or service, the client application utilizes stub objects. These stubs expose the same interface as the live objects on the server (possibly a subset of methods for security/feasibility reasons), but the implementation of the object lives on the server; client side methods all contain logic for making calls to the server, and returning the response as if the method were implemented on the client. These stub objects are created automatically at build time by a tool which is able to read a description of the web service, and interpret into source code which can be compiled and used by the application. In traditional web services, this description is a WSDL file. <what is this for a REST web service??>

Figure 1: Traditional Client-Server Web Service Deployment, with ORM Mapping Definition and a Web Service Descriptor

Problem

The problem with this deployment is that there are redundencies in the way the shared objects and web service is defined, which could be streamlined to the benefit of both server-side programmers and clients who wish to interface with the server. In the event that the class definition for one of the shared objects changes (ex. adding a new public member), the server-side application programmer must update both the ORM Mapping Definition file/logic, as well as the Web Service Descriptor, and the client-side programmer may be required to at least rebuild their application, to update the stubs (this is addresses in Versioning).

Single Mapping Definition

It has been proposed that the ORM Mapping Definition and Web Service Descriptor can be combined into one artifact, as the two separate documents both essentially describe how to serialize instances of a given class. With hopefully only a small amount of modification, an ORM Mapping Definition could be used to serve both these purposes. Also, if the ORM side of the interface to this artifact is properly preserved, it is hoped that it can still be used for many of the other functions the ORM layer uses it for, like relational database schema migration/exporting.

Note: it may be interesting to investigate this further, as the ORM Mapping Definition serializes an object's state, but a Web Service Descriptor would likely only describe an object's behavior/interface!

Figure 2: Client-Server Web Service Deployment, with single Shared Object Descriptor

Versioning

Following from the automatic schema migration/updating, questions are raised about how similar functionality could be used to span the client-server gap, not just the server-database gap. Obviously, client-side stub classes can't be updated completely automatically, as this would require rebuilding the application. However, the server could expose several concurrent versions of the same service, multiplexed based on a version field in incoming requests. As part of the schema update process, in addition to modifying the database, the ORM layer (or some other piece of code) could generate the infrastructure required to support backward-compatible calls to the web service API.

My First Lecture

So, I've been neglecting this old blog for the last few weeks, so I figure it's high time I let my captive audience in on what I've been doing at grad school.

Once again, I'm TAing CSC301. This term, in addition to my standard duties of marking, critiquing assignment questions, and coming up with exam problems, Greg asked each of his TAs to give a lecture to the class. My topic was unit testing with javascript. My first reaction was "Yay, I already know about unit testing, this will be a breeze", but then I remembered that my knowledge of javascript extended to image rollovers, and no further. So, I spent about 10 or 15 hours over the next week or so learning proper OO javascript, as well as how to use (or not use) JsUnit and some of its competitors, JsCoverage, Selenium, and trying to beat CruiseControl into state that fits together with these tools (no such luck, unfortunately).

The lecture went pretty much as I expected. I had prepared a loose agenda of items, a timeline for discussion and demonstration, and a few canned questions designed to get the undergrads to turn their brains on - all of which I forgot as soon as Greg introduced me. Luckily, I had my laptop and a big desk to hide behind. After a few moments of awkwardness, things got back on track, however.

Lessons learned:

leave your pen at your desk (aparently I click-click-click it as a nervous twitch).
always bring a glass of water to a lecture, so that a) you can keep your throat lubricated and b) you can take short pauses to think/fabricate answers to questions without looking like you're thinking/fabricating

All in all, I think it was a pleasant experience. I was worried everything I said went past the class and out the back door of the room, but I received several questions and comments from a couple of the project teams, asking about setting up JsUnit to test their code! Hooray! I hope they can get it working.