Rory Tulk's Blog: ICSE 09

I had a very critical talk with two grad students from Queen's yesterday, whom I unfortunately can only identify as Jackie and Brahm. They had a difficult time accepting the immediate usefulness/significance of the study I was proposing, and that's fine, as they came up with a few useful suggestions that I can leverage, primarily to keep a narrow focus and always be aware of the result I'm trying to generate. Two interesting opinions they shared with me are below:

Are the good testers simply those who are most familiar with the SUT (software under test)? This seems entirely plausable (in my own work experience, I was a vastly better tester of sotware that I had written, or at least software written by someone on my team). If so, then the correlation between programming experience and software testing would be much weaker if we are inventing the SUT for our subjects to test. However, I believe to have some empirical evidence to the contrary, in the form of a man called Scott Tindal, who is one of the most skilled QA engineers I've ever had the pleasure of working with. Scott was of the caliber that he could almost single-handedly test every product in the OpenText/Hummingbird line. An interesting configuration of our study may be to experiment with developers testing their own code versus devlopers testing new code that they've never seen before.

Will the heuristics extracted with this study be the same as those already in use by static analysis tools like FindBugs?
It appears that determining a correlation between the techniques that experienced testers use and the heuristics employed by static analysis tools such as FindBugs could be an interesting topic. If there is no correlation between these two, may it be possible to integrate these into SA suites for any significant improvement? If there is a correlation, why haven't more testers adopted static analysis to aid in their efforts? Is it a matter of usability of SA tools? Is it that the formalism used to express the SA heuristics isn't understood by everyday testers, and so they aren't aware of what SA has to offer? I plan to talk to Nathaniel Ayewah, University of Maryland, who is working on the FindBugs team, about this further.

Andreas Zeller, professor at Saarland University in Germany, apart from being on of the nicest people I've met so far this week, also had some very positive feedback on my research idea. He agreed with it's principle, and that research in this area would be valid. One of his first questions after I finished presenting my idea was about incentive for participation. I thought he was asking about how we would convince students and pros to give up their time to participate in our study, but this wasn't the case. he recounted results found at Saarland when teaching students to test their code. Traditional methods obviously didn't work effectively, so they switched tracks. Student assignments were assigned with a spec and a suite of unit tests that the students could use to validate their work as they progressed. Once nightly a second set of tests were run by the instructor, the source of which was kept secret, and the students were informed in the morning of the number of tests they passed and failed. To grade the assignment, a third set of (secret) tests were run. Students begin with a mark of 100%. The first failing test reduces their mark to 80%. The second, to 70%. The third to 60%. The grading continues in a similarly harsh fashion. The first assignment in the course is almost universally failed, as students drastically underestimate how thorough they need to be with their tests. Subsequent assignments are much better, and the students focus strongly on testing, and indeed collaborate by sharing test cases and testing each others assignments to eliminate as many errors as possible before the submission date. Once the students have the proper motivation to test, they eagerly consume any instruction in effective testing techniques. Andreas found that simple instruction in JUnit and maybe TDD was all that was required, and the rest the students figured out for themselves. This type of self-directed learning is encouraging, but the whole situation makes me think that these students may be working harder, but not smarter, to test their software. It may be possible that by providing instruction not only in the simple operation of an XUnit framework, but also in things like effective test case selection, they may be able to reach similar test coverage or defect discovery rates, while expending less frantic effort.
Thought: drop the first assignment from the course's final mark, as it is not used for evaluation as much as it is a learning experience (of course, don't tell the students this, otherwise the important motivational lesson will be lost).
In addition to this, Andreas, just as other folks I've spoken to so far, emphasized the need to keep my study narrowly focused on a single issue, as well as controlling the situation in a way that enables precise measurements to be made both before and after implementing the curriculum changes we hope to elicit from the study, to accurately determine the improvements (if any exist). I am beginning to see a loose correlation between a researcher's viewpoint regarding empiricism in SE research (positivist on the right and constructivist on the left) and their amount of emphasis on narrowness of focus. Often times, those people who advocate exploratory qualitative studies also recommend wider bands of observation while conducting these studies. This is likely an over generalization, and I apologize in advance to anyone who may disagree with this.

Rory Tulk's Blog

Monday, May 18, 2009

Testing Heuristics and Static Analysis at ICSE 09

ICSE 09 - Day Three

Twitter Updates

Twitter Updates

Blog Archive

About Me