Rory Tulk's Blog: September 2008

Tuesday, September 30, 2008

SenseCam: A Retrospective Memory Aid

http://www.cs.toronto.edu/~khai/classes/csc2526-fall2008/readings/04-hodges.pdf

The authors of this paper illustrate the findings of the initial clinical trial of SenseCam, a small form factor camera which is worn around the neck and records still images both at regular intervals and when the device's sensors are stimulated. This creates a pictorial record of events that happen in the wearer's immediate proximity. The motivation for this record is to aid in rehabilitating memory loss sufferers. The trial presented in this paper has shown a marked improvement in the cognitive ability of the test subject. However, further clinic trials are required before a definitive statement about its merits can be made.

I think that one of the most significant insights the creators of SenseCam had was that the success of the device relied critically on a small, compact form factor. Earlier incarnations, which utilized mobile PCs carried in a backpack, would be too unusable to have a net benefit for a patient. Ease of use is of vital importance. Another important point in the discussion of the clinical results is that the authors make the distinction between the patient remembering the events recorded by the SenseCam instead of remembering seeing the pictures it recorded in previous sessions. Although the patient claims to be remembering the actual events, I believe the experimental method could be altered to assert this claim more concretely.

The SenseCam represents a simple enough product, and it seems obvious that it could benefit a patient suffering from a memory dysfunction. However, before it can claim to out perform other methods, I believe further clinical study, under more controlled circumstances, needs to be carried out (it should be noted that the authors freely admit this, it is just being restated here for completeness). Primarily, more patients need to be examined; a single case study is not sufficient. In addition to increasing the number of subjects, the number of 'important events' recalled by each subject should be increased as well, preferably to some statistically significant level.

In addition to a small sample size, there is a strong possibility that the results for the single given sample, recorded by Mr. B, may have been skewed, perhaps even unintentionally, due to the nature of Mr. B's relationship with the subject. A more pure result would be obtained by using an impartial third party to administer the tests to Mrs. B.

Designing Capture Applications to Support the Education of Children with Autism

http://www.cs.toronto.edu/~khai/classes/csc2526-fall2008/readings/04-a-hayes.pdf

The authors present three prototype devices for assisting caregivers dealing with children with autism (CWA). The first, called Waldon Monitor (WM), consists of a wearable video camera and Tablet PC for observing and recording the behavior of CWA. Secondly, the Arabis system replaces traditional pencil and paper based recording with a Tablet PC, and is used by caregivers to record a subject's performance in one-on-one testing and diagnosis. Finally, CareLog is a distributed, ubiquitous system for recording data about a child from any wireless accessible device, including a cell phone, PDA, or PC. A therapist can record and access data using any of these devices, which is stored on a mobile storage unit that is co-located with the subject.

The most insightful findings that these studies show is the importance of properly planning for complicated, multi-user interaction with their devices. The proposed systems seem trivial (video recording unit, software for tabulating test scores, etc), until the use cases are presented, which span multiple users at different times, with very different goals for the data, potentially at different stages in the recurring care cycle. That is, a therapist may use CareLog to record data about a CWA in an intermediate phase of the care cycle, and an analyst will use the collected data, aggregated with past results, to make a diagnosis and set goals in the early stages of the next iteration of the cycle.

One area in which the proposed devices can be improved is in CareLog's portable storage unit. Although this is a novel approach, I believe that the same functionality could be achieved with less cost if the data were stored on a remote, web-accessible server, instead of in a device that needs to be carried around the by subject. This way the physical hardware cost is reduced, and the subject can't loose or destroy their own data. Also, using an existing commercial product to perform the data analysis required for these prototypes could reduce the upfront cost, making them easier to adopt by caregivers and therapists.

Monday, September 29, 2008

Ideas

I'm taking this opportunity to write down some ideas I had during today's talk, so that I don't forget them :)

Non-standard programming interfaces - what could we do if we wrote programs in a WYSIWYG editor? Or an auto cad type editor? Or any domain application? What would that be like? Also, could you bootstrap such a system (ie. implement the non-standard language using a non-standard language)? Intuitively not, but maybe that's why we haven't figured it out yet.

Program diff - diff a program not as a text file, but as a syntatically (and semantically?) correct piece of code.

Program representation - how can we represent a program other than a bunch of lines of text? Does this apply to the previous point?

BRAINSTORM!!!
Gdankin problems!! = design patters. According to Greg's definition (going to double check this against wikipedia in a minute), a Gdankin problem is one that results in the same solution when solved by independent domain experts. Now, as I understand it, when the gang of four first wrote Design Patterns, they analyzed large amounts of code, created by programmers in different organizations, independently of each other, and noticed that certain problems resulted in similar structures in the code. Therefore, I propose that the Design Patterns are the common solutions to a set of fundamental Gdankin problems.

Tuesday, September 23, 2008

The CHAOS Boondoggle

So the Standish CHAOS report was released in 1994, estimating that the majority of software projects go over-budget by 189%. The presentation of the report was questionable at best, and the data contained in it seemed inconsistent, with little more than hand-waving to back it up.

Enter the second paper, Jorgensen and Molokken, actively calling out Standish on the numbers it presents. Good for you, Jorgensen and Molokken!

In the interview, the interview does a pretty weak job of grilling Standish about his numbers, and they both leave having answered very little.

I'm pretty sure Standish blundered this number, and covered its tracks by a) not disclosing research methods and b) significantly altering the following year's CHAOS report.

But this is all just my opinion. I've been wrong in the past.

Why Line for Java Names

WTF-J = Whyline Toolkit For Java

WTMF-J = Whyline Toolkit (Multi-threaded) For Java

Monday, September 22, 2008

Automatic Bug Triage Using Execution Trace and Ownership Vector Space

Came up with this idea by smashing together two papers from Greg's reading list. Using this for my NSERC & OGS applications, so please don't steal it :)

Previously presented topics on automatic bug triage (directing bug reports to the appropriate member of the development team) showed very bright prospects, but lacked enough accuracy to make them a usable product [1]. This entry point for bug information is also an excellent location to apply any number of other filtering heuristics desired, such as the duplicate detection algorithm proposed by [2]. The method proposed in [2] requires, in addition to a natural language description of the problem, and execution trace that can be used to more quantitatively measure two, or more, bug's similarities. I propose that the same execution trace could be used to assist in the triage functionality described by [1]. Since the vector space created by Wang et. al. to measure bug similarity is based on function calls, a similar approach could be used, requiring the same input execution trace, to determine bug ownership. A master vector space could be created at build time from all possible called functions in a source code repository, and assigning ownership of these functions to developers based on either activity in a source revision system, or some static assignment. This would create volumes of ownership within the function vector space. In theory, a bug report, assuming it is not a duplicate, should be assigned to the developer in whose volume the bug's vector terminates. This may also have interesting and relevant applications to visualizing ownership of code, for management purposes.

[1] D. Cubranic and G. C. Murphy, "Automatic bug triage using text categorization," in Proceedings of the Sixteenth International Conference on Software Engineering & Knowledge Engineering, F. Maurer and G. Ruhe, Eds., June 2004, pp. 92-97. [Online]. Available: http://www.cs.ubc.ca/labs/spl/papers/2004/seke04-bugzilla.pdf

[2] X. Wang, L. Zhang, T. Xie, J. Anvik, and J. Sun, “An approach to detecting duplicate bug reports using natural language and execution information,” in Proceedings of the Thirtyth International conference on Software engineering

Lazy Delete for Email

So I'm going through my inbox this morning (procrastinating on my nserc application, btw), and I notice I have a lot of messages that take the form "Don't forget about on ". After reading, I don't want to delete this message, because having it in my inbox serves as a nice reminder about whatever it is I'm not supposed to forget about. However, after a week my inbox is now clogged with messages about events that have passed. What I would like to be able to do is, when initially reading the message, click a button next to the Delete button, lets call it Delete On ... and then I can specify a date, so that the message will be deleted once it has become invalid.

Maybe run some kind of computational linguistic method to read the message first and propose a deletion date?

Saturday, September 20, 2008

Context Aware Communications

This paper presents research activities into the field of Context-Aware Communication, which is considered as a subset of context-aware computing applications. The authors structure their presentation into five main categories: routing, addressing, messaging, caller awareness, and screening. Routing involves directing communication (phone calls, text messages) to physical devices in close proximity to the callee, and has been successfully implemented by combining Xerox PARC's Etherphone and Olivetti's ActiveBadge system. Addressing uses context information (“is this user in the building?”) to dynamically adjust traditional email mailing lists. Messaging is similar to context-aware call routing, but will instead route text messages to any proximal device capable of displaying text information. Caller awareness provides callers with information on their contact's context, so that they can actively choose not to call at inappropriate times. Screening is an approach that works in contrast to caller awareness; it filters out incoming calls based on the callee's context.

The authors presented several insightful technologies which were eventually adopted by modern day ubiquitous systems. The first of note was customizable phone ringers, depending on context. In the example, these ringers were used to distinguish callee, even though they have been successfully applied to determining caller in current applications. MIT's Active Messenger bears a striking resemblance to modern cell phone SMS capabilities. Also, AwareNex's context feature has been replicated in countless instant messaging systems. A possible extension of this idea is to incorporate automatic context sensing, via ActiveBadges or some other comparable technology, into current applications which would benefit from context information, such as an instant messaging system. Thereby, instead of manually setting one's IM status to 'on the phone', all one would have to do is simply pick up the phone. Not sitting at your workstation would change your IM status to 'away'.

The primary limiting factor in the applications presented in this paper is the technology that was available at the time it was written. Although the devices developed successfully demonstrated the concepts intended, further effort needs to be made to make these devices more marketable before they will be widely adopted, and form the ubiquitous network required for the proposed applications. It is entirely possible, however, that these advances have been made between the time this paper was published and present day.

Also, the authors made mention of certain situations where the context-aware communication applications would, for example, hold or screen incoming traffic because the callee is at the movies or eating dinner. This was only speculated at, because at the time of publishing the context sensing network wasn't pervasive enough to determine a user's location outside of the office (with the exception of GPS, but that won't work indoors). I propose that this limitation should be included in future systems which can determine a users' context outside of the workplace, to add a measure of privacy to the system.

Context Aware Computing Applications

This paper presents an interesting summary of current (1994) activities in the field of context-aware applications. This consists primarily of applications developed for the workplace which leverage the user's physical location, as well as the locations of coworkers and resources within said workplace. The authors present four areas or application features that rely on context, implemented with Xerox PARC tabs and boards: proximate selection, automatic contextual reconfiguration, contextual information and commands, and context-triggered actions. Proximal selection is a UI technique that visually makes objects closer to the user's physical location easier to select. Automatic contextual reconfiguration refers to a process through which ubiquitous devices (boards, tabs, etc) can be accessed by a user simply by being in their immediate vicinity. Contextual information and commands can be used to display default appropriate information, or alter the standard set of commands (or parameters to these commands) based on the user's location. Lastly, context-triggered actions represent actions (in this paper, unix shell commands) that are executed by context events. That is, when a predefined context state occurs.

The application features that Schilit et. al. present in this paper have proven to be insightful in that they have found their way into many pieces of modern ubiquitous devices. The proximal selection UI outlined bears a striking resemblance to fisheye menus, most notable found in Apple's OSX. Automatic contextual reconfiguration performs a similar function to modern BlueTooth networks, with the exception that BlueTooth doesn't rely on a centralized network to control all devices within a building.

One issue that I believe the authors may have overlooked, especially in respect to proximal selection and contextual reconfiguration is that of permissions. In the example of a user printing a document, the closes printer may not be the optimal choice if it is a restricted resource. Aside from this, the only element of context that the authors seemed to use was the location of individuals and resources. Other environmental variables could be utilized to augment the function of a ubiquitous device to better suit the needs of it's user. For example, by monitoring the ambient noise level around the user, a phone could decide change its ringer volume to match, or simply vibrate if the room is too noisy for the device to compete.

Thursday, September 18, 2008

Debugging Reinvented: Asking and Answering Why and Why Not Questions about Program Behavior

http://www.cs.toronto.edu/~gvwilson/reading/ko-debugging-reinvented.pdf

This paper is an interesting follow up to an earlier paper I read about Why Lines. If you're not familiar, a Why Line is a debugging tool that instruments a piece of code, allows you to execute the code (for a brief time, ~ a minute), then use a custom UI to ask questions about the output (ie. Why is this circle red?). Experimental results (on an admittedly small test size) showed dramatic improvement in debugging time.

There was a paragraph in this paper that I feel the authors glossed over. It concerned translating user submitted bug reports into Why Line questions. My applying some computational linguistics techniques, I bet it would be possible to automatically generate one or many Why Line questions from a bug report. Combine that with the previously presented technique on automatic bug triage, and you have a system which will (in theory) automatically assign bugs to particular developers, and present them with a Why Line question that will allow them to quickly assess the problem.

Also, I'm pretty sure there's something that can be gained in an organization by archiving/databasing their Why Line traces, but I'm not sure what yet.

Wednesday, September 17, 2008

Fantastic Contraption

This is the sole reason it took me all afternoon to read Jorge's paper:

http://fantasticcontraption.com/

Anchoring and Adjustment in Software Estimation

http://www.cs.toronto.edu/~jaranda/pubs/AnchoringAdjustment.pdf

This paper presented Jorge's results about anchoring and adjustment when estimating software project time consumption. Looking back at my notes on this paper, everything seems to be presented fairly well, all the numerical results seem sound. Couple of quick things:

The Anchoring and Adjustment phenomenon is clearly observable when the problem can be expressed as a number in a range. Can it be applied to other classes of estimations?

Why is the description of COCOMO so verbose? It don't see how it contributes to the paper, other than to demonstrate that current software estimation techniques don't work.

What is a null hypothesis?

Now, I want to take this opportunity to discuss an estimation technique that a former colleague of mine once described to me:

Estimating the time required to complete a software project is inherently a random activity, and it is generally accepted that estimating smaller, individual tasks within a project is more accurate than trying to estimate the project as a whole.

Lets divide our hypothetical project P into n subprojects, P0 ... Pn-1. For each of these, instead of guessing a completion time, we provide a low ball and high ball estimate (ie. P0 will definitely take more than a week, but less than 4).

We take the sum of all the low ball estimates, and the sum of all the high ball estimates, and we get two figures, one for the earliest possible completion time for P and one for the latest. While this may be sufficient for some, one further refinement is to apply a Gaussian distribution between these two figures, that way you could say with some degree of reasoning, that project P has an 80% chance of finishing in X weeks.

Ferenc, I apologize if I've gotten any of this wrong.

Tuesday, September 16, 2008

Google Integration and Tools

Couple of interesting google tools that I have come across in the last couple of days, probably old hat to most of you.

Google Reader (http://www.google.com/reader). This is an online, customizable RSS/Atom aggregator. Pretty handy for keeping track of slashdot, zdnet, ieee, and, for example, 15 graduate student blogs (like this one).

This last one doesn't exist yet, but I want my facebook events to be synchronized onto my google calendar. Sounds like a rainy friday afternoon project.

A Field Guide to Computational Biology

http://www.cs.toronto.edu/~lilien/CSC2431F08/readings/CompBioGuide.pdf

This magazine article presents the opinion that computational biology will without a doubt be the cause of future miraculous advances in biology, disease, and gene research. Also emphasizes that traditional biologists will have to get used to using higher level mathematics than they are used to, and much more interdisciplinary interaction with others.

Good, not great.

Can a Biologist Fix a Radio?

http://www.cs.toronto.edu/~lilien/CSC2431F08/readings/CanBiologistRadio.pdf

Explores the idea of having a formal language for expressing biological processes and structure, using the analogy of trying to fix a radio. An engineer would use a 'formal language' to describe the internal structure of the radio (amplifier, 10k ohm resistor, etc), and deduce the problem that way. A biologist would likely spend years of comparative research on other working radios, classifying parts based on phenotype, etc. This system quickly becomes too complicated for any one person to understand, primarily due to conflicting definitions of parts originating from different researchers. The author proposes that formal methods and language for biology will make cellular analysis much easier and vastly different, in the same way PowerPoint revolutionized slide-based presentations, and that biologists much catch up or be left by the way side.

Monday, September 15, 2008

NaviCam

In Ubicomp today, while discussing a paper previously presented in this blog (see The Human Experience), we got to see a video of Sony's NaviCam system in action (see link http://www.sonycsl.co.jp/person/rekimoto/navi.html). Seeing this system actually working was pretty cool, and a handful of applications extending this functionality immediately jumped to mind, all of which could be potential thesis topics, or business plans.

Combining the augmented reality capabilities of the camera & display setup with a wearable, glasses-based display could allow application developers (like me :P) to create real-time navigation software, meta-information pop-ups, and all kinds of cool stuff!

One immediate thought that jumped to mind as a detractor was the image I had seen of some geek in the wearable computing field with a webserver in a backpack that he lugged around everywhere. No consumer would buy that, but then I thought that if this backpack could be shrunk down to the size of an iPhone, which it almost certainly could, then this would be a viable market opportunity.

One target audience for such applications would be the military. Personnel in the field could have information on way points, location of friendly/hostile persons, radar & network coverage, etc, overlayed on top of real world vision, eliminating the need for secondary maps & gps devices. And, the military's level of network connectivity is legendary, so gaining access to this information is essentially a solved problem.

Iunno, just an idea. Sounds like it would be fun to tune around in one of these sets, seeing maps and stuff overlayed on regular vision.

Sunday, September 14, 2008

Gregory D. Abowd et. al. The Human Experience

Gregory D. Abowd et. al. present a follow-up to Weiser's The Computer for the 21st Centry. In this paper, the authors closely examine some of the ideas proposed by Weiser, and explore the changes in traditional design and development patterns needed to adopt ubiquitous applications. This is loosely broken down into three categories: defining physical interaction models to and from ubiquitous computing devices, discovering ubiquitous computing application features, and evolving the methods for evaluating human experiences with ubicomp. The physical interaction problem examines new ways to gather input from a user, beyond simple keyboard/mouse combinations. This includes gesture-based and implicit input. Also, non-standard output methods are explored, different from the traditional video display (ex. Ambient output). Utilizing these non-standard IO methods to create a 'killer app' is the next challenge the authors discuss. Applications which use the user's context (location, identity, etc) as input for providing useful features, as well as relative changes in context, are discussed. The use of changes in context lead to the problem of continuous input, whereby applications must respond to constant subtle input from users over extended, possibly infinite, time frames. This is in sharp contrast to current application mentalities, which are meant for discrete usage sessions (ie. Word processor). Lastly, the authors propose that traditional HCI evaluation techniques will be at a disadvantage when used with ubiquitous computing applications, and so they introduce three new cognition models: Activity Theory, Situated Activity, and Distributed Cognition.

This paper provided several new examples of ubiquitous computing devices and applications, and served to 'pin down' some of the specific details that Weiser left for further research. The showcase of new devices and technologies clearly illustrates the path of development between the time the Weiser paper was published (1991) and 'present' (2002). One noteworthy insight presented by Abowd et. al. is that of the physical means of interaction with ubiquitous devices, drawing particular attention to 'implicit input'. It seems apparent that the future of the embodied virtuality will not be interfaced with a keyboard and mouse, and the devices presented in The Human Experience demonstrate subtle input and output methods (ex. Network traffic monitor) which clearly show success.

Although this paper provided excellent insight into the concepts previously proposed by Weiser, I feel that it didn't introduce as much original work as could be possible. It can be argued that this was the purpose of the paper, in which case it has succeeded. However, I feel it may have contributed more value to the scientific community had it contained more unique ideas. Apart from this, the paper presented a discussion of using ubiquitous computing applications to perform one of the fundamental activities humans perform on a daily basis: capture and access of data. That is, the recording of information presented by a colleague and summarizing it for later retrieval. It is my opinion (and this clearly is not, nor should it be, shared by all) that automating a fundamental process such as this will contribute to a strong dependence on said application, reducing an individual's ability to be self reliant. In addition, it is not beyond the realm of possibility that regularly exercising the intellectual system by absorbing and recording information in this way is beneficial, and removing the need for this exercise could have negative impacts on cognitive ability. This, however, just just my opinion. This represents an area of further research, which should be pursued with as much importance as the technical developments in the field of ubicomp (that is, the implications of ubiquitous devices).

Mark Weiser's The Computer for the 21st Centry

The paper by Mark Weiser presents the current (as of time of publishing, 1991) efforts of Xerox PARC in the field of ubiquitous computing devices and applications, and follows this with speculation/projection of possible future scenarios. At time of publishing, these devices were divided into three classifications: tabs, pads, and boards. A tab is similar in size and function to a post-it note, except that it contains a dynamic display. A pad can be thought of as a scrap piece of paper, but again with a dynamic display and stylus-based interface. A board functions like a dynamic white board, or any large display. The intermixed use of these three classes of devices, the author argues, will form the basis of the future of ubiquitous computing in the “embodied virtuality”. Extrapolation on trends in technology evolution suggests that it would be capable to implement Weiser's embodied virtuality in the not-too-distant future.

I found the discussion of 'current' technology and research into ubiquitous computing devices quite interesting. I was previously unaware of the efforts of Xerox PARC toward these ends. The details of these devices operation, at the time, were a major innovations and greatly enhanced the field of computing. Also, the author presents a very insightful discussion on how reading became a ubiquitous technology, and how this can be used to define computing (ie. Anywhere you currently read, you could potentially compute as well). This idea, coming from the father of ubicomp, is of enormous significance to the scientific community. The paper mentions that this idea of ubiquity, in the same way as reading, means a drastic change in not only application features, but methods for measuring terse human actions which will eventually define the features of ubiquitous applications.

While the majority of Weiser's paper was interesting and beneficial to the scientific community, there were some holes which needed to be filled in before ubiquitous computing can fully take hold. The issue of privacy and security is one which I feel requires further investigation. The proposal to have relatively loose security seems like a bad idea to me. For example, in a current system, it is impossible to 'break in', until someone finds a way never thought of by the developers of the system. This makes Weiser's argument that someone can 'break in', but it is impossible to be unnoticed, invalid. This presents a possible area of further research: how to guarantee that an unwanted intruder can leave 'fingerprints' which can readily be discovered. Also, if embodied virtuality were to be implemented, I believe much more emphasis should be placed on an individual's privacy than is proposed in The Computer for the 21st Century. Simply having a system that knows where an individual is located at all times represents a serious invasion of privacy. I would propose a system where an individual can disconnect themselves from the ubiquitous network, should they so desire, and possible 'black-out' zones, in which no ubiquitous devices will operate.

A Reference Architecture for Web Servers

Don't let this paper's name fool you, this one is (at least in my opinion) presenting a process for semi-automatically generating reference architectures for any established domain, not just web servers. Having said that, a lot of really interesting, useful information is presented about three popular servers: Apache, AOL, and Jigsaw. Not particularly deep, however. The methods for automatically generating the architecture were glossed over, as the authors were using an existing tool. Next, they simply refined the architecture until it fit the 3 example architectures. I think it could use more meat. Still, a very interesting (and dare I say,entertaining) read, especially if you're familiar with the web domain.

Oh, and there were no 'results', per se. Wonder if they should have some.

Applying Complexity Metrics to Measure Programmer Productivity in HPC

This paper is based around a fairly ambitious task: to measure programmer productivity. As far as I understand it, measuring productivity of a worker outside of an assembly line scenario has been an open topic for as long as there have been assembly lines. Despite that, the authors discuss the set of tools they have used to instrument programmer's workstations, and compare and contrast differing methods of performing the same task; from a command line interface and from a GUI. Much to everyone's surprise, the GUI was quicker and easier to use!! (insert sarcasm) The two methods are measured using a metric devised by the authors, which it seems to me suffers from the same problems as the paper on Measuring Configuration Complexity I read earlier last week; the outcome depends wholly on the heuristics used to define and quantize a single step in work (or configuration), which is still a very open research problem. Also, one of the fundamental assumptions used to build the paper's heuristics, that more steps equals more complexity, is disproved by example in the paper's final movement! Overall, a valiant attempt, but I don't think I would endorse this one.

Agile Software Testing in a Large Scale Project

A fantastic article! This paper presents the findings of a team of developers employed by the Israeli Air Force. This team began the project with very strong focus on Agile processes, particularly TDD and short iteration lengths (at least these are the ones that stuck with me). This team has, self admittedly (at time of publish), one of the most complete sets of data recording an Agile project from inception to delivery. The results show a project which appears to have been run rather smoothly; the up-front testing influenced the amount of defects (compared to what previous result, though?)

More opinions, to be continued ...

In Praise of Tweaking

This paper presents the results of the second annual Matlab tweaking competition. In this contest, users are tasked with solving an algorithmic problem while minimizing a 'score' created by composing various metrics about the algorithm's performance, such as speed and resources consumed. All submitted algorithms are immediately available to all other contestants, and so a new solution can (and is encouraged) to be created by 'tweaking' someone else's work. In this way, programmers are encouraged to adopt a greater community-centered process.

Although the idea presented was intriguing, this is a magazine article and contributes little to the scientific community.

Storytest Driven Development

Started off promising, but didn't really deliver. Who writes these story tests? I don't think a customer alone will go though the effort of formalizing the requirements using FitLibrary, but the developer shouldn't be writing them alone, either, because he shouldn't be deciding the requirements of the system. Became harder to follow as it progressed (lots of flipping between figures and paragraphs of text). Also, where are the results? The idea of story tests is presented well enough ,but its not novel, and there is not indication in this paper of who actually uses this approach. Maybe I'm being too picky or it is too late at night. Maybe read this one again later.

Automatic Bug Triage Using Text Categorization

An extremely fascinating topic concerned with improving development process, not only in open-source projects, but in any project with a reasonably sized team and automatic bug tracking. A real money saver! A well presented paper (even though I got sort of lost in the details of the Bayesian algorithm). Results were presented clearly, even if they weren't as accurate as the authors would have liked. No effort is made to disguise the inaccuracy of the method. I think this idea is worth looking at, just need to swap out the Bayesian algorithm for one that actually works.

Presentations by Programmers for Programmers

Sounds like a very useful tool (gives the ability to queue up a live IDE session, with the UI tuned for a presentation, along side other presentation materials such as PowerPoint slides). I'm a bit skeptical as to whether the up-front cost of setting up this queue is worth the added visual appeal during the presentation (would have to be more of a formal programmer-to-programmer presentation, like in a conference). Combined some existing products to create this. However, where are the results? There is no study or empirical evidence to support that this product is actually useful. I would classify this as a report on a new product, not necessarily a research paper.

An Approach to Benchmarking Configuration Complexity

An interesting idea; this paper attempts to quantitatively measure the effort required to properly configure a new piece of software. In their example, they use a web container. This could provide a way to measure, and presumably, compare the configuration effort required between different products. The problem I found with this method was that the fundamental step in the process, determining the atomic configuration actions and assigning probability of failure to them, is an open research problem, and so the results reported in this paper are speculation based on hand-tuned data. Seems like they tried to slip this one past me.
Also, really bad graphs and mislabeled figures.

Inaugural Post

So this is the first post on my shiny new blog. I tried blogging once before, but it was pretty much just an exercise in setting up a persistent web container on my own hardware, and seeing how long it took for me to get bored of it. Two weeks. Now, seeing as I have neither the time nor the inclination to run dedicated web hardware, I'm using blogger.

What will follow will be a series of short posts containing my opinions of papers I'm currently reading. Enjoy :)