Showing posts with label build engineering. Show all posts
Showing posts with label build engineering. Show all posts

Monday, February 1, 2010

Left-Fold for Bash

I'd like to share a recent bash programming experience I've had. It began while processing the reams of data generated in my M.Sc. research study. I was producing long lists of frequency data in text files, and had the need to sum up all the lines in these tables. This is of course a trivial problem in many languages, and I had a wide array of options available to me:
  • I could write another ant task to do the summing (this required more work that I was willing to invest, as ant isn't really suited for computational tasks)
  • I could write a python script that took the contents of the specified file and returned the sum. I didn't really like this approach because it involved yet another file in my build process, invoked from the ant task. I always sort of thought that if you were forced to use , you were performing a task beyond the scope of your tool.
  • I could skip the generation of the table and go straight to the sum. A few of the tables were created with XSLT, so this was a valid option. However, my XSLT programming ability is very much trial-and-error based, so I thought this might take some time. Also, some of the other tables, created with grep would not be affected.
  • Write it in shell. I liked this idea. I really liked the feel of being able to just pipe something to 'sum' and have the sum returned. So this is what I chose.
My first version looked like this:
#!/bin/bash

sum=0
while read line
do
sum=$(($sum + $line))
done

echo $sum
exit 0
It did the trick quite well. I had suggestions from office mates for the following alternatives:
Using python (courtesy Aran Donohue):

python -c "import sys;print sum(float(x) for x in sys.stdin.read().split())"

Using tr (courtesy Zak Kincaid):

cat numbers | tr '\n' '+'|head --bytes='-1'|bc
(Note that this version doesn't quite work. bc throws a syntax error. not exactly sure why.)


Using my original design, I realized that if I abstracted out the operator, I could use this script to perform any 2-operand function I wished on the list, essentially creating a basic left fold:

#!/bin/bash
op=""
if [ "$1" == "" ]; then
op="+"
else
op=$1
fi

sum=0
while read line
do
sum=$(($sum$op$line))
done

echo $sum
exit 0

I've done a little bit of error checking, to see if the parameter supplied is blank, and if so replace it with + by default.

I've only seen this work for '+' and '-'. If I use '*', it breaks because it replaces the wildcard/multiplication character with the listing of the current directory.

Wednesday, June 3, 2009

A User Study for Mutation-based Testing Analysis

I recently read some material by Andreas Zeller in which he discusses the merits of using mutation testing as a method of verifying the quality of a test suite for a piece of software. These methods are meant to expose tests which perform an action, and assume that it was performed properly so long as an error is not thrown by the system under test - they do not verify that the action resulted in the desired program state. Code mutation (either on source or directly on the bytecode), when combined with accurate code coverage data, can identifiy these deficient tests by mutating the code they cover and observing tests that do not fail.

I believe that there is value in using this data, if it can be presented in the appropriate manner. If a developer has to spend an hour to generate the mutation report and cross-reference it with code coverage, then the investment likely outweighs the benefit. However, if I can arrive at my desk at 9am, and have in my inbox a build report, test report with coverage, and a mutation report for every branch, all of which are properly hyperlinked to each other and backed up on a central storage, then I would certainly use it. The problem here seems to be that this degree of automation and integration is hard to set up, and often times delicate when in place. It seems that some sort of standard platform for use by build engineers for integrating all of their reports, packaging operations, tests, etc, is called for. However, I have yet to see anything more sophisticated than Ant or Perl in widespread use by engineers. Maybe I just have an unrepresentative sample, though.

Thursday, January 22, 2009

Calling all Build Engineers

During my recent CSC301 lecture, I was surprised to find out how few students were aware that they could actually get a job as a full-time build engineer. It makes me think that there's probably a study to be done in this area.
  • To what extent are build processes and project maintenance discussed in CS education?
  • What is the market value of a professional build engineer?
  • How to build engineers perform their job? Is there a way it can be improved?
The last point is something that causes me mild concern. It seems that, in some cases, the methods used to set-up and execute project wide builds can be semi-structured voodoo.