84 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIET Y 074 0 -74 5 9 /11/ $ 2 6 . 0 0 © 2 011 I E E E
SOFTWARE TESTING IS A TIMEconsuming, often frustrating activity,
and the software engineering literature
related to it is overwhelming—especially for scientists writing computational software in scientif c disciplines
outside software engineering. Judith Segal studied the general cultural differences between the software engineering and scientif c communities.1 Even
though both communities emphasize
testing’s importance, the gap between
their respective understanding of testing concepts seems particularly wide.2
We conducted an exercise to test an
example of scientif c software. The interesting outcome is not so much the
number of code defects the testing activity detected but the form and evolution of the activity itself. To analyze
how the activity evolved, we applied a
four-dimensional view of testing. As far
as we know, this view is novel. The four
dimensions help shift the view of testing
from a single attribute of the software
(for example, “It’s tested!”) to a more
complete picture that lets us understand
the differences in concepts and priorities between testing as it’s described in
the software engineering literature and
as it’s applied to a scientif c application.
Four Dimensions of Testing
The test dimensions that guided our
analysis were context, goals, techniques, and adequacy. These four dimensions began as eight, which one of
the authors (Diane Kelly) used to teach
testing in a graduate course and at
instructional workshops.3 She found
signif cant overlap in the concepts included under each of the original dimensions, which allowed her to reduce
their number to four. These four dimensions represent an orthogonal minimal set, suff cient to support an interesting analysis of a testing activity.
Context
To fully understand context in terms of
what mattered to the testing exercise,
we had to cast a wide net. We included
the software’s historical and technical
background, its applications, and the
roles and knowledge of its users and developers, as well as the details of what
we needed to test for the exercise itself.
Goals
Test goals are sometimes confused
with statements such as, “I need to do
boundary value testing.” However, testing is an information-gathering activity.
The initial information-gathering goal
for our exercise was simply to better
understand the domain content of the
Scienti c
Software Testing:
Analysis with
Four Dimensions
Diane Kelly and Stefan Thorsteinson, Royal Military College
Daniel Hook, Engineering Seismology Group Solutions
// An exercise to analyze scienti c software
testing in terms of context, goals, technique,
and adequacy evolves to make better use of the
scientist’s dual role of developer and user. //
FEATURE: SOFTWARE TESTINGMAY/JUNE 2011 | IEEE SOFTWARE 85
software and how it was expressed in
code. As our understanding increased,
we articulated more focused goals.
Techniques
Our exercise included both static and
dynamic techniques. For example,
static reviews of source code address
maintainability goals better than running the executable. On the other hand,
running the software dynamically on
target platforms works better for goals
related to accuracy.
In our context, we found it important to consider the tester as an active
part of the system under test. The tester’s knowledge and goals were key factors in the choice of technique.
Adequacy
Adequacy often subsumes the goals of
a testing exercise. Software engineering
literature often reduces adequacy to a
measure of coverage or bug counts. If
time-to-market is the project’s highest
priority, adequate testing might depend
on when time or money runs out. In
safety- or business-critical situations,
adequacy might reflect the completion
of a predetermined verifcation exercise
or the reduction of failures below a statistical limit.
In our case, goals determined adequacy, and the tester determined
whether the goals were satisfed. This
is a softer measure of adequacy, but it
can be perfectly valid given the scientifc context and goals.
Testing an Astronomy
Software Package
Our exercise involved testing StarImg,
an astronomy software package. All
three of us have undergraduate degrees
in science or engineering disciplines
other than computing or software. Our
graduate degrees differ in that one of
us, Stefan Thorsteinson, has a graduate degree in physics, while the other
two, Daniel Hook and Diane Kelly,
have graduate degrees in computer science and software engineering. We felt
that this mixture of backgrounds could
help us bridge the cultural gap that Segal described.1
Each of us had different aims with
the project, but the exercise increased
everyone’s understanding of what it
means to test scientifc software, particularly in the context of a single scientist making changes to an industrial
product. It’s common for scientists to
have dual roles of developer and user
with scientifc software.
The StarImg Context
Our study’s expanded context included
not only the software’s technical aspects but also the scientifc domain’s
content as it related to the software’s
functionality, historical development,
and future work. In other words, the
context we considered was broad and
proved to be an important factor in understanding effective testing.
The StarImg software package normally runs automatically (without human intervention) to detect artifacts in
astronomical imagery. Its development
was spurred by a wealth of imagery
that became available in 2006 from a
new observatory. However, as is often the case with scientifc software,
StarImg adapted parts and ideas from
older software packages that involved
several developers scattered across different institutions.
Technical context. The software itself is not large—about 10,000 lines
of code (LOC) written in Matlab and
C++. However, the input images that
StarImg analyzes are each on the order
of 1 to 4 Mbytes. The output is a set
of image coordinates marking the locations of identifed artifacts, along with
metrics calculated for each one. In its
normal use, the package runs nightly,
analyzing hundreds of astronomical
images, together called bulk images.
We chose the latest version of the
StarImg code base for our testing exercise. Only its C++ modules are reasonably well documented. There are technical notes documenting algorithms
copied from an earlier code package,
but they don’t necessarily correspond
exactly to algorithms presently in the
code. No corresponding documentation was ever developed for StarImg. Its
code is sparsely commented, and names
of variables and functions that made
some sense to the original author are
not obvious otherwise.
Scientifc context. StarImg analyzes
an image taken in star-stare mode.
This is a wide-feld image taken by a
charged-coupled device (CCD) camera
and telescope. The image might contain
faint artifacts left by objects that are
moving relative to the brighter background stars. The faint artifacts can
be diffcult to detect because of background noise, of which CCD noise is
the largest contributor. CCD noise is
mostly from dark current, the thermal
energy emitted from the silicon lattice
composing the CCD. The camera records this thermal energy as a signal,
and its effects are related to the image’s
exposure time. In addition, individual
pixels can exhibit higher-than-normal
dark current. Nevertheless, because
dark current is a fxed characteristic of
The tester’s knowledge and goals
were key factors in the choice of technique.86 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE
FEATURE: SOFTWARE TESTING
the CCD, software can be written to
calculate suitable corrections ahead of
time.
Other contributors to background
noise are light pollution from natural
sources (bright stars, zodiacal light,
and clouds) and artifcial sources
(nearby electric lights and telescope
imperfections).
A major function of StarImg is to estimate the background noise and subtract it from the entire image. If the estimation is in error, the weak artifacts
StarImg is trying to locate can be lost.
After subtracting the background
from the image, StarImg creates a binary version of it. All image signals are
identifed as either stars or artifacts.
For each artifact, StarImg calculates
metrics such as length, orientation,
brightness, signal-to-noise ratio, and
eccentricity. Then it uses the metrics’
values to determine whether to include
an artifact in its output.
Historical context. StarImg’s background noise algorithms were ported
to Matlab from an earlier C++ image
analysis package. An even earlier detection package (Match) supplied some
of the algorithms but no code. Match
requires a priori information on an artifact’s size and orientation to identify
and extract its signal. StarImg doesn’t
use a priori information, which allows it to detect unexpected and multiple artifacts that Match wouldn’t see.
However, not taking advantage of prior
knowledge about artifact characteristics reduces StarImg’s ability to detect
very weak signals. Estimates of the artifacts it misses range from 10 to 25
percent. StarImg’s performance on the
large, bulk-image sets was nevertheless
suffcient to justify its use.
One astronomer developed StarImg’s
initial set of detection and sorting algorithms and tested them against sample
benchmark imagery. The algorithms
were then passed on to two other astronomers and a physicist, and the four
scientists carried out further tests, development, troubleshooting, and debugging. Their development work was
at times functionally separated and at
other times overlapping, particularly
during testing.
Current development and use.
StarImg has no formal maintenance
or development plan. Scientist users
fx problems as needed and send updates to one of the original scientistdevelopers, who acts as gatekeeper for
changes. The scientists perform regression tests by choosing samples from the
thousands of archived images. As is
typical with almost all scientifc software, the scientist must use experiencebased judgment to determine if an update or new functionality is working.
StarImg is currently deployed in an
automated image acquiring and processing environment. Scientists are continuing work on it to handle different image
types from new observatories. Because
each observatory is in a different location with its own camera and telescope,
each produces different image sizes and,
most importantly, different image backgrounds with different noise and signal
characteristics. Other new development
has added the ability to track artifacts
as well as identify them.
Initial Goals for the Testing Exercise
We began the exercise with different
testing goals. Thorsteinson had just inherited StarImg and would be adding
a signifcant new function. He was interested in assessing the trust he could
have in the current software package.
His trust touches his dual roles: as a
user, he must trust the software’s output; as a developer, he must trust that
he can successfully alter the software
without destroying the trust he needs as
a user.
Kelly and Hook, on the other hand,
were interested in the effectiveness of
two different quality assessment techniques as applied to scientifc software.
This led them to choose techniques independent of the scientist’s goals. Both
the techniques and the goals evolved as
the exercise took place. In a research
environment, this might be acceptable,
but it’s not the most effective way to
proceed in industry. By the end of the
exercise, both the goals and the techniques had crystallized into something
far more useful for the scientist.
Initial Selection of Techniques
The frst of the two major software engineering activities planned for StarImg
was to create a set of unit tests that
Hook could assess for its effectiveness
in the context of scientifc software and
output accuracy. We asked Thorsteinson to generate a set of unit tests for
several StarImg functions. We wanted
to automate the test execution, including the decision process for determining whether the test was successful. The
scientist’s goal was to conduct some specifc in-depth StarImg testing. The software engineer’s goal was to better understand how to test scientifc software.
The ultimate goal was to provide guidance to scientists in their choice of tests.
The second activity was to carry
out a software inspection of StarImg.
Inspection has been described as the
single most effective software quality
assessment activity.4 By inspection, we
Both the goals and the techniques
crystallized into something
far more useful for the scientist.MAY/JUNE 2011 | IEEE SOFTWARE 87
mean a formalized static assessment
of a software product—usually source
code—that includes a well-defned process and results in a record of found
code defects. Researchers at the Royal
Military College of Canada (RMC)
have developed and evaluated an inspection technique, called task-directed
inspection,5,6 that meets this purpose.
It integrates code inspection for defects
with production of a useful product
such as design documentation. Thorsteinson agreed to use this technique to
write a functional description for each
StarImg function and record what code
defects he found in the process.
Adequacy Criteria to Judge Completion
Initially, we defned adequacy, or stopping criterion for the exercise, in relation to the techniques: the exercise
would be complete when all software
functions were inspected and unit tests
that exercised every line of code were
created.
This is a common process-based
choice for adequacy in software engineering. The problem with it is the
lack of focus on either the product or
the person involved. For scientists interested in advancing their theoretical
or engineering understanding, this approach is tedious and lacking in motivation. As our exercise progressed, the
adequacy criteria shifted to a focus on
both the product and the scientist.
How the Exercise Unfolded
For a relatively small piece of code and
two reasonably well-defned software
engineering activities, it was surprising
how soon the process activities changed.
Testing from the Scientist’s Viewpoint
The scientist started creating tests for
the simple low-level functions that
didn’t call any other function. Input was
either a single value or an array of values. The testing technique was the familiar white-box technique driven by
statement coverage. For each function,
the scientist created enough tests to ensure each line of the code was executed
at least once. For the simple functions,
he verifed the coverage by hand because
he considered the effort and time needed
to learn and adapt a coverage tool to be
prohibitive. However, as the functions
increased in size, code coverage became
too diffcult to track manually.
Writing unit tests for the low-level
functions provided a degree of confdence in the code, but the overall usefulness was questionable. Providing full
coverage required considerable time.
Full coverage meant including test cases
with malformed inputs to exercise errorchecking code. The scientist found
problems in the low-level functions, but
their signifcance was low compared to
the time expended to fnd them. In addition, the low-level functions are selfcontained and therefore unlikely to
change as StarImg is developed further.
So unit tests developed for these functions are unlikely to be used again.
At this point, the scientist’s goal began to shift toward supporting planned
software changes. The adequacy criterion was also shifting to include riskbased considerations, which changed
the focus to the product rather than the
process.
Writing unit tests for higher-level
functions was a more challenging task
but seemed more satisfying than working with the low-level functions. This
was because the science in the more
complex functions was more interesting
for the scientist to explore. The scientist wasn’t sure that the code in these
functions was working well or that
his changes wouldn’t affect the functions in an unexpected way. He found
his motivation to complete the higherlevel function tests came from wanting
to fully understand how they worked.
This became the main motivating goal
for the testing exercise. This shifted the
adequacy criteria to now include the
scientist: increasing his understanding
until he reached a comfort level.
The input to each high-level function was often entire images or subimages. Achieving full-code coverage
would require some painstaking work
to alter the images. Given what we
learned from creating unit tests for the
low-level functions, we didn’t think
full-code coverage would necessarily
be a worthwhile pursuit. Instead, we
found a new approach to testing.
The scientist carefully considered
each function’s scientifc goal and how
to test it with a reasonable range of inputs. Instead of trying to format images
to reach every line in the function, he
selected input imagery that was typical of each usual case: images containing one, multiple, or no artifacts; faint
artifacts or artifacts positioned along
the image boundary; and malformed
artifacts.
The scientist changed techniques
from white-box testing driven by statement coverage to black-box testing using scenarios. In this approach, the scientist determines the different scenarios
that the code must handle and creates
the tests for each scenario. Creating test
cases this way has the additional beneft
of identifying a representative input imagery set that could be documented and
reused for any function that required
an input image. This improves on the
The adequacy criteria shifted
to a focus on both the product
and the scientist.88 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE
FEATURE: SOFTWARE TESTING
current system testing process, which
involves selecting test images each time
from the thousands of archived images.
The software engineer asked the
scientist about the possible impact of
a code error that escaped testing and
inspection. The biggest impact would
be false negatives—in other words,
missed artifacts in the astronomical images. Of less concern were false positives—that is, incorrect identifcation
of artifacts or miscalculations of their
metrics. Because the scientist manually
examines each identifed artifact, these
errors would be immediately obvious
and the output would be rejected. By
considering risk, we refned the testing
exercise goals into something more specifc: detecting errors that would cause
false-negative output.
Testing from the
Software Engineer’s Viewpoint
Segal’s case study looked at scientists
and software engineers working together to write new software and the
differences that hampered their effciency and productivity.1 In our case,
the software engineer had a background in engineering physics, which
gave him the fundamentals of the application area. However, he lacked domain knowledge specifc to processing
astronomical images, which made it
diffcult for him to conduct effective
testing without the scientist’s help. His
comments reveal the breadth of the
diffculty: “I didn’t know which parts
of the software were most important
(therefore deserving of more attention),
and I didn’t know how much error tolerance each function should be given.
In short, I didn’t have enough domain
experience to develop the intuition and
expertise that would allow me to test
the routines effectively.”
At the same time, the software engineer felt he had a positive influence on
the scientist’s testing practices. For example, the scientist’s frst batch of tests
focused on robustness testing—that is,
testing with nonsense inputs to ensure
the software handles them appropriately. However, because the signifcance
of such problems in this context was
low compared to the time expended on
them, the software engineer suggested
focusing on accuracy problems using
realistic input data instead. This proved
to be a more valuable use of the scientist’s time.
In addition, the software engineer
provided expertise on different testing
techniques and coverage metrics that
the scientist could experiment with to
determine which were most useful. He
suggested statement coverage to the scientist, despite its known weaknesses.
The suggestion was based on the availability of tools that provide statement
coverage statistics and our previous experience in testing scientifc software.7
We explored automating both the
running of the tests and the decisions
on test outcome. In cases where a test
oracle (expected output) was not obvious, the scientist wrote code or script to
evaluate the output’s correctness. However, as the project moved forward, it
became apparent that comparing test
output to expected values was a serious
issue.
Commonly, comparison of floating-point output, x, to some expected
value, y, is handled simply by providing an error band ε, where | x – y | < ε.
In many cases with scientifc software,
the value of y is not clear—that is,
there is no test oracle. A subtler problem exists with the error ε. Normally,
we think of ε representing round-off error imposed by the limitations of working with fnite-length representations in
computers. With scientifc software, ε
includes errors, simplifcations, and approximations from modeling, measurements, and solution techniques as well
as fnite floating-point round-off error.
These various sources of error require
the scientist’s expertise to judge what’s
reasonable for both y and ε. A misjudgment in the size of ε can hide a code
defect.
Hook went on to demonstrate the
impact of error tolerances on our ability to fnd code defects that affect accuracy in scientifc software.8
Software Inspections
Industry’s adoption of software inspections has been slow to nonexistent.4
Even without the overhead of multiple
inspectors and organized meetings, inspections pose problems for scientifc
software, mainly in identifying an effective reading technique. In our case
study, the scientist explored different
approaches to guide the inspection. In
the end, he developed two approaches
that substantially increased his understanding of the code and enabled him
to spot problems.
Typically, documentation is lacking for scientifc software. The scientifc theory might be documented and
the code authors might possibly still
be available as resources. In our particular case, the scientist was the third
person to inherit the original software
package. It included a signifcant code
base with which he was unfamiliar and
for which there was little documentation. Given these limitations, he could
inspect for self-consistency within the
code, duplication of code pieces, and
We refned the testing goals
to detect errors that would cause
false-negative output.MAY/JUNE 2011 | IEEE SOFTWARE 89
inconsistencies and invalid assumptions
on the basis of his own knowledge of
the application area.
Initially, the scientist followed an
inspection regime that resembled the
more traditional software inspection
approaches. He read the Matlab functions line by line, choosing the functions alphabetically from the StarImg
code base and then documenting each
function’s purpose. This approach was
laborious and seemed to reveal very little. Each function’s purpose had little
or nothing to do with the purpose of
the one just previously read, when chosen alphabetically.
The scientist discovered a far better
approach to choosing the order of the
functions for inspection. He selected
three typical images as input data and
set a breakpoint at the frst line of
StarImg. The scientist then read the
code as it was executed. The scientist
reported that “this gave a much better feel for what each function was to
do.” The execution sequence provided
additional information about the software and let the scientist make useful
judgments on the source code’s correctness. The scientist used his knowledge and expectations for the code as
he cross-checked executed pieces of
it against pieces that hadn’t executed.
This approach uncovered the problem
of dead code—functions that would
never be called and could therefore be
discarded. He also found algorithms
for several functions that accomplished
the same task or were deprecated and
not documented as such. Some functions had hard-coded assumptions that
were no longer valid. Another function
had a hard-coded assumption that was
currently correct but would have to be
generalized as StarImg added new imagery types to its functionality. This affected maintainability and therefore ft
with the more focused goal of supporting planned software changes.
Interestingly, the scientist found creating the unit tests to be as benefcial
as running them. This form of taskdirected inspection, in which the task
is “creation of tests,” requires careful
scrutiny of the code to determine test
cases. In our exercise, a thorough understanding of each code section led to
the identifcation of more problems.
In particular, we found a problem
in the code for removing background
noise, which had been ported from the
older C++ version of the software. It
contained several hard-coded values related to CCDs used for the images, but
the values were no longer valid for the
imagery types processed by StarImg.
The hard-coded values didn’t affect
the metrics calculated for identifed artifacts, so their impact wasn’t immediately apparent in the StarImg output.
However, further investigation showed
that the hard-coded values did affect the
threshold for whether an artifact is identifed at all. This problem contributed to
the 10 to 25 percent of the images that
StarImg missed—that is, it contributed
to the false-negative problem. Inspection by a knowledgeable scientist was
the only way to fnd this error.
Lessons Learned for
Testing Scientifc Software
Our initial expectations and approaches evolved throughout the exercise. Here, we look again at the four
testing dimensions and discuss what we
learned.
Context Considerations
To achieve the testing goals that eventually evolved, we had to understand
much more than the source code in
front of us. We had to understand the
histories of different pieces of code,
their current and ultimate uses, and the
goals of the scientists using the software. Context identifed risky code
areas, helped prioritize StarImg failure
types, and provided information for
understanding the pedigree of different parts of the code. Effective testing
was impossible without this full understanding of the science within the code.
We also had to use the scientist more
effectively. The scientist was both a
user and a developer. He had goals and
knowledge that blended both roles.
We found it better to make use of this
blend rather than artifcially separate
the parts. It lets us more effectively
defne the goals, techniques, and adequacy criteria.
Goal Considerations
Common software engineering goals to
“improve quality” and “fnd bugs” are
too vague to effectively guide the assessment of scientifc software. Once we
asked the question, “Who are we testing
this for?” and answered, “the scientist,”
we formulated more realistic goals.
These goals all involved the scientist
and included improving scientifc understanding, identifying code parts involved in future changes, and mitigating
problems in high-risk areas of the code.
Once we better understood the
goals, we refned techniques and adequacy criteria to match them.
Technique Considerations
The scientist said the most useful exercise was the line-by-line scrutiny of the
code while it executed with selected test
data. This exercise gave him a greater
The scientist found creating
the unit tests to be as benefcial
as running them.90 IEEE SOFTWARE | WWW.COMPUTER.ORG/SOFTWARE
FEATURE: SOFTWARE TESTING
sense of accomplishment with its activities constrained by the test data’s execution path and its clear termination point.
Line-by-line scrutiny of selected functions to create unit test cases required
the scientist to form a deep understanding of the code, which in turn revealed
important problems. Both these activities are a type of code inspection.
When asked if he would use code inspection again, the scientist answered
with a def nite “yes.” He will inherit
the Match code. Its increased sensitivity is needed for a space-based telescope, but like StarImg, Match must
be automated. He commented that a
“thorough code scrutiny provides a
level of scientif c understanding that is
very much desirable.”
The extent of the oracle and tolerance problems in testing scientif c software requires novel testing approaches
specif c to this software type. We
found a problem-domain viewpoint to
be more viable than a code-coverage
viewpoint.
As the exercise progressed, we
streamlined the activities and made
them more eff cient. The scientist commented that they gave him “a level of
knowledge and conf dence in the code
that wouldn’t have been achieved
otherwise.”
Adequacy Considerations
Adequacy criteria shifted from the initial process-based criteria to criteria focused on the product and the scientist.
This f ts with the goals of increasing the
scientist’s understanding of the code
and identifying and, as necessary, improving its high-risk areas. We judged
adequacy by the scientist’s ultimate satisfaction that he had a trustworthy tool
to work with.
B y analyzing our testing exer- cise through the four dimen- sions of context, goals, techniques, and adequacy, we developed a
better understanding of how to effectively test a piece of scientif c software.
Once we considered the scientist-tester
as part of the testing system, the exercise evolved in a way that made use
of and increased his knowledge of the
software. One result was an approach
to software assessment that combines
inspection with code execution. Another result was the suppression of
process-driven testing in favor of goalcentric approaches.
The combination of software engineer working with scientist was successful in this case. The software engineer
brings a toolkit of ideas, and the scientist
chooses and fashions the tools into something that works for a specif c situation.
Unlike many other types of software systems, scientif c software includes the scientist as an integral part of the system.
The tools that support the scientist must
include the scientist’s knowledge and
goals in their design. This represents a
different way of considering the juxtaposition of software engineering with scientif c software development.
References
1. J. Segal, “Scientists and Software Engineers:
A Tale of Two Cultures,” Proc. Psychology
of Programming Interest Group (PPIG 08),
Lancaster Univ., 2008, pp. 44–51.
2. R. Sanders and D. Kelly, “Scientif c Software:
Where’s the Risk and How Do Scientists Deal
with It?” IEEE Software, vol. 25, no. 4, 2008,
pp. 21–28.
3. T. Shepard and D. Kelly, Dimensions of Testing, tech. report TR-74.188-13, 2003; https://
www-927.ibm.com/ibm/cas/publications/
TR-74.188/13/index.pdf.
4. R.L. Glass, “Inspections—Some Surprising
Findings,” Comm. ACM, vol. 42, no. 4, 1999,
pp. 17–19.
5 D. Kelly and T. Shepard, “Task-Directed Software Inspection,” J. Systems and Software,
vol. 73, no. 2, 2004, pp. 361–368.
6. D. Kelly and T. Shepard, “Task-Directed
Software Inspection Technique: An Experiment and Case Study,” Proc. IBM Centers for
Advanced Studies Conf. (CASCON 2000),
IBM Press, 2000; http://portal.acm.org/
citation.cfm?id=782040.
7. D. Kelly, N. Cote, and T. Shepard, “Software
Engineers and Nuclear Engineers: Teaming Up
to Do Testing,” Proc. Canadian Nuclear Soc.
Conf., Canadian Nuclear Soc., June 2007.
8. D.A. Hook, “Using Code Mutation to Study
Code Faults in Scientif c Software,” master’s
thesis, Queen’s Univ., Kingston, Canada,
2009; https://qspace.library.queensu.ca/
handle/1974/1765.
Selected CS articles and columns
are also available for free at
http://ComputingNow.computer.org.
ABOUT THE AUTHORS
DIANE KELLY is an associate professor in the Department of Mathematics and Computer Science at the Royal Military College (RMC). Her
research interests are in identifying and improving software engineering techniques for use specif cally with scientif c software. Kelly has a
PhD in software engineering from RMC. Contact her at [email protected].
STEFAN THORSTEINSON is a researcher at the Royal Military
College (RMC) Center for Space Research. His research interests are
in small-aperture space-based astronomy, astrodynamics, and image
analysis. Thorsteinson has an MSc in physics from RMC. Contact him at
[email protected].
DANIEL HOOK is a software researcher and developer for Engineering Seismology Group Solutions in Kingston, Ontario. His research
interests are in the engineering and development of scientif c software,
especially the impact of software engineering on scientif c software
quality. Hook has an MSc in computing from Queen’s University, Kingston. Contact him at [email protected].