Debugging scientific applications in the .NET Framework

14
Future Generation Computer Systems 19 (2003) 665–678 Debugging scientific applications in the .NET Framework David Abramson , Greg Watson 1 School of Computer Science and Software Engineering, Monash University, Clayton, Vic. 3800, Australia Abstract The Microsoft .NET Framework represents a major advance over previous runtime environments available for Windows platforms and offers a number of architectural features that would be of value in scientific programs. However there are such major differences between .NET and legacy environments under both Windows and UNIX, that the effort of migrating software is substantial. Accordingly, software migration is unlikely to occur unless tools are developed for supporting this process. In this paper we discuss a ‘relative debugger’ called Guard which provides powerful support for debugging programs as they are ported from one environment or platform to another. We describe a prototype implementation developed for Microsoft’s Visual Studio.NET—a rich interactive environment that supports code development for the .NET Framework. The paper discusses the overall architecture of Guard under VS.NET and highlights some of the technical challenges that were encountered during its development. A simple case study is provided that demonstrates the effectiveness of relative debugging in locating subtle errors that occur when even a minor upgrade is attempted from one version of a language to another. For this example, we illustrate the use of relative debugging using a Visual Basic program that was ported from Visual Basic 6.0 to Visual Basic.NET. © 2002 Elsevier Science B.V. All rights reserved. Keywords: Microsoft .NET Framework; Common Language Specification; Relative debugger 1. Introduction The .NET Framework is a major initiative by Mi- crosoft that provides a uniform multi-lingual platform for software development [17]. It is based on a Com- mon Language Specification (CLS) that supports a wide range of programming languages and runtime environments. In addition, it integrates web services in a way that facilitates the development of flexible and Corresponding author. Tel.: +61-03-9905-1183; fax: +61-03-9905-5146; mobile: +61-0417-375-635. E-mail addresses: [email protected] (D. Abramson), [email protected] (G. Watson). URLs: http://www.csse.monash.edu.au/davida, http://www.acl.lanl.gov/cluster/people/gwatson 1 Present address: MS B287, Advanced Computing Lab, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. powerful distributed applications. Clearly this has ap- plicability in the commercial domain of e-commerce and P2P networks which rely primarily on distributed applications. An analysis of the features available in .NET sug- gests that the new architecture is as equally applicable to scientific computing as to commercial applications. In particular .NET provides efficient implementations of a wide range of programming languages, including FORTRAN [12], because it makes use of just-in-time compilation strategies. Further, the Visual Studio de- velopment environment is a rich platform for perform- ing software engineering as it supports integrated code development, testing and debugging from the one tool. Some of the more advanced features of .NET, such as Web Services, could also have interesting appli- cation in scientific code. For example it would be 0167-739X/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved. PII:S0167-739X(02)00176-0

Transcript of Debugging scientific applications in the .NET Framework

Page 1: Debugging scientific applications in the .NET Framework

Future Generation Computer Systems 19 (2003) 665–678

Debugging scientific applications in the .NET Framework

David Abramson∗, Greg Watson1

School of Computer Science and Software Engineering, Monash University, Clayton, Vic. 3800, Australia

Abstract

The Microsoft .NET Framework represents a major advance over previous runtime environments available for Windowsplatforms and offers a number of architectural features that would be of value in scientific programs. However there aresuch major differences between .NET and legacy environments under both Windows and UNIX, that the effort of migratingsoftware is substantial. Accordingly, software migration is unlikely to occur unless tools are developed for supporting thisprocess. In this paper we discuss a ‘relative debugger’ called Guard which provides powerful support for debugging programsas they are ported from one environment or platform to another. We describe a prototype implementation developed forMicrosoft’s Visual Studio.NET—a rich interactive environment that supports code development for the .NET Framework.The paper discusses the overall architecture of Guard under VS.NET and highlights some of the technical challenges that wereencountered during its development. A simple case study is provided that demonstrates the effectiveness of relative debuggingin locating subtle errors that occur when even a minor upgrade is attempted from one version of a language to another. Forthis example, we illustrate the use of relative debugging using a Visual Basic program that was ported from Visual Basic 6.0to Visual Basic.NET.© 2002 Elsevier Science B.V. All rights reserved.

Keywords: Microsoft .NET Framework; Common Language Specification; Relative debugger

1. Introduction

The .NET Framework is a major initiative by Mi-crosoft that provides a uniform multi-lingual platformfor software development[17]. It is based on a Com-mon Language Specification (CLS) that supports awide range of programming languages and runtimeenvironments. In addition, it integrates web services ina way that facilitates the development of flexible and

∗ Corresponding author. Tel.:+61-03-9905-1183;fax: +61-03-9905-5146; mobile:+61-0417-375-635.E-mail addresses: [email protected] (D. Abramson),[email protected] (G. Watson).URLs: http://www.csse.monash.edu.au/∼davida,http://www.acl.lanl.gov/cluster/people/gwatson

1 Present address: MS B287, Advanced Computing Lab, LosAlamos National Laboratory, Los Alamos, NM 87545, USA.

powerful distributed applications. Clearly this has ap-plicability in the commercial domain of e-commerceand P2P networks which rely primarily on distributedapplications.

An analysis of the features available in .NET sug-gests that the new architecture is as equally applicableto scientific computing as to commercial applications.In particular .NET provides efficient implementationsof a wide range of programming languages, includingFORTRAN[12], because it makes use of just-in-timecompilation strategies. Further, the Visual Studio de-velopment environment is a rich platform for perform-ing software engineering as it supports integrated codedevelopment, testing and debugging from the one tool.

Some of the more advanced features of .NET, suchas Web Services, could also have interesting appli-cation in scientific code. For example it would be

0167-739X/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved.PII: S0167-739X(02)00176-0

Page 2: Debugging scientific applications in the .NET Framework

666 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

possible to source libraries dynamically from the Webin the same way that systems like NetSolve[7] andNEOS [9] provide scientific services remotely. Thisfunctionality could potentially offer dramatic produc-tivity gains for scientists and engineers, because theycan focus on the task at hand without the need to de-velop all of the support libraries.

Unfortunately the differences between .NET andother legacy software systems, such as WIN32 andeven UNIX, are substantial and as a result there is asignificant impediment to porting codes from one en-vironment to another. Not only are the environmentsdifferent functionally, but the libraries and machinearchitectures may differ as well. It is well establishedthat different implementations of a programming lan-guage and its libraries can cause the same program tobehave erroneously. Because of this the task of mov-ing code from one environment to another can be er-ror prone and expensive. Many of these applicationsmay also be used in mission critical situations like nu-clear safety, aircraft design or medicine, so the cost ofincorrect software can potentially be enormous. Un-less software tools are developed that specifically helpusers in migrating software to the .NET Framework,it is likely that most scientists will continue to uselegacy platforms for their software development.

Traditional debuggers are not particularly helpful atfinding errors introduced during the porting processeven when they have been designed with scientific ordistributed computing in mind[6,8,15,16,19,25]. Thisis because they generally require the programmer tohave a good understanding of the way the programworks and have a mental model of the contents of thevarious data structures during execution. In this pa-per we describe a debugging tool called Guard, whichspecifically supports the process of porting codes fromone language, operating system or platform to another.Guard has been available under UNIX for some timenow, and we have proven its applicability for assistingthe porting of programs many times. We have recentlyimplemented a version of Guard that is integrated intothe Microsoft Visual Studio.NET development envi-ronment. Not only can the system be used to sup-port porting from WIN32 to .NET, but we have evendemonstrated the ability to support cross-platform de-bugging between a UNIX platform and a Windowsplatform. This has shown that the tool is not only use-ful for supporting software development on the one

platform, but can also support the porting of codes be-tween Windows and UNIX.

The paper begins with a discussion of the Guard de-bugger, followed by a description of the .NET Frame-work. We then describe the architecture of Guard asimplemented under Visual Studio.NET, and illustrateits effectiveness in locating programming errors in thisenvironment.

2. Guard—a relative debugger

Relative debugging was first proposed by Abram-son and Sosic in 1994. It is a powerful paradigm thatenables a programmer to locate errors in programs byobserving the divergence of key data structures as theprograms are executing[1–5,20,24]. The technique ofrelative debugging allows the programmer to makecomparisons of asuspect program against areferencecode. It is particularly valuable when a program isported to, or rewritten for, another language or com-puter platform. Relative debugging is effective becausethe user can concentrate onwhere two related codesare producing different results, rather than being con-cerned with the actual values in the data structures.Various case studies reporting the results of using rel-ative debugging have been published[1–3,14,24], andthese have demonstrated the efficiency and effective-ness of the technique. The concept of relative debug-ging is both language and machine independent. Itallows a user to compare data structures without con-cern for the implementation, and thus attention can befocussed on the cause of the errors rather than imple-mentation details.

To the user, a relative debugger appears as a tra-ditional debugger, but also provides additional com-mands that allow data from different processes to becompared. The debugger is able to control more thanone process at a time so that, once the processes arehalted at breakpoints, data comparison can be per-formed. There are a number of methods of comparingdata but the most powerful of these is facilitated by auser-supplieddeclarative assertion. Such an assertionconsists of a combination of data structure names, pro-cess identifiers and breakpoint locations. Assertionsare commands that are processed by the debugger be-fore program execution commences and used to buildan internal graph[5] which describes when the two

Page 3: Debugging scientific applications in the .NET Framework

D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678 667

programs must pause, and which data structures areto be compared. In the following example:

assert $reference::Var1@1000

= $suspect::Var2@2000

theassert statement compares data fromVar1 in$reference at line 1000 withVar2 in $suspectat line 2000. A user can formulate as many assertionsas necessary and can refine them after the programshave begun execution. This makes it possible to lo-cate an error by placing new assertions iteratively untilthe suspect region of code is small enough to inspectmanually. This process is incredibly efficient. Even ifthe programs contain millions of lines of code the de-bugging process refines the suspect region in a binaryfashion so it only takes a small number of iterationsto reduce the region to a few lines of code.

Our implementation of relative debugging is em-bodied in a tool called Guard. We have produced im-plementations of Guard for many varieties of UNIX,in particular Linux, Solaris and AIX. A parallel vari-ant is available for debugging applications on sharedmemory machines, distributed memory machines andclusters. Currently this is supported with UNIX Sys-tem V shared memory primitives, the MPICH library,as well as the experimental data parallel language ZPL[24].

The UNIX versions of Guard are controlled by acommand line interface that is similar in appearanceto debuggers like GDB[21]. In this environment anassert statement such as the one above is typed intothe debug interpreter and must include the actual linenumbers in the source as well as the correct spellingof the variables. As discussed later in the paper Guardis now integrated into the Microsoft Visual Studio en-vironment and so is able to use the interactive natureof the user interface to make the process of definingassertions easier.

3. Success stories

Over the last few years we have used Guard to debuga number of scientific codes that have been migratedfrom one platform to another or from one language toanother (or both). In one case study we used Guardto isolate some discrepancies that occurred when a

global climate model was ported from a vector archi-tecture to a parallel machine[2]. This study illustratedthat it is possible to locate subtle errors that are in-troduced when programs are parallelised. In this caseboth models were written in the same language, but thetarget architecture was so different that many changeswere required in order to produce an efficient solution.Specifically, the mathematical formulation needed tobe altered to reduce the amount of message passing inthe parallel implementation, and other changes suchas the order of the indexes on key array data structuresneeded to be made to account for an RISC architec-ture as opposed to a vector one.

In another case study we isolated errors that oc-curred when a photo-chemical pollution model wasported from one sequential workstation to another[1].In this case the code was identical but the two ma-chines produced different answers. The errors werefinally attributed to the different behaviour of a keylibrary function, which returned slightly divergent re-sults on the two platforms.

In a more recent case study we isolated errors thatoccurred when a program was rewritten from C intoanother language, ZPL, for execution on a parallelplatform[24]. This case study was interesting becauseeven though the two codes were producing slightly dif-ferent answers, the divergence was attributed to differ-ent floating point precision. However by using Guardit was possible to show that there were actually fourindependent coding errors—three in the new ZPL pro-gram, and surprisingly, one in the original C code.

All of these case studies have highlighted the powerof relative debugging in the process of developing sci-entific codes. We believe that many of the same is-sues will arise when migrating scientific software tothe new .NET Framework and that Guard will be ableto play an important role in assisting this process.

4. The .NET Framework

The Microsoft .NET Framework represents a sig-nificant change to the underlying platform on whichWindows applications run[17]. The .NET Frameworkdefines a runtime environment that is common acrossall languages. This means that it is possible to writeapplications in a range of languages, from experimen-tal research ones to standard production ones, with the

Page 4: Debugging scientific applications in the .NET Framework

668 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

expectation that similar levels of performance and ef-ficiency will be achieved. An individual program canalso be composed of modules that are written in dif-ferent languages, but that interoperate seamlessly. Allcompilers that target the .NET environment generatecode in an Intermediate Language (IL) that conformsto a CLS. The IL is in turn compiled into native codeusing a just-in-time compilation strategy. These fea-tures mean that the .NET Framework should providean efficient platform for developing computationalmodels.

The Web Services features of .NET also offer signif-icant scope for scientific applications. At present mostcomputational models are built as single monolithiccodes that call library modules using local procedurecalls. More recent developments such as the NetSolveand NEOS application servers have provided an ex-ception to this strategy. These services provide com-plex functions such as matrix algebra and optimisationalgorithms using calls to external servers. When an ap-plication uses NetSolve, it calls a local ‘stub’ modulethat communicates with the NetSolve server to per-form some computation. Parameters are sent via mes-sages to the server and results are returned the sameway. The advantage of this approach is that applica-tion programmers can benefit by using ‘state of the art’algorithms on external high-performance computerswithout the need to run the codes locally. Further, theload balancing features of the systems are able to al-locate the work to servers that are most lightly loaded.The major drawback of external services like this isthat the application must be able to access to requiredserver and so network connectivity becomes a centralpoint of failure. Also, building new server libraries isnot easy and requires the construction of complex webhosted applications. The .NET Framework has sim-plified the task of building such servers using its WebServices technology. Application of Web Services toscience and engineering programs is an area of inter-est that requires further examination.

Visual Studio.NET (VS.NET) is the preferred codedevelopment environment for the .NET Framework.The VS.NET environment represents a substantialchange to previous versions of Visual Studio. Olderversions of Visual Studio behaved differently depend-ing on the language being supported—Visual Basicused a different set of technologies for building appli-cations than Visual C++. The new VS.NET platform

has been substantially re-engineered and as a conse-quence languages are now supported in a much moreconsistent manner.

VS.NET also differs from previous versions by ex-posing many key functions via a set of remote APIsknown as ‘automation’. This means that it is possi-ble to write a third party package that interacts withVS.NET. For example, an external application canset breakpoints in a program and start the executionwithout user interaction. A separate Software Devel-opment Kit (SDK) called VSIP (Visual Studio In-tegration Program) makes it possible to embed newfunctions directly into the environment. This allows aprogrammer to augment VS.NET with new function-ality that is consistent with other functions that are al-ready available and operates seamlessly with the userinterface. This feature has allowed us to integrate aversion of Guard with Visual Studio as discussed inthe next section.

5. Architecture of guard

Fig. 1 shows a simplified schematic view of the ar-chitecture of Guard under VS.NET. VS.NET is builtaround a core ‘shell’ with functionality being pro-vided by commands that are implemented by a setof ‘packages’ These packages are conventional COMobjects that are activated as a result of user interac-tion (such as menu selection) within VS.NET, and alsowhen various asynchronous events occur. This com-ponent architecture makes it possible to integrate newfunctionality into the environment by loading addi-tional packages.

Debugging within the VS.NET environment is sup-ported by three main components. The Debugger pack-age provides the traditional user interface commandssuch as ‘Go’, ‘Step’, ‘Set Breakpoint’, etc. that appearin the user interface. This module communicates withthe Session Debug Manager, which in turn provides amultiplexed interface into one or more per-process De-bug Engines. The Debug Engines implement low-leveldebug functions such as starting and stopping a pro-cess, setting breakpoints, and providing access to thestate of the process. Debug Engines can cause eventsto occur in response to conditions such as a breakpointbeing reached and these are passed back through theSession Debug Manager to registered event handlers.

Page 5: Debugging scientific applications in the .NET Framework

D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678 669

Fig. 1. Guard architecture.

Each Debug Engine is responsible for controlling theexecution of a single process. The VS.NET architec-ture also supports the concept of remote debugging,so a process being debugged may be running on a re-mote Windows system.

The VS.NET implementation of Guard consists ofthree main components. A package is loaded into theVS.NET shell that incorporates logic to respond tospecific menu selections and handle debugger events.This package executes in the main thread of the shelland therefore has had to be designed to avoid blockingfor any extended time period. The main relative debug-ging logic is built into a local COM component calledthe Guard Controller. This is a separate process thatprovides a user interface for managing assertions anda dataflow interpreter that is necessary to implementrelative debugging. Because the Guard Controller runsas a separate process it does not affect the response ofthe main VS.NET thread. The Guard Controller con-trols the programs being debugged using the VS.NETautomation interface. We have also built a Debug En-gine that is able to control a process running on an ex-ternal UNIX platform. This works by communicatingwith the remote debug server developed for the origi-nal UNIX version of Guard using a TCP/IP socket and

a custom protocol. The UNIX debug server, based onthe GNU GDB debugger, is available for most vari-ants of UNIX, and provides basic debug functions,including process startup to Guard. We have modi-fied GDB to provide support for an Architecture Inde-pendent Format (AIF)[24] for data structures, whichmeans it is possible to move data between machineswithout being concerned about different architecturalcharacteristics, such as word size, endian’ness, etc.AIF also facilitates machine independent comparisonof data structures. It is the addition of this Debug En-gine that allows us to compare programs executing onWindows and UNIX platforms.

The architecture of Guard is consequently very flex-ible and allows debugging distributed processes aswell as monolithic codes. For example, if an applica-tion were decomposed into a number of distinct pro-cesses, possibly implemented as Web services, thenit would be possible to specify assertions betweenthese individual components and the original sequen-tial code version.

Fig. 2shows a screen dump of Guard running underVS.NET. When a user wishes to compare two runningprograms they must first be loaded into a VS.NET‘solution’ as separate ‘projects’. The solution is then

Page 6: Debugging scientific applications in the .NET Framework

670 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

Fig. 2. Guard controller and VS.NET.

configured to start both programs running at the sametime under the control of individual Debug Engines.The source windows of each project can then be tiledto allow both to be displayed at once.

A user creates an assertion between the two pro-grams using the Guard Controller, which is started byselecting the ‘VSGuard’ item from the ‘Tools’ menu.The Guard Controller has a separate Control Panelwindow as shown. An assertion is created in a fewsimple steps. A new, empty, assertion is created byselecting the ‘Add’ button. Guard displays the dialog

box shown inFig. 3, which allows the user to enterthe information necessary to create an assertion. Theleft-hand side of the assertion can be automaticallypopulated with the variable name, line number, sourcefile and program information by selecting the requiredvariable in the appropriate source window and thenusing a single right-mouse click. The right-hand sideof the assertion can be filled in using the same tech-nique in the other source window. Finally the user isable to specify properties about the assertion such asthe error value at which output is generated, when the

Page 7: Debugging scientific applications in the .NET Framework

D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678 671

Fig. 3. New assertion dialog.

debugger should be stopped and the type of output todisplay. The user can create any number of assertionsby repeating this process and then launch the programsusing the ‘Start’ button on the Control Panel.

Before commencing execution Guard automaticallysets breakpoints at the locations in the source filesspecified by the assertions. During execution Guardwill extract the contents of a variable when its cor-responding breakpoint is reached and then perform acomparison once data from each half of the assertionhas been obtained. Once the appropriate error thresh-old has been reached (as specified in the assertion),Guard will either display the results in a separate win-dow or stop the debugger to allow interactive examina-tion of the programs’ state. Guard currently supportsa number of display types including text, bitmapsand the ability to export data into a visualisationpackage.

6. Implementation issues

While the VS.NET debugger architecture has beendesigned with the ability to manage and debug multi-ple processes, it is not a true multi-process debugger.This is because:

(a) the debugger does not provide control opera-tions, such as start/restart and single step on a

per-process basis. Instead these functions onlyoperate on all processes collectively; and

(b) the debugger is constrained to operate in oneof two modes where a break event (such as abreakpoint being reached) in one process stopsall processes together, or a break event only stopsexecution of the current process but the debuggeris unable to access the other process to obtainstate information.

In contrast, the existing Guard architecture assumesthat independent control of individual processes is pro-vided by the debugger infrastructure. To addressthis issue we have modified Guard so that processesare restarted as soon as possible after a breakpointis reached and data has been extracted. However,because a restart command is issued to all processesGuard must also keep a record of the state of eachprocess so that it can ensure that restart commandsare only issued at the appropriate time. The result ofthis modification is some loss of functionality overthe existing UNIX version. In particular it is not pos-sible to stop both programs at a known location whenan assertion threshold is exceeded, complicating themanual debugging that might occur after an assertionhas triggered.

Our original intention was to integrate both theuser interface and a dataflow interpreter into a sin-gle multi-threaded package in VS.NET. However

Page 8: Debugging scientific applications in the .NET Framework

672 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

due to limitations in relation to the thread safety andre-entrancy of the VS.NET shell we were not success-ful in this approach. Instead, by separating these partsof the debugger into a local component the threadingissues were effectively eliminated.

Another implementation issue arose because of ouruse of the debugger automation interface that is pro-vided by Visual Studio. This interface is used to ex-tract the contents of program variables when a processis stopped at a breakpoint. Unlike our UNIX debugserver API, which is designed to transfer large datastructures efficiently, the automation interface is onlyable to extract data from a simple object with each call.This means that extracting data from complex objectssuch as arrays can be very time consuming since re-peated calls must be made for each element. To solvethis problem our package must expose its own inter-face to the low-level data access facilities provided bythe Debug Engines.

A further consideration was in relation to the needto use AIF in the Guard debugger. If Guard ope-rated exclusively in the WIN32 environment then itwould be feasible to perform all comparisons usingonly the native data format and avoid the overheadincurred by using the AIF library routines. Howeversince we wish to use Guard to compare data betweenUNIX, WIN32 and .NET systems we must employan architecture neutral format. Because VS.NET em-ploys its own internal format, data arriving at the De-bug Engine from a UNIX debug server must first beconverted into this format. Once Guard receives thedata via the debugger automation interface it must thenbe converted back into AIF before being processedby the dataflow interpreter. This results in extra over-heads because of the dual conversion but simplifiesthe implementation since we do not need to modifythe dataflow interpreter code. We plan to investigatearchitectural modifications to Guard that remove theneed for the multiple format conversions.

One final issue is in relation to the asynchronousbehaviour of programs being debugged under theVS.NET environment. We have occasionally observedsituations where one process of a multi-process debugsession receives significantly more execution timethan the others, particularly in cross-language situa-tions or where VS.NET is controlling both managed(.NET) and un-managed (legacy) code. Since thedataflow architecture employed by Guard is designed

to deal with this situation, it is not a serious issue,although it can only do so for a finite time before allits internal buffers become filled. We will be moni-toring this situation to see if the problem manifestsin later versions of .NET and if strategies need to beincorporated into Guard to deal with the issue.

In spite of these difficulties, the implementationhas been fairly smooth and a prototype version ofGuard has been produced. Visual Studio is one of thefew interactive environments that have been designedwith the goal of incorporating third party packages[11,13], and we have demonstrated that this integrationis possible. Specifically, we have been successful inincorporating a tool that will support the migration ofapplications to the new .NET Framework.

7. The ‘Earth’ case study

As discussed inSection 1, porting a code from oneplatform to another poses significant challenges forthe programmer. Many of these challenges are presenteven when the application is only migrated from oneversion of a language to another, regardless of whetherthe platform also changes. In this section we illustratethe power of relative debugging using a small scien-tific program called ‘Earth’, written in Visual Basic.The problems we experienced in doing this were nodifferent from problems we have experienced in thepast when we moved code from one platform and op-erating system to another.

‘Earth’ is a free program that uses the VSOP87planetary theory to compute the heliocentric eclipticlongitude (L), latitude (B) and the distance to thesun (R) of the planet Earth over a period of severalthousands of years[22]. It is based on the same math-ematical formulations used to compute long-term,high-precision, heliocentric orbital positions of theplanets when preparing astronomical almanacs. Orig-inally written in Visual Basic 5 and Visual Basic 6,we decided to upgrade it to Visual Basic.NET. Vi-sual Basic 6 and Visual Basic.NET have minor, butsubtle, syntactic and semantic differences. Whilst theexample is performed on the same operating system,it highlights many of the same issues that arise whena program is moved across platforms, and serves asan adequate illustration of how Guard can be applied.The program was converted using a special wizard

Page 9: Debugging scientific applications in the .NET Framework

D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678 673

Fig. 4. Exception generated in Earth.

provided by Microsoft, and whilst there were a num-ber of issues that required inspection, the wizardseemed to make sensible decisions during the trans-lation process. In spite of this, the code did not runcorrectly and consequently required debugging.

7.1. Problem 1

Whilst Earth compiled without errors, it generateda runtime error during execution, as shown inFig. 4.

Step 1. Using Guard to place an assertion on thestatement in error showed that one of thearguments to theMid function was differentbetween the two versions, in particular, thevariable M had a value of 1 in the Visual

Fig. 5. Code fragment for computingM.

Basic 6 code and 0 in the Visual Basic.NETversion. Fig. 5 shows the code used in thederivation ofM.

Step 2. Placing assertions onQ andMonth YearBCAD indicated that they were both wrong.Since Month Year BCAD was passed asan argument, we next inspected the locationof the call toDAYS IN MONTH OF. Fig. 6shows the code fragment responsible.

Step 3. Placing assertions onM, Y, MonthSelect.Text and Year Renamed.Text indi-cated that they were all incorrect, moreover MonthSelect.Text and YearRenamed.Text were actually initialised.

Step 4. Closer inspection of the code showed thatthere were multiple calls to the function

Page 10: Debugging scientific applications in the .NET Framework

674 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

Fig. 6. Code fragment for callingDAYS IN MONTH OF.

ADJUST MONTH LENGTH, some of whichwere as a result of handling changed fieldevents. Using the Visual Studio Call Stackview, we observed thatADJUST MONTHLENGTH had in fact been called from differ-ent places in the two versions of the program.

A code search showed thatMonthSelect.Textand Year Renamed.Text were initialised in theform-load event handler, but that the .NET version ofthe program was invoking a changed event handler as aside effect of initialising the form. This in turn causedthe ADJUST MONTH LENGTH code to be executedearlier than in the Visual Basic 6 version, and thusbefore it was properly initialised.

Fig. 7. Values ofL, B and R in error.

7.1.1. The root cause of this error was differentevent ordering in Visual Basic 6 and VisualBasic.NET

Even a simple error such as this illustrates the powerof relative debugging. Using conventional techniqueswe would have quickly been able to determine thatthe value of M was incorrect, since it was generatingan invalid value for theMid function. However at thispoint we would have had no idea whether the root ofthe problem arose in theMONTH NUM FOR ABBREVfunction or in the value ofQ. In tracking the cause wemight have spent considerable time examining codefor errors, even though it was actually correct. Usingrelative debugging we immediately know where thesource of the error is located.

Page 11: Debugging scientific applications in the .NET Framework

D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678 675

Fig. 8. Code fragment for computingL, B and R.

Fig. 9. Code fragment forJDE FOR.

Fig. 10. Code fragments forJD NUM FOR.

Page 12: Debugging scientific applications in the .NET Framework

676 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

Fig. 11. Fixing the coding error.

7.2. Problem 2

On correcting the first problem, the program ranwithout error. However the values being computedwere wrong, as shown inFig. 7.

Step 1. Using Guard to place assertions onQ,INTERFACE DATE and INTERFACETIME (see Fig. 8) we determined thatthe error was contained in theJDE FORfunction (shown inFig. 9).

Step 2. Placing assertions onDate String, WandQ indicated that the routineJD NUMFOR was in error. The relevant code isshown inFig. 10.

Steps 3–5. Placing assertions on the result ofJD NUM FOR and the define points ofvariables MM, MMM, Pointer, Q andDD allowed us to trace back through theexpressions and determine thatDD wascorrect butLen(DD) was incorrect. Oncloser inspection it transpired that in theVisual Basic 6 version of the code, theLen function only takes a string argu-ment, and thus when it is passed a variantof type double this is first converted toa character string, and then the length ofthat string is returned. However, in VisualBasic.NET, the Len function takes an ob-ject as a parameter and then it returns thelength of the object—in this case a valueof 8. The code can be corrected by explic-itly converting DD to a string as shown inFig. 11.

7.2.1. The root cause of this error was differenthandling of variant parameters with built-infunctions

An important point with both of these errors, isthat whilst a conventional debugger would have al-lowed us to explore whether values were correct ornot, it would also have required us to have a goodunderstanding of the values that were expected andthe algorithms being employed. On the other hand,

with relative debugging we did not need to havesuch a mental model, and could focus instead onjust comparing one program with another and tracingthe define points of variables. When the data struc-tures are complex and large, this becomes a verysignificant advantage over conventional approaches.Moreover, the errors were located very quickly andefficiently.

8. Future work and conclusions

It is far too early to claim that .NET is a suit-able platform for scientific computation since it hasonly been released for a short time and there are fewcommercial codes available, and virtually no scien-tific ones. As discussed inSection 1we believe that.NET offers a number of potential benefits for largenumeric models. However, the execution environmentis very different from other platforms and so it is criti-cal that as many tools as possible are available to facil-itate the transition of existing legacy software. Guardis one such tool because it allows a user to comparetwo executing programs simultaneously on differentplatforms.

Whilst the implementation of Guard under UNIXalone is mature and has been used on many casestudies, the current version under VS.NET is still in apre-beta testing phase. Specifically, the control of pro-grams by Visual Studio under UNIX at this stage isrudimentary and this is why the case study presentedhere focussed entirely on the Windows operating sys-tem. We are also planning a number of extensionsthat will be required if Guard is to be of practical usein supporting the migration to .NET. The current userinterface is fairly simple and must be made more pow-erful if it is to be applied to large programs. At presentonly simple data types and arrays are supported. Weneed to extend this to encompass the range of typesfound in scientific codes, such as structures and othercomplex types. Assertions need to be able to be savedand restored when the environment is restarted, andassertions should employ symbolic markers whichare independent of the actual numeric line numbers.

Page 13: Debugging scientific applications in the .NET Framework

D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678 677

We are also planning to integrate Guard into SourceSafe [10], Microsoft’s equivalent of SCCS[18] orRCS[23] making it possible to compare one versionof a program with previous versions automatically. Wehave already experimented with a version of Guardunder UNIX that provides explicit support for parallelprogramming[24], and we plan to enhance the sup-port for multi-process programs in the Visual Studioversion to make it feasible to debug programs in run-ning on a cluster of Windows machines. Finally, weare working on a new version of Guard that attemptsto perform much of the generation and refinement ofassertions automatically. Whilst this project is in theearly stages, it appears that using powerful data flowanalysis of the two programs would allow us to gen-erate and refine a number of the assertions withoutuser involvement.

Acknowledgements

This work has been funded by grants from the Aus-tralian Research Council and Microsoft Corporation.We wish to acknowledge the support of a numberof individuals at Microsoft for their assistance onvarious issues related to Visual Studio and .NET.Particular thanks go to Todd Needham, Dan Fay andFrank Gocinski. We also wish to acknowledge ourcolleagues, Professor Christine Mingins, ProfessorBertrand Meyer and Dr. Damien Watkins for manyhelpful discussions. Some of the coding for the UNIXdebug engine was performed by Le Phu Dung, and the‘Earth’ case study was performed by Tim Ka-chungHo and Clement Chu.

References

[1] D. Abramson, I. Foster, J. Michalakes, R. Sosic, Relativedebugging: a new paradigm for debugging scientific appli-cations, Commun. Assoc. Comput. Mach. 39 (11) (1996)67–77.

[2] D. Abramson, I. Foster, J. Michalakes, R. Sosic, Relativedebugging and its application to the development of largenumerical models, in: Proceedings of the IEEE Super-computing, San Diego, CA, December 1995.

[3] D. Abramson, R. Sosic, A debugging and testing tool forsupporting software evolution, J. Autom. Softw. Eng. 3 (1996)369–390.

[4] D. Abramson, R. Sosic, A debugging tool for softwareevolution, in: Proceedings of the Seventh International

Workshop on Computer-Aided Software Engineering (CASE-95), Toronto, Ont., Canada, July 1995, pp. 206–214;D. Abramson, R. Sosic, A debugging tool for softwareevolution, in: Proceedings of the Second Working Conferenceon Reverse Engineering, Toronto, Ont., Canada, July 1995.

[5] D. Abramson, R. Sosic, G. Watson, Implementation tech-niques for a parallel relative debugger, in: Proceedings ofthe International Conference on Parallel Architectures andCompilation Techniques (PACT’96), October 20–23, 1996,Boston, MA.

[6] T. Bemmerl, R. Wismüller, On-line distributed debuggingon scaleable multicomputer architectures, High PerformanceComputing and Networking, vol. II: Networking Tools,Lecture Notes in Computer Science, vol. 797, Springer,Berlin, April 1994, pp. 394–400.

[7] H. Casanova, J. Dongarra, NetSolve: a network server forsolving computational science problems, Int. J. Supercomput.Appl. High Perform. Comput. 11 (3) (1997) 212–223.

[8] D. Cheng, R. Hood, A portable debugger for parallel anddistributed programs, in: Proceedings of the Supercompu-ting’94, November 1994, pp. 723–732.

[9] J. Czyzyk, J. Owen, S. Wright, Optimization on the internet,OR/MS Today, October 1997.

[10] http://www.seg.org/research/3Dmodel/SEPformat.htmlhttp://msdn.microsoft.com/ssafe/.

[11] http://www.eclipse.org.[12] http://www.lahey.com/netwtpr1.htm.[13] http://www.sun.com/forte/.[14] L. Snyder, A Programmer’s Guide to ZPL, MIT Press,

Cambridge, MA, 1999.[15] T.J. LeBlanc, J.M. Mellor-Crummey, Debugging parallel

programs with instant replay, IEEE Trans. Computers C-364 (1987) 471–482.

[16] J. May, F. Berman, Panorama: a portable extensible paralleldebugger, in: Proceedings of the ACM/ONR Workshop onParallel and Distributed Debugging, San Diego, CA, May1993, pp. 96–106.

[17] B. Meyer, .NET is coming, IEEE Comput. 34 (8) (2001)92–97.

[18] Programming Utilities and Libraries’, Sun Release 4.1, SunMicrosystems, 1988.

[19] N. Ramsey, D. Hanson, A retargetable debugger, in: Procee-dings of the SIGPLAN’92 Conference on Programming Lang-uage Design and Implementation, ACM, 1992, pp. 22–31.

[20] R. Sosic, D. Abramson, Guard: a relative debugger, Softw.Pract. Exp. 27 (2) (1997) 185–206.

[21] R. Stallman, Debugging with GDB—The GNU Source LevelDebugger, 4.12 ed., Free Software Foundation, January 1994.

[22] J. Tanner, A. Mason,http://FreeVBCode.com.[23] W. Tichy, RCS—a system for version control, Softw. Pract.

Exp. 15 (7) (1985) 637–654.[24] G. Watson, D. Abramson, Relative debugging for data parallel

programs: a ZPL case study, IEEE Concurrency 8 (4) (2000)42–52.

[25] R. Wismuller, M. Oberhuber, J. Krammer, Interactivedebugging and performance analysis of massively parallelapplications, Parallel Comput. 22 (3) (1996) 415–442.

Page 14: Debugging scientific applications in the .NET Framework

678 D. Abramson, G. Watson / Future Generation Computer Systems 19 (2003) 665–678

David Abramson has been involved in computer architectureand high-performance computing research since 1979. Previousto joining Monash University in 1997, he has held appointmentsat Griffith University, CSIRO, and RMIT. At CSIRO he wasthe program leader of the Division of Information TechnologyHigh-Performance Computing Program, and was also an adjunctAssociate Professor at RMIT in Melbourne. He was also aprogram manager in the Co-operative Research Centre for Intel-ligent Decisions Systems. Abramson is currently the head of theSchool of Computer Science and Software Engineering (CSSE)at Monash University, Australia. CSSE consists of over 80 aca-demic staff members across two campuses. He is a project leaderin the Co-operative Research Centre for Distributed SystemsNimrod Project and also Chief Investigator on an ARC fundedresearch project called Guard, a relative debugger. Abramsonhas chaired a number of international conferences, including theprestigious ACM International Symposium on Computer Archi-tecture in 1992. He has published over 100 papers and technicaldocuments. He has given seminars and received awards around

Australia and internationally and has received nearly $2 millionin research grants. He is a co-founder of Active Tools P/L withDr. Rok Sosic, a company which was established to commer-cialise the Nimrod project and Guardsoft, a company focused oncommercialising the Guard project. Abramson’s current interestsare in high-performance computer systems design and softwareengineering tools for programming parallel and distributed super-computers.

Greg Watson is a technical staff member in the Advanced Com-puting Lab at Los Alamos National Laboratory. His research in-terests focus on tools for parallel computers, program debugging,distributed computing and operating systems. He received his BSin computer science from the University of Tasmania and his PhDfrom Monash University. Prior to his current position, Greg workedas a senior research fellow in the School of Computer Scienceand Software Engineering at Monash University. He is a memberof the IEEE, Immediate Past President of the Internet Society ofAustralia and Co-Chair of the Australian Domain Administration.