e-Collaboration and Grid-on-Demand Computing for Earth...

46
SPECIAL THEME: Environmental Modelling 14 ERCIM News No. 61, April 2005 One of the key requirements faced by the earth and environmental science community is to improve collaboration across the many geographically- distributed research groups and enable them to share data acquired by diverse means (ground, airborne and satellite), computational and storage resources, knowledge, experimental results. Complexity is caused by the fact that: very often a single instrument can serve several applications and a given application wishes to access all avail- able instruments; each application needs specific time and space data sampling; standards and data handling practises differ in different research communi- ties. The Grid and emerging e-collaboration computing services are considered as powerful new instruments, enabling and strengthening remote collaboration among researchers. For example, a Grid- based infrastructure can permit alterna- tive approaches to access and exploit large data sets. Instead of traditional remote sensing data ordering and deliv- ering from the acquisition/storage facili- ties to the user sites, user specialised processing modules could be located wherever data and computing resources are available. Specific on-demand data products, based on the best user-defined parameters, could then be generated or retrieved from archives and downloaded in real-time. Such services can be avail- able via the web to members of a virtual thematic community, using the emerging Web Services standards. The European Space Agency at the ESRIN site (Frascati, Italy) has the mandate to acquire, process, archive and distribute data coming from the Earth observation satellites operated in Europe, organised in a fully distributed network of facilities all over Europe. The ESA historical archive already includes some 2 PetaBytes of data and the flag- ship ENVISAT mission, launched in 2002, is increasing this data holding by some 400 TeraBytes per year. In recent years, ESA-ESRIN has experi- mented with the Grid and is currently involved in several applications and projects ranging from e-collaboration to the organisation of information in ad-hoc digital libraries, organised around the established Grid infrastructure. ESA intends to adopt the Grid computing philosophy for handling the instrument data of future Earth observation missions. Grid-on-Demand Specific Demonstration Applications Following successful demonstrations of GRID applications for Earth Science, both in studies funded by the ESA General Study Programme and through participation in EC-funded projects, in 2004 at ESRIN a schedule was defined to render mature GRID-based applica- tions operational. In particular, the inte- gration of the ENVISAT MERIS dedi- cated data processing tool (BEAM) in the so-called ‘Grid on-Demand’ activity was completed. Through Grid-on- Demand, authorised users can generate level-3 products (ie uniformly time and space remapped geophysical variables) used, for example, for the monthly mosaicking of the global chlorophyll or vegetation status derived from MERIS instrument data (see Figure 1). Grid on-Demand supports scientific applications for future large ENVISAT e-Collaboration and Grid-on-Demand Computing for Earth Science at ESA by Luigi Fusco, Veronica Guidetti and Joost van Bemmelen Information and Communication Technologies (ICT) have proven to be key instruments in activities targeted at protecting the environment and its integration in sustainable development policies. The European Space Agency (ESA) considers GRID computing capabilities as powerful tools to manage large volumes of Earth observation data and provide the Earth Science community with on-demand services. ESA is currently involved in several projects requiring a Grid infrastructure to support e-collaboration and digital library applications. Figure 1: MERIS mosaic image using data from the months of May, July, October and November 2004. This image is made up of true colour images using four out of 15 MERIS spectral bands taken from Envisat (bands 2,3,5 and 7) with data combined from the selected separate orbital segments with the intention of minimizing cloud cover as much as possible by using the corresponding data flags. In total, more than 1 TeraBytes of data were processed to generate a final image of about1GigaBytes.

Transcript of e-Collaboration and Grid-on-Demand Computing for Earth...

Page 1: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

14 ERCIM News No. 61, April 2005

One of the key requirements faced by theearth and environmental sciencecommunity is to improve collaborationacross the many geographically-distributed research groups and enablethem to share data acquired by diversemeans (ground, airborne and satellite),computational and storage resources,knowledge, experimental results.

Complexity is caused by the fact that:• very often a single instrument can

serve several applications and a givenapplication wishes to access all avail-able instruments;

• each application needs specific timeand space data sampling;

• standards and data handling practisesdiffer in different research communi-ties.

The Grid and emerging e-collaborationcomputing services are considered aspowerful new instruments, enabling and

strengthening remote collaborationamong researchers. For example, a Grid-based infrastructure can permit alterna-tive approaches to access and exploitlarge data sets. Instead of traditionalremote sensing data ordering and deliv-ering from the acquisition/storage facili-ties to the user sites, user specialisedprocessing modules could be locatedwherever data and computing resourcesare available. Specific on-demand dataproducts, based on the best user-definedparameters, could then be generated orretrieved from archives and downloadedin real-time. Such services can be avail-able via the web to members of a virtualthematic community, using the emergingWeb Services standards.

The European Space Agency at theESRIN site (Frascati, Italy) has themandate to acquire, process, archive anddistribute data coming from the Earthobservation satellites operated in

Europe, organised in a fully distributednetwork of facilities all over Europe. TheESA historical archive already includessome 2 PetaBytes of data and the flag-ship ENVISAT mission, launched in2002, is increasing this data holding bysome 400 TeraBytes per year.

In recent years, ESA-ESRIN has experi-mented with the Grid and is currentlyinvolved in several applications andprojects ranging from e-collaboration tothe organisation of information in ad-hocdigital libraries, organised around theestablished Grid infrastructure. ESAintends to adopt the Grid computingphilosophy for handling the instrumentdata of future Earth observationmissions.

Grid-on-Demand SpecificDemonstration ApplicationsFollowing successful demonstrations ofGRID applications for Earth Science,both in studies funded by the ESAGeneral Study Programme and throughparticipation in EC-funded projects, in2004 at ESRIN a schedule was definedto render mature GRID-based applica-tions operational. In particular, the inte-gration of the ENVISAT MERIS dedi-cated data processing tool (BEAM) inthe so-called ‘Grid on-Demand’ activitywas completed. Through Grid-on-Demand, authorised users can generatelevel-3 products (ie uniformly time andspace remapped geophysical variables)used, for example, for the monthlymosaicking of the global chlorophyll orvegetation status derived from MERISinstrument data (see Figure 1).

Grid on-Demand supports scientificapplications for future large ENVISAT

e-Collaboration and Grid-on-DemandComputing for Earth Science at ESAby Luigi Fusco, Veronica Guidetti and Joost van Bemmelen

Information and Communication Technologies (ICT) have proven to be keyinstruments in activities targeted at protecting the environment and its integrationin sustainable development policies. The European Space Agency (ESA) considersGRID computing capabilities as powerful tools to manage large volumes of Earthobservation data and provide the Earth Science community with on-demandservices. ESA is currently involved in several projects requiring a Gridinfrastructure to support e-collaboration and digital library applications.

Figure 1: MERIS mosaic image using data from the months of May, July, October and

November 2004. This image is made up of true colour images using four out of 15

MERIS spectral bands taken from Envisat (bands 2,3,5 and 7) with data combined from

the selected separate orbital segments with the intention of minimizing cloud cover as

much as possible by using the corresponding data flags. In total, more than

1 TeraBytes of data were processed to generate a final image of about1GigaBytes.

Page 2: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 15

data set access. Together with the high-performance processing capability of theGrid, it provides quick accessibility todata, computing resources and results.Figure 2 shows support for a sciencegroup interested in new algorithm devel-opment and fast validation of results.

The power of the Grid infrastructure willhelp to process and manage largeamounts of satellite images, thusforming the basis for long term datapreservation, while digital librarycommon functionality and third partyapplications will allow the users toretrieve, analyze and manage thecontents, the services and the virtualorganisation.

e-CollaborationESA also sees the Grid as a powerfulmeans to improve the integration of dataand measurements coming from verydifferent sources to form a CollaborativeEnvironment. The ESA-GSP project‘THE VOICE’ (THEmatic VerticalOrganisations and Implementation ofCollaborative Environments, seehttp://www.esa-thevoice.org) aims atbuilding an infrastructure that allowscollaboration between different groupsof researchers in the Earth observationfield and generates scientific prototypes

in the domain of ozone data calibrationand validation, in the establishment ofGMES (Global Monitoring forEnvironment and Security) applicationinnovative partnerships, in the agricul-ture and forest rural areas monitoringand the marine applications communi-ties.

Concluding RemarksAs already happened with the WorldWide Web, the Grid together with e-Collaboration technology is expected tohave a deep impact on our life, notnecessarily restricted to scientific appli-cations. However, the extent to which

Grid technology will beexploited in the future isclosely connected to theadoption of commonstandards to allowdifferent grids to collab-orate, ie to worktogether.

For the Earth sciencecommunity, it is impor-tant to continue andinvest in activitiesfocussing on Grid and e-collaboration. Initiativessuch as ESA Grid-on-demand and projects

like THE VOICE are demonstratingtheir relevance. These projects show thata Grid-based underlying infrastructure isa real asset for this community. It signifi-cantly improves the accessibility andusability of Earth science data, informa-tion and knowledge, and the way Earthscience users collaborate.

Links: Grid-on-Demand: http://eogrid.esrin.esa.int

THE VOICE: http://www.esa-thevoice.org

Please contact: Veronica Guidetti, European Space AgencyTel: +39 06 94180534E-mail: [email protected]

Figure 2: MGVI (MERIS Global Vegetation Index) Time

composite (May 2004) algorithm for FAPAR product

developed by JRC (Nadine Gabron, A Time Composite

Algorithm for FAPAR products Theoretical Basis Document

JRC Publication No. EUR 20150 EN).

ERAMAS is a simulation-based analysisframework for calculating risks due tochemically toxic or carcinogenicsubstances being released, for exampleduring accidents in industrial installa-tions, the transport of dangerous goods orby terrorist attacks. It is designed to beapplicable both for real-time emergencymanagement and for risk mitigationactivities such as simulation-aided

studies concerning the design of approvalprocedures or emergency plans.

Figure 1 shows an overview of the simu-lation models involved in ERAMASregarding various transportation paths forpollutants in the atmosphere and the soil.In the environmental simulation domain,this is a typical scenario; nevertheless,integrating and coupling even a small

number of heterogeneous simulationmodels, and making them available totechnically unskilled users, amounts to avery complex and time-consuming effort.

ERAMAS is being developed using thetechnology of the Fraunhofer ResourceGrid (http://www.fhrg.fraunhofer.de),which simplifies the coupling of hetero-geneously distributed software, hard-

ERAMAS — Environmental Risk Analysis and Management Systemby Thilo Ernst, Andreas Hoheisel, Thomas Lux and Steffen Unger

The aim of the ERAMAS project is to develop a Grid-based system for analysingand managing pollutant-related environmental risks. Sophisticated simulationprograms are used to forecast and evaluate the dispersion of carcinogenic andchemically toxic substances in the atmosphere, the soil and the groundwater,and to calculate the risk they pose to humans.

Page 3: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

16 ERCIM News No. 61, April 2005

ware and data resources (see Figure 2).The Fraunhofer Resource Grid is a Gridinitiative comprising five Fraunhoferinstitutes and funded by the GermanFederal Ministry of Education andResearch. Its main objectives are todevelop and implement a stable androbust Grid infrastructure within theFraunhofer Gesellschaft, to integrateavailable resources, and to provideinternal and external users with a user-friendly interface for controllingdistributed applications and services onthe Grid.

ERAMAS is a system with considerableresource demands. These arise not onlyfrom the inner complexity of its compo-nents, but also from complex workflowsand usage scenarios in which a substan-tial number of component instances needto be executed, eg parameter studies.Such a system cannot be expected to runon a single workstation; parallel anddistributed computing techniques areobviously necessary. However, theprimary advantage of using Grid tech-nology in ERAMAS is not the perfor-mance gain from having access to addi-tional resources, but rather the organiza-tional advantages in building and main-taining a distributed, highly heteroge-neous simulation system. The Grid isused to organize the workflow ofcoupled simulations and to provideuniform access to a wide variety of hard-ware, software and data resources. Thecomponent abstractions offered by theFraunhofer Resource Grid make thecoupling of a wide range of models anddata sources very easy - detailed know-

ledge of the internals of the componentsis no longer needed.

The simulation components integrated inERAMAS are pure command-line appli-cations, that is, they have no graphicaluser interface. Specialized simulatorsusually originate from the research sector,where this is considered normal.However, it conflicts severely with thegoals of and application scenarios envi-sioned for ERAMAS, which call forstrong support of the less technicallyskilled user, such as users from on-siteemergency response teams. This gap isbridged by relying on the VirtualLab plat-form (http://vl.nz.dlr.de/VL), which isbeing developed through collaborationbetween the German Aerospace Center(DLR) and Fraunhofer FIRST.VirtualLab contains a subsystem fordynamically generating flexible and easy-to-use Web user interfaces for command-line applications, using abstract descrip-tions of their input datasets. Together withits generic Web portal features (such asprotected user areas that persistently storesimulation runs, integrated documenta-tion management, and Web-based admin-istration), VirtualLab is thus able toprovide a powerful Web access layer forERAMAS.

The ERAMAS system is being devel-oped by Fraunhofer FIRST, in collabora-tion with Ingenieurbüro Beger fürUmweltanalyse und Forschung and theDresdner Grundwasser ConsultingGmbH in Germany. The project isfunded by the Arbeitsgemeinschaftindustrieller Forschungseinrichtungen

Otto von Guericke (AiF) in theprogramme Innovationskompetenzmittelständischer Unternehmen (PROINNO).

The funded ERAMAS projectcommenced in July 2002 and finished inOctober 2004, and the result is a demon-stration prototype of the ERAMASsystem. ERAMAS can be viewed as apilot project that is introducing Grid- andWeb-based e-science methods into theenvironmental simulation and riskmanagement community, and is devel-oping and deploying a dedicated plat-form for the purpose.

Our aim for the future is to makeERAMAS a commercially operatedservice that can be used in a variety ofways, eg for advance analysis inlicensing procedures or for drawing upaction plans. Potential customers includechemical companies, haulage contrac-tors and emergency services like the fireservice. Another application area is anal-ysis under real-time conditions, whetherin the case of malfunctions in industrialplants, the transport of hazardous mate-rials or terrorist attacks.

Links:http://www.first.fraunhofer.de/en/eramashttp://www.fhrg.fraunhofer.dehttp://www.first.fraunhofer.de/en/vlab

Please contact:Steffen Unger, Institute for ComputerArchitecture and Software Technology - FIRST, Fraunhofer ICT GroupE-mail: [email protected]

Figure 2: System architecture of ERAMAS.Figure 1: Simulation models of ERAMAS.

Page 4: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 17

To predict the future, we must under-stand the past. In the case of planet Earthwe do not yet fully understand the mech-anisms that have driven the most funda-mental change in climate over the pastmillion years – the transitions betweenice ages and warm inter-glacials. Toimprove understanding of the physicalprocesses and feedbacks that are impor-tant in the Earth System, the GENIEproject is creating a component frame-work that allows the flexible coupling ofconstituent models (ocean, atmosphere,land, etc.) of varying resolution (gridsizes), dimensionality (2D and 3Dmodels) and comprehensiveness(resolved physics vs. parameterisations)to form new integrated ESMs. Throughthe systematic study of a hierarchy ofGENIE models the project aims to deter-mine the spatial resolution and processcomplexity that actually need to beincluded in an ESM to exhibit past EarthSystem behaviour.

The GENIE project is funded by theNational Environment Research Council(NERC) and brings together expertisefrom UK and international academicinstitutions. The Universities of Bristoland East Anglia, the SouthamptonOceanography Centre and the Centre forEcology and Hydrology have providedmature models of major Earth Systemcomponents including atmosphere,ocean, sea ice, ocean biogeochemistry,sediments, land vegetation and soil, andice sheets. The e-Science centres at theUniversity of Southampton and ImperialCollege have been engaged to providethe software infrastructure for thecomposition, execution and management

of the integrated Earth System Modelsand their output on the grid. We havestrong international collaborations withresearchers at the Frontier ResearchCentre for Global Change in Japan,University of Bern in Switzerland andUniversity of British Columbia inVancouver.

e-Science ChallengeThe objectives of the GENIE project areto develop a Grid-based computingframework which will allow us:• to flexibly couple together state-of-

the-art components to form a unifiedEarth System Model (ESM),

• to execute the resulting ESM across acomputational Grid,

• to share the distributed data producedby simulation runs, and

• to provide high-level open access tothe system, creating and supportingvirtual organisations of Earth Systemmodellers.

SoftwareGrid computing technology is requiredto ease the construction of new instancesof Earth system model, automate theprocess of model tuning, speed up theexecution of individual long integra-tions, enable large ensembles to be run,ease their execution, and feed andrecycle data back into model develop-ment. A principle aim of the project is toensure that the Grid is useable directlyfrom the environment where the climatemodellers are performing their work.The software deployed to meet theserequirements is built upon products ofthe first phase of the UK e-Scienceprogramme. These include:

• Geodise Compute ToolboxThe Geodise computational toolboxfor Matlab provides a suite of Matlabfunctions that provide programmaticaccess to Globus Grid enabledcompute resources. The computationaltoolbox uses the APIs provided by theJava CoG toolkit to allow the submis-sion of compute jobs to Globusenabled resources, GridFTP datatransfer and the management of proxycertificates. An interface to Condorresources is also provided.

• Geodise Database ToolboxAn augmented version of the GeodiseDatabase Toolbox has been deployedto provide a distributed data manage-ment solution for the GENIE project.The Geodise system exploits databasetechnology to enable rich metadata tobe associated with any data file, scriptor binary submitted to the repositoryfor archiving. XML schemas definethe structure of the metadata and aremapped into the underlying Oracle 9idatabase. The database system is builton open W3C compliant standardstechnologies and is accessed through aweb services interface. Client tools areprovided in Matlab and Jython whichallow both programmatic and GUIaccess to the system.

• OptionsMatlabOPTIONS is a design exploration andoptimisation package that has beendeveloped in the ComputationalEngineering and Design Centre at theUniversity of Southampton. This soft-ware provides a suite of sophisticatedmultidimensional optimisation algo-

GENIE: Grid Enabled Integrated Earth System Modelby Andrew Price, Tim Lenton, Simon Cox, Paul Valdes and the GENIE team

An understanding of the astonishing and, as yet, unexplained natural variabilityof past climate is an essential pre-requisite to increase confidence in predictionsof long-term future climate change. GENIE is a new Grid-enabled modellingframework that can compose an extensive range of Earth System Models (ESMs)for simulation over multi-millennial timescales, to study ice age cycles and long-term human induced global change. Grid technology is a key enabler for theflexible coupling of constituent models, subsequent execution of the resultingESMs and the management of the data that they generate.

Page 5: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

18 ERCIM News No. 61, April 2005

rithms developed primarily for engi-neering design optimisation. Thepackage has been made available toMatlab via the OptionsMatlab inter-face and has been exploited inconjunction with the GeodiseToolboxes to tune GENIE modelparameters.

TuningA key challenge to the project is to tuneor re-tune the parameterisations of indi-vidual model components so that thenew coupled ESMs simulate reasonableclimate states. In particular, it is impera-tive that the fluxes passed betweencomponents are compatible if theresulting coupled model is to be stable.We have exploited the Grid enabledtoolset in conjunction with theOPTIONS package to apply ResponseSurface Modelling techniques andGenetic Algorithms to optimise GENIEmodel parameters. In addition, theensemble Kalman Filter, a data assimila-tion method, has also been employed.These techniques provide a comprehen-sive set of tools for a program of exten-sive model tuning which has progressedin step with model development.

Current and Future StudyWe are exploiting local resources(condor pools, Beowulf clusters) and the

UK National Grid Service to performextensive studies of GENIE models. Thecomputational Grid provides the meansto perform large ensemble runs. To date,experiments have studied the stability ofthe thermohaline circulation to multi-parameter freshwater inputs, using typi-cally ~1000 instantiations of the model,involving ~40 million years of modelintegration.

The database repository plays a centralrole in these studies as a resource forboth steering computation and sharing ofthe data. Future work will involve thedevelopment of a distributed federateddatabase system, deployment of the

database on the National Grid Servicedata node(s) and further enhancementsto the data management tools. Theproject will adopt the GeodiseLabToolbox from the OMII (OpenMiddleware Infrastructure Institute)managed programme when this productis released to the community.

Links:http://www.genie.ac.uk/http://www.geodise.org/http://www.omii.ac.uk/

Please contact:Andrew Price, Regional Southampton e-Science Centre, University of Southampton, UKTel: +44 23 8059 8375E-mail: [email protected]

Figure 1:

Schematic of

the Earth

System

components

that are

modelled in

GENIE.

In the project, the consortium (SZTAKI;Chemical Research Institute of theHungarian Academy of Sciences;Department of Physical Chemistry,Eötvös University; Hungarian

Meteorological Service) applied newGrid technologies to provide support fora specific research area. The developedinfrastructure now provides chemistswith access to both Hungarian computa-

tional Grid resources, called HUNGRID,and European-wide chemistry Gridinfrastructures. The latter were estab-lished as the result of the EU-fundedprojects SIMBEX and EGEE.

Chemistry GRID and its Applications for Air Pollution Forecastingby Róbert Lovas, István Lagzi, László Kullmann and Ákos Bencsura

Computational Grid systems are becoming increasingly popular in the naturalsciences. In such systems, a large number of heterogeneous computer resourcesare interconnected to solve complex problems. The main aim of the nationalresearch project funded by the Hungarian Ministry of Education, ‘Chemistry Gridand its Applications for Air Pollution Forecasting’, was to look at feasibleapplications of Grid technology in computational chemistry from a practical pointof view; for example, prevention of the harmful effects of high-level ozoneconcentrations.

Page 6: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 19

SZTAKI has elaborated a product line: aGrid-monitoring tool called Mercury,and two integrated application develop-ment environments, called P-GRADEparallel programming environment, andP-GRADE Grid portal (see Figure 2).These tools enable the efficient andtransparent parallelization of sequentialapplications through their high-levelgraphical approach and special perfor-mance debugging and analyser tools. Inthe framework of the project, the P-GRADE portal was developed further toprovide support for the efficient execu-tion of complex programs in variousGrids, eg in HUNGRID. It included thedynamic execution of applicationsacross Grid resources according to theactual state and availability conditionsprovided by the new information system.

Consequently, HUNGRID is not only avirtual organization within the EGEE: itsnew elements make it easier to use theinfrastructure for solving complex prob-lems, such as the modelling of air pollu-tion.

The phytotoxic nature of ozone wasrecognized decades ago. Due to highemissions of ozone precursorsubstances, elevated ozone concentra-tions may cover large areas of Europe forshorter (episodic) or longer periodsunder certain meteorological conditions.These elevated concentrations can bepotentially damaging to agricultural andnatural vegetation. Occasional extremeconcentrations may cause visible injuryto vegetation, while long-term exposure,averaged over the growing season, can

result in decreased productivity and cropyield.

A coupled Eulerian photochemical reac-tion–transport model and a detailedozone dry-deposition model were devel-oped to investigate ozone fluxes overHungary. The reaction-diffusion-advec-tion equations relating to air pollutionformation, transport and deposition aresolved on an unstructured triangulargrid. The model domain covers CentralEurope including Hungary, which islocated at the centre of the domain and iscovered by a high-resolution nested grid.The sophisticated dry-deposition modelestimates the dry-deposition velocity ofozone by calculating the aerodynamics,the quasi-laminar boundary layer and thecanopy resistance. The meteorologicaldata utilized in the model were generatedby the ALADIN meso-scale limited-areanumerical weather prediction model,which is used by the HungarianMeteorological Service. The workdemonstrates that the spatial distributionof ozone concentrations is a less accuratemeasure of the effective ozone load thanthe spatial distribution of ozone fluxes.The fluxes obtained show characteristicspatial patterns, which depend on soilmoisture, meteorological conditions,ozone concentrations and the underlyingland use (see Figure 1).

This project has demonstrated that theGrid is an efficient computer system forsupporting complex collaborative work.Applications for air pollution forecasting(elaboration of smog alert response plansand a Gaussian plume simulation) havebeen developed and presented. Theproject partners have designed a collabo-rative application that runs on Grid toforecast air pollution in Hungary. Thesame application can be used to simulateearlier smog events and to analyse theefficiency of smog alert response plansand the long-term effects of variousmeasures against air pollution.

Link:http://www.lpds.sztaki.hu/chemistrygrid

Please contact:Róbert Lovas, SZTAKI, HungaryE-mail: [email protected]

Figure 1:

Calculated ozone

concentrations on

the 3rd of August,

1998 at 17:00, with

the wind field taken

from the ALADIN

weather prediction

model.

Figure 2: (left) An application for modelling reaction-diffusion-advection systems in the

P-GRADE portal (workflow representation and job description); (right) workflow and job

level monitoring and visualization of execution on the HUNGRID infrastructure.

Page 7: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

20 ERCIM News No. 61, April 2005

In the Czech Republic, an operationalstatistical predictor of summer photo-chemical smog situations has been usedsince 2000. This system uses a neuralnetwork predictor and a dynamic regres-sion model. Its inputs are the dailymaxima of tropospheric ozone concen-trations and temperature, measured at theground-level stations of the AutomaticImmission Monitoring system (AIM) ofthe Czech Hydrometeorological insti-tute. The resulting concentration field isobtained by means of spatial interpola-tion. Although this system reasonablypredicts daily ozone concentrationmaxima, there is a need for a determin-istic model which would give reliablevalues for regions sparsely covered byAIM stations and which would reflectthe air circulation. On the other hand,

deterministic models often suffer fromuncertainties in emission inputs; system-atic bias in some locations may occur. Itis also desirable to make use of anyavailable measurements. A system withincorporated data assimilation thereforerepresents a natural target for modellingefforts.

The primary aim of the MEDARDproject is the development of such asystem for air quality forecasting. Itsorigins date to 2001 when the first stepswere made during the EU framework Vproject APPETISE. The project iscurrently supported by the grant agencyof the Academy of Sciences of the CzechRepublic, within the framework of the‘Information Society’ programme (No1ET400300414). It involves researchers

from the Institute of Computer Science(ICS) of the Czech Academy ofSciences, in collaboration with CharlesUniversity, the Czech Hydrometeoro-logical Institute and the Institute ofMeteorology at the University of NaturalResources and Applied Life Sciences inVienna.

A suitable NWP-CTM model pair wassought for this purpose. The most naturalchoice for the NWP model was MM5,which is widely used in both the USAand Europe (eg the EURAD group;www.eurad.uni-koeln.de). The modelwas configured for the territory of theCzech Republic, it has two nesteddomains with resolutions 27 and 3km, theNOAH land surface model is run and theMRF scheme of the Planetary Boundary

MEDARD – an Environmental ModellingProject for the Territory of the Czech Republicby Kryštof Eben, Pavel Jurus, Jaroslav Resler, Michal Belda

and Bernd C. Krueger

With the progress of the computing power of Linux clusters, operationalinstallations of numerical weather prediction (NWP) models and chemistrytransport models (CTM) have begun to spread through academic institutionsacross Europe. One such project focusing on the territory of the Czech Republicis MEDARD – a Meteorological and Environmental Data Assimilating system forRegional Domains. The system is run by the Institute of Computer Science of theCzech Academy of Sciences, in Prague. Its core consists of the NWP model MM5(PSU/NCAR) coupled with the CTM model CAMx (ENVIRON Corp, USA). Dataassimilation routines are under development.

Figure 1: Ozone concentrations — assimilated. Figure 2: Ozone concentrations - free run.

Page 8: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 21

Layer. The boundary conditions aretaken from the GFS global forecast ofNCEP. The CAMx model was chosen asthe CTM part of the system. CAMx haspre-processors for meteorological fieldsfrom MM5, and in our configuration itruns on two nested domains derived fromthe above MM5 domains, and uses theSAPRC99 chemistry mechanism andEMEP emission inventories.

Upon the MM5-CAMx pair, a commonpresentation layer was built. The resultsare available at a user-friendly Web siteintended for the general public. It wasdesigned so as to provide a quick orien-tation to the situation, and features twoalternative displays, animation and quickswitching of domains and products. Inthe field of data assimilation, a type ofensemble filter suitable for the CAMxmodel has been proposed and is in thetesting phase. Pilot experiments arebeing run, with the aim of determininghow much data assimilation willimprove operational forecasts. Figures 1

and 2 show an example of the output ofan ensemble run: the ensemble mean ofozone concentration (Figure1) comparedto the output of a free run without dataassimilation (Figure 2).

Soon after the project commenced andthe first weather prediction outputsbecame available, demand arose fromthe private sector for specialized outputslike lightning activity indices, risk oficing and local wind profiles. Someproducts of this kind have accordinglybeen developed. It also emerged thatpublic interest is biased towards weatherprediction, with interest in air qualitybeing marginal. We are attempting toattract more interest in air quality issuesby presenting the outputs of the CTM asa product of comparable importance toweather forecasts.

Commencing an operational run of theair quality forecast with incorporateddata assimilation, however, requires alonger time schedule. Any such system

will need on-line data, not only from thecountry immediately involved but also,due to transport phenomena, from neigh-bouring countries. Unfortunately thereremain large differences between indi-vidual countries in the availability of on-line measurements of pollutant concen-trations.

A big effort has to be made in designing asystem for the downloading and valida-tion of these on-line measurements.Together with the enormous computing-power demands of assimilation methods,the development of an operational dataassimilating system for air quality predic-tion remains a significant challenge.

Link: http://www.medard-online.cz

Please contact: Kryštof Eben, Institute of Computer Science of the Czech Academy of Sciences / CRCIM Tel: +420 266 053 720 E-mail: [email protected]

Canton Ticino is the southernmost part ofSwitzerland and it lies right on one of themost important European transport axis,connecting Italy with Central Europe. Itis a sub alpine region, with remarkablenatural beauties, but its environment issubject to many pressures, among whichroad emissions play a major role. In 2001the Swiss government decided to tacklethe problem of the growing traffic on themain national highways by fundinginitiatives aimed at making available thenecessary data and tools to understandthe impact of traffic on the environment,in order to enable policy makers to makeappropriate decisions.

Within this context, the LandManagement Department of Canton

Ticino launched a project to develop anintegrated monitoring and decisionsupport system to collect, analyse andprocess multi-domain environmentaldata. The project outcome has beennamed OASI (Osservatorio AmbientaleSvizzera Italiana), and was developed bySUPSI/DTI, the Department ofInnovative Technologies at theUniversity of Applied Science ofSouthern Switzerland.

OASI regularly and automaticallycollects environmental data from anetwork of hundreds of sensorsdistributed all over Canton Ticino. Then,the collected data automatically undergoa statistical, intersite and interdomainvalidation process before being made

available to the end-users. In thisprocess, past data, already validated, areused to detect anomalies in the measure-ments which have been recentlycollected.

Scientists can access the data repositoryvia the OASI application in order toperform integrated data analyses. Userscan select locations, parameters, timeintervals and diagram types. Multiplecurves can be displayed and compared ina single diagram. Data belonging todifferent domains can be shown on asingle screen, for comparison purposes,as shown in Figure 1.

The OASI software system has a scal-able architecture that allows additional

OASI: Integrated Monitoring and DecisionSupport for Environmental Systemsby Roberto Mastropietro, Lorenzo Sommaruga and Andrea E. Rizzoli

OASI is a system for regularly and automatically collecting environmental datafrom a network of hundreds of sensors distributed throughout the Canton ofTicino in southern Switzerland. It has been in successful operation for severalmonths.

Page 9: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

22 ERCIM News No. 61, April 2005

data domains, as diverse as basin levelsand landslides monitoring indicators, tobe seamlessly added to the set of moni-tored data. This is achieved thanks to its3-tier architecture composed of a datalayer, a logical layer and a presentationlayer — and to a flexible databasedesign.

Data from different domains arecollected in separate databases in thedata layer, but they can be transparentlyintegrated for validation and analysis inthe logical layer. The software architec-ture therefore easily supports the integra-tion of multiple databases, and it allowsthe deployment of the system in

distributed configurations making thesolution flexible and scalable. Thesystem allows the definition of user rolesand data ownership. An organizationmay or may not allow its data to be seenor modified by users belonging to otherorganizations

One of the most challenging aspects ofthe project is related to the amount ofmeasurements being collected and thesize of the databases, which results ininteresting storage management andsystem performance issues solved usingspecial design and advanced techniquesmade available by the database manage-ment system (Andretta et al. 2004).

Finally, the presentation layer isdesigned to serve both the needs ofscientists and researchers as well thegeneral public that wants to be informedon the state of the environment.

The OASI system has been running inproduction mode for a few months and ithas raised the interest of other Cantonsthat are now also integrating their data inthe system for validation and analysispurposes.

Further developments of the OASIproject will push towards the provisionof an even more open and mobile accessto data, supporting interoperabilityamong different client-server nodes,easy extensibility for integrating anykind of client device into the system,both for regular users and system admin-istrators. In particular, within the OASIcontext, Web services technology will beeffectively exploited for integrating theneeds for dissemination of analyticaldata about environment, such as air,noise, traffic, etc., and the needs ofdifferent users having accessibilityrequirements for their devices, beingdistributed and heterogeneous systems,remote and mobile clients (Arauco andSommaruga, 2004).

Link: Oasi website: http://www.ti.ch/oasi

Please contact: Roberto MastropietroUniversity of Applied Science of Southern Switzerland (SUPSI)Tel: +41 91 6108578E-mail: [email protected]

Figure 1: A screenshot of the OASI workspace.

The first objective of this project was toprepare the Polair3D model for fore-casting the air pollution in the Berlin-Brandenburg area. Polair3D is a 3D

chemistry-transport model developed byCerea (ENPC - EdF R&D), and is usedfor air quality forecasting on both conti-nental (Europe) and regional scales

(urban air pollution in Paris, Lille andMarseille). The second objective is toimprove the quality of the forecasts bydeveloping a sequential data assimilation

Data Assimilation and Air PollutionForecasting: the Berlin Caseby German Ariel Torres, Steffen Unger, Torsten Asselmeyer-Maluga,Vivien Mallet, Denis Quélo, Bruno Sportisse, Isabelle Herlin and Jean-Paul Berroir

In a joint French-German project, a three-dimensional chemistry-transport modelcalled Polair3D has been improved in order to achieve better air pollution forecasts.

Page 10: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 23

framework adapted to Polair3D.Experiments are run by assimilating dataprovided by Berlin's air pollution moni-toring network Blume.

Running Polair3D on a new applicationsite requires a set of input data to beprepared. This comprises: • the meteorological fields, generated by

the mesoscale meteorological modelMM5, which was developed by thePennsylvannia State University andthe National Center for AtmosphericResearch

• the emission inventory, based onEMEP European emission inventoryand on the CityDelta project for urbanemissions in the Berlin-Brandenburgarea

• the boundary conditions, obtained bynesting Berlin regional runs withinPolair3D runs at a European scale,now routinely performed.

The methodology is systematic, and canbe applied to regional air pollution fore-casting with Polair3D in any Europeancity for which a CityDelta emissioninventory is available.

Following this phase, sequential dataassimilation is used to adjust modelinputs and parameters, in order to mini-mize the difference between forecastsand measurements. This process is

performed each time a new measurementis acquired, hence the name 'sequential'.Data assimilation therefore helps toimprove the quality of air pollution fore-casts provided by the model.

One of the main advantages of sequentialdata assimilation is that it does notrequire the availability of an adjointmodel, and can be implemented as apost-processing toolbox that can beplugged into any model with little effort.However, it carries a considerablecomputational cost, since it requiresseveral direct runs for each newmeasurement.

During the project, two types of Kalmandata assimilation procedure were imple-mented: a rank-reduced Kalman filter,which reduces the dimension of theproblem, and an ensemble Kalman filter,where statistics of model errors aregenerated by performing many forwardruns of the model. These procedureshave been validated in case studies, andtheir use for the Berlin case is underdevelopment, since it requires the paral-lelization of the Kalman filter. We arecurrently carrying out an evaluation ofPolair3D results by performing a set offorecasts for the test period April-September 2001. Forecast error statisticsare then computed by comparing theforecasts with measured in situ data. The

project will be completed by quantifyingthe improvement brought by sequentialdata assimilation relative to direct runswithout assimilation. A comparison withvariational data assimilation methods isalso possible.

This work has been carried out in theframework of a French-German project,funded under the Procope programme,and supported by an ERCIM post-doctoral fellowship. It involves theClime team, a common INRIA andENPC project located in the Paris area,and Fraunhofer FIRST, Berlin. Researchwas carried out at the three sites -INRIA, ENPC and Fraunhofer - duringthe period 2003-2004.

The project is continuing throughcooperation with the University ofCórdoba in Argentina. Polair3D will beused for air quality forecasts in this area,which will require the preparation of anemission inventory. Sequential dataassimilation will be used to improveknowledge of the emissions. During thisproject, the feasibility of assimilatingsatellite measurements of chemicalconcentrations will be studied.

Please contact: Vivien Mallet, Clime Team (ENPC/INRIA), Ecole Nationale des Ponts et Chaussées, FranceE-mail: [email protected]

Figure 1: Impact of assimilation. The blue curve is a forward run

showing the ozone concentration over a one-day period at a

single location. The black curve is obtained after assimilation of

the measurements (shown as purple dots).

Figure 2: Forecast of ozone concentration with Polair3D model,

during a one-day period (red curve), compared with ground

station measurements at the same place (blue curve). The two

curves show good agreement, except at the beginning of the day,

because of an incorrect initialization.

0

10

20

30

40

50

60

70

6 8 10 12 14 16 18 20 22

"hs030925.blu_DEBE052_Oz.txt"

"POL_O3_24_46_1.txt"

Page 11: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

24 ERCIM News No. 61, April 2005

There is an increasing need for local andregional government authorities toacquire computerized systems to assistin evaluating the current status of thephysical environment in their area and inassessing the impact of proposed futuredevelopments. In general such authori-ties already have extensive environ-mental sampling programs (eg watercharacteristics at designated places onstreams, rivers and lakes; air quality at

specific locations; noise levels alongcertain roads) whose outputs must beincorporated in the new systems. Also tobe incorporated are such data as popula-tion densities, patterns of transport usageincluding car ownership, agriculturalpractices and patterns, industrial devel-opment and so on. The computerizedsystems must include relevant models ofvarious aspects of the environment (egassociations between water quality andpopulation density, agricultural produc-

tion and phosphate/nitrate levels, trafficand air quality, local weather patterns,social capital and car usage), and must becapable of incorporating new datasources and models in the future. Theutility of such systems depends cruciallyon the quality of their input data, whichinvolves maximising the informationyield from available measurements. Theparticular focus of this article is onextracting environmental information

from airborne hyperspectral measure-ments.

Spectral Signatures is a UniversityCollege Dublin campus company whichfor several years has specialised inextracting environmental information onwater quality and land productivity bymeans of optical instruments that havebeen designed and built in-house.Capabilities include in-vivo and non-contact sampling of water Chlorophyll

content, monitoring of Suspended matterand Dissolved Organic Matter and otherpigments for the water industry, andremote sensing (airborne) application ofhyperspectral techniques for waterquality and land productivity quantifica-tion and mapping.

Pigments are chemical compounds thatreflect only certain wavelengths ofvisible light, which makes them appear"colourful". More important is the abilityof pigments to absorb certain wave-lengths. In particular, Chlorophylls aregreenish pigments of which the mostimportant for photosynthesis isChlorophyll a. In general, since eachpigment reacts with only a narrow rangeof the spectrum, there is usually a need toproduce several kinds of pigments, eachof a different colour, to capture more ofthe sun's energy. Typically, each hyper-spectral sample consists of 512 measure-ments, collected by means of an airborne'Portable Multispectral ImagingSpectrometer', of the irradiancereflectance at equally spaced intervals inthe visible spectrum. Flight campaignshave been carried out over severalinland lakes and adjacents lands, withmany samples collected during eachflight.

Traditionally, broad band ratio tech-niques, which would make use of just asubset of the available measuredreflectances, have been used to convertmultispectral measurements into esti-mates or indices of vegetative biomass orphotosynthetic activity. An importantpart of the present work is the develop-ment of full spectral vegetation indices,with the aim of extracting the maximumamount of information from measure-ments. Figure 1 provides an example of

Environmental Modelling – Extracting Maximum Information from Airborne Remote Sensingby Eon O’Mongain and LiamTuohey

Computerized systems for environmental assessment and decision supportdepend fundamentally on measurements from which the maximum amount ofinformation must be extracted. Research on the case of airborne remotely senseddata is discussed.

Figure 1: Vegetation (Productivity) Index and Video Images of a Land Area.

Page 12: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 25

such an index, overlaid on simultane-ously taken video images.

In general, absorption data containchemical information while scatteringdata contain information on physicalcharacteristics of particles. An importantobjective of the present research is toestablish a systematic approach toextracting, from observations, maximuminformation on the ratio of scattering tototal attenuation (scattering + absorp-tion) in order to characterize lakes (andin-shore waters) comprehensively. Theapproach is to develop accurate mathe-matical models of the radiative transferprocess, starting from classical solutions

but drawing on modern tools such asMathematica to make them moretractable (see, for example, Picard styleiteration for anisotropic H-functions,W.G. Tuohey, Journal of QuantitativeSpectroscopy and Radiative Transfer, Inpress January 2005).

In conclusion, the aim is to establish anaccurate, reliable, easily updateable, andrelatively inexpensive source of infor-mation on the environmental quality ofwater bodies and on the productivity oftheir catchment areas. This source wouldconstitute a key underlying componentof a computerized environmental evalua-tion system as described at the outset. In

particular, a decision support systemmight include integration of this infor-mation source with in-situ measurementsof environmental quality within or nearwaste water systems or discharges.

Links: http://www.ucd.ie/spectral/

Please contact:Eon O’Mongain, University College, Dublin, IrelandTel: +353 1 7162233E-mail: [email protected]

Liam Tuohey, Dublin City University, Dublin, IrelandTel: +353 1 7008728E-mail: [email protected]

Photons leaving the Earth's atmospherecarry the signature of the emittinggasses. Space instruments orbiting theEarth use a variety of filtering anddetecting technologies to measure thesephotons and thence provide informationwhich can be used to determine the stateand composition of the atmosphere.

Imaging devices on geostationary satel-lites, measuring visible light reflectedfrom clouds or solid surfaces, can giveglobal coverage every 15 minutes or so.But this luxury is not available for thosewho want to study atmospheric composi-tion in detail.

Our climate is significantly modulated byozone, water vapour and methane in thestratosphere, all of which are present inconcentrations of only 1 to 10 parts perbillion (109) by volume. Detection of theweak signal emitted by these gasesrequires satellites flying much closer tothe Earth. Europe's environmental moni-

toring research satellite, ENVISAT,launched in 2003 at a cost of one billionEuro, is typical of this class of satellites,flying at around 800km altitude andorbiting the Earth every 100 minutes. TheMISTA project has so far focused onresults from one instrument onENVISAT: the Michelson Interferometerfor Passive Atmospheric Sounding(MIPAS). Of the three instruments onENVISAT which were designed to detectstratospheric gases MIPAS has been themost successful to date.

The greater detail which can be obtainedby instruments flying closer to the Earthis gained at the expense of losing theregular global coverage provided by thegeostationary instruments. MIPASprovides a string of observations alongthe satellite orbital path. As the Earthrotates under the orbit a picture of theglobal atmosphere can be built up.However, many of the features which areobserved evolve significantly over the

course of a day because of physicalprocesses, many of which are wellunderstood.

A wide range of methods have beendeveloped to combine the prior know-ledge we have of the physical processeswith the information stream comingfrom the measurements. The problemcan be formulated in a Bayesian frame-work, combining observational informa-tion with prior information expressed ina computational approximation to therelevant physical laws. So far so simple,at least to those involved in this area ofresearch. The novelty in this project is inthe way it tackles the numerical chal-lenges thrown up by the Bayesian frame-work. Many other research institutesworking on the same problem areexploiting existing forecast models.These are computational representationsof the physical laws designed to predictthe future state of the system from aprescribed initial state. The MISTA

Getting the Most out of Earth Observationby Martin Juckes

The imperative to understand our environment becomes ever more urgent asglobal change brings new uncertainties. At the same time, rapid technologicaladvances are making it possible to observe the global environment from spacein ever finer detail. In response to the challenges of handling an ever greatervolume and diversity of data, the Multiple Instrument Stratospheric TracerAssimilation (MISTA) project at Rutherford Appleton Laboratory has developeda new algorithm for extracting optimal information from satellite measurements.Thanks to gains in efficiency, a task normally done on supercomputers, can nowbe carried out easily on a PC.

Page 13: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

26 ERCIM News No. 61, April 2005

project has started from scratch anddeveloped code specifically designed fordata analysis. This allows the fullexploitation of the elliptical structure ofthe problem, a structure which followsnaturally from a standard Bayesianformulation. A multigrid relaxationalgorithm has been implemented. Thistype of algorithm is commonly used forthe solution of elliptical partial differen-tial equations, but has not been applied inthis context before.

A particular advantage of the new algo-rithm is that it fully exploits observationsboth prior to and after an analysis time.Information propagates both forwardsand backwards in time within the anal-ysis system, with a typical half-life of

around 2 days. This means that images

such as the figure, which shows theozone distribution in the stratosphere (ataround 30km altitude) on July 6th, 2003,can be significantly more accurate thananything produced by a forecast systemwhich, by its nature, can only exploitobservations prior to the analysis.

Link: http://home.badc.rl.ac.uk/mjuckes/mista/

Please contact: Martin Juckes, British Atmospheric Data Centre Research,Space Science and Technology Department, Rutherford Appleton Laboratory, CCLRC.Tel: + 44 1235 445124E-mail: [email protected]

Ozone concentration at 30km altitude

(parts per billion by volume), 6 July 2003.

Over the past few years there has been agrowing need to simulate meteorologicalfields for complex situations at finerspatial resolutions. This has been partlystimulated by scientific and technolog-ical advances (eg in dynamics, computa-tional methods and facilities) and partlyby policy pressures requiring moredetailed assessments of air pollution onurban to regional scales. As a conse-quence, complex dynamic models haveincreasingly been used in Europe and theUSA for meteorological and air pollu-tion applications. Models developed forshort- or long-range applications,however, are not always appropriate inflow conditions involving intermediatemesoscale features and processes (of theorder of 1-1000 km). This is becauseparameterizations, working hypothesesand configurations need to be differentfor differently scaled models (as stated inthe explanatory memorandum for theimplementation of the European

Concerted Research Action designatedCOST Action 728: ‘Enhancing meso-scale meteorological modelling capabili-ties for air pollution and dispersionapplications’).

In this context, our group is interested insituations of complex atmospheric flowsfor which mesoscale models arenecessary (eg sea breezes, valleys andlayered flows). It is generally acknowl-edged that current models are far fromperfect; our goal is therefore to identifythe gaps in our knowledge. We areperforming this research within thefollowing European frameworks: theCluster of European Air QualityResearch (CLEAR) project FUMAPEX(integrated systems for ForecastingUrban Meteorology, Air pollution andPopulation EXposure); the Network ofExcellence ACCENT (‘AtmosphericComposition Change: A EuropeanNetwork’); and the ‘CarboEurope-IP,

Assessment of the European TerrestrialCarbon Balance’.

Meteorological fields applied to airquality models (from global to localscales) may contain significant uncer-tainties that adversely affect simulations.There are a large number of meteorolog-ical variables needed for ‘generic’ airquality models, including horizontal andvertical wind components, temperature,water vapour mixing ratio, cloud fractionand liquid-water content, precipitation(rain/snow), solar actinic flux, sea-levelpressure, boundary layer depth, turbu-lence intensity, and surface fluxes forheat, moisture and momentum. In addi-tion to these variables, the forecasting ofair quality in mid-latitudes (as inMediterranean coastal regions) is alsohighly sensitive to the fact that non-local(mesoscale) effects strongly determineflows at urban scales (as reiterativeexperimental results have evidenced).

Meso-Meteorological Modelling for Air Pollution Applications in Mid-Latitudesby Jose-Luis Palau, Goka Pérez-Landa and Millán M. Millán

Any approach to meteorological modelling needs to take into account a varietyof processes that interact synergistically at different scales (mainly in mid-latitudeand complex terrain areas). Because of their potential to resolve regional andlocal atmospheric circulations, meso-meteorological models represent interestingtools. They are highly complex and their adaptation to a particular region requiresan in-depth understanding of the limits of parameterization applicability.

Page 14: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 27

The aforementioned atmospheric statevariables are insufficient under meteoro-logical conditions marked by non-localdynamic effects (as, for example, thecompensatory subsidences over theMediterranean coasts associated with theorographic injections resulting from thecoupling of sea breezes and inlandorographic upslope winds). In this sense,the meteorological models must beconfigured in such a way that they areable to reproduce the mesoscaledynamics at these subtropical latitudes.For instance, the interaction betweendifferent scales must be reproduced:land-use, soil moisture, sea-surfacetemperature, grid nesting, domain config-uration, and horizontal and vertical reso-lution are key magnitudes/parameters indescribing wind flows for air-pollutionforecasting purposes and must be set upproperly (see Figure 1).

In recent years, different researchprojects around the world have demon-strated that the pollutant ‘exchange rate’between different regions and underdifferent meteorological (and climatic)conditions is driven by interactions andforcings between different meteorolog-ical scales reaching distances of thou-sands of kilometres (see Figure 2).

For a believable evaluation of the impactof anthropogenic emissions from urban toglobal scales, one must therefore imple-ment within numerical models all thesespatial and temporal interactions, togetherwith feedback from climate, regional airquality and transport (at local, regionaland global scales). Moreover, in order toaddress certain issues (eg how differentmeteorological scales contribute to theLong-Range Transport - LRT - ofregional pollution), it is necessary toresolve some scientific questions relatedto these meteorological interactions. Inthis sense, there are indications that theformation and distribution of photo-oxidants in urban plumes, at regional orcontinental scales, in the boundary layerand in the free troposphere, are all linkedtogether. There are a number of ECresearch projects relevant to this,including MECAPIP - Meso-meteorolog-ical cycles of air pollution in the IberianPeninsula; RECAPMA - Regional cyclesof air pollution in the west centralMediterranean area; and SECAP - SouthEuropean cycles of air pollution.

Experimental data and complementarymodelling results from these projectshave established links between atmo-spheric circulations from local, throughregional, to sub-continental scales, partic-ularly in summer and in the central andwestern Mediterranean basins.

Link:http://www.gva.es/ceam

Please contact:Jose Luis Palau, Fundacion CEAM, SpainTel: +34 96 131 82 27E-mail: [email protected]

Figure 2: Feedbacks and

interactions/forcings

governing the transport

of anthropogenic

pollutants emitted at

urban/local scales.

Figure 1: Meteorological (top) and dispersive (bottom) simulations performed by

running RAMS v.4.3.0 and HYPACT v. 1.2.0 on a Mediterranean coastal site for three

days in July.

Upper figures: Time evolution of the vertical distribution of the simulated wind fields

and boundary layer height simulated on a Mediterranean coastal site using the same

resolution grid. This analysis of the meteorological data shows important differences in

the mesoscale model outputs between two different meteorological approaches. On the

right, the high resolution effects are included thanks to the two-way option between

grids (resolving 1.5km meteorological features) and on the left, the model uses the

resolution of the grid (13.5 km) without any feedback from the inner domains.

Lower figures: To check the implications of the two different meteorological

approaches in the simulation of a point-source plume in a Mediterranean coastal area,

both model outputs were employed to run two respective Lagrangian dispersion

simulations. As shown, the vertical distribution of the simulated SO2 concentration

emitted from a power plant (black vertical line) is strikingly different at identical

simulated times. Units are g/m3.

Resolution Grid: 13.5kmWithout inner domains feedback

Resolution Grid: 13.5kmWith inner domains feedback

Page 15: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

28 ERCIM News No. 61, April 2005

At present, the numerical simulation ofatmospheric dynamics is a necessarytool for meteorological diagnosis, anal-ysis and forecasting. At the beginning ofthe seventies, global models based onprimitive equations represented animportant development in the characteri-zation of synoptic-scale atmosphericprocesses. Afterwards, the need toanalyse lower-scale phenomena,together with the availability ofincreased computing power, resulted inthe development of regional models (or

limited-area models), which werecapable of resolving mesoscale atmo-spheric features. In the WesternMediterranean Basin, characterized bystrong mesoscale circulations, theseregional models (executed with high-resolution meshes) are required to solvethe atmospheric circulations/forcings.

During the last decade, the role of tradi-tional supercomputers and workstationshas been taken over by PCs. The hugePC market has benefited from the devel-

opment of hardware technology and nowshows an excellent price/performanceratio when compared with workstations.The growth of free software has facili-tated the development of very productivesoftware engineering capable of effi-ciently interconnecting computers andparallelizing the demands of computingpower. Consequently, PC clusters arenow able to respond to high-perfor-mance computing needs. This has beenan important advance for most atmo-spheric research groups, giving themmore computing power for mesoscalemodelling purposes and at a lower cost.

One of our scientific objectives is toincorporate this new technology into theair pollution research being undertakenat our institution (at both national andinternational levels). One of our nationalresearch projects - the ‘Els Ports-ElMaestrat’ project - is being used as apilot case to fit these new technolo-gies/methodologies to complex terrainareas with strong thermal and orographicforcings within the lower troposphere.

The ‘Els Ports-El Maestrat’ fieldcampaigns are sponsored by the govern-ment of the Autonomous Community ofValencia, Spain (specifically the‘Conselleria de Territori i Habitatge’ andthe ‘Conselleria de Cultura, Educació iSport’), and have been conducted on thesouth-western border of the Ebro basin(Northeast Iberian Peninsula) sinceNovember 1994. One of the objectivesof these field campaigns is to monitor(both aloft and on the ground) the plumeof sulfur dioxide (SO2) emitted from the343m-tall chimney of the Andorra powerplant located in Teruel (Spain). The ‘Els

Air Pollution Modelling in Complex Terrain:‘Els Ports-Maestrat’ Regional-Scale Studyby Jose-Luis Palau, Goka Pérez-Landa and Millán M. Millán

Characterization of atmospheric pollutant dispersion (advection and turbulentdiffusion) requires a detailed description of the wind and turbulence fields,especially on complex terrain under summer conditions. Gaussian regulatorymodels frequently incorporate simplifications that are not valid under thesecircumstances. Software-engineering development and technological advanceswithin the hardware industry have permitted the use of high-performancecomputing systems that are able to supply the computing power necessary forhigh-resolution meteorological and Lagrangian simulations.

Figure 1: Experimental and modelled SO2 plume, flowing NW from the power plant;

(left) experimental measurements of the plume concentration distribution, obtained

with the COSPEC over the road network around the power plant; (right) particle

distribution simulated with the LPD, FLEXPART.

-

C.T.

Jul, 25th - 1995 10 UTC

FLEXPARTParticles between 0 and 2 h since emmissionParticles between 2 and 4 h since emmissionParticles between 0 and 2 h since emmissionCOSPEC 2) 150 ppm • m (SO2)

Jul, 25th - 1995 11:40 - 12:06 UTC

Figure 2: Experimental and modelled SO2 plume, turning from the east towards the

north (from the power plant); (left) experimental measurements of the plume

concentration distribution, obtained with the COSPEC over the road network around the

power plant; (right) particle distribution simulated with the LPD, FLEXPART.

C.T

FLEXPART Particles between 0 and 2 h since emmissionParticles between 2 and 4 h since emmissionParticles between 0 and 2 h since emmission

Jul, 26th - 1995 13 UTC

COSPEC 150 ppm • m (SO2)

Jul, 26th - 1995 10:47 - 11:57 UTC

Page 16: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 29

Ports’ database consists of three inde-pendent (but related) meteorological andair-quality databases, along with a fourthdatabase composed of two differenttypes of forest plots: forest experimentalsites dominated by conifers, and plotswhere lichen transplant experimentshave been performed. Using these fourdatabases, we aim to understand howdifferent (meso and synoptic) meteoro-logical processes determine pollutantdispersion within this Iberian Peninsulabasin. In addition to this, we are able touse the new technology to describe theatmospheric dispersion of SO2 emis-sions in summer from a power plant situ-ated on very complex terrain in theIberian Peninsula.

Through experimentation andmodelling, the studies performed withinthis project attempt to characterize boththe advection (through the reconstruc-tion of three-dimensional wind fields)and the turbulent dispersion presentduring the period of analysis. SystematicSO2 plume tracking was carried out forthree days in July, by means of a vehicleequipped with a COSPEC (COrrelationSPECtrometer). This passive remotesensor utilizes solar radiation to obtainmeasurements of SO2 distribution, bothaloft and around the emission focus. Inaddition, the study used a non-hydro-static mesoscale meteorological model,MM5, coupled to a Lagrangian ParticleDispersion (LPD) Model, FLEXPART.

Simulated dispersion results are gener-ally checked against measurements oftracer-pollutant surface concentrations,with the dispersion analysis limited tothe impact areas. The availability ofmeasurements aloft enables us to verifythe patterns of advection and turbulentdiffusion that govern air pollutiondynamics in the area. This is followed byan analysis of the cause-effect relationbetween the emission source and theground-level concentration.

The mesoscale model uses a nested-gridconfiguration with five domains(100x100 grids spaced at 108, 36, 12, 4and 1.3 km, respectively) centred overthe power plant. The model predicts thewind field and turbulence parameters.The LPD model solves the turbulentwind components via the Markovprocess, which takes into account windvelocity variances and the threeLangrangian autocorrelations. To solvethe inhomogeneous turbulence in thiscomplex terrain, the time step used isconsidered to be a function of theLangrangian time scale.

From the point of view of advection(whole-body advection + differentialadvection), the coupled models wereable to reproduce the typical stationary-dispersion scenarios as experimentallycharacterized with the COSPEC.However, a significant temporal delaywas detected between the simulation and

experimental measurements of theplume dispersion (see Figures 1 and 2).

From the point of view of turbulentdispersion (differential advection +turbulent diffusion), there is a significantdiscrepancy during the transitionbetween dispersion scenarios (see Figure2), between the experimental andmodelled values of the horizontal distri-bution of plume concentration (sy,defined from the transversal axis to theaverage transport direction). This standsin contrast to the situation duringstationary periods (see Figure 1). In theformer situation, with no defined trans-port direction, classical dispersionparameters lose their physical meaning.

In conclusion, using an adequate config-uration of these two models (MM5 andFLEXPART) and with the methodologyshown, it is possible to simulate/charac-terize the main meso-meteorological anddispersive features in complex terrainunder very strong insolation conditions(where Gaussian regulatory models failwhen applied to air pollution sources).

Link:http://www.gva.es/ceam

Please contact:Jose Luis Palau, Fundacion CEAM, Paterna (Valencia), SPAINTel: +34 96 131 82 27E-mail: [email protected]

Particulate matter (PM) is a mixture ofparticles that can adversely effect humanhealth, damage materials and form atmo-spheric haze that degrades visibility. PMis usually divided up into differentclasses based on size, ranging from totalsuspended matter (TSP) to PM10 (parti-

cles less than 10 microns in aerodynamicdiameter) to PM2.5 (particles less than2.5 microns). In general, the smallestparticles pose the highest human healthrisks. PM exposure can affect breathing,aggravate existing respiratory andcardiovascular disease, alter the body's

defense systems against foreign mate-rials, and damage lung tissue,contributing to cancer and prematuredeath. Particulate matter includes dust,dirt, soot, smoke and liquid dropletsdirectly emitted into the air by sourcessuch as factories, power plants, cars,

2-Days Ahead PM10 Prediction in Milan with Lazy Learningby Giorgio Corani and Stefano Barazzetta

The lazy learning algorithm is used to predict PM10 air pollution levels in Milan,providing forecasts for the next two days with reasonable accuracy. In additionto the traditional data acquired by the air quality monitoring network, specificmicrometeorological variables are algorithmically estimated and then used asinput for our model.

Page 17: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

30 ERCIM News No. 61, April 2005

construction activity, fires and naturalwindblown dust.

The ‘Air Sentinel’ project, developed bythe ‘Agenzia Milanese per la Mobilità el’Ambiente’ (Agency for Mobility andthe Environment, Milan, Italy), aims atpublishing forecasts of pollutant concen-trations in Milan. One-day statisticallinear predictors for different measuringstations of PM10 have already beenshown to provide satisfactory perfor-mances. These models compute a predic-tion of the daily PM10 average for thecurrent day at 9 a.m.. Evaluation of thesepredictors via cross-validation on ayearly scale shows a true/predictedcorrelation higher than 0.85, and a meanabsolute error of about 10mg/m3 (out ofa yearly PM10 average of about43mg/m3 ).

We have now developed a two-daypredictor; ie, at 9 a.m. this modelpredicts the daily PM10 average for thefollowing day. We use the lazy learning(LL) prediction approach by Bontempiet al. LL has been shown to be viable fornonlinear time series prediction and, inparticular, also for air pollution predic-tion. The strengths of the lazy learningapproach are its predictive accuracy, itsfast design (the development of a LLmodel is much quicker than a neuralnetwork), the easy model update proce-dure, and the readability of the modelstructure. The most relevant contributionto LL development and diffusion inrecent years has been probably done bythe Machine Learning group working atthe University of Bruxelles, whichcontinuously produces LL algorithmic

enhancements and applications, and alsoreleases the LL implementation as open-source code.

The yearly average of PM10 in Milan issubstantially stable (about 45 mg/m3)since the beginning of monitoring in1998 and, just to give an idea of theseverity of the phenomenon, the PM10daily average exceeds the limit(50mg/m3 ) about 100 times every year.PM10 concentrations in Milan follow atypical weekly and yearly pattern: inparticular, winter concentrations areabout as twice as high as summer ones,both because of unfavourable dispersionconditions and of the additional emis-sions due to residential heating. Sundayconcentrations are about 25% lower thanthe other days of the week, because ofthe reduction in traffic volumes.

Our PM10 prediction applicationrequires a careful investigation of thesuitability of several variables; in addi-tion to the classical data available fromthe air quality network (such as pollutantconcentrations, wind speed and direc-tion, temperature, atmospheric pressureetc., measured at the ground level), wealso consider micrometeorological vari-ables (such as mixing layer height,Monin-Obukhov length, etc.). Thesevariables are algorithmically estimatedand make it possible to characterize thedispersion conditions.

Results and DiscussionWe presents the results obtained bycross-validating the model. At each run,a year of data is used as testing set, inorder to assess the ability of the model to

predict previously unseen data. Wecompare the performances of the basemodel (whose inputs are PM10 (ie,autoregressive term), SO2, temperatureand atmospheric pressure), with those ofthe model with micrometeorologicaldata. The probability of detecting over-threshold values and the false alarm raterefer to a threshold value set at50mg/m3.

Provided that predictions of the modelare available as a range of values, ratherthan as a crisp number, the forecast accu-racy can be considered satisfactory,especially if micrometeorological dataare used. Figure 1 provides a simulationsample for January 2003.

Links: Website of the Machine Learning group at the University of Bruxelles: http://www.ulb.ac.be/di/mlg/

Repository of papers presented at the 9th International Conference on Harmonisationwithin Atmospheric Dispersion Modelling for Regulatory Purposes (Garmish, 2004). From among them, one can find the paper by Barazzetta and Corani dealing with 1-day ahead prediction of PM10 :http://www.harmo.org/conferences/Proceedings/_Garmisch/Garmisch_proceedings.asp

Chapter dedicated to the air quality within the State of the Environment Report ofMilan (published by Agenzia Mobilità e Ambiente):http://81.208.25.93/RSA/capitolo_3/main_capitolo.htm

Please contact:Giorgio Corani, Stefano Barazzetta Agenzia Mobilità e Ambiente S.r.l., ItalyE-mail: [email protected], [email protected]

Table 1: Cross validation performances of the base model.

Table 2: Cross validation performances of the model enriched with

micrometeorological indicators.

Figure 1: A simulation sample for January 2003.

Page 18: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 31

Chaos was identified in meteorologicalsystems in the 1960s by E.N. Lorenz. Allsolution curves of Lorenz's equationseventually become confined to abounded region of space, yet twodifferent solutions with only slightlyperturbed initial conditions diverge fromone another at an exponential rate. Thisdivergence has dire consequences forany attempt to simulate the weather orclimate with a computer, since errorsthat are necessarily made in the processof representing the original, analyticalmodel in discrete form may be seen asperturbations of the solution. Theseunavoidable errors will, under the influ-ence of chaos, cause the simulated solu-tion to diverge from the true solutionexponentially. With current numericalweather prediction technology, model

and observation errors overwhelm thesolution in roughly ten days.

On the other hand, the errors incurred ina numerical computer simulation are notrandom but systematic, taking the formof phase errors, numerical damping andso forth. For well-designed methods, thesimulated solution may, while divergingexponentially from the true trajectory,still remain close to or ‘shadow’ someother solution satisfying a modifiedinitial state, modified parameters, oreven a perturbed model. In such cases,the solution may still be meaningfulwhen properly interpreted. In otherwords, one may still be accurately simu-lating some weather; it is just unlikelythat it will be the actual observedweather.

To deal with chaos, modern predictionsof weather and climate often utilize anensemble of simulations: a whole seriesof runs in each of which the initial stateof the system is slightly perturbed. Thisyields a distribution of weather scenariosthat can be subjected to statisticalmethods, and allows one to make state-ments like: “In 73% of all trials, rain waspredicted for Amsterdam nextThursday”.

When carrying out simulations on long-time intervals in which the numericalmap is iterated, say, a hundred thousandto a hundred million times, the qualita-tive differences between algorithmsbecome amplified. It is therefore crucialto employ methods having propertiessimilar to those of the physical modelbeing simulated (such as energy andmass conservation) or having similarmathematical structure (such as symme-tries).

Research at CWI, in collaboration withPotsdam University, focuses on preser-vation of Hamiltonian structure in simu-lations of atmospheric models.Hamiltonian structure lies at the root ofclassical mechanical systems, and can beused to unveil many of the conservationlaws present in such systems. Sinceviscosity can be neglected in large-scaleatmospheric flows, the equations ofmotion are conservative, and from theHamiltonian structure one can deriveconservation laws of mass, energy andvorticity. By ensuring that the computa-tional algorithm also adheres to thisstructure we can automatically guaranteethat these conservation laws will also beinherited by the simulated solution. Thepreservation of Hamiltonian structurealso has important implications forensemble simulations. One can think ofan ensemble of initial conditions as a set

Better Weather Forecasts with Use of Conservation Lawsby Jason Frank

Research at CWI aims to ensure structure in the midst of chaos. In long-timesimulations of weather and climate, the Hamiltonian Particle Mesh methodpreserves Hamiltonian structure, and thereby energy, mass and vorticity. Thistechnique could provide better quality solutions for weather forecasts.

Figure 1: Simulation of a 2D non-hydrostatic flow over a mountain ridge, computed

using the HPM method. (Source: CWI.)

Potential Temperature, t=30

x, km

z, k

m

30 35 40 45 50 55 60 65 700

5

10

15

Page 19: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

32 ERCIM News No. 61, April 2005

of points in space, each tracing out asolution. Initially the points are closetogether: for example, they form a ball.The ball is stretched and deformed aseach point follows its solution trajectory.However, for Hamiltonian systems, thevolume of the ball remains constant.This property is also inherited by thenumerical method, and ensures that themembers of an ensemble of simulationswill each explore different possiblescenarios, as opposed to beingcontracted to a smaller number ofextreme solutions.

Preserving Hamiltonian structure usingstandard, purely grid-based computa-tional techniques is difficult if not impos-

sible. Instead we follow an approach inwhich a fluid is represented by a finitenumber of particles representing fluidmasses, which satisfy classical mechan-ical equations of motion. For computa-tional expediency, the potential field inwhich the particles move is representedon a grid. This is all carefully formulatedto preserve Hamiltonian structure. Theresult is the Hamiltonian Particle-Mesh(HPM) method.

The HPM method has been under devel-opment since 2001. It has been usedsuccessfully for rotating planar andspherical shallow water equations, two-layer flows, and most recently, 2Dvertical non-hydrostatic flow over

topography. Figure 1 shows a simulationof a 2D non-hydrostatic flow over amountain ridge using the HPM method,with the wind field (arrows) and poten-tial temperature (color contours) indi-cated. Interest in non-hydrostatic modelshas grown in recent years as increasedcomputational power has allowed highergrid resolutions to be attained. Currentplans in this project address the issuessurrounding 3D simulations.

Links: http://www.cwi.nl/projects/gi/HPM/http://www.cwi.nl/~jason/

Please contact:Jason E. Frank, CWI, The NetherlandsTel: +31 20 592 4096E-mail: [email protected]

The Centre for Intelligent EnvironmentalSystems (CIES), StaffordshireUniversity, UK, specialises in the appli-cation of artificial intelligence (AI) toproblems affecting the natural environ-ment. Projects to date have concentratedon the development of intelligentsystems for the biological monitoring ofriver quality, with the Centre’s expertisein this field growing out of thepioneering work carried out by BillWalley and Bert Hawkes in the early1990s.

Biological monitoring of river qualityhas grown in importance over the pastfew decades and the approach nowunderpins legislation recently introducedin the EU, the Water FrameworkDirective (WFD). According to theWFD all freshwater bodies should beclassified in terms of their ecology, withthe target of reaching at least ‘good’ecological status by 2015. The objectivefor CIES in the last few years has been todevelop robust models for the interpreta-tion of biological and environmental

variables in terms of water chemistry andvice versa. Diagnosis is a key function;that is, determining likely pressures suchas organic pollution, excess nutrients oracidity, from biological and environ-mental data. Prediction is also required;that is, forecasting the biologicalresponse to changes in pressures,perhaps as a result of modifications toland use or the treatment of waste water.Both of these functions are of funda-mental importance for meeting therequirements of the WFD.

The methods used by CIES stem fromthe belief that experts use two comple-mentary mental processes when diag-nosis or prediction is required, namelyprobabilistic reasoning based upon theirscientific knowledge; and pattern recog-nition based upon their experience ofprevious cases. Consequently, the grouphas followed two lines of AI research inparallel - probabilistic reasoning basedon Bayesian methods and pattern recog-nition based on neural networks andinformation theory.

The robust and holistic nature ofBayesian belief networks (BBNs) makesthem well-suited to modelling complexsystems, like river ecology, where theinteraction between the elements is prob-abilistic in nature. This uncertainty isrepresented explicitly in the model byprobability distributions that encodecause-effect relationships. CIES hasused BBN technology to develop, incollaboration with the EnvironmentAgency for England and Wales, a modelof river ecology called RPBBN (RiverPollution Bayesian Belief Network). Thestrength of this approach has been thatthe model can be used to diagnose thepressures that affect a site and/or predictthe ecological changes resulting from aprogramme of remedial measures.

The group has also developed its ownpattern recognition system, MIR-Max(Mutual Information and RegressionMaximisation), based upon informationtheory. This is now central to its patternrecognition systems and forms the basisof an application called RPDS (River

Modelling Ecological Health using AI Techniquesby Martin Paysley and Bill Walley

The conservation of the natural environment is crucial to our long term survival.Scientists are now turning to artificial intelligence as a basis for betterunderstanding of our surroundings.

Page 20: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 33

Pollution Diagnostic System) that wasdeveloped for the Environment Agency.RPDS uses MIR-Max to categorises n-dimensional biological and environ-mental data into ‘clusters’ so that eachcluster contains data samples that aresimilar. The clusters are then ordered in atwo-dimensional hexagonal output‘map’ so that similar clusters are closetogether. Since each cluster represents aparticular set of biological and environ-mental variables, and a set of associatedchemical variables, the map provides apowerful diagnostic system and datavisualisation tool for examining patternsand relationships in the data (see Figure).

The current work of the group isfocussed on integrating the BBN andpattern recognition approaches into asingle system. This system, due forcompletion in 2007, will offer enhancedcapacity for diagnosis, prediction andclassification, and will provide decisionsupport for those responsible for rivermanagement in England, Wales andScotland.

Although bio-monitoring remains theprincipal application domain, the workof the group has diversified into otherenvironmental applications such as oil-spills on inland waters. The Oil

Pollution Diagnostic System (OPDS) isa prototype computerised diagnostic tooldesigned to help the EnvironmentAgency to determine the type of oilproduct involved, then to “fingerprint” itto identify its source and to provide solidevidence for legal proceedings againstthe polluter. This is based on detailedmatching of patterns in the gas chro-matograms of the pollutant and the orig-inal product, after allowing for theeffects of weathering. The patternmatching techniques of the OPDS canassist scientists in cases where the chro-matograms are particularly complex andmatches are hard to achieve.

Link: http://www.cies.staffs.ac.uk,

Please contact: Martin PaisleyE-mail: [email protected]

David TriggE-mail: [email protected]

Rob BuxtonE-mail: [email protected])

Bill WalleyE-mail: [email protected]).Tel: +44 (0) 1785 353510

Distributions of three variables across an output map. The first two are biological

variables (a mayfly and a snail) and the third is a chemical variable.

The DAMFLOW project is looking atthe development of efficient techniquesfor the numerical solution of hydrody-namic flows, with a particular focus onenvironmental applications. Among theproject’s aims is the development ofrobust and reliable numerical tools withlow computational cost, which canpredict and simulate hazards or emer-gency situations such as floods or oilspills. These numerical models may alsoassist, in some particular cases, in under-

standing the physical nature of theproblem and the key processes involved.

Currently, a set of fascinating problemshas attracted our attention. One of theseis the two-layer water masses exchangethrough the Strait of Gibraltar. Theconfinement of the flow by a strait cangive rise to profound dynamic conse-quences including choking or hydrauliccontrol, a process similar to that bywhich a dam regulates the flow from a

reservoir. The funnelling geometry canlead to enhanced tidal modulation andincreased velocities, giving rise to localinstabilities, mixing, internal bores,jumps and other striking hydraulic andfine-scale phenomena. In short, seastraits represent choke points which areobservationally and dynamicallystrategic and which contain a range ofinteresting processes. The flow configu-ration in the Strait of Gibraltar is charac-terized by two counter-currents: at the

Mathematical Models for the Simulation of Environmental Flows: From the Strait of Gibraltar to the Aznalcollar Disasterby Carlos Parés, Jorge Macías, and Manuel J. Castro

The realistic simulation of environmental flows is a subject of major economic,social, human and scientific interest. Human interaction with natural flows, throughadding pollutants or being responsible for disasters such as the dam-break ofAznalcollar, feeds the necessity for numerical modelling and prediction.

Page 21: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

34 ERCIM News No. 61, April 2005

surface the less saline water of theAtlantic flows eastward, spreading intothe Mediterranean and, at depth, thewaters of the Mediterranean flow west-ward toward the Atlantic Ocean. Thissituation means that it is possible anduseful to use two-layer models to repre-sent the region’s dynamics and to betterunderstand the key processes involved.As result of the DAMFLOW project, aset of finite-volume two-layer shallowwater models have been developed,comprehensively tested, and used tonumerically simulate the dynamics ofthe Strait of Gibraltar. One-dimensionaltidal modelling of the Strait (seeFigure 1) revealed a complicated patternof time-dependent hydraulic fluctuationsinvolving changing interfacial levels,moving control points and reversal of thelayer flows at different stages of the tide,in good agreement with observations.Two-dimensional modelling (seeFigure 2) produces a more complexpicture of tidal dynamics for the interpre-tation of which several numerical andgraphical tools have also been devel-oped.

A second environmental problem thatdeserved our attention was the simula-tion of the Aznalcollar dam-break. On 25April 1998, a massive spillage of tailingsoccurred from a pyrites mine lagoon atAznalcollar (Seville, Spain) anddamaged the environmentally sensitiveDoñana region. This caused an estimatedeconomic cost of 300 million euros andan unquantifiable cost in terms of envi-ronmental damage. Our project hasfocused on reproducing the Aznalcollardam-break conditions in order to assessour model’s performance. First, a digitalmodel of the affected area was madefrom GIS data. A two-dimensional one-layer finite volume model was used (seeFigure 3), which showed robustbehaviour and produced good results;propagation speeds, the flooded area (seeFigure 4) and draught sections were ingood agreement with the available data.The model had previously been vali-dated against test cases with analyticalsolutions and laboratory data. The resultsprovided by the model have given usconfidence that this numerical tool canproduce and provide a full description ofa potential flood wave and flooded areain other scenarios of potential risk.

At the present stage of the project:• robust 1D and 2D finite-volume one-

and two-layer solvers have beenimplemented, taking into account real-istic topographical data

• an exhaustive model validation hasbeen undertaken

• rigorous comparisons have been madewith observed data

• input data for model problems weretaken from observed measurements

• sensibility studies have beenperformed on various physical param-eters such as friction or density ratio inoceanographic applications

• fortnightly and monthly signals havebeen investigated in model variables(Strait of Gibraltar).

Numerical modelling and simulationrequire, besides algorithmic research andimplementation, the use of ‘interfaces’and other technological developments.In the framework of this project we havealso undertaken the following tasks:• development of pre-processing, post-

processing, analysis and visualizationtools

• parallelization of the numericalschemes on a PC cluster, providinggood efficiency

• use of mesh adaptation techniques.

This research is partially funded by theC.I.C.Y.T. (Comisión Interministerial deCiencia y Tecnología), projectBFM2003-07530-C02-02, whichcommenced in December 2003 and willend in December 2006. The researchteam involved in this project belongs tothe Differential Equations, NumericalAnalysis and Applications group at theUniversity of Málaga. Close collabora-tions exist with the Numerical Analysisin Fluid Mechanics group at SevilleUniversity led by Tomás Chacón, andother research groups at the Universitiesof Santiago de Compostela, Zaragoza,Cádiz, I.E.O. (Spanish Institute ofOceanography), CITEEC (University ofLa Coruña) and Université de Corse(France).

Link:http://www.damflow.com

Please contact: Carlos Parés, Universidad de Málaga, SpainTel: +34 9 5213 2017E-mail: [email protected]

Figure 2: Two-layer 2D model showing

simulated two-layer exchange flow

through the Strait of Gibraltar.

Figure 3: One-layer 2D model showing a

simulation of the Aznalcollar dam-break.

The flooded area is shown 2.5 hours after

the dam-break.

Figure 4: One-layer 2D model, again

showing the Aznalcollar case; (above)

digital model of the affected area, with the

flooded area in red; (below) numerical

simulation and comparison with the

actual flood area.

Figure 1: Two-layer 1D model showing the

simulated interface separating the deeper

Mediterranean waters from the surface

Atlantic waters in the Strait of Gibraltar.

Caption corresponds to spring tide

conditions one hour before low tide.

Page 22: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 35

The management of water resources is ahighly complex and tedious task thatinvolves multidisciplinary domainsincluding data acquisition and treatment,multi-processes and therefore multi-models, numerical modelling, optimiza-tion, data warehousing, and the analysisof exploitation, socio-economic, envi-ronmental and legal issues. In addition,given water’s vital role in human life, thecontinuous decrease in its availabilitydue to over-exploitation and pollution,and its scarcity in arid areas, there is aclear need for better design and optimaluse of hydro-systems. In this context, arational analysis must be based on anapproach that considers all relatedcauses and effects and systematicallyevaluates the various alternatives. Theseinclude preserving water for irrigation,managing watersheds, building dams toalleviate the impact of floods andimprove the water supply, simulatingfloods for risk prevention, andoxygenating lakes to eliminate eutrophi-

cation. For this, we have developed aWeb-Integrated Information System(WIIS) dedicated to the management andmodelling of water resources. Thissystem is an infrastructure of develop-ment: perennial, open, modular, robust,effective and convivial, offering acomplete, dynamic and configurableenvironment.

The WIIS architecture consists in twolevels: the first contains general modulessuch as viewers, GIS, grid generators, adata warehouse and simulators, while theother is domain-specific and includes allthe relevant modules tackling numericalmodelling for the water-related physicalapplication under study, ie hydrology,hydraulics, hydrodynamics and floods.This architecture is based on integrationprinciples linking software componentswith heterogeneous data sources. Withinthis objective, we confront an importantconflict. On the one hand, there is aconsiderable need for flexibility in the

simulation of water phenomena and therepresentation and handling ofinput/output data, which are character-ized by the absence of fixed and rigidstructures. On the other hand, these data,exchanged within the WIIS, are a prioriunstructured, heterogeneous anddistributive, which means they are noteasily accessible or exploitable. Withinthis framework, we looked at the defini-tion of a model for the representation andstructuring of the data using XML tech-niques. Accents were put on both thecustomization of the WIIS and its adap-tation according to the user profile andthe context of use.

In the first stage, we led a pseudo-XMLization of all the data accessed orexchanged within the WIIS through anInformation Indexation and ResearchSystem (IIRS). This IIRS operates on athesaurus based on Substitutes ofDocuments under XML structure of thedata warehouse, characterized bystrongly heterogeneous and distributedcontents. The second stage aimed atsetting up an Adaptive Web-IntegratedInformation System (AWIIS), which isbased on the WIIS and is robust andeffective in terms of integrated tools anddata. This AWIIS must provide aconvivial, adaptive interface, rich ininformation, and possibly with anoverview or various partial viewsaccording to the situation. Thiscustomization is closely dependent onthe user profile and the context of use.The AWIIS will also integrate a PIRS(Personalized Information ResearchSystem), which generalizes on the IIRS.It should be dynamic and evolutionary,use the concept of relevance feedback,and its driving idea is the combination

AWIIS: An Adaptive Web-IntegratedInformation System for Water ResourcesModelling and Managementby Fadi El Dabaghi

The AWIIS is a Spatial Decision Support System for rational planning, operationand management involving water resources. It provides the user with a customizedoverview or partial view depending on the user profile and the use context. Thisis achieved through dynamic ontology, using the concept of relevance feedbackand a thesaurus based on substitutes of documents.

Adaptive Web-Integrated Information System (AWIIS).

Page 23: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

36 ERCIM News No. 61, April 2005

‘request-result-user’. This PIRS operateson dynamic ontologies built around athesaurus, which is based on substitutesof documents of our data warehouse.

All the components of the WIIS are builtthrough Java intelligent interfaces withinan appropriate user-friendly frameworkunder a client/server application. Theyare portable on both Unix and Win plat-forms, and provide coherence in thedistribution of data treatments and/orbetween client and server. The clientprepares and sends requests to the server,a powerful Unix machine in terms ofmemory and input/output capacities,which provides direct access to a parallel

computing machine or HPCN facilities.This platform allows inter-applicationnavigation, cooperation between soft-ware and ease of use on the client and/orserver station, cooperation betweenprograms as an environment of develop-ment, and access to distant materials andsoftware resources (eg real or virtualparallel machines, data, calling ofremote procedures such as execution ofheavy numerical codes on the UNIXserver, storage etc). Hence, it permitsusers to carry out an efficient impactassessment analysis for sustainabledevelopment, while taking into accountsocio-economic, legal and environ-mental issues.

This work was supported by the WADIEuro-Mediterranean Project and Frenchbilateral cooperation programmes(CMIFM , CMEP and PLATON ). TheWADI project was managed by ERCIMthrough a consortium consisting ofINRIA-France, FORTH/IACM-Greece,EMI/ONEP-Morocco, ENP-Algeria,Calabria University-Italy andESIB/CREEN/EUCLID-Lebanon.

Please contact: Fadi El Dabaghi, INRIA, FranceTel:+33 1 3963 5343.E-mail: [email protected]

The models developed in this contextcan be classified into two categories: atwo-phase flow model based on Navier-Stokes equations, and a shallow-waterflow model.

Two-phase flow models have been usedto simulate the remedial aeration used tocombat eutrophication effects in lakes. Awater reservoir is generally consideredeutrophized when the concentration ofdissolved oxygen drops below 3 mg/L.The idea behind remedial aeration is toinject compressed air into the bottom ofthe reservoir in order to stir up andoxygenate the water. The numericalsimulation of the resulting flow byconventional models such as the two-fluids or Lagrangian models leads tomany difficulties. This is mainly due tothe models’ complexity and the need fora fine grid in order to achieve a goodrepresentation of the effect of thebubbles.

These difficulties limit interest in theseclassical models and lead us to suggestsome cheap and realistic alternatives.We consider a one-phase flow modelbased on velocity-pressure semi-compressible Navier-Stokes equations.Here, bubble dynamics are taken intoaccount, firstly through boundary condi-tions related to the air injection velocityat the aerator position, and secondly byintroducing correction terms repre-senting the forces applied by the bubblesto the water. In the same framework, amore general two-fluids (air/water)model with a moving water free surfacehas been developed. We use realisticassumptions regarding the wind velocityand the atmospheric pressure with aconvection equation describing the voidfraction function of the water, whichdetermines the wet domain. This allowsus to treat the free surface effects of thelake in a more suitable manner, andconsequently improve the dynamicresults of the bubbles’ aeration effects.

The second category of model is theshallow-water model: these describeriver flooding, tidal fluctuations, bay andestuary flows, breaking waves onshallow beaches etc. They are derivedfrom 3D incompressible Navier-Stokesequations by depth-averaging of thecontinuum mass and momentumbalances. These models, known also asSaint-Venant, involve fluid-domaingeometries characterized by theircomplexity and variability, and thelarge-scale computational aspects andfine grid necessary for realistic simula-tions.

To overcome these problems, we haveused a priori and a posteriori error anal-ysis. Geometric error indicators havebeen derived to improve the numericalsimulations models by adaptive meshtechniques; these ensure an optimalquality solution at a required precisionfor a given computational cost.Moreover, the error indicators accurately

Numerical Modelling and Analysis of Water Free Surface Flowsby Fadi El Dabaghi

A number of environmental engineering applications related to water resourcesinvolve unsteady free surface flows. A full three-dimensional model based onNavier-Stokes equations can give good descriptions of the physical features ofcertain phenomena, including lake eutrophication, transport of pollutants, floodsand watersheds. However these models are characterized by significantcomputational cost. We aim to reduce this through the use of two-dimensionalmodels or appropriate coupling models of different dimensions.

Page 24: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 37

predict in which parts of the fluiddomain the flow model may beconsidered 1D, 2D or 3D, and cancouple the resulting multi-dimen-sional models. In addition, we haveintegrated a conceptual hydrolog-ical model based on the HEC-HMS: this model computes run-offvolume by subtracting the volumeof water that is intercepted, infil-trated, stored, evaporated, or tran-spired from the precipitation in thecatchment area. The result of thisstep, a flood hydrogram, forms theboundary conditions for the numer-ical models presented above (St-Venant) in simulating the flood.

From the numerical point of view,the approximation of the modelsmentioned above is based on thecharacteristics method for the timediscretization of the advectionterms. This method leads to anupwind scheme, which has the doubleadvantage of being physically closed toconvection, and being an explicit schemeunconditionally stable in the finiteelement context, thereby allowing theuse of large and reasonable time steps.At each time level, we have to solve aquasi-Stokes problem, approximated by

P1/P1 mixed finite elements for thevelocity-height St-Venant formulation,or by (P1 + bubble /P1) mixed finiteelements for the velocity-pressureNavier-Stokes formulation. This ensuresthe discrete LBB condition necessary inthis context. For both formulations, the a

priori error estimates are verifiednumerically on many academictest cases before the models arevalidated on real cases such riversin flood or crossed 2D lakesections. Moreover, given that thereal-time and large-scalecomputing requirements are veryimportant for this kind of simula-tion, we used the HPCN facilities;this permitted the execution ofdeveloped codes, particularly onreal test cases requiring finer andmore complex meshes.

This work was supported by manyEuro-Mediterranean projects(WADI , ESIMEAU and CruCID)and French bilateral cooperationprogrammes (CMIFM , CMEPand PLATON ). The EU projectswere managed by ERCIM througha partnership constituted byINRIA-France, FORTH/IACM-

Greece, EMI/ONEP-Morocco, ENP-Algeria, ENIT-Tunisia, RIKS-Holland,Calabria University-Italy andESIB/CREEN/EUCLID-Lebanon.

Please contact: Fadi El Dabaghi, INRIA, FranceTel: +33 1 3963 5343E-mail: [email protected]

Water depth iso-values during flood. Foum Tillicht river,

Morocco 1988.

The objective of the SACADEAU is tobuild a decision-aid tool to help special-ists in charge of catchment area manage-ment to preserve stream-water quality.This is done by coupling a qualitativetransfer model (simulating pesticidetransfer through the catchment area)with a qualitative management model(simulating farmers’ decisionsconcerning weeding strategies and herbi-cide application). This has two main

advantages: it allows the impact ofrecommendations to be evaluated bysimulating high-level scenarios, and itallows simulation results to be analysedby using machine learning and datamining techniques to discover discrimi-nating variables and to acquire know-ledge in this poorly understood domain.

The experimentation site (the Fremeurcatchment area) is located in Brittany,

France and covers about seventeensquare kilometres.

A Transfer Model Coupled with Three Input ModelsThe transfer model simulates rivercontamination by pesticides. It modelspesticide transfer through a catchmentarea and simulates the daily rivercontamination that this causes. Thisphenomenon depends on numerous

SACADEAU: A Decision-Aid System to Improve Stream-Water Quality by Marie-Odile Cordier

Water quality is a critical environmental issue. In this project, we use a pesticidetransfer model to simulate effects on the quality of stream-water. Since a largenumber of parameters are involved in pollution phenomena, modelling, simulationand machine learning are useful techniques for acquiring knowledge in this poorlyunderstood domain.

Page 25: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

38 ERCIM News No. 61, April 2005

parameters, including human activities,climate, soil type and catchment areatopology. Since these parameters arecomplex and difficult to formalize, wecreated three sub-models to describethem (see Figure).

These are as follows:• A decision model, which models

farmers’ strategies. This providesherbicide application characteristics(date, substance, quantity) and agricul-tural interventions (soil preparation,seeding date, weeding dates)according to predefined farmers’strategies and weather conditions.

• A climate model, which provides dailyweather data such as the temperatureand the quantity of rainwater.

• A spatial model, which is in charge ofthe spatial distribution of agriculturalactivities, according to the catchmentarea topology.

Using the outputs of these three sub-models, a biophysical transfer modeldetermines pesticide transfer from appli-cation locations, through the catchmentarea, to the river. The model takes intoconsideration all the possible ways inwhich rainwater can flow through thecatchment area (run-off and leaching).

A High-Level LanguageIn order to achieve qualitative results, ahigh-level language for inputs andoutputs of the model is required. The aimis to describe qualitatively, via ascenario, numerical inputs and outputs ofthe model. This can be seen as theprocess of discretization of quantitativedata. This process is fundamental if wewant to construct comprehensive resultsfor decision-makers. The initial step wasto gather a set of scenarios suggested byexperts; for example, “What happens if apost-emergence weeding strategy ratherthan a pre-emergence strategy is appliedon all plots close to the river?” or “Whatis the impact of pesticide applicationdates on stream-water quality?”

To simulate a scenario, a methodologywas defined that consists in generating alarge set of instances of the scenario.These instances are then simulated andthe results generalized to give a qualita-tive description (in response to a qualita-tive question) using machine-learning

techniques. For example, given the ques-tion above concerning the impact ofapplication dates, a response could be:“Concentration peaks appear when pesti-cide application dates are close (less thantwo days) to significant showers (quan-tity > 10mm)”.

Learning from Simulation ResultsThe global model generates pesticidequantities and concentrations accordingto the parameters mentioned above. Theset of simulation inputs and outputs iscalled an example in machine-learningvocabulary, and the set of examples isthe set of instances of a scenario we havesimulated. We used ICL (http://www.cs.kuleuven.ac.be/~wimv/ICL/), induc-tive logic programming software, tolearn a set of rules summarizing theexamples. These rules can be formattedaccording to the task being addressed.For example:• qualitatively predicting water pollu-

tion• identifying which variables play an

important role in water pollution• characterizing important risks (eg

whether pollution is due to too manyconcentration peaks, or to a constantand significant source of pollution).

First results have been obtained with asimplified model, and according to theexperts, they show both expected andsurprising relationships. For instance,

concerning stream-water quality, a post-emergence weeding strategy did notshow better results than a pre-emergencestrategy. This was unexpected, and adiscussion on the impact of weedingstrategies was generated, with somepossible explanations given by experts.

Future work is three-fold, and involvesvalidating the model, refining the high-level language, and providing recom-mendations to experts based on relation-ships discovered through the model.

The SACADEAU project is in its thirdyear of development. Contributingmembers are M. O. Cordier, V. Masson,A. Salleb (IRISA/Univ. Rennes 1,Rennes), C. Gascuel-Odoux, F. Tortrat, P.Aurousseau, R. Trepos (INRA-ENSAR/UMR SAS, Rennes), F. Garcia(INRA/BIA, Castanet Tolosan), B.Chanomordic (INRA/LASB, Montpellier),M. Falchier, D. Heddadj and L. Lebouille(Chambre d’agriculture de Bretagne).SACADEAU is funded by ConseilGénéral du Morbihan and INRA.

Link: http://www.irisa.fr/dream

Please contact: Marie-Odile Cordier, Université Rennes 1/ IRISA, France Tel : +33 2 9984 7100E-mail: [email protected]

The four components of the pesticide transfer model in a catchment area.

Page 26: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 39

European agriculture is currently under-going radical changes due to economicalpressures imposed by the enlargement ofthe EU, the revision of WTO agree-ments, and changes in farm supportpayments. Such changes interact withthe physical and natural environment(eg, climate change, loss of biodiver-sity). Meanwhile, society demands agreen and clean landscape, and farmingcommunities in rural areas are faced withcontinuous technological innovation.

Scientists and modelers are thereforeconfronted with the increasing need todeliver scientific results, which can beused by policy makers to assess newpolicies, at the regional and nationalscale, that can facilitate agriculture’scontribution to sustainable development.The process of evaluation of a policywith respect to the three components ofsustainable development (environment,economy, society) requires what hasbeen defined ‘integrated assessment’(Parker et al. 2002).

Yet, the obstacles to performing inte-grated assessment studies are substantial,given the dimension and the complexityof the problem. Integration requires thatresearchers from different disciplinesmerge and compare their ideas, oftenscaling up and down the dimension ofthe problem under study. For instance, inthe case of the integrated assessment ofagricultural policies, the effect of asubsidy for a given product has adifferent impact in the various regions,and the related change in farm activityhas a very local effect on the environ-ment.

Science, through innovation andresearch, has progressively built a vastbody of knowledge specific to varioussectors, which can now be considered ascomponents of an integrated system. Forinstance, for a given crop growthprocess, there may be tens of differentformulations solving the same modellingproblem. Thus, the main issue becomesto make this knowledge accessible andre-usable. In fact, computer models aremonolithic, difficult to reuse in adifferent context, and strongly tied to the

modelling domain and scale of resolu-tion. For this reason, many groups havebeen researching the problem of modelintegration and modelling frameworks toallow a ‘seamless’ and transparentmodel integration and re-use acrossdifferent scientific domains and scales(eg, OpenMI (http://www.harmonit.org/),TIME (http://www.toolkit.net.au/),OMS (http://oms.ars.usda.gov/) just tocite a few).

A modelling framework is basically asoftware environment that supports thedevelopment of models, but it also

provides a set of facilities to set up modelruns, data visualisation and analysis,documentation, archival and discoveryof models. A modelling framework alsosupports the packaging of end-userapplications, which can be effectivelyused to support decision making. Anenvironmental modelling frameworkadds domain-specific knowledge thatenables solving environmental-relatedissues, for instance, it considers spatialmodelling.

A new Integrated Project "SEAMLESS:System for Environmental andAgricultural Modelling; LinkingEuropean Science and Society” aims todevelop an environmental modellingframework, named SEAMFRAME,which is specifically targeted for inte-grated assessment in the agriculturaldomain. This project was defined as aresult of a research call from theEuropean Commission within the EU6th Framework Programme (GlobalChange and Ecosystems).

Developing an Environmental ModellingFramework for Integrated Assessment of EU Agricultural Policies by Andrea E. Rizzoli, Carlo Lepori, Roberto Mastropietro and Lorenzo Sommaruga

‘SEAMLESS’, System for Environmental and Agricultural Modelling; LinkingEuropean Science and Society, is an environmental modelling project aimed atcreating a framework, named SEAMFRAME, which is specifically targeted forintegrated assessment in the agricultural domain. The goal is to deliver a softwaretoolkit to facilitate the comparison and integration of the many different modelsin existence today.

A snapshot taken at the SEAMLESS kickoff meeting, January 2005.

Page 27: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

40 ERCIM News No. 61, April 2005

SEAMFRAME will capitalise onprevious research in the development ofenvironmental modelling frameworks.Moreover, it will support a ‘declarative’approach to modelling (Muetzelfeldt andMassheder, 2003). A clear-cut separa-tion between model representation(equations) and model manipulation(data processing, integration routines)will allow the effective re-use of themodel in different contexts. The modelequations will be semantically denoted,by means of Semantic Web technologiessuch as RDF and Ontologies. This willopen the model structure to automatedprocessing, leading to the possibility ofsearching models according to theirspecifications --- and model linking andcomposition will be greatly facilitated.Finally, another distinctive feature ofSEAMFRAME will be the component-

oriented approach of its software archi-tecture. Declarative models will bedeployed as software components,which can be manipulated by tools suchas calibrators, simulators, optimisationroutines, which are themselves softwarecomponents. The use of introspection(Rahman et al. 2004) will allow tools toadapt to the published interface ofmodels, thus granting the required flexi-bility and extensibility of the framework.

Thirty research institutions from thirteenEuropean countries, including severalnew member states, are involved in theproject. These institutions bring togethera vast amount of knowledge and exper-tise from economic, environmental,agronomic, social and information tech-nology disciplines. The project alsoincludes co-operation with an African

and an American research institute. Thetotal budget is 15 million Euros. In 18months the first prototype should beavailable and in four years the systemshould be fully operational. The projectis coordinated from WageningenUniversity (The Netherlands), while thedevelopment of SEAMFRAME is coor-dinated by the Swiss institute IDSIA,part of USI (University of Lugano) andSUPSI (the University of AppliedSciences of Southern Switzerland).

Link: SEAMLESS website: http://www.seamless-ip.org

Please contact: Andrea Emilio RizzoliTel: +41 91 610 8664E-mail: [email protected]

Digitized aerial photographs and satelliteimages of forests represent convenientdata for developing computerizedassessments of forestry resources. Suchautomatic tools are useful for a numberof reasons, including the help with whichthey provide forest managers in classi-fying species. Such work is currentlydone by specialists, using difficult imageanalysis combined with ground verifica-tions. Some tools already exist for thispurpose, and use texture information andclassification based on parameters suchas covariance matrices. However, fewtake advantage of high data resolution.Nowadays, it is possible to study forestson the scale of individual trees, byresolving the extraction of tree crowns.This is one of the aims of the Arianaresearch group, which in the last year hasadapted its knowledge in stochasticgeometry (object processes) to forest-resource evaluation. This will allow the

automatic assessment of economicallyand environmentally important parame-ters such as the number of tree crowns,the distribution of their diameters or thestem density.

Our approach consists in modelling theforestry images as realizations of amarked point process of trees. Thisstochastic framework aims at finding thebest configuration of an unknownnumber of geometrical objects in theimage, with respect to a probabilitydensity defined a priori. This densitytakes into account both the data, in orderto fit the objects to the feature we want toextract, and the interactions betweenthese objects, to favour or penalize somearrangements. In the case of tree-crownextraction, we modelled the trees asellipses, defined by the position of theircentre, their orientation, and their majorand minor axes. After simulation, we

obtain a collection of objects and haveaccess to several statistics such as thenumber of trees in the stand, their posi-tion and their diameter. If differentspecies coexist, we can add a mark to theobjects to describe their type, and obtaina classification during the extraction.Some tests have been performed on digi-tized aerial photographs of stands ofpoplars, courtesy of the French ForestInventory (IFN). These photographs aretaken in the infrared, which enhances thechlorophyll matter of vegetation. Infuture investigations, we will bestudying texture parameters in order todistinguish different species during thesimulation, and lower-level informationto obtain a pre-segmentation of theimage before the extraction process.

Another application relevant to forestmonitoring has been developed byAriana in collaboration with Alcatel

Image Processing for Forest Monitoringby Josiane Zerubia and Paul-Henry Cournede

Aerial and satellite imagery have a key role to play in forestry management. Theincreasing availability of data and their high spatial resolution, which is nowsubmetric, allow automatic tools to be developed that can analyse and monitorforests by accurately evaluating the vegetation resources. The Ariana researchgroup at INRIA Sophia Antipolis, with its strong experience in remote-sensingimage analysis and feature extraction, is working on this topic as part of the jointresearch effort Mode de Vie.

Page 28: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 41

Space, and addresses forest-fire detec-tion on satellite images, using therandom fields theory. It consists inmodelling the image as a realization of aGaussian field, in order to extract rareevents like potential fires that couldgrow and imply serious damage. For thispurpose, the thermal infrared channel

(TIR) is selected, since fires show up aspeaks of intensity at these wavelengths.

Of further interest is the involvement ofAriana in the ARC joint research effortMode de Vie, in collaboration with MASLaboratory (Ecole Centrale Paris), theDigiplante research group (INRIA

Rocquencourt, CIRAD), and LIAMA(Sino-French Laboratory of Informaticsand Automation, Academy of Sciences,Beijing, China). The purpose of this jointaction is to link the assessment offorestry resources with the dynamicalmodels of plant growth developed inDigiplante and LIAMA. The ultimatetarget would be to develop a completetool for vegetation monitoring, whichcould both evaluate the present biomassand predict its evolution.

Links :http://www.inria.fr/ariana

http://www.inria.fr/recherche/equipes/digiplante.en.html

http://www.mas.ecp.fr

http://liama.ia.ac.cn

Please contact :Josiane Zerubia, INRIA, FranceTel: +33 4 9238 7865E-mail: [email protected]

Paul-Henry Cournede, Digiplante/MAS Laboratory, FranceTel : +33 1 4113 1289E-mail : [email protected]

Figure 1: (left ) In the original image (courtesy of IFN), the plantation of poplars on the

left-hand side can be isolated thanks to a Gabor filter; (right) the result of the extraction

process by stochastic geometry (Guillaume Perrin).

The cultivated areas of Europe,including agricultural land and exploita-tion forests, have a strong impact onglobal environmental conditions.Erosion, resource impoverishment dueto over-exploitation, and pollution byfertilizers or pesticides are crucial prob-lems that agronomy and forestry hope tosolve through harmonious cultivationmodes and exploitation strategies. Forthis purpose, they must take into accountproduction needs on one hand and theenvironment on the other; that is to say,both quantitative and qualitative criteria.In this context, mathematical models ofplant growth describing interactions

between the architecture of the plant andits physiological functioning have a keyrole to play. They allow the exchanges(of water, carbon, minerals etc) betweenplants and their natural environment tobe quantified.

GreenLab is just such a functional-struc-tural model, and is the result of a longdialogue between botanists, physiolo-gists and mathematicians. Derived fromthe AMAP model developed in the1990s at CIRAD, GreenLab’s newformulation was introduced at LIAMA(Beijing) in 2000. Today, the model isstudied and improved upon through the

DigiPlant project, which is run by a jointteam of researchers from INRIA,CIRAD and Ecole Centrale Paris, andhosted by INRIA. Some very close part-nerships exist with LIAMA, ChinaAgriculture University, WageningenUniversity, Orsay University and INRA.

A number of choices have been made inorder to simplify biological knowledgeand write the dynamical equations ofgrowth. Organogenetic growth cyclesare defined, and an automaton describingthe evolution of the buds determines theplant architecture. The botanical conceptof physiological age, defining a typology

GreenLab: A Dynamical Model of Plant Growthfor Environmental Applicationsby Paul-Henry Cournède and Philippe de Reffye

Some of the most critical issues for sustainable development and the protectionof the environment concern the rationalization of agriculture and forest exploitation.For this purpose, a research project at INRIA, DigiPlant, aims at developing bothmathematical models of plant growth and related computing tools. Over the years,an important Sino-European collaboration has emerged, involving researchinstitutes and universities, mathematicians and biologists.

Page 29: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

42 ERCIM News No. 61, April 2005

of the axes, allows a powerful factoriza-tion of the plant structure. Fresh biomassproduction is computed from transpira-tion and is then distributed amongexpanding organs (including buds)according to their demands (sinks). Theavailable biomass determines the rulesof the organogenetic automaton and theorgan sizes, and plant architecture has astrong influence on photosynthesis.Thus, there is a very strong couplingbetween the growth equations.

Stresses are also taken into account,especially for environmental resources.In particular, we have studied thecompetition between plants for light andwater in plantations or crop fields. Plantdensity is also an important control para-meter of the model.

The introduced mathematical formalismhad a significant impact on the modelanalysis. First, the simulation software isstrictly derived from the mathematicalformalism. Thanks to the structuralfactorization, the computation timegrows linearly with the chronologicalage of the plant (in comparison with abud-by-bud simulation whosecomputing time grows exponentiallywith the plant’s chronological age).Moreover, the mathematical study of thesystem allowed theoretical results onplant viability and yield prediction to beobtained. The best prospects, however,

rely on our ability to clearly formularizeoptimal control problems and to fit thetheoretical models to real plants. Thesetwo points are the keys to applications inagronomy and forestry.

The hidden parameters of the model arevery few (for example, twelve for anannual plant like maize). Agronomicdata on maize, sunflowers, tomato, riceand wheat have been fitted with verygood accuracy, and fittings on differentplants of the same type have shown thatthe parameters are very stable. Somemore complex plants like coffee, cottonand young beech trees are currentlybeing studied.

We have developed tools for a variety ofobjectives:• Optimization and control of the culti-

vation modes: in the case of limitedresources, there is an optimal strategyof fertilizing and watering during plantgrowth. Likewise, controlling plantdensity or partial forest clearings canbe beneficial. In this way, we canimprove water resources and landmanagement and reduce pollution byfertilizers.

• Control of plant sanitation and pesti-cides treatment: by coupling the plant-growth model and insect-populationdynamics, we can control the use ofpesticides and thus reduce costs andpollution.

• Selection of crop variety: we arecurrently working with geneticists, inorder to prove that the plant genesdirectly determine the physiologicalparameters of the GreenLab model. Inthis way, we expect to propose betterstrategies for crop selection.

• Virtual simulation and visualization ofplantations: computer graphics tech-niques allow the results of numericalsimulations to be visualized. This isvery important in urbanism or land-scaping for predicting the long-termevolution of projects.

The results of this research seem to showthat in the near future, new tools ofprediction, optimization and controlcould be effectively used in agricultureand forest exploitation on a large scale,and would drastically improve themanagement of the environment.

Links:http://www.inria.fr/recherche/equipes/digiplante.en.html

http://www.mas.ecp.fr

http://amap.cirad.fr

Please contact:Paul-Henry Cournède, MAS - Ecole Centrale Paris, FranceTel: +33 141 131 289E-mail: [email protected]

Philippe de Reffye, INRIA, AMAP - CIRADTel: +33 139 635 774E-mail: [email protected]

Figure 1: Fitting results obtained on maize data from China Agriculture University

(Ma Y.T.); (a) simulated maize, (b) internode weights, and (c) leaf weights. Red dots

represent experimental data and green lines simulated data.

Figure 2: Architectural and physiological

growth of a gingko tree (Kang M.Z.).

SPECIAL THEME: Environmental Modelling

Page 30: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 43

In the Department for NumericalSoftware (NUSO) at the FraunhoferInstitute for Algorithms and ScientificComputing (SCAI), a new group hasbeen founded to look at the compres-sion of data resulting from numericalsimulations. One of the fields in whichit has been active is the compressionof meteorological data. The motiva-tion was the drastically increasedamounts of meteorological dataneeding to be stored. This is a conse-quence of improved weather modelsand of the higher resolution with whichforecasts are being made. However thedata is also being used for new purposes:direct sales, data-mining and reanalyses.All of these tend to increase the amountof data that is archived for longer periodsof time.

The initial impetus was provided by theplans of the German Weather Service(Deutscher Wetterdienst) to switch fromtheir Local Model (LM) to the LocalModel-Europe (LME). This new modelnot only covers a larger area but also hasa higher vertical resolution. The amountof data they plan to archive can be seen inthe graph below. Medium-term storagerequirements, ie for about a year, areplanned to be just short of 4 Petabytes.That is 4,000,000,000,000,000 bytes.

CompressionCompression involves changing theform of data in a file, so that thecompressed file takes up less space thanthe original file. There are two types ofcompression: lossy and lossless. If thecompression is lossy, one cannot retrievethe original file from the compressedfile. The advantage of this approach isthat the data can be compressed to amuch greater degree. Lossy compressionis typically used to compress graphic andvideo files, especially in the context ofthe Internet. Typical lossy compressionprograms are JPEG and MPEG, which

can reduce the size of a file by factors often to fifty.

On the other hand, if lossless compres-sion is used, the original file can beretrieved exactly from the compressedfile. Lossless compression is typical fortext files, but also for files containingsensitive numerical data, eg medical ormeteorological data. All Zip utilitiesperform lossless compression, withcompression factors of around 1.5 to 3.

Meteorological data is typically stored inthe GRIB format. This format, griddeddata in binary form, is an internationalstandard for the storage and exchange ofmeteorological data. The usual proce-dure is to first put the data obtained froma weather forecast into the GRIB format.This is lossy compression. Afterwards,however, any compression is required tobe lossless. The new group established atSCAI has developed a program,GRIBzip, for the lossless compression ofmeteorological data stored in the GRIB1format (ie GRIB Version 1). As anexample, if the data is formatted inGRIB1 with 16-bit precision, then theGRIB files produced by a typicalweather forecast can be reduced in sizelosslessly by, on average, a factor ofthree. Extrapolating this to the exampleof the German Weather Service, storagewould be required for only 1.5 Petabytesof data instead of 4 Petabytes.

Archiving data in a compressed formhas another advantage that may not beimmediately apparent. Normally thearchiving system consists of hardwareseparate from the other computers,meaning the connection to thearchiving system can be a bottleneck.By using compressed data, the band-width of the connection is effectivelyincreased by the same factor as thedata has been compressed. In ourexample, the bandwidth would beeffectively increased by a factor of

three. A patent for the techniques used inthe program has been applied for.

Future ActivitiesThe programs developed in SCAI cancompress GRIB data on topologicallyrectangular grids. Other types of grids,such as a triangular grid covering thewhole globe, or grids which becomesparser towards the poles, are alsoallowed by the GRIB format. We areplanning programs able to losslesslycompress this type of data. In addition,spectral data is sometimes stored in theGRIB format, and its compression is alsodesirable.

The group has also developed programsfor compressing data produced by crashsimulations (Pamcrash, LS-Dyna) in theautomobile industry. Since this data isdefined on irregular grids, the techniquesused are quite different from those fordata on regular grids. Using the expertisegained from these extreme cases, thegroup intends to compress data resultingfrom other types of simulations.

Link:http://www.scai.fraunhofer.de/nuso.0.html?&L=1

Please contact:Rudolph Lorentz, Institute for Algorithms and Scientific Computing - SCAI, Fraunhofer ICT GroupTel: +49 (0) 2241 143 480E-mail: [email protected]

Lossless Compression of Meteorological Databy Rodrigo Iza-Teran and Rudolph Lorentz

Weather forecasts are getting better and better. Why? The models include morephysical processes and resolution has been improved. The consequence: moredata is generated by every weather forecast. Much more data. Very much moredata. What does one do with all this data? Answer: compress it.

Planned archive size, German Weather Service.

Page 31: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

44 ERCIM News No. 61, April 2005

Based on such high-quality dissemina-tion services, even environmental infor-mation becomes an attractive productonce it is perceived as providing citizenswith indicators of their levels of comfort.The APNEE information services arebased on the following concepts: • public sector information is promoted

as catalytic content for new andcompelling applications

• citizens require information on airquality in a user-friendly form’

• EU directives are calling for new waysto disseminate information on airquality to citizens.

Authorities and scientists have largeamounts of data with perfect means ofvisualization and well-researchedmodels for calculation, forecasts andinterpolations. Normally however,normal citizens are not targeted as poten-tial consumers of these data. Yet citizensare now beginning to call for timely andhigh-quality environmental informationfor reasons of comfort and health care,

especially when they are individuallyaffected by respiratory diseases. Whilegeneral information is available viaInternet servers and TV video text, somethings are still lacking: • classification of measurement values

by user-friendly indexes (ie ‘good’ or‘bad’ instead of, say, 77 mg)

• information regarding health impactand what action to take

• appropriate channels through which toreach citizens on the move (ie anytimeand anywhere).

To date, most warnings of adverse envi-ronmental conditions have made use ofbroadcast channels. However, radio orTV inevitably reaches a much largeraudience than that section of the popula-tion actually at risk. National andregional authorities have found it diffi-cult to 'narrowcast' specific environ-mental warnings or advice only to thoseliving within a specific area. It is thiscapability that APNEE and APNEE-TUhave implemented and tested by utilizing

modern IT and telecommunication tech-nology at locations across Europe.

Project APNEE commenced in January1999 as an RTD project in the FifthFramework Programme of the EuropeanCommission, IST Programme, KeyAction I (Systems and Services for theCitizen), Action Line I.5.1. It concludedsuccessfully in December 2001, and atake-up action named APNEE-TU(4/2003 – 3/2004) adapted and tested theAPNEE systems at additional user sitesin Europe, as well as employing newtechnologies (eg handhelds, smartphones, PDAs and new mobile protocolslike GPRS) for the dissemination of airpollution information.

Field trials for the evaluation of theseinnovations in APNEE and APNEE-TUtook place at nine test sites (Oslo,Greenland, Athens, Thessalonica,Marseilles, Canary Islands, Madrid,Andalusia, and the whole of Germany),and the projects included 21 partners

Air-Quality Information for Multi-ChannelServices: On the Air with Mobile Phonesby Gertraud Peinel and Thomas Rose

The APNEE project (Air Pollution Network for Early warning and informationexchange in Europe) has designed customizable services providing informationon air quality for citizens. These services draw on various information channels,including mobile technology, interactive portals for the Internet and street panels.

Figure 1:

Channels in APNEE and APNEE-TU.

Figure 2:

Partners and test-site locations in APNEE and APNEE-TU.

Page 32: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

SPECIAL THEME: Environmental Modelling

ERCIM News No. 61, April 2005 45

from research, government and the ITand telecommunications sectors. Thetest sites represent a diverse selection interms of geography, pollution problemsand cultural settings. In addition, theirexperience and development of air moni-toring stations were different.

We designed and implemented thefollowing ‘business collaborationconcept’, which has proven successful: • in each APNEE region an authority

provides and authorizes the measuredair pollution data

• research institutes and universitiesoperate and control models for theforecasting of air pollution

• technological partners design andrealize the Internet and WAP portalsas well as street panel interfaces

• mobile and Internet information portalproviders integrate the APNEE solu-tion into their portals

• telecommunication companiesdistribute the messages through SMS,MMS and WAP, as well as via smartphones and PDAs.

APNEE is, first of all, the architecture ofa dissemination platform, built around aharmonized environmental databasescheme. In this respect, APNEE supportsthe harmonization of air-quality manage-ment systems. This eases the exchangeof data (eg among authorities), allowsfurther standardization of methods anddirectives and simplifies integration intoother services and service providers. Inaddition, APNEE provides a referenceimplementation of components, requiredto set up an APNEE infrastructure at anew user site. APNEE is mostly basedon royalty-free open-source softwarethat has a very good developer commu-nity for support. It can be installed on astandard PC and is easily administratedonce the set-up has been completed. Theinstallation and operational costs ofAPNEE are very low, allowing cities andcommunities to set up an informationportal for air-quality (or any other envi-ronment-related) information, even on alow budget.

APNEE created a set of reference coremodules, including a common databasescheme, service triggers, and regionalserver applications. These core modulesare considered to be the heart of the

system; however, they may be bypassedor not implemented in cases where onlythe electronic services are of interest forinstallation and operational usage,provided that there is a database and apull-and-push scheme offered via alter-native software infrastructures. Thus, theproject’s e-services pallet currentlyincludes: • pull services: WAP, J2ME, PDA,

Internet-based, GIS informationservices, voice server

• push services: SMS, e-mail, newsletter • street panels.

These services may be applied individu-ally, in combination, or in parallel withother pre-existing services, where theyexist.

The complementarity and in particularthe situateness of services have provendecisive in reaching citizens, that is, inadvising people of episodes that mightcause harm to their health. MMSservices sound attractive from a tech-nology point of view and WAP servicesallow for more detailed informationwhen on the move. Nevertheless, SMSservices have proven sufficient inseveral field trials.

APNEE has been built upon the use ofcomplementary communication chan-nels. Having successfully completed thefield trials, we now know which chan-nels are best suited for various kinds ofenvironmental episode. The informationdissemination platform is distinguishedby its customizability: messages can becustomized to the location of users, theirpreferences or the types of content thataffect them. This facility means APNEEexcels in providing early warning ofhazardous events. Recipients can betargeted in a focused fashion, with thetype of message also able to be tailored.

The project was recently presented as asuccess story among IST projects byCommissioner Viviane Reding during apress conference in Brussels. Futureapplications of APNEE will be earlywarning services. We are currentlyworking on risk management conceptsand services for cities and urban regions.

APNEE was supported by the EuropeanCommission DG XIII under the 5th

Framework Pro-gramme, IST-1999-11517. Partners included FAW Ulm,Germany (coordinator), Airmaraix,France, Aristotle UniversityThessaloniki, Greece, Ayuntamiento deMadrid, Spain, Expertel Consulting,France, NILU, Norway, NORGIT AS,Norway, Seksjon for kontroll ogovervåking i Grenland, Norway, SICE,Spain, Telefonica I+D, Spain,Universidad Politecnica de Madrid,Spain. APNEE-TU was supported by theEC under contract IST-2001-34154.Additional partners are the StormWeather Center and National RoadProtection Agency, Norway, ITCCanarias and Andalusia Network, Spain,OMPEPT and Siemens, Greece, andUMEG, Fraunhofer FIT (new coordi-nator), and t-info GmbH, Germany.

Links:Project Web site:http://www.apnee.org

Spain: http://panda.lma.fi.upm.es/andalusiahttp://panda.lma.fi.upm.es/canary/http://atmosfera.lma.fi.upm.es:8080/regional-eng/servlet/regional/template/Index.vm

Greece: http://www.apnee.gr (temporarily out of order)

Norway: http://www.luftkvalitet.info

Germany: http://www.t-info.de http://www.t-zones.de

France: http://www.airmaraix.com

Please contact:Gertraud Peinel, Thomas Rose, Institute for Applied Information Technology - FIT, Fraunhofer ICT GroupTel: +49 22 4114 2432 (-2798)E-mail: [email protected], [email protected]

Page 33: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

46 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

Articles in this Section47 Building a Bridge for Communication between

Patients, Family Doctors, and Specialistsby Matteo Paoletti, Loriano Galeotti and Carlo Marchesi

48 Connecting Remote Tools: Do it by yourSELF!by María Alpuente and Salvador Lucas

50 Google Teaches Computers the Meaning of Wordsby Rudi Cilibrasi and Paul Vitányi

51 Reconstruction, Modelling and Motion Analysis of theHuman Knee based on MR Imagesby Gábor Renner and György Szántó

53 Computer Controlled Cognitive Diagnostics andRehabilitation Method for Stroke Patientsby Cecília Sik Lányi, Julianna Szabó, Attila Páll, IlonaPataky

54 Web-based NeuroRadiological Information RetrievalSystem by Sándor Dominich, Júlia Góth and Tamás Kiezer

55 ARGO: A System for Accessible Navigation in theWorld Wide Webby Stavroula Ntoa and Constantine Stephanidis

56 BRICKS: A Digital Library Management System forCultural Heritageby Carlo Meghini and Thomas Risse

58 Student Programming Project becomes Multi-MillionDollar Companyby Nancy Bazilchuk

59 Software Engineering Institute goes International inSoftware Process Researchby Mario Fusani

Page 34: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

47

R&D AND TECHNOLOGY TRANSFER

The objective of the AMICUS project,launched in 2003 at the BIM Lab,University of Florence, is to design apersonalized communication system thatcan improve the quality of daily life andmedical care for the chronically ill, at thesame time helping to reduce the numberand duration of hospital recoveries.AMICUS aims at contributing to thereduction of health costs and atproviding easy access to medical assis-tance. This is in line with a recent posi-tion statement from WHO that stronglyencourages the development of devicesable to give patients an active, consciousrole in the management of their disease:actions are needed that emphasizecollaborative goal setting, patient skillbuilding to overcome barriers, self-monitoring, personalized feedback, andsystematic links to communityresources.

AMICUS proposes a number of innova-tive solutions all aimed at increasing thecontrol of patients over their health. Firstof all, patients should be encouraged toplay an active role in monitoring theirdisease through the application ofcustom-designed procedures embeddedin portable (possibly wearable) devices.In order to obtain the necessary medicaland technical specifics for the imple-mentation of devices of this type,AMICUS is studying the development ofan instrument which will monitor thepatient's vital signals and derive patternsof evolution over time. Although suchdevices are already potentiallymarketable, in our opinion a rigorousassessment phase is first needed in orderto evaluate the level of knowledge andthe experience necessary for a correctemployment. Fully reliable systems are

not yet available and a too-early intro-duction on the market could bedangerous. The AMICUS project willbuild a prototype system, known asCOMPASS, which should offer widepossibilities of in itinere intervention(hardware and software changes) as aresult of experimentation and field trials.The system has a split architecture andconsists of two units: one is a wearableBlue Tooth biosignal transmitter whoserange covers the patient home area andthe other is a PC work-station, which

receives data from the patients and clas-sifies events, submitting the informationto the health care network, whennecessary.

A second objective of the project is thedefinition of an accurate personalizationof system parameters based on contex-tual information about the patient’s casehistory and current dynamic conditions.A preliminary tuning is obtained on thepatient’s first admission, through a set ofmultivariate analysis techniques that

Building a Bridge for Communication betweenPatients, Family Doctors, and Specialistsby Matteo Paoletti, Loriano Galeotti and Carlo Marchesi

According to recent reports of the World Health Organization, the next two decadeswill see dramatic changes in the health needs of the world's populations withchronic diseases, mental illness and injuries as the leading causes of disability.Increases in the senior population ‘confined’ within the home are also expected.ICT technologies must tackle this challenge by providing the means for fastcommunication and consulting services between the chronically ill, the generalpractitioner and the hospital specialist.

Figure 1: Split architecture (COMPASS project).

ECGFront End

Blue ToothTransceiver

P

A/DConversion

UserInterface

Portable unit

BlueToothTransceiver

PractitionerHospitalSpecialistServices on demand

WebPC-based

Figure 2:

Personalization

procedure.

ERCIM News No. 61, April 2005

Page 35: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

48 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

indicate his/her profile (see Figure 2).This profile (or set of features) isembedded in the personalization proce-dure and the system’s performance isthen tuned accordingly.

Without the contribution of such contex-tual data, the signaling system wouldonly be based on features derived fromreal time signals, which are usually notsufficiently reliable. On the contrary, thesignals of the system can normally becorrectly interpreted through humanintervention. In our opinion, communi-cation between a patient and his familydoctor can be usefully supported bypersonal devices bridging the timeinterval between successive medicalexaminations.

An additional but important feature ofthe system regards a set of aesthetic andergonomic values provided by industrialdesigners (ISIA, Industrial DesignInstitute of Florence).

We intend to begin experimentation ofthe system with a very well-definedgroup of patients. In this respect. we arecurrently talking to the Respiratory Unitof the Main Hospital in Florence withreference to the population affected byCOPD (Chronic Obstructive PulmonaryDisease). On-line long-term monitoringcould play an important role in the well-being of these patients. It could providecritical information for long-term assess-ment and preventive diagnosis for whichtrends over time and signal patterns areof special importance. Such trends andpatterns are often ignored or scarcelyidentified by traditional examinations.

The aim of the operational phase (dataacquisition and patient profile definition)will be to produce a system prototype.During the clinical trials we hope todiscover the answers to questions like:How far can we rely on cooperation fromthe patient? Can the patient feel confi-dent of a timely intervention? How

simple must the system be to be acceptedby a typical user? How easily would it befor specialists to abandon current patientadmission practice? Would such anapproach be useful in reducing thenumber and/or the duration of hospitalrecoveries?

Project partners: • BIM Lab, DSI, University of Florence• ISIA, Industrial Design Institute• “Leonardo”, General Practitioners

Association• Respiratory Unit, Critical Care

Department, University of Florence• SAGO s.p.a..

Link: http://www.dsi.unifi.it/bim

Please contact:Carlo Marchesi, Matteo Paoletti, University of Florence, ItalyE-mail: [email protected], [email protected]

Connecting Remote Tools: Do it by yourSELF!by María Alpuente and Salvador Lucas

Fifty years of research in computer science have seen a huge number ofprogramming languages, theories, techniques, and program analysis tools comeinto existence. The World-Wide Web makes it possible to gain access to resourcesin a number of ways, ranging from remote downloads followed by local executions,to remote execution via WWW services. Unfortunately however, few of the existingsystems and tools are effectively connectable, even if they address similarproblems or rely on similar theoretical bases.

As recently noticed by T. Hoare, thequest for a verifying compiler is aclassic, but still urgent problem for boththe software industry and the computerscience community. Of course, programverification, debugging and analysis(especially when oriented towardsimproving program efficiency) will beessential components of such a tool.Moreover, the effective development ofsuch a system will require an incre-mental and cooperative effort from workteams all around the world. Thanks to theInternet, the physical distance separatingthose teams is becoming less and lessimportant. Unfortunately, many existingsystems and tools are not really suitablefor working together, even if theyaddress closely related problems or rely

on similar theoretical bases. The Internetand middleware technology, however,provide numerous possibilities forremoving integration barriers anddramatically improving the reusability ofprevious theoretical results and develop-ment efforts.

We can further motivate this point ofview with two concrete examplesextracted from our personal experience:model checking and termination anal-ysis. Model checking is becoming a stan-dard technique for automated softwareverification. Its success has made theterm relatively popular for describingother verification methods. An exampleis the automatic analysis of generalcorrectness properties of concurrent

systems that do not require a particularrepresentation with a property language(eg deadlock absence, dead code andarithmetic errors). Moreover, advancesin model checking position it as animportant contributor in the future devel-opment of the verifying compiler.

Still, model checking is not commonlyused in, for example, object-orientedprogramming, where ‘de facto’ standardmodelling methods (eg UML) andprogramming languages (eg C, C++ andJava) are common practice. The currentpattern in verification tools for theselanguages essentially consists of auto-matic model extraction from the sourcecode to the input language of existing,efficient model-checking tools like

Page 36: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

ERCIM News No. 61, April 2005 49

R&D AND TECHNOLOGY TRANSFER

SPIN. However, the use of these toolsfor verifying parts of a program currentlybeing developed in, say, Java is not easy.First, the syntax of Java programs doesnot correspond to the input language ofany of the above-mentioned tools (egPromela). Some translation is thereforerequired, but hardly covers a subset ofthe language. Second, there is no easyway of calling one of these tools from theJava interpreter. Third, we lack methodsto analyse the results, especially counter-examples for non-deterministic execu-tions. This is partly due to the differentinterfaces of the tools. For instance, incontrast to the usual GUI of most Javainterpreters, SPIN has a command-basedinterface accepting text from the stan-dard input (stdin), and writes its resultson the standard output (stdout). Fourth,the tools can be written in differentprogramming languages, which maymake the eventual embedding of one toolinto another significantly more difficult.Finally, there is no documented API togain external access to the functionalitiesof the applications.

As a different (although related) moti-vating example, we consider the termi-nation analysis of programs written inprogramming languages such asCafeOBJ, Elan, Erlang, Maude and OBJ,whose operational principle is based onreducing expressions according to thewell-known paradigm of term rewriting.Proofs of termination of term-rewritingsystems can be used to prove terminationof such programs. A number of termina-tion tools exist which can, in fact, be

used to address this problem: forinstance, the tool MU-TERM can beused to prove termination of simpleMaude programs (see Figure 1).Termination of rewriting has alsorecently been related to certain classes ofstandard verification problems that canbe reduced to termination problems.

Unfortunately, however, it is not easy toconnect independently developed anal-ysis tools (like MU-TERM) to a practicalenvironment such as the Maude inter-preter.

The problems are analogous to thoseenumerated above.

Interoperability (ie making it possible fora program on one system to accessprograms and data on another system) isa general problem in software engi-neering, and a number of solutions(namely, middleware systems) havebeen devised to solve it. The developersof (formal) software tools shouldconsider such solutions to be an essentialpart of their design and development.Tackling this problem seriously wouldalso be a positive step for the researchcommunity in formal methods anddeclarative languages. These areas areoften used (or should be used) to designand implement analysis tools. Byisolating the various aspects of acomplex tool (eg GUI, numeric compu-tation, constraint solving, program trans-formation and manipulation etc) intodifferent modules possibly written indifferent languages, we gain the flexi-

bility to use the most appropriatelanguage. Tightly coupled support tech-niques such as RPC, CORBA and COMhave not undergone sufficient experi-mentation and development in thissetting. The XML WWW services (orjust WWW services) also provide a flex-ible architecture for achieving interoper-ability of loosely coupled systems thatare developed in different programminglanguages. Again, few efforts have beenmade to conciliate the design and devel-opment of software tools with this tech-nology. We claim that, when consideringthe design and use of software systemsand analysis tools, not only correctnessand efficiency must be systematicallyconsidered but also interoperabilityacross platforms, applications andprogramming languages.

Starting this year, we plan to developthese ideas into the research projectSELF (Software Engineering andLightweight Formalisms). This willinclude research on: • Suitable XML-like formats for

expressing the most important featuresof currently used (families of)programming languages. A relatedeffort has been recently addressed fordeveloping the CLI/CTS/CLS/CLRframework, which forms the basis ofMicrosoft’s .NET platform.

• Suitable XML sub-languages forexpressing analysis/verificationrequirements to existing tools.

• Middleware translators and interfacesfrom existing programming languagesto the formalisms or lower-levellanguages which underlie the programanalysis tools.

• The inclusion of existing verificationand analysis tools into theSOAP/WSDL/UDDI framework ofXML WWW services in order to gainsystematic access to their function-ality.

Links:SELF: http://self.lcc.uma.es/

ELP group: http://www.dsic.upv.es/users/elp/elp.html

Please contact:Salvador Lucas, Universidad Politécnica de Valencia/SparCIME-mail: [email protected]://www.dsic.upv.es/~slucas

Screenshot of the termination tool MU-TERM.

Page 37: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

50 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

To make computers more intelligent onewould like to represent the meaning ofwords and phrases in computer-digestable form. Long-term and labor-intensive efforts like the Cyc project (CycCorporation) and the WordNet project(Princeton University) try to establishsemantic relations between commonobjects, or, more precisely, names forthose objects. The idea is to create a

semantic web of such vast proportionsthat rudimentary intelligence and know-ledge about the real world spontaneouslyemerges. This comes at the great cost ofdesigning structures capable of manipu-lating knowledge, and entering highquality contents in these structures byknowledgeable human experts. While theefforts are long-running and large scale,the overall information entered is minute

compared to what is available on theWorld Wide Web.

The rise of the World Wide Web hasenticed millions of users to type in tril-lions of characters to create billions ofweb pages of on average low qualitycontents. The sheer mass of the informa-tion available about almost everyconceivable topic makes it likely thatextremes will cancel and the majority oraverage is meaningful in a low-qualityapproximate sense. We devise a generalmethod to tap the amorphous low-gradeknowledge available for free on theWorld Wide Web, typed in by localusers aiming at personal gratification ofdiverse objectives, and yet globallyachieving what is effectively the largestsemantic electronic database in theworld. Moreover, this database is avail-able for all by using search engines likeGoogle.

We developed a method that uses onlythe name of an object and obtains know-ledge about the semantic (meaning)similarity of objects by tapping anddistilling the great mass of availableinformation on the web. Intuitively, theapproach is as follows. The meaning of aword can often be derived from words inthe context in which it occurs. Tworelated words will be likely to give morehits — web pages where they bothoccurred — than two unrelated words.For instance, the combined terms ‘head’and ‘hat’ will give more hits in a Googlesearch than ‘head’ and ‘banana’.

The Google search engine indexesaround ten billion pages on the webtoday. Each such page can be viewed asa set of index terms. A search for aparticular index term, say “horse”,returns a certain number of hits, say

Google Teaches Computers the Meaning of Wordsby Rudi Cilibrasi and Paul Vitányi

Computers can learn the meaning of words with the help of the Google searchengine. CWI researchers Rudi Cilibrasi and Paul Vitányi found a way to use theWorld Wide Web as a massive database from which to extract similarity of meaningbetween words. The approach is novel in its unrestricted problem domain,simplicity of implementation, and manifestly ontological underpinnings.

The program, arranging the objects in a tree visualizing pairwise NGD’s, automatically

organized the colors towards one side of the tree and the numbers towards the other.

Page 38: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

ERCIM News No. 61, April 2005 51

R&D AND TECHNOLOGY TRANSFER

The knee joint is a very delicate andfrequently damaged component of thehuman motion system. A wide range ofimaging techniques (MR, CT) is avail-able to clinical practitioners investi-gating the injured parts. These methodsoffer invaluable support to theorthopaedic surgeon by making visibleboth the state of the articular cartilageand certain joint diseases. The sametools can be applied to the analysis ofkinematic behaviour as well as themorphology of the articular cartilage.

When analysing articular motion, thedetermination of the contact points(regions) of the cartilage surfaces inthree-dimensional (3D) space is espe-cially important. Delineation of thecontacting surfaces is extremely difficultdue firstly to the similar grey-scalerepresentation of the synovial fluidbetween the opposite surfaces and thehyalin, and secondly to the fact that thetwo surfaces are partly covering eachother. We have developed computermethods for the image analysis, the 3D

reconstruction, and for buildinggeometric models of the knee joint basedon MR/CT images. An iso-surface ray-tracing method with a Phong illumina-tion model is used to create surfacesfrom volume grid models according to apredefined value of grey-scale threshold.

Surface points extracted from the iso-surface models represent only roughinformation on the shape. Accurategeometrical models are required for adetailed analysis of the shape and func-

Reconstruction, Modelling and Motion Analysis of the Human Knee Based on Magnetic Resonance Imagesby Gábor Renner and György Szántó

New computer methods and programs have been developed at SZTAKI incooperation with the Orthopaedic Department of the Semmelweis University andthe CT/MR Laboratory of the International Medical Center, Budapest. They supportthe three-dimensional visualization, reconstruction and modelling of the tissuesand organs of the human body. These methods were used in examining themorphology and movement of the knee.

46,700,000. The number of hits for thesearch term “rider” is, say, 12,200,000. Itis also possible to search for the pageswhere both “horse” and “rider” occur.This gives, say, 2,630,000 hits. Dividingby the total number of pages indexed byGoogle yields the ‘Google probability’ofthe search terms. Using standard infor-mation theory, we take the negativelogarithm of these probabilities to obtaincode word lengths for the search terms— this ‘Google code’ is optimal inexpected code word length, and hencewe can view Google as a compressor.

We then plug in these code word lengthsin our formula derived in a decade oftheoretical and experimental work byothers and us on compression-basedsimilarity distances, to obtain thenormalized Google distance or NGD,which represents the relative distance inmeaning between the words The lowerthis so-called normalized Googledistance, the more closely words arerelated. With the Google hit numbersabove, we computed NGD (horse; rider):0.453. The same calculation when

Google indexed only one-half of thecurrent number of pages gave NGD(horse; rider): 0.4445. This is in line withour contention that the relative frequen-cies of web pages containing searchterms give objective absolute informa-tion about the true semantic distancesbetween the search terms: that is, whenthe number of pages indexed by Googlegoes to infinity, the NGD's must stabilizeand converge to definite limit values.

By calculating the NGD's, networks ofsemantic distances between words canbe generated, with which a computer caninternalize the meaning of these wordsby their semantic interrelations. Inseveral tests, we demonstrated that themethod can distinguish between colorsand numbers, between prime numbersand composite numbers, and can distin-guish between 17th century Dutchpainters. It also can understand thedistinction between electrical terms andnonelectrical terms, and perform similarfeats for religious terms, and emergencyincidents terms. Furthermore, weconducted a massive experiment in

understanding 100 randomly selectedWordNet categories. There, we obtainedan 87.5 percent mean agreement in accu-racy of classifying unknown terms withthe PhD-expert-entered semantic know-ledge in the WordNet database. Themethod also exhibited the ability to do asimple automatic English-Spanish trans-lation.

This research has been supported partlyby the Netherlands ICES-KIS-3 programBRICKS, by NWO, the EU and ESF.

Links:http://www.arxiv.org/abs/cs.CL/0412098

http://homepages.cwi.nl/~paulv/

http://homepages.cwi.nl/~cilibrar/

http://www.newscientist.com/channel/info-tech/mg18524846.100

Please contact: Paul Vitányi, CWI and University of Amsterdam, The NetherlandsTel: +31 20 592 4124E-mail: [email protected]

Rudi Cilibrasi, CWI, The NetherlandsTel: +31 20 5924232E-mail: [email protected]

Page 39: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

52 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

tionality of the knee structures, espe-cially when studying the contact proper-ties of proximal or touching cartilagesurfaces.

Spatial geometrical models are createdusing a set of contours extracted fromMR images. We have developed activecontour and fast marching methods forthe detection of tissue contours in thesequence of MR images relevant to theknee joint (bone, cartilage). Our algo-rithms and programs based on these twomethods work fast and reliably for bonecontours; however, they require severalimprovements for cartilage contours.

By using contours extracted from 2Dimages and suitable transformations, adigitized representation (point set) offunctional surfaces of the knee joint canbe constructed. We have developedseveral methods and programs, whichallow us to visualize and evaluate themorphology of those surfaces (eg noiseelimination, decimation and triangula-tion). Various geometrical properties (eg

lines of intersection, curvature distribu-tions and feature points) can be calcu-lated, visualized and evaluated.

The study of the knee in motion is impor-tant both from anatomical and patholog-ical points of view. This requires thealignment of images and spatial modelscorresponding to different flexions of theknee in a common coordinate system(registration). Different types of registra-tion methods (such as iterative closestpoint methods and methods based onanatomical markers or feature points)have been developed for the accurate andfast registration of knee structures.

Our experiments show that continuousand smooth representations of the activesurfaces are needed for detailed biolog-ical or medical investigations (especiallyfor estimating the size and shape of thecontacting regions of cartilages).

Continuous surface fit starts with thetopological ordering of points, which isdone by creating a triangulation over the

surface points. A good triangulationmust meet a number of requirements: itmust be topologically correct, must elim-inate outlier points, triangles must havecomparable side lengths and angles, theirsize must reflect the curvatures of thesurface, and so forth. Algorithms andcomputer programs have been developedfor the automatic generation of trian-gular structures. The mathematicaldescription of surfaces with a highdegree of continuity is performed byparametric surfaces commonly used incomputer graphics and CAD (eg Bézier,B-spline and NURBS surfaces). Themathematical representation of contin-uous surfaces is computed by mini-mizing a functional. This contains thesquared distances of data points to thecontinuous surface, as well as geomet-rical quantities reflecting continuity andsmoothness.

Contact properties of cartilage surfaces(their shape and extension) can be bestevaluated by a distance function definedbetween the two surfaces. This is thelength of the shortest path from a surfacepoint to the other surface. We havedeveloped a procedure to evaluate thedistance function for the continuoussurface models of the cartilage. Surfacescan be colour-coded according to thedistance function, which provides anextremely efficient tool for evaluatingthe shape and size of the contact regions.Figure 3 shows the colour-codeddistances for the two contacting cartilagesurfaces (femur and tibia).

The above methods provide the basis fora prototype system to be used in thesurgical treatment of orthopaedicpatients. The complex nature of theproblems in this area means that closecooperation is required within a multi-disciplinary team. The project, whichhas run from 2002-2005, has beengreatly supported by the NationalResearch and Development Programmeof the Ministry of Education of Hungary.

Please contact:Gábor Renner, SZTAKI, HungaryTel: +36 1 279 6152E-mail: [email protected]

Szántó György, SZTAKI, HungaryTel: +36 1 279 6187E-mail: [email protected]

Figure 1: Plane intersections

perpendicular to the tibia axis.

Figure 2: Active surface of the femur

fitted to the point set.

Figure 3:

Color coded

distance

function of

cartilage

surfaces.

Page 40: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

ERCIM News No. 61, April 2005 53

R&D AND TECHNOLOGY TRANSFER

The system we have developed provideshelp for neuropsychological cliniciansby supporting them in setting up theanamnesis, providing tests and trainingmaterial for rehabilitation work andkeeping track of patients’ progress.

The system contains a form for holdingpatients’ personal data, numerous testsand tutorial tasks, and a database inwhich the rehabilitation process isarchived. The single tasks contain deci-sion-making situations, and the systemprovides assistance in these cases if thepatient seeks it. In case of failure thesystem provides the possibility of a newtrial. The method used by the patient tosolve the problems is stored, as thisrepresents important information for theneurologist. To enable the patient to find

his or her best solution to the problem,we have developed interactive proce-dures that permit the solution to bereached using different cognitiveapproaches. It was important that thedifferent routes taken by patients shouldnot influence the test results, which areexpressed in points, but should guide therehabilitation process.

During the diagnostic phase of the reha-bilitation, the program keeps track of thepatient’s problem-solving strategy anduses this in selecting the best rehabilita-tion plan. In order to do this, we usesyndrome analysis, meaning our systemis not merely a collection of computer-ized tests, but has been built in such away that it uses the solution provided bythe patient to select the next task. The

system contains tasks that relate toperception, memory and attention,including writing, reading and countingtasks. Further on, a new aspect in oursolution is that through modern technicalsupport one can change the difficulty ofthe task by using colouring, changing thelocal or global stroke intensity, or addingnoise.The program also numericallypresents the efficiency of the task solu-tion. This means that diagnostic andtherapeutic programs are not only usedaccording to the individual needs, butalso supply exact results and compara-tive data. One can therefore investigatecognitive and behaviour deficits in sucha way as to activate the intact functionsof the brain, thereby allowing the patientto control the course of the procedures.The pace of the rehabilitation is thusinfluenced by the cooperation of thepatient, and by this involvement onehopes that the rehabilitation speed willalso increase. In addition, the main menuof the HELEN program is constructed inorder to allow new tests and practicingtasks to be added at any time.

The Specification of the LogicalStructure of the HELEN SoftwareThe following data are stored in thedatabase of the software:• personal data• left- or right-handedness (this has to be

known for the best strategy of certaintasks)

• data from the investigation, eg testresults stored in the different sub-menus of the program, their solutionstrategy, drawings prepared by thepatient, audio recordings etc.

The tasks of the HELEN interactive andtherapeutic test and teaching softwareare shown in the table.

Computer Controlled Cognitive Diagnosticsand Rehabilitation Method for Stroke Patientsby Cecília Sik Lányi, Julianna Szabó, Attila Páll, Ilona Pataky

The HELEN (HELp Neuropsychology) computer-controlled diagnosis andtherapeutic system was developed at the Colour and Multimedia Laboratory atthe University of Veszprém, Hungary during the period 2002-2004. It was preparedto support the diagnosis and therapy of stroke patients, to keep records of therehabilitation process and to help the therapeutic process. The system has beensuccessfully tested at the National Centre of Brain Vein Diseases.

Table: Tasks of the HELEN interactive and therapeutic test and teaching software.

Figure 1: The “Where?” tasks. Figure 2: How many pictures do you see?

Page 41: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

54 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

Steps of the Solutions of the Applied ResearchThe HELEN system contains severalinteractive computer-controlled rehabili-tation programs. Using these programsthe neurologist can reveal individualchanges in the mental capacity of thepatient, and can get new information onthe cognitive functioning of the patient’sbrain. The program can be used also ingroup therapies.

The information stored in the databasehelps the neurologist to follow the indi-vidual rehabilitation path of the patient. Itcan also provide data on the typicalbehaviour of the patient, how he or sherecognizes something, and give furtherinsight into the sequencing of perception.

In the future we intend to supplement thesystem with further sub-tasks, so as toincrease the variety of tests the neurolo-gist can use. Based on user feedback, weintend to include stories that are moreinteresting and understandable for thestroke patients, so that they becomemore interested and involved in the reha-bilitation process.

The rapid progress in the image-processing capabilities of moderncomputers enables the use of moresophisticated pictures, the introductionof animation, the use of virtual realityworlds etc. We are of the opinion thatpatients would appreciate more realisticpresentations, stimulating them tobecome more involved and thus engage

their mental capacity. All this could leadto quicker and more complete rehabilita-tions. Naturally this is at the presentmoment still only speculation; neverthe-less, we feel it to be worth exploring,since it could help a non-negligiblepercentage of the now still inactivepopulation to re-integrate themselvesinto normal life.

The development of the system wassupported by the National Research andDevelopment Fund, Hungary, Grant no.NKFP 2/052/2001

Please contact: Cecília Sik Lányi, University of Veszprém, HungaryTel: +36 88 624601 E-mail: [email protected]

Medical images, like CT images, areincreasingly important in healthcare,therapy treatment and medical research.The research results consist in applyinginformation retrieval methods to humanbrain CT images and reporting retrieval.This represents a solution to the problemradiologists have of finding cases thatmatch exactly or partially or are similar tothe information they need, depending onthe user's viewpoint. To satisfy thesedifferent requirements, three differentinformation retrieval techniques wereadopted in one system. The first isemployed when the user requires casesthat exactly match their information need.A Boolean retrieval technique using ANDqueries was used for this purpose. Whenthe user wants to retrieve partiallymatching cases, the retrieval is performedby the Hyperbolic Information Retrieval(HIR) algorithm. Lastly, in order to find

similar cases the Interaction InformationRetrieval (I2R) method was implemented.

The advantage of using all three strate-gies is that they complement each other:the Boolean search retrieves every caseexactly matching the query, the hyper-bolic search retrieves every case havinga partial match to the query, while theinteraction search returns the mostclosely associated cases based on thequery. The last two techniques weredeveloped in the Center of InformationRetrieval (CIR).

In medical practice, most image retrievalsystems are designed to help experi-enced physicians with diagnostic tasks,and they require that users have priorknowledge of the field. This means theyare not suitable for educational purposes.The results of our research from the

application viewpoint will enhance thequality of both specialist consultationand medical education. General practi-tioners may confirm a diagnosis orexplore possible treatment plans byconsulting the CT retrieval system overthe Web. Medical students may want torelate images with diagnoses, or to seeimages corresponding to different patho-logical cases such as lesion, bleed orstroke. Our system enables users (bothradiologists and general practitioners) tomake use of medical text and imagedatabases over the Web, both in order tofacilitate health preservation and to assistdiagnosis and patient care. An assess-ment by radiologists found the applica-tion to be both relevant and effective.

System DescriptionThe NeuRadIR (NeuroRadiologicalInformation Retrieval System) applica-

NeuRadIR: A Web-Based NeuroRadiologicalInformation Retrieval System by Sándor Dominich, Júlia Góth and Tamás Kiezer

The Cost-Effective Health Preservation Consortium Project was formed to developnovel, Web-based Computer Tomograph (CT) image retrieval techniques for usein medical practice, research and teaching. The project was carried out in theDepartment of Computer Science, Faculty of Information Technology, Universityof Veszprém, within the Center of Information Retrieval between 2001 and 2004.It was sponsored by the National Research and Development Fund of theSzéchenyi Plan, Hungary.

Page 42: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

ERCIM News No. 61, April 2005 55

tion (see Figure 1) consists of computerprogram modules written in severallanguages, as well as related documen-tation. Communication between theWeb server and the search programis based on the CGI protocol. TheCT Base Editor makes it possibleto create and modify the databasecontaining the images, corre-sponding reports and index files.The Search Module is used on-lineon the Web, and consists of userinterfaces and search programs.The query is formally analysed andinterconnected with the database.A controlled vocabulary wascreated based on both textualreports and standard specialistqueries.

The title page of the ’SearchScreen’ contains a header at thetop, whilst the area underneath isdivided into two halves: the lefthalf contains the terms of thecontrolled vocabulary, whilst theright half contains the searchbuttons and the query. Medical

terms can be selected from the vocabu-lary or freely entered from the keyboard

in the query line. The user selects oneof the searching strategies byclicking on the appropriate button.This returns a hit list, and byclicking on any hit both textualinformation and CT images aredisplayed.

The NeuRadIR system has beendemonstrated on several nationaland international forums and canbe accessed from the project home-page. In the future, we plan toextend the application by makinguse of feature extraction modulesto index images.

Link: http://www.dcs.vein.hu/CIR/neuradir/app/

Please contact:Sándor Dominich , Júlia Góth, Tamás Kiezer, Department of Computer Science, University of Veszprém, Veszprém,HungaryTel: +36 88 624711E-mail: [email protected], [email protected], [email protected]

R&D AND TECHNOLOGY TRANSFER

System diagram and architecture of NeuRadIR

application.

ARGO: A System for Accessible Navigation in the World Wide Webby Stavroula Ntoa and Constantine Stephanidis

ARGO is a system for accessible navigation in the World Wide Web, whichaddresses the requirements of diverse target user groups, including people withdisabilities. ARGO supports visual and non-visual interaction in order to satisfythe requirements of blind users, users with vision problems, and users withmobility impairments of the upper limbs. ARGO has been developed by ICS-FORTH in the context of the EQUAL PROKLISI Project.

In the context of the ‘CommunityInitiative EQUAL’ of the EuropeanCommission, the Project PROKLISI –‘Identifying and combating simple andmultiple discrimination faced by peoplewith disabilities in the labour market’,coordinated by the Hellenic NationalConfederation of People withDisabilities, aims to identify and combatthe difficulties that people with disabili-ties and their families face in accessingand remaining in the labour market. Oneof the goals of the PROKLISI project isto exploit the potential that new technolo-gies provide in order to develop innova-

tive systems and tools aimed at ensuringaccess to information and communica-tion to groups that are usually discrimi-nated in the employment market.

Under this perspective, a system foraccessible navigation in the World WideWeb has been developed, called ARGO,which addresses the requirements ofdiverse target user groups, includingpeople with disabilities. The systemsupports visual and non-visual interactionin order to satisfy the requirements of:• blind users and users with vision prob-

lems

• users with mobility impairments ofupper limbs.

The ARGO system operates onMicrosoft Windows XP, supports alltypical functionalities of a browser appli-cation, and additionally includes:• innovative mechanisms for the repre-

sentation of links and the structure ofweb pages

• innovative mechanisms for acceler-ating user interaction with the system

• facilities for finding words and phrasesin web pages

• speech synthesis software support

Page 43: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

56 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

BRICKS (Building Resources forIntegrated Cultural KnowledgeServices) is an Integrated Project of the6th Framework Programme (IST507457) within the research and techno-logical development programme

‘Integrating and Strengthening theEuropean Research Area (2002-2006)’.BRICKS began in January 2004 and hasa duration of 42 months. The BRICKSConsortium consists of 24 partners; 7from the academia, and the rest equally

distributed between users and industry,mostly SMEs. The target audience isvery broad and heterogeneous andinvolves cultural heritage and educa-tional institutions, research community,

BRICKS: A Digital Library ManagementSystem for Cultural Heritageby Carlo Meghini and Thomas Risse

The aim of the BRICKS project is to design and develop an open, user- and service-oriented infrastructure to share knowledge and resources in the Cultural Heritagedomain. BRICKS will provide cultural heritage institutions and users the possibilityto provide or share their content and services with other users.

• compatibility with alternative inter-action devices

• high-level security provision fornetworking functionalities

• HTML 4.0 support.

In order to address the needs of thevarious target user groups, ARGOemploys different operation modes(visual and non-visual interaction), aswell as alternative input and outputdevices.

The system operates in three differentmodes:• non-visual mode, appropriate for

blind and users with severe visionimpairments

• visual mode enhanced with scanning tech-nique, appropriate for users with severemotor impairments of the upper limbs

• visual mode without assistive interactiontechniques or input and output devices,appropriate for able-bodied users.

In non-visual interaction, user input isprovided through the keyboard, whilethe system output is provided throughthe use of synthetic speech and, whennecessary, warning sounds. All visualinterface elements are organised in anon-visual hierarchical tree structure. Byusing the appropriate keyboard inputcommands, the user can navigate in thehierarchy and interact with the currentlyactive interface element.

In visual interaction with scanning, theuser provides input through three

switches and receives the system outputthrough the screen. All the interactiveinterface elements of the application areautomatically scanned (highlighted),providing the user with the opportunityto interact with each one of them. Thus,there is only one active interface elementfor each dialogue step, indicated by acoloured border that highlights it. Theuser may interact with the indicatedelement, or move the scanning dialogueto the next interactive interface element,by pressing the appropriate switch.

Users with less severe impairments, suchas low vision or limited dexterity, canalso be assisted by the system, by usingthe simple visual interface of ARGO incombination with the accessibilityfeatures of the operating system (eg,magnifier, mouse keys, sticky keys, etc).

Evaluations with usability expertsand users that took place in theusability laboratory of ICS-FORTHindicated that the system is usableand can be useful to people withdisability.

According to the project require-ments, the ARGO system has beendesigned in order to operate as aninformation point in public places.Therefore, specifications have beenprovided regarding the setting of theinformation system and the arrange-ment of the assistive devices for fullaccessibility. To facilitate blind usersin locating the system and learning

how to use it without any assistance, acamera is attached to the system, andwhen motion is detected, instructions ofuse are announced.

The ARGO system is currently installedand in use in two information kiosks inthe municipalities of Cholargos (Attikidistrict) and Neapoli (Thessalonikidistrict).

Future work includes evaluating ARGOwith real users in the municipalitieswhere it is currently available, andimproving the system according to theevaluation results.

Please contact:Stavroula Ntoa, ICS-FORTHTel: +30 2810 391744E-mail: [email protected]

Overview of the ARGO system.

Page 44: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

ERCIM News No. 61, April 2005 57

R&D AND TECHNOLOGY TRANSFER

industry, and citizens. The project iscoordinated by Engineering SpA, Italy.

Requirements The BRICKS infrastructure will use theInternet as a backbone and has beendesigned to enable:• Expandability, which means the

ability to acquire new services, newcontent, or new users, without anyinterruption of service

• Graduality of Engagement, whichmeans offering a wide spectrum ofsolutions to the content and serviceproviders that want to becomemembers of BRICKS

• Scalability• Availability• Interoperability.

In addition, the user community requiresthat an institution can become a member

of a BRICKS installation with minimalinvestments, and that the maintenancecosts of the infrastructure, and in conse-quence the running costs of members areminimal. BRICKS membership will alsobe flexible; parties can join or leave thesystem at any point in time withoutadministrative overheads.

ApproachIt has been decided that the BRICKSarchitecture will be decentralised, basedon a peer-to-peer (P2P) paradigm, ie nocentral server will be employed. Everymember institution of a BRICKS instal-

lation is a node (a BNode in the BRICKSjargon) of the distributed architecture.BNodes communicate and use availableresources for content and metadatamanagement. Any BNode only hasdirect knowledge of a subset of otherBNodes in the system. However, if aBNode wants to reach a member withwhich it is not in direct communication,it will forward a request to some of itsknown neighbour BNodes; these willdeliver the request to the final destina-tion or forward again to other nodes.

The bBricks of BRICKSThe figure shows the architecture of aBNode. The components (bricks, in theBRICKS jargon) making up theBRICKS architecture are Web Services,and are divided into 3 broad categories:• Fundamental bricks; these are required

on a BNode for its proper functioning

and to maintain membership in theBRICKS community. In addition tothe P2P layer, other fundamentalservices are: Decentralised XMLStorage (providing data ubiquity onthe P2P architecture), ServiceRegistration and Discovery, and IndexManager

• Core bricks; these are needed if a localBNode installation wants to provide itslocal users with access to the BRICKS.They include: User Management,Authentication and Authorisation, andSearch and Browse

• Basic bricks: these are optional, andare deployed on the BNode only incase of need (ie the ContentManagement brick will be deployedonly if the institution running theBNode exposes content to theBRICKS installation). They include:Content Management, MetadataManagement, Accounting, IPRProtection, Annotation Management,and Service Composition.

Pilot ApplicationsIn order to render BRICKS operationalin the Cultural Heritage domain, theproject will develop 4 applications(Pillars, in the BRICKS jargon):• Archaeological Site, which includes

four scenarios: Cultural LandscapeDiscoverer (sharing knowledge aboutcultural landscapes), Finds Identifier(identification of objects brought to amuseum by members of the public),Landscapes Reconstructed (recon-struction of knowledge about culturallandscapes), and Pompeii and Roma(access to visual artefacts)

• The European Museum Forum, whichaims at introducing a digital applica-tion process for the European Museumof the Year Award

• Living Memory, which aims at devel-oping a collaborative environmenttargeted at the general public, andallowing the creation of culturalcontents and their sharing of themamong the rest of the community

• Scriptorium, which will facilitate thework of historians and archive profes-sionals, universities, cultural researchcentres, libraries, history professorsand teachers, and others, by the defini-tion of a new way to exploit andmanage Distributed Digital Texts andhistorical documents. The specificscenario will be the creation of a crit-ical edition of an ancient manuscript.

Link:BRICKS Community: http://www.brickscommunity.org/

Please contact:Carlo Meghini, ISTI-CNR, ItalyTel: +39 050 315 2893E-mail: [email protected]

Thomas Risse, Integrated Publication and Information Systems Institute -IPSI,Fraunhofer ICT GroupTel: +49 (0) 6151 / 869-906E-mail: [email protected]

Architecture of a BNode.

Page 45: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

In March 2005, the two graduates —now entrepreneurs — moved theirrapidly growing business, NorwegianMarket Monitor AS, to new headquartersin Trondheim, Norway.

The program developed by PetterFornass and Tore Steinkjer, who studiedfor their Master‚s degrees together atNTNU, uses a high accuracy wrapper to

crawl the Internet and collect informa-tion about the odds that have been set for25 different sports across the globe.About 300 sites are regularly checked forinformation. This technology hasbecome Betradar.com, which monitorsodds and is updated every few minutes.

“Since we are collecting numbers, every-thing has to be 100 percent correct,”Fornass says. More than 100 book-makers around the globe now use thisweb-crawling technology. Because evensmall differences in odds can result inhuge losses for bookmakers, book-makers can use Betradar.com to check tomake sure the odds they have set forvarious sporting events are in line withodds set elsewhere on the planet.

Petter Fornass is now CEO ofNorwegian Market Monitor AS, whichowns and operates Betradar.com. Hisinterest in developing an informationcollection technology began with asimple desire to finance expenses associ-ated with his master‚s degree incomputer science at NTNU. He devel-oped a web-crawling program withSteinkjer that allowed them to check and

compare hundredsof websites forbetting odds. Theprogram enabledthe pair to finddifferences in oddsfor the samesporting event. Withthis information,they could thenplace sure bets thatguaranteed a win.

But as Fornass andSteinkjer discov-ered, placing betsand playing the

odds can be cash-intensive, even if youare always guaranteed a win.

They were also far more interested in theprogramming challenges of their projectrather than the financial benefit in and ofitself.

Fornass and Steinkjer launched theircompany from their offices at theGløshaugen campus of NTNU in 2000.The launch was a little more unusualthan most: instead of participating in theVenture Cup competition in Norway, thebusiness plan competition that helpsScandinavian students, researchers andothers to take their business idea fromconcept to actual start-up, Fornass andSteinjker decided to look for venturecapital on their own ̂ and found it. In late

March 2005, the pair moved their rapidlygrowing business out of theInnovasjonssenter Gløshaugen to newoffices on the Trondheim waterfront.Among the products developed by thegroup is Sports Fixtures, which reportsstarting times and dates for hockey,soccer and tennis matches and othersports, with daily updates on more than10 sports one week in advance. Resultsare delivered 2-3 times per day.

Another key service is called OddsRadar, which detects the odds andstarting times from more than 140 book-makers. Bookmakers can view 8000odds and 2000 starting times of theircompetitors each day, and are provided acomprehensive monitoring service ofmore than 200 Internet sources for oddseach day. The monitoring module showsthe odds key that was used in settingodds and the odds history and datechanges for each bookmaker detected asa part of the program. The service allowsbookmakers to compare their odds andstarting times with the market average,with alerts of critical differences by e-mail or SMS. The program also detectsincorrect results, errors from typos ormatches that have been missed or leftout, and presents the official schedulesfrom federations or tournament orga-nizers if available on the Internet.

The company employs 27 people, with10 employees in Norway, 13 employeesin Germany, three in Switzerland andone in Czechoslovakia, and more than100 clients spread across the globe.

Link:http://www.betradar.com

Please contact: Petter Fornass, Betradar.com, NorwayE-mail: [email protected]

Student Programming Project becomes Multi-Million Dollar Companyby Nancy Bazilchuk

Everyone likes a sure bet. But two computer scientists who have their Master‚sdegrees from the Norwegian University of Science and Technology have founda way to turn bookmaker‚s odds into money in their pocket with a cleverinformation-gathering program.

58 ERCIM News No. 61, April 2005

R&D AND TECHNOLOGY TRANSFER

Petter Fornass, co-founder of Norwegian Market Monitor AS.

Page 46: e-Collaboration and Grid-on-Demand Computing for Earth ...homepages.cwi.nl/~paulv/papers/ercim05.pdf · observation data and provide the Earth Science community with on-demand services.

ERCIM News No. 61, April 2005 59

R&D AND TECHNOLOGY TRANSFER

On the empirical side, corporate andpublic standards for software develop-ment have flourished in recent years. Ithas been shown that standards impactdirectly on industry (even the softwareindustry) much more than solutionsproposed in specialised literature. Forthis very reason, standards shouldembody applicable results of scientificresearch, whereas frequently they do not.A consequence is that, despite anincreasing tendency of developers tofollow recognised international norms,there is growing concern about theability of empirical software develop-ment processes to respond to theemerging needs of the global informa-tion society. On the formal methods side,indicative is the fact that the applicationof mathematical models (together withthe instruments that can be derived) bysoftware process managers is still asomewhat rare occurrence. So the bigquestions are:• what will be the successful

paradigm(s) for the software processin the near future?

• what requirements must the softwaresatisfy?

• how will these requirements be influ-enced by other (social, political,economical, geographical) factors?

At the Software Engineering Institute(SEI), this concern has been taken on-board as an exciting challenge (thebigger the problem, the happier theresearcher, one could say). ‘ExploringWhat’s Next in Software ProcessResearch’ has been the leading conceptof a recent initiative, sponsored by SEIand by several private corporations andhas led to the constitution, in August2004, of the International ProcessResearch Consortium (IPRC), a world-

wide association of researchers. SixEuropean countries are joining SEI inthis effort, Italy being represented by aresearcher from ISTI-CNR.

The main goal of IPRC is to prepare aRoadmap — for software processresearch for the next 5-10 years. The aimwill be to provide indications on how thetechnology challenges of the near futureshould be addressed by the softwareindustry. However, software is not theonly issue of interest for IPRCresearchers. Here below we summarisethe main activities:• a number of working groups have been

set-up to investigate distinct areas ofinterest

• six workshops will be held betweenAugust 2004 to August 2006 in whichideas of the IPRC member researcherswill be proposed and debated;working groups will continue theiractivity in the meantime

• different nationalities, cultures andneeds are represented in the researchteams, to avoid localisation and privi-leged solutions

• experts in non-technical disciplines,such as economy, psychology andorganizational science are invited tolecture to the technical team

• experts (from SEI) in working groupbehaviour act as facilitators, catalysingreactions among researchers

• various interaction mechanisms areactivated by the facilitators to obtainmaximum benefit from the collabora-tion;

and the current IPRC approaches andcontents:• technical and non-technical discussion

has been solicited from the membersin an unbiased way, allowing much

degree in divergence of points of viewand perspectives; convergence isexpected as ideas and activity maturearound elements of the roadmap

• widely-disparate content has beendeployed so far in the form of state-ments in order to establish relation-ships and priorities: needs,hot/warm/cold research topics, state-of-art and trends in software processsolutions, human factors, known andmysterious forces pushing forward /retarding progress

• a scenario-based approach is beingexplored to analyse the impact of afuture multi-dimensional ‘trendsspace’ in any of the software processelements; the lack-of-crystal-balluncertainty is amended by consideringeach trend in two opposing directions:this implies much work if complete-ness is the goal, but knowledge can beincreased, even with partial analyses.

The outcomes of the two workshops heldso far are considered very promising,thanks to the huge amount of materialexamined and the ideas generated by thediscussions. Although, much work isneeded before there can be convergencetowards concrete proposals, we alreadyhave the impression that in addition tothe Roadmap other results and new linesof research will emerge from the work ofthe Consortium.

Link: http://www.sei.cmu.eduhttp://www.sei.cmu.edu/iprc/

Please contact:Mario Fusani, ISTI-CNR, ItalyTel: +39 050 315 2916E-mail: [email protected]

Software Engineering Institute goesInternational in Software Process Researchby Mario Fusani

In industrial prototyping and production, the shifting of the object of study fromproduct to process is more than a century old and has been at the basis ofsignificant product improvement. In software technology, a similar effort has beenattempted for a couple of decades, both with empirical software engineeringapproaches and with mathematical models, yet the results in software-dependentproducts and services are not as bright as it could be expected.