Post on 25-Dec-2021
Australian Ocean Data Network utilises Amazon Web Service Batch processing for gridded dataSebastien Mancini, Roger Proctor, Peter Blain, Kate ReidUniversity of Tasmania
5 November 2018
Australia’s Integrated Marine Observing System (IMOS)
12 minutes, plus 3 minutes for questions
1. About IMOS
2. Gridded datasets
3. Cloud computing and AWS Batch
4. Helpdesk and user statistics
5. Future improvements
Outline of the talk
IMOSgriddeddatasets
• SeaSurfaceTemperature(SST)products• OceanColourproducts• Satellitealtimetryproducts• Coastalradarproducts• Climatology• Bathymetry
Whatdoestheuserwant?
• Visualisedatabeforedownloading• RetrieveatimeseriesataparticularlocationanddownloaddatainCSVformat
• SubsetandaggregatedataandgenerateoutputfileinnetCDFformat
• MostofthemdonotwanttoaccessdataonTHREDDS
Whatdoestheuserwant?MostdownloadeddatasetsontheAODNPortalfortheperiod01/2016to04/2018usingGoogleAnalyticsRank EventLabel TotalEvents1 IMOS- AustralianNationalMooringNetwork(ANMN)Facility- Currentvelocitytime-series 9442 IMOS- ArgoProfiles 8533 IMOS- AustralianNationalMooringNetwork(ANMN)Facility- Temperatureandsalinitytime-series 7764 IMOS- SRSSatellite- SSTL3S- 01daycomposite- nighttime 7655 IMOS- SRS- MODIS- 01day- Chlorophyll-aconcentration(OC3model) 6796 IMOS- AustralianNationalFacilityforOceanGliders(ANFOG)- delayedmodegliderdeployments 5017 IMOS- SRSSatellite- SSTL3S- 06daycomposite- daytime 4588 IMOS- OceanCurrent- Griddedsealevelanomaly- Delayedmode 439
9IMOSNationalReferenceStation(NRS)- Salinity,Carbon,Alkalinity,OxygenandNutrients(Silicate,Ammonium,Nitrite/Nitrate,Phosphate) 420
10 IMOS- SRSSatellite- SSTL3S- 1monthcomposite- dayandnighttimecomposite 40411 IMOS- SRSSATELLITE- SSTL3S- 01daycomposite- dayandnighttimecomposite 37612 IMOS- OceanCurrent- Griddedsealevelanomaly- Nearrealtime 36815 IMOS- SRS- MODIS- 01day- OceanColour- SST 32417 IMOS- SRSSATELLITE- SSTL3S- 03daycomposite- nighttime 28019 IMOS- SRSSatellite- SSTL3S- 03daycomposite- dayandnighttimecomposite 22720 IMOS- SRSSatellite- SSTL3S- 01daycomposite- daytime 20922 IMOS- SRSSatellite- SSTL3S- 06daycomposite- dayandnighttimecomposite 16923 IMOS- SRS- MODIS- 01day- Chlorophyll-aconcentration(GSMmodel) 15726 MARVL3- Australianshelftemperaturedataatlas 14427 IMOS- SRSSatellite- SSTL3S- 1monthcomposite- daytime 141
Serverlessarchitecture
• Serverdetailsgetabstractedaway• Serversonlyrunwhenneeded• Leaveservermanagementtoacompanythatdoesthatastheirbreadandbutter
• Focusonwhatmattersinstead- thecodeanddata
Image:Creator:JohnVoo,Url:https://www.flickr.com/photos/138248475@N03/Licence:CCBY2.0
Userstatistics:FebruarytoSeptember2018
240uniqueusers
42userspermonth
1600downloadsovertheentireperiod
350downloadsinJune
32faileddownloadsovertheentireperiod
Processtimeof<8minfora1000jobs10%jobshaveanaveragedprocessedtimeof900min
Queuetimeof<7minfora1000jobs10%jobshaveanaveragedqueuedtimeof550min
Costcomparison:
AWSBatch Normalapproach
EC2SPOTinstance+localstorage+Lambda
EC2instance(ifreserved)+Localstorage
80$permonth 350$permonth
Additionalcostforbothapproaches:• Storageforoutputfile• S3operations(dataaccess)
vs
VendorLock-in
+majorityof aggregationcodewrittenasagenericlibrary/utility
- requesthandlerLambdahassomeAWSspecificcode,butcouldberefactoredtoamoregenericAPIhandler
- statusserviceLambdaisquitespecifictoAWSBatch/S3currently,butcouldbemademoreabstract
- thecomponentsaregluedtogetherbyCloudFormationanddomakeassumptionsaboutrunningonLambda,howeverthemajorityoftheclassesaregeneric,soifrunningoutsideofAWSwasarequirement,theprojectcouldberefactoredtomakeconceptslikestorageandAPIsmoregeneric
Futureimprovements
• Implementcloudcomputingtootherelementsofourinfrastructure:• Developmentofourapplicationstack(AODNPortal,Geonetwork,Geoserver,Geowebcache,ncWMS …)
• Improvesubsetting/aggregatingcode:• Downloadmultiplepointtimeseriesatthesametime• Improveefficiencytoproduceresultquicker
• Useofsystemanalyticstoimprovequeuedesign• Multiplequeuesdependingonsizeofjobs,typeofusers…
• ApplyAWSBatchforothertypeofdatadownloads:• LargeCSVdownloadsusingWFSrequestsfromGeoserver• SubsetGeotiff (e.g.Bathymetrydata)