THE FEASIBILITY AND UTILITY OF IMPLEMENTING …...innovation leads to business innovation, has been...

Post on 13-Jul-2020

3 views 0 download

Transcript of THE FEASIBILITY AND UTILITY OF IMPLEMENTING …...innovation leads to business innovation, has been...

THE FEASIBILITY AND UTILITY OF IMPLEMENTING TEMPORAL DATA CUBES

TO SUPPORT PROJECTION OR “FORECAST” MODELS AND LAND CHANGE TRENDS

AReportoftheNationalGeospatialAdvisoryCommitteeLandsatAdvisoryGroup

April2018

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

1

THEFEASIBILITYANDUTILITYOFIMPLEMENTINGTEMPORALDATACUBESTOSUPPORTPROJECTIONOR“FORECAST”MODELSANDLANDCHANGETRENDSExecutiveSummaryInAugustof2016,theU.S.GeologicalSurvey(USGS)requestedthattheLandsatAdvisoryGroup(LAG),asubcommitteeoftheNationalGeospatialAdvisoryCommittee,studythefeasibilityandutilityofimplementingtemporaldatacubestosupportprojectionor‘forecast’modelsoflandchangetrends.Thisstudywasafollow-ontotwopreviousLAGstudypaperson“ProductImprovement”and“Cloudcomputing”thathadbothbeenpublishedin2013.Thestudywasproposedtohelpaddresswhetheradeepermarketdemandforforecastinglandchangewoulddevelop.SeveralquestionswerealsoposedbasedonthepresumptiveuseofadatacubewithLandsatderivedinformation,asameasure,andtime,asadimension,whichthisreportdiscusses.BackgroundThejointNationalAeronauticsandSpaceAdministration(NASA)/UnitedStatesGeologicalSurvey(USGS)Landsatprogramprovidesthelongestcontinuousandopenlyavailablespace-basedrecordofEarth'slandinexistence.Landsatmissionshaveacquiredmoderateresolutionmultispectraldataforover40years.TheEuropeanSpaceAgency(ESA)hasbeengatheringEarthobservationdataforalongtimeandinitiatedsystematicarchivingandanalysisofdatafromotheragencies’satellitesintheearly1980s.ItbeganitsownEarthobservationswithEurope’sfirstEarthResourcesSatellite(ERS).TheESAEarthObservingSentinelsatellitesovernearlythepastfouryearshaveaddedtotheamount,thecomplexity,andtherelevanceofreadilyaccessibleremotelysenseddata.Havingafacile,agile,andreliablewayforalltointeract,directlyorindirectly,withthese,alreadyvastbutalsogrowing,collectionshasbothnationalandinternationalinterest.Thesecollectionsposethe“BigData”technologychallengetopreviousdataarchitecturesandtoolstomanipulateortointerrogatepricelessobservationsfromawide-rangeofsensors.Higherspatial,spectral,andtemporalresolutionofthecollectioncompoundsthechallengeaswellastheopportunitiestobetterunderstandourEarth.Improvedapproachestothemanagement,preparation,distributionandanalysiswillrelievesomeofthedata-to-information-to-knowledgeprogressionstress.Algorithmsforstatisticalanalysisofincreasinglylargersamples(andperhapssignificantlyvarying)“BigData,”usedunderdifferentconditionstoaddressdifferentissuesandperspectives,mustbewiselyselectedandusedtoavoiderroneousstatisticalinferenceorinadequateconclusion.TheFederalGeographicDataCommittee(FGDC)requested,forthe2016program,thattheLandsatAdvisoryGroupprovideadviceon“thefeasibilityandutilityofimplementingtemporaldatacubestosupportprojectionor‘forecast’modelsoflandchangetrends”andnotedthatthisworkwas“intendedasafollowontopictotheLAGstudypapersonProductImprovementandCloudcomputingpublishedin2013.”Fivequestionswereposed:

• InadditiontoLandsat,whatotherdatasources(toincludeEO,SAR,andLIDAR)areoptimallysuitedforleveraging(e.g.,co-registered)tosupportdatacubeimplementationsforlandchangeanalysisandforecastmodeling?

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

2

• WhatkindsofLandsattime-seriesproductswouldhavethebroadestcommunityuseormostimpactfulcontributioninspecificareas?

• WhichorganizationswithexpertiseinforecastmodelingarebestposturedtoevaluateanddemonstratetheforecastpotentialfromaLandsat-basedtemporaldatacube?

• HowfarbackintimeintotheLandsatarchiveshouldthestagingof‘analysisreadydata’beconsidered?E.g.,earlydatacollectionssuchasmulti-spectralscanner(MSS)dataarelessequipped(intermsofmetadata)tosupportrigorousgeometricandradiometriccalibrationcomparedtolatercollections.

• Howcouldefficientsynergyberealizedamonggovernmentandcommercialrolesfordatacubedevelopment,andoperations(processing,storage,distribution)tosatisfybroadcommunityneeds?

TheNGACPaper,dated11December2013,on“ProductImprovement:AdviseUSGSonpotentialmeansofmodifyingthecurrentproductstomakethemmoreusefultocommercialinformationprovidersandvalue-addedanalysts”1madethegeneralrecommendationthat“USGSfurtherimproveLandsatproductstobothenhancethescientificvalueoftheimagery,butalsotoprovideadditionalvaluetothecommercialandgovernmentorganizationswishingtoextractthemaximumvaluefromtheimagery.”SevenpointsexpandedthatsummaryrecommendationforUSGS:

• ClearlydefinewhatUSGSwillproduceandavoidcompetitionwithcommercialwork.• Refinegeometricaccuracyandradiometricmeasurementstoenablebetterchange

detection.• ImproveL1Gproductgeometricaccuracyandco-registration.• Defineastandardsurfacereflectanceproduct.• Consolidatescientificresearchandpublishbestpracticesforarangeofproducts.• Providecertification/validationfacilitiesforproductsnotproducedbyUSGS.• SimplifyaccesstotheL1Tproduct.

ThesecondNGACPaperofthesamedate,entitled“CloudComputing:PotentialNewApproachestoDataManagementandDistribution”1endorsedtheuseofcloudcomputingandsuggestedhowUSGS/ EarthResourcesObservationandScience(EROS)shouldleveragethattechnologyby:

• Supportingthird-partycloudprovidersbyprovidingbulkdatadownload;• Co-locatingdataandon-demandprocessingforonlythedesiredinformation;• Transmittingtherequiredprocessingmodeltothecloudsomassivedatacouldbehandled

bymultipleCPUs;• DownloadingsubsetsofL1Tproducts;• Givingattentiontouseofopensoftwarestandardstoavoidtyinganyservicestoproprietary

software;and• Streamliningsecurity.

IntroductionExplaininginterestinthespatio-temporaldatacubeTheoptionsforstoringandaccessingrelevantdataofferarangeoffunctionalitybutaresomewhatlimitedwithmassivedataandspecificrequirementstosatisfyparticularbusinesscases.Inmanycases,adatawarehousecanadequatelysupportinformationprocessingasastableplatformforconsolidated

1 Twoofthe2013NGACKeyDocumentsfoundathttps://www.fgdc.gov/ngac/key-documents

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

3

andtransactionaldata.However,ofincreasinginterest,onlineanalyticalprocessing(OLAP)moreadequatelyallowsformulti-facetedconsumptionofdatatomeetvariedneeds.Thedatacubeprovidesnotonlyastoragestructurebutalsothe“staging”spaceforanalysisoftheinformation.TheOLAPcubeisamulti-dimensionaldatabase,whichhasdrawnincreasingattentionoverthepastseveralyearsforearthobservationcollection.AmarketingpromotionforanEarthserverProjectworkshopondatacubesdescribedthedaylongworkshopfocusinthefollowingway.“Thedatacubeconceptpromisestotacklesomeofthechallengesthatcomealongwithlargevolumesofenvironmentalandgeospatialdata.Datacubesofferamoreon-demandandanalysis-readyaccesston-dimensionaldata,whichcanbeaccessedalonganyaxis,allowingforefficienttrimorsliceoperations.Thedatacubeconceptmakeslargevolumesofenvironmentalandgeospatialdatamoremanageableandthus,increasesthegeneraluptakeofBigEarthdata.”2ExamininganotionalarchitectureTheCommitteeonEarthObservationSatellites(CEOS)ofwhichtheUSisamembercountrybegananOpenDatacubeinitiativein2016.BrianKillough(NASA)andRobertWoodcock(CSIROofAustralia)havebeenprincipaladvocatesfortheinitiative.Whentheinitiativelaunched,useofthedatacube,withthedimensionsofspace,time,anddatatype,wasalreadyaprovenconceptbyGeoscienceAustraliaandtheAustralianSpaceAgencyandwithindevelopmentfortheirLandsatdataarchive.Theobjectivewastohave20countriesoperationallyinvolvedby2022.3Thepace,however,isexceedingtheJuly2017plan.InJuly2017,threecountries(Australia,Colombia,andSwitzerland)hadoperationalcapability.Fourotherwereunderdevelopmentandtwenty-onecountrieswereunderreview.DuringateleconferenceddiscussionwithDr.KilloughandtheTaskTeam2inmid-October2017,hecommentedthat29countrieswerealreadyunderreview.InMarch2018,duringabriefingattheCEOS7thWorkingGroupforCapacityBuildingandDataDemocracyAnnualMeetinginBrazil,itwasmentionedthatatleast40countrieshaveenteredintosomelevelofdiscussionalthoughtheobjectivedoesremain20.ThespeakernotedthatAustralia,Colombia,andSwitzerlandarestilldoingwell.TheUnitedKingdom,Uganda,Vietnam,Taiwan,Georgia,andMoldovaaremakingprogress.ThereareAfricanregionaldatacubesinGhana,Kenya,Senegal,SierraLeone,andTanzania.Therefore,thenotionofthedatacubeisgaininginterestandsupport.Theglobalnatureoftheinterest,however,addstothecomplexityofhowtheUSplanstoexpanditseffortswiththeLandsatcollections.Finding1:InternationallytheutilityofthedatacubefororganizingLandsatdataovertimeandlocationhasgrowingacknowledgementtosupporttimeseriesanalysis.3Colombiahasfoundvalueinexamininglandchangesince2000andenablingunderstandingthetrendsforforestmappingandmanagement.ThemainobjectivesoftheSwissDataCube(SDC)aretosupporttheSwissgovernmentforenvironmentalmonitoring.4TheVietnamDataCubeisintendedtocreatebroadapplicationsforsocio-economicsustainabledevelopmentgoalsforVietnamaswellasothercountriesintheregion.5AnalysisReadyData(ARD)6feedtheformationofadatacube.Landsat8OperationalLandImager(OLI)/ThermalInfraredSensor(TIRS)Tier1and2,Landsat7EnhancedThematicMapperPlus(ETM+)Tier1,

2 https://themes.jrc.ec.europa.eu/news/view/158675/earthserver-workshop-data-cubes-for-big-earth-data-19th-20th-october-2017-frascati-rm-italy 3Killough,BrianOpenDataCubeBackgroundandVision,https://www.opendatacube.org/eventsJuly7th,2017 4http://www.swissdatacube.org/5https://vnsc.org.vn/en/news-events/news/internal-news/introduction-of-satellite-data-sharing-system-vietnam-data-cube/6 https://landsat.usgs.gov/ard

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

4

andLandsat4-5ThematicMapper(TM)Tier1comprisethecontiguousUS,Alaska,andHawai’iARD,whichisavailablefromtheEROSCenter,usingEarthExplorertodownload.Startingmid-March2018,twonewLandsatscienceproducts,SurfaceTemperatureandDynamicSurfaceWaterExtentwillbegintobeintegrated.Dr.RobertWoodcock,whohasworkedforalmosttwodecadesinthefieldofvisualization,spatialinformationsystemsandanalyticsanditsapplicationtoEarthSciencewithafocusonensuringresearch

Figure1.ThearchitecturalconceptsoftheAustralianGeoscienceDataCube

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

5

innovationleadstobusinessinnovation,hasbeenreinforcingtheabove-mentionedworkwiththeOpenDatacubeinitiativeusinghisextensiveexperience.He,withsomecolleagues,preparedthediagramseeninFigure1,7whichdescribesanotionalarchitectureemployingthedatacube.Thefourlayersfrombottomtotopasfollows:

DataAcquisitionandInflow-Observationsarecollectedandpre-processedtoan‘analysisready’levelbyvariouscustodians;DatacubeInfrastructure-analysisreadydataareindexedintotheAGDCv2includingingestionintomulti-dimensionaldatasets,withasuiteoftoolsfortaskexecution,discovery,visualizationandsoon;DataandApplicationPlatform-Platformsandenvironmentsthatallowroutinegenerationofproducts,and,explorationofnewproductsina‘virtuallaboratory’environment;andUIandApplicationLayer-Adiversesetofapplicationsisenabledbytheunderlyinginfrastructure.

Finding2:TherecommendationsfromtheaforementionedLAGpaperscanbealignedwiththisnotionalarchitecture“tobothenhancethescientificvalueoftheimagery,butalsotoprovideadditionalvaluetothecommercialandgovernmentorganizationswishingtoextractthemaximumvaluefromtheimagery”andtooffer“potentialnewapproachestodatamanagementanddistribution.”EROSCenteranddatacubes:InNovember2016,USGS/EROSprovidedtheLAGteamwithabriefingontheLandChangeMonitoring,Assessment,andProjection(LCMAP)initiative“toharnesstheLandsatrecordinordertoprovidestate-of-the-artlandchangecapabilitiesneededbyscientists,resourcemanagers,anddecisionmakers.”Asexplainedduringthepresentation,tomanagetheresultantland-changeproductsrequiredaddressingtheissue“thattheLandsatarchive,currentlyorganizedaspathrows,isnotsufficientlyefficientfortimeseriesstudies.Movingtoagrid-baseddatacubeapproachwithAPI’sthatconditionandservedataperuserspecificationwillreducedatapreparationtime.”ThedatastructuretobeusedwasidentifiedasanOLAPcube.ThedatacontentitselfistheAnalysisReadyData(ARD)inthediagramabove.ThetilingschemeismodeledupontheWebEnabledLandsatData(WELD)andwillusetheAlbersEqualAreaConicprojectionandtheWordGeodeticSystem84datum.ARDarestandardizedwell-characterizedradiometricandgeometricproducts.Dr.TomLovelandcharacterizedtheARDasLandsatdataprocessedtoalevelthatenablesdirectuseinapplications.

§ Itwillsupportgeospatial,multi-spectral,andmulti-temporalmanipulationsforthepurposesofdatareduction,analysis,andinterpretation.

§ Itoffersconsistentradiometricprocessingscaledbothtotop-of-atmosphere(TOA)reflectanceandsurfacereflectance.

§ Itisdesignedforconsistentgeometryincludingspatialcoverageandcartographicprojection–e.g.,pixelsalignthroughtime,<12mRMSE.

§ Itprovidesmetadataondataprovenance,geographicextent,anddataquality.

7AdamLewis,SimonOliver,LeoLymburner,BenEvans,LesleyWyborn,NormanMueller,GregoryRaevksi,JeremyHooke,RobWoodcock,JoshuaSixsmith,WenjunWu,PeterTan,FuqinLi,BrianKillough,StuartMinchin,DaleRoberts,DamienAyers,BiswajitBala,Lan-WeiWangTheAustralianGeoscienceDataCube—Foundationsandlessonslearnedhttps://www.sciencedirect.com/science/article/pii/S0034425717301086

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

6

Insimplewords,ARDareintendedtoprovidesomepre-processedproductsthatalleviatesomeworkburdenonthepartoftheusers.ThereinliesbothitsbenefitformostconsumersandconcernforsomeotherLandsatusersthatwillbeaddressedlater. QuestionsPosedbyUSGSInadditiontoLandsat,whatotherdatasources(toincludeEO,SAR,andLIDAR)areoptimallysuitedforleveraging(e.g.,co-registered)tosupportdatacubeimplementationsforlandchangeanalysisandforecastmodeling?AmongtheeffortsconsideredbytheLCMAPteamalreadyhasbeentoincreasetimeseriesdensitybyaddingSentinel-2.8IntheCEOSinitiative,ColombiaandSwitzerlandarestudyinghowtoincorporatebothSentinel-1(SAR)andSentinel-2(multi-spectral).TheVietnamprototypeincludesbothSentinel-2andALOSdata.Progresswasdiscussedon6March2018,whentheVietnamNationalSpaceCenterorganizedaworkshop“IntroductionofsatellitedatasharingsystemVietnamDataCube”inHanoi.Oneshouldnotassumethatallotherdatasourcescould,would,orshouldbehousedbyUSGS/EROS.Thedatacubedesignmustallowadditionaldimensionsorlayerstothecube.Itwilloftenbenecessaryforanothergovernment,academicorcommercialorganizationtoincorporatetheirown,sometimesproprietary,datasettoimprovetheresultsortopreparetailoredanalysis.Thus,inFigure1,onecouldconsideranalysisreadydatatobemultipledatasetsthathavebeenreadiedbysomepre-processingtoenterintothedatacubestructuring.HereasseeninFigure29,layersofdifferentdatasourceproductsandextensionsofmorelocationsortimescanbeadaptivelyincorporatedtoaddresseithersomespecificorgenericissue.Thegraphicmayobscuretherealitythatprospective“layering”demandsconsiderationofsomestandardizingstructureandfunctionalguidelines.

Figure2.GraphicofConceptualDataCube

ThenotionthatavarietyofpossiblesourcesofdatawouldaccompanytheARDwithintheframeworkoftheUSGSLCMAPinitiativewascharacterizedinthegraphicofFigure3providedbyDr.Loveland.

8Dwyer,John,“USGSAnalysisReadyData”presentationtotheLandsatScienceTeamonJanuary14,2016andrecentreleaseindicatinginterest:https://landsat.usgs.gov/february-17-2018-us-landsat-ard-special-issue-call-manuscripts9Adaptedfromhttps://www.slideshare.net/algum/data-cubes-7923771/5

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

7

Figure3.LCMAPConceptualFrameworkandFlow10

WhatkindsofLandsattime-seriesproductswouldhavethebroadestcommunityuseormostimpactfulcontributioninspecificareas?TheAnalysis-readydata(ARD)preparation,ingeneral,restsuponfoundationaltechnologythatcanbenefitnearlyallusersofLandsatdata,notjustafewspecificapplications.Forexample,ensuringthatalldataareconsistentlycalibratedandcarryappropriatequality-assurancemetadataisofbenefittoeveryone,regardlessofwhethertheyareusingdatainoneoftheexistingUTMgridsoranewcountry-specificgrid.TheU.S.LandsatARDtilingsystemisamodifiedversionoftheWELDstructure.ThreetilegridextentsaredefinedforCONUS,Alaska,andHawaii.ThegridoriginsaredefinedinrelationtotheWGSdatumbutadjustedtoalignwiththeNationalLandCoverDatabase.Theyarecountryspecific.Inaddition,thedevelopmentofaUS-specificARD-baseddatacubeinanAlbersEqualAreaConicmapping projectioniswell-alignedwiththemissionoftheUSGSservingitsUScustomers,asispreprocessingothergeographically-coincidentdatasetstobeavailableinthatsameprojection.ThatapproachbothenablesandfacilitatesthedevelopmentofarangeofUS-specifichigher-leveldataproductsandservices.However,asdatacubesbecomeubiquitous,whatworkswellfortheUSmaybequiteawkward

10Loveland,ThomasAnLCMAPOverview:LandChangeMonitoring,Assessment,andProjection,aDiscussionwiththeLandsatAdvisoryGroupandAmericaViewMembers,November16,2016

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

8

forothercountries.WhentheOpenGeospatialConsortium(OGC®)firstbeganitsDiscreteGlobalGridSystem(DGGS)11workinggroup,itsoughttoestablishaspecificationtoaddresscollatingspatialdatafrommultipleplacesandsourcesandovercomingthechallengesofworkingwithdifferentreferenceorgridsystems.ARDpresenttheBigDatachallenge.AsexplainedbyUSGS/EROS,ARDarethefoundationofLCMAPprovidingstandardizedwell-characterizedradiometricandgeometricproducts(Level-1Collection1),theatmosphericcorrectionandgeo-physicalsurfacereflectanceandsurfacebrightnessretrievals(Level-2),andhierarchicalmetadatatoincludepixel-levelattributes.Alsoavailablewouldbe“opensource”codetoestablishtheprocessingandmetadatastandards,toaccommodatescalablearchitectures,todeployintopublicorprivateclouds.ThelatterareallpointsconsistentwiththerecommendationsfromtheCloudComputingpaper.Taskteammembersendorsedthisopennessbutrecommendalsothedistributionofverificationproceduresthatthemethodsandworkflowshavebeenreplicatedproperlyforanynon-USGSproductionthatincorporatedothersourcesandinitiatestailoredanalysis.ThoseproceduresmightmirrorwhatUSGSitselfuse.Atthistime,theARD’s“opensource”codeisaccessiblethroughhttps://github.com/USGS-EROS.Thedownloadofatilestillinvolves5000x5000pixelspercollectioneventandanypartitioningdowntosomesmallergeographicfootprintforamorelocalareaoccursinthechosenenvironmentoftheuser.Improvementstothelengthyandspacedemandingdownloadandprocessingtasksareneeded.Finding3:Non-USGSprocessingofdatausingtheopen-sourcecodeandalgorithmsavailablefromUSGScouldnecessitatethatUSGSalsoreleaseproceduresdocumentationandsomeverificationtestdatasets.ThetaskteamresponsibleforthisreportcautionsthattheUSGSshouldensurethatitsvariouseffortsrelatingtoLandsatdataprocessingdistributionarewellalignedwitheachotherandclearlyarticulatedtotheusercommunity.Inparticular,therelationshipbetweentheCollection1reprocessingeffort,existingSurfaceReflectanceprocessinganddistributionefforts,andtheAnalysis-ReadyDataeffort,mayneedtobeclarified.Fundamentalimprovementstoprocesses,likesensorcalibration,shouldbeappliedequallytoprocessinganddeliveryofbothUTManddatacubedata.Similarly,bothTOAandSurfaceReflectancedataareofvalueinallproductformsandshouldbemadeavailableinaconsistentmanner.Keepingalltheseeffortsalignedmayminimizeduplicationofeffort,butmoreimportantly,itwillavoiduserconfusion,whichcouldotherwiseleadtoerroneoususeofdatabyendusers.12RecentlybroughttotheattentionoftheTaskTeamhasbeenthevoiceofthosewhoworryabouttheimpactof“normalizing”thereflectanceproductacrossallthecollections.Fromtheirperspective,theyagreethatpre-processingtheLandsatdataintothis“normalized”statesothattime-seriesanalysisofmultiplecollectionsoveralargeareabringsgreatefficienciesbyreducingprocessingburdenonamanyorevenmostoftheusers.Whattheconcernedgroupquestionsiswhaterrorisintroducedinthatpre-processingthatmightaffectanalysisofsmallerfootprintsandmorerestrictedtimesequences.Importantly,theyarenotclaimingthatsignificanterrorsmightresult.RathertheyareconcernedthatwhateveranalysismayhavebeencompletedbeforemovingaheadwithARDhasnotbeenquantifiedforthem.TheyendorsethattheLevel1Tproductswillremainavailableandwillwanttodomorestudyon

11http://www.opengeospatial.org/projects/groups/dggsswg12StevenJ.Covington,PrincipalSystemsEngineerfortheUSGSLandRemoteSensingProgram,commentedCurrentthinkinghasCollection2encodedwithCloudOptimizedGeoTIFF(COG)toenableefficientextractionofuser-definedareassmallerthantheplannedstoragegranule(aWRSScene)

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

9

thealgorithmsthathavebeenusedtocreateARDsotheycanreliablyassesstheerror,ifanyornegligible,introducedintothedata.ArecommendationwouldbethatUSGSEROSCenterreleaseanystudyanalysiscompletedontheerrorimpactofthepreprocessingorinitiatesuchastudy.(Itisrecognizedthatifthereisconcernthatcannotberesolved,onecanreversetheprocessthatproducedtheTOAproductandhavethehistoricalradianceproduct.)WhichorganizationswithexpertiseinforecastmodelingarebestposturedtoevaluateanddemonstratetheforecastpotentialfromaLandsat-basedtemporaldatacube?MuchhasbeenwrittenaboutproblemsofforecastingwithanyBigData,includingalltheimageryandgeospatialcollections-withtheforemostchallengebeingthelackofpersonnelskilledforthistask.ThetilingschemechosenforARDandappliedtotheLandsatimagesovertheUSshouldassurealignmentoftilessothat“drilling”thoughseveralimagesoverthesamegeographyprovidesthesamefootprintforsubsequenttimeseriesanalysisthatcouldleadtoforecastingfutureconditionsbaseduponpastinformation.Itistrustedthatrigoroustestinghasbeendonetoassurethelayeredfootprintsovertimearepositionedwithinsomedefineddegreeofpositionalaccuracy.OneobjectivesoftheARDeffortcouldbetoimproveuseof“biggeodata.”Withinsomeoftheresearchandanalysisworkwithlargequantitiesofgeospatialdatahasbeendiscussionofthefrustratinginsufficiencyoftraditionalstatisticaltechniquesorofthechallengingselectionofthemostappropriatestatisticaltechniquetoobtainreliableandconsistentforecastsfromlargequantitiesofdata.IntheinitialreleasesoftheLandsatARDandthetemporaldatacube,itwouldbewisetoconsidertheuseofacademicresearchcenterstoassesshowmuchthenewstructureactuallyfacilitatesanalysisandtoencourageuniversitiestoreviseclassroommodulesthatpreparethefutureanalystsandinformationmanagers.WillARDenablebetterforecastswithBigDatausingavarietyofnoveltechniques?NotonlycanacademicorganizationsbeexcellentpartnerswiththegovernmentusingthesevaststoresofdatabutalsoseveralprivatecompanieswillbeeagertousetheARDandbuildversionsofthedatacubetailoredtosupportprocessingthatdeliverstheanswersneededbytheircustomers.HowfarbackintimeintotheLandsatarchiveshouldthestagingof‘analysisreadydata’beconsidered?E.g.,earlydatacollectionssuchasmulti-spectralscanner(MSS)dataarelessequipped(intermsofmetadata)tosupportrigorousgeometricandradiometriccalibrationcomparedtolatercollections.ThedecisiontoincludetheMSSdatahasbeenstronglyrecommendedwithinUSGSattheEROSCenter.Addressingthequestionmaybeamootpoint,givenitsvalueinthelongtermofcontinuousEarthimagingandobservationanditsinclusionbeingstronglyrecommendedbysomemembersofthepreviousLandsatScienceTeam.However,thistaskteamstronglyrecommendsthatprioritizingdevelopmentworkshouldbecarefullyscrutinizedwithinUSGS.IsglobalARDwithoutMSSofgreatervalue,toagrowinginternationalcommunityofusers,thanUSARDwithMSS?Inaddition,followingsomeoftheconcernaboutforecastsfrommassivedatastores,theissueofsignaltonoise(noteasilymitigatedbytheseriouslydiminishedamountofmetadataforMSSdata)shouldalsobeevaluated.Howcouldefficientsynergyberealizedamonggovernmentandcommercialrolesfordatacubedevelopment,andoperations(processing,storage,distribution)tosatisfybroadcommunityneeds?CautionwasurgedbyteammembersabouthowmuchoftheproductionworkloadshouldbeassumedbyUSGS.TheanalysisoftheARD,asingestedintothedatacubeinfrastructure,shouldnotbesolely

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

10

dependentonthecomputinginfrastructureoftheUSGS,whichisunlikelytohavereadyaccesstosomeofthelatesttechnologyadvancements,giventhebudgetingprocesses.ManyoftheorganizationsveryinterestedinthepromiseofLCMAPmightneedattentionmorefocusedonspecificareasthattheEROSCenterhadnotplannedtoaddressimmediately.ThosespecificareasmighthavelargerorsmallerfootprintsortheymightbeoutsidetheUS.Itisnotclearthatthegovernmentispreparedforsuchflexibleresponseforbuildingsuchspecificdatacubes,norhasanygoodjustificationbeenprovidedforwhythegovernmentshouldassumethatroleofproduction.MembersoftheTaskTeamhighlyrecommendedmoreconsiderationoftheprivatepublicpartnershipconceptintheend-to-endprocessfromLandsatlevel1productstoARDtouser-tailoreddatacube.TheTaskTeamagreedthatUSGS,astheLandsatsourceexperts,shouldberesponsibleforLandsatARDqualityandconsistency,althoughtheywouldlikelybenefitfromcommercialsupportfortheprocessinganddistributioninfrastructure.Finding4:Thecommercialsectorisreadytoprovidedatacubetailoringassistance,givenitsincreasingexperiencewithglobalgeospatialdata.ItisalsopreparedtoprovisioninfrastructuretoassistintheproductionofARD.Astheneededtoolsandtechniquesmature,theteamsimilarlyrecommendsthatUSGSshouldnotundertaketoscalethiscountry-specificeffortgloballythemselves.Thereisnoonepeerlessglobalprojectioncoordinatesystem.Givenspecificneeds,anyspatialmulti-dimensioneddatacubecanbequiteparochial,andeachcountryorregionthatwantsadatacubewouldlikelyselecttheirowntilinggridtominimizedistortionintheirregionandmaximizeinteroperabilitywithotherexistingregionaldatasets.TheUSGSshouldfocusonopeningupitstoolsandthenecessaryinputdatasetssothatthirdpartiesintheprivatesectorcanofferaserviceofbuildingthesedatacubesforglobalcustomersinaccordancewithUSGSbestpractices.Suchascenario,mightalsoinvolveothercountriesproducingtheirownARD,andiffromLandsat,thatcouldrequiretheUSGStoreleaseimagedata(perhapsLevel0),DEM,GCPdata,andallothernecessaryinputsinadditiontothecodethatUSGSusestocreatetheUSARDproduct.Inthisway,theUSGScouldfocusondevelopingexpertiseandonbuildingoperationalsystemsfortheUS,withoutstrayingintobuildingoperationalsystemsfortheworld.TheconcernaboutUSGSproducingeitherARDordatacubesfortheglobalcustomerrelatesbacktotheearlierdescriptionofbothamappingprojectionandagridsystemthatdonotapplywellglobally.ThatraisedthequestionabouttheprioritiesoftheUSGSproductionplansandhowandwhytheprivatesectorcanstepforward.TheCEOSinitiativeisnotwithoutquestionsforsimilarchallenges.EvenifglobalstakeholdersagreethatanOpenDatacubevisionhaspromise,willtheymaketheircontributionstomitigatetheriskthattheconceptcannotbescaledwithlimitedresources?GiventheadoptionoftheconceptandthedevelopmentofnationaldatacubesundertheCEOSinitiative,havingexcellenttransformationalgorithmsfortheprojectionswouldallownecessaryflexibility.Thetilingscheme,however,couldbefarmorechallenging,ifandwhenadjacentcountriesbuildnationaldatacubesandselectdifferingschemes.TheroleofCEOSinestablishingorinstantiatingstandardsandspecifications,likethoseintheDGGSmentionedabove,shouldnotbeunderestimated.Previouslythispapermentionedstandardswithrespecttotheopensoftwarestandardsneededtoavailanyrequesterofthesoftware,whomightrequirethealgorithmsusedbyUSGSinpreparingtheARDatanypointintheanticipatedimprovementsovertime.Finding5:Thedatacubeimplementationinvolvesabroadscopeofstandardsissues.• InFebruary2018,afteraninformaldiscussionofthetopic,anOGCgrouppreparedanOGC

discussionpaper:“Inresponsetoarecentdiscussion(viatheOGCemaillists)regardingperceptions

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

11

aboutdatacubesandDGGS,itwassuggestedthatwebeginamoreformaldiscussiononthistopicwithintheOGCTechnicalCommittee.ThisinformationdocumentaimstoinitiateadiscussionofthebroaderdefinitionofadatacubeandthecomplementaryrolethatDGGStechnologiesplay.”13Thefollowingfutureactionswereidentified.“…theAuthorsrecommendsometargetedactionswithwhichweshouldproceedincollaborationwiththecommunityoftheDGGSspecificationanddomaingroups.Theseactionsmainlyfocusoninvestigatingtheefficiencyonqueryingandexploringlargemulti-dimensionalarrayswhileusingtheDGGStechnologiesonDatacubes.Theseactivitieswillbeexercisedunderspecificongoingbigdataresearchinternationalprojects.”

• AlsoinFebruary,theOpenGeospatialConsortiumannouncedthatitwasseekingpubliccommentonWebCoverageService(WCS)2.1CandidateStandard.Thequalifierstatementfortheannouncementread“UpdatedWCS2.1Standardwillsimplifyaccesstospatio-temporal‘bigdatacubes’”.14Thereleasealsooffersmoreexplanation.“BysupportingthemoregeneraldatacubemodelofCIS1.1,theWCS2.1standardwillsimplifyaccesstospatio-temporal‘bigdatacubes’,withanoperationspectrumrangingfromsimplesub-settinginspaceandtimeuptocomplexspatio-temporalanalyticsthroughWebCoverageProcessingService(WCPS).WCPSoffersaprotocol-independentlanguagefortheextraction,processing,andanalysisofmulti-dimensionalcoveragesrepresentingsensor,image,orstatisticsdata,suchasmightbeenvelopedwithinadatacube.

• In2017,Dr.PeterBaumann,ProfessorofComputerScience,JacobsUniversityBremen,publishedapositivelyprovocativepaperwithinthecommunityofinterest,namedtheDataCubeManifesto15,inwhichhecommented,“Recently,thetermdatacubeisreceivingincreasingattentionasithasthepotentialofgreatlysimplifying“BigEarthData”servicesforusersbyprovidingmassivespatio-temporaldatainananalysis-readyway.However,thereisconsiderableconfusionaboutthedataandservicemodelofsuchdatacubes.”Thatstatementwasfollowedbyhissixprinciplesofdatacubeserviceconcludingwiththesixthbeing“Datacubesshallsupportalanguageallowingclientstosubmitsimpleaswellascompositeextraction,processing,filtering,andfusiontasksinanad-hocfashion…TheOGCdatacubestandards,CISandWCS/WCPS,areembracedbyopen-sourceandproprietaryimplementers,comingwithcompliancetestsenablinginteroperabilitydowntothelevelofsinglepixels.Availabilityofdatacubestandardsandtoolsisheraldinganeweraofservicequalityand,ultimately,betterdatainsights.”

TheTaskTeamrecommendsthatOGCbeencouragedtocontinueworkonthestandardsthatsupporttheagileandreliableandconsistentuseofadatacubeapproach.Thiswouldhelpaddressthissection’squestionabouttheefficientsynergybetweenpublicandprivatesectorusetomeetcustomer/clientrequirements.

AnotherquiterelevantpointthathasemergedduringthemonthsofdiscussiononthisLAGtaskassignmenthasbeenthequestionoflocalorcloudstorageand/orprocessing.Assumptionsaboutthedesirefornationstowantalltheirdatadownloadedtotheirownserversratherthanpreferringthevalue-addedsolutionsprovidedbythecloudserviceprovidersarenotnecessarilyreinforcedbythe

13Purss,M.,Peterson,P.,Strobl,P.,andSabeur,Z.DiscussionPaper:ADGGSPerspectiveonDatacubes18-006,14February2018(Permissiontouse:TheOGCWorkingGroupthatdevelopedthepaperapprovedreleaseforusebytheNGACmembership.26March2018.)14http://www.opengeospatial.org/pressroom/pressreleases/273815Baumann,Peter,TheDatacubeManifestohttp://earthserver.eu/tech/datacube-manifestoResearchsupportedbyECcontract654367

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

12

emergingevidence.16Insomecountries,thetoolstoworkwiththemassivedataareeithernotavailableortheskilllevelsarecurrentlyinadequate.Theprivatesectorworkingcloselywiththenationalgovernment’simagerycoulddramaticallysimplifydatausewhensolutionsratherthandataisthedesiredoutcome.Cloudcomputing,inadditionorinlieuofcloudstorage,maybethetailoredapproach.Evenwhendataarewhatmaybeneeded,theresponsecouldbeatailoreddatacubeprovisionwherethenationaldata,likeARD,arelayeredwithotherdatasourcesandrefinedtoaparticularfootprint,consistentwiththeprecedingdiscussioninthisstudy.OnequestionraisedwashowtheprivatesectormightcollaboratetohelpwithtilingtheadditionalsourcestomatchthatofARDasthelayersofthedatacubeareincorporated.DuringtheCEOSbriefinginBrazilinMarch201817,thetopicofcooperationwiththeprivatesectorundersomegrantagreementswasincludedasaneededfacilitatoroftheglobaleffort.PartnershipswithGoogle,Amazon,andotherswereseenasenablingthe“scalablesolution.”MajorRecommendations

Thisreportmakessomespecificrecommendations,specificallywithrespecttotheU.S.LandsatAnalysisReadyData(ARD)anditspotentialforbeingincorporatedinavarietyofdatacubes,asadirect-usedatasetinmonitoringandassessinglandscapechange.1. TaskteammembersendorsetheopennessoftheEROSCentercommitmenttoprovidethesource

dataandtopublish,as“opensource,”thesoftwareandalgorithmsusedtoproduceARD.TheTeamrecommendstheUSGSshouldpublishverificationproceduresthatthemethodsandworkflowshavebeenreplicatedproperlyforanynon-USGSprocessing.TheseprocedureswouldlikelyreflecttheveryprocessesthatUSGShasusedinpreparingARD.TheverificationtaskitselfwouldnotbetheresponsibilityoftheEROSCenterbutratherofanyotherentityusingthesoftwareandalgorithms.

2. Studiesmayalreadyexistthatcharacterizehow“normalizing”reflectanceacrosssensorsandyears

mightaffectvalues.TheTaskTeamrecommendsthatUSGSEROSCenterreleaseanyerror/differencestudyandanalysisbetweenthereflectancevaluesoftraditionalscenepixelsandtheARDunitpixels,whichmayhavealreadybeencompleted,todetermineanyradiometricchangesresultingfrompreprocessingtocreatetheARD.Offeringaccesstothosestudiescouldbebeneficialtosomeresearchers.Ifsuchananalysishasnotbeencompleted,theTaskTeamrecommendsthatonebeinitiated.

3. TheTaskTeamexpectsprocessingtechniques,algorithms,andassociatedtoolstoimproveover

time.Reprocessingtheentiredataset,vicelimitingnewapproachestoonlydataacquiredafterthedevelopmentofimprovements,wouldmeetthe“analysis-ready”objectiveofreducingthedataprocessingloadofdatauses.TheTeambelievesthatcompleterevisionoftheentireARDcouldfollowaMODISapproach.TheteamwasadvisedbyUSGSthatsuchprocessingofsomuchdatacouldtakeuptotenmonthssoareasonablescheduleforupdateswillneedtobeestablished.TheTaskTeamrecommendsthatwhenimprovedprocessingapproachesareready,thereprocessing

16TheCEOSDataCube,Three-YearWorkPlan2016-2018http://ceos.org/document_management/Ad_Hoc_Teams/SDCG_for_GFOI/Meetings/SDCG-10/Cube%203-Year%20Work%20Plan%20-%20v1.0.pdf.17Holloway,Kim“OpenDataCubeInitiative”Agendaitem#8,CEOS7thWorkingGroupforCapacityBuildingandDataDemocracyAnnualMeeting,INPEJosedosCampos,Brazil,6-8thMarch2018

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

13

shouldapplytotheentiredatasetinuseandthatusersshouldnotberequiredthemselvestoapplycompatibilityadjustmentstoanyARDreceivedpriortothechange.

4. TheTaskTeamdoesagreethatMSSshouldbeincorporatedintoARDtooptimizeuseoftheentire

forty-fiveyearsofcollectionhistory.However,theTeamrecommendsthatprioritizingdevelopmentworkshouldbecarefullyscrutinizedwithconsiderationgiventowhethergloballyextendingARDmaybemoreimportantthanspendingavailabletimeincorporatingtheMSScollection.Ingeneral,itisrecommendedthatUSGSassessallneedsandwantsandestablishcriteriatoprioritizeLandsatwork,includingenhancementstotheARDinitiative.

5. TheTaskTeamrecommendsthatUSGSshouldnotundertaketoscaletheUSARDcoverageeffort

globallybythemselves,astheprivatesectorisbetterpreparedwithneededtools,maturetechniques,and,particularlyscalableinfrastructure.

Thisreportalsomakesrecommendationsaboutgeospatialdatacubes,astheybecomemoregloballyemployedtomanageandexchangeinformationforavarietyofapplications.1. TheTaskTeamrecommendsthatUSGSrepresentation,asaStrategicMember,totheOpen

GeospatialConsortiumshouldadvocateforandparticipateinmorediscussionaboutdatacubestandardswithintheOGCTechnicalCommittee.

2. TheTaskTeamrecommendsthatpreparingdatacubesforspecificusesshouldnotbeanobjectiveofthegovernment,whichshouldbecautiousaboutproceedingevenwithproductionofsomegenericformsofadatacube.Thetailoreddatacubesshouldnotbeafederalgovernmentproductionresponsibility.

Additionalrecommendationsaremadewithreferencetothisreport.1. TheTaskTeamrecommendsthatasubsequentrequestbemadetoafutureLAGTeamtoevaluate

progressonthefindingsandrecommendationsofthispaperandtoupdateasneeded.2. TheUSGShasonlyfledglingexperiencewithARD,havingfirstreleasedittothecommunityofusers

attheendofOctober2017.Atthispoint,therehasnotbeenextensiveexperienceonthepartofARDusersandcertainlynotmuchevidenceoftheresultingdatacubes.ItwouldbehelpfulforUSGStosurveythosewhorequesttheARDonsomeroutinebasis,gatheringinformationforasubsequentreport.AmongthefactorstobesurveyedwouldbeifusersaretransitioningtoARDorstillrequestingthepreviousdistributionformats.ThePecoraConferenceinmid-November2017providedaninitialopportunityforgroupsofLandsatuserstodiscusstheirearlyreactionstothereleaseofARD.Sincethattime,usehasincreasedbutnotallusersarefullycomfortableknowinghowtousethedatatoitsbestadvantage.Similarly,on-goinginformationexchangesbetweenthepublicandprivatesectorsmayprovidemoreinsightintodefiningtheinterdependenciestomakedatacubesthemosteffectivewaytoadvanceuseofimageryandexpansionofGIStechnology.

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

14

AcronymListforthisPaperALOS AdvancedLandObservingSatelliteJapaneseEarth-observationsatellite,

developedbyJAXA(JapanAerospaceExplorationAgency)ARD AnalysisReadyDataCEOS CommitteeonEarthObservationSatellitesCOG CloudOptimizedGeoTIFFCSIRO CommonwealthScientificandIndustrialResearchOrganisationisan

independentagencyoftheAustralianFederalGovernmentresponsibleforscientificresearchinAustralia.

DGGS DiscreteGlobalGridSystemEO Electro-opticalsystemsoperateintheopticalportionoftheelectromagnetic

spectrum.EROS EarthResourcesObservationandScience,aUSGSCenternearSiouxFall,SDERS EarthResourcesSatellite,thefirsttworemotesensingsatelliteslaunchedbyESAESA EuropeanSpaceAgencyETM+ EnhancedThematicMapperPlus,asensoronboardtheLandsat7satelliteFGDC FederalGeographicDataCommitteeGeoTIFF GeoreferencedTaggedImageFileFormat,apublicdomainmetadatastandard

whichallowsgeoreferencinginformationtobeembeddedwithinaTIFFfileL1 Level-1Landsatproductswiththebestavailableprocessinglevelforeach

particularsceneL1G Level-1Landsatradiometricallycalibratedwithsystematicgeometriccorrections

usingspacecraftephemerisL1T Level-1Landsatradiometricallycalibratedandorthorectifiedusingground

controlpointsanddigitalelevationmodeldatatocorrectforreliefdisplacementLAG LandsatAdvisoryGroupLCMAP LandChangeMonitoring,Assessment,andProjection,aUSGSinitiative

implementedatEROSLIDAR(Lidar,LiDAR) LightDetectionandRanging,aremotesensingandsurveyingmethodthat

measuresdistancetoatargetbyilluminatingthetargetwithpulsedlaserlightandmeasuringthereflectedpulseswithasensor

MSS Multi-spectralscanner,linescanningdevicesobservingtheEarthperpendiculartotheorbitaltrackonthefirstfiveLandsats

NASA NationalAeronauticsandSpaceAdministrationOGC® OpenGeospatialConsortiumOLAP Onlineanalyticalprocessing,useofdataorganizedmulti-dimensionallytoallow

comparisonsfromdifferentperspectivesOLI OperationalLandImager,apushbroomscanneronLandsat8thatusesafour-

mirrortelescopewithfixedmirrorsSAR Synthetic-apertureradar,atechniqueforproducingfineresolutionimagesfrom

anintrinsicallyresolution-limitedradarsystemSDC SwissDataCube

NGACDataCubeFeasibilityforForecastingPaper April2018

NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)

15

TIRS ThermalInfraredSensor,asystemonLandsatthatmeasureslandsurfacetemperatureintwothermalbands

TOA Top-of-atmosphereUSGS U.S.GeologicalSurveyUTM UniversalTransverseMercator,acoordinatesystemwhichdividestheEarthinto

60zones,each6°oflongitudeinwidthWCPS WebCoverageProcessingService,aprotocol-independentlanguageforthe

extraction,processing,andanalysisofmulti-dimensionalcoveragesrepresentingsensor,image,orstatisticsdata

WELD WebEnabledLandsatDataWRS TheWorldwideReferenceSystem,aglobalnotationsystemforLandsatdataAcknowledgementsThispaperwasapprovedbytheNGACLandsatAdvisoryGroup(LAG)onMarch20,2018andadoptedbytheNGACasawholeonApril3,2018.TheLAGteamdevelopingthispaperincludedRobertaLenczowski,RobertaE.LenczowskiConsulting(TeamLead);FrankAvila,NationalGeospatial-IntelligenceAgency;PeterBecker,ESRI;StevenBrumby,DescartesLabs;RebeccaMoore,GoogleInc.;andTonyWillardson,WesternStatesWaterCouncil.MatthewHancher(Google,Inc.)andSaraLarsen(WesternStatesWaterCouncil)alsocontributedtothispaper.