A Tool for Verification of Big-Data Applications

30
DICE Horizon 2020 Research & Innovation Action Grant Agreement no. 644869 http://www.dice-h2020.eu Funded by the Horizon 2020 Framework Programme of the European Union A Tool for Verification of Big-Data Applications Jul 21 th , 2016 QUDOS 2016 Saarbrücken, Germany M.M. Bersani, F. Marconi, M.G. Rossi Politecnico di Milano Milan, Italy Madalina Erascu Institute e-Austria Timisoara & Western University of Timisoara Timisoara, Romania

Transcript of A Tool for Verification of Big-Data Applications

DICEHorizon2020Research&InnovationActionGrantAgreementno.644869http://www.dice-h2020.eu

FundedbytheHorizon2020FrameworkProgrammeoftheEuropeanUnion

AToolforVerificationofBig-DataApplications

Jul21th,2016

QUDOS2016Saarbrücken,Germany

M.M.Bersani,F.Marconi,M.G.RossiPolitecnicodiMilano

Milan,Italy

MadalinaErascuInstitutee-AustriaTimisoara&WesternUniversityofTimisoara

Timisoara,Romania

Ourworkataglance

o Approachandtoolfortheautomatedverificationoftopology-baseddata-intensiveapplications.§ Based(sofar)ontemporallogicmodel§ Performsautomatedtransformationfromhighlevelapplicationdescriptiontoformalmodel

§ Enablesverificationofsafetyproperties

2

Roadmap

§ Context• QualityassuranceinDIA

§ ResearchDesign• Researchquestion• Ourapproach

§ Conclusions• Contributions• Futureworks

3

CONTEXT

QualityAnalysisandVerificationfordata-intensiveapplications

4

FormalVerification

o GivenaModelMandaPropertyspecificationP,verificationcheckswhetherPholdsinM.

o MandPcanbeexpressedinmanydifferentways§ variouskindsofautomata(operationalmodels)§ variouskindsoflogics(descriptivemodels)

5

Data-IntensiveApplications(DIA)

6

DICEProject

o Horizon2020Research&InnovationAction(RIA)§ Quality-AwareDevelopmentforBigDataapplications§ Feb2015- Jan2018,4MEurosbudget§ 9partners(Academia&SMEs),7EUcountries

7

QualityDimensionsinDICE

o Reliability

o Efficiency

o Safety&Privacy

8

§ Availability§ Fault-tolerance

§ Performance§ Costs

§ Verification§ Dataprotection

BigDataTechnologies

Cloud(Priv/Pub)`

9

DICEIDE

Profile

Plugins

Sim Ver Opt

DPIM

DTSM

DDSMTOSCAMethodology

Deploy Config Test

Mon

AnomalyTrace

Iter.Enh.

DataIntensiveApplication(DIA)

Cont.Int. FaultInj.

WP4

WP3

WP2

WP5

WP1 WP6- Demonstrators

OurpositioninginDICEframework(1)

OurpositioninginDICEframework(2)

10FeaturingtheDICEH2020EUProject

DICE DPIM Meta-Model

Scenario: Tech. Comparison

DICE DTSM Meta-Model

Big-Data Technological Development

DICE DDSM Meta-Model

Big-Data PhysicalAssetsDICE

DMON

Operations

Spark

DICE Process Views

HadoopMR

DICE TOSCA Meta-Model

Big-Data Technological Deployment (TOSCA)

DICE DTSM Meta-Model

Big-Data Technological Logic

Safety Verification with ZOT

Reliability and Resource Management

with GreatSPNHadoop MR Monitoring

StormMonitoring

LEGENDASoftware

Architecture ViewsModel-to-Model Transformation

Storm Oryx 2

Scenario:Deployment

Safety Reliability

Resource Management

Config. Optimization

SparkMonitoring

Oryx 2Monitoring

Configuration Optimisation with BO4CO

JSON or YAML File Exchange

Model-to-Text Transformation

TOSCA *.yaml Blueprint

RESEARCHDESIGN

QualityAnalysisandVerificationfordata-intensiveapplications

11

Researchquestion

“Howcanweverifysafetypropertiesofadata-intensiveapplication?”

12

Stateoftheart

o Formalverificationofdistributedsystemsisamajorresearchareainsoftwareengineering

o FewworkstryingtoaddressformalverificationinthecontextofDIA§ Mainfocusonverifyingapplication-independentpropertiesrelatedtospecificframeworks• ReliabilityandloadbalancingofMapReduce• ValidityofmessagingflowinMapReduce

§ nomodelingandverificationofapplication-dependentproperties

o VerificationtoolshavebeenusedasverificationenginestobuildformalverificationtechniquesforUMLmodels§ Fewofthemdealwithreal-timeconstraints.§ Mainlyfocusedonfunctionalrequirements.

13

OurApproach

o Focusonaspecificsetoftechnologies§ Topology-basedstreamingapplicationsà ApacheStorm

o Identifysafetyissueso Deviseaformalmodel

§ Havinganappropriatelevelofabstraction§ Allowingtocapturemeaningfulsystembehaviorandproperties

§ Usingaformalismthatenablesautomaticverificationo Defineatool-supportedmechanismforformalverification§ Startingfromhighlevelapplicationdescription(annotatedUML)

14

ApacheStorm

o OpenSourceDistributedStreamProcessingSystem

o Analytics,LogEventprocessing,etc..o Reliability,at-least-onesemanticsoWideadoptioninproductionoMainconcepts

§ Streams§ Topologies

- 15 -

StormApplications

o ApplicationsdefinedbymeansofTopologies,graphsofcomputationscomposedof:§ Spouts

• Sourcesofdatastreams(tuples)

§ Bolts• Calculate,Filter,Aggregate,Join,Talktodatabases

- 16 -

SafetyIssues

o Importantrequirementsforstreamingapplications§ Latency§ Throughput

o Criticalpoints§ incorrectdesignoftimingconstraints§ nodefailures

o mightcause§ latencyinprocessingtuples§ monotonicgrowthofthesizeofusedmemory(queues).

17

DICEVerificationTool

oWewantto§ Verifywhetheratopologyreachesanunwantedconfiguration• e.g.,whereboltsarenotabletoprocessincomingtuplesontime

§ Lettheuserspecifythetopologybymeansofhighlevelmodels(UML)

18

D-VerT - DICEVerificationTool

19

DICEDTSM::StormUMLprofile

20

D-VerT - DICE-profiledUMLClassDiagram

21

DTSM2Jsonmodule

o ReliesonEclipseUML2 Javalibraryo “Navigates”DTSMclassdiagramandextracttopologystructure

andinformationo GathersverificationoptionfromEclipselaunchconfigurationo MapstopologycomponentstoJavaobjectso DirectlyconvertsJavaobjectstoJSONobjectviagson library

22

Json2MC- Module

o PythoncomponentbasedonJinja2templatingengine

o GeneratesFormalModelbasedonthecontentofJSONfileandontheselectedtemplate(TLorFOL).

23

————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

TL Model Zot

JSON2MC

<—> <—> <———> ———— ————————— </———> </—> <————————> <———————> </——————> <—> </—> </————————></—>

JSON

————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

TL ModelTemplate

————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

FOL ModelTemplate

————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

FOL Model Cubicle

X

WIP

VerificationApproacheso BoundedSatisfiabilityChecking(BSC)

§ Input:• Temporallogicformula(Model)• NegatedPropertyovertime

§ Outcome:• SATà counterexampletrace• UNSATà Propertyholdsfortheconsideredtimebound

§ WeuseZotverificationtool(https://github.com/fm-polimi/zot)o ReachabilityChecking(WIP)

§ ModeldefinedbyFOLArraybasedsystem• Setofinitialstates andtransitions• Formuladefiningundesiredstates(Negatedproperty)

§ Outcome:• UNSAFEà Traceshowingthatundesiredstatearereachablefrominitialstates

• SAFEà Noundesiredstatecanbereache frominitialstates

24

D-VerT – Outputtrace

o Whenatleastonequeuegrowswithanunboundedtrend§ aninfiniteultimatelyperiodicmodelisfound§ OutputParserprovidesgraphicalcounterexampletrace

25

CONCLUSIONS 26

Contributions

oWeenabledautomaticverificationontopology-based streamingapplicationsby§ Definingaformalmodelbasedontemporallogic§ definingautomaticmechanismsfortranslatingtotheformalmodelfromahighleveldescription.

§ extendingZotVerificationtooltosupporttheformalismandcarryoutBSConit

27

Preliminaryresults

o Validationthroughopensourceandindustrialusecases§ Meaningfulqualitativeresultsinidentifyingcriticalpointsintopologydesign

§ Executiontimestronglydependsonthesizeofthetopologyandontheconfigurationsofsinglecomponents

28

http://dice-project.github.io/DICE-Verification/

OngoingandFutureworks

o Identificationandverificationoffurtherproperties§ PrivacyandSecurity

o Toolimprovementso Modelingdifferenttechnologies(Spark,CEP,Tez)o DevelopingFOLmodelo Newtheoreticalresultsonthecorrectnessandcompletenessoftheformalanalysis

29

Questions?

30

Thankyou!