Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/...

45
Data Mining for Avia.on Safety Nikunj C. Oza, Ph.D. NASA Ames Research Center [email protected] hBp://..arc.nasa.gov/people/oza American Sta.s.cal Associa.on, San Francisco Bay Area Chapter, October 21, 2010

Transcript of Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/...

Page 1: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

DataMiningforAvia.onSafety

NikunjC.Oza,[email protected]

hBp://..arc.nasa.gov/people/oza

AmericanSta.s.calAssocia.on,SanFranciscoBayAreaChapter,October21,2010

Page 2: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Outline

•  Thebreadth/depthoftheproblems•  Combiningdiscreteandcon.nuoussequences:Mul.pleKernelAnomalyDetec.on(MKAD)Derivedfromanomalydetec.onmethodsondiscreteandcon.nuoussequences.

•  TextMining:classifica.on,topicmodeling

•  Ongoing,futurework

Page 3: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

AccidentData

Opera.onalData(DiscreteandCon.nuousData)

Opera.onalSurveys(Text)

AnecdotalReports

Projec.ons

Forensics

WhatHappened?

DiscoveringCausalFactors

WhydiditHappen?

SUBJEC

TIVE

‐‐datacon.

nuum

‐‐‐OBJEC

TIVE

Predic.ons

WhatwillHappenNext?

Avia.onSafetyMapping

FromIrvingStatler,Avia.onSafetyMonitoringandModelingProject

Page 4: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

DataArisesfromMoleculartoGlobalScales

VehicleTechnology Opera.ons

Molecules

MaterialsSensors So/ware Engines Aircra/

10‐2

100 101 102 10310‐1

10‐3

104 105 10610‐4

10‐5

10‐6

Page 5: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

DataMininginSupportofGlobalOpera.ons

105

DatasharingJustculture/safetyculture

10‐2

100 101 102 10310‐1

10‐3

104 105 10610‐4

10‐5

10‐6

Page 6: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

• 6

DataMining•  IVHM/SSATprojectgoalsrequireleveragingsubstan.aldata.– Aircrab‐produced:Sensordata,flight‐relateddata(e.g.,origin,des.na.on),coveringmanyflightsovermanyyears.

– Other:Safetyreports,trajectories•  Transformdataintousefulknowledge

– ToolsforDetec.on,Diagnosis,Prognosis,Mi.ga.on

– Levelsrangingfromflight‐leveltona.onalairspacelevel

Page 7: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Outline

•  Thebreadth/depthoftheproblems•  Combiningdiscreteandcon.nuoussequences:Mul.pleKernelAnomalyDetec.on(MKAD)Derivedfromanomalydetec.onmethodsondiscreteandcon.nuoussequences.

•  TextMining:classifica.on,topicmodeling

•  Ongoing,futurework

Page 8: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

•  Needtomodelthebehaviorofdiscretesensorsandswitchesinanaircrabduringflight.

•  Focusisonprimarysensorsthatrecordpilotac.ons.

•  Theaimistodiscoveratypicalbehaviorthathaspossibleopera.onalsignificance.

Page 9: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

•  WedevelopedsequenceMiner:Eachflightisanalyzedasasequenceofevents,accoun.ngfor

•  orderinwhichswitcheschangevalues•  frequencyofoccurrenceofswitches

•  TwoTasks:

•  Givenagroupofflights,findflightsthatareanomalous.

•  Givenananomalousflight,describetheanomaliesandthedegreeofanomalousness.

•  Methodbasedontechniquesusedinbioinforma.cs.

Page 10: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

C=

O=

S1= S2=

S3=

–  Sequences Si in a cluster is dependent on a prototype C

–  The outlier O is dependent on the Si

–  Therefore:

–  This forms the basis of our objective function F that allows us to describe why sequences are anomalous.

P(O|C)=ΣiP(O|Si)xP(Si|C)

Page 11: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

A B _ _ E _ A B C _ E F A B C _ _ F A _ C D E F

Missinga“B” Hasextra“D”

{ClusterMembers

OutlierSequence}

Alignment

•  P(O|Si) is proportional to the normalized length of the longest common subsequence of O and Si •  Discovery of missing and extra symbols is done by aligning the

sequences •  Missing and extra symbols are those whose addition to or deletion from an

outlier sequence O would make O more normal with respect to the objective function F

Page 12: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

SequenceMiner

NormalizedLongestCommonSubsequence(NLCS)

wherethefunc.onsandcalculatethelongestcommonsubsequenceandlengthofthesequences.€

L h si,s j( )( )L si( ) × L s j( )

Page 13: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

SME opinion: Possible evidence of mode confusion.

Page 14: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

MKADforFleetwideanalysis

How to integrate all information in a concise and intuitive manner?

Compression, Feature extraction,

Fusion, Anomaly detection

Sequences D and continuous data streams C interactions

Page 15: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

MKADFramework

Kernelonmeasurements

Preprocessingdiscretedata

Preprocessingcon.nuousdata

Model

SequenceKernelonswitching

SAXrepresenta.on

Compression Fusion Detection

Page 16: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

0 20 40 60 80 100 120

CC

baabccbc

Firstconvertthe.meseriestoPiecewiseAggregateApproxima.on(PAA)representa.on,thenconverttosymbols

Ittakeslinear.me

SlidecourtesyofEamonnKeoghandJessicaLin

Page 17: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Op.miza.onproblem

Discretekernel Con.nuouskernelIntheobjec.vefunc.on,eachentryofthediscretekernelandthecon.nuouskernelrepresentsthescoreobtainedusinglongestcommonsubsequence(LCS)ofdiscretesignalsandSAXifiedcon.nuoussignals,respec.vely.

Unique Linear Decision Boundary

Page 18: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

PairwiseSimilarityMeasure

Kernelondiscrete:NormalizedLongestCommonSubsequence(NLCS)

Kerneloncon.nuous:Inverselypropor.onaltodistancebetweenSAXrepresenta.onsofsequences

Page 19: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

GeneralCase,Mul.pleKernels

minQ =12

α iα j βλKλ xi,x j( )λ

i, j∑

Subjectto:

OneclassSVMstrainingalgorithmsrequiresolvingthequadra.cproblem

Dual form

Linear equality constraint

Bounds on design variables

Control parameter

:Lagrangemul.pliersoftheprimalQPproblem

βλλ

∑ =1Also:

Page 20: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Anomalyscores

Datapoints with will be the support vectors

Magnitude of h: degree of normality/anomalousness

Sign of h: if negative – outlier if positive - normal

Indicator

Decisionboundaryisdeterminedonlybymarginandnon‐marginsupportvectorsobtainedbysolvingtheQPproblem

Page 21: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Synthe.cExperimentSimulation data

Type 1 – (Missing event) Flaps were not extended to normal full deployment at landing.

Type 2 - (Extra event) Landing gear was retracted after being deployed on final approach.

Type 3 – (Out of order event) Gear deployed before initial flaps below flaps limit.

Type 4 – (Continuous anomaly) High bank angles or rate of descent below 1,000 ft.

Page 22: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Casestudy:FOQAanomalydetec.on

•  Thetradi.onmethodscannotdetectandmonitortheseanomalousac.vi.esthatmayhaveoccurredsimultaneouslyandareheterogeneousinnature.

Touchdownpoint

Page 23: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

MKADSummary

1.  Support flights safety experts 2.  Schedule maintenance

Application

.… anomaly detection on multivariate mixed attributes where discrete may influences the system dynamics which is reflected on the continuous data streams.

Performs

.. High detection rate on most operationally significant anomalies in fleet wide analysis on large datasets

.. Discover some “unknown unknowns”

Highlights

Page 24: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Outline

•  Combiningdiscreteandcon.nuoussequences:Mul.pleKernelAnomalyDetec.on(MKAD)– Derivedfromanomalydetec.onmethodsondiscreteandcon.nuoussequences.

•  TextMining:classifica.on,topicmodeling

•  Ongoing,futurework

Page 25: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Avia.onSafetyTextReports

•  Pilotsareencouragedtoreportincidents,concerns,orunsafecondi.ons.Tworepositories:–  TheNASA‐FAAAvia.onSafetyRepor.ngSystem(ASRS).

–  EachindividualairlinehasanAvia.onSafetyAc.onProgram(ASAP).

•  Reportscanbeanalyzedtoimproveavia.onsafety.

•  Reportsneedtobecorrectlyandconsistentlycategorizedbyeventtypetodeterminedangeroussitua.onsandtracktrendsofincidentsandevents.

•  Eventtypesarenotenoughtocaptureallanomalies.Topiciden.fica.onneededtoiden.fyneweventtypes,combina.onsofeventtypes.

Page 26: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

ManuallyClassifyingReports•  Reviewandanalysisofreportsislaborintensive.

–  InASRS,reviewershaveclassifiedabout135,000ASRSreportsof715,000totalreportssubmiBedinto60overlappinganomalycategories.

–  Thelengthofthereportsrangefrom0.5‐4pageslong.

–  ASRSreportsaccrueatabout3,000permonth.

–  Classifica.onsnotalwaysconsistentduetomul.plereviewers,changingexperiences.

•  Historicalreportsmustberereadwhennewcategoriesareaddedorchanged.

•  Currentsystemsrelyheavilyonhumanmemoriesforhistoricalperspec.ves.

Page 27: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

ASRSReportExcerpt

JUST PRIOR TO TOUCHDOWN, LAX TWR TOLD US TO GO AROUND BECAUSE OF THE ACFT IN FRONT OF US. BOTH THE COPLT AND I, HOWEVER, UNDERSTOOD TWR TO SAY, 'CLRED TO LAND, ACFT ON THE RWY.' SINCE THE ACFT IN FRONT OF US WAS CLR OF THE RWY AND WE BOTH MISUNDERSTOOD TWR'S RADIO CALL AND CONSIDERED IT AN ADVISORY, WE LANDED...

Note:Industryspecificvocabularyandabbrevia.ons.

Page 28: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Automa.cCategoriza.onofASRSReports

Non Adherence to ATC Clearance

Critical Equipment Problem

Runway Incursion

Landing without a Clearance

Air Space Violation

Altitude Deviation Overshoot

Fumes

Altitude Deviation Undershoot

Ground Encounter, Less Severe

...

JUST PRIOR TO TOUCHDOWN, LAX TWR TOLD US TO GO AROUND BECAUSE OF THE ACFT IN FRONT OF US. BOTH THE COPLT AND I, HOWEVER, UNDERSTOOD TWR TO SAY, 'CLRED TO LAND, ACFT ON THE RWY.' SINCE THE ACFT IN FRONT OF US WAS CLR OF THE RWY AND WE BOTH MISUNDERSTOOD TWR'S RADIO CALL AND CONSIDERED IT AN ADVISORY, WE LANDED…

ASRS Report ExcerptSample of 60 ASRS Anomaly Categories

Asinglereportcanbeinmul.plecategories.

Page 29: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

MarianaSta.s.calModelOp.miza.onFlowchart

Start at Co, γo, wo

Train SVM; Build model

Test Model w/ Validation

Data

Compare to ‘best’ AUROC

Update ‘best’ AUROC;

Save parameters

Move to new C, γ, w

Calculate AUROC

Apply statistical optimization

method

Yes

Save Model; Quit

ConvergedtoaSolu.on?

BeBerthanPrevious‘best’?

Yes

NoNo

AUROC‐areaunderreceiveroperatorcurve

Page 30: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Mariana w/ raw text only

Support Vector Machine w/NLP

Decision Tree w/NLP

Random Forest w/ NLP

Logistic Regression w/NLP

Linear Discriminant Analysis w/NLP

Marianavs.MethodsusingNaturalLanguageProcessing

TheMarianaalgorithmcangivesubstan.allybeBerresultsoverothermethods(greenbox),evenwhenprocessingrawtext.

Areaun

derRO

CCu

rve

Categories*

0.8

1.0

0.6

0.4

0.2

0.0

Page 31: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Outline

•  Combiningdiscreteandcon.nuoussequences:Mul.pleKernelAnomalyDetec.on(MKAD)– Derivedfromanomalydetec.onmethodsondiscreteandcon.nuoussequences.

•  TextMining:classifica.on,topicmodeling

•  Ongoing,futurework

Page 32: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

SubspaceApproxima.on

•  Goals–  Systemobenassumedexplainablebysimplemodelplusnoise.Construc.onofsimpleapproxima.onmayreducenoise.

–  Derivesmall,keysetoffeaturestoexplainsystembehavior.

–  Lowerstoragerequirements.

•  Issue–  Derivedfeaturesmoreintui.veiftheyareposi.ve,reflec.ngpresenceofkeycomponentsthatadduptocharacterizetheitemofinterest.

Page 33: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Non‐nega.veMatrixFactoriza.on

•  NMFfindsfactorsrepresen.ngpartsoffaces(e.g.,nose,mouth).Easytointerpret.

•  VectorQuan.za.onandPCAfindholis.crepresenta.ons(face+/‐otherstuff).Difficulttointerpret.

Page 34: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Non‐nega.veMatrixFactoriza.on

Problem:Givenanonnega.vematrixandaposi.veintegerfindnonnega.vematricestominimizethefunc.on

A ∈ Rmxn

k <min{m,n}

WHisanonnega.vematrixfactoriza.on.ColumnsofWrepresentkbasisvectorscapturedfromnexampleshavingmfeaturevalueseach.Hgivesfactor‐exampleweights.kisproblem‐specificandchosenbyuser.

Page 35: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

NMFforclassifica.on

•  NMFfactorscanbeusedforclustering,classifica.on,regression.

•  Classifica.onofASAPreportsintooneormorecontribu.ngfactors.

Page 36: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Sample(ASRS)BasisVectorsBasisVector1 BasisVector2

Run 1 2 3 1 2 3

Page 37: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

ASRSClassifica.onResults

Page 38: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Outline

•  Combiningdiscreteandcon.nuoussequences:Mul.pleKernelAnomalyDetec.on(MKAD)– Derivedfromanomalydetec.onmethodsondiscreteandcon.nuoussequences.

•  TextMining:classifica.on,topicmodeling

•  Ongoing,futurework

Page 39: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

Ongoing,FutureWork

•  Anomalydetec.onoverdiscrete,con.nuous,andtext(u.lizewhenavailable,don’tpenalizewhennot).

•  Anomalycause/precursoriden.fica.on.

•  Predic.onovermul.plescales:withinflight,acrossflights,acrossfleetsoveryears.

Page 40: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

HowDoesHumanPerformanceAffectAvia.onSafety?

PhysiologicalMeasurements* Sample Text Report JUST PRIOR TO TOUCHDOWN, LAX TWR TOLD US TO GO AROUND BECAUSE OF THE ACFT IN FRONT OF US. ...

NASADataMiningLab(MountainView,CA)

Luton,UK

ToNASADataMiningTeamDailydata300GBflightspermonthPhysiology,text,cockpit,enginesandflightparameters,flightpath,networkinforma.on.

ToEasyJetNearreal‐.medecisionsupport

Understandingpilotfa.gue

Mul.pleKernelAnomalyDetec.on(KDD2010)

*Diagramforno.onalpurposesonly

Page 41: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

MiningHeterogeneousDataistheKey

K(xi,xj)

Discrete

Textual

Networks

Con.nuous

PrimarySource:aircrabCananswerwhathappenedinanaircrabduringanAvia.onSafetyIncident

PrimarySource:Humans.CananswerwhyanAvia.onSafetyIncidenthappened

PrimarySource:Radardata.CananswerwhathappenedintheNa.onalAirspaceduringAvia.onSafetyIncident

Sample Text Report JUST PRIOR TO TOUCHDOWN, LAX TWR TOLD US TO GO AROUND BECAUSE OF THE ACFT IN FRONT OF US. ...

MassiveData:Archivesgrowingat100Kflightspermonth.

Page 42: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

TheAnatomyofanAvia.onSafetyIncident

Time

. . . . . .

State k-1 State k State k+1

Transition k-1

State n+1State n-1 State n

Transitionk

Transitionk+1

Transitionn-2

Transitionn-1

Transitionn

Compromised States Safe States or AccidentsSafe

Incident

AnomalousStates

Scenario=(ContextBehaviorOutcome)

FromIrvingStatler,Avia.onSafetyMonitoringandModelingProject

Page 43: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

DASHlinkdisseminate.collaborate.innovate.hBps://dashlink.arc.nasa.gov/

DASHlinkisacollabora.vewebsitedesignedtopromote:• Sustainability• Reproducibility• Dissemina.on

• Communitybuilding

Userscancreateprofiles

• Sharepapers,uploadanddownloadopensourcealgorithms• FindNASAdatasets.

ComingSoon…DASHlink2.0.

JoinDASHlink!

Page 44: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

ConferenceonIntelligentDataUnderstanding‐2010

•  Conferencefocusedontheoryandapplica.onsofdataminingandmachinelearningtoEarthScience,SpaceScience,EngineeringSystems

•  Loca.on:ComputerHistoryMuseum,MountainView,CA

•  Date:October5‐6,2010•  Registra.on:Free✔

•  SteeringCommiBee•  AshokSrivastava(chair)•  StephenBoyd•  JiaweiHan•  EamonnKeogh•  VipinKumar•  ZoranObradovic•  NikunjOza•  RaghuRamakrishnan•  RamasamyUthurusamy•  RamasubbuVenkatesh•  XindongWu

ProgramChairs:NiteshChawlaandPhilipYu

CallforPar.cipa.on

Page 45: Data Mining for Aviaon Safety · 2013-08-13 · Machine w/NLP Decision Tree w/NLP Random Forest w/ NLP Logistic Regression w/NLP Linear Discriminant Analysis w/NLP Mariana vs. Methods

References

•  S.Budalako.,A.N.Srivastava,andM.E.Otey,Anomalydetec.onanddiagnosisalgorithmsfordiscretesymbolsequenceswithapplica.onstoairlinesafety,inIEEESMCPartC,39(1),2008.

•  SantanuDas,KanishkaBhaduri,NikunjC.Oza,andAshokN.Srivastava,nu‐Anomica:AFastSupportVectorBasedNoveltyDetec.onTechnique,inICDM2009.

•  SantanuDas,BryanL.MaBhews,AshokN.Srivastava,andNikunjC.Oza,Mul.plekernellearningforheterogeneousanomalydetec.on:algorithmandavia.onsafetycasestudy,inKDD2010.

•  NikunjC.Oza,J.PatrickCastle,andJohnStutz,Classifica.onofAeronau.csSystemHealthandSafetyDocuments,inIEEESMCPartC,39(6),2009.