SAMS: Data and Text Mining for Early Detection of...
Transcript of SAMS: Data and Text Mining for Early Detection of...
![Page 1: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/1.jpg)
SAMS:DataandTextMiningforEarlyDetectionofAlzheimer’sDiseaseNovember,2016DrChristopherBull
![Page 2: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/2.jpg)
Aimoftalk
• WhatisSAMS• DataCapture
– Problemsandsolutionstoacquiringthistypeoftext/data• NLP
– Toolsused• Existing• Bespoke
• Reflections
![Page 3: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/3.jpg)
WhoamI?
DrChristopherBull
[email protected]@ChrisBull88
[Insertdashingphotohere]
• 2011– PhD• 2014– SAMS(PDRA)• 2016– MobileAge(PDRA)------------------------------------------• SoftwareEngineering• Education/Pedagogy• DigitalHealthTechnologies
![Page 4: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/4.jpg)
SAMSOverview
![Page 5: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/5.jpg)
Problem
• NationalDementiaStrategy(2009):early(‘timely’)diagnosis
• Onlyabout50%ofpeoplewithdementiacurrentlyreceiveadiagnosis
• Diagnosisisoftenlate- moderateorseverestages
![Page 6: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/6.jpg)
WhatisAlzheimer’sDisease?
• Alzheimer’sisthemostcommoncauseofdementia(estimated60%-80%ofcases)– Dementia“describessymptomsthatoccurwhenthebrainisaffectedby
certaindiseasesorconditions”• Symptomsinclude:
– memoryloss– difficultieswith:
• thinking• problem-solving• language
• UltimatelyfatalSource:Alzheimer’sSociety
![Page 7: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/7.jpg)
SAMS
Goal:ExploreTechnology-dependentproxymarkersOfAlzheimer’sDisease
Aims:• Nonintrusivecaptureofcomputeruse• Minethedatafortrendsandpatterns• Inferlongitudinalchangesincognitivehealth
![Page 8: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/8.jpg)
Team
ProfessorPeteSawyer SchoolofComputingandCommunications,LancasterUniversity
DrPaulRayson SchoolofComputingandCommunications,LancasterUniversity
DrChristopherBull SchoolofComputingandCommunications,LancasterUniversity
ProfessorAlistairSutcliffe SchoolofComputingandCommunications,LancasterUniversity
ProfessorAlistairBurns NationalClinicalDirectorforDementiainEngland,InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester
DrIracema Leroi InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester
GemmaStringer InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester
DrSamuelCouth InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester
ProfessorJohnKeane SchoolofComputerScience,UniversityofManchester
DrAnnGledson SchoolofComputerScience,UniversityofManchester
ProfessorCliveBallard WolfsonCentreforAge-RelatedDiseases,King'sCollegeLondon
![Page 9: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/9.jpg)
DataFlows
![Page 10: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/10.jpg)
CurrentStatus
• ProjectfundingendedSeptember2016
• On-goinganalysis
![Page 11: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/11.jpg)
MyRoleinSAMS
…andDataCollection
![Page 12: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/12.jpg)
MyRole
• Datacapturesoftware– SoftwareDesign/implementation
• SAMSManager• Browserextensions
– Maintenance(obviously)• TextMining
– Textextraction(reconstruction)– ReusingexistingNLPpipeline(Wmatrix;UCREL)– Implementingextensionstopipelineforspecificheuristics
• GeneralProjectSupport(Team&Participants)• Considerchallenges
![Page 13: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/13.jpg)
Challenges
• Volatilityofparticipantcomputers– Unexpectedupdates– Varyingshutdownprocedures– Varioussoftwaresetups(anti-virusetc.)
• Weakperformingcomputers(andnotmonopolisevaluableresources)– Again,varioushardware/softwaresetups
• Ethicalchallenges– Privacy/Security
• Novelmonitoringapproaches• InternetExplorer*sigh*• Win10roll-outmidprojectà
![Page 14: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/14.jpg)
AbstractArchitecture(DataCollection)
BrowserExtensions
Desktop/ApplicationMonitorProcesses
EncryptLogs
SecureSAMSServer
ManagerProcess
Collectingcontext,notjustrawdata
![Page 15: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/15.jpg)
Desktop/ApplicationMonitorProcesses
u C#inputeventlisteners
u VarietyofMouse,keyboard.
u WindowsAutomationAPI:UIAutomation(UIA)
u ObserveUIelements(andproperties)auserinteractswith.
u Providescontextbehindevents.
Desktop/AppMonitor
*WorkofDrAnnGledson,Mancs
![Page 16: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/16.jpg)
BrowserExtensionsBrowserExtension
Webpageblack/whitelist(e.g.nohttps://unlesspredefined)
JSDOMparsing(textfields andinteractiveelements)
JSeventlisteners&contextidentifier(Click,Mouse-Move,Focusetc.)
Logmessagecaching(volatile)
Encryption
Writelogfiles
![Page 17: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/17.jpg)
BrowserMonitoring- Challenges
• Contexttoevents
• ConstantlychangingordynamicDOM
![Page 18: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/18.jpg)
Manager/Uploader
• Processmanagement
• Servercommunication
• Remoteupdating
• Logmessagecachingandencryption
![Page 19: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/19.jpg)
Manager(2)
EarlyUI
![Page 20: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/20.jpg)
ProjectSupport
• ParticipantStatusChecker– Forclinical&Techteams– +Androidapp
• Phonesupport– ClinicalTeam– Participants
• Participantvisits(Installs)
![Page 21: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/21.jpg)
ExistingStudy(s)
NunStudy:• Measures
obtainedfromautobiographies
• writtenovera60-yearspan(age22to83).
Nodementia Dementia
Grammaticalcomplexity
-mean4.78-declined.04unitsperyear
-mean3.86-declined.03unitsperyear.
Ideadensity -mean5.35propositionsper10words- declined.03unitsperyear
-mean 4.34propositionsper10words-declined.02unitsperyear.
![Page 22: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/22.jpg)
PropositionalIdeaDensity(P-density)
• “Ideadensity[…]isthenumberofexpressedpropositionsdividedbythenumberofwords.Intermsofsemantics,ideadensityisameasureoftheextenttowhichthespeakerismakingassertions(oraskingquestions)ratherthanjustreferringtoentities”– “Automaticmeasurementofpropositionalideadensityfrompart-
of-speechtagging”(Brownetal,2008)• ExistingImplementation
– CPIDR(ComputerizedPropositionalIdeaDensityRater)– (pronounced“spider”)– onlytooltoautomatethis*
*AttimeofstartingSAMS
![Page 23: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/23.jpg)
Kusari (Toolchainmanager)
“ToolchainanddatadependencymanagerforusewithconventionalNLPtoolchains”
DrSteveWattamhttps://delta.lancs.ac.uk/Steve/kusarihttps://delta.lancs.ac.uk/Steve/kusari-links
![Page 24: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/24.jpg)
Toolchain
SpellingVariation VARDucrel.lancs.ac.uk/vard/Java
PartOfSpeechTagger CLAWSucrel.lancs.ac.uk/claws/C
SemanticTagger USASucrel.lancs.ac.uk/usas/C
FrequencyLists Tmatrixucrel.lancs.ac.uk/wmatrix/C
SAMSsoftware SNOWCATdelta.lancs.ac.uk/SAMS/SNOWCATJava
![Page 25: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/25.jpg)
SNOWCAT
Sams aNalysis ofOutputfromWmatrix fortheCognitiveAssessmentofText
• Input– Tmatrix (FQLs)– USAS(Sem)
• Output– CSVofmetrics
![Page 26: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/26.jpg)
SNOWCAT:SampleOutput(1/2)
• TotalWords(MWE), 26278• TotalWords, 27787• Vocabularysize(MWE), 3533• Vocabularysize, 3444• Type:Token (ratio;MWE), 0.134• Type:Token (ratio), 0.124• Type:Token (normalisedratio), 0.403• Wordsoccurringonce(MWE), 1842• Adjective(total;MWE), 1288• Adjective(ratio;MWE), 0.049• Noun(total;MWE), 4280• Noun(ratio;MWE), 0.163• …
![Page 27: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/27.jpg)
SNOWCAT:SampleOutput(2/2)
• Pronoun(total;MWE), 2672• Pronoun(ratio;MWE), 0.102• Verb(total;MWE), 6135• Verb(ratio;MWE), 0.233• Contentwords(total;MWE), 13757• Contentwords(ratio;MWE), 0.524• Fillerwords(total;MWE), 183• Fillerwords(ratio;MWE), 0.007• Noun:Verb (ratio;MWE), 0.698• MeanLengthofUtterance, 27.653• VARDVariant(total), 69• VARDVariant(ratio), 0.003• PropositionalIdeaDensity, 0.565
![Page 28: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/28.jpg)
Early(unpublished)Results
• ValidateP-Density(comparisontoCPIDRtool)
• UsesnoveliststudytoexploreusefulnessofSNOWCATmetrics
• [Showspreadsheetofearly(unpublished)results]
![Page 29: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/29.jpg)
Charts
![Page 30: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/30.jpg)
What’snext?
• ContinueNLPanalysis
• CorrelateDataandTextMininganalyses
• …SAMS2.0
![Page 31: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/31.jpg)
LessonsLearnt
• Ethicalprocess– Affectsfundamentaldesigndecisions
• Complexityofdatacollectionoutsideof“labsetting”
• Validatingotherstudies/claimsimportant
![Page 33: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s](https://reader035.fdocuments.us/reader035/viewer/2022070717/5edda589ad6a402d6668cb9a/html5/thumbnails/33.jpg)
Publications
ucrel.lancs.ac.uk/sams/papers.php• Combiningdataminingandtextminingfordetectionofearlystagedementia:the
SAMSframework.Bull,C.,Asfiandy,D.,Gledson,A.,Mellor,J.,Couth,S.,Stringer,G.,Rayson,P.,Sutcliffe,A.,Keane,J.,Zeng,X.,Burns,A.,Leroi,I.,Ballard,C.,&Sawyer,P.(2016).In LREC-2016Workshop: RaPID-2016 [proceedings; slides]
• FromClicktoCognition:Detectingcognitivedeclinethroughdailycomputeruse.Stringer,G.,Sawyer,P.,Sutcliffe,A.,&Leroi,I.(2015).InD.Bruno(Ed.), ThePreservationofMemory:TheoryandPracticeforClinicalandNon-ClinicalPopulations (pp.93-103).Hove,UK:PsychologyPress.[onlinepreview]
• DementiaandSocialSustainability:ChallengesforSoftwareEngineering.Sawyer,P.,Sutcliffe,A.,Rayson,P.,& Bull,C. (2015).In 37thInternationalConferenceonSoftwareEngineering(ICSE'15) (pp.527-530).Florence,Italy:IEEE.DOI: 10.1109/ICSE.2015.188
• Discoveringaffect-ladenrequirementstoachievesystemacceptance.Sutcliffe,A.,Rayson,P., Bull,C.,&Sawyer,P.(2014).In 22ndIEEEInternationalRequirementsEngineeringConference(RE'14). (pp.173-182).IEEE.DOI: 10.1109/RE.2014.6912259