Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard...
-
Upload
dwain-fitzgerald -
Category
Documents
-
view
220 -
download
0
description
Transcript of Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard...
Using Bayesian Networks Using Bayesian Networks to Predict Plankton to Predict Plankton
Production from Satellite Production from Satellite DataData
By: By: Rob Curtis, Richard Fenn, Damon Rob Curtis, Richard Fenn, Damon OberholsterOberholster
Supervisors: Supervisors: Anet Potgieter, John Field, Laurent Anet Potgieter, John Field, Laurent DrapeauDrapeau
Department of Computer Science
OverviewOverview• IntroductionIntroduction• Work DetailWork Detail• Knowledge AcquisitionKnowledge Acquisition• Knowledge RepresentationKnowledge Representation• Bayesian Learning and InferenceBayesian Learning and Inference• Topic MapsTopic Maps
IntroductionIntroduction• Aim to predict plankton primary Aim to predict plankton primary
production using satellite dataproduction using satellite data• Daily satellite data on surface Daily satellite data on surface
temperature, chlorophyll, winds, temperature, chlorophyll, winds, currentscurrents
• Archive of ships’ sub-surface detailsArchive of ships’ sub-surface details• Predict likely subsurface plankton Predict likely subsurface plankton
profile from surface featuresprofile from surface features
Current SystemCurrent System• Currently best solution uses Self Currently best solution uses Self
Organising Maps (SOMs: A type of Organising Maps (SOMs: A type of neural network) to classify dataneural network) to classify data– Resulting solution lacks accuracyResulting solution lacks accuracy– Difficult to interpretDifficult to interpret
Proposed SystemProposed System• Propose a system that uses Bayesian Propose a system that uses Bayesian
Networks to predict plankton productionNetworks to predict plankton production– Use ships’ sub surface profiles + satellite Use ships’ sub surface profiles + satellite
data to draw cause effect relationshipsdata to draw cause effect relationships– Will use Bayesian Inference and LearningWill use Bayesian Inference and Learning
• Use Topic Maps to visualize networkUse Topic Maps to visualize network
Work DetailWork DetailKnowledge Acquisition
Inference Engine KnowledgeRepresentation
Learning Engine
Topic Map
RequirementsElicitation
Rob Curtis
Richard Fenn
Damon Oberholster
Knowledge AcquisitionKnowledge Acquisition• ““The process of analyzing, transforming, The process of analyzing, transforming,
classifying, organizing and integrating classifying, organizing and integrating knowledge and representing that knowledge and representing that knowledge in a form that can be used in a knowledge in a form that can be used in a computer system. Typically the knowledge computer system. Typically the knowledge is based on what a human expert does is based on what a human expert does when solving problems”when solving problems”
www.centc251.org/Ginfo/Glossary/tcglosk.htmwww.centc251.org/Ginfo/Glossary/tcglosk.htm
• Relating to this project:Relating to this project:– Huge amounts of dataHuge amounts of data– Data is poorly recorded in Excel spreadsheetsData is poorly recorded in Excel spreadsheets– Gaps in current dataGaps in current data
Knowledge Acquisition: Amount of Knowledge Acquisition: Amount of DataData• 2500 ship sub surface readings 2500 ship sub surface readings
– Recorded over 10 year periodRecorded over 10 year period• Bayesian Network requires satellite Bayesian Network requires satellite
data for the same time perioddata for the same time period• Need to represent data in a form that Need to represent data in a form that
can be used by the Bayesian can be used by the Bayesian NetworkNetwork
Knowledge Acquisition: Current DataKnowledge Acquisition: Current Data
Knowledge Acquisition: Gaps in Knowledge Acquisition: Gaps in DataData
Ships’ sub-surface readings (discrete)
Satellite data (continuous)
Knowledge Acquisition: Gaps in Knowledge Acquisition: Gaps in DataData
Knowledge Acquisition: Knowledge Acquisition: ChallengesChallenges• Making sense of all the available data Making sense of all the available data
(consultations with Dr John Field and (consultations with Dr John Field and Laurent Drapeau)Laurent Drapeau)
• Correlating the 2D continuous satellite data Correlating the 2D continuous satellite data to 3D discrete ships’ sub-surface profileto 3D discrete ships’ sub-surface profile
• Representing all the data in a form easily Representing all the data in a form easily used by the Bayesian Networkused by the Bayesian Network
• Integration of disparate dataIntegration of disparate data
Knowledge RepresentationKnowledge Representation• ““A search for formal ways to describe knowledge A search for formal ways to describe knowledge
presented in informal terms (a prerequisite for its presented in informal terms (a prerequisite for its handling as computation)” handling as computation)”
encyclopedia.laborlawtalk.com/Representationencyclopedia.laborlawtalk.com/Representation
• Relating to this project:Relating to this project:– Need to find causal relationships between environment variablesNeed to find causal relationships between environment variables– Represent those relationships in a Bayesian NetworkRepresent those relationships in a Bayesian Network– Store the data in a database so that it will be easy for the Store the data in a database so that it will be easy for the
Inference and Learning Engines of the Bayesian Network to Inference and Learning Engines of the Bayesian Network to Manipulate.Manipulate.
– Need to consider the temporal aspects of the dataNeed to consider the temporal aspects of the data
Knowledge Representation: Causal Knowledge Representation: Causal RelationshipsRelationships
Primary Plankton
Production
Many variables that influence plankton production: •Chlorophyll•Surface Temp•Wind •Current
Chlorophyll
Surface Temp
Wind
Knowledge Representation: Knowledge Representation: Bayesian NetworkBayesian Network• Directed graphical modelDirected graphical model• Each node represents influencing variableEach node represents influencing variable• An edge from one node to another represents An edge from one node to another represents
causal relationship between those nodes causal relationship between those nodes
• Create Bayesian network structure based on the Create Bayesian network structure based on the most relevant relationships found between the most relevant relationships found between the variablevariable
Knowledge Representation: Knowledge Representation: Temporal aspectsTemporal aspects•Need to divide data up into time steps
•Each time step is dependant on previous step
t + 1t t + 2
Learning EngineLearning Engine• Each Node of the Bayesian network Each Node of the Bayesian network
will have a Conditional Probability will have a Conditional Probability Table (CPT)Table (CPT)
• Learning engine will implement an Learning engine will implement an algorithm to update the probabilities algorithm to update the probabilities in each of these tablesin each of these tables– nine years of satellite and ship data will nine years of satellite and ship data will
be used in training the system be used in training the system
Inference EngineInference Engine• The inference engine will be The inference engine will be
responsible for calculating the responsible for calculating the probability of a certain sequence of probability of a certain sequence of observations given certain input observations given certain input parametersparameters
Testing
• Nine years of sub-surface data will be used to train the system.
• Compare the predicted results for the tenth year against the recorded results for that year.
• The project will be a success if predictions are very similar to those that were recorded.
Representing Bayesian Representing Bayesian Networks using Topic MapsNetworks using Topic Maps
Topic Maps: OverviewTopic Maps: Overview• Brief introduction to topic maps and Brief introduction to topic maps and
hypergraphshypergraphs• Applying topic maps to the systemApplying topic maps to the system• TestingTesting• ChallengesChallenges
Topic MapsTopic Maps• Topic maps provide means for Topic maps provide means for
indexing dataindexing data• ISO standard for describing ISO standard for describing
knowledge structures and knowledge structures and associating them with information associating them with information resources. resources.
Topic Map StructureTopic Map Structure• TopicTopic
– Anything, subject, entity, conceptAnything, subject, entity, concept• OccurrenceOccurrence
– Link to information about topicLink to information about topic• AssociationAssociation
– Relationships between topicsRelationships between topics
Topic Map StructureTopic Map Structure
OccurrenceTopic
Association
Representing Topic MapsRepresenting Topic Maps• HypergraphsHypergraphs
hypergraph is a graph that can have smaller hypergraph is a graph that can have smaller graphs (subgraphs) imbedded within itself graphs (subgraphs) imbedded within itself
Applying Topic MapsApplying Topic Maps• Bayesian NetworkBayesian Network
– Topics will represent nodes in the networkTopics will represent nodes in the network– Associations represent relationships Associations represent relationships
between nodes in the networkbetween nodes in the network– Occurrences will link to info about nodeOccurrences will link to info about node
• Future SystemFuture System– Web application linking topic maps for Web application linking topic maps for
different regions of the oceandifferent regions of the ocean
Testing Testing • Qualitative approachQualitative approach
• Low-Fi prototypes to test intuitiveness Low-Fi prototypes to test intuitiveness of proposed interface to Bayesian of proposed interface to Bayesian NetworkNetwork
• Test with the intended users of the Test with the intended users of the systemsystem
ChallengesChallenges• Representing temporal information Representing temporal information
using topic mapsusing topic maps• Representing Bayesian Network Representing Bayesian Network
relationships using topic mapsrelationships using topic maps
SUMMARYSUMMARY• Represent data in a formal way Represent data in a formal way using knowledge acquisition using knowledge acquisition and representationand representation
• Research the viability of using Research the viability of using Bayesian Networks as a Bayesian Networks as a prediction mechanismprediction mechanism
• Research the viability of using Research the viability of using topic maps for intuitively topic maps for intuitively representing Bayesian representing Bayesian NetworksNetworks
ReferencesReferences• Pepper, S. (2002), ”The TAO of Topic Pepper, S. (2002), ”The TAO of Topic
Maps, Finding the Way in the Age of Maps, Finding the Way in the Age of Infoglut”, retrieved 01/06/2005, URL: Infoglut”, retrieved 01/06/2005, URL: http://www.ontopia.net/topicmaps/mhttp://www.ontopia.net/topicmaps/materials/tao.htmlaterials/tao.html