William H. Hsu with Haipeng Guo, Rengakrishnan Subramanian, Ben Perry, and Julie A. Thornton
-
Upload
kelly-montoya -
Category
Documents
-
view
21 -
download
2
description
Transcript of William H. Hsu with Haipeng Guo, Rengakrishnan Subramanian, Ben Perry, and Julie A. Thornton
Kansas State University
Department of Computing and Information Sciences
Bioinformatics and Machine Learning:Bioinformatics and Machine Learning:Building Probabilistic ModelsBuilding Probabilistic Models
of Gene Expression from Microarray Dataof Gene Expression from Microarray Data
William H. Hsu
with Haipeng Guo, Rengakrishnan Subramanian,
Ben Perry, and Julie A. Thornton
Department of Computing and Information Sciences
Kansas State UniversityLaboratory for Knowledge Discovery in Databases
http://www.kddresearch.org/Groups/Bioinformatics
Kansas State University
Department of Computing and Information Sciences
OverviewOverview
• Computer Science: What We Do– Software: operating systems, programming languages, software
engineering, databases
– Hardware: logic design, organization and architecture
– Theory of Computation: algorithms, complexity, languages
– Artificial Intelligence (AI): learning, reasoning, planning, agents
– Computer Graphics, Geometry, and Vision
– Computational Science and Engineering (CSE)
• Artificial Intelligence (AI) – Fields of Study– Areas: learning, planning, vision, robotics
– Applications in science, engineering, business, and defense
• Computer Graphics – Some Current Projects and Fun Stuff– Computer-Aided Design (CAD) and Engineering (CAE)
– Information Visualization
– Computer-Generated Images (CGI) and Animation (CGA)
• High-Performance Computing: Linux and Beowulf
Kansas State University
Department of Computing and Information SciencesSPIRIX software ThemeScapes http://www.cartia.com
6500 news storiesfrom the WWWin 1997
Information Retrieval (IR) and Text Mining: Information Retrieval (IR) and Text Mining: Commercial ApplicationsCommercial Applications
Kansas State University
Department of Computing and Information Sciences
Visual Programming andVisual Programming andSoftware EngineeringSoftware Engineering
Kansas State University
Department of Computing and Information Sciences
Stages of Data Mining andStages of Data Mining andKnowledge Discovery in DatabasesKnowledge Discovery in Databases
Kansas State University
Department of Computing and Information Sciences
Knowledge Discovery in Databases (KDD)Knowledge Discovery in Databases (KDD)and Fraud Detectionand Fraud Detection
Kansas State University
Department of Computing and Information Sciences
[2] Representation Evaluatorfor Learning Problems
Genetic Wrapper forChange of Representationand Inductive Bias Control
D: Training Data
: Inference Specification
Dtrain (Inductive Learning)
Dval (Inference)
[1] Genetic Algorithm
αCandidate
Representation
f(α)Representation
Fitness
OptimizedRepresentation
α̂
eI
Genetic Algorithms for Parameter Tuning in Genetic Algorithms for Parameter Tuning in Bayesian Network Structure Learning [1]Bayesian Network Structure Learning [1]
Kansas State University
Department of Computing and Information Sciences
[2] Representation Evaluatorfor Input Specifications
: Evidence SpecificationeI
Dtrain (Model Training)
Dval (Model Validation by Inference)
f(α)
Specification Fitness(Inferential Loss)
[B] Validation(Measurementof Inferential
Loss)
hHypothesis
[A] Inductive Learning(Parameter Estimation
from Training Data)
α
CandidateInput Specification
Genetic Algorithms for Parameter Tuning in Genetic Algorithms for Parameter Tuning in Bayesian Network Structure Learning [2]Bayesian Network Structure Learning [2]
Kansas State University
Department of Computing and Information Sciences
LearningEnvironment
Specification Fitness(Inferential Loss)
[B] ParameterEstimation
[A] StructureLearning
G = (V, E)Graph Component of BN
D: Microarray Data
B = (V, E, )BN with Probabilities
Dval (Model Validation by Inference)
G1
G2
G3
G4 G5
G1
G2
G3
G4 G5
Kansas State University
Department of Computing and Information Sciences
MicroarraysMicroarrays
Kansas State University
Department of Computing and Information Sciences
A Gene Network for YeastA Gene Network for Yeast[Friedman, Nachman, Linial, Pe’er, 2000][Friedman, Nachman, Linial, Pe’er, 2000]
Kansas State University
Department of Computing and Information Sciences
Publication(e.g., PubMed)
Source(e.g.,
Taxonomy)
Gene(e.g., GenBank)
Experiment
Sample Hybridization Array
Normalization/Discretization
Data
Components of A Microarray Experiment:Components of A Microarray Experiment:HybridizationHybridization
Kansas State University
Department of Computing and Information Sciences
ComputationalWorkflows
(e.g., myGrid)
ExperimentalServices &Metadata
(Mage-ML XML)
GeneExpression
Model
Pathway &NetworkLearning
Specification
DataPreprocessingSpecification
ParameterLearning
Specification
ModelAnalysis
Specification
DiscretizationUse Case
Data MiningUse Case
Feature Selection
Specification
Validation(e.g., Bootstrap)
Use Case
Components of A Microarray Experiment:Components of A Microarray Experiment:Computational Gene Expression ModelingComputational Gene Expression Modeling
Kansas State University
Department of Computing and Information Sciences
Domain-Specific Repositories
Experimental DataSource Codes and Specifications
Data ModelsOntologies
Models
DESCRIBER
Personalized Interface
Domain-SpecificCollaborative Filtering
New QueriesLearning and Inference
Components
HistoricalUse Case & Query Data
Decision SupportModels
Users ofScientificDocumentRepository
Interface(s) to Distributed Repository
Example Queries:• What experiments have found cell cycle-regulated
metabolic pathways in Saccharomyces?
• What codes and microarray data were used, and why?
DESCRIBERDESCRIBER: An Experimental: An ExperimentalIntelligent FilterIntelligent Filter
Kansas State University
Department of Computing and Information Sciences
Module 2
Learning & Validationof Bayesian Network
Models forUse Cases
Module 4Learning & Validationof Bayesian Network
Models forMAGE Data & Codes
Relational Models of MAGE Data
Module 1Intelligent Collaborative
Filtering Front-End
Data
Historical Use Case& Query Data
Personalized Interface Module 5MAGE
Data Model
User
Estimationof
ConstraintParameters
Graphical Modelsof Use Cases
Module 3
Constrained Models of Use Cases
New Queries
DESCRIBERDESCRIBEROverviewOverview
Kansas State University
Department of Computing and Information Sciences
Intelligent Collaborative FilteringFront-End
Personalized Interface
Relational Models of(Domain-Specific) Data
Constrained Modelsof Use Cases
RelationalProbabilistic
ModelConstraintSelector
IntegratedReasoning
Component:
XML Validator andConstraint Checker
Constraintson Repository
Content
Responseto User
New Queryfrom User
Module 1
DESCRIBERDESCRIBERCollaborative Filtering ModuleCollaborative Filtering Module