Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience...
-
Upload
easter-miller -
Category
Documents
-
view
218 -
download
1
Transcript of Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience...
![Page 1: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/1.jpg)
Brain Data & Knowledge Grid(or: Towards Services for Knowledge-Based
Mediation of Neuroscience Information Sources)
National Center for Microscopy and
Imaging Research (NCMIR)
Mark Ellisman Maryann Martone
Steve PeltierSteve Lamont
...
Data-Intensive Computing Environments San Diego Supercomputer Center (SDSC)
Reagan MooreChaitan Baru
Amarnath GuptaBertram LudäscherRichard MarcianoArcot RajasekarIlya Zaslavsky
...
University of California, San Diego
![Page 2: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/2.jpg)
Infrastructure for Sharing Neuroscience Data
CCB, Montana SUSurface atlas, Van Essen
Lab NCMIR, UCSD
stereotaxic atlas LONIMCell, CNL, Salk
SOURCES:• NCMIR, U.C. San Diego• Caltech Neuroimaging• Center for Imaging Science, John Hopkins• Center for Computational Biology, Montana State• Laboratory of Neuro Imaging (LONI), UCLA• Computatuonal Neurobiology Laboratory, Salk Inst.• Van Essen Laboratory, Washington University• …
Data Management Infrastructure (DICE/NPACI)• MIX Mediation in XML • MCAT information discovery• SRB data handling • HPSS storage• ...
Knowledge-based GRID
infrastructure
? ? ??
Data Management Infrastructure (“Data Grid”)GTOMO, Telemicroscopy, Globus, SRB/MCAT, HPSS
![Page 3: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/3.jpg)
Sharing Resources on the Brain Data Grid
• Scientific groups ...– create data products (e.g., text data, images, simulation data …)
– put them in collections
– add metadata (who created it, what is the data about …)
– make it available for sharing (on the web, in data caches, in HPSS, …)
• Technical challenges ...– size & packaging of data
– heterogeneity: data types, storage technologies, transport mechanisms, authentication, ...
– access levels: collection, object, fragment; data-specific functions (“data blades”)
• Data Grid technologies can help ...– distributed data management, e.g., Storage Request Broker/Metadata
Catalog (SRB/MCAT), computing (Globus), ...
– focus is on resource sharing (data, networks, cycles)
![Page 4: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/4.jpg)
Integration Issue: Semantic Integration/Mediation
??? SEMANTIC INTEGRATION ???
SYNTACTIC/STRUCTURAL Integration
• Integrated Views (Src-XML => Intgr-XML)
• Schema Integration (DTD =>DTD)
• Wrapping, Data Extraction (Text => XML)
MIX
Mediation of
Information
using XML
SYSTEM INTEGRATION
SR
B/M
CA
T
TCP/IP grid-ftp HTTPstorage, query capabilities
protocols & services
Dis
trib
ute
dQ
ue
ry P
roce
ssin
g
Globus JDBC DOM CORBA
![Page 5: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/5.jpg)
Standard Mediator/Wrapper Architecture
GRID federationservices ???
INTEGRATED VIEW
Client/User-Query
(Neuro)Science (Re)Sources
DB Files WWW
Lab1 Lab2 Lab3
Wrapper Wrapper Wrapper
XML Q/A
SRB/MCAT, DOM, X(ML)Querystructure
transport
syntax
storage}domain
semantics ???Integration logic
protocol translation
![Page 6: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/6.jpg)
The Need for Semantic Integration
protein localization
What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity?
How about other rodents?
morphometry neurotransmission
???Mediator ??????Mediator ???
Web
CaBP, Expasy
Wrapper WrapperWrapper Wrapper
??? Integrated View ???
??? Integrated View Definition ???
Data, relationships,
constraints are modeled (CMs)
Cross-source relationships are
modeled
Semantic (knowledge-
based) mediation services
Cross-source queries
![Page 7: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/7.jpg)
Hidden Semantics: Protein Localization
<protein_localization><neuron type=“purkinje cell” /><protein channel=“red”>
<name>RyR</>….</protein><region h_grid_pos=“1” v_grid_pos=“A”>
<density> <structure fraction=“0.8”>
<name>spine</><amount name=“RyR”>0</>
</> <structure fraction=“0.2”>
<name>branchlet</><amount name=“RyR”>30</>
</>
Molecular layer ofCerebellar Cortex
Purkinje Cell layer ofCerebellar Cortex
Fragment of dendrite
![Page 8: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/8.jpg)
Hidden Semantics: Morphometry
<neuron name=“purkinje cell”><branch level=“10”>
<shaft>…
</shaft> <spine number=“1”>
<attachment x=“5.3” y=“-3.2” z=“8.7” />
<length>12.348</> <min_section>1.93</> <max_section>4.47</> <surface_area>9.884</> <volume>7.930</> <head> <width>4.47</>
<length>1.79</> </head>
</spine> …
Branch level beyond 4 is a branchlet
Must be dendritic because Purkinje cells
don’t have somatic spines
![Page 9: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/9.jpg)
Knowledge-Based (Semantic) Mediation
• Multiple Worlds Integration Problem:– compatible terms not directly joinable– complex, indirect associations among attributes– unstated integrity constraints
• Approach:– a “theory” under which terms can be “semantically joined”
=> lift mediation to the level of conceptual models (CMs)
=> formalize domain knowledge, ICs become rules over CMs
=> Knowledge-Based/Model-Based (Semantic) Mediation
![Page 10: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/10.jpg)
XML-Based vs. Model-Based Mediation
Raw DataRaw DataRaw Data
IF THEN IF THEN IF THEN
LogicalDomainConstraints
Integrated-CM :=
CM-QL(Src1-CM,...)
. . ....
....
........ (XML)Objects
Conceptual Models
XMLElements
XML Models
C2 C3
C1
R
Classes,Relations,is-a, has-a, ...
DOMAIN MAP
Integrated-DTD :=
XML-QL(Src1-DTD,...)
No DomainConstraints
A = (B*|C),DB = ...
Structural Constraints (DTDs),Parent, Child, Sibling, ...
CM ~ {Descr.Logic, ER, UML, RDF/XML(-Schema), …} CM-QL ~ {F-Logic, OIL, DAML, …}
![Page 11: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/11.jpg)
Knowledge-Based Mediator Prototype
USER/ClientUSER/Client
S1 S2
S3
XML-Wrapper
CM-Wrapper
XML-Wrapper
CM-Wrapper
XML-Wrapper
CM-Wrapper
GCM
CM S1
GCM
CM S2
GCM
CM S3
CM (Integrated View)
MediatorEngine
FL rule proc.
LP rule proc.
Graph proc.XSB Engine
Domain MapDM
Integrated View Definition IVD
Logic API(capabilities)
CM Queries & Results (exchanged in XML)
CM Plug-In
![Page 12: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/12.jpg)
Mediation Services: Source Registration (System Issues)
Source
Data Type
Access Protocol
Query Capability
table tree file
SRB HTTP
JDBC
SQL XMLQL
DOODARC
Result Delivery
Tuple-at-a-time Set-at-a-
time
Stream
Binary for Viewer Selections SPJ
![Page 13: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/13.jpg)
Mediation Services: Source Registration (Semantics Issues)
• Domain Map Registration– provide concept space/ontology
• … as a private object (“myANATOM”)• … merge with others (give “semantic bridges”)• … and check for conflicts
• Conceptual Model Registration– schema: classes, associations, attributes– domain constraints – “put data into context” (linking data to the domain
map)
Next
![Page 15: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/15.jpg)
anatom_dom(X) :- (ucsd_has_a(X,_) ; ucsd_has_a(_,X) ; ucsd_isa(X,_) ; ucsd_isa(_,X)).senselab_dom(X) :- (sl_has_a(X,_) ; sl_has_a(_,X) ; sl_isa(X,_) ; sl_isa(_,X)).
% map Senselab anatom terms to equivalent UCSD ANATOMsl2ucsd(X,X) :- senselab_dom(X), anatom_dom(X).sl2ucsd('A',axon).sl2ucsd('AH',axon).sl2ucsd('Dad',spiny_branchlet). % should map to a PATH not just the end of the pathsl2ucsd('Dam',main_branches). % some of the main_branches based on the branch levelsl2ucsd('Dap',main_branches).sl2ucsd('Dbd',spiny_branchlet).sl2ucsd('Dbm',main_branches).sl2ucsd('Dbp',main_branches).sl2ucsd('Ded',spiny_branchlet).sl2ucsd('Dem',main_branches).sl2ucsd('Dep',main_branches).sl2ucsd('T',axon).
% keep has_a edge if at least one node is known from UCSDhas_a(X,Y) :- sl2ucsd(_,X), ucsd_has_a(X,Y).has_a(X,Y) :- sl2ucsd(_,Y), ucsd_has_a(X,Y).% keep all and only UCSD is_a relsisa(X,Y) :- ucsd_isa(X,Y). Back
Senselab (Yale) and NCMIR (UCSD) “Semantic Bridge”
![Page 16: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/16.jpg)
Neuron
Spiny Neuron
Substantia Nigra Pc
AxonSoma Dendrite
GABA
Neurotransmitter
Compartment
Dopamine R
Substance P
MyNeuron
Medium Spiny Neuron
Substantia Nigra PrGlobus Pallidus Int.
Globus Pallidus Ext.
MyDendrite
OR
ALL:has
AND
=
exp
exp
Neostriatum
Refinement of a Domain Map (Ontology): Putting Data in Context via Registration of new Classes & Relationships
![Page 17: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/17.jpg)
Mediation Services: Integrated View Definition
DERIVEprotein_distribution(Protein, Organism, Brain_region, Feature_name,
Anatom, Value) FROM I:protein_label_image[ proteins ->> {Protein}; organism -> Organism;
anatomical_structures ->>{AS:anatomical_structure[name->Anatom]}] , % from
PROLAB
NAE:neuro_anatomic_entity[name->Anatom; % from ANATOM located_in->>{Brain_region}], AS..segments..features[name->Feature_name; value->Value].
• provided by the domain expert and mediation engineer• declarative language (here: Frame-logic)
![Page 18: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/18.jpg)
Example Query Evaluation (I)
• Example: protein_distribution– given: organism, protein, brain_region– Use DOMAIN-KNOWLEDGE-BASE:
• recursively traverse the has_a_star paths under brain_region collect all anatomical_entities
– Source PROLAB:• join with anatomical structures and collect the value of attribute
“image.segments.features.feature.protein_amount” where “image.segments.features.feature.protein_name” = protein and “study_db.study.animal.name” = organism
– Mediator:• aggregate over all parents up to brain_region• report distribution
![Page 19: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/19.jpg)
Example Query Evaluation (II)
@SENSELAB: X1 := select output from parallel fiber ;@MEDIATOR: X2 := “hang off” X1 from Domain Map;
@MEDIATOR: X3 := subregion-closure(X2);
@NCMIR: X4 := select PROT-data(X3, Ryanodine Receptors);
@MEDIATOR: X5 := compute aggregate(X4);
"How does the parallel fiber output (Yale/SENSELAB) relate to the distribution of Ryanodine Receptors (UCSD/NCMIR)?"
![Page 20: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/20.jpg)
Mediation Services: Client Registration
Client
Update Client
Fat Result Viewer
Query Client
CheckData
MergeBeforeInsert
DeriveBeforeInsert
Client-side
Buffer
Client-sideProcessing
Navigate/
Ad-hoc
QueryCapabilityQuery on
Schema
Thin Result Viewer
Send Full DataServer-side
Buffer
ContextSensitive
Server-Push/Client-Pull
![Page 21: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/21.jpg)
Example Client: Query Formulation and Result Display
• combination of ad hoc and navigational queries• client side visualization (left)• results are shown in semantic context (right)
![Page 22: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/22.jpg)
Mediation Services: Semantic Annotation Tools
line drawing ==annotation==> (spatial) database for mediation
![Page 23: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/23.jpg)
XMLSources
RDBSources
FileSources
HTMLSources
Query interface (down API): • SDLIP, SOAP, ... • (subsets of) SQL, X(ML)-Query, CPL,...• DOM• SRB-based access
Result delivery interface (up API): • SDLIP, SOAP, ...• pull (tuple/set-at-a-time, DOM) vs. push (stream)• synchronous/asynchronous• direct data/data reference
Wrapper Layer
Digital Libraries (Collections)
SpatialSources
Source registration:• domain knowledge • model & schema • query & computation capabilities
Query processing:• view unfolding • semantic optimization• capability-based rewriting
Source model lifting:• domain knowledge reconciliation• model transformation
Query formulation:• user query• integrated view definition
Optimizer
Model Reasoner
DeductiveEngine
Mediator Layer
Mediation Services
Mediator Architecture Blueprint
Boston
Univ.
NCMIRUCSD
Yale Univ.
Montana Univ.
SDLIP ARCIMS
![Page 24: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/24.jpg)
Coming up: Knowledge-Based/Semantic Mediation of Brain Data
CCB, Montana SUSurface atlas, Van Essen
Lab
NCMIR, UCSDstereotaxic atlas LONI
MCell, CNL, Salk
ANATOM
PROTLOC
Result (VML/SVG)
Result (XML/XSLT)
Knowledge-Based Mediation
![Page 25: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/25.jpg)
Some Open Issues
• Data/Knowledge Modeling– Extensibility: how to handle a source with new data types and
operations?• Temporal Data: instrument readings, video microscopy• Spatial Data: Integrating with spatial database systems• Image database systems
– Conflict Management• Grades of certainty• Alternate Hypothesis
• Integrating Services– Registration and warping of my image slice to a reference
• Integrating into Larger Applications– M-Cell simulation– Telemicroscopy– Visualization
![Page 26: Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.](https://reader035.fdocuments.us/reader035/viewer/2022062422/56649e6a5503460f94b67661/html5/thumbnails/26.jpg)
• Model-Based Mediation with Domain Maps, Bertram Ludäscher, Amarnath Gupta, Maryann Martone, Intl. Conference on Data Engineering (ICDE), Heidelberg, 2001
• Knowledge-Based Mediation of Heterogeneous Neuroscience Information Sources, Amarnath Gupta, Bertram Ludäscher, Maryann Martone, Intl. Conference on Scientific and Statistical Databases (SSDBM), Berlin, 2000.
• Model-Based Information Integration in a Neuroscience Mediator System, Bertram Ludäscher, Amarnath Gupta, Maryann Martone, Intl. Conference on Very Large Data Bases (VLDB), Cairo, 2000.
References