Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox...

15
Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) [email protected] Tetherless World Constellation

Transcript of Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox...

Page 1: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Realities in Science Data and Information - Let's go for translucency

AGU FM10 IN13B-02

Peter Fox (RPI) [email protected] World Constellation

Page 2: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

And the reality?

It’s about the questions that are being asked, e.g.

•When was the last sensor calibration and who did it, why was it done and where are the results?

•Exactly what physics routines went into this model run and how do I know this is the actual output it generated (and that it has not been altered)?

Page 3: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

The ecosystem?

• These are what enable scientists or anyone to explore/ confirm/ deny their ‘hunches’ or get answers to direct questions…

Accountability

ProofExplanation Justification Verifiability

‘Transparency’ (the illusion of it)

Trust

Provenance - Internal/ External

Identity

Page 4: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Why an illusion?

• It’s not that the word transparency is wrong, it is what it is being used for – – “If I let you see everything, you can get

answers to your questions”

• Nope, not unless you are very lucky…

• It depends on– Who is asking the question (and why)– What the answer will be used for– CONTEXT and ROLE (poorly represented)

Page 5: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

20080602 Fox VSTO et al. 5

But back to reality

Fragmentation

Disconnection

Encapsulation

Data as service

… all are bad for the questions that are being asked

Page 6: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

So … translucency

• See just what is necessary and suff.

• Practical definition– As close to the relevant data, information and

knowledge artifacts presented in an appropriate form

– Damn, yes, I mean curation

• Methodological means– Lenses (with filters, roles if possible)– Bags– Logic, i.e. rules

Page 7: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Some of this is, er…

• Provenance - Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility

• Knowledge provenance; enrich with semantics (especially the relations between concepts previously isolated, and retaining context) and semantically-aware tools

Page 8: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Complexity (see IN43C-05)

8

Page 9: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

And some …

• Identity– YOUR identity– Friends, organizations– Communities– Peer and non-peer relations

• Accountability– By whom, to whom– When and how often

• Documentation – are you happy Ted?

Page 10: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

We need a Knowledge Base

10

Page 11: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Access Control Essential For Establishing Trust

• Licensing• Intellectual

property• Security/ defence• Endangered

species• Sensitive Data /

Information• Defining

authorized access

Page 12: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Proof Markup Language

PML•Justification

– Explanation– Causality graph

•Provenance– Conclusion– Source– Engine– Rule

•Trust– Trust/Belief metrics NodeSetNodeSet

JustificationJustification

ConclusionConclusion

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

NodeSetNodeSet

JustificationJustification

ConclusionConclusion

EngineEngine RuleRule RuleRule

hasAntecedentList

hasSourceUsagehasInferenceRule

hasInferenceEngine

SourceUsageSourceUsage

SourceSource

DateTimeDateTime

12

Page 13: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

Open Provenance Model

• Agents– Catalyst and controlling

entity of a process• Processes

– Action or Series of actions performed resulting in new artifacts

• Artifacts– Immutable piece of state

• Roles– Non-semantic flat tags

used to provide context in relations ArtifactArtifact

ProcessProcess

wasGeneratedBy(Role)

AgentAgentArtifactArtifactArtifactArtifact

used(Role)

wasControlledBy(Role)

ArtifactArtifact

wasDerivedFrom(Role)

ProcessProcess

ProcessProcess

wasGeneratedBy(Role)

wasTriggeredBy(Role)

13

Page 14: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

E.g. Knowledge Base – see Zednik et al. IN43C-06

Page 15: Realities in Science Data and Information - Let's go for translucency AGU FM10 IN13B-02 Peter Fox (RPI) pfox@cs.rpi.edupfox@cs.rpi.edu Tetherless World.

My suggestion(s)

• Accommodation of dynamic content in an open (web) environment (distrust)

• Filter/ lens models and implementations in tools/ applications

• Declarative semantics to formalize the meaning/ terms and relations - progress

• Rules to define the combinations of evidence required - starting

• “In their face” end-user modeling – getting real use cases for presentation of ‘facts’