Visual Analytics in Omics: why, what, how?

49
Visual Analytics in omics - why, what, how? Prof Jan Aerts STADIUS - ESAT, Faculty of Engineering, University of Leuven, Belgium Data Visualization Lab [email protected] [email protected] creativecommons.org/licenses/by-nc/3.0/

description

Presentation given at VisBio workshop in Bergen, Norway.

Transcript of Visual Analytics in Omics: why, what, how?

Page 1: Visual Analytics in Omics: why, what, how?

Visual Analytics in omics - why, what, how?

Prof Jan AertsSTADIUS - ESAT, Faculty of Engineering, University of Leuven, BelgiumData Visualization Lab

[email protected]@datavislab.org

creativecommons.org/licenses/by-nc/3.0/

Page 2: Visual Analytics in Omics: why, what, how?

• What problem are we trying to solve?

• What is Visual Analytics and how can it help?

• How do we actually do this?

• Some examples

• Challenges

2

Page 3: Visual Analytics in Omics: why, what, how?

A. So what’s the problem?

3

Page 4: Visual Analytics in Omics: why, what, how?

hypothesis-driven -> data-driven

Scientific Research Paradigms (Jim Gray, Microsoft)

I have an hypothesis -> need to generate data to (dis)prove it.I have data -> need to find hypotheses that I can test.

1st 1,000s years ago empirical

2nd 100s years ago theoretical

3rd last few decades computational

4rd today data exploration

4

Page 5: Visual Analytics in Omics: why, what, how?

What does this mean?

• immense re-use of existing datasets

• much of initial analysis is exploratory in nature => what’s my hypothesis?

• biologically interesting signals may be too poorly understood to be analyzed in automated fashion

• visualization is very effective in facilitating human reasoning about complex data

• automated algorithms often act as black boxes => biologists must have blind faith in bioinformatician (and bioinformatician in his/her own skills)

5

Page 6: Visual Analytics in Omics: why, what, how?

inputfilter 1

filter 2

output A

filter 3

output B output

Opening the black box

6

Page 7: Visual Analytics in Omics: why, what, how?

A B

C

7

Page 8: Visual Analytics in Omics: why, what, how?

A B

C

8

Page 9: Visual Analytics in Omics: why, what, how?

A B

C

9

Page 10: Visual Analytics in Omics: why, what, how?

What’s my hypothesis?

10

Martin Krzywinski

Page 11: Visual Analytics in Omics: why, what, how?

11

Martin Krzywinski

Page 12: Visual Analytics in Omics: why, what, how?

12

Martin Krzywinski

Page 13: Visual Analytics in Omics: why, what, how?

B. What is Visual Analytics and how can it help?

13

Page 14: Visual Analytics in Omics: why, what, how?

14

Page 15: Visual Analytics in Omics: why, what, how?

What is visualization?

T. Munzner

15

Page 16: Visual Analytics in Omics: why, what, how?

What is visualization?

T. Munzner

cognition <=> perceptioncognitive task => perceptive task

16

Page 17: Visual Analytics in Omics: why, what, how?

• record information

• blueprints, photographs,seismographs, ...

• analyze data to support reasoning

• develop & assess hypotheses

• discover errors in data

• expand memory

• find patterns (see Snow’s cholera map)

• communicate information

• share & persuade

• collaborate & revise

Why do we visualize data?

17

Page 18: Visual Analytics in Omics: why, what, how?

pictorial superiority effect

“information”

“informa” “i”65% 1%

72hr

18

Page 19: Visual Analytics in Omics: why, what, how?

Steven’s psychophysical law

= proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength

19

Page 20: Visual Analytics in Omics: why, what, how?

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

20

Page 21: Visual Analytics in Omics: why, what, how?

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

21

Page 22: Visual Analytics in Omics: why, what, how?

Accuracy of quantitative perceptual tasks

McKinlay“power of the plane”

what/where (qualitative)how much (quantitative)

22

Page 23: Visual Analytics in Omics: why, what, how?

Pre-attentive vision

= ability of low-level human visual system to rapidly identify certain basic visual properties

• some features “pop out”

• used for:

• target detection

• boundary detection

• counting/estimation

• ...

• visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure

23

Page 24: Visual Analytics in Omics: why, what, how?

24

Page 25: Visual Analytics in Omics: why, what, how?

25

Page 26: Visual Analytics in Omics: why, what, how?

1. Combining pre-attentive features does not always work => would need to resort to “serial search” (most channel pairs; all channel triplets)e.g. is there a red square in this picture

Limitations of preattentive vision

2. Speed depends on which channel (use one that is good for categorical; see further (“accuracy”))

26

Page 27: Visual Analytics in Omics: why, what, how?

Gestalt laws - interplay between parts and the whole

27

Page 28: Visual Analytics in Omics: why, what, how?

Gestalt laws - interplay between parts and the whole

• simplicity

• proximity

• similarity

• connectedness

• good continuation

• common fate

• familiarity

• symmetry

28

Page 29: Visual Analytics in Omics: why, what, how?

Context affects perceptual tasks

Page 30: Visual Analytics in Omics: why, what, how?

C. How do we actually do this?

30

Page 31: Visual Analytics in Omics: why, what, how?

Talking to domain experts

31

Page 32: Visual Analytics in Omics: why, what, how?

Data visualization framework

32

Page 33: Visual Analytics in Omics: why, what, how?

Card sorting

33

Page 34: Visual Analytics in Omics: why, what, how?

Tools of the trade

34

Page 35: Visual Analytics in Omics: why, what, how?

Processing - http://processing.org

• java

35

Page 36: Visual Analytics in Omics: why, what, how?

D3 - http://d3js.org/

• javascript

36

Page 37: Visual Analytics in Omics: why, what, how?

Vega - https://github.com/trifacta/vega/wiki

• html + json

37

Page 38: Visual Analytics in Omics: why, what, how?

To use vega

• Create the json file

• Create the index.html

• Run “python -m SimpleHTTPServer”

• Go to http://127.0.0.1:8000/index.html

• Get help at https://github.com/trifacta/vega/wiki

38

Page 39: Visual Analytics in Omics: why, what, how?

D. Examples

39

Page 40: Visual Analytics in Omics: why, what, how?

HiTSeeBertini E et al. IEEE Symposium on Biological Data Visualization (2011)

40

Page 41: Visual Analytics in Omics: why, what, how?

Aracari

Ryo Sakai

Bartlett C et al. BMC Bioinformatics (2012)

41

Page 42: Visual Analytics in Omics: why, what, how?

MeanderPavlopoulos et al. Nucl Acids Res (2013)

42

Georgios Pavlopoulos

Page 43: Visual Analytics in Omics: why, what, how?

ParCoordBoogaerts T et al. IEEE International Conference on

Bioinformatics & Bioengineering (2012)

Thomas Boogaerts

Endeavour gene prioritization

43

Page 44: Visual Analytics in Omics: why, what, how?

Data filtering (visual parameter setting)

TrioVis

Ryo Sakai

Sakai R et al. Bioinformatics (2013)

44

Page 45: Visual Analytics in Omics: why, what, how?

User-guided analysis

SparkNielsen et al. Genome Research (2012)

clustering

chromatin modification

DNA methylationRNA-Seq

data samples

regions of interest

45

Page 46: Visual Analytics in Omics: why, what, how?

Bret Victor - Ladder of abstration

46

Page 47: Visual Analytics in Omics: why, what, how?

E. Challenges

47

Page 48: Visual Analytics in Omics: why, what, how?

Many challenges remain

• scalability (data processing + perception), uncertainty, “interestingness”, interaction, evaluation

• infrastructure & architecture

• fast imprecise answers with progressive refinement

• incremental re-computation

• steering computation towards data regions of interest

48

Page 49: Visual Analytics in Omics: why, what, how?

Thank you

• Georgios Pavlopoulos

• Ryo Sakai

• Thomas Boogaerts

• Data Visualization Lab (datavislab.org)

• Erik Duval

• Andrew Vande Moere

49