Actionable data in life sciences
-
Upload
jorge-boucas -
Category
Technology
-
view
194 -
download
4
Transcript of Actionable data in life sciences
![Page 1: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/1.jpg)
1 Sunday 11 December 16 Jorge Bouças, Bioinformatics Core Facility, MPI-AGE, Köln
Actionable data in life sciences
![Page 2: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/2.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 2
Performance
request for data analysis reply with results
time
• background / scientific question
• metadata collection
• data transfer
• data analysis • validation
• data transfer
![Page 3: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/3.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 3
Performance
request for data analysis reply with results
time
• background / scientific question
• metadata collection
• data transfer
• data analysis • validation
• data transfer
No build test No integration test Tailor cut validation
![Page 4: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/4.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 4
Performance
request for data analysis reply with results
time
• background / scientific question
• metadata collection
• data transfer
• data analysis • validation
• data transfer
structured inplace actionable 24/7
![Page 5: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/5.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 5
Performance
þ Network
þ Storage
þ CPUs
þ Memory
þ Software
þ Algorithms ¨ Human
![Page 6: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/6.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 6
Performance
þ Network
þ Storage
þ CPUs
þ Memory
þ Software
þ Algorithms ¨ Human
"Nur 8,3 Prozent der Stellen für
Informatiker können problemlos besetzt
werden.”
http://www.golem.de
![Page 7: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/7.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 7
Performance
þ Network
þ Storage
þ CPUs
þ Memory
þ Software
þ Algorithms ¨ Human
Data Science
Computer Science
Math & Statistics
Subject Matter Expertise
/ biology
Unicorn Trad.
Research Trad.
Software
Machine Learning
Copyright 2014 by Steven Geringer Raleigh, NC. Permission is granted to use, distribute, or modify this image, provided that this copyright remains intact
![Page 8: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/8.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 8
Performance
þ Network
þ Storage
þ CPUs
þ Memory
þ Software
þ Algorithms ¨ Human
“… It appears that the development of effective human cooperation and the development of man-computer symbiosis are "chicken-and-egg" problems. It will take unusual human teamwork to set up a truly workable man-computer partnership, and it will take man-computer partnerships to engender and facilitate the human cooperation. …if the required solutions are not ready, it would not be good to wait for them.”
Licklieder JRC, Clark WE, On-line man-computer communication, Proceedings of the May 1-3, 1962, spring joint computer conference
![Page 9: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/9.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
HPC
git
datashare
9
![Page 10: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/10.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
HPC
git
datashare
10
Berlin
Garching
Köln
![Page 11: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/11.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
HPC
git
datashare
11
Berlin
Garching
Köln
TAPE
in-house
curl / wget md5sum
bit -g
www
rsync
![Page 12: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/12.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
HPC
git
datashare
12
Berlin
Garching
Köln
results 8kb .. 8gb
private link 21d public link
write upload log on wiki with perma links
push code
https://to.data
bit -i <myfile.txt> -m <code and data message>
customer
![Page 13: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/13.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
HPC
git
datashare
13
Berlin
Garching
Köln
results 8kb .. 8gb
private link 21d public link
write upload log on wiki with perma links
push code
https://to.data
bit -i <myfile.txt> -m <code and data message>
customer
Binding of Results & Code
![Page 14: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/14.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
HPC
git
datashare
14
Berlin
Garching
Köln
results 8kb .. 8gb
private link 21d public link
write upload log on wiki with perma links
push code
https://to.data
bit -i <myfile.txt> -m <code and data message>
customer
Binding of Results & Code
> 30 projects / 3 analysts
1 project: > 1000 GB data > 1000 files > 1000 lines of code (with dependencies)
> 10-40 change actions
![Page 15: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/15.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
15
HPC datashare git
![Page 16: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/16.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
16
HPC datashare git
bit --start <DP_project_name>
![Page 17: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/17.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
17
HPC datashare git
bit --start <DP_project_name>
![Page 18: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/18.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
18
HPC datashare git
bit -i <myfile.txt> -m <code and data message>
![Page 19: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/19.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
19
HPC datashare git
bit -i <myfile.txt> -m <code and data message>
![Page 20: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/20.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
20
HPC datashare git
bit -c <folder_to_create>
![Page 21: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/21.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
21
HPC datashare git
bit -g <folder_or_file_to_download>
![Page 22: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/22.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
22
HPC HPC2
bit --sync <folder_or_file_to_sync> --sync_to <Uname@HPC2>
![Page 23: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/23.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
23
HPC HPC2
bit --sync <folder_or_file_to_sync> --sync_from <Uname@HPC2>
![Page 24: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/24.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
24
HPC git
bit --adduser
![Page 25: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/25.jpg)
Garching HPC
Köln HPC
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“On-line man-computer communication”
git
datashare
25
Berlin
Garching
results 8kb .. 8gb
private link 21d public link
write upload log on wiki with perma links
push code
https://to.data customer
user1
user2
user3
pull code
![Page 26: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/26.jpg)
Garching HPC
Köln HPC
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
github.com/owncloud/pyocclient
datashare
26
Garching
results 8kb .. 8gb
private link 21d public link
![Page 27: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/27.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
github.com/owncloud/pyocclient
27
REST API
![Page 28: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/28.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
github.com/owncloud/pyocclient
28
![Page 29: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/29.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
github.com/owncloud/pyocclient
29
![Page 30: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/30.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
Why?
30
ownCloud http tmp link >> download. simplicity
Github
“With statement-by-statement compiling and testing and with computer-aided book-keeping and program integration, a few very talented men may be able to handle in weeks programming tasks that ordinarily require many people and many months.”
Licklieder JRC, Clark WE, On-line man-computer communication, Proceedings of the May 1-3, 1962, spring joint computer conference
ownCloud + Github data & metadata management
![Page 31: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/31.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
Front-end
31
![Page 32: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/32.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
“Back-end”
32
register
http://www.mpcdf.mpg.de/userspace/forms/onlineregistrationform
Sys. Admin. (MPI-AGE)
Github (MPI-MOLGEN)
user
![Page 33: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/33.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16 33
Performance
request for data analysis reply with results
time
• background / scientific question
• metadata collection
• data transfer
• data analysis • validation
• data transfer
bit
![Page 34: Actionable data in life sciences](https://reader033.fdocuments.us/reader033/viewer/2022052706/58edb9ce1a28ab3a0b8b45d3/html5/thumbnails/34.jpg)
Jorge Bouças, Bioinformatics Core Facility Sunday 11 December 16
[b]ermuda [i]nformation [t]riangle
34
github.com/mpg-age-bioinformatics/AGEpy