Providing National Cyberinfrastructure to Biologists, esp. Genomicists. William K. Barnett, Ph.D....

Post on 21-Jan-2016

218 views 0 download

Tags:

Transcript of Providing National Cyberinfrastructure to Biologists, esp. Genomicists. William K. Barnett, Ph.D....

Providing National Cyberinfrastructure to Biologists, esp. Genomicists.

William K. Barnett, Ph.D. (Director)Thomas G. Doak (Manager & Domain Biologist)

National Center for Genome Analysis Support

2

Lab of Mike Lynch

Thank YouOSG Team Rob Quick and Soichi

Hayashi

Questions?

Bill Barnett (barnettw@iu.edu)

Le-Shin Wu (lewu@iu.edu)

Carrie Ganote (cganote@iu.edu)

Tom Doak (tdoak@iu.edu)

help@ncgas.org

An outline:

• The science and research NCGAS addresses

• What tools and infrastructure NCGAS provides to researchers

• What is the near to mid-term future of bioinformatics research

Genomics

Proteomics

Transcriptomics

MetaGenomics

MetaProteomics

MetaTranscriptomics

‘omics is expanding to include everything

then Population Genomics, etc. … .

Cost per Genome

03/23/2015 http://omicsmaps.com/

8

Making it easier for Biologists

• Web interface to NCGAS resources

• Supports many bioinformatics tools

• Available for both research and instruction.

Common

Rare

Computational Skills

LOW

HIGH

Researchers must balance cost, ease, and availability.

NCGAS’s primary goals:

• Provide bioinformatics expertise• Maintain a curated set applications• Provide access to HPC resources, esp.

large-memory clusters = Mason• Build Galaxy instances for our software• Pursue out-reach to biologists

NCGAS is embedded in Research Technologies

12

NCGAS is embedded in Research Technologies

13

• 16-nodes, 500GB RAM

• 10TB project space

• Bioinformatics software

• Galaxy instance

• 50TB archive space/userWe ask that you

acknowledge our grant in any published work

that uses our resources.

Collaborations and authorship are requested for intellectual

contributions.

THE FACTS

The fine print

Mason and NCGAS use over time

Users

Mason Use

CASE STUDYSuspect: Horned Dung BeetleScientific Name: Onthophagus taurus, O. sagittarius, and O. nigriventrisWanted for:Nutrition, metabolism, and horn development. Warning! Subject may be armed with horns which “vary in size, number, position on the body, and degree of sexual dimorphism”. Rapidly evolving genes in three closely related species may be implicated in the diversity of these structures. Suspects’ genomes are under current investigation for strong signals of selection.

PI: Melissa Pespeni (lab of Armin Moczek)

• Our role in Melissa’s researchWe recommended assembly procedures and Unix commands – when and how to concatenate data sets together to retrieve the desired information

We wrote customized scripts to get the data in the format required by the programs requested

We troubleshot issues with

the system that were beyond

user experienceWe assisted with the data moving process and advised steps to address data corruption and failures

We added new users to project and brought them up to speed on the project and on Unix …With a smile

GALAXY.NCGAS.ORG Model

Virtual box hosting Galaxy.ncgas.org

The host for each tool is configured individually

Quarry Mason

Data Capacitor

Archive

NCGAS establishes tools, hardens them, and moves them into production.

Custom Galaxy tools can be made for moving data

Individual projects can get duplicate boxes – provided they support it themselves.

Policies on the DC guarantee that untouched data is removed with time.

Simplify this!

From our recent NSF survey:

From our NSF survey:• “The biggest impediment to discovery by

biologists is the need to rely on others with knowledge of impenetrable systems and obscure acronyms to process and interpret data. Don't know how to fix this, but on some level user friendly platforms programs like Geneious more than make up for their lack of power by providing an intuitive platform that encourages free exploration and experimentation with data.”

The end…

Mason Use

27

From our recent NSF survey:

28

From our recent NSF survey:

29

From our recent NSF survey:

30

From our recent NSF survey:

31

From our recent NSF survey:

32

From our recent NSF survey:

33

From our recent NSF survey: