Figure 1. Proposed bioinformatics system architecture .

1
Genomic signatures reveal stressors induced by habitat degradation & climate change in a model reptile species 1 Environmental Laboratory, US Army, Engineer Research and Development Center, Vicksburg, MS. 2 US Army, Public Health Command, Edgewood, MD. 3 Oklahoma State University, Stillwater, OK. Figure 1. Proposed bioinformatics system architecture. Full Project Introduction W eb Services Data Exchange Using XM LBased SO AP High Perform ance and Throughput Com puting using SuperCom puters Batch Processing (1) Data Uploading; (2) Data Validation; (3) Data Analysis; (4) Data Processing O racle Relational Database Private File Server PublicFile Server DataM anagem ent Data Q uery, Data Upload via http: Data M anagem ent Perl & Java Table 1. Results of GS-FLX Pyrosequencing of normalized cDNA Library for Western fence lizard (WFL). Table 2. Summary of sequence clustering and assembly for Western Fence Lizard (WFL). Table 3. Unigenes homology-based coding potential detection and annotation against the following protein databases: NR.aa (10,606,545 proteins), Refseq (6,392,535 proteins), UniProt-SwissProt (515,203 proteins), Uniref90 (6,544,144 proteins), Uniref100 (9,865,668 proteins). Figure 2. Web dissemination of datasets including web-based tools for transcriptomes and unigene analysis. http :// jeff.ifxworks.com/EGGT/Quail_Lizard. html M EC Contam ination H abitatLoss / C lim ate C hange Chem ical Stress R esource Loss Increased Parasitism = A ssess StressorPriority: 1. Clim ateChange 2. M EC Contam ination 3. Habitat Loss Genom ics Kurt A Gust 1 , Mitchell S Wilbanks 1 , Xianfeng Chen 1 , Craig A McFarland 2 , Larry Talent 3 , Edward J Perkins 1 Sequencing Param eters W FL Raw W ells 2,125,263 K ey P ass W ells 2,061,220 P assed FilterW ells 928,780 TotalB ases 328,540,934 Length A verage 354 M edian R eads Length 397 LongestR eads Length 2,043 S hortestR eads Length 2 Sequence Assem bly W FL TotalE STs Available 928,759 TotalA ssem bled C ontigs 53,897 TotalS inglets 5,065 TotalU nigenes 58,962 Unigene D ataset Coding D etected Non- Coding D etected % C oding P rotein D atabase 23,385 30,512 43.39% NR.aa 23,173 30,724 43.00% Refseq 21,593 32,304 40.06% UniProt-SwissProt 23,463 30,434 43.53% Uniref100 23,508 30,389 43.62% U niref90 1,425 1,825 44.33% NR.aa 1,440 1,837 43.94% Refseq 1,457 1,820 44.46% UniProt-SwissProt 1,465 1,812 44.71% Uniref100 1,298 1,979 39.61% U niref90 W FL C ontigs W FL S inglets Problem Identification: Multiple-stressor effects of common and emerging environmental stressors are unknown for native populations including the reptile model Western Fence Lizard (WFL). As habitat loss and global warming advance, resulting stressors may exacerbate chemical stressors (MECs) on Army Ranges. Tools are needed to forensically identify predominant environmental stressors to optimize field management. Purpose: Assess and characterize the effects of chemical stress (TNT) in conjunction with environmental stressors induced by habitat loss and climate change in a model reptile species. Null Hypotheses : A. Multiple ecosystem-level stressors characteristic of habitat degradation and climate change have no interactive effects on lizard health and fitness. B. Environmental stressors are uniquely identifiable via genomic signatures and these signatures can be used to identify predominant stressors in multiple-stressor scenarios. Focus on Genomic Infrastructure Development WFL normalized-cDNA Library Construction RNA extracted from brain, liver, gut, heart, bone marrow and gonad tissues. Tissues Collected from 5 male and 5 female, unexposed WFL. All RNA quality assured using gel- electrophoresis. SMART™ PCR cDNA Synthesis used to reverse transcribe full length cDNAs Trimmer, cDNA Normalization Kit used to normalize cDNA library. Materials and Methods: 1 K b L a d d e r P o s t - N o r m a l i z a t i o n P r e - N o r m a l i z a t i o n Roche - GS-FLX ANNOTATION Sequencing, Annotation & Microarray Development 929K Sequences Diamond Supercomputer at ERDC Agilent G3 Custom 60K oligonucleotide microarray Sequencing: 1 GS-FLX pyrosequencing run yielded 329 megabases in 929K reads with 354bp average read length (Table 1). Clustering: A genome-scale transcriptome for WFL was used for EST-based clustering and assembly via The Gene Indices Clustering Tools (TGICL), which uses megablast for homology-based clustering and CAP3. Annotation: In all, 53,897 contigs and 5,065 singlets totaling 58,962 unigenes were identified (Table 2). Approx 44 % of unigenes were annotated for protein- coding potential via homology-based annotation against NCBI NR.aa, Refseq, and EBI UniProt-SwissProt, Uniref90, and Uniref100 protein sequence reference knowledgebases (Table 3). Microarray Design – Transition WFL Results and Discussion: Web Accessible Knowledgebase: We have implemented a mature bioinformatics and computational biology system which includes: (1) a Relational Database Management Systems (RDBMS) - Oracle (Oracle, Redwood Shores, CA) for quick data retrieval and integration (2) public and private data and results access via network shared file servers ( http:// jeff.ifxworks.com/EGGT/Quail_ Lizard.html ) (3) data and results visualization via a public accessible web server ( http:// www.ifxworks.com/Environmenta lSystemsBiology.html ) (4) High performance and throughput computational analysis pipelines for quick data loading, retrieval, analysis, processing, integration, and validation (Figure 1 and 2). Conclusions & Future Efforts: We are the first to develop gene expression tools and a publicly accessible, extensive genomics infrastructure for a laboratory- amenable reptile-model species, the Western Fence Lizard. These assets will enable the in progress assessment of emerging stressors that are directly influenced by the non-point multi-stakeholder

description

Genomic signatures reveal stressors induced by habitat degradation & climate change in a model reptile species. Kurt A Gust 1 , Mitchell S Wilbanks 1 , Xianfeng Chen 1 , Craig A McFarland 2 , Larry Talent 3 , Edward J Perkins 1. - PowerPoint PPT Presentation

Transcript of Figure 1. Proposed bioinformatics system architecture .

Page 1: Figure  1. Proposed bioinformatics system architecture .

Genomic signatures reveal stressors induced by habitat degradation & climate change in a model reptile species

1Environmental Laboratory, US Army, Engineer Research and Development Center, Vicksburg, MS. 2US Army, Public Health Command, Edgewood, MD. 3Oklahoma State University, Stillwater, OK.

Figure 1. Proposed bioinformatics system architecture.

Full Project Introduction

Web ServicesData Exchange

Using XML Based SOAP

High Performance and Throughput Computing using Super Computers

Batch Processing(1) Data Uploading;(2) Data Validation;(3) Data Analysis;(4) Data Processing

Oracle Relational Database

Private File Server

Public File Server

Data Management

Data Query, Data Upload via http:

Data Management

Perl & Java

Table 1. Results of GS-FLX Pyrosequencing of normalized cDNA Library for Western fence lizard (WFL).

Table 2. Summary of sequence clustering and assembly for Western Fence Lizard (WFL).

Table 3. Unigenes homology-based coding potential detection and annotation against the following protein databases: NR.aa (10,606,545 proteins), Refseq (6,392,535 proteins), UniProt-SwissProt (515,203 proteins), Uniref90 (6,544,144 proteins), Uniref100 (9,865,668 proteins).

Figure 2. Web dissemination of datasets including web-based tools for transcriptomes and unigene analysis.http://jeff.ifxworks.com/EGGT/Quail_Lizard.html

MEC Contamination Habitat Loss / Climate Change

Chemical Stress

Resource Loss

Increased Parasitism

=Assess Stressor Priority:

1. Climate Change2. MEC Contamination3. Habitat LossGenomics

Kurt A Gust1, Mitchell S Wilbanks1, Xianfeng Chen1, Craig A McFarland2, Larry Talent3, Edward J Perkins1

Sequencing Parameters WFLRaw Wells 2,125,263Key Pass Wells 2,061,220Passed Filter Wells 928,780Total Bases 328,540,934Length Average 354Median Reads Length 397Longest Reads Length 2,043Shortest Reads Length 2

Sequence Assembly WFLTotal ESTs Available 928,759Total Assembled Contigs 53,897Total Singlets 5,065Total Unigenes 58,962

Unigene Dataset

Coding Detected

Non-Coding

Detected

% Coding

Protein Database

23,385 30,512 43.39% NR.aa23,173 30,724 43.00% Refseq21,593 32,304 40.06% UniProt-SwissProt23,463 30,434 43.53% Uniref10023,508 30,389 43.62% Uniref901,425 1,825 44.33% NR.aa1,440 1,837 43.94% Refseq1,457 1,820 44.46% UniProt-SwissProt1,465 1,812 44.71% Uniref1001,298 1,979 39.61% Uniref90

WFL Contigs

WFL Singlets

Problem Identification: •Multiple-stressor effects of common and emerging environmental stressors are unknown for native populations including the reptile model Western Fence Lizard (WFL).

•As habitat loss and global warming advance, resulting stressors may exacerbate chemical stressors (MECs) on Army Ranges.

•Tools are needed to forensically identify predominant environmental stressors to optimize field management.

Purpose: • Assess and characterize the effects of chemical stress (TNT) in conjunction

with environmental stressors induced by habitat loss and climate change in a model reptile species.

Null Hypotheses: A. Multiple ecosystem-level stressors characteristic of habitat degradation and climate change have no interactive effects on lizard health and fitness. B. Environmental stressors are uniquely identifiable via genomic signatures and these signatures can be used to identify predominant stressors in multiple-stressor scenarios.

Focus on Genomic Infrastructure Development

WFL normalized-cDNA Library Construction

•RNA extracted from brain, liver, gut, heart, bone marrow and gonad tissues.

Tissues Collected from 5 male and 5 female, unexposed WFL.

All RNA quality assured using gel-electrophoresis.•SMART™ PCR cDNA Synthesis used to reverse transcribe full length cDNAs•Trimmer, cDNA Normalization Kit used to normalize cDNA library.

Materials and Methods:

1Kb Ladder

Post-Normalization

Pre-Normalization

Roche - GS-FLX

ANNOTATION

Sequencing, Annotation & Microarray Development

929K Sequences Diamond Supercomputer at ERDC Agilent G3 Custom 60K oligonucleotide microarray

•Sequencing: 1 GS-FLX pyrosequencing run yielded 329 megabases in 929K reads with 354bp average read length (Table 1).

•Clustering: A genome-scale transcriptome for WFL was used for EST-based clustering and assembly via The Gene Indices Clustering Tools (TGICL), which uses megablast for homology-based clustering and CAP3.

•Annotation: In all, 53,897 contigs and 5,065 singlets totaling 58,962 unigenes were identified (Table 2). Approx 44 % of unigenes were annotated for protein-coding potential via homology-based annotation against NCBI NR.aa, Refseq, and EBI UniProt-SwissProt, Uniref90, and Uniref100 protein sequence reference knowledgebases (Table 3).

•Microarray Design – Transition WFL transcriptome to Agilent G3 Custom 60K oligonucleotide microarray.

Results and Discussion:

•Web Accessible Knowledgebase: We have implemented a mature bioinformatics and computational biology system which includes: (1) a Relational Database Management Systems (RDBMS) - Oracle (Oracle, Redwood Shores, CA) for quick data retrieval and integration

(2) public and private data and results access via network shared file servers (http://jeff.ifxworks.com/EGGT/Quail_Lizard.html)

(3) data and results visualization via a public accessible web server (http://www.ifxworks.com/EnvironmentalSystemsBiology.html) (4) High performance and throughput

computational analysis pipelines for quick data loading, retrieval, analysis, processing, integration, and validation (Figure 1 and 2).

Conclusions & Future Efforts:•We are the first to develop gene expression tools and a publicly accessible, extensive genomics infrastructure for a laboratory-amenable reptile-model species, the Western Fence Lizard.

•These assets will enable the in progress assessment of emerging stressors that are directly influenced by the non-point multi-stakeholder impacts, habitat loss and climate change (See Platform Pres. # 386).