The! Collaboratory,! a! key! Quantitative! & Computational...

the Collaboratory 529 Boyer Hall

Ying Zhen Ying is a postdoc working jointly with Tom Smith and

Kirk Lohmueller in the Department of EEB and Institute of the

Environment and Sustainability at UCLA. Her current research

focuses on using large scale population genomic data to learn about

the demographic history and natural selection across a diverse

array of non-‐‑model taxa from Central Africa. Ying previously was a

postdoc with Peter Andolfatto at Princeton University, working on

convergent evolution in an herbivore community and evolution of

non-‐‑coding sequences in Drosophila species. She received her Ph.D.

from Kansas State University where she worked with Mark

Ungerer on natural variation of freezing tolerance in Arabidopsis.

Fall 2016

The Collaboratory, a key

component of the Institute for

Quantitative & Computational

Biosciences (QCBio), provides

collaborative computational

expertise to all experimental basic

and clinical life scientists at

ULCA.

The Collaboratory’s main mission

is to facilitate genomic data

analysis in two ways:

(1) provide free weekly classes

covering a number of topics,

ranging from an introduction to

UNIX command line and R to

specific classes on RNAseq,

ChIPseq and variant calling etc.,

and, (2) provide for the

opportunity to collaborate with

expert bio-‐‑-‐‑ informaticians to

undertake analyses or quality

control prior analyses.

Both missions are accomplished

by a group of QCBio

Collaboratory Fellows funded by

the QCBio Collaboratory. They

represent a highly selected group

with proven expert knowledge, as

well as training and collaborative

skillsets.

The Collaboratory welcomes new Fellows: Ying Zhen, Daria Merkurijev, Don Vaughn, Nick

Mancuso, Mike Thompson, Yerbol Kurmangaliyey, and Motahareh (Bahar) Moghtadaei!

This issue: New Fellows New Collaboratory Workshops NCI Cancer ‘Moonshot’ B.I.G. Summer Recently published collaborations Collaboratory Workshop Schedule

2

Daria Merkurjev is a postdoc at UCLA working in the laboratories of Dr. Jake Lusis, Dr.

Caius Radu, and Dr. Grace Xiao. Dr. Daria Merkurjev did her undergraduate work in Mathematics at UCLA and

received her Ph.D. in Bioinformatics and Systems Biology at UCSD under Dr. Michael Rosenfeld. Her

Ph.D. research focused on investigating the molecular and architectural strategies responsible for integrating

genome-‐‑wide transcriptional responses to diverse signaling systems critical for physiological and behavioral

processes in vertebrates. Her research also includes understanding the factors affecting susceptibility to

cardiovascular and metabolic disorders.

Nick Mancuso is a post-‐‑doctoral fellow in the Department of Pathology and Laboratory Medicine at UCLA

working with Bogdan Pasaniuc. He is involved in developing computational methods to dissect the genetic basis

for disease. In addition to method development, Dr. Mancuso is interested in applying developed methods to

large-‐‑scale datasets. Prior to joining UCLA Nicholas completed his PhD in the bioinformatics lab of Alex

Zelikovsky at Georgia State University.

Michael J. Thompson Dr. Michael J. Thompson is a research scientist working with the

laboratory of Prof. Pellegrini. Michael received a BA in high-‐‑energy physics from Boston University and a PhD in

biophysics from the University of Michigan where he developed methods to predict protein structure from

evolutionary sequence information. His research in genomics began as a postdoctoral fellow at UCLA comparing

the genomes of extremophiles for insight into the evolution of protein thermostability. Out of this work, and in

collaboration with fellow postdocs, several quantitative genomic analysis methods emerged that were patented

and used to launch a bioinformatics start-‐‑up company. After 4 years at this company, Michael returned to UCLA

to work with Prof. David Eisenberg in developing a method to predict prion proteins. He is currently

investigating the role of DNA methylation in tumor development for multiple types of cancer.

Yerbol Kurmangaliyev is currently a postdoctoral researcher in the laboratory of Professor S. Lawrence

Zipursky. He earned a M.S. degree in Biochemistry at Lomonosov Moscow State University in 2006. He earned a

Ph.D. degree in Bioinformatics at Kharkevich Institute for information transmission problems of the Russian

Academy of Sciences in 2011 under the supervision of Professor Mikhail S. Gelfand. Before joining UCLA he was

a postdoctoral researcher in the laboratory of Professor Sergey V. Nuzhdin at USC. His previous research interests

have been primarily focused on the genetic basis of gene expression.

New Collaboratory Fellows (continued)

3

Dr. Motahareh (Bahar)

Moghtadaei. is joining Dr.

Radu’s Lab at the department of

Molecular and Medical

Pharmacology, UCLA as a

Postdoctoral Fellow in

bioinformatics. In collaboration

with Dr. Timothy Donahue, Dr.

Kym Faull, and Dr. Julian

Whittelegge, she will be working

on systematic and integrative

analysis of proteomics and

metabolomics data. Bahar

obtained her PhD in Biomedical

Engineering at Tehran

Polytechnic (Amirkabir

University of Technology), 2009-‐‑

2013, and was a Postdoc at Dr.

Rose’s Lab at the Department of

Physiology and Biophysics,

Faculty of Medicine, Dalhousie

University, Canada, 2014-‐‑2016.

Her general research interests

include algorithm development,

and computer programming for

biomedical research.

NEW Collaboratory Workshops Cancer Genomics -‐‑ with Catie Grasso Cancer Genomics will cover the fundamentals of analyzing tumor

genomics data, including exome and transcriptome sequencing,

and using it in a translational setting to identify diagnostic

biomarkers and drivers that can be drug targeted. We will cover

the fundamentals of calling somatic aberrations, including point

mutations, indels, rearrangements and copy number alterations

with an eye to immediate application. We will discuss targeted

sequencing versus exome sequencing versus whole genome

sequencing and how they differ in terms of methodology, time to

results and cost in the context of diagnostic testing. We will also

discuss the integration of drug screen data, other high-‐‑throughput

functional data data, including germline genetics and

epidemiology, and experimental results, and also pathway data for

understanding underlying cancer biology and finally we will

discuss how to use this information to decide the best therapeutic

strategy.

Intro to Modern Statistics -‐‑ with Don Vaughn Traditional statistical analysis emerged in the pre-‐‑computer age,

and correspondingly, nearly all tractable methods required closed-‐‑

form (you can write them in a single equation) solutions.

Attached to these methods (t-‐‑test, chi-‐‑squared, Pearson correlation,

regression, etc), however, are a number of assumptions about the

data: Gaussian distribution, homoscedasticity, linearity, equal

variances, continuity, large sample size to name a few.

In nearly every dataset with which research scientists work, these

assumptions are rarely all met. Journals and reviewers are

increasingly scrutinizing and rejecting submissions that apply

traditional statistics when their requirements are not met.

Fortunately, resampling techniques like bootstrapping and

permutation tests make far fewer assumptions about the

underlying data and can thus be applied much more generally

than traditional statistics.

4

Schedule of Collaboratory Workshops Please follow links for full workshop descriptions. Classes are held in Boyer Hall 529. 9/27-‐‑9/29, Workshop 4: Galaxy for NGS Data Analysis, 1:00pm 10/4-‐‑10/6, Workshop 13: Cancer Genomics, 9:30am-‐‑12:00pm 10/4-‐‑10/6, Workshop 5: RNA-‐‑seq Analysis, 1:00pm 10/11-‐‑10/13, Workshop 3: Intro to R, 9:30am 10/18-‐‑10/20, Workshop 8: Variant Calling, 10:30am 10/25-‐‑10/27, Workshop 9: Python, 10:00am 11/1-‐‑11/3, Workshop 10: Hi-‐‑C, 1:00pm 11/8-‐‑11/10, Workshop 11: Metagenomics Analysis, 9:30am 11/8-‐‑11/10, Workshop 13: Cancer Genomics, 1:00pm-‐‑3:30pm 11/15-‐‑11/17, Workshop 1: Intro to UNIX, 2:00pm 11/29-‐‑12/1, Workshop 2: Using NGS Analysis Tools, 1:00pm 12/6-‐‑12/8, Workshop 6: BS-‐‑Seq,1:00pm 12/6-‐‑12/8, Workshop 13: Cancer Genomics, 9:30am-‐‑12:00pm

5

B.I.G. Summer 2016

By Ina Thorner

QCBio’s eight week undergraduate summer research program, Bruins-‐In-‐Genomics (B.I.G.) Summer, came to an end on August 12th after an exciting summer.

Thirty-‐seven students attended Collaboratory workshops and contributed to research in the labs of QCBio faculty.

“My BIG summer student was outstanding! She contributed greatly to the analysis of an RNA-‐seq dataset, while working almost entirely independently. She went above and beyond my expectations in terms of analysis, background reading, and preparation for her lab meeting and poster. She will make a fantastic graduate student if she chooses that route,” said postdoctoral fellow Jeff Rasmussen from the Sagasti Lab.

B.I.G. Summer students also participated in professional development workshops, journal club, weekly seminars, and a concluding poster session.

B.I.G. Summer student Scott De Taboada said, “Honestly I felt that this program was amazing. Coming in I wasn't sure if graduate school was a path I wanted to take. This program showed me the opportunities I would have and gave me a great understanding of what it would take to get a PhD.”

B.I.G. Summer student Brenda Ji said, “B.I.G. Summer has influenced my thoughts on graduate school pretty drastically. Before this I was pretty set on medical school and didn't think graduate school would be the right fit for me, but now that I have seen the demand for bioinformatics and the need for people with computer science backgrounds, it has really changed the way that I see my future. Now, I'm not so sure if I want to go to medical school, but graduate school sounds fun and exciting and rewarding. My home institution is a small liberal arts college where I had very little exposure to bioinformatics, but I love both biology and computer science, so this has been a really valuable and enriching experience for me.”

The program culminated with a poster session and awards ceremony.

6

UCLA Collaboratory

Takes First Step in

NCI Cancer

'ʹMoonshot'ʹ

By Catie Grasso

In the last few years, multiple

new drug treatment strategies for

untreatable metastatic cancers

have ignited new hope for

progress in cancer

treatment. Immunotherapy

techniques, including PD-‐‑1

inhibitors, have been putting 90%

of patients who had failed

standard of care into one to two

year remissions for multiple

cancer types. Similarly, drugs

targeting patients with DNA

repair deficiencies, like PARP

inhibitors, have been yielding

similar responses in other

terminal patients (88% response

rate). These are unparalleled

response rates. The advent of

next generation sequencing

techniques able to rapidly and

affordably monitor response and

resistance during treatment make

possible the development of

combined approaches and new

options in real time with living

patients.

The NIH National Cancer

Institute Cancer Moonshot

initiative is about making these

treatment options as effective as

possible, as quickly as possible,

and the new UCLA Parker

Institute for Cancer

Immunotherapy led by Dr. Toni

Ribas, an innovator in cancer

immunotherapy and targeted

gene therapy, is a key part of

that push. This fall I will be

teaching a one week course on

the basics of cancer genomics, as

well as introducing cutting edge

approaches. The course is open

to anyone interested, and a key

goal is mobilizing each person’s

existing skills now to help them

contribute to this exciting new

initiative with an eye to rapidly

impacting patient care.

Dr. Catherine Grasso is an

Adjunct Assistant Professor in

Hematology Oncology and

Director of Bioinformatics for the

UCLA Parker Institute of Cancer

Immunotherapy.

Recently Published

Collaborations

CRISPR/Cas9-‐‑mediated correction of the sickle mutation in human CD34+ cells. Mol Ther. 2016 Jul 13. doi: 10.1038/mt.2016.148. [Epub ahead of print]

Hoban MD, Lumaquin D, Kuo CY, Romero Z, Long J, Ho M, Young CS, Mojadidi M, Fitz-‐‑Gibbon S, Cooper AR, Lill GR, Urbinati F, Campo-‐‑Fernandez B, Bjurstrom CF, Pellegrini M, Hollis RP, Kohn DB. Antibody-‐‑Mediated Rejection in Lung Transplantation: Clinical Outcomes and Donor-‐‑Specific Antibody Characteristics. Am J Transplant. 2016 Apr;16(4):1216-‐‑28. doi: 10.1111/ajt.13589. Epub 2016 Feb 4. PubMed PMID: 26845386. 1: Roux A, Bendib Le Lan I, Holifanjaniaina S, Thomas KA, Hamid AM, Picard C, Grenet D, De Miranda S, Douvry B, Beaumont-‐‑Azuar L, Sage E, Devaquet J, Cuquemelle E, Le Guen M, Spreafico R, Suberbielle-‐‑Boissel C, Stern M, Parquin F; Foch Lung Transplantation Group. Genomic Flatlining in the Endangered Island Fox. Curr Biol. 2016 Apr 20. pii: S0960-‐‑9822(16)30173-‐‑7. doi: 10.1016/j.cub.2016.02.062. [Epub ahead of print] PubMed PMID: 27112291.

Collaboratory Services Analyses Service

The Collaboratory has

implemented computational

methods and procedures to

analyze sequencing data for the

UCLA community.

We have developed pipelines that

are specifically designed to

optimize the use of the hoffman2

cluster resources.

As part of this service, QCB

Collaboratory fellows will

analyze next generation

sequencing data submitted to us,

and provide users with their

results. Examples of data that we

can analyze include RNA-‐‑seq,

ChIP-‐‑seq, and bisulfite seq, and

we can also perform variant

calling on whole genomes or

exomes.

Please contact Matteo Pellegrini

[email protected], if you

are interested in participating.

The service is provided at no cost

to users, but we do ask that the

collaboratory fellow that analyzes

your data be acknowledged as an

author on any eventual

publications.

Genome Browser Tools

The Collaboratory hosts and administers a local server for the UCSC genome browser tools that allows users to view genomic data associated with genomes not supported by the UCSC site.

It contains the following major features:

●Graphical display of genes, gene structures, and gene annotations

●blat alignment of DNA sequences with a reference custom genome

●Graphical display of custom tracks with existing genomic annotation tracks

The common uses of the genome browser for NGS data analysis include:

●display and view gene expression profiles

●examine genomic data base-‐‑by-‐‑base

●display and examine genomic variations

●parallel compare genomic data derived from different technologies, such as NGS and microarray

●share genomic date with collaborators

●Prepare figures for publications

The! Collaboratory,! a! key! Quantitative! & Computational...

Documents

Transcript of The! Collaboratory,! a! key! Quantitative! & Computational...