The! Collaboratory,! a! key! Quantitative! & Computational...
Transcript of The! Collaboratory,! a! key! Quantitative! & Computational...
the Collaboratory 529 Boyer Hall
Ying Zhen Ying is a postdoc working jointly with Tom Smith and
Kirk Lohmueller in the Department of EEB and Institute of the
Environment and Sustainability at UCLA. Her current research
focuses on using large scale population genomic data to learn about
the demographic history and natural selection across a diverse
array of non-‐‑model taxa from Central Africa. Ying previously was a
postdoc with Peter Andolfatto at Princeton University, working on
convergent evolution in an herbivore community and evolution of
non-‐‑coding sequences in Drosophila species. She received her Ph.D.
from Kansas State University where she worked with Mark
Ungerer on natural variation of freezing tolerance in Arabidopsis.
Fall 2016
The Collaboratory, a key
component of the Institute for
Quantitative & Computational
Biosciences (QCBio), provides
collaborative computational
expertise to all experimental basic
and clinical life scientists at
ULCA.
The Collaboratory’s main mission
is to facilitate genomic data
analysis in two ways:
(1) provide free weekly classes
covering a number of topics,
ranging from an introduction to
UNIX command line and R to
specific classes on RNAseq,
ChIPseq and variant calling etc.,
and, (2) provide for the
opportunity to collaborate with
expert bio-‐‑-‐‑ informaticians to
undertake analyses or quality
control prior analyses.
Both missions are accomplished
by a group of QCBio
Collaboratory Fellows funded by
the QCBio Collaboratory. They
represent a highly selected group
with proven expert knowledge, as
well as training and collaborative
skillsets.
The Collaboratory welcomes new Fellows: Ying Zhen, Daria Merkurijev, Don Vaughn, Nick
Mancuso, Mike Thompson, Yerbol Kurmangaliyey, and Motahareh (Bahar) Moghtadaei!
This issue: New Fellows New Collaboratory Workshops NCI Cancer ‘Moonshot’ B.I.G. Summer Recently published collaborations Collaboratory Workshop Schedule
2
Daria Merkurjev is a postdoc at UCLA working in the laboratories of Dr. Jake Lusis, Dr.
Caius Radu, and Dr. Grace Xiao. Dr. Daria Merkurjev did her undergraduate work in Mathematics at UCLA and
received her Ph.D. in Bioinformatics and Systems Biology at UCSD under Dr. Michael Rosenfeld. Her
Ph.D. research focused on investigating the molecular and architectural strategies responsible for integrating
genome-‐‑wide transcriptional responses to diverse signaling systems critical for physiological and behavioral
processes in vertebrates. Her research also includes understanding the factors affecting susceptibility to
cardiovascular and metabolic disorders.
Nick Mancuso is a post-‐‑doctoral fellow in the Department of Pathology and Laboratory Medicine at UCLA
working with Bogdan Pasaniuc. He is involved in developing computational methods to dissect the genetic basis
for disease. In addition to method development, Dr. Mancuso is interested in applying developed methods to
large-‐‑scale datasets. Prior to joining UCLA Nicholas completed his PhD in the bioinformatics lab of Alex
Zelikovsky at Georgia State University.
Michael J. Thompson Dr. Michael J. Thompson is a research scientist working with the
laboratory of Prof. Pellegrini. Michael received a BA in high-‐‑energy physics from Boston University and a PhD in
biophysics from the University of Michigan where he developed methods to predict protein structure from
evolutionary sequence information. His research in genomics began as a postdoctoral fellow at UCLA comparing
the genomes of extremophiles for insight into the evolution of protein thermostability. Out of this work, and in
collaboration with fellow postdocs, several quantitative genomic analysis methods emerged that were patented
and used to launch a bioinformatics start-‐‑up company. After 4 years at this company, Michael returned to UCLA
to work with Prof. David Eisenberg in developing a method to predict prion proteins. He is currently
investigating the role of DNA methylation in tumor development for multiple types of cancer.
Yerbol Kurmangaliyev is currently a postdoctoral researcher in the laboratory of Professor S. Lawrence
Zipursky. He earned a M.S. degree in Biochemistry at Lomonosov Moscow State University in 2006. He earned a
Ph.D. degree in Bioinformatics at Kharkevich Institute for information transmission problems of the Russian
Academy of Sciences in 2011 under the supervision of Professor Mikhail S. Gelfand. Before joining UCLA he was
a postdoctoral researcher in the laboratory of Professor Sergey V. Nuzhdin at USC. His previous research interests
have been primarily focused on the genetic basis of gene expression.
New Collaboratory Fellows (continued)
3
Dr. Motahareh (Bahar)
Moghtadaei. is joining Dr.
Radu’s Lab at the department of
Molecular and Medical
Pharmacology, UCLA as a
Postdoctoral Fellow in
bioinformatics. In collaboration
with Dr. Timothy Donahue, Dr.
Kym Faull, and Dr. Julian
Whittelegge, she will be working
on systematic and integrative
analysis of proteomics and
metabolomics data. Bahar
obtained her PhD in Biomedical
Engineering at Tehran
Polytechnic (Amirkabir
University of Technology), 2009-‐‑
2013, and was a Postdoc at Dr.
Rose’s Lab at the Department of
Physiology and Biophysics,
Faculty of Medicine, Dalhousie
University, Canada, 2014-‐‑2016.
Her general research interests
include algorithm development,
and computer programming for
biomedical research.
NEW Collaboratory Workshops Cancer Genomics -‐‑ with Catie Grasso Cancer Genomics will cover the fundamentals of analyzing tumor
genomics data, including exome and transcriptome sequencing,
and using it in a translational setting to identify diagnostic
biomarkers and drivers that can be drug targeted. We will cover
the fundamentals of calling somatic aberrations, including point
mutations, indels, rearrangements and copy number alterations
with an eye to immediate application. We will discuss targeted
sequencing versus exome sequencing versus whole genome
sequencing and how they differ in terms of methodology, time to
results and cost in the context of diagnostic testing. We will also
discuss the integration of drug screen data, other high-‐‑throughput
functional data data, including germline genetics and
epidemiology, and experimental results, and also pathway data for
understanding underlying cancer biology and finally we will
discuss how to use this information to decide the best therapeutic
strategy.
Intro to Modern Statistics -‐‑ with Don Vaughn Traditional statistical analysis emerged in the pre-‐‑computer age,
and correspondingly, nearly all tractable methods required closed-‐‑
form (you can write them in a single equation) solutions.
Attached to these methods (t-‐‑test, chi-‐‑squared, Pearson correlation,
regression, etc), however, are a number of assumptions about the
data: Gaussian distribution, homoscedasticity, linearity, equal
variances, continuity, large sample size to name a few.
In nearly every dataset with which research scientists work, these
assumptions are rarely all met. Journals and reviewers are
increasingly scrutinizing and rejecting submissions that apply
traditional statistics when their requirements are not met.
Fortunately, resampling techniques like bootstrapping and
permutation tests make far fewer assumptions about the
underlying data and can thus be applied much more generally
than traditional statistics.
4
Schedule of Collaboratory Workshops Please follow links for full workshop descriptions. Classes are held in Boyer Hall 529. 9/27-‐‑9/29, Workshop 4: Galaxy for NGS Data Analysis, 1:00pm 10/4-‐‑10/6, Workshop 13: Cancer Genomics, 9:30am-‐‑12:00pm 10/4-‐‑10/6, Workshop 5: RNA-‐‑seq Analysis, 1:00pm 10/11-‐‑10/13, Workshop 3: Intro to R, 9:30am 10/18-‐‑10/20, Workshop 8: Variant Calling, 10:30am 10/25-‐‑10/27, Workshop 9: Python, 10:00am 11/1-‐‑11/3, Workshop 10: Hi-‐‑C, 1:00pm 11/8-‐‑11/10, Workshop 11: Metagenomics Analysis, 9:30am 11/8-‐‑11/10, Workshop 13: Cancer Genomics, 1:00pm-‐‑3:30pm 11/15-‐‑11/17, Workshop 1: Intro to UNIX, 2:00pm 11/29-‐‑12/1, Workshop 2: Using NGS Analysis Tools, 1:00pm 12/6-‐‑12/8, Workshop 6: BS-‐‑Seq,1:00pm 12/6-‐‑12/8, Workshop 13: Cancer Genomics, 9:30am-‐‑12:00pm
5
B.I.G. Summer 2016
By Ina Thorner
QCBio’s eight week undergraduate summer research program, Bruins-‐In-‐Genomics (B.I.G.) Summer, came to an end on August 12th after an exciting summer.
Thirty-‐seven students attended Collaboratory workshops and contributed to research in the labs of QCBio faculty.
“My BIG summer student was outstanding! She contributed greatly to the analysis of an RNA-‐seq dataset, while working almost entirely independently. She went above and beyond my expectations in terms of analysis, background reading, and preparation for her lab meeting and poster. She will make a fantastic graduate student if she chooses that route,” said postdoctoral fellow Jeff Rasmussen from the Sagasti Lab.
B.I.G. Summer students also participated in professional development workshops, journal club, weekly seminars, and a concluding poster session.
B.I.G. Summer student Scott De Taboada said, “Honestly I felt that this program was amazing. Coming in I wasn't sure if graduate school was a path I wanted to take. This program showed me the opportunities I would have and gave me a great understanding of what it would take to get a PhD.”
B.I.G. Summer student Brenda Ji said, “B.I.G. Summer has influenced my thoughts on graduate school pretty drastically. Before this I was pretty set on medical school and didn't think graduate school would be the right fit for me, but now that I have seen the demand for bioinformatics and the need for people with computer science backgrounds, it has really changed the way that I see my future. Now, I'm not so sure if I want to go to medical school, but graduate school sounds fun and exciting and rewarding. My home institution is a small liberal arts college where I had very little exposure to bioinformatics, but I love both biology and computer science, so this has been a really valuable and enriching experience for me.”
The program culminated with a poster session and awards ceremony.
6
UCLA Collaboratory
Takes First Step in
NCI Cancer
'ʹMoonshot'ʹ
By Catie Grasso
In the last few years, multiple
new drug treatment strategies for
untreatable metastatic cancers
have ignited new hope for
progress in cancer
treatment. Immunotherapy
techniques, including PD-‐‑1
inhibitors, have been putting 90%
of patients who had failed
standard of care into one to two
year remissions for multiple
cancer types. Similarly, drugs
targeting patients with DNA
repair deficiencies, like PARP
inhibitors, have been yielding
similar responses in other
terminal patients (88% response
rate). These are unparalleled
response rates. The advent of
next generation sequencing
techniques able to rapidly and
affordably monitor response and
resistance during treatment make
possible the development of
combined approaches and new
options in real time with living
patients.
The NIH National Cancer
Institute Cancer Moonshot
initiative is about making these
treatment options as effective as
possible, as quickly as possible,
and the new UCLA Parker
Institute for Cancer
Immunotherapy led by Dr. Toni
Ribas, an innovator in cancer
immunotherapy and targeted
gene therapy, is a key part of
that push. This fall I will be
teaching a one week course on
the basics of cancer genomics, as
well as introducing cutting edge
approaches. The course is open
to anyone interested, and a key
goal is mobilizing each person’s
existing skills now to help them
contribute to this exciting new
initiative with an eye to rapidly
impacting patient care.
Dr. Catherine Grasso is an
Adjunct Assistant Professor in
Hematology Oncology and
Director of Bioinformatics for the
UCLA Parker Institute of Cancer
Immunotherapy.
Recently Published
Collaborations
CRISPR/Cas9-‐‑mediated correction of the sickle mutation in human CD34+ cells. Mol Ther. 2016 Jul 13. doi: 10.1038/mt.2016.148. [Epub ahead of print]
Hoban MD, Lumaquin D, Kuo CY, Romero Z, Long J, Ho M, Young CS, Mojadidi M, Fitz-‐‑Gibbon S, Cooper AR, Lill GR, Urbinati F, Campo-‐‑Fernandez B, Bjurstrom CF, Pellegrini M, Hollis RP, Kohn DB. Antibody-‐‑Mediated Rejection in Lung Transplantation: Clinical Outcomes and Donor-‐‑Specific Antibody Characteristics. Am J Transplant. 2016 Apr;16(4):1216-‐‑28. doi: 10.1111/ajt.13589. Epub 2016 Feb 4. PubMed PMID: 26845386. 1: Roux A, Bendib Le Lan I, Holifanjaniaina S, Thomas KA, Hamid AM, Picard C, Grenet D, De Miranda S, Douvry B, Beaumont-‐‑Azuar L, Sage E, Devaquet J, Cuquemelle E, Le Guen M, Spreafico R, Suberbielle-‐‑Boissel C, Stern M, Parquin F; Foch Lung Transplantation Group. Genomic Flatlining in the Endangered Island Fox. Curr Biol. 2016 Apr 20. pii: S0960-‐‑9822(16)30173-‐‑7. doi: 10.1016/j.cub.2016.02.062. [Epub ahead of print] PubMed PMID: 27112291.
Collaboratory Services Analyses Service
The Collaboratory has
implemented computational
methods and procedures to
analyze sequencing data for the
UCLA community.
We have developed pipelines that
are specifically designed to
optimize the use of the hoffman2
cluster resources.
As part of this service, QCB
Collaboratory fellows will
analyze next generation
sequencing data submitted to us,
and provide users with their
results. Examples of data that we
can analyze include RNA-‐‑seq,
ChIP-‐‑seq, and bisulfite seq, and
we can also perform variant
calling on whole genomes or
exomes.
Please contact Matteo Pellegrini
[email protected], if you
are interested in participating.
The service is provided at no cost
to users, but we do ask that the
collaboratory fellow that analyzes
your data be acknowledged as an
author on any eventual
publications.
Genome Browser Tools
The Collaboratory hosts and administers a local server for the UCSC genome browser tools that allows users to view genomic data associated with genomes not supported by the UCSC site.
It contains the following major features:
●Graphical display of genes, gene structures, and gene annotations
●blat alignment of DNA sequences with a reference custom genome
●Graphical display of custom tracks with existing genomic annotation tracks
The common uses of the genome browser for NGS data analysis include:
●display and view gene expression profiles
●examine genomic data base-‐‑by-‐‑base
●display and examine genomic variations
●parallel compare genomic data derived from different technologies, such as NGS and microarray
●share genomic date with collaborators
●Prepare figures for publications