Characterisation of RNA in milk and prediction of microRNA...

220
Characterisation of RNA in milk and prediction of microRNA functions by Amit Kumar (MSc) Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy Deakin University November 2013

Transcript of Characterisation of RNA in milk and prediction of microRNA...

Page 1: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Characterisation of RNA in milk and prediction of microRNA functions

by

Amit Kumar (MSc)

Submitted in fulfilment of the requirements for the degree of

Doctor of Philosophy

Deakin University

November 2013

Page 2: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA
sfol
Retracted Stamp
Page 3: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Acknowledgements

This thesis was written with generous support from countless individuals, who

shared their knowledge and expertise. First and foremost I would like to thank my

supervisor Christophe Lefèvre to introduce me to the multi-disciplinary field of

bioinformatics. I appreciate all his invaluable, outstanding, and inspiring

contributions to make my PhD experience productive and stimulating. His selfless

time and care were sometimes all that kept me going.

My profound thanks to Rob Moore and Mark Tizard at CSIRO AAHL for countless

brainstorming discussions, that helped me to focus. I am especially grateful for Rob

Moore for his advice and patience to get me through tough times in the Ph.D. pursuit.

You also introduced me to soccer and after years of Wednesday afternoon games, I

enjoyed being able to last 1 hour on soccer field. I am grateful to Kevin Nicholas for

his invaluable inputs and expert guidance in lactation biology.

I would like to acknowledge Adam and Philip Church from school of IT for their

valuable support in high performance cluster computing. I enjoyed the interesting

talks about designing clouds on your servers. It was a truly multidisciplinary effort

and I appreciate the help from both lab members at Deakin University and CSIRO

AAHL. These groups have been a source of lifelong friendships as well as good

advice and collaborations.

I will forever be thankful to Pragati for all her constant support and unconditional

love. Thanks to be a source of happiness, laughter and lifelong companionship. Last

but not least I would like to take this opportunity to thank my parents and family for

love and encouragement, who took the best possible care to raise me up and

supported me in all my endeavors.

Amit Kumar

Deakin University

November 2013

Page 4: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

List of publications

miRNA_Targets: A database for miRNA target predictions in coding and non-

coding regions of mRNAs. Amit Kumar, Adam K.-L. Wong, Mark L. Tizard,

Robert J. Moore, and Christophe. In: Genomics 2012 Dec;100(6):352-6. URL

http://mamsap.it.deakin.edu.au/mirna_targets/

The emerging role of micro-RNAs in the lactation process. Amit Kumar, Laurine

Buscara, Sanjana Kuruppath, Khanh Phuong Ngo, Kevin R. Nicholas, Christophe

Lefèvre (2012). In: Lactation: Natural Processes, Physiological Responses and Role

in Maternity. NOVA publishers (In press).

https://www.novapublishers.com/catalog/product_info.php?products_id=31495

Buffalo milk transcriptomics. Sanjana Kuruppath, Amit Kumar, Vengama Naidu

Modepalli, Ngo Khanh Phuong, Sally Louis Gras, Christophe Lefèvre (2013).

Published in: Buffalo World Congress 2013.

Lactating fur seal case study suggests mir21 may contribute to mammary gland

involution evasion during off-shore lactation via regulation of PTEN. Laurine

Buscara, Amit Kumar, Kevin R. Nicholas, Christophe Lefevre. (2013) In AJAR 2013

April(1):5-8

Page 5: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Abbreviations

AGO Argonaute protein family

cDNA Complementary deoxyribonucleic acid

CDS Coding sequences

Col. Colostrum

CTGF Connective tissue growth factor

DGCR8 DiGeorge Syndrome Critical Region Gene 8

Diff. Difference

DNA Deoxyribonucleic acid

ESC Embryonic Stem Cell

FBS Fetal bovine serum

HITS-CLIP High Throughput Sequencing Crosslink Immunoprecipitation

hsa-miR Homo sapien microRNA

kb Kilobase

LNA Locked nucleic acid

M.M. Milk

miR MicroRNA

miRNAs microRNAs

mmu-miR Mus musculus microRNA

mRNA Messenger RNA

ng Nanogram

nt Nucleotide

PBS Phosphate buffered saline

Pre-miRNA Precursor microRNA

pri-miRNAs Primary microRNAs

RISC RNA-induced silencing complex

RNA Ribonucleic acid

RNAi Ribonucleic acid interference

RNase Ribonuclease

siRNA Small interfering ribonucleic acid

UTR Untranslated region

Page 6: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Abstract

MicroRNAs (miRNAs) are a class of small non-coding RNA molecules. These

regulatory molecules target hundreds of genes in sequence specific manner leading

to inhibition of translation and the regulation of gene expression at post-

transcriptional level. However, their functions and mechanisms of action have not

been fully characterised. Computational tools combined with deep sequencing

technologies are being applied to decode RNA interaction networks. The aim of this

thesis was to characterise miRNAs in milk and study the functions of the milk

miRNAs. First, methods for the prediction and analysis of miRNA target networks

were developed to assist the investigation of the role of miRNAs as gene regulatory

elements. All the publically available miRNA target prediction web resources at the

time this study was initiated were limited to 3 prime UTR sequences, but

experimental evidence had started to reveal functional target sites in coding and 5

prime UTR regions. Here, the miRNA_targets database was developed to

comprehensively cover predicted miRNA target sites across full-length mRNAs in

various model organisms. Bioconductor packages for ontology classification were

integrated into this framework to classify predicted miRNA target networks into GO

ontologies and the integration of the CIRCOS algorithm also allowed the

representation of chromosomal position maps of target gene networks. With the

recent discovery of secretory miRNA in milk, a number of milk miRNA profiling

experiments conducted in the lab were analysed. Specifically, the presence of

miRNAs was confirmed across various species in mammalian kingdom. First, mature

miRNAs and mRNAs were cataloged in buffalo colostrum and milk. The results

confirm that small RNA isolated from colostrum and milk contains a large

population of miRNA. Over 700 putative miRNAs were identified and quantified by

RNA sequencing in buffalo milk and colostrum. The analysis of buffalo colostrum

and milk cell transcriptomes also found expression of sequences of thousands genes,

including 5657 novel buffalo genes were annotated. Then miRNA were characterised

in milk from various species including platypus, echidna, seal, wallaby and alpaca.

Page 7: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Highly conserved miRNAs were found by comparative analysis of miRNAs in milk.

Common highly expressed milk miRNAs were miR-148, miR-191, miR-30a, miR-

184, let-7f and miR-375. Further, diverse expression profiles of miRNAs in wallaby

milk and serum indicated the mammary gland origin of miRNAs, although no

evidence for epithelial cell biogenesis could be found in a study of the buffalo milk

cell transcriptome. Interestingly, significant temporal variations have been

characterised in milk miRNA profiles during the lactation cycle of the marsupial

wallaby, or between colostrum and milk of the buffalo. The work presented here has

identified some of the key secretory miRNA that may be involved in such processes,

providing a more complete comparative catalog of milk composition and illustrating

some of the approaches that can be deployed for the interpretation of the data to

create new hypothesis for further investigations on the functional relevance of milk

miRNA.

Page 8: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

List of tables

TABLE 1.1 A LIST OF MIRNA DATABASES ................................................................... 39

TABLE 2.1 NUMBER OF GENES FROM ENSEMBL DATABASE AND MIRNAS FROM

MIRBASE (RELEASE 18) IN THE WEBSERVER. ...................................................... 47

TABLE 3.1 RAW DATA PROCESSING STEPS. .................................................................. 61

TABLE 3.2 SUMMARY OF SMALL RNA ALIGNMENTS TO COW GENOME. ...................... 70

TABLE 3.3 TOP RANKED MIRNAS EXPRESSED IN COLOSTRUM AND MILK INDIVIDUAL

SAMPLES. ............................................................................................................ 73

TABLE 3.4 LOG 2 TRANSFORMED EXPRESSION OF THE 10 HIGHEST EXPRESSED MIRNAS

IN COLOSTRUM, MILK AND COLOSTRUM EXOSOME ONLY SAMPLES. .................... 74

TABLE 3.5 STATISTICALLY SIGNIFICANTLY DIFFERENTIALLY EXPRESSED MIRNA

(WITHOUT APPLYING MULTIPLE CORRECTION). ................................................... 76

TABLE 3.6A TOP DIFFERENTIALLY ENRICHED MIRNAS IN COLOSTRUM COMPARED TO

MILK. ................................................................................................................... 77

TABLE 3.7A MOST ABUNDANT MIRNAS IN COLOSTRUM EXOSOMES COMPARED TO

COLOSTRUM WHEY. ............................................................................................. 80

TABLE 3.7B MOST ABUNDANT MIRNAS IN COLOSTRUM WHEY COMPARED TO

COLOSTRUM EXOSOMES. ..................................................................................... 80

TABLE 3.8 PREDICTED MIRNAS WITH MORE THAN 500 READS. .................................. 85

TABLE 3.9 NUMBER OF UNALIGNED READS LEFT AFTER COW GENOME ALIGNMENTS.

CLUSTERED READS REPRESENTS THE NUMBER OF UNIQUE SEQUENCES. .............. 86

TABLE 3.10 THE MOST ABUNDANTLY EXPRESSED SEQUENCES FROM THE GREATER

THAN 26 NT. ........................................................................................................ 89

TABLE 3.11 COMPARATIVE ABUNDANCE OF MILK MIRNA ......................................... 91

TABLE 3.12 ONTOLOGY CLASSIFICATION AND ENRICHMENT SCORES OF SELECTED

MIRNA TARGET GENES ALONG WITH ENRICHMENT SCORES. ............................... 93

TABLE 4.1 NUMBER OF RAW SEQUENCE READS AND READS FILTERED OUT FOR

IMPURITIES. CC: COLOSTRUM SAMPLES, MM: MILK SAMPLES. ......................... 103

TABLE 4.2 NUMBER OF NOVEL AND PREVIOUSLY ANNOTATED GENES EXPRESSED AT IN

EACH MILK SAMPLE ........................................................................................... 103

TABLE 4.3 GENE EXPRESSION LEVELS CALCULATED BY CUFFLINKS. ......................... 107

Page 9: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

TABLE 4.4A TOP 10 ABUNDANT GENES IN COLOSTRUM. ............................................ 110

TABLE 4.4B TOP 10 GENES EXPRESSED IN MILK. ........................................................ 111

TABLE 4.5A TOP 10 MOST STATISTICALLY SIGNIFICANTLY COLOSTRUM ENRICHED

GENES WITH LOG NORMALIZED EXPRESSION VALUES AND DIFFERENCES. .......... 115

TABLE 4.5B TOP TEN MOST STATISTICALLY SIGNIFICANTLY MILK-ENRICHED GENES

WITH LOG NORMALIZED EXPRESSION VALUES AND DIFFERENCES. ..................... 115

TABLE 4.6 ONTOLOGY CLASSIFICATION OF COLOSTRUM-ENRICHED GENES (P-VALUE <

0.05, WITHOUT APPLYING MULTIPLE CORRECTIONS FILTER) .............................. 121

TABLE 5.1A PROCESSING OF WALLABY SMALL RNA LIBRARIES. .............................. 137

TABLE 5.1B PROCESSING OF PLATYPUS, ECHIDNA, ALPACA AND SEAL SMALL RNA

LIBRARIES. ......................................................................................................... 138

TABLE 5.2A MICRORNA ANALYSIS USING MIRANALYZER, WALLABY MILK MIRNAS

ALIGNED AGAINST MONDOM5 MIRNAS (MIRBASE 19) ..................................... 141

TABLE 5.3 TOP 10 MIRNAS FOUND IN INDIVIDUAL SAMPLES. ................................... 142

TABLE 5.4 WALLABY MILK MIRNA CLUSTERS WITH SELF ORGANIZING TREE

ALGORITHM (SOTA) IN EXPRESSION LEVELS ACROSS 5 TIME POINTS OF

LACTATION CYCLE. ............................................................................................ 144

TABLE 5.5 MILK AND SERUM SPECIFIC MIRNAS IN WALLABY COMBINED FOR DAY 118

AND 175. ........................................................................................................... 151

TABLE 5.6 HIGHLY ABUNDANT MILK OR SERUM ENRICHED MIRNAS. ....................... 152

TABLE 5.7 PLATYPUS AND ECHIDNA MIRNAS ALIGNED AGAINST ORNANA1 MIRNAS

FROM MIRBASE 19. ........................................................................................... 154

TABLE 5.8 TOP 10 MOST ABUNDANT MIRNA IN PLATYPUS, ECHIDNA, SEAL AND

ALPACA MILK. ................................................................................................... 155

TABLE 5.9 MIRNAS IN HUMAN MILK SAMPLES .......................................................... 156

TABLE 5.10 MIRNAS IN PIG MILK SAMPLES. ............................................................. 157

Page 10: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

List of figures

FIGURE 1.1 BIOGENESIS OF MICRORNA ........................................................................ 7

FIGURE 1. 2 OVERVIEW OF MICRO-RNAS INVOLVED IN MAMMARY GLAND

DEVELOPMENT .................................................................................................... 19

FIGURE 1.3 COMPARATIVE ANALYSIS OF MILK MICRO-RNA. ...................................... 25

FIGURE 1. 4 MILK CONTENT PROFILES OF CANDIDATE MICRO-RNAS MARKERS. ......... 26

FIGURE 1.5 MODES OF MIRNA ACTION ...................................................................... 33

FIGURE 1.6 MIRNA HAIRPIN ENTRIES IN MIRBASE (2002 TO 2012) ............................ 37

FIGURE 2.1 FLOW CHART DIAGRAM OF SEQUENCE DATASETS AND ALGORITHMS USED

TO MAKE THIS DATABASE. ................................................................................... 47

FIGURE 2.2 COMPARISON OF MIRNA TARGET PREDICTION ALGORITHMS ON CLIP-SEQ

DATASETS FROM STARBASE DATABASE. ............................................................. 48

FIGURE 2.3 SNAPSHOT OF THE MIRNA_TARGETS DATABASE WEB INTERFACE. .......... 50

FIGURE 2.4 LET-7C MIRNA TARGETS. ........................................................................ 52

FIGURE 3.1 DISTRIBUTION OF SEQUENCE LENGTHS IN RNA SAMPLES FROM SEQUENCED

LIBRARIES. .......................................................................................................... 63

FIGURE 3.2A CUMULATIVE DISTRIBUTION PLOT OF COLOSTRUM, MILK AND

COLOSTRUM EXOSOME ONLY SAMPLES. .............................................................. 65

FIGURE 3.2B CUMULATIVE DISTRIBUTION OF RNA QUANTIFICATION BEFORE AND

AFTER NORMALIZATION TO THE 75 PERCENTILE. ................................................. 66

FIGURE 3.3 DATA TREE OF ALL ANNOTATED BUFFALO MIRNA IN COLOSTRUM AND

MILK SAMPLES. ................................................................................................... 67

FIGURE 3.4A BOXWHISKER PLOT OF 75-PERCENTILE NORMALISATION OF COLOSTRUM,

MILK AND COLOSTRUM EXOSOME ONLY SAMPLES. .............................................. 68

FIGURE 3.4B BOXWHISKER PLOT OF QUANTILE-QUANTILE NORMALIZED COLOSTRUM

AND MILK SAMPLES ............................................................................................. 68

FIGURE 3.5 OVERLAP OF KNOWN COLOSTRUM AND MILK MIRNAS. ........................... 71

FIGURE 3.6A A SCATTER PLOT BUFFALO MILK MIRNA PROBES. ................................. 74

FIGURE 3.6B MIRNA EXPRESSION PROFILES OF COLOSTRUM AND EXOSOME ONLY

SAMPLES. ............................................................................................................ 75

FIGURE 3.7 MIRNA READ ALIGNMENTS TO MIR-101. ................................................. 87

Page 11: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

FIGURE 3.8 COMPARISON OF COLOSTRUM 26-40 NT AND 26-80 NT. SIZE SELECTED

SAMPLES. ............................................................................................................. 89

FIGURE 4.1 WORKFLOW OF SAMPLE PROCESSING AND BIOINFORMATIC ANALYSIS OF

RNA DATASETS. ................................................................................................ 100

FIGURE 4.2 COMPARISON OF BOWTIE AND TOPHAT ALIGNMENTS ON THE ALPHA -

CASEIN (CASA1) GENE. .................................................................................... 104

FIGURE 4.3A COUNT VS DISPERSION PLOT BY CONDITION FOR ALL GENES. ................ 105

FIGURE 4.3B THE SQUARED COEFFICIENT OF VARIATION (CV2) OF CROSS-REPLICATE

VARIABILITY BETWEEN GENE AND ISOFORM LEVELS. ........................................ 106

FIGURE 4.3C THE DENSITY PLOTS OF DISTRIBUTIONS OF GENES AND ISOFORM FPKM

SCORES ACROSS COLOSTRUM AND MILK SAMPLE REPLICATES. .......................... 107

FIGURE 4.4 DATA TREE OF NORMALIZED RNA-SEQ SAMPLES FROM COLOSTRUM AND

MILK. ................................................................................................................. 109

FIGURE 4.5 NORMALIZED FEATURE PROBES OF COLOSTRUM AND MILK SAMPLES AFTER

REMOVING MM4 MILK SAMPLE. ........................................................................ 109

FIGURE 4.6 DIFFERENTIALLY EXPRESSED GENES IN COLOSTRUM AND MILK FROM

CUFFDIFF ALGORITHM. ...................................................................................... 113

FIGURE 4.7B SCATTERPLOT OF DIFFERENTIALLY EXPRESSED GENES IN COLOSTRUM AND

MILK. ................................................................................................................. 114

FIGURE 4.7A FUNCTIONAL CLASSIFICATION OF COLOSTRUM ENRICHED GENES VIEWED

IN PIE CHART. ..................................................................................................... 118

FIGURE 4.7B FUNCTIONAL CLASSIFICATION OF MILK ENRICHED GENES VIEWED IN PIE

CHART. .............................................................................................................. 119

FIGURE 4.8B ECM RECEPTOR INTERACTION PATHWAY. ........................................... 122

FIGURE 4.8C AXON GUIDANCE PATHWAY. ................................................................ 123

FIGURE 4.9 ANNOTATED MIRNA CLUSTER ON CHROMOSOME 7. ............................... 123

FIGURE 5.1 DEVELOPMENT OF TAMMAR WALLABY YOUNG AFTER BIRTH FROM DAY 0

TO 350. .............................................................................................................. 131

FIGURE 5.2A WALLABY SMALL RNA SEQUENCE LENGTH DISTRIBUTIONS. ............... 139

FIGURE 5.2B PLATYPUS, ECHIDNA, SEAL AND ALPACA SMALL RNA SEQUENCE LENGTH

DISTRIBUTIONS. ................................................................................................. 139

Page 12: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

FIGURE 5.3 WALLABY MILK MIRNA ABUNDANCE PROFILES ACROSS 5 LACTATION TIME

POINTS AND TWO SERUM SAMPLES. ................................................................... 143

FIGURE 5.4 CLUSTERING OF WALLABY MILK AND SERUM MIRNA PROFILES BY

HIERARCHICAL CLUSTERING LINKAGE METHOD: (SINGLE LINKAGE PEARSON

CORRELATION). ................................................................................................. 145

FIGURE 5.5 CLUSTERING WITH HCL ACROSS 5 TIME POINTS OF LACTATION CYCLE

ILLUSTRATING INDIVIDUAL MIRNA EXPRESSION. ............................................. 148

FIGURE 5.6 VENN DIAGRAMS OF MIRNAS FOUND ON MILK AND SERUM. .................. 150

FIGURE 5.7A TWO VENN DIAGRAM WITH ALL PLATYPUS MIRNAS VERSUS MIRNA WITH

=> 5 READS ....................................................................................................... 155

FIGURE 5.7B TWO VENN DIAGRAM WITH ECHIDNA ALL MIRNAS VERSUS MIRNA WITH

=> 5 READS ....................................................................................................... 156

FIGURE 5.7C TWO VENN DIAGRAM WITH ALL PLATYPUS AND ECHIDNA MIRNA VERSUS

MIRNA WITH => 5 READ ................................................................................... 156

FIGURE 5.8 THE MOST ABUNDANT MIRNAS (PERCENTAGE DISTRIBUTION) AMONG 32

SAMPLES. .......................................................................................................... 159

FIGURE 5.9A HEAT MAP OF 22 MIRNAS EXPRESSED ACROSS ALL MILK AND 2 WALLABY

BLOOD SERUM SAMPLES. ................................................................................... 159

FIGURE 5.9B HEAT MAP OF 89 MIRNAS EXPRESSED IN MORE THAN 20 SAMPLES. ..... 160

FIGURE 5.9C CLUSTERING OF ALL MIRNAS EXPRESSED IN ALL MILK AND BLOOD

SERUM SAMPLES. ............................................................................................... 161

FIGURE 5.10 A CYTOSCAPE NETWORK DIAGRAM OF NUMBER OF PREDICTED MIRNA

TARGET GENES FOR TOPMOST EXPRESSED MIRNAS USING MIRNA_TARGETS

DATABASE. ........................................................................................................ 164

Page 13: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Table of contents

Characterisation of RNA in milk and prediction of microRNA functions ............ i

Acknowledgements ................................................................................................... iii

List of publications .................................................................................................... iv

Abbreviations .............................................................................................................. v

List of tables ............................................................................................................ viii

List of figures .............................................................................................................. x

Table of contents ..................................................................................................... xiii

Chapter 1 ..................................................................................................................... 1

Introduction ................................................................................................................ 1

Part A The emerging role of micro-RNAs in the lactation process ............................... 2

Abstract ............................................................................................................................... 3

Introduction ........................................................................................................................ 4

Overview of micro-RNAs ................................................................................................ 5

The emerging role of miRNAs in cellular communication ............................................ 12

Lactation evolution and origin of the mammary gland .................................................. 13

miRNAs in mammary gland biology ............................................................................. 15

Milk miRNAs ................................................................................................................. 20

Lactation diversity: a treasure chest for milk miRNA biology ...................................... 27

Conclusion ..................................................................................................................... 30

Part B MicroRNA Targets ............................................................................................... 32

Databases ....................................................................................................................... 36

Problem statement ............................................................................................................ 39

Key questions & motivation .......................................................................................... 40

Thesis Outline ................................................................................................................... 41

Chapter 2 ................................................................................................................... 42

miRNA_Targets database ........................................................................................ 42

Methods and Results ........................................................................................................ 46

Implementation .............................................................................................................. 46

Page 14: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Web interfaces ............................................................................................................... 49

Experimental example ................................................................................................... 51

Discussion .......................................................................................................................... 53

Conclusions ....................................................................................................................... 54

Chapter 3 .................................................................................................................. 55

MicroRNAs in buffalo milk ..................................................................................... 55

Introduction ...................................................................................................................... 56

Materials and Methods .................................................................................................... 57

Milk collection and RNA extraction .............................................................................. 57

RNA-Seq data analysis .................................................................................................. 58

Gene target predictions and GO ontology analysis ........................................................ 58

Results ............................................................................................................................... 59

Small RNA libraries, sequencing, pre-processing and global analysis of RNA-seq data

........................................................................................................................................ 59

Annotation of RNA-seq data ......................................................................................... 63

Comparison of colostrum and milk RNA-seq data sets ................................................. 72

MicroRNAs in colostrum exosomes .............................................................................. 78

Novel miRNAs in buffalo milk ...................................................................................... 79

Analysis of unaligned reads after cow genome alignments: .......................................... 86

Characterisation of small RNA sequences greater than 26 nt ........................................ 87

Comparative analysis of buffalo, human and pig milk miRNAs ................................... 90

Target gene prediction, GO ontology enrichment and pathway analysis ...................... 91

Discussion .......................................................................................................................... 92

Chapter 4 .................................................................................................................. 96

Buffalo milk transcriptome ..................................................................................... 96

Introduction ...................................................................................................................... 97

Materials and Methods .................................................................................................... 99

Results ............................................................................................................................. 100

RNA-seq genome alignments ...................................................................................... 100

Comparison of Bowtie2 and Tophat2 alignments ....................................................... 101

Highly expressed genes in buffalo colostrum and milk ............................................... 111

Differential gene expression analysis of colostrum and milk somatic cells ................ 112

Comparison of buffalo and cow milk transcriptomes .................................................. 116

Page 15: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Ontology and pathway enrichment analysis of colostrum and milk enriched genes ... 116

MiRNAs in PolyA selected transcriptome ................................................................... 123

Discussion ........................................................................................................................ 124

Chapter 5 ................................................................................................................. 128

Comparative analysis of milk miRNAs in seven species ..................................... 128

Introduction .................................................................................................................... 129

Materials and methods ................................................................................................... 133

Milk sample collections ............................................................................................... 133

RNA extraction ............................................................................................................ 134

RNA-seq data analysis ................................................................................................. 135

Results .............................................................................................................................. 136

miRNAs are the most abundant small-RNA subtype in milk samples ........................ 136

Platypus & echidna milk miRNA expression .............................................................. 153

Seal and Alpaca milk miRNAs .................................................................................... 153

Human and pig milk miRNAs ..................................................................................... 157

Distribution of miRNAs among all milk samples from 8 species ............................... 157

MiRNA target gene predictions ................................................................................... 162

Discussion ........................................................................................................................ 164

Chapter 6 ................................................................................................................. 170

General discussion .................................................................................................. 170

Summary of work ........................................................................................................... 171

Conclusions ..................................................................................................................... 175

Future work .................................................................................................................... 176

References ............................................................................................................... 177

Page 16: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA
Page 17: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chapter 1

Introduction

Page 18: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

MicroRNAs (miRNAs) are a class of non-coding RNAs playing a critical role in

post-transcriptional gene regulation in a wide variety of organisms. There are two

parts of literature review (part A and B). Part A review chapter of thesis was

published as a book chapter in nova publishers. It outlines literature review on

miRNA biogenesis, their role in mammary gland biology and lactation in

monotreme, marsupial and eutherian animal model species. This is followed by part

B on miRNA target prediction and associated bioinformatics tools for the analysis of

miRNA target networks. Finally, a thesis outline is given.

Part A The emerging role of micro-RNAs in the lactation process

Amit Kumar, Laurine Buscara, Sanjana Kurupath, Khanh Phuong Ngo, Kevin R.

Nicholas and Christophe Lefèvre

Key words; milk, microRNA, lactation, mammary gland, neonate, development

Page 19: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Abstract

Micro-RNAs (miRNAs) are small RNA molecules known to participate in important

regulatory mechanisms through the targeting of mRNAs by sequence specific

interactions, leading to specific inhibition of gene expression. On going studies have

revealed the role of miRNAs in the regulation of mammary gland development but a

role in lactation is not yet completely clear. Recently, the identification of significant

quantities of selective miRNAs in the milk of a number of mammals, together with

the recent characterisation of plant food miRNAs in the blood of people, have

precipitated an investigation of the potential role of miRNAs in the regulation of the

lactation process. This investigation should include both the process of milk

production by the mother and the post-partum development of the young. In order to

examine the role of milk miRNAs in the lactation process, we propose a comparative

framework for the analysis of lactation. We review mammalian lactation diversity

and animal models of lactation and recent literature on milk miRNA. We also

perform comparative and functional analysis of milk miRNAs and, discuss the

function of milk miRNAs as informative markers of both lactation status and

maternal physiology, as well as information carrying signals facilitating the timely

delivery of maternal development signals to the young.

Page 20: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Introduction

MicroRNAs (miRNAs) are genomically encoded small non-coding RNAs, 18 to 26

nucleotides long, that regulate the genetic information flow by controlling translation

and stability of mRNAs. Recent research has led to a rapidly expanding set of

important biological mechanisms and functions associated with miRNAs. This has

revealed their broad post-transcriptional inhibitory effect on gene expression, and

subsequent effects on the control of differentiation, metabolism, cell morphology,

polarity, migration, signal transduction, cellular communication, organogenesis,

epigenetic and developmental processes (Chen et al., 2012b, Inui et al., 2010). These

discoveries point to a new layer of regulatory mechanisms directed at the fine-tuning

of genome expression in a tissue and time specific manner. A broader picture is

emerging where miRNAs appear to act during the dynamic events of cell-lineage

decisions and morphogenesis, linking dynamic physiological process with gene

expression in response to mechanical and environmental changes. Advances in the

understanding of miRNA biogenesis, target recognition and participation in

regulatory networks have demonstrated their importance in lineage decisions of

progenitor cells or organogenesis, and future discoveries in this area are likely to

reveal developmental regulation and disease mechanisms related to miRNAs (Small

and Olson, 2011). The mammary gland is the main organ directly involved in the

lactation process. The repeating developmental cycle of the mammary gland,

comprised of successive cell growth, differentiation, metabolic activity, apoptosis

and remodelling implies a process highly controlled by precise temporal gene

regulation. Below, we review the current knowledge regarding the role of miRNA

during mammary gland development.

The identification of high levels of a large number of miRNAs in the milk of

eutherian mammals (human, mouse and cow) over the last two years (Chen et al.,

2010b, Hata et al., 2010, Kosaka et al., 2010, Zhou et al., 2012a), and the recent

observation that exogenous miRNAs consumed from plant foods may directly

influence gene expression in animals (Zhang et al., 2012a), raise questions about the

use of miRNAs as biomarkers and their putative role in mammary gland physiology

Page 21: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

together with their potential effects on growth and development of the young. Here,

we explore these aspects including the idea of miRNA signalling by milk. After

reviewing mammalian lactation diversity, we focus on the unique reproduction

strategy of marsupials and monotremes. These animals have a short gestation and a

relatively long lactation during which the newborn undergoes a significant part of its

development (corresponding to intrauterine development in the eutherian lineage).

During lactation marsupials also exhibit extensive mammary gland development, and

changes in milk composition which can be correlated with developmental stages of

the suckled young. Thus marsupials and montremes are valuable animal models to

study the potential role of milk miRNA in the lactation process, and potentially, to

better understand the role of miRNA in signalling developmental process in the

young.

Overview of micro-RNAs

Molecular function of miRNAs

miRNAs are small non-coding RNA from 18 to 26 nucleotides, which recognise

mRNAs, by partial sequence complementary to specific target sites in association

with protein complexes. This miRNA-mRNA target recognition process brings about

either the degradation or translation inhibition of targeted mRNAs, leading to the

post-transcriptional sequence specific silencing of expression of the gene (Valencia-

Sanchez et al., 2006). Due to its short sequence and partial recognition of the

complementary mRNA target sequence, any miRNA typically targets a large number

of genes. Currently, it is thought that over one thousand miRNAs potentially regulate

more than 60% of the human genome (Gunaratne et al., 2010). Therefore miRNAs

have rapidly emerged as important regulators of gene expression to control metabolic

and developmental processes, including the lactation cycle.

Page 22: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNA biogenesis

The biogenesis of miRNA is presented in figure 1.1. Most miRNAs are encoded in

the genome within RNA polymerase II dependant non-coding transcription units

(Lee et al., 2004). The promoter organization of miRNA genes is similar to coding

genes, encompassing TATA box sequences, initiator elements, GpC islands and

histone modification marks (Corcoran et al., 2009). The primary (pri-miRNA)

transcript is capped with 7-methylguanylate at the 5’ end and carries a polyA tail at

the 3’ end. Often these genes encompass several miRNAs in clusters of miRNA

families. Other miRNAs may be found within introns of protein coding genes (Cai et

al., 2004). Thus miRNA gene expression is likely integrated within the same

machinery controlling mRNA expression, including transcription factors, enhancers,

silencers and chromatid modifications. Pri-miRNA transcripts adopt secondary stem

loop structures due to the presence of imperfect complementary sequences

encompassing the miRNA. The post-transcriptional maturation of the kilobase range

pri-miRNA into ~22 bp miRNAs involves two subsequent cleavage steps. The first

cleavage occurs in the nucleus where the Drosha complex, composed of at least 20

proteins, hydrolyses the RNA at the base of the stem loop to release a stem loop pre-

miRNA. Nuclear export of pre-miRNA to the cytoplasm is facilitated by Exportin 5

and Ran guanosine triphosphate–dependent (RanGTP) double strand RNA binding

protein (Bohnsack et al., 2004, Corcoran et al., 2009).

Page 23: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 1.1 Biogenesis of microRNA

In the cytoplasm the pre-miRNA is recognised by the Dicer complex composed of

the RNAse III enzyme Dicer in association with TAR RNA binding protein TRBP.

The endonuclease Dicer cleaves the pre-miRNA to release double strand RNA about

22 nucleotides in length. The guided strand corresponding to the miRNA is then

loaded into the RNA-induced silencing complex (RISC) to function as a mature

miRNA in the miRISC complex while the other strand, called either the passenger or

star strand (miRNA*) is usually degraded. The level of miRNA* present in the cell is

generally significantly lower relative to the corresponding miRNA, but widespread

regulatory activities have also recently been associated with passenger strand

miRNA* (Okamura et al., 2008, Yang et al., 2012). In cases where there is a higher

proportion of passenger strand present in the cell, the nomenclature miRNA-

3p/miRNA-5p is used instead of miRNA/miRNA*. miRNA-3p is the miRNA

derived from the 3' arm of the precursor miRNA. whereas miRNA-5p is the miRNA

derived from the 5' arm of the precursor miRNA.

The RISC complex is made of a number of proteins, including members of

the Argonaute (Ago) family, which encompass 8 proteins in the human genome

Page 24: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

(Ago1-4 and PIWI1-4) (Hock and Meister, 2008, Sasaki et al., 2003). Ago2 is

considered to be a critical component for the functionality of the RISC complex as it

is the only protein carrying RNAse activity necessary for the degradation of the

target mRNA recognised by the loaded miRISC complex due to partial

complementarity to the loaded miRNA. RISC proteins have been located in two

subcellular compartments called P-bodies or GW-bodies (Gibbings et al., 2009).

However miRISC associated with mRNA can also be localised with membrane

fractions from the Golgi apparatus and the endoplasmic reticulum (ER) (Cikaluk et

al., 1999). P-bodies carry decapping activity. They are dispersed in the cytoplasm,

not associated with any other identifiable cell structure, enhanced during cellular

stress and are believed to be associated with miRISC degradation (Gibbings et al.,

2009, Liu et al., 2005). More interestingly, GW-bodies are closely associated with

membranes in the multivesicular body (MVB), a key subcellular structure of the

endoslysosomal pathway where the sorting of cellular and endosomal components

takes place for either recycling or targeting toward lysosome or exosome excretion

pathways (Lemmon and Traub, 2000). GW-bodies appear to play an important role

in the loading of miRNA into the functional miRISC complex (Lee et al., 2009b).

They also possibly have a direct role in mRNA targeting by miRNAs since mRNAs

and miRNAs are highly enriched in MVB membrane fractions and, blocking

turnover of MVBs into lysosomes by loss of the tethering factor HPS4 enhances

siRNA and miRNA mediated silencing in Drosophila and humans, also triggering

over-accumulation of GW-bodies (Li et al., 2004). Conversely, blocking MVB

formation by ESCRT (Endosomal Sorting Complex Required for Transport)

depletion results in impaired miRNA silencing and loss of GW-bodies (Lee et al.,

2009b). GW182, a protein of the RISC complex associated with P-bodies,

ubiquitinated proteins and membrane of the MVB plays an important role in the

localisation and turnover of the miRISC complex in the MVB, and it has been

suggested that GW182 may act as a lock by binding to AGO and inhibiting the

loading of miRNA (Gibbings et al., 2009). Thus, the functionality of miRNAs seems

to be closely associated with their activation in the MVB of the endosomal pathway

and this is probably relevant to miRNA function and the secretion of miRNA in

general, and in milk in particular.

Page 25: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Mechanisms of miRNA action

The relatively recent discovery of miRNA in 1993 identified the key role of lin-4,

encoding a small non-coding miRNA, in the control of larval development of the

nematode through its inhibitory control of lin-14 gene expression (Lee et al., 1993).

The mechanism of action of miRNA loaded miRISC on gene expression is mediated

by the recognition and binding of miRISC to mRNAs in a sequence specific manner.

While in plants the recognition has a high degree of sequence specificity, in animals

sequence complementarity between miRNA and mRNA preferentially covers the

seed region of the miRNA within nucleotide positions 2 to 8 and more extensive

variation exists in the 3’ sequence of the miRNA beyond these seed positions. This

has greatly impaired the development of miRNA target recognition algorithms for

the investigation of miRNA function in animals. The seed region facilitate the

nucleation of the miRNA-mRNA hybrid but a better understanding of the role of the

3’ sequence in the context of AGO protein binding in the RISC complex may help

improving sequence based target prediction methods (Wang et al., 2010b). Once

bound to the mRNA, miRISC may influence its expression by different mechanism,

including inhibition of translation elongation, co-translational protein degradation,

competition of cap structure, inhibition of ribosomal subunit joining or polyA tail

deadenylation and decapping (Eulalio et al., 2008).

miRNA Secretion

A diversity of miRNAs has been identified in cells and, more recently, body fluids

including milk, milk vesicles and milk exosome fractions (Chen et al., 2010b, Hata et

al., 2010, Kosaka et al., 2010). In addition serum has been shown to contain protein

bound miRNAs, including proteins such as AGO2 and high-density lipoprotein HDL

(Arroyo et al., 2011, Vickers et al., 2011). MiRNAs do not seem to simply diffuse

passively out of the cells where they are produced, but may instead be selectively

exported inside vesicles (Wang et al., 2010a). One important property of secretory

miRNAs is that they are very stable in acidic conditions and are protected from

RNAse digestion (Kosaka et al., 2010, Mitchell et al., 2008). Secreted miRNA

profiles do not fully overlap with cellular miRNA profiles suggesting some

selectivity in miRNA secretion in an energy dependant manner (Wang et al., 2010a).

Page 26: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

In cow milk, there is no clear correlation between serum and milk miRNA contents

and a large enrichment of miRNA is observed in milk compared to serum, indicating

selectivity in diffusion of miRNAs through the epithelial barrier as well as the high

miRNA secretory activity of the lactating mammary gland (Weber et al., 2010) (Hata

et al., 2010). These studies indicate the importance of mammary gland miRNA

biogenesis on milk composition.

There are at least four different pathways by which miRNAs are secreted leading to

different secretory particles: protein-miRNA complex secretion, microvesicle

budding, exosome and apoptotic body secretion (figure 1.1). The secretory pathways

were first recognised from the identification of miRNAs in cell culture media or

serum (Turchinovich et al., 2011). The secretion of miRNA complexed with HDL is

regulated by neutral sphingomyelinase, in a manner opposite to exosome secretion

suggesting that these secretory pathways are different. It was shown that HDL can

deliver miRNA signals to recipient cells and this was demonstrated to be scavenger

receptor BI-dependent (Vickers et al., 2011). In addition, variation in serum HDL

miRNA composition can be associated with familial hypercholesterolemia (Vickers

et al., 2011). The interaction of miRNA with HDL is thought to involve interaction

between miRNA and lipids. In the serum it has been claimed that the majority of

circulating miRNAs are found in the form of a protease sensitive complex with the

AGO2 protein, while only a small proportion of serum miRNAs are associated with

microvesicles (Arroyo et al., 2011).

Microvesicles and exosomes are often named circulating microvesicles. However,

while they are both surrounded by lipid membranes, these secretory vesicles are

produced by two distinct pathways. Microvesicles (also called microparticles or

ectosomes) vary in size from 50 to 1000 nanometers in diameter and are produced by

budding and fission of the cellular plasma membrane (Keller et al., 2006, van Niel et

al., 2006). This process that is regulated by calcium, involves local lipid organization

of the cell membrane as well as reorganisation of the cell structural scaffold.

Contractile molecules play a direct role in bringing opposite parts of the membrane

together to allow budding of small vesicles sequestering part of the surrounding

cytoplasm inside the vesicle, including RNA (pre-miRNA, mRNA, miRNA) and

cellular proteins. Exosomes are about 100 nm in diameter and are formed inside the

Page 27: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

cell. Endocytic invaginations of cell membranes form intracellular endosomes.

Endosomes then segregate within the MVB structure which later fuses with the

plasma membrane to release its exosome content. As we have seen above, miRNA

biogenesis appears to be highly integrated with the MVB and inhibition of MVB

turnover has profound effects on miRNA function.

Apoptotic bodies, also called apobodies, are small membrane vesicles produced by

cells undergoing cell death by apoptosis. The formation of apoptotic bodies is

thought to prevent the release of free, potentially toxic or immunogenic, cellular

content by dying cells in order to avoid inflammation or autoimmune reactions as

well as tissue destruction. Apoptotic bodies are actively taken up by phagocytosis by

phagocytes and other cells for degradation. They have been shown to carry nucleic

acids and have been implicated in the transmission of viral DNA (Halicka et al.,

2000).

In the mammary gland the secretion of milk miRNA may not be limited to these

three mechanisms. It is likely that milk fat globules (MFG) also contain miRNA as it

has been shown that fragments of cytoplasm are incorporated in the MFG during

exocytosis of lipid droplets and that cellular RNA can be recovered from MFG

fractions (Maningat et al., 2009). Profiling of miRNA in these different fractions

will further our understanding of the selective excretion of milk miRNA and how

these compartments differentially reflect the physiological activity of the mammary

gland. This will potentially allow the development of new and powerful markers of

mammary gland physiology, enable the identification of new potential milk miRNA

signalling pathways, and permit the evaluation of the impact of the environment on

the lactation process.

Significantly, in human milk the majority of miRNAs are found in vesicle or

exosome fractions with higher abundance of immune related miRNAs during the

first 6 month of lactation (Kosaka et al., 2010, Zhou et al., 2012a). Our unpublished

results also show a large exosomal milk miRNA fraction in the milk of other species.

Page 28: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

The emerging role of miRNAs in cellular communication

The mammalian genome is currently estimated to encode more than 1400

miRNA species. A diversity of miRNAs has been identified in cells and more

recently, body fluids including milk (Hata et al., 2010, Kosaka et al., 2010). At the

same time, a new hypothesis about the role of miRNA in inter-cellular

communication has also been suggested in which secreted miRNAs serve as

signalling molecules (Chen et al., 2012b). Human blood cells and cultured THP-1

cells selectively package miR-150 into actively secreted microvesicles and deliver

miR-150 into human microvascular endothelial HMEC-1 cells (Zhang et al., 2010).

The elevated exogenous miR-150 effectively reduced c-Myb expression and

enhanced cell migration in these cells. Moreover, intravenous injection of THP-1 cell

culture purified microvesicles significantly increased the level of miR-150 in mouse

blood vessels. Microvesicles isolated from the plasma of patients with

atherosclerosis contained higher levels of miR-150 and were more effectively

promoting cell migration than microvesicles from healthy donors (Zhang et al.,

2010). Communication between endothelial and smooth muscle cells through

circulating miRNA-143/145 has also be reported, providing a novel treatment

approach for artherioslerosis based on circulating miRNA microvesicles

(Hergenreider et al., 2012). In plants, circulating miRNA-165/166 species have been

shown to specify the radial position of xylem vessels in roots of Arabidopsis and

miRNAs provide signals that are transported from the shoot to the root (Furuta et al.,

2012). miRNAs have also been implicated in oocyte-somatic cell communication in

the ovary (Hawkins and Matzuk, 2010).

These results demonstrate that cells can secrete miRNAs and deliver them into

recipient cells where these exogenous miRNAs can regulate target gene expression

and recipient cell function. This is further supported by a ground-breaking study,

which recently showed that exogenous miRNAs consumed from plant foods pass

through the intestinal mucosa into the plasma and are delivered to specific organs

where they directly influence gene expression in animals (Zhang et al., 2012a).

These observations suggest that miRNA concentrations in body fluid compartments

may be at least partially under biological control. This points to a possible role of

milk miRNA as new markers or essential players of mammary gland function as well

Page 29: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

as potential agents communicating signals of maternal origin to the young during

lactation. The challenge is now to fully understand the extent of specific biological

function of miRNAs. The possibility for such mother-young signalling may open

new perspectives toward the understanding of a healthy start in life.

Lactation evolution and origin of the mammary gland

Evolution of lactation

The nourishment of the young with copious milk secretion by the mammary gland

during lactation, is an important aspect of mammalian reproductive physiology. The

naming of the class Mammalia by Linnaeus in 1758 (Linnaeus, 1758) implicates

lactation as a dominant character of mammalian species. Lactation is an essential

period of the mammalian reproduction cycle. Young mammals are totally dependent

upon milk for survival and the nursing of the young during lactation may have

facilitated the development of affectivity and learning in man (Peaker, 2002). Thus,

lactation is an important aspect of mammalian evolution. Fossil and molecular

analysis place the appearance of early mammals on the synaptid branch of the tree of

life during the end of the Triassic period (from 210 to 166 million years ago) but, a

complex lactation system was already established in these ancient mammals as

confirmed by comparative genome analysis (Oftedal, 2002a) (Lefevre et al., 2010a).

For example, in the egg-laying monotremes, milk cell cDNA sequencing has shown

the highly conserved organization of casein genes in all mammalian genomes

suggesting that the establishment of concerted expression of caseins and whey

proteins in the mammary gland predates the common ancestor of living mammals

166 My to 240 Mya (Lefevre et al., 2010a). Thus, it is recognised that lactation

gradually evolved into a complex system during the Triassic, between 320 and 200

million years ago along other mammalian characters such as the development of an

integument, endothermy and fur.

The exact adaptive pathways by which the mammary gland and lactation may have

been gradually established during synapsid evolution are controversial due to both

the the lack of direct fossil evidence for mammary gland development or parental

care, and the absence of living representatives of these early lineages. In

contemporary mammals, lactation is a complex process involving morphological,

Page 30: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

physiological, biochemical, ecological, and behavioural adaptations, which could not

have arisen suddenly, and a simpler process must have taken place at the origin of

lactation and gradually evolved into a more complex system. Unfortunately, little

evidence of simpler intermediary processes has survived through therapsid evolution

to be observed today. The survival advantage that could be gained through the

neonate’s chance consumption of a new secretion from the cutaneous glands of a

parent has been a challenge to the theory of evolution. Darwin himself has addressed

this in the second edition of “On the Origin of the Species” (1872) (Darwin, 1872),

only to be misled by hypothesising a common origin for the uterine secretion of fish

that give birth to live young and milk produced by the mammary gland. It is now

recognized that mammary glands instead evolved from a glandular integumentary

system (Blackburn et al., 1989, Oftedal, 2002a).

Origin of the mammary gland

The structure of the mammary gland is similar to skin glands, but there is some

controversy regarding the exact type of ancestral gland from which it evolved.

Blackburn has suggested that the mammary gland evolved through the combination

of different skin gland populations into new functional units (Blackburn, 1993) but,

Oftedal argued that the mammary gland most likely derives from an ancestral

apocrine-like gland associated with hair follicles (Oftedal, 2002a), an association that

is still observed in features of mammary gland development of monotremes and

marsupials. This suggests that lactation has coevolved with hair in the cynodont

lineage, possibly first in the ventral patch, where it played a functional role in the

nursing of eggs through proto-lacteal secretory glands. Lactation could have

preceded the appearance of endothermy and fur and may have played a direct role in

their evolution. The exact pathways of mammary gland evolution remain speculative.

However, it is now accepted that mammary glands likely derived from ancestral skin

glands, probably of the apocrine type, and that these glands acquired a functional

role in the nursing of the egg or the young of mammalian ancestors by providing

protolacteal secretions.

Page 31: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Scenario for the evolution of lactation

Several hypotheses have been advanced for the selective value of primitive

protolacteal secretions including thermoregulation of eggs and young through

evaporative cooling (Haldane, 1965) or heat transfer from the mother (Bresslau,

1907), prevention of dessication of eggs (Haldane, 1965, Oftedal, 2002b),

maintenance of proximity between the mother and the egg or the young glue

(Gregory, 1910), pheromones operating as a milk precursor (Duvall, 1983) and,

protection of the egg via secretions with antibacterial activity (Blackburn, 1985).

While the exact original drivers of the selective advantage of lactation remain

elusive, it seems possible that the secretion by chance of a primitive molecule of

immunoprotective, nutritional, or communicative value may have provided selective

advantage to favour the incremental establishment of the lactation process. The true

origin of lactation and the timing of the evolution of its most primitive form remain

speculative, but by 200 Mya, a complex lactation system with a highly differentiated

mammary gland specialized in the production of complex milk secretions, including

a large number of milk-specific proteins, had evolved (Lefevre et al., 2010a). During

subsequent mammalian radiation, a variety of lactation strategies have been adopted

in different lineages to account for the diversity of environmental and behavioural

adaptations found in extant mammalian species. It is likely that miRNA have

contributed to these processes and the study of their participation in the lactation

process may enable new insights on the evolution of lactation.

miRNAs in mammary gland biology

Mammary gland development

The mammary gland, a mammalian specific organ, is an exocrine tubulo-alveolar

gland whose function is to produce milk; a complex mixture of carbohydrates,

proteins and lipids, essential for the development and health of the new-born

(Picciano, 2001). This gland is composed of an epithelium network surrounded by

stroma. The epithelium consists of a layer of secretory luminal cells (ductal and

alveolar cells) directly in contact with the lumen and a layer of myo-epithelial cells

in contact with the basement membrane. The stroma is further subdivided between

the periductal stroma, mainly composed of fibroblasts and collagen-rich extracellular

Page 32: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

matrix, which directly surrounds the ducts, and the fat pad composed of adipocytes,

fibroblast, endothelial cells and mast cells, which play a role of support for the gland.

The mammary gland differs from all the other organs, as its development mainly

occurs after birth, undergoing important developmental cycles under control of

female reproductive hormones and local factors (oestrogens, progesterone, etc.)

(Medina, 1996). Indeed, even though the mammary anlage is established during

foetal development, at birth, the mammary gland is rudimentary, mainly composed

of short ducts restricted to the nipple region. Ductal elongation and branching is only

mainly achieved during puberty; proliferation taking place at the end bud area,

localised at the ducts apex. During pregnancy, dramatic alveolar differentiation

occurs. During lactation, the alveolar cells start producing milk while the

myoepithelial cells allow the ejection of milk from the alveoli lumen into the duct.

Upon weaning, the cessation of suckling induces a rapid down regulation of milk

protein gene expression, followed by an alveolar epithelial cells death concomitant

with the collapse and remodelling of the mammary gland, returning the gland to a

virgin-like stage. This is the involution phase (Stein et al., 2007).

miRNA in mammary gland development

Although numerous studies have addressed the distribution of miRNAs in

breast cancer, and more generally cancer where it is believed that the power of

miRNA profiling is to allow a better classification of tumours, few studies have yet

been conducted during normal mammary development (Sdassi et al., 2009, Silveri et

al., 2006). Even though the role of miRNAs in mammary gland development remains

generally unclear and no miRNA has been found to be mammary gland specific so

far (Liu et al., 2004), it has been reported that some miRNAs present a specific

expression pattern during the different stages of mammary gland development

(figure 1.2). Early experiments examining about 10% of the human miRNome have

revealed a breast specific signature composed of a least 23 miRNAs (Sdassi et al.,

2009, Silveri et al., 2006). Similar profiling studies in mice have estimated that up to

half of all miRNAs tested were expressed at different stages of mammary gland

development and cloning experiments have identified miRNAs, which are

differentially expressed during organogenesis (Avril-Sassen et al., 2009, Wang and

Li, 2007). Other reports have described miRNA signatures at a more comprehensive

Page 33: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

set of mammary gland development stages, including lactation and involution (Wang

and Li, 2007). 21 miRNA were specifically down regulated (mmu-miR-292-5p,

mmu-miR-290, mmu-miR-126-3p, mmu-miR-138, mmu-let-7g, mmu-miR-106a, mmu-

miR-10b, mmu-miR-142-3p, mmu-miR-152, mmu-miR-183, mmu-miR-195, mmu-

miR-291a-3p, mmu-miR-376b, mmu-miR-409, mmu-miR-431, mmu-miR-376a, mmu-

miR-467a, mmu-miR-30e, mmu-miR-1, mmu-miR-221, mmu-miR-129-5p) and 17 up

regulated (mmu-miR-434-3p, mmu-miR-30d, mmu-miR-129-3p, mmu-miR-365, mmu-

miR-483, mmu-miR-345, mmu-miR-542-5p, mmu-miR-381, mmu-miR-291a-5p-291b-

5p, mmu-miR-133b, mmu-miR-484, mmu-miR-133a-133b, mmu-miR-324-5p, mmu-

miR-375, mmu-miR-434-5p, mmu-miR-361, mmu-miR-146b) in the

Gestation/Lactation stage compared to the Virgin/Involution stage (Avril-Sassen et

al., 2009). In 2009, Avril-Sassen et al., undertook to characterize each murine

mammary developmental stages by specific miRNA clusters (Avril-Sassen et al.,

2009). Globally, they found that the majority of miRNAs were expressed

intermittently throughout development and were absent in one or more stages. They

therefore identified seven expression clusters with distinct complex temporal

expression profiles. While some clusters are characterized by decreased expression

during lactation and involution (miR-196a/b and miR-203), others are highly

expressed during lactation and early involution (among them miR-141, miR-200a,

miR-429 and also miR-146b, miR-210 and multiple members if miR-148 and miR-

181 family), or during puberty, maturity and early gestation (first five days) such as

members of let-7 family. Interestingly, miRNAs belonging to the same cluster often

encompass identical seed-sequences, which suggests common evolutionary origin

and common targeting of gene expression. Expression clusters are often located

inside genomic miRNA clusters and can be co-expressed in polycistronic transcripts.

Thus, it is likely that some miRNAs expressed during puberty and gestation could

have a role in proliferation and invasion during mammary gland development, while

miRNAs expressed during lactation and early involution could have a role in the

regulation of innate immune response and inflammation, as it has been observed for

miR-146b (Monticelli et al., 2005). Inversely, miRNAs up-regulated during lactation

and early involution could be associated with increased contact between epithelial

cells during the early reversible stages of involution and prevent involution during

Page 34: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

lactation. This is the case for miR-200 family members, known to inhibit epithelial to

mesenchymal transition; a process similar to the loss of contact between epithelial

cells during the transition from lactation to involution (Ucar et al., 2011, Gregory et

al., 2008). These observations that each stage of the lactation cycle is characterized

by a particular miRNAs mammary gland expression profile suggest that miRNAs

play an important role in mammary gland biology.

One direct functional proof that miRNAs participate in the control of

mammary gland development was provided by Ucar et al. (Ucar et al., 2011) when

they showed how the miR-212/132 family was necessary for epithelial-stroma cell

communication during mammary gland pubertal development. Using miR-212/132

null mice, the authors observed that while the embryonic and pre-puberty mammary

gland development are normal, ductal outgrowth and invasion of the fat pad are

impaired during puberty. Interestingly, during gestation, the lobulo-alveolar

differentiation and the milk secretion were successful, even though restricted to the

nipple area. By performing transplantation experiments, they showed that miR-

212/132 family was required by stromal cells to maintain a low expression of the

matrix metalloproteinase 9 (MMP9) in the periductal stroma. This proteinase is

responsible of the degradation of the collagen, which acts as a reserve of

transforming growth factor TGFß, maintained in an inactive form (Annes et al.,

2003). Thus, in miR-212/132 null mice, MMP9 escapes the down-regulation of gene

expression inhibited by these miRNAs, and degrade collagen, leading to the release

of activated TGFß. This activation induces the inhibition of epithelial cell

proliferation, responsible of the ductal outgrowth impairment. Thus, by restricting

the expression level of MMP9, miR-212/132 family prevents the anti-proliferative

effect of TGFß activation on epithelial cells, allowing a normal fat pad invasion and

the establishment and maintenance of a functional epithelium network during

puberty. These observations demonstrate how miRNAs participate in the

maintenance of a proper balance in the complex regulatory mechanisms of mammary

gland development by fine-tuning gene expression of key effectors.

Page 35: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 1. 2 Overview of micro-RNAs involved in mammary gland development

In 2009, Tanaka et al, (Tanaka et al., 2009) identified miR-101a as a potential

mediator of mammary gland development, and notably for the involution process. In

this study, the up-regulation of miR-101a observed at the onset of involution

suppressed the expression of the cyclooxygenase-2 (cox-2), a protein mediating cell

proliferation (Liu et al., 2001). Interestingly, this decrease of cox-2, known to happen

during involution (Tanaka et al., 2009), suggests a decisive role for miR-101a in this

process. This supposition was reinforced by the negative correlation between miR-

101a and β-casein gene expression, a cell differentiation marker also down-regulated

during involution (Liu et al., 2001) in a process shown to be related to cox-2

repression.

Recently, Cui et al., (Cui et al., 2011) highlighted a role for miR-126-3p in

mammary gland development via the regulation of the progesterone receptor,

mammary epithelial cells proliferation and beta-casein gene expression. This miRNA

was found to be down-regulated during pregnancy and lactation, probably in order to

allow side-branching and alveogenesis mediated by progesterone and other

hormones during gestation, and beta-casein gene expression during lactation. Finally

Page 36: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

the regulation of lactoferrin by miR-214, an important protein with immune and

apoptotic function has also been reported (Liao et al., 2010).

These initial studies highlight the importance of miRNA in normal development

processes associated with the lactation cycle and further studies are essential to

identify direct targets and functional mechanisms associated with the regulation of

miRNA gene expression to fully understand the normal development processes in the

breast.

Milk miRNAs

Recent studies using high throughput sequencing techniques have reported the

presence of a large number of miRNAs in bovine and human milk. Hata et al.

reported 6 bovine milk miRNAs from milk derived exosomes using real time PCR

(Hata et al., 2010). Chen et al. reported the purification of RNA from cow milk and

colostrum, including some mRNA, accounting for less than 1% of the total RNA

(Chen et al., 2010b). In addition they reported a high proportion of miRNA and

ribosomal RNA (50% of total RNA each) and identified 245 miRNAs in fresh milk

exosomes (Chen et al., 2010b). Amongst these, 47 miRNAs were present in cow

milk but not identifiable in serum, while serum contained 162 other miRNAs not

detected in milk. 108 miRNAs were significantly more highly expressed in cow

colostrum than in milk while only 8 other miRNAs showed higher expression in milk

compared to colostrum. Expression of another 129 miRNAs remained unchanged

from colostrum to milk. In total 105 miRNAs were reported to be milk specific or

milk enriched. Seven miRNAs were consistently found in both colostrum and milk (

miR-26a, miR-26b, miR-200c, miR-21, miR-30d, miR-99a, and miR-148a). These

milk specific miRNAs, apparently constantly expressed throughout lactation, were

proposed as potential biomarkers for the quality control of raw milk and other milk-

related products. Using a PCR assay panel for the quantitative profiling of these

miRNAs was shown to be sensitive enough to accurately estimate milk concentration

in milk dilutions or formula milk powders for children, highlighting the low miRNA

concentrations in formula milk compared to fresh milk and the value of PCR panels

for new milk quality control assays.

Page 37: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

In human milk Kosaka, et al., reported RNA concentration of 10-200 ng/ml in milk

and identified 281 miRNAs in human milk using miRNA microarrays (Kosaka et al.,

2010). They showed that highly expressed miRNAs were enriched for immune

related functions. MiR-146b is present specifically in milk compared to plasma,

whereas miR-181 and miR-155, which are highly expressed in milk, are also present

at similar levels in plasma. MiRNAs were also found to be associated with

microvesicle fractions purified by centrifugation. In a more recent study using

sequencing, Zhou, et al. reported the presence of 602 unique miRNAs in breast milk

exosomes and estimated exosomal RNA concentrations in milk around 160-280

ng/ml (Zhou et al., 2012a). The ten most highly expressed miRNAs accounted for 62

% of the total miRNA content (miR-148a, miR-30b, let-7f, miR-146b-5p, miR-29a,

let-7a, miR-141, miR-182, miR-200a, miR-378). The majority (8) of these miRNAs

was also found in cow milk and was reported to be milk specific or milk enriched.

Milk exosome miRNAs

As discussed above, exosomes are tiny membranous vesicles (~30-100 nm in

diameter) that are generated in the MVB and can mediate intercellular

communication (Wang et al., 2010a). They are complex structures composed of a

lipid bilayer containing trans-membrane proteins. These vesicles can affect the

physiology of recipient cells by binding to receptors (Lässer et al., 2012). They are

secreted by various tissues and are present in different body fluids (such as amniotic

fluid, blood, malignant ascites fluid, milk, saliva and urine)(Zhou et al., 2012a,

Weber et al., 2010). There is a link between exosome biogenesis and miRISC

loading and exosomes are known to carry mRNA and miRNA molecules (Hata et al.,

2010, Valadi et al., 2007) (figure 1.1).

The characterisation of the relatively large amount of RNA in human milk exosome

fractions suggests that a large proportion of milk miRNAs are contained in exosomes

(Zhou et al., 2012a). We have obtained similar results with the milk of other species.

It will be interesting to analyse further the difference in miRNA composition of

exosomes derived from milk in comparison to whole milk, fat or whey fractions. As

exosomes may be able to travel longer in the infant digestive systems this may be

relevant to the differential functional targeting of milk miRNA. Hata et al. (2010)

demonstrated the uptake of milk exosomes by cultured cells and the transfer of 6

Page 38: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

bovine miRNAs and milk specific genes into these cells, suggesting the possible

transfer of genetic information encoded by RNA between mother and child through

milk exosomes (Hata et al., 2010).

Milk miRNA Stability

The RISC effector protein AGO2 binds post-transcriptionaly to miRNAs and

increases their stability. Milk miRNAs have been shown to withstand harsh

environmental conditions including prolonged room temperature, multiple freeze-

thaw, RNAse digestion and even boiling (Chen et al., 2010b). Human breast milk

miRNAs were able to withstand acidic (pH 1) solution for 1 hour (Kosaka et al.,

2010). Like mRNAs, miRNAs possess differential stability (Bail et al., 2010). Using

mouse embryonic fibroblasts following Dicer1 ablation the average half-life six

miRNAs was 119 hour (Gantier et al.). However cellular miRNA can be degraded at

different rates. When cells are grown at low density or cells are detached by

trypsinization or EGTA treatment, mature miR-141 is rapidly downregulated while

miR-200c from a common primary transcript (pri-miR-200c 141) remains

unaffected. This differential decay of miR-141 was shown to involve sequence

specific RNAse activity (Kim et al., 2011). These studies show that miRNA can

withstand the acidic environment of the gut and enter cells to target specific cellular

pathways, possibly including early development of the young.

Immune related miRNAs in Milk

Exosomes are well known communication channels between immune cells (Admyre

et al., 2007). Four out of the top ten most highly expressed miRNAs in human breast

milk exosomes (miR-148a-3p; miR-30b-5p; miR-182-5p; and miR-200a-3p) have

designated immune-related functions (Zhou et al., 2012a). In total, 59 well-

characterized immune related miRNAs were enriched in breast milk exosomes

including T- and B- cell related miRNAs (miR-181a-b, miR-155, miR-17) and innate

immune system related miRNAs such as miR-146b, miR-92. miR-125b. These

observations are consistent with the report that breast milk-derived exosomes can

increase the number of Foxp3+ CD4+ CD25+ regulatory T cells in infants and that

human breast milk miRNAs may induce B-cell differentiation (Admyre et al., 2007).

Page 39: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparative analysis of milk miRNAs

Comparative approaches allow a more detailed analysis of the functional evolution

of specific molecular components of lactation, emphasising the ancient origin of the

essential components of the lactation system at the molecular level and, the

evolutionary paths that have allowed the adaptation of a variety of lactation strategies

during mammalian radiation (Lefevre et al., 2010a). Despite major differences

between plant and animal miRNA biogenesis and mechanisms of action, there is

support for a common origin of miRNA in the common ancestors, which are most

likely associated with the appearance of muticellularity. However miRNA may

originate from different evolutionary mechanisms leading to the formation of

expressed hairpin RNA structures (Liu et al., 2008) and rapid rates of miRNA

evolution have been observed (Nozawa et al., 2010). While some miRNA are highly

conserved, showing purifying selection indicating their important functionality, other

more recent miRNAs seem to evolve neutrally in search of novel functions. The

identification of conserved miRNAs, both in terms of miRNA structure and

expression levels in milk will potentially allow the characterisation of miRNA

essential for the lactation process. Towards this end, we can compare datasets from

bovine and human milk, although one has to be cautious about the different

preparations used to obtain these datasets as the bovine data are from raw milk while

the human data are from milk exosomes (Chen et al., 2010b, Zhou et al., 2012a). The

results are also compounded by normalisation issues, reference database versions,

and bioinformatic annotation pipelines. Annotated on line data were retrieved from

the depository for the bovine milk miRNA study, and raw data from the human study

were annotated using the online deep sequencing analysis pipeline tool (Huang et

al.). Data were normalised by estimating proportions, taking the average of the four

human individual samples from human milk exosome study. Available data on

human milk exosomes and cow milk miRNA contents are compared in figure 1.3a,

showing the log normalised expression plot of cow versus human milk RNA after

averaging values for the 4 individual human samples. In total we were able to

identify 172 common miRNAs representing a large proportion (70%) of all miRNA

identified in the smaller cow dataset (245 miRNAs). In contrast to the relatively

strong correlation (R2 ~0.9) between cow colostrum and milk from the same dataset

Page 40: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

shown in figure 3b with the grouping of the data points along the diagonal, the

comparison of human and cow milk does not show this trend as the data is spread on

each side of the diagonal. If we compare cow milk and cow serum, also from the

same dataset as shown in figure 3c, we find a similar result with perhaps a larger

spread of data-points, confirming a large difference in milk and serum miRNA

content and, possibly, only a slightly better correlation between human and cow milk

(R2 0.35 versus 0.3). This analysis shows a large overlap of miRNA populations

present in human and cow milk but a relatively poor correlation in apparent global

expression levels between the species. However, it is always possible to identify a

subpopulation of miRNAs present at concentrations within an arbitrarily close range.

For example 64 miRNAs have concentrations with less than five-fold difference

between cow and human milk, including 5 of the 7 miRNAs previously proposed as

milk-enriched quality markers in cow studies (miR-26b, miR-21, miR-30d, miR-26a

and miR-148a). 46 miRNAs are within a 3-fold difference, including four quality-

markers candidates; miR-26b, miR-21, miR-30d, miR-26a. The most highly and

consistently expressed miRNAs in this group can be identified as let-7f, miR-21, and

miR-30d, representing each 1 to 2% of the total milk miRNA content, followed by

miR-92a, miR-320a, miR-25, miR-103, let-7g, miR-26b, miR-107, miR-23a and

miR-29a with expression level estimates between 0.1 and 1%. These miRNAs

emerge as potential candidates for conserved, perhaps universal, milk markers.

Normalized expression profiles of these miRNAs in cow milk, colostrum, serum and

human milk exosomes are presented in figure 1.4. It is interesting to note that relative

concentrations in serum and milk are very variable. MiRNA levels also vary between

human individuals and often overlap between one of the individual human milk

sample and cow milk. This shows that individual variation can also be relatively

large and should be taken into consideration, as noted previously. Additional

controlled studies in the same and other species will be needed to fully address this

issue and allow the validation of conserved lactation candidates or universal milk

markers. This will also allow further identification of specific milk miRNAs together

with the influence of physiological, dietetic or environmental conditions on their

expression in milk. Other studies are also needed to integrate miRNA and mRNA

gene target expression datasets in order to further explore the potential benefits of

Page 41: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNAs in milk. It is however becoming clear that the potential for miRNA research

to open such a novel window of observation provides a new approach for the

analysis of lactation offering exiting opportunities.

Figure 1.3 Comparative analysis of milk micro-RNA.

3a) miRNA levels of bovine versus human milk on the logarithmic scale. 3b) Bovine milk compared to bovine colostrum. 3c) Bovine milk versus bovine serum.

Page 42: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 1. 4 Milk content profiles of candidate micro-RNAs markers.

Those are highly expressed at comparable levels in bovine and cow milk. Proportion of total miRNA in milk colostrum and serum from cow and in four human subjects plus their average.

Page 43: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Food miRNA in milk

It was recently reported that plant food miRNAs can be found in the blood of people

and that plant food miRNA can regulate gene expression in the host, demonstrating

that miRNA can be horizontally transferred between species through the digestive

system (Zhang et al., 2012a). These surprising results show the potential for the

functional transfer of milk miRNA from the mother to the child. They also raise the

logical question of weather or not plant food miRNA ingested by the mother may

also be transferred into milk. To address this possibility, we have revisited published

data on the sequencing of short RNA in human milk exosomes. A number of

candidate plant miRNAs were identified at variable levels between individuals,

including; MIR-159, a plant miRNA highly conserved from moos to flowering plants

(Li et al., 2011b), at frequency 10-4 in 3 out of 4 human subjects, and MIR-168a, a

major plant miRNA previously identified in human serum at comparable frequency

10-3 in 2 out of the 4 human milk samples (Zhang et al., 2012a). This result supports

the notion that food miRNA may be transferred into milk. Additional experiments

will be needed to confirm the true plant origin of candidate plant food miRNAs by

chemical analysis. Contrary to animal miRNAs, plant miRNAs are commonly

methylated on the 2’ OH position of the ribose ring of the last 3’ terminal nucleotide.

This modification renders plant miRNAs resistant to oxidation and it is possible to

use this property to confirm the plant origin of particular miRNAs (Zhang et al.,

2012a).

Lactation diversity: a treasure chest for milk miRNA biology

Lactation, an important characteristic of mammalian reproduction, has evolved for

over 200 million years by exploiting a diversity of strategies across mammal

lineages. Comparative genomics and transcriptomics experiments have now allowed

a more in depth analysis of the molecular evolution of lactation and have emphasized

the ancient origin of the essential components of the lactation system at the

molecular level (Lefevre et al., 2009). We have also reported milk cell and mammary

gland transcriptomics experiments revealing conserved milk proteins and other non-

coding RNA components of the lactation system of monotreme, marsupial, and

eutherian lineages, confirming the ancient origin of the lactation system and

Page 44: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

providing useful insight into the function of specific milk proteins in the control of

lactation (Nicholas et al., 1995, Lemon and Bailey, 1966). These studies illuminate

the role of milk in the regulation of mammary gland function and the regulation of

growth and development of the young beyond simple nutritive aspects (Green et al.,

1983). In a similar manner, comparative studies of milk miRNA deployed over a

larger spectrum of mammalian lineages, promises to allow a more detailed analysis

of the functional evolution of specific miRNAs involved across the lactation process.

Monotremes

Monotremes have diverged from other mammals at least 166 Mya and are regarded

as the best living representative of early mammals with a more primitive prototherian

lactation system. These egg-laying mammals are now confined to Australia and New

Guinea and display a relatively long lactation cycle compared to gestation and egg

incubation with reported changes in milk protein content. These animals excrete milk

from a series of ducts opening directly on the surface of the ventral skin patch of the

areola. Hatchlings are highly altricial and depend completely on milk as a source of

nutrition during the period of suckling. Much of the development of the monotreme

young occurs after hatching and before weaning and the role of the milk in this

process needs to be examined. Genomics and transcriptomics approaches have

recently enabled the molecular analysis of cells in milk from monotremes (Lefevre et

al., 2009, Sharp et al., 2007, Warren et al., 2008). In a similar way, the study of

monotreme milk miRNA may allow the identification of conserved miRNAs

corresponding to the most ancient pathways of their biogenesis and function. We are

currently collecting and testing milk from these animals to engage these studies.

Marsupials

Marsupials present one of the most sophisticated lactation programs yet described.

These animals are found on the Australian and American continents and have

diverged from eutherian about 140 Mya. After a short gestation, marsupials give

birth to highly altricial young that are totally dependent upon the progressive changes

in milk composition for normal growth and development during the extended

lactation period (Tyndale-Biscoe, 1987, Brennan et al., 2007, Nicholas et al., 2012a).

Under certain conditions, some species of macropods such as the tammar wallaby

Page 45: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

(Macropus eugenii) produce milk of different composition from adjacent mammary

glands, a phenomena called concurrent asynchronous lactation, demonstrating the

importance of local control of the lactation program including major shifts in the

relative milk content in carbohydrates, lipids and proteins as well as the phase

specific expression of major milk proteins (Nicholas et al., 1995). Thus, the tammar

wallaby (Macropus Eugenii), with an extensively studied lactation system, represent

one of the most interesting model to analyse the variation of milk miRNAs during

the lactation cycle in order to identify specific miRNA expression that may be

correlated with the demand of the young or mammary gland development.

Preliminary results from our lab indicate that tammar milk contains significant

amount of miRNA and profiling experiments are ongoing.

Eutherians

Eutherians are the most common mammals and are the only mammals surviving on

all continents except Australia and America. By contrast to monotremes and

marsupials, eutherians have invested in extended intrauterine development of the

young and produce a milk of relatively constant composition, apart from the initial

colostrum. However, there are interesting adaptations of the lactation system within

Eutherian species. For example, the pinniped family of marine mammals present the

most diverse lactation strategies. There are three families of pinnipeds with very

different lactation characteristics; phocids (true seals), odobenids (walrus) and

otariids (sea lions, fur seals). The walrus has the lowest reproductive rate of any

pinniped species, with extended lactation for at least 2 years. Phocid seals have

adopted a fasting strategy of lactation whereby amassed body reserves of stored

nutrients allow fasting on land during continuous milk production over relatively

short periods (Oftedal et al. 1987). Finally, otariid seals have adopted a alternating

strategy of short periods of copious milk production on shore, and extended periods

of maternal foraging at sea while the young remain on shore (Bonner, 1984). Thus,

the study of milk miRNA in the pinniped family may allow the analysis of the role

that miRNAs may have played in the rapid adaptation of the lactation system in this

family. For example, preliminary studies have implicated miR-21 in the maintenance

lactation of the fur seal during extended foraging periods (L Buscara, 2013).

Page 46: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparative approaches have started to allow a detailed analysis of the functional

evolution of specific molecular components of lactation and rapid advances in

genome sequencing of a number of mammalian species have provided invaluable

resources for the comparative evolutionary analysis of genes and, more recently,

miRNAs involved in lactation. There is much to learn about the genetics of lactation

from the rich natural resource of animal diversity through the comparative analysis

of gene expression in a variety of lactating mammalian lineages. Non-invasive milk

sequencing approaches will enable a still broader exploration of lactation diversity to

facilitate a broader understanding of the role of miRNAs in the lactation process.

Conclusion

MiRNAs are established mediators of mammary gland development and have

recently emerged as new and powerful biomarkers. Milk miRNAs provide a

fascinating new way to look at the lactation process, creating novel opportunities for

the bioinformatic analysis of milk and gain a unique and deeper insight into lactation

and the role of milk. Milk is so far the body fluid found to have the highest

concentration of miRNA and is a good source of natural exosomes. This observation

alone should promote milk as a rich source of miRNA and rapidly raise the interest

of lactation research and milk related industries. During the last two years, a number

of reports have shown the great potential of miRNA to be used as biomarkers for

milk quality, and suggested that miRNA are new signalling molecules for the

immuno-protection or the development of the human infant.. Rapid advances in

miRNA sequencing combined with the natural resource of mammalian lactation

diversity, provide a fantastic platform to address the role of miRNA during lactation.

We have started to analyse milk miRNA in animal models with extreme adaptation

of lactation, and are developing bioinformatics tools for the comparative analysis of

gene expression (Church et al., 2012) or the creation of genome-wide miRNA target

databases (Kumar et al., 2012b) with associated data mining tools to mine a growing

number of milk miRNA datasets for evolutionary conserved or newly adapted

miRNA functionalities for innovative applications of milk miRNA.

A number of important questions on miRNA biology in the lactation process are only

starting to be addressed. First, the distribution of miRNA in different milk fractions

Page 47: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

(whey, fat, cells, microvesicles and exosomes) and their biosynthetic origin need to

be resolved. This will allow further understanding of milk miRNA biogenesis, and

the development of more informative assays of the quality and the origin of milk and

milk products. Preliminary studies have highlighted the major contribution of

exosomes in milk miRNA content. Second, statistical analysis needs to be conducted

to assess individual variability in relation to genetic variation. Third, more detailed

longitudinal studies of how milk miRNA content may change during the lactation

cycle are necessary. Preliminary reports in eutherians have revealed similarities and

differences between colostrum and milk, but complementary studies in marsupial or

monotreme species with large and gradual changes in milk composition will provide

the most interesting models to study the link between milk miRNA and mammary

gland or infant development. Fourth, the effect of physiological and disease

conditions of the mother on milk miRNA composition need to be evaluated. This

could lead to innovative assays to improve lactation and milk quality. Fifth, the

passage of plant food miRNA into milk needs to be confirmed. Ultimately, proof for

the direct role of milk miRNAs in mammary gland and infant development will most

likely require the construction and analysis of transgenic lines of animals with

modified miRNA expression. Finally the effect of purified milk miRNAs and

exosomes fractions in cell culture systems and other biological models should be

evaluated further for the functional analysis of milk miRNA signals. Further research

in these and other areas will rapidly result in the characterisation of the emerging role

of miRNA as biomarkers and signalling molecules of the lactation process and

beyond. Milk miRNAs may contribute to the apparent advantage of breastfeeding for

the health of the young and the health benefit associated with the consumption of

dairy products. The identification of milk miRNA signals would potentially provide

new opportunities for the maintenance of good health, and the prevention and

treatment of disease.

Page 48: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Part B MicroRNA Targets

MicroRNAs modulate gene expression by controlling protein synthesis and mRNA

cleavage post-transcriptionally. MiRNAs recognize their targets by complementary

base pairing to sequence motifs on target genes. In most metazoan miRNAs, a partial

match at the miRNA base pairs is shown to be sufficient to down regulate gene

expression, whereas plants miRNA are mostly fully complementary to their target

motifs. The small sized miRNA combined with imperfect base pairing in animals

and the long size of mRNAs expands the number of potential interaction sites. This

requires computational tools to map a large number of possible interactions. More

than 2000 mature miRNAs have been characterized in the human genome and

individual miRNAs can target hundreds of genes. The network of possible

interactions also depends on different cell types, cell cycle, external stimuli and other

environmental conditions affecting transcription status. MiRNA binding to mRNA

results in reduction of translational efficiency or degradation of mRNA through

argonaute mediated mRNA cleavage. As shown in figure 1.1 miRNAs are thought to

act in either of three modes of action. First miRNAs can cleave the target mRNA on

the site of miRNA-mRNA interaction. Second miRNAs can attach to the mRNA and

stops the translation by restricting movements of ribosomes. Third they trigger

mRNA degradation through mRNA- de-adenylation followed by de-capping and

degradation leading to reduced levels of target mRNAs in cells (Eulalio et al., 2009,

Wu et al., 2006). Guo et al. used ribosome profiling technique to show that 84% of

the decrease in protein production was a result of decrease in mRNA levels (Guo et

al., 2010). With the current estimation that up to 60% of all human genes may be

targeted by miRNAs, appropriate techniques to assist the exploration of miRNA

target networks are now required. Here I will review the current experimental and

computational approaches to study miRNA-gene interactions.

Page 49: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 1.5 Modes of miRNA action

miRNAs along with RIAC complex can (1) cleave the mRNA at the intraction site (2) bind to the mRNAs and stops translation by restricting movements of ribosomes (3) cause the deadenylation and decapping of target mRNA (Wittmann and Jack, 2011).

Page 50: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Experimental Approaches for miRNA target identification

MiRNAs inhibit gene expression by guiding the RISC complex to semi-

complementary target mRNAs. Various experimental techniques have been applied

to validate miRNA-mRNA interactions such as mRNA quantification, Ago-protein

immunoprecipitation, affinity purification of tagged mRNA and miRNAs, analysis of

polysomal associated mRNAs and quantification of protein production (Orom and

Lund, 2010). These approaches target different levels of transcription and translation.

All these techniques have their own advantages and disadvantages.

Transcription and translation are similar across phylogenetically divergent

species and the presence of a high degree of conservation among several miRNA

families suggest miRNA mediated regulatory networks share some common basic

framework, although many aspects of mRNA-miRNA interactions may have species

and cell specific differences. However experiments may be plagued by additional

complexity. For example, most of the miRNA studies are being carried out in

specific cell lines which might not reflect the true behaviour of a normal healthy cell

in a tissue, so there could be some bias in the data collected form these cloned cell

lines.

A combination of experimental and computational approaches can be used to find

genome wide miRNA targets. First miRNA are over expressed in a model organism

and altered gene expression is captured by harvesting mRNA, proteins or miRNA-

mRNA cross-linking complexes using microarrays, deep sequencing or proteomics

approaches. These large datasets are compared in case vs. control or disease vs.

normal tissues. miRNA target prediction algorithms combined with ontology and

pathway tools are then used to refine the experimental datasets to pinpoint the genes

targeted by miRNAs.

Computational approaches for miRNA target identifications

A number of computational algorithms have been developed to predict

miRNA target sites. They employ miRNA-mRNA base pairing patterns matching,

evolutionary conservation, secondary structure and nucleotide composition of target

Page 51: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

transcripts to look for miRNA-mRNA interactions. : DIANA-microT (Maragkakis et

al., 2011), MicroInspector (Rusinov et al., 2005), miRanda (Enright et al., 2003),

MirTarget2 (Wang, 2008), miTarget (Kim et al., 2006), PicTar (Krek et al., 2005),

PITA (Kertesz et al., 2007), RNA22 (Miranda et al., 2006b), RNAhybrid

(Rehmsmeier et al., 2004), and TargetScan (Grimson et al., 2007) are some of them.

MiRExpress and mirTools can extract miRNA expression profiles from high-

throughput sequencing datasets (Zhu et al., 2010, Wang et al., 2009). Other

algorithms use machine learning (Bandyopadhyay and Mitra, 2009, Yang et al.,

2008, Yousef et al., 2007), but all algorithms implement some form of modelling of

miRNA-mRNA interaction through a collection of informative features.

miRNA-mRNA feature based tools

These programs use the biological information known about miRNA-mRNA

interactions such as, seed region, minimum free energy, site location, presence of

multiple sites, expression profiles and data training. The definition of seed and site

location are two main features that distinguish most tools from each other.

Seed region: miRNAs bind to their targets from the 5’ end of miRNA by

complementary base pairing, while sequences at 3’ end of miRNA are usually

partially complementary to target sequences. Any mutation in the seed region can

lead to the loss of miRNA activity. So the 5’ extremity of miRNA at positions 2-7 is

more crucial for miRNA target interactions. This region is therefore called the “seed

region”. These seeds are further subdivided into 4 subclass; 8mer, 7mer-m8, 7mer-

A1 and 6mer depending on the length and combination of nucleotides from position

1-8. An adenine is mostly present at position1 in 8mer and 7mer-A1 seed region.

Complementary pairing at 3 prime end of miRNA enhances the target site

recognition, and miRNA nucleotides 13-19 usually complement the target sequences.

Site location: miRNA target sites exist on all parts of mRNA in 3’ UTR, 5’ UTR and

coding regions. But most studies have been focused on 3’ untranslated regions.

Computational studies have shown that miRNA target genes have longer 3’ UTRs

compared to non-miRNA target genes that are mostly housekeeping genes. In long

3’ UTRs target sites exists near either UTR extremities. In shorter UTRs sites are

Page 52: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

concentrated near the end of translation, after ~ 15 nt. However, recent studies have

also shown functional miRNA target sites in 5 prime and coding regions of mRNA.

Databases

miRBase database is the primary catalogue of miRNA sequences (Griffiths-Jones,

2006, Griffiths-Jones, 2010b, Kozomara and Griffiths-Jones, 2011). It is updated

frequently and the latest release (19th) contains 21264 miRNA hairpins precursors

and 25141 mature miRNA products, in 193 species. Ensembl (Kersey et al., 2012),

NCBI (Pruitt et al., 2012) and UCSC (Meyer et al., 2012) are the primary

repositories and comprehensive genome browsers. Most of the algorithms developed

for miRNA target predictions have made their databases available through web

servers. Other specialized web servers have been developed to provide a comparative

view of prediction results from many other prediction algorithms in combination

with experimental miRNA microarray expression results. Since the characterization

of miRNAs in early 2000, the number of newly discovered miRNAs has been

increasing steadily and the roles of miRNAs are being investigated in increasing

number of organisms. Today miRNA research is truly an active area of investigation

and the number of miRNA identified is steadily increasing (figure 1.6). New

statistical methods, computer algorithms and publically available web servers have

been developed (table 1.1). Research in this area is also rapidly increasing. These

bioinformatic resources have greatly facilitated miRNA research.

Page 53: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 1.6 miRNA hairpin entries in miRBase (2002 to 2012)

The first version of miRBase database was released in December 2002. Current version (v.19) was released in August 2012. It contains 21264 hairpin precursor miRNAs.

Name Description Webpage

DIANA-

mirExTra

Six nucleotide long motifs on 3’

UTRs in given list of genes from

human and mouse

www.microrna.gr/mirextr

a

ExprTarget Computational prediction and

microarray

http://www.scandb.org/a

pps/microrna/

HOCTAR Intragenic miRNAs annotated in

human

http://hoctar.tigem.it/

MAGIA miRNA target prediction with

different algorithms

http://gencomp.bio.unipd.

it/magia/start/

MicroCosm miRanda algorithm on 3’ UTRs of http://www.ebi.ac.uk/enri

Page 54: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

most of ensembl listed species ght-srv/microcosm/

microRNA.org miRanda on 3’UTRs of 5 selected

species

http://www.microrna.org/

mirConnX mRNA and microRNA (miRNA)

gene regulatory networks from

expression datasets

http://www.benoslab.pitt.

edu/mirconnx

miRDB Using MirTarget2 (SVM learning

machine) algorithm.

http://mirdb.org/

miRecords Validated and predicted targets http://mirecords.biolead.o

rg/

miReg Manually curated microRNA

Regulation Resource

http://www.iioab-

mireg.webs.com/

miRGator Ensemble of multiple miRNA

related tools. (miRNA

identification, quantification and

target analysis etc.)

http://genome.ewha.ac.kr

/miRGator/miRGator.htm

l

miRiam Human targets for viral-encoded

microRNAs by thermodynamics

and empirical constraints

http://ferrolab.dmi.unict.i

t/miriam.html

miRNAmap miRanda, RNAhybrid and

TargetScan

http://mirnamap.mbc.nct

u.edu.tw/

MiRror Combination of 9 algorithms http://www.proto.cs.huji.

ac.il/mirror/

miRSel Text mining of publication

abstracts (human, mice, rat)

http://services.bio.ifi.lmu.

de/mirsel/

mirTar Human (full length mRNA and

promoter regions target predictions)

http://mirtar.mbc.nctu.ed

u.tw/human/index.php

miRvar MicroRNA variation http://genome.igib.res.in/

mirlovd/

miRWALK miRNA targets in full length

mRNAs using 6 different

http://www.ma.uni-

heidelberg.de/apps/zmf/

Page 55: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

algorithms. mirwalk/index.html

PicTar miRNA targets using Pictar

algorithm

http://pictar.mdc-

berlin.de/

RepTar Predicted cellular targets of host

and viral miRNAs

http://bioinformatics.ekm

d.huji.ac.il/reptar/

SigTerms GO, miranda, targetscan, Pictar

(human, mouse)

http://sigterms.sourceforg

e.net/

starBase miRNA-mRNA interaction maps

from Argonaute CLIP-Seq and

Degradome-Seq data.

http://starbase.sysu.edu.c

n/

Sylamer Analysis of off-target effects from

expression data

http://www.ebi.ac.uk/enri

ght/sylamer/

TarBase Validated targets http://diana.cslab.ece.ntu

a.gr/tarbase/

TargetScan miRNA targets uing Targetscan

algorithm

http://www.targetscan.or

g/

TargetSpy Supervised machine learning tool http://www.targetspy.org/

ViTa Viral miRNA on host genes http://vita.mbc.nctu.edu.t

w/

Table 1.1 A list of miRNA databases

Problem statement

DNA, RNA and proteins are the main constituents of all living organisms.

Their complex interactions give rise to viruses, bacteria, fungi, plants and animals.

Protein coding genes are highly conserved in all life forms from C. elegans to

human. In early 2000 another class of genes called non-coding RNAs was identified.

These genes do not make any proteins but often work as genomic regulatory agents.

One important subclass of non-coding genes is miRNA. MiRNA communicate with

Page 56: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

mRNA in a sequence specific manner. Along with RISC complex, miRNA can

recognize complementary target sites on mRNA. This leads to a very high number of

possible miRNA-mRNA interaction sites. Various tissue types, cell types, diseases

and response to ever changing environmental conditions are known to change the

expression of miRNAs. The central problem in miRNA biology is the identification

of exact miRNA interaction sites. Several computational and experimental methods

have been developed to validate individual miRNA target sites but have their own

advantages and disadvantages. Computational tools combined with latest sequencing

technologies can be used to characterize new miRNAs and predict their functions. In

the next paragraph I will discuss the key research questions addressed in this thesis.

Key questions & motivation

As discussed earlier several computational tools have been developed to

identify miRNA target sites. However it is still difficult to reduce the high false

positive and missed true positive miRNA target sites. One possible explanation to

this problem is we still know very little about the cellular events taking place during

miRNA-mRNA interaction. Differences in annotations from NCBI, UCSC and

Ensembl genomic repositories can also lead to difference in predicted target sites.

Experimental validation of predicated target sites is even more challenging. Studying

single miRNA-mRNA interaction or genome wide miRNA targets both require long

laboratory experiments. Complexities of different living organisms, tissue and cell

types can have altered the functions of most conserved miRNAs. Experimental

conditions in different labs can give different results. A bioinformatic approach can

be applied where computational tools are applied to analyse huge genomic and

protein datasets. Further development and refinement of computational tools to

predict miRNA targets and combine computational methods with modern

experimental techniques such as sequencing technologies should facilitate the study

of genome wide miRNA target interactions. Here, bioinformatics tools were

developed and applied towards the comparative and functional analysis of miRNAs

in lactation biology.

Page 57: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Thesis Outline

Broadly, the two main areas of contribution by this work relate to the

characterisation of miRNAs using Illumina sequencing platform and the

development of web based tools for prediction of miRNA target genes. The

following is a brief outline of the work presented in this thesis: Chapter 2 deals with

the development of online miRNA target prediction tools for full length mRNAs in

several model organisms. Chapter 3 provides the detailed analysis of miRNAs in

buffalo milk using Illumina sequencing platform. In chapter 4, the buffalo milk

transcriptome was analysed for protein coding mRNAs found in somatic milk cells.

Chapter 5 provides a comparative analysis of milk miRNAs in human, buffalo, pig,

wallaby, platypus, echidna and alpaca milk. This is concluded by a general

discussion, summary of findings and future directions in chapter 6.

Page 58: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chapter 2

miRNA_Targets database

Page 59: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

At first, a repository of genome wide miRNA targets was developed to address the

lack of miRNA target predictions on full-length mRNAs. Main features of this

database were target predictions on whole mRNA - 5' and 3' UTRs and coding region

using 2 algorithms on 7 genomes. Versatile search capabilities using miRNA or

mRNA individually or in groups, gene ontology classification of sets of target genes

and visualization of miRNA and target gene location network on chromosomes were

implemented by integrating MySQL, PhP and various bioconductor packages. The

work discussed in this chapter was published in Genomics journal (Kumar et al.,

2012b).

miRNA_Targets: A database for miRNA target predictions in coding and non-

coding regions of mRNAs

Amit Kumar1, 2*, Adam K.-L. Wong3, Mark L. Tizard1, Robert J. Moore1, and

Christophe Lefèvre2 , (2012), Genomics, 100, 352-356.

1CSIRO Livestock Industries, Australian Animal Health Laboratory, Geelong,

Victoria 3220, Australia 2Institute for Frontier Materials, Deakin University, Geelong, Victoria 3216,

Australia 3School of Information Technology, Deakin University, Geelong, Victoria 3216,

Australia

*Corresponding author

E-mail: [email protected]

Tel: +61 3 522 72037

Page 60: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Abstract

MicroRNAs (miRNAs) are small non-coding RNAs that play a role in post-

transcriptional regulation of gene expression in most eukaryotes. They help in fine-

tuning gene expression by targeting messenger RNAs (mRNA). The interactions of

miRNAs and mRNAs are sequence specific and computational tools have been

developed to predict miRNA target sites on mRNAs, but miRNA research has been

mainly focused on target sites within 3’ untranslated regions (UTRs) of genes. There

is a need for an easily accessible repository of genome wide full length mRNA -

miRNA target predictions with versatile search capabilities and visualization tools.

We have created a web accessible database of miRNA target predictions for human,

mouse, cow, chicken, Zebra fish, fruit fly and Caenorhabditis elegans using two

different target prediction algorithms, The database has target predictions for

miRNA’s on 5’ UTRs, coding region and 3’ UTRs of all mRNAs. This database can

be freely accessed at http://mamsap.it.deakin.edu.au/mirna_targets/.

Key Words: Database, microRNA, Target Predictions, Coding region, 5 Prime UTR

Introduction

The microRNAs (miRNA) are a class of small (~22 nucleotides) non-coding RNAs

that post-transcriptionally regulate gene expression by interacting with mRNAs. In

animals the mRNA-miRNA interaction is semi-complementary, whereas in plants

miRNAs bind with near perfect complementarity on mRNA coding regions (Bonnet

et al., 2010). A miRNA can interact with hundreds of genes and a gene can be

targeted by many miRNAs. This results in a very high number of possible

interactions. Computational approaches have been used to predict mRNA-miRNA

interactions (miRanda (John et al., 2004), RNAhybrid, TargetScan (Lewis et al.,

2003, Lewis et al., 2005), PITA (Kertesz et al., 2007), PicTar (Krek et al., 2005),

RNA22 (Miranda et al., 2006a), microT and miRtarget etc) (Thomas et al., 2010).

These algorithms use knowledge of experimentally proven mRNA-miRNA

interactions to develop a scoring system (i.e. mRNA-miRNA partial

complementarity, seed region, target position, sequence conservation features etc.),

Page 61: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

which is then used to predict mRNA-miRNA interactions. Each algorithm use

slightly different scoring techniques, resulting in differences in prediction results.

A number of the miRNA target prediction algorithms has been developed and tested

for accuracy and precision using both computational and laboratory techniques.

When results from miRNA knockout experiments were compared to results from

computational approaches, computational algorithms were shown to produce high

false negative (undetected miRNA target genes) and false positive (nonfunctional

miRNA target sites) results. One possible explanation for false negative outcomes

could be that most of these studies applied computational algorithms to only 3’ UTR

regions of mRNAs. It is now recognized that miRNAs can also interact with mRNAs

in coding regions and 5’ UTRs as well (Lee et al., 2009a, Tay et al., 2009, J. Robln

Lytle, 2007). Secondly, it would be wrong to think that all possible target sites for a

miRNA will be always functional in any biological condition. Gene repression also

depends on a number of other factors such as the balance between quantity, half-life

and location of miRNAs and target mRNAs. Current miRNA target prediction

algorithms do not take into account these important factors. In general, results from

target prediction algorithms should be carefully scrutinized and should be treated

only as a guide to mRNA–miRNA interactions. The construction of advanced

integrated miRNA target prediction resources such as ours can help guide the

development of experimental approaches to target validation and database mining

will enable a more detailed analysis of the complex interactions occurring across the

network of miRNAs and mRNAs.

Previously designed web servers focused on 3’ UTR targets only (John et al., 2004,

Betel et al., 2008). In the last few years many high throughput experiments have

reported experimentally validated functional miRNA target sites located in 5’ UTR

and coding region (Joshua J. Forman, 2008, Zhou et al., 2009). miRNA target

database miRWalk used 7-mer seed sequence matches as the main criteria to predict

miRNA targets on mRNAs in promoter and flanking regions from human, mouse

and rat species (Dweep et al., 2011). The miRTAar.human database used a

combination of prediction algorithms (miRanda, TargetScan, RNAhybrid and pita) to

scan full length mRNAs for predicted miRNA targets using mainly the miRNA seed

sequence (1-8nt) and conservation filters. This approach is likely to achieve the best

Page 62: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

accuracy to date for conserved miRNA target sites but will miss non-

conserved/species specific miRNA target sites (Hsu et al., 2011). Here we designed a

web server for miRNA target predictions for mRNA 5’, 3’ UTRs and coding region

using precompiled genome wide target predictions on human, Mouse, cow, chicken,

zebrafish, fruitfly and C. elegans using miRanda and RNAhybrid algorithms. Both

these algorithms apply commonly accepted miRNA target features and are not highly

focused towards miRNA seed regions and highly conserved miRNA targets. This

combination provides maximum sensitivity for target sites predictions. We have

incorporated versatile search capabilities and tools to help visualize results. This will

provide a much needed resource for the biological research community.

Methods and Results

Implementation

Full length mRNA sequences were downloaded from the Ensembl database using the

BioMart tool (Haider et al., 2009). Mature miRNA sequences were downloaded from

miRBase (Release 18)(Griffiths-Jones, 2010a). miRNA target prediction algorithms

miRanda (John et al., 2004) and RNAhybrid (Kruger and Rehmsmeier, 2006) were

downloaded from their respective web servers. These target prediction algorithms

were used to predict miRNA targets on all sequence datasets of the respective

species. Both types of target predictions use full miRNA sequence for searching

target genes and are not highly conservation biased. This gives maximum sensitivity

to the miRNA target search.

Database

The miRNA_Targets MySQL database stores annotated mRNA sequences and

miRNA target prediction results. Target prediction results are available for Homo

sapiens, Mus musculus, Gallus gallus, Danio rerio, Bos Taurus, Drosophila

melanogaster and Caenorhabditis elegans (table 2.1). This MySQL-PhP based

pipeline can be extended to all the species present in the Ensembl database (figure

2.1). Ensembl gene ID’s are used as the main reference in the database structure.

Page 63: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Where multiple transcripts were available for a gene, the longest mRNA isoform of

genes were used and for RNAhybrid miRNA targets with P-Value < 0.05 were

selected.

Species Ensembl gene ids

Mature miRNA

miRanda Target sites

RNAhybrid Target sites

Human (GRCh37.p3) 54,283 1921 18,340,081 25,772,789 Mouse (NCBIM37) 37,681 1157 9,889,849 6,573,689

Chicken (WASHUC2) 17,934 544 2,099,138 984,979

Zebra fish (Zv9) 32,307 247 1,627,051 709,606 Cow (UMD4) 26,015 676 3,117,593 733,956

C. elegans (WS220) 45,435 368 1,375,889 765,604 Drosophila

melanogaster (BDGP5.25)

14,867 430 1,359,496 1,327,560

Table 2.1 Number of genes from Ensembl database and miRNAs from miRBase

(release 18) in the webserver.

miRNA target sites using miRanda with default settings and RNAhybrid at < 0.05 p-value.

Figure 2.1 Flow chart diagram of sequence datasets and algorithms used to make this database.

Sequence datasets were downloaded from Ensembl and miRBase. mRNA sequences were scanned for miRNA targets using miRanda and RNAhybrid algorithms. Results

Page 64: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

were stored on MySQL database and displayed using CIRCOS and goProfiles algorithms.

Figure 2.2 Comparison of miRNA target prediction algorithms on CLIP-Seq datasets from StarBase database.

Total miRNA-mRNA interactions reported in more than one CHIP-seq expreiments. miRanda algorithm (applied on 3’UTR sequences) shows the maximum coverage of miRNA target sites folloewd by PITA, TargetScan and Pictar.

Web server

The PhP-MySQL web interface allows the user to search for miRNA targets either

by using a common name, Ensembl gene ID or miRBase mature miRNA ID. Users

can search for miRNAs targeting a gene or group of gene IDs. Target list is sorted by

best energy scores. A diagram in the results shows the position of miRNA targets on

mRNA 5’, 3’ UTRs and coding region of each gene. miRNAs predicted to target a

gene by both algorithms are listed first, followed by miRNA predicted only by

miRanda and then predicted only by RNAhybrid.

These prediction algorithms also use full-length mature miRNA sequences for target

mRNA interactions, thus are not heavily seed biased and give different results for

different members of an miRNA family. In contrast, the TargetScan algorithm

considers only seed regions of miRNA families for greater accuracy. To test the

sensitivity of target prediction algorithms we used HITS-CHIPS datasets for human,

Page 65: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

mouse and C. elegans downloaded from the online starBase database (Yang et al.,

2011). It contains a collection of 21 CLIP-seq experiments. As shown in figure 2.2,

with overall miRNA-mRNA interactions from this dataset, miRanda showed the best

coverage of prediction results. Our aim was to get the greatest coverage of target

predictions for all mature miRNAs, so that miRNA target prediction results could be

used for further filtering of gene targets from microRNA overexpression and miRNA

gene knockout studies.

Web interfaces

Genome wide miRNA target genes predictions:

Single or multiple miRNA names can be used as inputs to identify target genes using

miRanda or RNAhybrid prediction algorithms. miRNA target genes can be searched

by applying different energy cut-offs for the best sequence complementary gene

targets. The user can also restrict the miRNA target search selected gene names for

example interrogating the down regulated genes from miRNA knockout or

overexpression studies.

Using Gene ID’s as input:

Single or multiple Ensembl gene ID’s or official gene names can be used to search

for miRNA targets of interest. miRNAs are sorted first by miRNA predicted by both

algorithms then by miRanda alone and followed by RNAhybrid results.

Using gene Sequence:

miRanda and RNAhybrid miRNA target prediction algorithms can also be used to

scan a mRNA sequence for miRNA targets by selecting all miRNAs from a given list

of 17 species including viruses or using any mature miRNA sequence to pin-point

the exact position of a miRNA target on a mRNA sequence. It is also possible to

investigate viral miRNA interactions with host mRNAs. miRNA-mRNA alignments

can be viewed on the sequence web page. Users can also screen for miRNA targets

on mRNA sequence of their interest with different levels of energy stringency. Fig. 3

Page 66: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

gives a snapshot of the database interfaces. Lower binding energy cut-off results in

higher specificity, greater sequence complementarity and less sensitivity (fewer

targets).

Most of the results predicted by miRanda were also predicted by RNAhybrid. This

showed a good overlap in the prediction results. All output gene common names are

linked to the NCBI gene database and Ensembl ID’s are linked to the Ensembl

database. miRNA ID’s are linked to the miRBase database.

Figure 2.3 Snapshot of the miRNA_Targets database web interface.

Multiple interfaces for different ways of querying miRNA target predictions. (miRNA against a given list of genes, genome wide predictions for multiple miRNAs\genes and using a mRNA sequence)

Ontology Analysis

We integrated goProfiles, an R (Bioconductor) package for the functional profiling

of lists of genes at the second level of Gene Ontology(Alex Sanchez et al., 2010).

This package is based on the functional classification of gene ontology developed by

Alex et al(A. Sa ́nchez-Pla et al., 2007). Genes targeted by a miRNA or a group of

miRNAs can be classified into molecular function, cellular location and biological

Page 67: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

process at the second level of GO classification. Two lists of miRNAs can also be

compared against each other to get an idea of the collective ontology differences of

their target genes.

MiRNA and Target circular diagrams

miRNAs are mapped to the target genes on respective chromosomes in all given

species using the Circos algorithm (Krzywinski et al., 2009). This network

visualization presents insight into any preferential targeting of certain chromosomes

by particular miRNAs.

Experimental example

To demonstrate the value of the miRNA_Targets database we used the data from

study published by Melton et al. in 2010 (Melton et al., 2010a). In this paper it is

shown that the inhibition of the let-7 miRNA family promotes de-differentiation of

somatic cells to induced pluripotent stem cells. We used their microarray dataset to

show the presence of miRNA target sites in 5’ UTRs and coding regions in addition

to 3’ UTRs of the target genes. Expression of let-7c miRNA down regulated 694

genes. The Ensembl biomart tool was used to match the gene associated names to

unique Ensembl gene ID’s (559). By using our miRNA_Targets database, we found

488 of the 559 genes have predicted let-7c miRNA target sites. Melton et al. (2010)

reported only 294 genes as having miRNA target sites when they only analysed the

3’UTRs. From the same paper, expression of miR-294, down regulated 1899 genes.

Again, only 638 genes had predicted miRNA target sites in 3’ UTR region compared

to 1371 on full length mRNA (5’ UTR, coding region and 3’ UTRs combined).

Figure 2.4 shows a snapshot of predicted miRNA targets for the set of given down

regulated genes. Ontology classification and chromosomal locations of genes are

shown on the mouse genome. This chromatogram shows the connecting network of

miRNA-gene interactions on different chromosomes. Multiple genes on different

chromosomes can be controlled simultaneously at the translation step, in a sequence

specific manner, by miRNA interactions. Chromosomes 4, 5 and 11 have higher

densities of closely located genes interacting with let-7c miRNA. This approach can

Page 68: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

also give insights into chromosomal biases in miRNA-mRNA interactions and can

highlight over-represented gene ontologies in the list of potential target genes.

miRNA target prediction algorithms give false predictions, but if used on

differentially expressed genes, we can map possible miRNA interaction sites on a

given list of down regulated genes.

Figure 2.4 Let-7c miRNA targets.

From the down regulated set of genes with ontology analysis and chromosomal location map of the target genes on mouse genome. Percentage of genes classified in each GO category is given in front of respective GO ontology.

Page 69: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Discussion

miRNA target prediction algorithms and publically available databases are

continuously evolving. As more information about miRNA–mRNA interactions is

becoming available, new publically available database tools are being developed to

incorporate the new data. Here we applied the two well-known target prediction

algorithms miRanda and RNAhybrid to full length mRNA sets. Differences In

prediction results from different algorithms are due to different weightings of the

different properties of miRNA-mRNA interactions. The seed region is the best-

known indicator of possible interaction, but this does not cover all interactions

(Melton et al., 2010b). The presence of complementarity at the 3’ end of miRNAs is

also known to affect miRNA target identification (Lee et al., 2009a). Multiple

experimental studies have reported a large number of miRNA targets are present in

coding regions and 5’ UTR regions of mRNA. miRNA_Targets database fills in the

gap in publically available databases by providing full length miRNA target site

prediction for multiple species.

This database is particularly helpful for screening through the differentially regulated

genes from miRNA related experimental studies. Within such studies a proportion of

the down regulated genes will be directly modulated by miRNA interactions whereas

others are not subjected to miRNA interactions but rather are regulated by indirect

effects of a regulatory cascade. New algorithms are required to computationally

screen through a large number of genes linked in pathways, which are not direct

targets of miRNAs. Currently this step can only be performed manually for

individual pathways. As more and more sequencing datasets are becoming available,

expression of miRNA and mRNA transcripts at multiple time points will provide

further light on the quantification of repression caused by each miRNAs. Algorithms

and publically available databases have to keep up with each other for the smooth

translation of bioinformatics studies to laboratory experiments. In future updates

experimental data on miRNA-mRNA interactions will be integrated into

miRNA_Targets database.

Page 70: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Conclusions

This database will provide researchers a query platform to investigate miRNA non-

coding and coding region interactions in whole transcriptomes and will promote

research activity on these otherwise neglected 5’ and coding regions of mRNA. User

can query the database for precompiled miRNA targets form 7 species and miRNA

target predictions can be performed for multiple genes using mature miRNAs from

17 species and viruses using miRanda and RNAhybrid algorithms. This is a more

complete platform than previously available databases for the analysis of miRNA

targeting biology. In the future, we will continue to update and maintain this

database with addition of new miRNAs, gene annotations and incorporate more

advance open source pathway and ontology analysis algorithms as they become

available.

Page 71: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chapter 3

MicroRNAs in buffalo milk

Page 72: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Introduction

MicroRNAs are a class of small non-coding RNAs, 18-26 nucleotides long, that

regulate the genetic information flow by either mRNA degradation or translational

repression (Valencia-Sanchez et al., 2006). Research in this area has highlighted the

broad spectrum of important biological mechanisms and functions associated with

miRNAs (Gunaratne et al., 2010). Several developmental and physiological

processes including the control of cell differentiation, metabolism, morphology,

polarity, migration, signal transduction, cellular communication, organogenesis,

epigenetic and developmental processes have shown to be regulated by miRNAs

(Chen et al., 2012b, Inui et al., 2010). Many miRNAs are evolutionarily conserved

and expressed in a tissue and stage specific manner (Cao et al., 2008). High levels of

miRNAs have been found in human, mouse and cow milk and reported to be

packaged into small exosome-like secretory vesicles (Chen et al., 2010b, Hata et al.,

2010, Kosaka et al., 2010, Zhou et al., 2012a). The recent observation that exogenous

miRNAs consumed from plant foods may pass cross the gut and directly influence

gene expression in animals (Clarke et al., 2012), raises new questions about the

potential effects of milk miRNA on growth and development of the young besides a

putative role in mammary gland physiology (Kumar et al., 2012a). The current study

explored buffalo milk small-RNA content using sequencing technology.

Buffalo (Bubalus bubalis) is an important livestock animal. Dairy production

of buffalo represents 13% of total milk production worldwide. Buffalo milk has up to

58% more calcium, 40% more protein and 58% less cholesterol than cow’s milk

(Australian buffalo industry council, 2013, Spanghero and Susmel, 1996). Study of

miRNAs in buffalo milk may highlight further differences in milk composition. In

particularly, the difference between the colostrum (0-5 days after birth) and mature

(6 months after birth) milk are investigated below.

The results confirm that small RNA isolated from colostrum and milk

contains a large population of miRNA. Over 700 putative miRNAs were identified

and quantified by RNA sequencing in buffalo milk and colostrum. The small RNA

content of colostrum was also compared with small RNA derived from a partially

purified exosome fraction from the same sample, suggesting that buffalo milk

Page 73: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNA may be carried inside exosome-like vesicles. In addition, the small RNA

profile of colostrum was investigated further by sequencing a single colostrum

sample after size selection of 18 to 40 nucleotides or 18 to 80 nucleotides in order to

identify other RNA species that may be secreted in colostrum, although no evidence

for such molecules could be found.

Materials and Methods

Milk collection and RNA extraction

Buffalo milk samples were collected at the Shaw River Dairy, Victoria, Australia.

Approximately 2 to 4 litre of colostrum (CM, n=3) and milk (MM, n=4) samples

were collected from different animals and transported to the lab on ice (3 to 12

hours). The samples were filtered through gauze (150 μm) to remove hair and

impurities and the filtrate was subjected to centrifugation at 2,000 xg for 15 minutes

at 4°C to separate the cell pellet, fat and skim milk fractions. Cell pellets were

washed 3 times in PBS before RNA extraction. A purified exosome fraction was

prepared with ExoQuickTM solution (SBI-System Biosciences) following the

manufacturer’s instructions.

RNA extractions from skim milk were performed with the Ambion miRNA Isolation

Kit (Life Technologies), according to manufacturer instruction for body fluids. The

final RNA extract was immediately snap-frozen in liquid nitrogen and stored at −80

°C. RNA quality and quantity were evaluated on the Agilent Bioanalyser (not

published here). Poly-A selection or size selection of RNA fragments below 40 or 80

nucleotides were applied before mRNA or small RNA sequencing, respectively.

RNA sequencing was contracted to BGI, Shenzen. For each sample 10 to 15 Million

Illumnia (GA-II or HiSeq) reads were obtained (50 nucleotide maximum read

length); corresponding to over 2 Gigabytes of raw sequence data in total.

Page 74: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

RNA-Seq data analysis

Raw sequences (fastq output) were processed and adaptors were removed using

FastQ and FastX programs (Ben Bimber, 2010). Because an annotated genome

sequence of buffalo is not yet available, read mapping was performed using the

closely related bovine genome (version UMD3.1 (Zimin et al., 2009)) using Bowtie2

(Langmead and Salzberg, 2012). Alignment details were also visualized and

investigated further in SeqMonk software (Andrews, 2013), allowing log-

normalisation, visualization, and annotation of genomic alignments. Subsequently

miRanalyzer (Hackenberg et al., 2011), a specialized miRNA annotation tool, was

used to identify the miRNA populations in RNA-seq data and predict new miRNAs.

This tool used the fast read aligner Bowtie to align sequences firstly to known bovine

miRNAs from miRbase, then to non-coding RNA sequence from Rfam and finally

the remaining sequences were aligned to the cow genome. DSAP (Deep Sequencing

Small-rna Analysis Pipeline) (Huang et al., 2010) was used for the identification of

putative miRNAs related to miRNAs already identified in other species using the

miRBase reference database containing a full catalogue of miRNAs identified in any

species (Griffiths-Jones, 2006).

Gene target predictions and GO ontology analysis

The miRNA_Targets database was used to build a catalogue of predicted miRNA

target sites on full length mRNA (including 5 prime UTR, coding regions and 3

prime UTR)(Kumar et al., 2012b). Functional cluster analysis was done with the

online tool DAVID (Huang da et al., 2009b, Huang da et al., 2009a).

Page 75: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Results

Small RNA libraries, sequencing, pre-processing and global analysis of RNA-

seq data

In total nine small RNA libraries from buffalo milk were sequenced by high

throughput (Illumina) sequencing. These included three colostrum samples (CM-1 to

3) and four milk samples (MM-1 to 4). In addition, to test for the association of

miRNA and exosome-like vesicles in milk, an exosome preparation from one of the

colostrum samples was sequenced (CM-E). Finally, to test for the presence of

additional small RNA species, the same colostrum sample (18 to 40 nt) was also

sequenced after selection of RNA sizes between 18 and 80 nucleotides rather than

the standard selection between 18 and 40 nucleotides. Approximately 10 to 15

million high quality sequences were obtained for each sample. After adaptor removal

and quality checks, including the removal of Sequences smaller than 18 nucleotides

and sequences containing polyA tracks, each small RNA library generated between

approximately 5 to 13 million clean reads as shown in table 3.1.

The distributions of small RNA sequence lengths presented in figure 3.1a show two

main populations centred on 22 and 32 nucleotides. In most of the samples a major

peak was seen in the 22 nucleotides region corresponding to the length of miRNA

(18 to 26 nucleotide), except for CM-2 and MM-4 where a larger number of

sequences were reported around 32 nucleotides. However, peak centred at a 22-

nucleotide was also clearly present in these samples as well. Figure 3.1b represents

the proportion of sequences of different length in each sample. It can be seen than in

most samples from 40 to 95 % of all sequences are in the 18-26 nt range

corresponding to miRNA length, with the exception of samples CM-2 and MM-4

where less than 10% of sequences fall in that range and longer oligonucleotides are

more abundant. In milk samples more than 80 % of sequences consistently contains

sequences in the miRNA range while in colostrum samples this proportion is more

variable from 40% to 90%. Overall the results show that over 50 % of small RNAs

obtained after size selection below 40 nucleotides may represent putative miRNA.

Page 76: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

In order to focus on sequences in the miRNA size range, the datasets were divided

into subsets with those with lengths less than 27 nt separated from those of 27 nt or

longer nucleotides. The new length distributions for the subset of less than 27

nucleotides in length are presented in figure 3.1c. It can be seen that, even in samples

where a large proportion of sequences longer than 26 were found previously, the

distribution of shorter sequences are more comparable and the most prevalent size in

this subset is 22 nt, generally representing 20 to 30% of each trimmed sample. 22 nt

is the canonical length of miRNA, while 70 to 90% of sequences have length

between 20 and 24 nucleotides. These results suggest that, despite variable

contamination by longer RNA species, milk and colostrum contain a large proportion

of miRNA-like molecules with comparable length distributions.

Colostrum samples CM-1 CM-2 CM-3

Type Count % Count % Count %

Total reads 11408450 10954359 10901894

High quality 11385376 100% 10935510 100% 10858159 100%

3'adapter null 52604 0.46% 46696 0.43% 51677 0.48%

Insert null 1459590 12.82% 250056 2.29% 1169248 10.77%

5'adapter contaminants 142584 1.25% 10177 0.09% 994702 9.16%

Smaller than 18nt 718087 6.31% 49235 0.45% 2925901 26.95%

PolyA 129 0.00% 62 0.00% 8 0.00%

Clean reads 9012382 79.16% 10579284 96.74% 5716623 52.65%

Colostrum samples CM-E CM-80

Type Count % Count %

Total reads 10917820 12872584

High quality 10895007 100% 12846556 100%

3'adapter null 61207 0.56% 1256139 9.78%

Insert null 558885 5.13% 1805116 14.05%

Page 77: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

5'adapter contaminants 591223 5.43% 303778 2.36%

Smaller than 18nt 546789 5.02% 790276 6.15%

PolyA 335 0.00% 197 0.00%

Clean reads 9136568 83.86% 8691050 67.65%

Milk samples MM-1 MM-2

Type Count % Count %

Total reads 10886460 14749581

High quality 10857047 100% 14712446 100%

3'adapter null 33693 0.31% 55322 0.38%

Insert null 225367 2.08% 109243 0.74%

5'adapter contaminants 100587 0.93% 30757 0.21%

Smaller than 18nt 1520601 14.01% 1443943 9.81%

PolyA 47 0.00% 0 0.00%

Clean reads 8976752 82.68% 13073181 88.86%

Milk samples MM-3 MM-4

Type Count % Count %

Total reads 11442843 11009867

High quality 11414369 100% 10991978 100%

3'adapter null 61416 0.54% 24395 0.22%

Insert null 2775662 24.32% 17653 0.16%

5'adapter contaminants 2688172 23.55% 2741 0.02%

Smaller than 18nt 884295 7.75% 61204 0.56%

PolyA 98 0.00% 3 0.00%

Clean reads 5004726 43.85% 10885982 99.04%

Table 3.1 Raw data processing steps.

Read counts at various stages of data processing in colostrum and milk samples. Three colostrum samples (CM-1, CM-2 and CM-3), colostrum exosome only RNA sample (CM-E), a colostrum sample with sequences from 18 to 80 nucleotides (CM-80) and four milk samples.

Page 78: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 3.1a

Figure 3.1 b

Page 79: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

(C)

Figure 3.1 Distribution of sequence lengths in RNA samples from sequenced libraries.

(a) MM-4 has the least number of sequences in miRNA length (~22 nt) category. All samples showed two peaks at around 22 and 32 nucleotides length. (b) Proportions of 18-26 nt. sequences in each sample, MM-1 and MM-2 have the highest and MM-4 and CM-2 have the least number of sequences in this length.

Annotation of RNA-seq data

Bowtie2 and SeqMonk were used to align, visualize and analyse differential

composition of colostrum and milk RNA. The miRanalyzer tool available online was

also used to identify miRNAs and predict new miRNA. As a genome reference

sequence for the buffalo is not yet available, the cow UMD3.1 genome assembly was

used as a reference for the mapping and annotation of RNA-seq data. The Bowtie2

Page 80: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

short read alignment program was used to map the data to the reference genome

using the following command:

bowtie2 bos_tarus_umd3_1_67 -f -p 8 sample.fa -S sample.sam

where bos_tarus_umd3_1_67 is the indexed bovine genome reference sequence file,

sample.fa is the file containing pre-processed RNA-seq sequences and sample.sam

is the name of the result file.

Alignment result files (sam format) were then imported into SeqMonk software for

visualisation and analysis. A total of 209685 overlapping contig probes (nt sequence

probes) were generated using a depth cutoff of a minimum of 3 overlapping reads to

generate a probe. Read count quantitation was done using all reads within probes,

normalised to count per million reads, corrected for probe length and log

transformed.

After visualising the cumulative distribution of the data as shown in figure 3.2a it

was observed that the scales of the distributions were not comparable. This was due

to different proportions of reads overlapping the probes selected after filtering in

each sample with major differences between sub-datasets above or below 27 nt.

Additional correction was necessary to bring the data on a comparable scale and a

75-percentile normalisation was applied to achieve this as illustrated in figure 3.2b.

In order to analyse the similarity of quantitative miRNA profiles computed from the

samples, the hierarchical clustering program in SeqMonk software was used to

cluster the samples as shown in the data tree of figure 3.3. It can be seen that the

samples are grouped in two clusters containing colostrum or milk data and that two

samples (CM-3 and MM-4) lay outside their respective clusters. During pre-

processing it was already noted that the length distribution in colostrum sample CM-

3 looked more like those of milk, while milk sample MM-4 was of poor quality with

low number of shorter sequences (figure 3.1b). These two outlier samples were

excluded from comparative analysis of milk and colostrum miRNA secretion

profiles. A boxplot of normalized miRNA probes in figure 3.4a shows the

distribution of expression values in respective datasets. All the distributions have a

similar mean but a larger spread of values is present in milk compared to colostrum.

Alternatively, assuming that the shape of the distribution should be similar in

colostrum and milk, a quantile-quantile normalisation step can be applied instead of

Page 81: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

the simpler 75-percentile normalisation as show 4a. Figure 3.4b shows BoxWhisker

distribution plots of the same set of samples after quantile-quantile normalisation,

Distributions have been fitted onto each other for further statistical analysis. Finally,

samples CM-1 and CM-2 or MM-1, MM-2 and MM-3 were grouped into their

respective colostrum and milk replicate sets for differential analysis.

Figure 3.2a Cumulative distribution plot of colostrum, milk and colostrum exosome only samples.

X axis represent the percentage if sequences and Y axis represent the spread of gene expression values. Sequence sets in l26 and m26 RNA datasets.

Page 82: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 3.2b Cumulative distribution of RNA quantification before and after normalization to the 75 percentile.

X axis represent the percentage if sequences and Y axis represent the spread of gene expression values. Sequence sets in l26 and m26 RNA datasets.

Page 83: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 3.3 Data tree of all annotated buffalo miRNA in colostrum and milk samples.

There are two major groups; 1) Colostrum samples (top) (2) Milk samples (below) and two outlying samples (CM-3 and MM-4). Branch lengths indicate the degree of similarity.

Page 84: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 3.4a BoxWhisker plot of 75-percentile normalisation of colostrum, milk and colostrum exosome only samples.

X axis represent the gene expression values.

Figure 3.4b BoxWhisker plot of quantile-quantile normalized colostrum and milk samples

The miRanalyzer algorithm was also utilised as an alternative annotation tool to

analyse the RNA-seq data. The miRanalyzer pipeline is designed to remove all

sequence reads longer than 27 bases and re-cluster the sequences after trimming.

Page 85: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

This tool uses the fast read aligner Bowtie to align sequences firstly to known bovine

miRNAs from miRbase, then to noncoding RNA sequence from Rfam and finally the

remaining sequences are aligned to the cow genome. The results are listed in 3.2.

Summarising the number or proportion of RNA sequences at various steps of RNA-

seq data processing using miRanalyzer.

Three colostrum samples CM-1, CM-2 and CM-80 were analysed. MM-1, MM-2

and MM-3 were used for milk. For the reasons described earlier CM-3 and MM-4

were not included in this analysis. In colostrum 50 - 70 percent of total reads shorter

than 27 nt aligned to mature miRNAs compared to 60 - 85 percent in milk. In

colostrum approximately 2.5 to 4 % reads aligned to genes, where as in milk this

varied from 0.9 to 1.1 %. These results suggest a higher proportion of RNA

represents miRNAs in milk than in colostrum. However a larger number of known

individual miRNA were identified in colostrum (265-316) than in milk (204-235),

suggesting that a small subset of miRNA is expressed at higher levels in milk. In

total, after pulling the results from individual samples in their respective groups, 361

distinct known miRNAs were reported in colostrum compared to 288 distinct known

miRNAs in milk as depicted in figure 3.5. Novel miRNAs were also predicted in

each data set based on their expression profiles and a machine-learning algorithm

employed by miRanalyzer. The total number of predicted miRNA in colostrum

ranged from 170 - 428 compared to 99 - 125 in milk, again suggesting a larger

distribution of individual miRNAs in colostrum.

Overall miR-30a-5p was reported to be the most highly expressed miRNA in milk

samples. Table 3.3 summarizes the 10 most highly expressed miRNAs in all

colostrum and milk samples in decreasing order of expression. There is more

variation in individual colostrum samples than milk samples. miRNA miR-148a was

also highly expressed in all samples and the highest miRNA accounted for

approximately 35% of total miRNA contents while the 10 most abundant miRNAs

covered approximately 75% of all aligned reads. A list of the most abundant mature

miRNAs identified in colostrum and milk samples is given in the table 3.4.

These results show that a large proportion of milk RNA encompass miRNA and

identify a large number of miRNA species, including a small number of miRNA

Page 86: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

species secreted at very high level, and miRNA species with apparent differential

distribution in colostrum and milk.

CM-1 CM-2 CM-80 MM-1 MM-2 MM-3

Total reads 489648

6

69181

2

459362

4

865857

8

1115914

3

447940

1

Unique reads 164808 77505 172803 41127 89887 47597

% reads aligned to

miRNAs

70.7% 50.8% 71.5% 85.8% 60.7% 73.8%

% reads aligned to

Rfam

0.48 % 0.46

%

0.48 % 0.07 % 0.12 % 0.07 %

% reads aligned to

genes

2.52% 3.91% 2.28% 1.10% 1.33% 0.91%

miRNA known 315 265 318 234 204 225

Predicted miRNAs 404 170 428 99 125 107

Total miRNAs 719 435 746 333 329 332

Table 3.2 Summary of small RNA alignments to cow genome.

Page 87: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 3.5 Overlap of known colostrum and milk miRNAs.

Page 88: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparison of colostrum and milk RNA-seq data sets

Bowtie2 and SeqMonk tools were used to find the miRNAs expressed in colostrum

and milk. As shown in figure 3.6a with a scatterplot of miRNA probes quantified in

colostrum and milk miRNAs were among the most highly expressed sequences and

most miRNA showed little change in expression from colostrum to milk. In order to

identify miRNA that are differentially expressed in colostrum and milk statistical

tests were employed.

First, using the statistical test based on intensity differences of the annotated miRNA

probes, only miR-375 and miR-18b were statistically significantly differentially

expressed below P < 0.05. Mir-18b was colostrum enriched and miR-375 was milk

enriched. When conditions were further relaxed by removing the stringent multiple

correction filter in order to harvest more candidates for putative differential

expression, as shown in 3.5, 41 probes (including 29 miRBase annotated miRNAs)

were potentially differentially expressed at P-value < 0.05. Interestingly, a 45 kb

cluster on chromosome 21 was found to contain 12 of these miRNAs enriched in

milk. In addition, 83 and 91 miRNAs presented 2-fold enrichment in colostrum and

milk respectively.

Colostrum enriched miRNAs had log2 expression values from 7.5 to 15 with few

miRNAs above 15, whereas milk enriched miRNAs were distributed more evenly

from 10 up to 26 and reached higher intensities. The 10 most colostrum-enriched

miRNAs are listed in 3.6a. Among them miR-145, miR-455 and miR-34c showed

the highest log intensity difference with 6.7, 6.2 and 6.1 respectively. Milk enriched

miRNAs showed greater intensity differences. As listed in table 3.6b, miR-122, miR-

375, miR-381 were expressed with 10.2, 8.9 and 8.5 log-intensity differences. The

cluster of miRNAs on chromosome 21 also contained some of the most milk

enriched miRNAs.

Page 89: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Ran

k CM-1 CM-2 CM-80 MM-1 MM-2 MM-3

1 miR-30a-

5p miR-191

miR-30a-

5p

miR-30a-

5p

miR-30a-

5p

miR-30a-

5p

2 miR-27b miR-30a-

5p miR-148a miR-148a miR-148a miR-148a

3 miR-148a miR-148a miR-27b let-7a-5p miR-375 miR-375

4 miR-191 miR-181a miR-26a let-7f let-7f let-7a-5p

5 miR-181a miR-26a miR-191 miR-375 miR-191 let-7f

6 miR-26a miR-92a miR-181a miR-141 let-7a-5p miR-141

7 miR-141 let-7a-5p miR-141 miR-191 miR-141 miR-191

8 miR-186 miR-182 miR-186 miR-22-3p miR-181a miR-21-5p

9 let-7a-5p miR-27b let-7a-5p miR-101 miR-22-3p miR-181a

10 let-7f miR-30d miR-22-3p miR-27b miR-182 miR-143

Table 3.3 Top ranked miRNAs expressed in colostrum and milk individual samples.

Page 90: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Rank Colostrum Log-exp. Milk Log-exp. Col. exosomes only Log-exp.

1 miR-30a-5p 26.7 miR-30a-5p 35.2 miR-148a 22.5

2 miR-191 8.9 miR-148a 20.3 miR-30a-5p 20

3 miR-148a 8.1 miR-375 6 miR-181a 7.9

4 miR-27b 6.2 let-7f 5 miR-191 5.5

5 miR-181a 5.3 let-7a-5p 4.7 let-7a-5p 5.4

6 miR-26a 5.1 miR-141 3.1 miR-27b 4.5

7 miR-141 3.6 miR-191 2.9 miR-182 3.7

8 miR-186 3.1 miR-181a 1.9 let-7f 3.5

9 let-7a-5p 2.9 miR-22-3p 1.8 miR-26a 3.3

10 miR-182 2.1 miR-21-5p 1.5 miR-141 2.4

Table 3.4 Log 2 transformed expression of the 10 highest expressed miRNAs in

colostrum, milk and colostrum exosome only samples.

Figure 3.6a A scatter plot buffalo milk miRNA probes.

A scatter plot of miRNA probes with milk enriched miRNAs highlighted in green and colostrum enriched miRNAs in blue (change of less than 2). Further 41 miRNAs were found to be differentially expressed after removing multiple corrections filter (red colour).

Page 91: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 3.6b miRNA expression profiles of colostrum and exosome only samples.

miRNA Colostrum

L26 Milk L26

Diff P-

value Chr

miR-18b 9.5 6.0 0.00 X

miR-375 16.7 25.6 0.00 2

miR-411 12.3 20.2 0.00 21

miR-122 7.5 17.7 0.00 24

miR-381 10.6 19.2 0.00 21

miR-410 10.4 18.0 0.00 21

miR-409a 9.7 17.4 0.00 21

miR-24-1 8.8 6.8 0.01 8

miR-143 18.1 23.2 0.01 7

miR-3431 10.9 16.8 0.02 X

Page 92: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miR-486 17.4 21.9 0.02 27

miR-145 14.1 7.4 0.02 7

miR-494 7.6 14.2 0.03 21

miR-376c 7.5 14.0 0.03 21

miR-487b 7.6 14.2 0.03 21

miR-369 7.7 14.0 0.03 21

miR-455 13.7 7.5 0.03 8

miR-493 9.4 15.0 0.04 21

miR-34c 13.6 7.5 0.04 15

miR-299 9.3 14.9 0.04 21

miR-340 15.0 19.0 0.04 7

miR-127 7.6 13.6 0.04 21

miR-1296 8.9 7.2 0.04 28

miR-760 12.2 7.3 0.04 3

miR-23b 19.5 15.7 0.04 8

miR-374a 15.5 19.3 0.05 X

miR-101 18.1 21.8 0.05 8

miR-92 8.9 7.2 0.05 X

miR-1388 13.9 17.9 0.05 13

miR-223 9.9 7.3 0.05 X

Table 3.5 Statistically significantly differentially expressed miRNA (without

applying multiple correction).

Page 93: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNA Colostrum

L26 Milk L26 Diff. Chr. Start End

miR-145 14.1 7.4 6.7 7 62810747 62810772

miR-455 13.7 7.5 6.2 8 105147098 105147122

miR-34c 13.6 7.5 6.1 15 22135438 22135461

ENSBTAG0

0000046154 13.2 7.5 5.7 19 19989559 19989582

miR-181d 15.0 10.0 5.0 7 12915272 12915296

miR-20b 14.2 9.3 4.9 X 17918172 17918196

miR-452 12.3 7.4 4.9 X 34665670 34665695

miR-125a 12.5 7.6 4.9 18 58015587 58015609

miR-760 12.2 7.3 4.8 3 49797118 49797144

miR-1-1 12.3 7.6 4.7 13 55237550 55237571

Table 3.6a Top differentially enriched miRNAs in colostrum compared to milk.

Page 94: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNA Colostru

m L26

Milk L26 Differenc

e

Chr. Start End

miR-122 7.5 17.7 10.2 24 58095656 58095680

miR-375 16.7 25.6 8.9 2 107667521 107667552

miR-381 10.6 19.2 8.5 21 67583119 67583143

miR-411 12.3 20.2 7.9 21 67563112 67563134

miR-409a 9.7 17.4 7.7 21 67603260 67603315

miR-410 10.4 18.0 7.6 21 67603914 67603935

miR-127 7.9 15.6 7.6 21 67429798 67429821

miR-154c 8.8 15.6 6.8 21 67575295 67575316

miR-494 7.6 14.2 6.6 21 67569725 67569747

miR-487b 7.6 14.2 6.5 21 67583608 67583630

Table 3.6b Top differentially enriched miRNAs in milk compared to colostrum milk.

A cluster of 8 miRNAs on chromosome 21 was colostrum enriched.

MicroRNAs in colostrum exosomes

Resent publications have suggested packaging of miRNAs inside exosomes (Chen et

al., 2010b, Kosaka et al., 2010). In order to evaluate the contribution of the exosome

fraction to total colostrum miRNA content, exosomes were purified from the skim

milk. As previously mentioned in methods section, starting with 1.5 ml of whey from

the same colostrum sample previously used in small RNA sequencing, about 100-

300 μl of ExoQuick precipitate was recovered and used for RNA preparation with a

yield estimated at 25% of total milk RNA. One sample of small RNA fraction (<40

nt) was subsequently sequenced. From a total of 9,136,568 reads, 8,285,028 (90%)

reads could be mapped onto the bovine genome sequence. In general, as shown in

figure 3.6b the small RNA profile of exosomes was very similar to the profile from

Page 95: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

the whey fraction from the same colostrum. As shown in figure 3.4 miR-148a (22.5

% of less than 27 nt. reads) was the most abundant miRNA extracted from the

exosomes fraction followed by miR-30a-5p (20 % of less than 27 nt. reads). These

two mature miRNAs accounted for more than 40% of all reads mapped to the cow

genome. There were no significantly differentially expressed miRNAs in exosomal

or raw colostrum samples, although colostrum enriched and exosome enriched

miRNAs could be identified based on quantitative difference (>2 fold). As shown in

figure 3.6b, there were twice as many miRNAs enriched in colostrum than exosomes

with 21 exosome enriched and 43 colostrum 2-fold enriched miRNAs. This scatter

plot showed that most highly expressed miRNAs were found in relatively higher

quantities in colostrum exosomes compared to colostrum whey. The ten most highly

enriched miRNAs in colostrum exosomes and colostrum whey are listed in 3.7a and

3.7b respectively. Raw colostrum enriched miRNAs (miR-223: 6.5 and miR-142:

5.3) showed a greater expression than top exosome enriched (miR-216b: 3.2, miR-

2903: 3.1) miRNAs. However, further experiments will be required to test any

significant difference before and after exosome preparation and the current

experiment indicate that miRNA distributions from these preparations are highly

correlated.

Novel miRNAs in buffalo milk

MiRanalyzer was used to predict novel miRNAs in buffalo milk by aligning the

reads to the cow genome, which is a closely related species. A total of 1031 novel

miRNAs were predicted with a range of 1 to 61071 reads aligned to each contig.

Novel mature miRNA sequences with more than 500 sequence reads (69 miRNAs)

are listed in table 3.8 with their hairpin chromosomal positions on the cow genome

(bostau4). For each new miRNA the number of unique reads and strand information

is also listed. The number of unique reads gives an insight into the heterogeneity of

miRNA sequences also referred to as iso-miRs in recent literature (Cloonan et al.,

2011).

Page 96: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNA Colostru

m L26

Exosome

only Diff. Chr.

miR-216b 7.9 11.1 3.2 11

miR-2903 7.5 10.6 3.1 9

miR-139 8.4 11.3 2.9 15

miR-195 7.6 10.5 2.9 19

miR-380 8.6 11.3 2.7 21

miR-92 8.9 11.6 2.7 X

miR-128-1 10.8 13.4 2.6 2

miR-190a 8.1 10.6 2.5 10

miR-132 8.2 10.7 2.5 19

miR-3413 10.9 13.3 2.4 X

Table 3.7a Most abundant miRNAs in colostrum exosomes compared to colostrum

whey.

miRNA Colostru

m L26

Exosome

only Diff. Chr.

miR-223 13.9 7.5 6.5 X

miR-142 12.8 7.5 5.3 19

miR-33b 12.4 7.9 4.6 19

miR-33a 10.4 6.2 4.3 5

miR-339b 12.8 8.8 4.0 7

miR-142 16.2 12.6 3.6 19

miR-424 11.0 7.6 3.4 X

miR-24-1 16.4 13.0 3.4 8

miR-150 11.1 7.8 3.4 18

miR-409a 11.2 8.0 3.2 24

Table 3.7b Most abundant miRNAs in colostrum whey compared to colostrum

exosomes.

Page 97: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chr. Start End

St

ra

n

d

Uniq

ue

read

s

Read

count Read cluster sequence

r17 75694754 75694830 + 99 61071 AGGGTCTGCCGGGCG

TCGGCC

chr28 25971595 25971721 - 99 61025 TGGGTCTGCGGGCCG

TCGGCA

chr4 121288207 121288285 + 118 58437 TGGGGGTCTGCCGGC

CGTCATC

chr11 74676110 74676206 - 301 53955 CTGGTTCGATTCCGG

CTCGAAGGACT

chr16 48446665 48446781 + 50 49121 TGGGGCTAAGCCTGT

CTGAAC

chr2 132707809 132707903 - 220 40984 CCCACATGGTCTAGC

GGTTAGGATTCCTGG

chr26 33243035 33243137 + 131 27736 TGCCAAACCAGTTGT

GCCTGTAGA

chr27 7419639 7419773 + 235 21458 CCCTCGGATCGGCCC

CGCCGGGGTCGGCCC

chr2 132707807 132707901 - 220 15435 CGGGTTCGACTCCCG

GTGTGGGAAG

chr12 15516110 15516190 + 92 10746 GGGCGCGTGGGTCCC

CCCCGCCG

chr23 45127609 45127749 - 48 7499 CCAGTCCTTCTGATC

GAGGCCCC

chr6 113648913 113649053 + 122 6610 TTGAACATGGGTCAG

ACGGTCCTGAGAG

chr27 7419626 7419778 + 110 6308 GGAGCGCTGAGAAG

Page 98: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

ACGGTCGAACTTGAC

chr4 14134486 14134628 - 110 6308 GGAGCGCTGAGAAG

ACGGTCGAACTTGAC

chr19 56966011 56966099 + 35 5831 GGGGCGCGGGCCAC

GAGGAGGG

chr3 86297808 86297902 - 35 5831 CCCTGTGGTCTAGTG

GTTAGGACTC

chr10 19075601 19075741 - 38 4182 AAGGACCCAAATGA

ACTTTTTGG

chrUn.0

04.318 127047 127123 + 42 3737

TGGGAATACCGGGTG

CTGTAGGCTAA

chr7 6422106 6422182 - 40 3735 TGGGAATACCGGGTG

CTGTAGGCTAA

chr21 22903123 22903263 - 42 3142 GTACACCACGTTCCC

CTGGAGAT

chr7 91477771 91477909 + 34 2756 ACAGAAGAATCTGA

ACGAATCTTTTGG

chr17 53706977 53707123 + 77 2682 CTGGTTCGATCCCGG

GTTTCGGCAGC

chr23 30069297 30069401 + 71 2658 CTGGTTCGATCCCGG

GTTTCGGCAG

chr12 68043677 68043821 - 67 2633 CTGGTTCGATCCCGG

GTTTCGGCAG

chr12 67940759 67940903 - 67 2626 CTGGTTCGATCCCGG

GTTTCGGCAG

chr7 42830879 42830961 - 68 2616 CTGGTTCGATCCCGG

GTTTCGGCA

chr4 14134515 14134629 - 80 2584 GATTGGATGGTTTAG

TGAGGCCCTCGGATC

chr26 33700944 33701082 + 25 2559 GCCCACCACGTTCCC

ATGGGAA

Page 99: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

chr12 27908343 27908475 - 31 2518 GAGAAATCTGAACG

AACTTTTTGG

chr10 25844456 25844598 + 66 2336 TGGTTCGACTCCGGC

TCGAAGGAA

chrUn.0

04.2 1722135 1722259 + 42 2218

TATTCATTTATCTCCC

AGCCTACAA

chr11 89654156 89654256 - 56 2180 ATACAGTGACCAGGT

GACGACGGATTTC

chr2 132707813 132707909 - 36 2067 TCCCACATGGTCTAG

CGGTTAGGATT

chrUn.0

04.361 53625 53727 - 35 2005

GATTGGATGGTTTAG

TGAGGCCCT

chr26 965890 966032 + 78 1855 CTCTCCTCCCGCCCC

CCCGCCTCT

chr15 55209145 55209239 - 51 1852 GTTCAAATCCCGGAC

GAGCCCCCCT

chr5 35742351 35742489 + 36 1771 CTGGAGGTAGAGCAC

TGGATTC

chr9 74541597 74541727 - 6 1587 TGCGACTCCATGGAC

TGTCGGCT

chr7 16300382 16300480 + 23 1518 GCGGGCCCCCTCCCC

CGCCCCA

chr21 9439392 9439530 + 5 1489 CCCTCCATCTCCGGG

CGGGGGC

chr2 112999223 112999361 + 58 1460 CTGGGAGAAGGATTG

GCTCTGA

chr3 67781207 67781319 - 47 1329 GGCGTGGAATGCGA

GTGCCTAGTGGGCCA

chr7 109751352 109751432 - 39 1302 AAACTGCCACCACGT

TCCCGAAG

chr7 109649195 109649275 - 38 1301 AAACTGCCACCACGT

Page 100: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

TCCCGAAG

chr15 83653289 83653361 + 33 1262 GGCCCCACCCCGCGC

GGCGCCCTC

chr19 46779999 46780077 + 39 1262 CTGGCGCTGTGGGAT

GAAGCAA

chr13 40499195 40499333 + 20 1189 CACATCCTCCACGTT

CCCGGTG

chr13 54580788 54580920 - 23 1188 ATTCTCAGGTTGGAC

AGTCCTGAA

chr6 106089697 106089839 + 10 1156 CTAAAAGTTTGGTTG

GGTTTTTCT

chr1 107758081 107758223 - 26 1153 CTCCAGGTCCCTGTT

CGGGCGCCA

chr27 7419603 7419755 + 24 1056 ACTACCGATTGGATG

GTTTAGTGAGGCCC

chr3 91642792 91642892 + 9 1037 CTAAAAGTTTGTTTG

GATTTTTC

chr14 16237870 16237950 + 31 1009 TGCCTAGCAGCCGAC

TTAGAACTGGTGC

chr26 965894 966032 + 43 999 GGGGTGGGGGGGCG

GCGGGACC

chr4 105408583 105408719 - 20 989 TCAGATTCCACCACG

CTCCCA

chrUn.0

04.1189 25386 25528 - 43 881

CAACTGTTAGGAGGC

TTGGCTGCT

chr2 5346240 5346350 + 13 871 CAAAAAGTTCGTCCA

GATTTTTC

chr18 59082158 59082248 + 37 864 GGGGTGTAAATCTCG

CGCCGGCC

chr20 73410962 73411048 + 8 827 AGCGCGTCGGGGGG

CGGGGGC

Page 101: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

chr22 12931210 12931306 - 9 744 GCCCGTGGACTGTGT

GAGGCT

chr8 103731644 103731788 - 39 734 CGCGGTGGCGCGCGC

CTGTAGTCCCAGCTA

chr10 105141661 105141763 - 26 616 GGGGCTGTAGCTCAG

ATGGTAGAG

chr3 11657299 11657419 - 26 594 CCCTGGCGGAGCGCT

GAGAAGAC

chr10 42908210 42908322 - 25 589 CGCGCGCCTGTAGTC

CCAGCTACTCGGGA

chr8 103731643 103731775 - 25 589 CGCGCGCCTGTAGTC

CCAGCTACTCGGGA

chr28 29020957 29021081 + 43 553 ACCTGGGTATAGGGG

CGAAAGATTA

chr25 30022766 30022894 - 20 543 CGCGCCTGTAGTCCC

AGCTACTCGGGA

chr26 965884 966036 + 29 534 CCCCGGCTCTCCTCC

CGCCCCCCCGCCTC

chr25 41475482 41475612 - 19 506 CCTCACACCACGTTC

CCGTCCAC

Table 3.8 Predicted miRNAs with more than 500 reads.

Page 102: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

CM-1 CM-2 CM-3 CM-80 CM-E

All reads 284041 68136 1423659 357044 691081

Clustered

Reads 44466 29716 16525 50029 64223

M1-M M2-M M3-M M4-M All

All reads 295063 1441366 2454378 10540 7025308

Clustered

Reads 16961 51889 42473 7325 265366

Table 3.9 Number of unaligned reads left after cow genome alignments. Clustered

reads represents the number of unique sequences.

Analysis of unaligned reads after cow genome alignments:

After cow genome alignment, unaligned reads were collected for each sample from

the bowtie alignment files (bam format). For each sample, the number of unaligned

reads and clustered reads are listed in table 3.9. Because of the low number of reads

in individual samples the sequences were pooled together to approximately 7 million

individual reads and re-clustered into 265366 sequences. These clustered reads were

aligned to the miRBase database, referencing all known miRNA from all species,

using DSAP (Deep Sequencing Small rna Analysis Pipeline). After filtering for a

minimum of 10 reads to extract the most reliable novel miRNA, 2.46 % of reads

were found to align to miRNAs and 124 new miRNAs were identified

(supplementary table 3.1). The highest number of reads (140884) were aligned to

cfa-miR-101 and Eca-miR-101 was the second most abundant with 4639 reads. As

shown in figure 3.7, eca-miR-101 has an additional adenosine compared to cfa-miR-

Page 103: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

101. The full list of novel buffalo miRNA identified with this approach is given in

supplementary data.

Figure 3.7 miRNA read alignments to miR-101.

Characterisation of small RNA sequences greater than 26 nt

To improve identification of miRNAs, small RNA datasets were divided into two

groups, shorter than 27 nt and longer than 26 nt sequences. A analysis of colostrum

and milk RNA sequences longer than 26 nt showed a very high level of tRNA and

ribosomal RNA sequences with very few alignments to miRNAs. As expected the

sample processed with a length cut-off below 80 nucleotides (CM-80 sample),

instead of the standard 40 nt, cut-off protocol, showed a higher number of enriched

contig probes in the longer RNA-seq data (figure 3.8). The 5 most abundant genes

identified in colostrum whey and milk samples are listed in table 3.10a and 3.10b,

showing that they mainly represent tRNA (predominantly Glycine and Valine

tRNA). SnoRNAs and ribosomal RNAs were also represented at high level in the

CM-80 sequences (table 3.10c & 3.10d).

Page 104: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Name Chr. Start End Colostrum MT26 Milk MT26

tRNA:Gly 28 40330919 40330955 24.1 18.9

tRNA:Gly 3 21254306 21254371 24.0 18.8

tRNA:Glu 7 44000403 44000475 23.5 16.3

tRNA:Gly 3 21225220 21225255 23.3 18.3

tRNA:Gly 18 1453949 1454023 23.1 18.1

(10a)

Name Chr. Start End Colostrum MT26 Milk MT26

tRNA:Val 3 21072781 21072814 22.9 24.0

tRNA:Val 23 31093036 31093069 22.7 23.5

tRNA:Val 3 21230135 21230168 21.6 23.1

tRNA:Val 3 8152757 8152792 22.2 23.0

tRNA:Val 3 21251227 21251261 21.4 22.8

tRNA:Val 23 30910583 30910618 21.4 22.8

(10b)

Name Chromosome Start End CM_80_m26

snoRNA 29 41853678 41853744 25.1

rRNA 28 1894641 1894767 23.2

rRNA 21 60891275 60891399 22.6

unknown 6 84048412 84048486 21.8

snoRNA X 20461139 20461165 20.5

snoRNA 19 49114706 49114777 20.3

snoRNA 9 71976829 71976898 20.1

snoRNA 7 41766068 41766132 20.0

(10c)

Page 105: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Name Chr. Start End CM_80_m26 CM_1_m26 Difference

snoRNA 24 14627782 14627831 19.7 7.3 12.5

snoRNA 5 21914202 21914250 17.3 7.3 10.0

snoRNA 19 20706675 20706724 15.5 7.3 8.3

unknown 4 97593471 97593512 15.8 7.5 8.3

tRNA:Ser 23 31147438 31147467 16.1 7.8 8.3

(10d)

Table 3.10 The most abundantly expressed sequences from the greater than 26 nt.

datasets (a) Colostrum (b) milk (c) CM-80 (sample with RNA length upto 80 nt) and

(d) CM-80 enriched.

Figure 3.8 Comparison of colostrum 26-40 nt and 26-80 nt. size selected samples.

Page 106: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparative analysis of buffalo, human and pig milk miRNAs

Additional small RNA sequencing datasets were retrieved from the publicly

available Gene Expression Database GEO (Barrett et al., 2011), including data from

previously reported studies on human and pig milk (Gu et al., 2012, Zhou et al.,

2012a). Unfortunately the sequencing data from a report on bovine milk RNA

sequencing (Hata et al., 2010) was not available publicly for direct comparison. All

these datasets were processed using the online miRanalyzer miRNA analysis pipeline

(Hackenberg et al., 2009) and compared with similar data generated in the current

study on buffalo milk. Table 3.11 presents the most abundant miRNAs identified in

quantification results, expressed as a percentage of total mapped RNA reads. It can

be seen that that miR-148a is consistently present at the highest concentration in the

milk of all the species considered, representing from 20% to 50% of all miRNA

sequences. However, while some miRNAs were found at comparable levels in all

species (for example miR-30a, miR-101, miR-3596d), others are highly enriched in

particular species. For example, miR-486, miR-185 and miR-103 are enriched in pig,

miR-30b in humans or miR-423 in buffalo. It should also be noted that, when

multiple data points or duplicate samples were available (pig and human) the

abundance of particular miRNA could vary significantly between individual samples

from the same species. It remains to be established whether the variation in the

estimation represents either the genetic variability of milk miRNA between

individuals, physiological variation associated with lactation status or technical

issues such as sequencing depth, RNA quality and annotation methodology.

However, the results confirm the conserved high abundance of miR-148a and miR-

30a in mammalian milk and suggest their potential use as universal milk miRNA

biomarker. These results also highlight the potential difference between milk miRNA

from different species and the requirement for additional experiments to control

further for technical, genetic and physiological factors affecting milk miRNA

content.

Page 107: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Rank Buffalo Human Pig

1 miR-30a-5p 35.2 miR-148a-3p 24.31 miR-148a-3p 53.68

2 miR-148a 20.3 miR-200a-3p 7.70 miR-30a-5p 9.39

3 miR-375 6 miR-30a-5p 5.71 miR-182 5.52

4 let-7f 5 miR-141-3p 5.29 miR-378 3.66

5 let-7a-5p 4.7 miR-378a-3p 4.38 miR-92a 2.96

6 miR-141 3.1 miR-146b-5p 3.61 miR-30d 2.61

7 miR-191 2.9 miR-101-3p 3.14 miR-30c-5p 2.00

8 miR-181a 1.9 miR-182-5p 2.88 miR-191 1.78

9 miR-22-3p 1.8 miR-375 2.78 let-7c 1.65

10 miR-21-5p 1.5 miR-21-5p 2.60 miR-574 1.55

Table 3.11 Comparative abundance of milk miRNA

(% of total miRNA content, ‘-‘ when <0.1%) in wallaby (t3: late lactation), Buffalo

(bce: colostrum exosome, bc: colostrum or b: milk), human (4 subjects) and pig

(lactation day 0 to 27) milk.

Target gene prediction, GO ontology enrichment and pathway analysis

The miRNA_Targets database interface was used to identify potential miRNA target

genes for the miRNAs identified in colostrum and milk. Experimentally confirmed

miRNA target genes were from mirWalk miRNA targets database. Gene targets for

the 10 most abundant miRNA present in colostrum and milk were used to find

significantly associated functional ontology terms using the DAVID ontology

database interface. We could not get specific ontology categories among target genes

using the bovine annotation reference. This could have been due to poor quality of

annotations attached to cow genes, so we tested the human annotation of these genes

by mapping common gene names to human ontology annotation using the DAVID

ID conversion tool. The list of enriched miRNA target gene ontology categories is

given in table 3.11 along with enrichment scores. Interestingly, the genes predicted

to be targeted by miR-30a-50p and miR-141 were enriched for angiogenesis and

Page 108: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

vasculature development. The highly abundant miR-148a and the drastically milk

enriched miR-375 predominantly targeted genes enriched for immune system. miR-

191 is predicted to target gene expression during circadian rhythm. Colostrum

enriched miR-27b may target transcription factors, miR-26a bone development, and

the let-7 family may regulate hormone response. This shows a dominance of putative

signalling functions associated with immunity and angiogenesis.

Discussion

Milk is known to provide postnatal immunity and nutrition to neonates, but

the molecular mechanisms of this transfer are incompletely understood. Recent

studies have reported the presence of miRNAs in human and cow milk (Chen et al.,

2010a, Zhou et al., 2012b). MicroRNAs are estimated to account for 1-5 % of total

protein coding genes (Berezikov et al., 2005). Each miRNA is predicted to regulate

hundreds of genes (Krek et al., 2005, Reyes-Herrera and Ficarra, 2012, Chi et al.,

2012). The emergence of miRNAs as a new class of post-transcriptional regulatory

molecules and their involvement in virtually all biological processes warranted their

characterisation in buffalo milk. Despite buffalo being an important milk producer

with unrivalled nutritional and economic value, studies had not yet been conducted

on the RNA content of buffalo milk. This study was also inspired by the need for

much more intensive studies on the role of miRNAs in lactation. By employing

advanced sequencing technology and bioinformatics techniques, small RNA

profiling in milk may also provide a non-invasive methodology to investigate the

lactation status of buffalo and characterize in particular the changes associated

between colostrum and milk synthesis. In this study, small RNA from buffalo

colostrum and milk samples was sequenced by high throughput sequencing, and

sequence reads were aligned against known miRNAs and the bovine genome

reference sequence. This is one limitation of the current study. In the future, access

to the Buffalo genome sequence would also greatly benefit the analysis of

sequencing results and the identification of putative buffalo specific miRNAs.

Page 109: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

miRNA

Enrich

ment

Score

Ontology classification

miR-30a-5p 6.61 Angiogenesis, blood vessel development, vasculature

development

miR-148a 11.1

Hemopoietic or lymphoid organ development, immune system

development, regulation of cell proliferation, Pathways in

cancer, leukocyte differentiation

miR-191 2.9 Circadian rhythm, negative regulation of gene expression

miR-27b 7.34 Regulation of transcription from RNA polymerase II promoter

miR-375 1.99 Immune response, inflammatory response

miR-181a 2.94 Wound healing, growth, developmental growth, tissue

regeneration

miR-26a 6.93 Skeletal system development, regulation of ossification, bone

development, skeletal system morphogenesis

let-7f 12.86 Response to organic substance, response to estrogen stimulus

let-7a-5p 20.07 Negative regulation of programmed cell death, response to

hormone stimulus

miR-141 7.51 Vasculature development, blood vessel development, blood

vessel morphogenesis, angiogenesis

Table 3.12 Ontology classification and enrichment scores of selected miRNA target

genes along with enrichment scores.

Colostrum and milk samples contained large amount of small RNA with a

large proportion of miRNA and, over 700 known and novel candidate miRNAs were

characterized in buffalo colostrum and milk. Colostrum carried apparently more

miRNA species compared to milk, but this might be due to milk containing high

levels of a few miRNA species, possibly masking other low abundance miRNA

Page 110: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

(Lemay et al., 2009), as most of the highly expressed miRNAs had higher read

counts in milk than colostrum (figures 3.5a & 3.5b). The most abundant miRNAs

were present at comparatively similar high levels at both stages of lactation (table

3.3). Two miRNA miR-30a-5p and miR-148a were consistently the most abundant in

colostrum whey and milk. Especially in milk samples, these two accounted for more

than 50 % of all miRNAs. miR-375 was another abundant miRNA and was highly

enriched in milk compared to colostrum. Milk had slightly, but not significantly,

higher number of enriched miRNAs (91) compared to colostrum (83) and expression

values of milk enriched miRNAs were found to be among the highest as shown in

figure 3.6a. One possible explanation for this is that in milk fewer miRNAs are

secreted at high level to convey genetic signals to the young. On the other hand,

colostrum enriched miRNAs may play a role in remodelling of mammary tissues at

the start of milk production as suggested by (Ibarra et al., 2007) and (Ucar et al.,

2010). This is partially supported by the prediction that colostrum specific miR-27b

may regulate transcriptional activity.

Sequencing of exosomal RNA supports the notion that, as reported in other

species (Hata et al., 2010, Zhou et al., 2012a), buffalo milk miRNA is secreted in

exosomes. When exosomes were purified using a standard reagent (Exoquick), it was

shown that most of the colostrum miRNA sequences and their expression levels were

the same, suggesting the miRNAs were packaged in exosomes. However, it remains

to be established if this particular reagent is really appropriate to purify milk

exosomes and only one RNA sample was analysed by this method. Although the

graph of figure 3.5b shows that the most highly expressed miRNAs are apparently

packaged in exosomes, this needs to be confirmed by additional biochemical analysis

and characterisation of buffalo milk exosomes.

Ontology analysis of the genes targeted by the most highly expressed miRNAs miR-

30a-5p and miR-148a in colostrum and milk revealed putative functional signalling

associated with the regulation of immunity and vascular development, revealing the

potential of milk miRNA to convey regulatory signals to the infant. Most of the

miRNAs identified so far in milk from human, buffalo, and pig are similar and this

may imply their role in the regulation of common molecular processes involved in

milk synthesis or in providing the developmental signals to the new-born. However,

Page 111: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

significant variation in the miRNA profiles between species or individuals of the

same species have also consistently been observed and larger studies will be required

to fully address the emerging significance of milk miRNA in lactation biology. This

study has only paved the way by providing a comprehensive catalogue of buffalo

milk and colostrum miRNA in comparison to current knowledge of other specie to

inform new functional hypothesis.

Page 112: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chapter 4

Buffalo milk transcriptome

Page 113: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Introduction

Milk is the primary source of nutrition for newborn mammals. It contains macro- and

micro- nutrients required to provide appropriate nutrition for growth and

development of the neonate (German et al., 2002, Newburg, 2001). However, milk is

also comprised of bioactive molecules that play a central role in regulating

developmental processes in the young and providing a protective function for both

the infant gut and in the mother’s mammary tissues during lactation (Trott et al.,

2003, Kwek et al., 2009a, Nicholas et al., 2012b). Identifying bioactive molecules

and their physiological functions can be difficult and requires extensive screening of

milk components using the latest technologies (Sabikhi, 2007). The need to support

growth and development in early life differs from the maintenance of muscle, eyes

and brain function in the elderly. Therefore, functional foods that both improve

general nutrition and wellbeing, and also target prevention and treatment of health

problems are very attractive. New animal models are now becoming increasingly

relevant to search for these factors. This chapter focuses on the buffalo model and

provide novel insights into buffalo milk.

According to the United Nations food and agriculture organisation there are

approximately 170 million buffalo’s in world (FAO, 2004). Buffaloes are found in

Asia (97%) mainly in India, Pakistan, and Indonesia (Kumar et al., 2007, Yindee et

al., 2010) followed by Africa (2%) and Europe-Italy (0.2%). Recently they have been

introduced into Australia and America (Kierstein et al., 2004). Buffaloes produce

approximately 5 % of the world milk supply. Buffalo milk contains less water, more

fat, lactose, protein and minerals than cow milk (Cockrill, 1981, Abd El-Fattah et al.,

2012). Despite these advantages buffaloes remain underutilized and research

activities are lagging on buffalo milk. For example, a draft genome assembly of

buffalo is still incomplete (Michelizzi et al., 2010).

Similar to human and cow milk buffalo milk also contains heterogeneous

populations of somatic cells (Boutinaud and Jammes, 2002). Previous studies have

highlighted the protein and RNA components of milk (Yang et al., 2013, Hinz et al.,

2012) and showed that the composition of buffalo milk-fat proteins changes during

the lactation cycle (Singh and Ganguli, 1976), but RNA content of buffalo milk have

Page 114: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

not been thoroughly investigated. Here we used Illumina sequencing technology to

examine the RNA contents in buffalo milk cells. Previous gene expression studies

conducted on buffalo have used microarrays (Rao et al., 2011). To the best of our

knowledge this is the first gene expression study using RNA-seq technology in

buffalo. Unlike microarrays this technology is not dependent on any previous

knowledge of genes in the organism and can both precisely quantify the RNA and

identify new genes and mRNA transcripts. Colostrum and milk samples were

processed to sequence the RNA from somatic cells. The sequences of over 40000

transcripts were characterised and more than 5000 novel transcribed loci were

revealed by buffalo milk transcriptome analysis. The analyses of highly expressed

genes and differentially expressed genes shared significant differences between

colostrum and milk cells. Further ontology analysis of gene expression confirmed the

previous findings that the colostrum transcriptome is rich in immune-protective

elements (Singh et al., 1993, Uruakpa et al., 2002). Other genes differentially

expressed in colostrum were enriched for regulation of cell motion, migration

ontology terms and focal adhesion, axon guidance, ECM receptor interaction

pathways. In the future, RNA sequencing may also be developed to provide a

snapshot of animal health and milk quality without using invasive procedures.

Page 115: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Materials and Methods

Milk fractionation

As described in the previous chapter approximately 2 to 4 litres of buffalo milk and

colostrum were collected at the Shaw River Dairy in the state of Victoria, Australia.

Samples were stored on ice and brought to the lab (3 to 12 hours). They were filtered

through a large gauze (150 μm) to remove hair and impurities and subjected to

centrifugation at 2,000 g for 15 minutes at 4°C to pellet cells and separate fat and

skim milk fractions. Cell pellets were washed 2-3 times in PBS before RNA

extraction. Fractions were stored at -80.

RNA preparation and sequencing

Cellular RNA was prepared with the RNeasy minikit (Qiagen, Sydney, Australia)

following the manufacturer’s instructions. RNA quality and quantity were evaluated

on the Agilent Bioanalyser. RNA sequencing (Illumina HiSeq2000 RNA-seq

quantification procedure) was contracted to BGI, Shenzen, China. The RNA was

processed by poly-A selection before mRNA sequencing and about 10 Million reads

(50 nucleotide maximum read length) were obtained for each sample.

Bioinformatics analysis

An annotated genome sequence of buffalo is not yet available, therefore read

mapping was performed against the closely related bovine genome reference

(ftp://ftp.ensembl.org/pub/release-70/fasta/bos_taurus/dna/

Bos_taurus.UMD3.1.70.dna.toplevel.fa.gz); (Zimin et al., 2009) Tophat2 version

tophat-2.0.3.Linux_x86_64, Cufflinks package version (cufflinks-

2.1.1.Linux_x86_64); (Trapnell et al., 2010) and SeqMonk software version v0.24.1

(http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) were used for

transcriptome quantification and sequence annotation. Alignment details were also

visualized and investigated further by SeqMonk software as described in RNA-seq

data analysis guidelines (SeqMonk online video tutorials). Functional cluster analysis

was done with the online tool DAVID (Huang da et al., 2009b, Huang da et al.,

2009a) and Panther database (Mi et al., 2013). A workflow of sample preparation

and RNA sequence analysis is given in figure 4.1.

Page 116: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.1 Workflow of sample processing and bioinformatic analysis of RNA datasets.

Results

RNA-seq genome alignments

To determine the RNA content of buffalo milk, three colostrum and four milk RNA

libraries were sequenced. Approximately 7 to 11.5 million sequence reads were

obtained in each sample. In the absence of buffalo genome reference assembly,

Bowtie2 and Tophat2 programs were used to align the reads to the closely related

cow genome sequence and annotation (Ensembl UMD 3.1) in order to map the reads

to exons, across splice junctions, introns and intergenic regions. As shown in table

4.1 using the Tophat2 algorithm, a small proportion of reads were filtered out as poor

quality sequences and 45 to 72 % reads aligned to the cow genome in individual

samples.

Page 117: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparison of Bowtie2 and Tophat2 alignments

RNA-seq datasets were mapped to the cow genome using Bowtie and Tophat2

algorithms. Bowtie was faster than tophat2 as it aligns RNA sequences to the

chromosomes, thus cannot map sequences spread across exon splice junctions.

Tophat also uses Bowtie software, but it first aligned RNA-sequences to mRNAs

using Ensembl genomic annotations to facilitate the detection of exon junctions and

then aligned the remaining sequences to chromosomes. Overall, Tophat was able to

perform better alignments than Bowtie. For example, a comparison of genomic

alignments in the alpha-casein locus using Bowtie and Tophat software, Tophat

found all the annotated exons including new possible exons in the buffalo alpha-

casein gene (figure 4.2). Although both programs ranked this gene among the most

highly expressed, Tophat ranked this gene as second most highly expressed with

326647 RPKM compared to third rank at 181486 RPKM using Bowtie. This large

discrepancy is, in part, explained by the presence of a number of short exons in gene,

illustrating the importance of correctly mapping reads overlapping splice junctions to

more accurately estimate gene expression from transcriptome data.

Gene expression quantification

After aligning the reads to cow reference genome, two different programs (Cufflink

suite and SeqMonk) were employed to quantify gene expression from RNA-seq

alignments obtained with the Tophat2 program. Cufflinks was used to assemble and

calculate gene expression values (FPKM- Fragments Per Kilobase of exon per

Million fragments mapped). Cuffmerge from the cufflinks package was then used to

combine the transcript assemblies for novel or referenced annotations across all

samples. The cow genome reference assembly (UMD 3.1) had a total of 24525

annotated genes and 26679 transcripts, including 20258 multi-exonic transcripts.

After merging these genes with novel genes using the Cufflinks package a total of

30073 genes and 43061 transcripts were compiled from the RNA-seq data. These

included 5657 novel genes and 12009 novel exons. In total 31301 of 43061

transcripts were multi-exon transcripts. The numbers of reference and novel genes

identified in each sample are listed in table 4.2.

Page 118: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

The bioconductor package CummeRbund was used to visualize normalised gene

expression values and differential expression datasets. Dispersion plot of mean

counts and dispersion of reads mapped across genes in colostrum and milk datasets is

shown in figure 4.3a. Both colostrum and milk samples showed similar dispersion.

Cross-replicate variability in gene expression was visualized as the squared

coefficient of variation in figure 4.3b. It can be seen that there is a greater variability

in gene expression between colostrum samples than between milk samples. This is

consistent with the short timeframe of colostrum production, during the transition

period leading to milk production after parturition, when colostrum samples may

have been collected at slightly different time-points along this transition period with

major changes in gene expression resulting in greater variation. The distributions of

FPKM scores for genes and isoforms between replicates of colostrum and milk

groups were also visualised as density plots as displayed in figures 4.3c (groups of

colostrum and milk). In these plots, a good agreement between the distributions of

expression values of transcripts and gene was observed and supported further

downstream statistical analysis.

Page 119: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Milk

Samples

Total

Reads

Reads

mapped

Not

mapped

Reads filtered

out

% Reads

mapped

CC-1 8794607 3990815 4282493 521299 45.4

CC-2 11092872 8220510 2823582 48780 74.1

CC-3 8377306 6043630 2325182 8494 72.1

MM-1 11027853 6733701 4129425 164727 61.1

MM-2 10079141 5884631 3423485 771025 58.4

MM-3 7807251 3816871 3958668 31712 48.9

MM-4 13200023 9075021 4070467 54535 68.8

Table 4.1 Number of raw sequence reads and reads filtered out for impurities. CC:

colostrum samples, MM: milk samples.

Sample

Name

Novel genes

(FPKM > 0.3)

Reference like genes

(FPKM > 0.3)

Colostrum

CC-1 11385 9027

CC-2 11967 7092

CC-3 11828 8033

Milk

MM-1 10082 8535

MM-2 7422 9648

MM-3 7542 9402

MM-4 16050 6980

Table 4.2 Number of novel and previously annotated genes expressed at in each milk

sample

Genes expressed in MM-4 sample did not match the gene expression profiles in other

colostrum and milk samples (above 0.3 FPKM cutoff).

Page 120: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.2 Comparison of bowtie and tophat alignments on the alpha -casein (CASA1) gene.

In this diagram top four tracks are genomic annotation tracks of CASA1 gene represented by gene, mRNA and coding region tracks. The lower two tracks are alignments using Bowtie upper and Tophat (lower) tracks. It can be seen that Bowtie missed several short exons in the middle.

Page 121: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.3a Count vs dispersion plot by condition for all genes.

Higher dispersion leads top lower number of differentially expressed genes.

Page 122: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.3b The squared coefficient of variation (CV2) of cross-replicate variability between gene and isoform levels.

Page 123: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.3c The density plots of distributions of genes and isoform FPKM scores across colostrum and milk sample replicates.

Gene expression level (FPKM) Colostrum Milk

Highly Expressed (> 500) 706 562

Medium expressed (10 - 500) 8733 7336

Low expressed (0 > 10) 12129 12250

Not expressed (= 0) 8505 9925

Table 4.3 Gene expression levels calculated by cufflinks.

As shown in table 4.3 approximately 2.3 and 1.8 % of genes were highly expressed

in colostrum and milk samples respectively with 706 genes in colostrum and 562

genes in milk showing an FPKM value greater than 500. Colostrum had a slightly

higher number of 8733 genes with medium expression levels (10 – 500 FPKM)

compared to 7336 genes in milk. The number of genes detected at low expression

with between 1 to10 FPKM was comparable in milk and colostrum with 12250 and

12129 genes, respectively. RNA expression was not reported for 8505 known genes

in colostrum and 9925 genes in milk. Overall buffalo colostrum had approximately

five % more genes than milk.

Page 124: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Genome alignments of RNA-seq data were also visualized using SeqMonk software.

The feature probe read count quantitation pipeline was used and datasets were

normalized per million bases and corrected for probe lengths. They were further

normalized by applying the percentile normalization quantitation pipeline in

SeqMonk and grouped into colostrum and milk replicate sets. Figure 4.4 shows the

data tree obtained with hierarchical clustering of the gene expression profiles of all

sample. It can be seen that milk samples MM-1, MM-2 and MM3 were clustered

together while colostrum samples were not as tightly clustered. We have seen

previously that gene expression estimates were more variable in colostrum and this

may be due to the rapid changes in colostrum during the short transition period to

lactation. It can also be seen that milk sample MM-4 is apparently more similar to

colostrum sample CC-2 than other milk samples. Based on this observation, this

sample was left out of the milk group for further differential expression analysis

between milk and colostrum. Normalized feature probes gene expression distribution

boxplots are shown in figure 4.5. Complete list of genes with expression values were

given in supplementary tables 4.1 and 4.2.

Page 125: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.4 Data tree of normalized RNA-seq samples from colostrum and milk.

MM-4 sample is grouped away from milk samples and closer to colostrum CC-2 sample and is considered as milk outlier. Milk samples were more uniform compared to colostrum samples.

Figure 4.5 Normalized feature probes of colostrum and milk samples after removing MM4 milk sample.

Page 126: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Feature Description Chr. Colostrum

FPKM

Milk

FPKM

CASA1 Alpha-S1-casein 6 10512.7 38734.6

CASB Beta-casein 6 10424.0 34662.6

E7E1Q6 Beta-lactoglobulin 11 10011.7 80038.9

CASA2 Alpha-S2-casein 6 7851.0 16234.9

A3FJ56 Kappa-casein 6 7312.0 15002.0

FRIH Ferritin heavy chain 29 6266.8 3785.6

FTH1 Ferritin, heavy

polypeptide 1 15 5246.2 3145.8

TYB4 Thymosin beta-4 X 4044.7 1789.9

TMSB4X Thymosin beta 4 11 3758.3 1649.3

TCTP

Translationally-

controlled tumour

protein

25 3731.1 4262.5

Table 4.4a Top 10 abundant genes in colostrum.

Feature Description Chr. Milk

FPKM

Colostrum

FPKM

E7E1Q6 Beta-lactoglobulin 11 80038.9 10011.7

CASA1 Alpha-S1-casein 6 38734.6 10512.7

CASB Beta-casein 6 34662.6 10424.0

LALBA Alpha-lactalbumin 5 19696.5 2968.7

GLCM1 Glycosylation-dependent

cell adhesion molecule 1 5 18059.2 1830.1

CASA2 Alpha-S2-casein 6 16234.9 7851.0

A3FJ56 kappa-casein 6 15002.0 7312.0

RLA1 60S acidic ribosomal

protein P1 10 6833.7 3300.1

TCTP Translationally-controlled 25 4262.5 3731.1

Page 127: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

tumour protein

TCTP Translationally-controlled

tumour protein 12 4163.5 3642.1

Table 4.4b Top 10 genes expressed in milk.

Highly expressed genes in buffalo colostrum and milk

CASA1, the gene encoding Casein alpha-S1, was the most highly expressed gene in

colostrum, followed by beta-casein and alpha-S2-casein. The 10 most highly

expressed genes in colostrum were listed in table 4.4a. Among these genes were also

FRIH and FTH1, encoding two ferritin heavy subunits, intracellular iron storage

proteins found in prokaryotes and eukaryotes. These proteins store iron in a soluble,

non-toxic, readily available form and also play a role in delivery of iron to cells

(Hentze et al., 1986, Lawson et al., 1991). Two other highly expressed genes from on

chromosomes X and 11 respectively encoded Thymosin beta-4 proteins, an actin

sequestering protein, which plays a role in the regulation of actin polymerization and

is thus involved in cell proliferation, migration, and differentiation.

As expected, most of the highly expressed genes in milk were caseins and whey

proteins. Beta-lactoglobulin (E7E1Q6) was most highly expressed gene. As listed in

table 4.4, well known milk proteins alpha-lactalbumin (LALBA), beta-casein

(CASB), alpha-S1-casein (CASA1) and kappa-casein (A3FJ56) were among the

most highly expressed genes. Glycosylation-dependent cell adhesion molecule

(GLCM1) and translationally controlled tumor protein (TCTP) were also highly

expressed genes. GLCM1 is also known as GlyCAM 1. This protein is secreted by

lactating mammary epithelial cells (Groenen et al., 1995, Dowbenko et al., 1993).

Translationally-controlled tumour protein is well known to bind calcium ions and

microtubules (Bommer and Thiele, 2004, Xu et al., 1999) and milk is high in calcium

ions. Ribosomal protein 60S acidic protein P1 RLA1 was also highly expressed and a

large number of other ribosomal proteins were identified, apparently indicating a

Page 128: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

high level of protein synthesis, also compatible with the activity of milk producing

mammary epithelial cells.

Overall, all known major milk proteins were predominantly present in RNA-seq data

collected from colostrum and milk cells. As major milk proteins are predominantly

specifically expressed in mammary epithelial cells the results show that a large

proportion of colostrum and milk cells are derived from the mammary epithelia,

suggesting that milk cell transcriptomics provides a non-invasive approach to

examine gene expression status of epithelial cells during lactation in buffalo.

Differential gene expression analysis of colostrum and milk somatic cells

As discussed previously in methods section two separate algorithms, Cuffdiff (from

the Cufflink package) and SeqMonk, were employed to analyse differentially

expressed genes in colostrum and milk cells.

Cufflinks:

Three colostrum and three milk samples were compared after merging Cufflink

output annotation files using the Cuffmerge program and submitting the full data set

to Cuffdiff for statistical analysis of differential gene expression between colostrum

and milk groups. Overall, 136 contigs (51 milk enriched and 86 colostrum enriched)

were identified as statistically significantly differentially expressed (figure 4.6).

These included 59 novel and 77 annotated genes in the cow genome reference used

(supplementary table 4.3).

SeqMonk:

Cufflinks software was useful for assembling novel genes and transcripts, but lack

capabilities for data visualisation. However, due to its visual interface, SeqMonk

provided more data mining features and visualization options to investigate the

RNA-seq dataset. Figure 4.7a shows a scatter plot of gene expression in colostrum

versus milk. It can be seen that, in general there is a high level of correlation between

the genes expressed in colostrum and milk cells, illustrated by a general linear trend

along the diagonal. Above the diagonal it can be seen that few genes are apparently

Page 129: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

expressed at higher levels in milk with, in particular some highly expressed genes

representing the major caseins and whey proteins.

In order to address differential gene expression more specifically, intensity

difference filtering with multiple test corrections (P-value < 0.05) was applied to

filter differentially expressed genes. A total of 186 genes were reported as

differentially expressed respectively 42 genes from colostrum and 144 genes in milk.

A majority of genes in this group of 186 genes were also reported within the 136

genes found to be statistically differentially expressed by Cuffdiff above. For

example 6 out of the ten colostrum or milk enriched genes shown in table 4.5a and

4.5b were included in both results. For example Major Seqmonk software can find

novel exons, but cannot assemble novel multi-exonic transcripts on a genome wide

scale. This explains, in part, that fewer genes were identified with Cuffdiff. The ten

most abundant colostrum-enriched genes, sorted based on the difference in gene

expression in colostrum and milk, are listed in table 4.5a. Similarly, the top 10 milk-

enriched genes were listed in table 4.5b. When we further relaxed the statistical filter

by removing the multiple corrections filter in order to harvest more candidate genes,

we found 2212 genes to be differentially expressed (P-value < 0.05, figure 4.7b).

They were further grouped into two clusters of 1003 genes enriched in milk and 1211

colostrum-enriched genes.

Figure 4.6 Differentially expressed genes in colostrum and milk from cuffdiff algorithm.

Page 130: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.7a: Scatterplot of genes expressed in colostrum and milk.

Figure 4.7b Scatterplot of differentially expressed genes in colostrum and milk.

186 significantly differentially expressed genes (P-value < 0.05) were clustered into 42 colostrum-enriched and 144 milk enriched genes (Orange color). Red (colostrum enriched) and green dots (milk enriched) are for the differentially expressed genes without applying multiple correction filter.

Page 131: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Gene Description Col. M.M. Diff.

bTrappin-5 Peptidase inhibitor activity 4.8 -1.0 5.8

A1A4Q6 Elafin (Peptidase inhibitor) 3.7 -1.3 5.0

bTrappin-4 Peptidase inhibitor activity 5.9 0.9 5.0

LOC100295842 Trappin-6-like 4.7 -0.2 4.9

A6H734 Plexin domain-containing protein 1 -1.4 -6.0 4.6

LOC100297932 Bos taurus interstitial collagenase-like 0.3 -4.3 4.6

Q9XS50 Vascular endothelial growth factor C -3.0 -7.4 4.4

MMP1 Interstitial collagenase precursor 0.5 -3.8 4.3

F1MRD4 Alpha-actinin-2 -3.2 -7.2 4.0

LOC782545 No description -1.0 -4.9 3.9

Table 4.5a Top 10 most statistically significantly colostrum enriched genes with log

normalized expression values and differences.

Feature Description Col. M.M. Diff.

CCDC108 Coiled-coil domain containing 108 -5.2 2.1 7.4

E1BLR5 Uncharacterized protein -4.8 2.2 7.0

A4IFV9 Proteoglycan 3 -5.8 0.9 6.7

DJC12 Dnaj homolog subfamily C member 12 -3.6 2.7 6.3

PNPLA3 No description -7.5 -1.4 6.2

CRBA2 Beta-crystallin A2 -3.7 2.1 5.8

A4IFU0 Proteoglycan 3 -3.8 1.6 5.4

G3N0C7 No description 2.5 7.6 5.1

ENHO Adropin -1.4 3.7 5.1

LOC100297709 No description -6.5 -1.6 4.9

Table 4.5b Top ten most statistically significantly milk-enriched genes with log

normalized expression values and differences.

Page 132: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparison of buffalo and cow milk transcriptomes

A study of bovine milk transcriptome was published in 2012 (Wickramasinghe et al.,

2012). In this study, cow milk RNA contents at day 15 (early lactation), 90 (mid

lactation) and 250 (late lactation) were sequenced. RNA-seq data generated from

this study was not released publicly and only a list of the most highly expressed

genes at each stage of lactation were reported in the publication. In the absence of

RNA-seq data whole transcriptome level comparison of cow and buffalo could not

be performed. The report described genome level expression of genes in three

categories and found 140 genes (gene list not published) in mid-lactation with higher

than 500 RPKM values. As shown in table 4.3, in buffalo milk we found a

comparable number of 562 genes with more than 500 RPKM. We also found 9925

genes not expressed at all compared to 8274 genes reported in bovine milk. Similar

to their findings we found genes encoding for LGB (β-lactoglobulin), CSN2 (β-

casein), CSN1S1 (α-S1-casein), LALBA (α-lactalbumin), CSN3 (k-casein),

GLYCAM1 (glycosylation dependent cell adhesion molecule-1) and CSN1S2 (casein-

α-S2) highly expressed during lactation.

Ontology and pathway enrichment analysis of colostrum and milk enriched

genes

Ensembl gene id’s were used for investigating the enrichment of ontology terms

related to colostrum and milk enriched genes. Functional annotation tools from

David bioinformatics resource 6.7 and Panther database were used for ontology

analysis. Colostrum enriched genes mapped to 35 David ontology database ids: 4

clusters containing terms related to extracellular matrix with an enrichment score

(E.S.) of 1.48, cell junctions and parts of the plasma membrane (E.S. 1.21), calcium,

zinc and metal ion bindings (E.S. 0.69) and glycoproteins (E.S. 0.55).

Page 133: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Genes significantly enriched in milk were clustered into 17 groups with terms related

to milk and mammary gland proteins (E.S. 5.08), Casein (E.S. 4.64), coenzyme,

cofactor binding (E.S. 1.54) immunoglobulin subtype (E.S. 0.53).

We also investigated colostrum-enriched genes that were statistically significant (P-

value < 0.05) without applying a multiple corrections filter. Associated ontology

terms were related to blood vessel development, morphogenesis, angiogenesis and

vasculature development (E.S. 4.2), extracellular region parts (E.S. 3.65), chordate,

embryonic and in utero development (E.S. 3.34) cell motion, migration, mobility and

localization (E.S. 3.26) blood vessel development and morphogenesis (E.S. 2.52) and

mesenchymal cell differentiation, development and migration etc. (E.S. 2.21). A list

of functional annotation chart for colostrum-enriched genes is given in table 4.6.

There were 1003 milk-enriched genes that were statistically significant (P-value <

0.05) without applying the multiple corrections filter. They were enriched for milk,

milk proteins, caseins (E.S. 4.02), nucleotide, gtp binding (2.15), mitochondria

related (1.93) and amino acid biosynthesis processes (E.S. 1.7) ontology terms.

The Panther database was also used to compare the difference in enrichment of

ontology terms among buffalo colostrum and milk genes. As shown in figure 4.7a,

genes in colostrum were related to binding (32%) followed by catalytic activity

(24%), whereas milk genes had functions related to catalytic activity (32%) closely

followed by binding (30%).

Genes differentially expressed in colostrum were enriched for focal adhesion (P-

value: 1.80E-08), ECM-receptor interaction (4.70E-06), axon guidance (4.80E-05),

pathways in cancer (9.60E-05) and cell adhesion molecules (2.30E-04) pathways in

the KEGG database. In the focal adhesion pathway 28 genes were enriched in

colostrum including actin-beta (ACTB), actin-gama (ACTG1), caveolin 1 & 2

(CAV1, CAV2), epidermal growth factor receptor (EGFR) and vascular endothelial

growth factor C (VEGFC). As shown in figures 4.8a and 4.8b, colostrum enriched

genes were involved in focal adhesion and ECM-receptor pathways. This included

many variants of Integrin (ITGA9, ITGA2, ITGAV, ITGB5, ITGB6) and laminin

(LAMA3, LAMA5, LAMB3 & LAMC2). Integrin is in attachment of the cell to the

ECM and signal transduction from ECM to cells (Hynes, 2002, Shattil et al., 2010).

Laminins are integral part of basal lamina. They bind to cell membranes through

Page 134: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

integrin receptors (Beck et al., 1990). Axon guidance pathway (figure 4.8c)

contained 17 colostrum genes including ephrin receptors (EPH) A2 & B4, plexin B1

& B3 (PLXNB1, PLXNB3) and sema-domain containing genes (SEMA3A,

SEMA3C, SEMA4B & SEMA6A). Ephrin receptors are known to play important

roles in axon guidance, cell migration and formation of tissue boundaries (Egea and

Klein, 2007, Rohani et al., 2011, McClelland et al., 2010). Plexin proteins act as

legend receptors for sema proteins and are important for neural system development

(Roney et al., 2013, Nkyimbeng-Takwi and Chapoval, 2011, Staton et al., 2011). All

these genes likely reflect the changes in the mammary gland during secretory

activation in early lactation when most mammary glands begin to produce milk.

Figure 4.7a Functional classification of colostrum enriched genes viewed in pie chart.

Maximum number of genes were related to binding (32%) followed by catalytic

activity (24%).

Page 135: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.7b Functional classification of milk enriched genes viewed in pie chart.

Highest number of genes had functions related to catalytic activity (32%) closely

followed by binding (30%).

Term P-Value

Regulation of cell motion 1.90E-06

Regulation of cell migration 2.70E-06

Regulation of locomotion 4.00E-06

Glycosylation site:N-linked 6.20E-06

Blood vessel morphogenesis 9.80E-06

Signal peptide 1.20E-05

Focal adhesion 1.40E-05

Extracellular region part 2.50E-05

Basic-leucine zipper (bzip) transcription factor 3.30E-05

Page 136: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Tube development 3.40E-05

Positive regulation of cell motion 3.80E-05

Response to protein stimulus 3.80E-05

Cell fraction 5.40E-05

Cell-matrix adhesion 6.20E-05

Cell motion 6.40E-05

Enzyme linked receptor protein signaling pathway 7.50E-05

Response to endoplasmic reticulum stress 7.80E-05

Protein amino acid phosphorylation 9.80E-05

Blood vessel development 1.00E-04

Angiogenesis 1.10E-04

Positive regulation of cell migration 1.30E-04

Vasculature development 1.40E-04

BRLZ 1.40E-04

Cell-substrate adhesion 1.50E-04

Plasma membrane part 2.00E-04

Cell migration 2.20E-04

Chordate embryonic development 2.30E-04

Extracellular region 2.30E-04

Cell surface 2.40E-04

Embryonic development ending in birth or egg 2.50E-04

TGF-beta signaling pathway 2.70E-04

ER-nuclear signaling pathway 2.70E-04

Regulation of cellular protein metabolic process 2.90E-04

Disulfide bond 3.40E-04

Phosphorylation 3.70E-04

Positive regulation of locomotion 3.80E-04

Signal 3.90E-04

Phosphorus metabolic process 4.00E-04

Phosphate metabolic process 4.00E-04

Regulation of cell proliferation 4.00E-04

Page 137: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Protein complex binding 5.40E-04

Localization of cell 5.80E-04

Cell motility 5.80E-04

Pathways in cancer 7.80E-04

Transmembrane receptor protein tyrosine kinase signaling 7.90E-04

Glycoprotein 8.20E-04

Endoplasmic reticulum unfolded protein response 9.50E-04

Cellular response to unfolded protein 9.50E-04

Muscle cell differentiation 9.80E-04

Protein amino acid autophosphorylation 9.90E-04

Table 4.6 Ontology classification of colostrum-enriched genes (P-value < 0.05,

without applying multiple corrections filter)

Figure 4.8a Focal adhesion pathway. Colostrum enriched genes are denoted by

stars.

Page 138: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.8b ECM receptor interaction pathway.

Page 139: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 4.8c Axon guidance pathway.

Figure 4.9 Annotated miRNA cluster on chromosome 7.

MiRNAs in PolyA selected transcriptome

As described in the previous chapter, high quantities of miRNAs were reported in the

small RNA datasets (size selected for 18 – 40 nt.). To look for pre-miRNAs in

polyA-selected datasets we used the contig probe generator in SeqMonk to define

probes for chromosome regions aligned with more than 50 reads (to screen for highly

expressed miRNAs). This resulted in 121635 probes overlapping with 24 annotated

Page 140: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

pre-miRNAs. A closer visualization of these miRNAs showed that all but 7 probes

mapped inside exons and represented coding gene expression instead of precursor

miRNA. From a pool of intronic miRNAs, 3 (miR-27a, miR-24-2, miR-23a) were

from a cluster on chromosome 7 (figure 4.9). However, alignments of this sequence

to the nr database at NCBI demonstrated that this was an unannotated ribosomal

sequence. Therefore we could not find any clear evidence for the expression of

precursor (poly-adenylated) miRNA transcripts in the RNA-seq data. Whether this is

due to absence of expression, technical limitations in the co-purification of precursor

miRNA, or rapid turnover of miRNA in somatic milk cells remains to be established.

Discussion

In this study the transcriptome of somatic cells isolated from buffalo colostrum and

milk were characterized using Illumna RNA-seq technology. Tophat and Bowtie

software programs were used for genome alignments. Bowtie does not use genome

annotation to align sequenced RNA reads with chromosomes while Tophat first

aligned sequence reads to mRNAs using gene annotation information and then align

the remaining reads on with chromosomes. Therefore Tophat performed better

alignments by allowing reads mapping across splice sites of mature mRNAs.

Although Tophat took longer to do genome alignments, it was able to better align the

sequence reads to annotated genes, and this study discovered 5657 novel buffalo

genes while identifying a total of 30073 genetic loci and revealed the molecular

sequence of 43061 buffalo transcripts, including milk proteins genes. Significant

differences were observed in the transcriptomes of colostrum and milk samples.

Similar to a previous report of bovine somatic milk cell transcriptome, buffalo

colostrum had more diverse gene expression compared to milk (Wickramasinghe et

al., 2012). Similar to human milk, ferritins were found to be highly expressed in cells

from buffalo colostrum (Lemay et al., 2013).

As expected, the most highly expressed genes in milk were well-known milk

proteins. Casein genes were among 4 of the top 6 expressed genes. Beta-

lactoglobulin precursor was the most highly expressed gene followed by alpha-

lactoglobulin, beta-casein and alpha casein. High concentrations of major milk

Page 141: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

proteins and glycosylation-dependent cell adhesion molecule RNAs also pointed to

the mammary epithelial origin of cells isolated from milk.

Colostrum enriched genes

bTrappin-5 was the most significantly differentially expressed gene in cells from

colostrum (P-value < 0.05). Significantly, bTrappin-4 and Trappin-6

(LOC100295842) were also prominent differentially enriched genes in colostrum

(table 4.5a). These genes are associated with negative regulation of peptidase activity

in the extracellular region. Along with Elafin (A1A4Q6), which is also a peptidase

inhibitor, Trappins are found to be the most highly expressed genes identified. Plexin

domain containing protein 1 (A6H734) is a disulphide-rich motif with an alpha+beta

core structure that contains two conserved disulphides. Plexin receptors are involved

in the development of neural and epithelial tissues (Basile et al., 2007) and regulate

axon guidance, immune function and angiogenesis (Love et al., 2003). Next on the

list was Bos taurus interstitial collagenase-like (LOC100297932), and Vascular

endothelial growth factor C (Q9XS50), a protein predicted to play a role in organ

morphogenesis and lymphangiogenesis through positive regulation of cell division

and proliferation (Orpana and Salven, 2002). Matrix metalloproteinase-1 (MMP-1)

also known as interstitial collagenase and fibroblast collagenase gene was also

colostrum enriched. MMP-1 enzyme holds five metal ions, three Ca2+ and two Zn2+

(Spurlino et al., 1994). It also breaks down the interstitial collagens and is involved

in tissue remodelling (Reviglio et al., 2003). This may play essential role in

transforming non-lactating mammary tissue to lactating stage. Next on the list was

Alpha-actinin-2 (F1MRD4). This protein plays a role in actin crosslink formation

and focal adhesion assembly (Zimin et al., 2009, Lu et al., 2009). The last gene in

this list was a predicted interleukin 4 induced 1-like gene (LOC782545). Another

highly expressed gene in colostrum was connective tissue growth factor (CTGF),

also known as CCN2. CTGF plays important roles in cell adhesion, migration,

proliferation, angiogenesis and skeletal development (Hall-Glenn and Lyons, 2011).

Another important gene differentially enriched in colostrum was the toll-like receptor

adaptor molecule 1 (TICAM1). This gene is a critical component of the innate

immune system (West et al., 2006) and was also found to be highly expressed in

milk.

Page 142: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

These preliminary results show that the major genes expressed in the cells from

colostrum code for proteins involved in tissue remodelling. The activities of a

combination of extracellular proteases inhibitors and growth factors are most likely

the key morphological effectors of secretory activation in the mammary gland during

the transition from pregnancy, during which the mammary gland is primed for milk

production, towards full milk secretion into lactation.

Milk enriched genes

The most differentially expressed gene in milk was CCDC108 (coiled-coil domain

containing 108), a coiled transmembrane protein followed by E1BLR5, an

uncharacterized protein (table 4.5b). Two milk enriched proteoglycan genes A4IFV9

and A4IFU0 are major components of the animal extracellular matrix. DnaJ homolog

subfamily C, member 12 (DJC12) is associated with complex assembly, protein

folding and export (genecards). Patatin-like phospholipase domain-containing

protein 3 (PNPLA3) also known as adiponutrin is involved in the balance of energy

usage/storage in adipocytes (Kienesberger et al., 2009) indicating a possible role in

regulating the overall increase in metabolism of lactating mammary glands. Beta-

crystallin A2 (CRBA2) gene was next on the list. This protein is known to inhibit

cell toxicity associated with amyloid fibril formation by k-casein protein misfolding

(Dehle et al., 2010). The energy homeostasis associated gene ENHO was at ninth

place. It is predicted to be involved in the regulation of glucose homeostasis and lipid

metabolism. Overall milk-enriched genes were involved in synthesis of

carbohydrates and lipid. An increase in genes involved in immune function, protein

synthesis and export machinery was indicated. However, the topmost differentially

expressed genes list do not have much literature associated, so are potentially good

subjects for further functional investigations of their role in regulating metabolism

and mammary gland physiology during lactation.

For ontology analysis, the number of robustly statistically significantly differentially

expressed genes was low. In order to mine further ontological terms associated with

colostrum or milk gene expression, the criteria for gene selection was relaxed by

removing multiple corrections filter, resulting in approximately 1000 differentially

Page 143: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

expressed genes in either colostrum or milk. Colostrum enriched genes were also

significantly enriched for biological ontology terms associated with development and

extracellular matrix remodelling. Genes expressed in cells from colostrum were also

highly associated with blood vessel and vasculature development, cell mobility,

differentiation and localization related activities with the addition of proteins related

to immune defence. These observations reflect changes in somatic mammary cell

gene expression during the transition from a non-lactating to the lactating stage. On

the other hand, milk cells expressed more catalytic activities than colostrum with

nutrition (milk proteins) and immune related genes found to be express at higher rate.

In summary, buffalo milk RNA-seq reference datasets were produced across

colostrum and milk cells samples. Preliminary results showed presence of thousands

of novel genes and transcripts. Downstream differential expression analysis

suggested that somatic cells transcriptomics provides insight on the bio-molecular

events happening in the mammary gland during lactation. For example, ontology

analysis provided an insight into the possible functions of the genes identified in

buffalo milk cells from the mammary gland. These findings may boost the research

in buffalo lactation genomics and help future food security in fast growing

developing nations where buffalo milk is an important resource. Further

experimental studies are needed to probe the significance of presence and transfer of

these important RNA molecules through milk. As milk cells are directly transferred

to the baby by mothers’ milk, their function could be to deliver bioactivities to the

gut of the new born, thus contributing nutrition, immune related elements or other

activities toward the wellbeing of the new born.

Page 144: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chapter 5

Comparative analysis of milk miRNAs in seven species

Page 145: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Introduction

Understanding the molecular components of milk has been a central point of focus in

lactation biology. In addition to comparative genomics, analysis of protein

composition changes and gene expression alterations, the molecular basis of

mammary cell biology has been investigated from an evolutionary perspective

(Lefevre et al., 2010b, Hayssen, 1993). More recently, microRNAs (miRNAs) have

been reported in bovine and human milk and this discovery has opened a new line of

inquiries on the composition and bioactivity of milk miRNAs (Munch et al., 2013,

Chen et al., 2010a, Gu et al., 2012). MiRNAs are short (~ 22 nucleotides) noncoding

RNAs involved in most biological processes as key regulators of post-transcriptional

gene expression (Gangaraju and Lin, 2009, Yates et al., 2013, Bartel, 2004). Like

many protein coding and other non-coding genes, miRNAs have been found to be

evolutionally conserved and are thought to be involved in the regulation of a wide

range of biologic processes of multicellular organisms (Cuperus et al., 2011). The

mammary gland is a complex organ with a repeating developmental cycle, comprised

of successive waves of cell growth, differentiation, metabolic activity, apoptosis and

tissue remodeling. This implies a process highly controlled by precise temporal and

spatial gene regulation. Over the last two years, high levels of miRNAs have been

reported in the milk of eutherian mammals (human, mouse and cow) (Chen et al.,

2010b, Hata et al., 2010, Kosaka et al., 2010, Zhou et al., 2012a), but no study has

yet looked at the comparative analysis of miRNA recovered in milk from a larger

number of more diverse mammalian species. The potential use of miRNAs as

biomarkers and their putative role in mammary gland physiology, together with their

potential effects on growth and development of the young, needs to be investigated

further. In this study, sequencing technology has been employed to analyse and

compare the miRNA composition of milk from a large representative set of

mammals, including monotremes, marsupials and eutherians. In addition, temporal

changes in milk miRNA during the marsupial lactation cycle were investigated in the

tammar wallaby (Macropus eugenii).

Previously, milk cell and mammary gland transcriptomics experiments have revealed

conserved milk proteins and other non-coding RNA components of the lactation

Page 146: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

system of monotreme, marsupial, and eutherian lineages, providing useful insight

into the function of specific milk proteins in the control of lactation (Nicholas et al.,

1995, Lemon and Bailey, 1966). However little is known about the role of newly

identified miRNA secretions in milk. In this chapter, miRNAs recovered from the

milk from eight species were investigated using high throughput miRNA sequencing

datasets from human, buffalo, wallaby, platypus, echidna, pig, alpaca and seal.

Buffalo milk miRNAs were discussed in a previous chapter. Here, in order to assess

the common and species-specific miRNA components of milk, we enlarged the study

by quantifying milk miRNA in a larger number of species to allow a more significant

comparison throughout the mammalian kingdom.

Monotremes (platypus and echidna) are regarded as the best living representative of

early mammals with a more primitive prototherian lactation system. These egg-

laying mammals are now confined to Australia and New Guinea and display a

relatively long lactation cycle compared to gestation and egg incubation with

reported changes in milk protein content (Kuruppath et al., 2012). Much of the

development of the monotreme young occurs after hatching and before weaning and

the role of the milk in this process is yet to be fully examined. Genomics and

transcriptomics approaches have enabled the molecular analysis of cells in milk from

monotremes (Lefevre et al., 2009, Sharp et al., 2007, Warren et al., 2008). Two

libraries of small RNAs from milk samples were sequenced from both platypus and

echidna.

Marsupials (tammar wallaby: Macropus eugenii) present one of the most

sophisticated lactation programs yet described. After a short gestation, marsupials

give birth and the young tammar is totally dependent upon the progressive changes

in milk composition for normal growth and development during the extended

lactation period (Tyndale-Biscoe, 1987, Brennan et al., 2007, Nicholas et al., 2012a).

Development of tammar wallaby young after birth from day 0 to 350 is shown in

figure 5.1. Moreover, lactating females can produce milk of different composition

from adjacent mammary glands, a phenomena called concurrent asynchronous

lactation, demonstrating the importance of local control of the lactation program

including major shifts in the relative milk content in carbohydrates, lipids and

proteins as well as the phase specific expression of major milk proteins (Nicholas et

Page 147: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

al., 1995, Wanyonyi et al., 2013). Therefore we can anticipate changes in milk

miRNA concomitant with changes in protein, lipid and carbohydrate composition of

tammar milk. In this study, tammar wallaby milk miRNA libraries from five

different lactation stages were sequenced. In addition, two small RNA libraries from

blood serum of lactating mothers were also sequenced in order to assess the putative

transfer of blood miRNA into milk or infer, indirectly, a more direct contribution

from mammary gland miRNA biogenesis.

Figure 5.1 Development of tammar wallaby young after birth from day 0 to 350.

Eutherians give birth to relatively more developed young due to elongated gestation

period. The large majority of extant mammalian species, including humans, belong

to this group of mammals and a diversity of reproductive adaptations of the lactation

cycle can be encounter throughout the eutherian radiation. For example the Antarctic

fur seal (Arctocephalus gazella) is another interesting model to study lactation.

Females nurse the offspring for at least 4 months, and then depart the breeding

colony to forage at sea and, for the remainder of lactation, alternate between short

periods ashore suckling their young with longer periods at sea foraging (Cane et al.,

2005, Arnould and Boyd, 1995). However, the ability of these animals to maintain

active lactation during extended foraging periods of up to 3 weeks is quite unique as

Page 148: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

in most other mammals the interruption of suckling quickly leads to a rapid

involution of the mammary gland that becomes irreversible after a few days. Thus,

the study of milk miRNA in Antarctic fur seal may allow the analysis of the role that

miRNAs may have played in the rapid adaptation of the lactation system in this

family. In a separate study, we have also reported a possible involvement of miR-21

in this process. Here, one small RNA library from fur seal milk collected during

onshore lactation was sequenced to profile milk miRNA in this species.

A comparison of milk miRNAs from a larger set of eutherian species (human,

buffalo, pig and alpaca) may allow a more detailed analysis of the functional

evolution of specific molecular components of lactation and the identification of both

species specific (biomarkers) and broadly conserved miRNAs (essential for

lactation). Meunier and colleagues proposed that an expansion of the miRNAs gene

catalogue has occurred in placental and marsupial animals (Meunier et al., 2013).

They also reported a correlation in higher miRNA conservation and miRNA

expression among 6 organs (not mammary gland), suggesting that miRNA may have

played an important role in mammalian evolution. Here we examine the evolution of

milk miRNA secretion with the comparison of milk miRNA profiles obtained with

platypus, echidna, wallaby, buffalo, seal and alpaca and publicly available data from

human and pig milk miRNA profiling.

Page 149: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Materials and methods

Milk sample collections

Wallaby milk samples:

Tammar wallaby (M. eugenii) milk samples were collected from the Deakin

University wallaby facility (May–August 2012) and the Canberra CSIRO Wallaby

facility (June-September 2012). The wallaby lacation cycle is commonly divided into

three major phases; phase 1: pregnancy, phase 2A: first 100 (1 to 90-110) days,

phase 2B: 100-200 days, phase 3: 200-330 days. To examine the temporal variation

of milk miRNA secretion, milk was obtained from mothers at 30, 75, 120, 175, &

250 days (Phase 3) of lactation. These samples have been chosen to cover the main

transitions between phase 2A and 2B and phase 2B and 3 as well as the initial period

of phase 2A.

Platypus and Echidna milk samples:

Dr Thomas Ritchie Grant from University of New South wales provided the platypus

milk samples. Platypus milk was collected on 13/12/2011 from animals in peak

lactation in the Upper Shoalhaven River, NSW, Australia. Assoc/Prof Stewart Nicol

from University of Tasmania provided echidna milk samples from peak lactation

stage. Animals were anaesthetized and injected intramuscularly with 0.2ml of

synthetic oxytocin (2 IU, Syntocin, Sandoz). The mammary glands were gently

massaged and milk collected.

Seal and Alpaca milk samples:

The milk sample from an on shore nursing seal was collected in March 2003 during

the 2002–2003 breeding season. The female was selected at random and captured

using a hoop-net (Fuhrman Diversified, Flamingo, TX, USA) (Lavigne, 1986). The

animal was given an intramuscular injection (ca. 0.15 mg kg−1) of the

sedative Midazolam (Hypnovel, Roche Products, Dee Why, New South Wales,

Australia) to facilitate handling and reduce capture stress. The sedated seal was

transferred to a restraint board, and following an intramuscular injection (0.5–1.0

Page 150: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

mL, 10 UI mL−1) of oxytocin (Heriot Agvet, Rowville, Victoria, Australia) a small

milk sample (30 mL) was collected by manual expression. The milk sample was

stored at −80 °C until analyzed. An alpaca milk sample was obtained from an alpaca

farm near Deakin University at Waurn Ponds, Victoria.

Human and pig milk samples:

Human and pig milk miRNA datasets from multiple lactation stages were

downloaded from the publicly available Gene Expression Omnibus (GEO) Database

(Barrett et al., 2011, Barrett et al., 2013). Four replicates of non-coding RNAs in

human breast milk exosomes were taken from accession number GSE32253 datasets

(Zhou et al., 2012b). Non-coding RNA from porcine breast milk exosomes were

taken form accession number GSE36590 dataset (Gu et al., 2012).

RNA extraction

Milk samples were stored at -80°C. They were thawed on ice and The samples were

filtered through gauze (150 μm) to remove hair and impurities and the filtrate was

subjected to centrifugation at 2,000 xg for 15 minutes at 4°C to separate the cell

pellet, fat and skim milk fractions. The commercially available miRNA isolation Kit

(mirVana™ miRNA Isolation Kit, Ambion) was used for isolating miRNAs from

skim milk as instructed in kit manual. Small RNA were eluted with 100μl of elution

solution provided in the kit. The miRNAs from wallaby blood serum were extracted

by using miRNeasy Serum/Plasma Kit (Qiagen) following the standard protocol.

The final RNA extract was immediately snap-frozen in liquid nitrogen and stored at

−80 °C. RNA quality and quantity were evaluated on the Agilent Bioanalyser (not

published here). Tammar wallaby miRNA extraction was done by Vengama Naidu

Modepalli at Deakin University, Comparative Lactation Group. Platypus milk RNA

samples were prepared by Dr. Julie Sharp from the Comparative Lactation Group at

Deakin University. Echidna milk RNA samples were prepared by Ashwantha Kumar

Enjapoori, and buffalo, Alpaca and seal milk RNA were prepared by Sanjana

Kurupath. Illumnia HiSeq2000 RNA sequencing was contracted to BGI, Shenzen.

Page 151: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

RNA-seq data analysis

Raw fastq sequence files from the sequencer were cleaned and adaptors were

trimmed using FastQ and fastx programs (Ben Bimber, 2010). Because a complete

annotated chromosomal sequence of wallaby is not yet available, the closely related

Opossum genome reference and annotations were downloaded from Ensembl

(Monodelphis_domestica.BROADO5.67). Chromosomal sequences were indexed

using Bowtie2 software and wallaby datasets read mapping was performed against

Mondom5 genome assembly using Bowtie2 (Langmead and Salzberg, 2012).

Genome alignments were visualized and further investigated in SeqMonk (Andrews,

2013). A specialized miRNA annotation tool miRanalyzer was used to standardize

the identification of miRNA populations in small RNAs from all milk samples. This

tool employs the fast read aligner Bowtie to first align sequences to known miRNAs

from miRbase version 19 (Griffiths-Jones, 2006), then to noncoding RNA sequence

from Rfam and, finally, the remaining sequences investigated in the context of the

respective genome references. Wallaby milk miRNAs were aligned against

Monodelphis domestica genome assembly (mondom5) miRNAs from miRBase 19.

Alpaca and Seal small RNA datasets were aligned against human miRNAs in

miRBase. Four small RNA sequencing librarires from monotremes were aligned

against miRNAs from Ornithorhynchus anatinus (ornana1) genome assembly.

miRNA target predictions were done on miRNA_Targets database described in a

previous chapter (Kumar et al., 2012b) and miRNA mRNA network map was

designed with Cytoscape software (Lotia et al., 2013). Further ontology analysis

were done on David Bioinformatics Ontology Database (Huang da et al., 2009b).

Page 152: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Results

miRNAs are the most abundant small-RNA subtype in milk samples

Thirteen small RNA libraries were sequenced on Illumina HISEQ 2000. Briefly, as

described in the methods, wallaby, platypus, echidna, seal and alpaca were

represented by 7, 2, 2, 1 and 1 samples respectively. All miRNA samples were

isolated from milk at various lactation stages except for two samples from blood

serum of wallaby. Each small RNA library sequencing produced approximately 9 to

13 million raw reads, which were cleaned by removal of low quality reads, reads

contaminated with 3 prime and 5 prime adaptor sequences, adaptor only sequences

and sequences less than 18 nucleotides. The number of low quality sequences

removed in the samples were listed in tables 5.1a (wallaby small RNA) and 5.1b

(platypus, echidna, alpaca and seal). In total, 4 to 13 million high quality reads were

used for further analysis. At this stage, echidna-2 had the lowest number of clean

reads (4583630 reads, t5.able 5.1b) and wallaby day 118 blood serum had the highest

number of clean reads (13445118 reads, table 5.1a). Details of the number of

sequences filtered at each step are given in tables 5.1a and 5.1b. Sequence length

distributions of small RNA from the milk samples showed a clear pick at 22 nt,

indicating miRNA-like sequences were the most abundant sequences identified in the

majority of samples (figure 5.2a), although wallaby samples from Day-72 and phase-

3 showed high proportion of 30 - 31 nt sequences and, as shown in figure 5.2b,

platypus samples 1 and 2 also contained a larger number of 29-nt sequences.

Page 153: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Day-35 Day-72 Day-118 Day-175 Phase3

Total reads 12381908 12134643 13784187 12309932 13709829

High quality 12335856 12089154 13751725 12280357 13658689

3'adapter 92168 88219 59397 64848 52910

Insert 439041 323346 237928 81011 79542

5'adapter 101546 18404 33582 54636 44400

< 18nt 1821675 376879 2041588 1288530 1392025

polyA 5 186 311 87 190

Clean reads 9881421 11282120 11378919 10791245 12089622

D-118 serum D-175 serum

Total reads 14309026 10205531

High quality 14256868 10180431

3'adapter 72041 89445

Insert 194066 232346

5'adapter 105075 99390

< 18nt 440562 669988

polyA 6 63

Clean reads 13445118 9089199

Table 5.1a Processing of wallaby small RNA libraries.

Page 154: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Platypus-1 Platypus-2 Echidna-1 Echidna-2

Total reads 10736722 11516869 15658727 10136813

High quality 10675604 11453353 15595535 10097594

3'adapter 36562 39708 59236 43775

Insert 399858 373602 2081779 2033392

5'adapter 103783 60782 793277 1557562

<18nt 1219449 1144954 3550031 1878807

polyA 76 53 179 428

Clean reads 8915876 9834254 9111033 4583630

Seal Alpaca

Total reads 11631244 9970474

High quality 11585689 9929732

3'adapter 32353 38029

Insert 90795 463002

5'adapter 34096 171075

<18nt 868241 537003

polyA 63 37

Clean reads 10560141 8720586

Table 5.1b Processing of platypus, echidna, alpaca and seal small RNA libraries.

Page 155: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.2a Wallaby small RNA sequence length distributions.

Figure 5.2b Platypus, echidna, seal and alpaca small RNA sequence length distributions.

Page 156: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Variation of wallaby milk profiles during the lactation cycle

As discussed earlier, the wallaby lactation cycle is divided into three main stages;

phase 2A (up to day 100), phase 2B (from day 100-200) and phase 3 (after day 200),

with phase 1 corresponding to the period of gestation. The number of small RNA

sequences, unique reads and miRNAs expressed in these milk samples were not

generally correlated (table 5.2a). All milk samples had more diverse small RNA

populations than blood serum samples (table 5.2a unique reads), except for the day

35 sample. The lowest number of miRNAs (different types) was also reported in the

wallaby day 35 milk sample. Lactation days 35 and 72 are both under phase 2a, but

we reported a large difference in the number of already known and predicted

miRNAs at these two time points, with the twice as many miRNA types identified at

day 72. This result is probably due to a poorer quality of the day 35 RNA

preparation, when only small quantities of milk could be collected from the animal.

The number of different miRNAs expressed in Phase 2B milk was also lower at day

118 but this sample also had the lower number of reads aligning to miRNA,

suggesting increased contamination of this sample by degradation products of longer

RNAs.

The ten most highly abundant miRNAs in skim milk, at five lactation time points and

the two blood serum samples were summarized in table 5.3. MiR-191 and miR-184

were the most abundant miRNAs in milk from day 35 to day 118. MiR-191 was also

reported in very high quantity in the blood serum of mothers. MiR-184 was highly

abundant in phase-2A up to early phase-2B. Three miRNAs of the let-7 familly (7f,

7a and 7i) were also among the top ten miRNAs expressed during the lactation cycle.

MiR-181 was found among the top 6 most abundant miRNAs in all wallaby milk and

serum samples. MiR-148 showed a gradual increase in abundance from early

lactation to the end of phase-2B and was the most highly abundant miRNA at day-

175. MiRNA-375 also showed a gradual increase in abundance from early lactation

with a peak in phase-3.

Page 157: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

D-35 D-72 D-118 D-175

Total reads 6781022 7084407 9559993 10596179

Unique reads 28968 52333 72537 40811

% reads aligned to miRNAs 25.80 50.30 20.80 37.50

% reads aligned to genes 0.00% 0.00% 0.00% 0.00%

miRNA known 49 90 60 85

Predicted miRNAs 45 131 89 101

Total miRNAs 94 221 149 186

Phase3 D-118 serum D-175 serum

Total reads 6664261 12360382 4776956

Unique reads 48558 22639 34143

% reads aligned to miRNAs 62.30 62.30 58.50

% reads aligned to genes 0.00% 0.00% 0.00%

miRNA known 90 77 76

Predicted miRNAs 141 60 55

Total miRNAs 231 137 131

Table 5.2a MicroRNA analysis using miRanalyzer, Wallaby milk miRNAs aligned

against mondom5 miRNAs (miRBase 19)

Page 158: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Table 5.3 Top 10 miRNAs found in individual samples.

MiRNAs miR-148 and miR-375 are highlighted to show the increase in their

abundance levels along the lactation cycle.

D-35 D-72 D-118 D-175 Phase3

1 miR-191 miR-191 miR-191 miR-148 miR-375

2 miR-184 miR-184 miR-184 miR-30a miR-191

3 let-7f miR-148 miR-148 miR-375 miR-30a

4 miR-181c,a miR-181c,a miR-375 miR-191 miR-148

5 miR-148 let-7f let-7f let-7f miR-22

6 let-7a miR-30a miR-181c,a miR-181c,a miR-181c,a

7 miR-92 miR-375 miR-10b miR-143 miR-141

8 miR-10b let-7a miR-30a miR-141 miR-143

9 miR-375 miR-141 let-7a miR-10b let-7f

10 let-7i miR-204 miR-92 let-7a let-7a

D-118 serum D-175 serum

1 miR-10b miR-10b

2 miR-10a miR-191

3 miR-191 miR-10a

4 miR-181c,a miR-92

5 miR-92 miR-181c,a

6 miR-215 miR-215

7 let-7i miR-375

8 miR-143 miR-143

9 miR-375 miR-22

10 miR-25 miR-25

Page 159: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.3 Wallaby milk miRNA abundance profiles across 5 lactation time points and two serum samples.

The milk miRNA signature was very different from serum miRNA samples.

Co-differentially expressed miRNA in wallaby milk

Wallaby milk miRNAs data were normalized by dividing the total number of reads

mapped to each miRNA by the total number of mapped reads in respective samples

(proportions). The normalised data were then clustered into 5 groups by the Self

Organizing Tree Algorithm (SOTA) in MeV (Version 4) software with default

settings. Hierarchical clustering of Samples (figure 5.5a-e) showed blood samples

were closely grouped while milk samples nicely were grouped on a different branch.

Cluster number 5 contained the most number of miRNAs (41%). All these miRNAs

showed an increase in expression from early to late stage of lactation. This cluster

consisted of milk miRNAs expressed at all stages of lactation. A heat map of

quantified miRNAs in this cluster is shown in figure 5.5e. The most notable miRNAs

among this group are miR-375, miR-30a and miR-148. As shown in figure 5.5a, this

Page 160: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

group also included let-7 family miRNAs members let-7i, let-7a and let-7f. The

smallest cluster was cluster 2 with only 7 miRNAs, where highest expression values

reported on day-35 and decreased expression thereafter. In cluster 3 miRNAs were

highly expressed up to day 118 and showed a gradual decrease in lactation phase-2B

and phase3 (Figure 5.5c). In cluster 4, miRNAs showed similar expression across the

whole lactation cycle. As shown in figure 5.5d miR-191 was among the top three

miRNA expressed in all samples. Mir-10b also featured in top10 miRNAs expressed

among all samples.

Cluster miRNA in Cluster % of miRNA in Cluster

1 24 26%

2 7 8%

3 11 12%

4 12 13%

5 38 41%

Table 5.4 Wallaby milk miRNA clusters with Self Organizing Tree Algorithm

(SOTA) in expression levels across 5 time points of lactation cycle.

Page 161: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.4 Clustering of wallaby milk and serum miRNA profiles by Hierarchical Clustering Linkage method: (single linkage Pearson correlation).

Grey color in the heat maps

denotes the absence of these

miRNAs among the respective

samples and black color id for

miRNAs expressed below

0.001 %.

Page 162: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.5a Cluster 1 clustering with HCL across 5 time points of lactation cycle

illustrating individual miRNA expression.

Figure 5.5b Cluster 2

Page 163: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.5c cluster 3

Figure 5.5d Cluster 4

Page 164: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.5e Cluster 5

Figure 5.5 Clustering with HCL across 5 time points of lactation cycle illustrating individual miRNA expression.

Page 165: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Comparison of wallaby milk and serum miRNAs

Overall a similar number of miRNAs were identified in milk (86) and serum (82).

But miRNA composition was different as observed by clustering in figure 5.4. Both

serum samples were clustered separately from all other milk samples. Out of these,

13 miRNAs were found to be milk specific and 9 serum specific (table 5.5). Most of

the serum specific miRNAs had low read counts with the highest number of reads

mapping to miR-1 (731 reads). Milk specific miRNAs showed comparatively high

expression with miR-204 represented by 58,142 reads. As shown in table 5.5, other

miRNAs with more than 500 read counts were miR-200b, miR-200c, miR-200a-5p

and miR-1549. Mir-200a and miR-200b genes are present as a cluster. The most

abundant miRNA present in milk and serum-enriched miRNAs are listed in table 5.6

with the normalized expression values (% of all mapped miRNA in respective

samples). While miR-191 was most highly abundant in milk during the first three

time points of lactation, its expression went down on lactation day-175 and, at the

same time, showed a marked increase in serum. This observation, together with the

disparity of milk and serum miRNA profiles in general, suggests that there is no

direct correlation between serum and milk miRNA contents.

Page 166: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.6 Venn diagrams of miRNAs found on milk and serum.

Distribution of miRNAs in milk and serum were compared at lactation day 118 and

day 175 and combined milk vs serum.

Page 167: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Milk Reads Blood Serum Reads

miR-107 104 miR-1 731

miR-1541 36 miR-129 194

miR-1549 560 miR-132 29

miR-17-3p 41 miR-142 34

miR-190a 86 miR-187 148

miR-200a-5p 515 miR-196b 393

miR-200b 4927 miR-214 10

miR-200c 1948 miR-499 29

miR-204 58142 miR-93 343

miR-218 93

miR-23b 129

miR-338 119

miR-34a 60

Table 5.5 Milk and serum specific miRNAs in wallaby combined for day 118 and

175.

Page 168: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

D118-Milk D118-Serum D175-Milk D175-serum

miR-184 8.56 0.01 1.45 0.00

miR-148 8.32 0.20 14.61 0.09

miR-30a 3.36 0.37 13.67 0.31

let-7f 5.371 0.238 6.04 0.168

miR-141 0.684 0.002 4.962 0.01

miR-375 8.225 1.202 9.782 1.487

miR-191 43.83 5.77 9.66 14.51

miR-10a 0.48 7.20 0.71 10.55

miR-10b 3.42 69.17 3.76 55.40

miR-92 2.504 2.628 0.849 4.486

Table 5.6 Highly abundant milk or serum enriched miRNAs.

Green colored cells are milk enriched and yellow colored cells are serum enriched.

Page 169: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Platypus & echidna milk miRNA expression

In order to profile milk miRNA in monotremes, small RNA libraries from two

platypus and two echidna milk samples were sequenced. As shown in table 5.7,

platypus samples had a similar number of high quality clean reads or unique reads.

Whereas in echidna samples, sample 1 had more than double the number of small

RNA sequence reads but sample two had almost double the number of unique reads

(sample 1: 31243, sample 2: 61215). Approximately similar numbers of annotated

miRNAs were identified for each sample in individual species. However, the number

of newly predicted miRNAs was more than 600 for platypus and only approximately

80 for echidna. The reason is most likely the use of the available platypus genome

reference, as an echidna genome sequence is not yet available. Mir-148 miRNA was

the most abundant miRNA in echidna, whereas, miR-191 was the most abundant

miRNA in platypus milk samples. The 10 most abundant miRNAs in platypus and

echidna milk are listed in table 5.8. In total 180 pre-annotated miRNAs were found

in platypus milk. A large majority were found in both samples (Figure 5.7a). To

reduce the false positives, a minimum of 5 reads cut-off was applied leading to the

identification of 125 miRNAs with high similarity. Similarly in echidna, after

applying a minimum of 5 reads cut-off filter the number of annotated miRNA

declined from 142 miRNA to 134 miRNAs (Figure 5.7b). A Venn diagram of

platypus and echidna miRNAs showed 94 miRNAs were common in both species

with 31 miRNAs found only in platypus and 40 miRNAs found only in echidna milk

(figure 5.7c). Most of the highly expressed miRNAs were found in both species.

However large quantitative differences were observed between individual milk

miRNA in the two species, for example miR-191.

Seal and Alpaca milk miRNAs

Seal and alpaca milk small RNA libraries produced a similar number of high quality

reads after cleaning. However, after clustering of the reads into individual sequences,

the data from alpaca contained three times more individual sequences than seal milk.

Both species do not have their genome sequenced, so these libraries were aligned to

Page 170: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

the human miRNA collection (> 2000 miRNAs) and reference genome. Interestingly

more annotated miRNAs were identified in alpaca than seal milk. The number of

predicted miRNAs was also higher in alpaca than in seal. The ten most abundant

miRNAs found in seal and alpaca milk are listed in table 5.8.

Platypus-1 Platypus-2 Seal Alpaca

Total reads 8915876 9834254 6688361 7107032

Unique reads 161710 190498 56083 165689

% reads aligned

to miRNAs 10.70% 9.20% 65.40% 80.40%

miRNA known 137 169 258 358

Predicted miRNAs 608 636 70 133

Total miRNAs 745 805 328 491

Echidna-1 Echidna-2

Total reads 8399862 3482275

Unique reads 31243 61215

% reads aligned

to miRNAs 12.60% 16.00%

miRNA known 117 125

Predicted miRNAs 78 80

Total miRNAs 195 205

Table 5.7 Platypus and echidna miRNAs aligned against ornana1 miRNAs from

miRBase 19.

Highest number of small RNA unique reads was reported in Platypus-2 (190,498)

and alpaca (165689) milk samples respectively.

Page 171: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Platypus-1 Platypus-2 Echidna-1 Echidna-2 Seal Alpaca

1 miR-191 miR-191 miR-148 miR-148 miR-148a miR-148a

2 miR-92a miR-92a let-7f let-7f miR-182 miR-30a

3 miR-181a miR-181a miR-146b miR-21 miR-101 miR-141

4 miR-148 miR-148 miR-30a miR-146b miR-486 miR-182

5 miR-192 miR-30d miR-21 let-7e miR-30a miR-191

6 miR-425 miR-425 miR-22 miR-101 let-7f miR-22

7 miR-30d miR-192 miR-143 miR-30a miR-181a miR-375

8 miR-22 miR-22 let-7e let-7b miR-21 let-7a

9 miR-10b miR-99 miR-10b let-7g miR-26a miR-151a

10 let-7f miR-10b miR-10a miR-10a miR-22 let-7f

Table 5.8 Top 10 most abundant miRNA in platypus, echidna, seal and alpaca milk.

Figure 5.7a Two venn diagram with all platypus miRNAs versus miRNA with => 5 reads

Page 172: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.7b Two venn diagram with echidna all miRNAs versus miRNA with => 5 reads

Figure 5.7c Two venn diagram with all platypus and echidna miRNA versus miRNA with => 5 read

Human 1 Human 2 Human 3 Human 4

Total reads 31323775 29656785 7836132 17557335

Unique reads 848462 1028195 487253 868511

% reads aligned to miRNAs 49.10% 63.30% 63.20% 30.40%

miRNA known 1030 998 784 804

Table 5.9 miRNAs in human milk samples

Pig Day0-1 Day0-2 Day0-3 Day-3 Day-7

Total reads 2046849

5

2418963

5

1822484

4

2112354

0

2398454

0

Unique reads 1115906 1098762 404707 814057 594418

% reads aligned to

miRNAs 1.7% 2.1% 19.2% 0.6% 0.6%

miRNA known 211 204 230 198 192

Page 173: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Pig Day-14 Day-21 Day-28

Total reads 1459053

1

2507443

1

2814142

5

Unique reads 814147 526088 1067351

% reads aligned

to miRNAs 1.5% 1.1% 1.6%

miRNA known 201 210 231

Table 5.10 MiRNAs in pig milk samples.

Human and pig milk miRNAs

The publicly available human and pig milk miRNA sequence datasets were also re-

processed using the same pipeline as previously described with all other species. For

human datasets, more than 700 miRNAs were found in each sample (table 5.9).

Approximately 200 miRNAs were found in pig samples (table 5.10). Pig day-0

sample-3 was discarded, as the profile was very different from the other two

duplicates available at the same time point.

Distribution of miRNAs among all milk samples from 8 species

In this study milk miRNAs from 8 mammalian species were compared. They

included 32 small RNA samples (30 milk and 2 blood serum); 3 colostrum and 4

milk samples from buffalo, 4 milk samples from human, 8 milk samples from pig, 5

milk and 2 blood serum samples from wallaby, 2 platypus milk, 2 echidna milk

samples, 1 alpaca and 1 seal sample. Overall 436 miRNAs (a minimum of > 0.001%

Page 174: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

reads in any sample) were expressed among these samples and 22 miRNAs were

detected in all milk and serum samples (Figure 5.9a). These miRNAs also included

the most highly expressed milk miRNAs; miR-148a and miR-30a. Hierarchical

clustering was used to cluster these expression profiles in three sets; (1) miRNAs

expressed in all samples, (2) miRNAs expressed in more than 20 samples and (3)

miRNAs common in all 32 samples. The 22 miRNAs common to all samples were

clustered and visualized by heatmap (Figure 5.9a). Wallaby serum samples were

clustered separately from milk samples. Platypus miRNAs were closely grouped

with wallaby samples. Echidna milk miRNAs samples were placed on a different

branch with human and pig milk. Wallaby day-35 and day-118 milk samples were

more closely related than day-72, while wallaby day-175 and phase 3 samples were

clustered with buffalo. The alpaca milk profile was more closely related to human

and pig. The pig milk day0-3 sample was closer to the alpaca, but it was located

away from other pig milk samples from day0, suggesting atypical miRNA expression

in this pig at the onset of lactation.

After applying the second cut off filter (miRNA expressed in more than 20 samples),

89 miRNAs were selected and clustered using the same Hierarchical method (figure

5.9b). Wallaby milk samples moved further apart. Milk samples from day-35, day72

and day118 were again clustered with platypus milk samples. However, unlike at

previous cut off, Seal milk was clustered with human milk and alpaca milk was

clustered with wallaby day-175 and buffalo milk. After clustering all 436 miRNAs

expressed in all samples, sample positions showed little change (figure 5.9c). Most

closely clustered samples were from individual species. All human, echidna, and

platypus samples were located in groups. All but one pig samples were also grouped

together. Pig-d0-3 sample was grouped with the seal milk sample. Alpaca and

wallaby day-175 samples were grouped with buffalo milk. Two outliers in buffalo

milk samples were buffalo colostrum-2 and milk-4. Buffalo milk miRNA expression

patterns were closely related to alpaca milk, followed by seal, pig, human and

echidna. Interestingly, platypus milk was more closely related to wallaby milk while

echidna samples were grouped with human and pigs.

Page 175: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.8 The most abundant miRNAs (percentage distribution) among 32 samples.

Figure 5.9a Heat map of 22 miRNAs expressed across all milk and 2 wallaby blood serum samples.

Custer distance trees are displayed for both miRNAs and individual samples (with

log expression values).

Page 176: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.9b Heat map and sample cluster distance trees of 89 miRNAs expressed in more than 20 samples (with log expression values).

Page 177: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.9c Clustering of all miRNAs expressed in all milk and blood serum samples.

Page 178: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

MiRNA target gene predictions

MicoRNAs are known to negatively control gene expression by binding in a

sequence specific manner to their target mRNAs. Computational tools are being

developed to predict miRNA and target gene interactions. The miRNA_Targets

database described in the second chapter was used to predict the miRNA targets for a

selected set of highly expressed miRNAs in milk using a predicted binding energy

cut off below -26 kcal/mole.

In this study number of predicted target genes were not proportional to miRNA

abundance. Most of the mRNA targets were specific to individual miRNAs with

comparatively less overlapping in gene targets except for miR-184, which had targets

in common with other miRNAs in this list. As shown in figure 5.10 miR-184 had the

highest number of predicted targets. Highly expressed miR-148a and miR-30a had

comparatively fewer target predictions. They also had very few common target

mRNA. Interestingly, blood serum enriched miRNA miR-10b had few targets in

common with milk-enriched miRNAs miR-148a and miR-30a. These miRNAs were

identified as enriched in milk from most species in this study. Ontology enrichment

analysis using David Bioinformatics database of gene targets for miR-30a, miR-148a

and miR-184 showed enrichment for ATP binding, nucleotide binding and

phospholipid binding. Interestingly the focal adhesion pathway was the top most

pathway with highest gene enrichment for target genes of these miRNAs. Genes

involved in this pathway were also most differentially expressed in buffalo

mammary gland somatic cells (in chapter 2). Common gene targets in this pathway

included actinin alpha 1& 3, caveolin 1, collagen, cyclin integrin alpha, Laminin

alpha and myosin light chain kinase. Mir-375 miRNA is known to regulate blood

glucose levels by targeting phosphatidylinositol 3-kinase (PDK1) gene and maintain

insulin gene expression (El Ouaamari et al., 2008, Li et al., 2010). Mir-375 predicted

gene targets were enriched for membrane invagination and endocytosis. It’s

predicted gene targets included GAPVD1, arrestin 3, and adaptor-related protein

complex 3. This miRNA may be playing a role in controlling the gene expression of

genes regulating exosome vesicle formation (Hunker et al., 2006, Kim et al., 2002,

Dong et al., 2007). Predicted gene targets for miR-191 were enriched for helicase

activity, thus regulating the levels of helicase enzyme to drive the unwinding of a

Page 179: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

DNA or RNA helix. Blood serum enriched miRNAs miR-10b and miR-22 were

enriched for cofactor binding, oxidoreductase and hydrolase, phosphoprotein

phosphatase activity, respectively. Although miRNA target prediction algorithms

need further development with further improved miRNA-mRNA interaction features.

Here we were able to find separate functional classes of ontologies for milk and

serum enriched miRNAs.

Page 180: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Figure 5.10 A Cytoscape network diagram of number of predicted miRNA target genes for topmost expressed miRNAs using miRNA_Targets database.

Node size is proportional to the number of predicted target genes and black lines

represent individual gene targets.

Discussion

Milk provides optimal postnatal nutrition for new born mammals (Greenhill, 2013).

Milk proteins have been characterized in many species (Restani et al., 1999) and

together with carbohydrates (Messer and Kerry, 1973, Grimmonprez, 1971, Stary

and Cindi, 1955) have been considered to make the major contribution to milk

bioactivity (Picciano, 2001). However, newly identified RNA components of milk

represent new and existing signalling molecule in milk. Although ingested plant

miRNA have been detected in serum and milk exosome fraction have

shown uptake by cell culture (Zhang et al., 2012a, Hata et al., 2010), it is not known

if ingested miRNA from milk can cross the gut, and signaling responses in the

suckled young have not yet been well defined. In recent years, with the advances in

sequencing technologies, RNA contents are being increasingly investigated. In this

study for the first time, milk miRNA contents were compared across seven

mammalian species. Milk miRNAs were characterised in buffalo, wallaby, platypus,

echidna, seal and alpaca milk and compared along with publically available human

and pig datasets. Due to the significant costs related to RNA sequencing technology

single samples were generally sequenced for most biological condition. Overall miRNAs-like sequences, approximately 22 nucleotide long, made up the

highest proportion of sequences shorter than 40 nt in milk samples across all species.

The quantitative analysis of our sequence data confirms the presence of miRNAs as a

major class of small non-coding RNAs in milk and reveal milk miRNA composition

in great details.

If we foster young to a mother at a more advanced stage of lactation we see

increased rate of growth and gut development (Kwek et al., 2009b, Kwek et al.,

2009a). It indicates a role of milk composition in the development of young one. In

wallaby a gradual increase in the number of miRNAs was observed from day 32 to

phase 3 except for the day 72 sample, although in the absence of replicates this day

Page 181: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

72 time-point needs to be further verified. Moreover, when all known and co-

differentially expressed miRNAs were clustered into 5 groups, the highest number of

miRNAs fell in a cluster showing a steady increase in expression from early to late

lactation. Thus the study suggests a steady increase in the number of miRNAs and a

general quantitative increase of a large group of miRNA during the course of

marsupial lactation, and correlates with milk production, mammary gland

development in this organism, and the development of new organs and improved

physiological conditions in the developing young wallaby. In contrast, the lowest

number of miRNAs were in the cluster showing a trend to decrease from early to late

lactation. The long lactation period of the wallaby allowed for the identification of

interesting dynamic patterns of miRNA expression. One of the most differentially

expressed miRNA was miR-375 showing a gradual increase from early lactation to

late lactation. Another miRNA with increase in expression during lactation period

was miR-148. In contrast miR-184 was highly expressed up to day-118 time point

and showed a big drop in expression thereafter. As shown in figure 5.1, the wallaby

young shows significant development up to mid phase-2B. This suggests that

miRNA miR-184 may be a good candidate for a possible developmental function in

reported in accelerated development of the young in foster experiment (Kwek et al.,

2009b).

The most dominant miRNA in wallaby serum samples was miR-10b. It was

comparatively low in milk samples. Two other highly serum enriched miRNAs were

miR-10a and miR-92 (table 5.6). Day 118 and Day 175 serum samples had

comparatively fewer miRNAs than milk from the same time points in the same

animal (table 5.7). Even in the absence of replicates there was high correlation

among serum samples, and serum miRNA profiles were significantly different from

all milk samples. This observation points to the mammary gland biogenesis of

miRNAs and secretion in milk, rather than transfer from serum, and suggest that, in

normal conditions, serum miRNA are not able to diffuse into milk, as is the case of

some milk proteins (Geursen and Grigor, 1987). Although further detailed analysis

of specific miRNA may still be needed to characterise the diffusion of particular

sequences, these results also suggest the possible development of milk quality

Page 182: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

control and diagnostic tests for to detect any infections in mammary glands based on

the detection of serum miRNA in milk.

Noticeably, a higher number of miRNAs were predicted for platypus samples

compared to echidna, seal and alpaca milk samples (Table 7). This difference was

likely due to the alignment of platypus small RNA datasets to the more specific

platypus genome while other had to be processed using a homologous genome

reference (platypus or human). In the case of the platypus analysis, the software was

able to more accurately align sequences and predict new miRNA genes. Echidna,

seal and alpaca genomes were not available. So, these datasets were aligned to

closely related species. This may have resulted in lower numbers of predicted

miRNAs. Echidna sample 1 had approximately double the number of sequence reads

compared to sample 2, but also had half the number of distinct reads (Table 7).

However both had approximately same number of miRNAs. Therefore the total

number of reads and distinct reads in these two samples did not show marked

difference in number of miRNAs or miRNA profiles in echidna. However when the

echidna genome reference becomes available, a higher number of echidna specific

miRNAs may be revealed.

To reduce the false positive results miRNA datasets were also compared after

removing miRNAs with less than 5 reads. This filtered approximately 30% miRNAs

from platypus samples and less than 5 % from echidna samples, apparently

validating the suggestion that low abundance miRNA are more accurately identified

in con-specific genomes (Differences in annotated miRNAs population were

represented in Venn diagrams, figure 5.7a, 5.7b). Human and pig milk miRNAs were

downloaded from the GEO database and processed by the same pipeline to reduce

annotation processing artefacts. In total 32 samples were compared and the

examination of miRNAs expressed among all samples showed miR-148 and miR-

30a to be one of the most highly expressed miRNA in milk from the eutherian

species; pig, seal, alpaca, human and buffalo (figure 5.8). In contrast, significant

quantities of miR-148 and miR-30a were also present and miR-191 was the most

highly expressed miRNA in the monotreme platypus and the marsupial wallaby milk,

however, Mir-191 was also detected at low levels in eutherian milk.

Page 183: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Three different filters were used to cluster the milk miRNAs profiles of all species

(32 samples; i) present in all samples, ii) present in more than 20 samples iii) present

in any sample). No major changes were reported by using these filters and a general

trend was recovered after clustering. Serum miRNA samples were well separated

from milk samples. Monotremes platypus and echidna miRNAs expression profiles

were very different and platypus miRNAs profiles were grouped with wallaby milk

profiles, while echidna profiles were consistently more euthrerian-like.

Overall highly expressed miRNAs in milk were miR-148, miR-191, miR-30a, miR-

184, let-7f and miR-375. In wallaby milk miR-148 showed an increase in expression

from early to late lactation (figure 5.8). It was also comparatively more highly

expressed in buffalo milk than colostrum and also one of the most highly expressed

miRNA in human, pig, seal, alpaca and echidna. It was also among the most

abundant miRNAs in platypus milk, but platypus samples were highly dominated by

miR-191, which also showed a decrease in milk from wallaby during late lactation.

This suggests either that platypus and echidna produce milk of different composition

or that the milk samples represent different stages of lactation, suggesting possible

change in milk miRNA composition during lactation in monotremes. Analysis of

additional sample collected during the lactation cycle of monotreme should highlight

the changes in milk composition. Further indication of changes in monotreme milk

composition is also given by the observation that, while echidna milk miRNA

profiles cluster with eutherian milk and marsupial milk from late lactation, the

platypus samples cluster apart together with early marsupial lactation samples.

The role of milk miRNAs is being intensely debated in lactation biology. Variation

of mRNA profiles may represent changes in mammary glands during different stages

of lactation and be used as biomarkers, or may be specifically secreted by the

mammary gland to be transported to the newborn as milk bio-actives, possibly

through a sophisticated exosomal transport system (Zhang et al., 2012a). miR-148

showed a gradual increase in abundance in milk from early lactation to the end of

phase-2B and was most highly abundant miRNA in wallaby milk at day-175. It is

also among the most highly abundant miRNAs in all other species included in this

study. Mir-148 has been shown to be important for growth and development of

normal tissues (Chen et al., 2013). Therefore, it may have a role in growth of

Page 184: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

epithelial cells in mammary glands (Richert et al., 2000). MiR-184 was highly

abundant in wallaby milk during early lactation phase-2A until early phase-2B.

Previous studies showed this miRNAs is important for central nervous system and

brain development (McKiernan et al., 2012, Li et al., 2011a). Three members of the

let-7 family (7f, 7a and 7i) were also among top ten abundant miRNAs found in

milk. MiRNAs from this family are well known for their involvement in deciding

stem cell transition into various cell types from many cancer studies (Zhang et al.,

2012b). Abundance of these miRNAs also reflect the dynamic and tightly controlled

apoptosis of mammary gland cells (Chen et al., 2012a).

Another highly concentrated miRNA in milk was miR-30a. This miRNA suppresses

breast tumour growth and metastasis by targeting metadherin (Zhang et al., 2013).

Another study in cancer cells showed its contribution to regulate autophagy through

beclin-1 and demonstrated that it can be used to sensitise tumour cells to cis-

platinum (Zhu et al., 2009, Zou et al., 2012). It is also shown to regulate expression

of brain-derived neurotrophic factor (BDNF) in prefrontal cortex (Mellios et al.,

2008), and regulate liver development and function (Hand et al., 2009). Further it has

been shown to inhibit epithelial to mesenchymal transition by targeting Snai1 gene

(Kumarswamy et al., 2012). Functions of this miRNAs may be related to both the

control of cell division and cellular integrity in mammary glands during lactation,

and regulation of early development in newborns.

MiR-181 miRNA was found among top 6 most abundant miRNAs found in all

wallaby milk and serum samples. This miRNA has been implicated in many cellular

processes such as cell fate determination and cellular invasion (Neel and Lebrun,

2013). It also plays a key role in inflammatory response involving central nervous

system (Hutchison et al., 2013). Thus could be playing important role to keep

inflammation under control in lactating mammary glands. Up regulation of miR-181

was shown to contribute to neuronal survival following both mild and severe

seizures. It is also shown to be involved in regulating genes closely related to eye

development (Hoffmann et al., 2012, Iliff et al., 2012). As shown in figure 5.1

wallaby young shows significant eye and brain development up to mid phase-2B.

MiR-181 abundance in milk correlates with the development of nervous tissues in

brain. MiRNA-375 also showed a gradual increase in milk from early lactation and

Page 185: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

peaked in lactation phase-3. This miRNA is a known tumour suppressor (Isozaki et

al., 2012, Ward et al., 2013). However, most of the early functional studies on

miRNAs have been related to cancer studies and further studies need to be carried

out to find the functions of these miRNAs in milk.

In this study miRNA populations have been quantified in monotremes, marsupial and

eutherian milk. In the marsupial model of lactation, pronounced temporal variation

patterns of miRNA milk composition have revealed miRNA signatures that may be

used to understand their involvement in regulation of mammary gland activity and,

transport a molecular message to protect the young and assist the timing of

development. The comparison of milk miRNA profiles across the mammalian

kingdom has highlighted the value of comparative quantitative transcriptomics to

improve our understanding of milk composition and, more importantly, has paved

the way forward to address the true functionality of milk and the full impact of milk

consumption on health and wellbeing.

Page 186: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Chapter 6

General discussion

Page 187: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Summary of work

MicroRNAs are small regulatory molecules known to participate in cellular

processes through targeting mRNAs of hundreds of genes in sequence specific

manner leading to inhibition of translation. This widespread activity of miRNA-

mRNA interaction networks may account for a layer of genetic regulation acting at

the post-transcriptional level. In plants the miRNA sequence is generally exactly

complementary to target sequence, but in animals partial complementarity is

sufficient to repress target genes. Computational tools combined with deep

sequencing technologies are being applied to decode RNA interaction signals.

Previously it was thought that miRNAs mainly bind to target mRNAs on 3 prime

UTRs, but a growing body of literature showed presence of miRNA target sites on 5-

prime UTRs and coding regions (Thieme et al., 2012, Wen et al., 2011). All the

freely available miRNA target prediction web resources at the time this study was

initiated were limited to 3 prime UTR sequences. To address this issue, in chapter 2,

we developed miRNA_Targets, a freely accessible miRNA target prediction web

server (Kumar et al., 2012b). This database employed two miRNA target prediction

algorithms miRanda and RNAhybrid to catalog predicted miRNA target sites on full-

length mRNAs from genome-wide transcript reference annotations for human,

mouse, cow, chicken, fruit-fly, zebra-fish and C. elegans. Pre-computed miRNA

target sites for these species were stored in a MySQL database. A combination of

HTML, PhP and Javascript were implemented to designed a live website to query

this database through a flexible framework and a web interface allowing different

types of data mining options. Bioconductor packages for ontology classification were

integrated into this framework to classify predicted miRNA target networks into GO

ontologies. The integration of the CIRCOS algorithm also allowed the representation

of chromosomal position maps of target networks. In addition, various miRNA target

query options for live target predictions of user defined miRNA or target mRNA

sequences, therefore providing valuable data mining service to a broader base of

potential users. The number of miRNAs characterised in any species has been

increasing every year and databases needs to be maintained and regularly updated to

accommodate for newly characterized miRNAs. The miRNA_Targets bioinformatics

Page 188: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

resource presented here implements a framework for the management of workflows

to process new data.

Previously miRNAs have been shown to regulate the development of mammary

gland development, but small non-coding RNA molecules such as miRNA are also

thought to function as signaling molecules for cellular communication. The secretion

of miRNAs has been reported in various body fluids, where miRNA may be

protected from nucleases, by either binding to the Ago2 protein, or residing inside

small vesicles called exosomes. Some of the highest concentrations of secretory

miRNA were found in milk and miRNA had been identified in human, pig and cow

milk. In cow milk, 7 miRNAs expressed across all lactation stages were proposed as

potential biomarkers for milk quality control (Chen et al., 2010b). Given the

importance of milk to newborns in all mammalian species, further characterization of

miRNAs in milk from other species was mandated, including studies of

representatives of the more ancestral marsupial and monotreme lineages.

Therefore, as a first step, the third chapter of this thesis was focused on

characterisation of small non-coding RNAs, especially miRNAs in buffalo milk. The

results confirm that small RNA isolated from colostrum and milk contains a large

population of miRNA. Over 700 putative miRNAs were identified and quantified by

RNA sequencing in buffalo milk and colostrum. The small RNA content of

colostrum was also compared with small RNA derived from a partially purified

exosome fraction from the same sample, suggesting that buffalo milk miRNA may

be carried inside exosome-like vesicles.

In chapter 4, an analysis of buffalo colostrum and milk cell transcriptomes was also

conducted. The sequences of thousands genes were identified, over 5657 novel

buffalo genes were annotated and their expression in milk cells was quantified by

deep sequencing. The transcriptome of milk cells contained a full set of highly

expressed milk proteins, showing that a majority of somatic milk cells are cells are

epithelia cells detached from the mammary epithelia, and suggesting milk cell

transcriptomics can be used as non-invasive method to infer change in gene

expression in the mammary gland during the course of lactation. This was further

Page 189: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

supported by the comparison of milk and colostrum transcriptomes, showing major

milk proteins like caseins, coenzyme, cofactor binding and immunoglobulins were

enriched in milk. Colostrum enriched genes were found to be involved in regulation

of cell motion, migration ontology terms and focal adhesion, axon guidance, ECM

receptor interaction pathways, apparently revealing some of the molecular

mechanisms involved in shaping the mammary gland tissue during the initial period

of milk production activation. Milk is the primary source of nutrition for new born

babies and somatic cells carrying RNAs are transferred along with other milk

proteins. This study provides a new level of understanding on the origin and activity

of somatic milk cells and describes new methods and reference datasets for buffalo

lactation genomics that will further advance the knowledge base on the buffalo and

milk evolution.

In chapter 5, for the first time, comprehensive datasets of quantitative milk miRNA

profiles from eight mammalian species were integrated and compared. To achieve

this, miRNAs were purified from buffalo, wallaby, platypus, echidna, seal and alpaca

milk, characterised with RNA-seq technology, and compared after integration with

publically available human and pig milk miRNA datasets. In marsupial, a study of

the variation of milk miRNA profile during the lactation cycle was conducted. In

buffalo milk was compared to colostrum. The results confirmed that all mammals

secrete milk miRNA and revealed some of the commonalities and specificities of

milk miRNA secretions. Common highly expressed milk miRNAs were miR-148,

miR-191, miR-30a, miR-184, let-7f and miR-375. Interestingly, the closely related

platypus and echidna presented remarkably different milk miRNA profiles, with

platypus more similar to early lactation milk from the marsupial lactation cycle

study, and echidna more similar to late marsupial milk or milk from eutherians. For

example, miR-191 was one of the most highly expressed miRNA in platypus and

early wallaby milk. The dynamic of milk miRNA composition where analysed in

more details in the wallaby model of complex marsupial lactation, and temporal

trends of increase or decrease during the lactation cycle were identified by clustering

technics. Temporal changes of a large number of miRNA have been characterised.

Together with our study of buffalo and other available studies in human and cow

Page 190: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

milk, will allows the further exploration of the potential functional role of specific

putative milk miRNA signals.

Finally, in the wallaby, milk miRNA profiles were compared to profiles from serum

and no direct correlation was found between milk and serum miRNAs contents.

Some of the most abundant miRNA in wallaby serum were miR-10b, miR-10a and

miR-92. This suggests that milk miRNAs are not secreted passively by diffusion

from serum but are instead actively synthesised locally by the mammary gland.

However, although our analysis of the transcriptome of buffalo milk cells revealed

high expression of secreted milk protein, the study failed to identify the expression of

any miRNA, Although this lack of observation could be due to high turnover or

technical limitation, it suggests that secretory miRNA may not be directly secreted

by the epithelial cells and that stromal cells could play a direct role in their

biogenesis. This is an interesting hypothesis as it has been shown previously that

stromal miRNA can regulate mammary gland development.

Page 191: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Conclusions

In this work, methods for the prediction and analysis of miRNA target networks were

developed to assist the investigation of the role of miRNAs as gene regulatory

elements. The miRNA_targets database was developed to comprehensively cover

predicted miRNA target sites across full-length mRNAs in various model organisms.

With the recent discovery of secretory miRNA in milk, a number of milk miRNA

profiling experiments conducted in the lab were analysed. Specifically, the presence

of miRNAs was confirmed across various species in mammalian kingdom. First,

mature miRNAs and mRNAs were cataloged in buffalo colostrum and milk. Then

miRNA were characterised in milk from various species including platypus, echidna,

seal, wallaby and alpaca. Highly conserved miRNAs were found by comparative

analysis of miRNAs in milk. Further, diverse expression profiles of miRNAs in milk

and serum indicated the mammary gland origin of miRNAs, although no evidence

for epithelial cell biogenesis could be found in a study of the buffalo milk cell

transcriptome. Interestingly, significant temporal variations have been characterized

in milk miRNA profiles during the lactation cycle of the marsupial wallaby, or

between colostrum and milk of the buffalo. Further studies will need to probe for the

role these tiny RNA molecules might play in the regulation of mammary glands

function during lactation or their putative effects on the regulation of newborn

development. However the work presented here has identified some of the key

secretory miRNA that may be involved in such processes, providing a more

complete comparative catalog of milk composition and illustrating some of the

approaches that can be deployed for the interpretation of the data to create new

hypothesis for further investigations on the functional relevance of milk miRNA.

Page 192: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

Future work

First miRNAs were discovered in early 1990’s and the advancement in sequencing

technology in last five years has helped driving research at RNA level. After

relatively low number of new genes discovered in initial human sequencing project,

unexpectedly high variation in mRNA transcripts and conserved/non-conserved non-

coding elements across various genomes is uncovered at RNA level. These

functional genomic elements may be used to connect the dots to understand complex

biological networks that codes for species-specific traits. Some of the areas that we

are now exploring for future research include, but are not limited to:

• Regular updates of miRNA_Targets predictions database to incorporate novel

miRNAs, genes and advanced bioinformatics algorithms.

• Extending this database to incorporate other less popular model organisms.

• Re-analysis of publically available human and mice miRNA sequencing

datasets in various body tissues and fluids.

• Design experiments to test our hypothesis of can milk miRNAs can pass

through gut into recipient’s blood stream and control gene expression.

Page 193: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

References

A. SA ́NCHEZ-PLA, M. SALICRU ́ & OCAN ̃A, J. 2007. Statistical methods for the

analysis of high-throughput data based on functional profiles derived from

the gene ontology. Journal of Statistical Planning and Inference.

ABD EL-FATTAH, A. M., ABD RABO, F. H. R., EL-DIEB, S. M. & EL-KASHEF,

H. A. 2012. Changes in composition of colostrum of Egyptian buffaloes and

Holstein cows. Bmc Veterinary Research, 8.

ADMYRE, C., JOHANSSON, S. M., QAZI, K. R., FILEN, J. J., LAHESMAA, R.,

NORMAN, M., NEVE, E. P., SCHEYNIUS, A. & GABRIELSSON, S.

2007. Exosomes with immune modulatory features are present in human

breast milk. J Immunol, 179, 1969-78.

ALEX SANCHEZ, JORDI OCANA & SALICRU, M. 2010. goProfiles: an R

package for the statistical analysis of functional profiles. R package version

1.14.0.

ANDREWS, S. 2013. SeqMonk [Online]. Available:

http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/ 2013].

ANNES, J. P., MUNGER, J. S. & RIFKIN, D. B. 2003. Making sense of latent

TGFbeta activation. J Cell Sci, 116, 217-24.

ARNOULD, J. P. Y. & BOYD, I. L. 1995. Temporal Patterns of Milk-Production in

Antarctic Fur Seals (Arctocephalus-Gazella). Journal of Zoology, 237, 1-12.

ARROYO, J. D., CHEVILLET, J. R., KROH, E. M., RUF, I. K., PRITCHARD, C.

C., GIBSON, D. F., MITCHELL, P. S., BENNETT, C. F., POGOSOVA-

AGADJANYAN, E. L., STIREWALT, D. L., TAIT, J. F. & TEWARI, M.

2011. Argonaute2 complexes carry a population of circulating microRNAs

independent of vesicles in human plasma. Proc Natl Acad Sci U S A, 108,

5003-8.

AUSTRALIAN BUFFALO INDUSTRY COUNCIL, A. 2013. Buffalo dairy

farmers.

AVRIL-SASSEN, S., GOLDSTEIN, L. D., STINGL, J., BLENKIRON, C., LE

QUESNE, J., SPITERI, I., KARAGAVRIILIDOU, K., WATSON, C. J.,

TAVARE, S., MISKA, E. A. & CALDAS, C. 2009. Characterisation of

Page 194: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

microRNA expression in post-natal mouse mammary gland development.

BMC Genomics, 10, 548.

BAIL, S., SWERDEL, M., LIU, H. D., JIAO, X. F., GOFF, L. A., HART, R. P. &

KILEDJIAN, M. 2010. Differential regulation of microRNA stability. Rna-a

Publication of the Rna Society, 16, 1032-1039.

BANDYOPADHYAY, S. & MITRA, R. 2009. TargetMiner: microRNA target

prediction with systematic identification of tissue-specific negative examples.

Bioinformatics, 25, 2625-31.

BARRETT, T., TROUP, D. B., WILHITE, S. E., LEDOUX, P., EVANGELISTA,

C., KIM, I. F., TOMASHEVSKY, M., MARSHALL, K. A., PHILLIPPY, K.

H., SHERMAN, P. M., MUERTTER, R. N., HOLKO, M., AYANBULE, O.,

YEFANOV, A. & SOBOLEVA, A. 2011. NCBI GEO: archive for functional

genomics data sets--10 years on. Nucleic Acids Res, 39, D1005-10.

BARRETT, T., WILHITE, S. E., LEDOUX, P., EVANGELISTA, C., KIM, I. F.,

TOMASHEVSKY, M., MARSHALL, K. A., PHILLIPPY, K. H.,

SHERMAN, P. M., HOLKO, M., YEFANOV, A., LEE, H., ZHANG, N. G.,

ROBERTSON, C. L., SEROVA, N., DAVIS, S. & SOBOLEVA, A. 2013.

NCBI GEO: archive for functional genomics data sets-update. Nucleic Acids

Research, 41, D991-D995.

BARTEL, D. P. 2004. MicroRNAs: genomics, biogenesis, mechanism, and function.

Cell, 116, 281-97.

BASILE, J. R., GAVARD, J. & GUTKIND, J. S. 2007. Plexin-B1 utilizes RhoA and

Rho kinase to promote the integrin-dependent activation of Akt and ERK and

endothelial cell motility. J Biol Chem, 282, 34888-95.

BECK, K., HUNTER, I. & ENGEL, J. 1990. Structure and function of laminin:

anatomy of a multidomain glycoprotein. FASEB J, 4, 148-60.

BEN BIMBER, J. W. 2010. FASTX-Toolkit [Online]. Available:

http://hannonlab.cshl.edu/fastx_toolkit/index.html.

BEREZIKOV, E., GURYEV, V., VAN DE BELT, J., WIENHOLDS, E.,

PLASTERK, R. H. & CUPPEN, E. 2005. Phylogenetic shadowing and

computational identification of human microRNA genes. Cell, 120, 21-4.

Page 195: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

BETEL, D., WILSON, M., GABOW, A., MARKS, D. S. & SANDER, C. 2008. The

microRNA.org resource: targets and expression. Nucleic Acids Res, 36,

D149-53.

BLACKBURN, D. G. 1993. Lactation: historical patterns and potential for

manipulation. J Dairy Sci, 76, 3195-212.

BLACKBURN, D. G., HASSEN, V. & MURPHY, C. J. 1989. The origin of

lactation and the evolution of milk: A review with new hypotheses. Mamma

Rev, 19, 1-26.

BLACKBURN, V. H. A. D. G. 1985. alpha-Lactalbumin and the origin of lactation.

Evolution, 39, 1147.

BOHNSACK, M. T., CZAPLINSKI, K. & GORLICH, D. 2004. Exportin 5 is a

RanGTP-dependent dsRNA-binding protein that mediates nuclear export of

pre-miRNAs. Rna-a Publication of the Rna Society, 10, 185-91.

BOMMER, U. A. & THIELE, B. J. 2004. The translationally controlled tumour

protein (TCTP). Int J Biochem Cell Biol, 36, 379-85.

BONNER, W. N. 1984. Lactation strategies in pinnipeds: problems for a marine

mammalian group. Symp zool Soc London, 51, 253-272.

BONNET, E., HE, Y., BILLIAU, K. & VAN DE PEER, Y. 2010. TAPIR, a web

server for the prediction of plant microRNA targets, including target mimics.

Bioinformatics, 26, 1566-8.

BOUTINAUD, M. & JAMMES, H. 2002. Potential uses of milk epithelial cells: a

review. Reproduction, nutrition, development, 42, 133-47.

BRENNAN, A. J., SHARP, J. A., LEFEVRE, C., TOPCIC, D., AUGUSTE, A.,

DIGBY, M. & NICHOLAS, K. R. 2007. The tammar wallaby and fur seal:

models to examine local control of lactation. J Dairy Sci, 90 Suppl 1, E66-75.

BRESSLAU, E. 1907. Die entwickelung des Mammaraparates der Montremen,

Marsupialier, und einiger Placentalier, ein Beitrag zur Physiologie der

Saugethiere I. Entwicklung and Ursprung des Mammaraparatus von Echidna.

Semon's Zoologische Forschungsreisen, 4, 59.

CAI, X., HAGEDORN, C. H. & CULLEN, B. R. 2004. Human microRNAs are

processed from capped, polyadenylated transcripts that can also function as

mRNAs. Rna-a Publication of the Rna Society, 10, 1957-66.

Page 196: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

CANE, K. N., ARNOULD, J. P. & NICHOLAS, K. R. 2005. Characterisation of

proteins in the milk of fur seals. Comparative biochemistry and physiology.

Part B, Biochemistry & molecular biology, 141, 111-20.

CAO, H., YANG, C. S. & RANA, T. M. 2008. Evolutionary emergence of

microRNAs in human embryonic stem cells. PLoS One, 3, e2820.

CHEN, P. Y., QIN, L., BARNES, C., CHARISSE, K., YI, T., ZHANG, X., ALI, R.,

MEDINA, P. P., YU, J., SLACK, F. J., ANDERSON, D. G.,

KOTELIANSKI, V., WANG, F., TELLIDES, G. & SIMONS, M. 2012a.

FGF regulates TGF-beta signaling and endothelial-to-mesenchymal transition

via control of let-7 miRNA expression. Cell reports, 2, 1684-96.

CHEN, X., GAO, C., LI, H., HUANG, L., SUN, Q., DONG, Y., TIAN, C., GAO, S.,

DONG, H., GUAN, D., HU, X., ZHAO, S., LI, L., ZHU, L., YAN, Q.,

ZHANG, J., ZEN, K. & ZHANG, C. Y. 2010a. Identification and

characterization of microRNAs in raw milk during different periods of

lactation, commercial fluid, and powdered milk products. Cell research, 20,

1128-37.

CHEN, X., GAO, C., LI, H., HUANG, L., SUN, Q., DONG, Y., TIAN, C., GAO, S.,

DONG, H., GUAN, D., HU, X., ZHAO, S., LI, L., ZHU, L., YAN, Q.,

ZHANG, J., ZEN, K. & ZHANG, C. Y. 2010b. Identification and

characterization of microRNAs in raw milk during different periods of

lactation, commercial fluid, and powdered milk products. Cell Res, 20, 1128-

37.

CHEN, X., LIANG, H., ZHANG, J., ZEN, K. & ZHANG, C. Y. 2012b. Secreted

microRNAs: a new form of intercellular communication. Trends Cell Biol, in

press.

CHEN, Y., SONG, Y. X. & WANG, Z. N. 2013. The MicroRNA-148/152 Family:

Multi-faceted Players. Molecular cancer, 12, 43.

CHI, S. W., HANNON, G. J. & DARNELL, R. B. 2012. An alternative mode of

microRNA target recognition. Nature structural & molecular biology, 19,

321-7.

Page 197: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

CHURCH, P. C., GOSCINSKI, A. & LEFEVRE, C. 2012. EXP-PAC: providing

comparative analysis and storage of next generation gene expression data.

Genomics, 100, 8-13.

CIKALUK, D. E., TAHBAZ, N., HENDRICKS, L. C., DIMATTIA, G. E.,

HANSEN, D., PILGRIM, D. & HOBMAN, T. C. 1999. GERp95, a

membrane-associated protein that belongs to a family of proteins involved in

stem cell differentiation. Mol Biol Cell, 10, 3357-72.

CLARKE, C., HENRY, M., DOOLAN, P., KELLY, S., AHERNE, S., SANCHEZ,

N., KELLY, P., KINSELLA, P., BREEN, L., MADDEN, S. F., ZHANG, L.,

LEONARD, M., CLYNES, M., MELEADY, P. & BARRON, N. 2012.

Integrated miRNA, mRNA and protein expression analysis reveals the role of

post-transcriptional regulation in controlling CHO cell growth rate. BMC

genomics, 13, 656.

CLOONAN, N., WANI, S., XU, Q. Y., GU, J., LEA, K., HEATER, S.,

BARBACIORU, C., STEPTOE, A. L., MARTIN, H. C., NOURBAKHSH,

E., KRISHNAN, K., GARDINER, B., WANG, X. H., NONES, K., STEEN,

J. A., MATIGIAN, N. A., WOOD, D. L., KASSAHN, K. S., WADDELL,

N., SHEPHERD, J., LEE, C., ICHIKAWA, J., MCKERNAN, K.,

BRAMLETT, K., KUERSTEN, S. & GRIMMOND, S. M. 2011.

MicroRNAs and their isomiRs function cooperatively to target common

biological pathways. Genome Biology, 12.

COCKRILL, W. R. 1981. The water buffalo: a review. The British veterinary

journal, 137, 8-16.

CORCORAN, D. L., PANDIT, K. V., GORDON, B., BHATTACHARJEE, A.,

KAMINSKI, N. & BENOS, P. V. 2009. Features of mammalian microRNA

promoters emerge from polymerase II chromatin immunoprecipitation data.

PLoS One, 4, e5279.

CUI, W., LI, Q., FENG, L. & DING, W. 2011. MiR-126-3p regulates progesterone

receptors and involves development and lactation of mouse mammary gland.

Mol Cell Biochem, 355, 17-25.

CUPERUS, J. T., FAHLGREN, N. & CARRINGTON, J. C. 2011. Evolution and

functional diversification of MIRNA genes. The Plant cell, 23, 431-42.

Page 198: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

DARWIN, C. 1872. The origin of species by means of natural selection, or the

preservation of favored races in the struggle for life. Sixth edition, with

additions and corrections, London:John Marray, Albemarle street.

DEHLE, F. C., ECROYD, H., MUSGRAVE, I. F. & CARVER, J. A. 2010. alphaB-

Crystallin inhibits the cell toxicity associated with amyloid fibril formation

by kappa-casein and the amyloid-beta peptide. Cell Stress Chaperones, 15,

1013-26.

DONG, H., YUAN, H., JIN, W., SHEN, Y., XU, X. & WANG, H. 2007.

Involvement of beta3A subunit of adaptor protein-3 in intracellular

trafficking of receptor-like protein tyrosine phosphatase PCP-2. Acta Biochim

Biophys Sin (Shanghai), 39, 540-6.

DOWBENKO, D., KIKUTA, A., FENNIE, C., GILLETT, N. & LASKY, L. A.

1993. Glycosylation-dependent cell adhesion molecule 1 (GlyCAM 1) mucin

is expressed by lactating mammary gland epithelial cells and is present in

milk. J Clin Invest, 92, 952-60.

DUVALL, B. M. G. A. D. 1983. A role for aggregation pheromones in the evolution

of mammal-like reptiles. Am. Nat., 122, 835.

DWEEP, H., STICHT, C., PANDEY, P. & GRETZ, N. 2011. miRWalk - Database:

Prediction of possible miRNA binding sites by "walking" the genes of three

genomes. J Biomed Inform.

EGEA, J. & KLEIN, R. 2007. Bidirectional Eph-ephrin signaling during axon

guidance. Trends Cell Biol, 17, 230-8.

EL OUAAMARI, A., BAROUKH, N., MARTENS, G. A., LEBRUN, P.,

PIPELEERS, D. & VAN OBBERGHEN, E. 2008. miR-375 targets 3'-

phosphoinositide-dependent protein kinase-1 and regulates glucose-induced

biological responses in pancreatic beta-cells. Diabetes, 57, 2708-17.

ENRIGHT, A. J., JOHN, B., GAUL, U., TUSCHL, T., SANDER, C. & MARKS, D.

S. 2003. MicroRNA targets in Drosophila. Genome biology, 5, R1.

EULALIO, A., HUNTZINGER, E. & IZAURRALDE, E. 2008. Getting to the root

of miRNA-mediated gene silencing. Cell, 132, 9-14.

Page 199: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

EULALIO, A., HUNTZINGER, E., NISHIHARA, T., REHWINKEL, J., FAUSER,

M. & IZAURRALDE, E. 2009. Deadenylation is a widespread effect of

miRNA regulation. RNA (New York, N Y ), 15, 21-32.

FAO 2004. Live stock census available from http://www.fao.org.

FURUTA, K., LICHTENBERGER, R. & HELARIUTTA, Y. 2012. The role of

mobile small RNA species during root growth and development. Curr Opin

Cell Biol.

GANGARAJU, V. K. & LIN, H. 2009. MicroRNAs: key regulators of stem cells.

Nature reviews. Molecular cell biology, 10, 116-25.

GANTIER, M. P., MCCOY, C. E., RUSINOVA, I., SAULEP, D., WANG, D., XU,

D., IRVING, A. T., BEHLKE, M. A., HERTZOG, P. J., MACKAY, F. &

WILLIAMS, B. R. 2011. Analysis of microRNA turnover in mammalian

cells following Dicer1 ablation. Nucleic Acids Res, 39, 5692-703.

GERMAN, J. B., DILLARD, C. J. & WARD, R. E. 2002. Bioactive components in

milk. Current opinion in clinical nutrition and metabolic care, 5, 653-8.

GEURSEN, A. & GRIGOR, M. R. 1987. Serum albumin secretion in rat milk. J

Physiol, 391, 419-27.

GIBBINGS, D. J., CIAUDO, C., ERHARDT, M. & VOINNET, O. 2009.

Multivesicular bodies associate with components of miRNA effector

complexes and modulate miRNA activity. Nature Cell Biology, 11, 1143-9.

GREEN, B., GRIFFITHS, M. & LECKIE, R. M. 1983. Qualitative and quantitative

changes in milk fat during lactation in the tammar wallaby (Macropus

eugenii). Aust J Biol Sci, 36, 455-61.

GREENHILL, C. 2013. Nutrition: drinking cow's milk alters vitamin D and iron

stores in young children. Nat Rev Endocrinol, 9, 126.

GREGORY, P. A., BERT, A. G., PATERSON, E. L., BARRY, S. C., TSYKIN, A.,

FARSHID, G., VADAS, M. A., KHEW-GOODALL, Y. & GOODALL, G. J.

2008. The miR-200 family and miR-205 regulate epithelial to mesenchymal

transition by targeting ZEB1 and SIP1. Nature Cell Biology, 10, 593-601.

GREGORY, W. K. 1910. The orders of mammals. Buull. Am. Mus. Nat. Hist., 27, 1.

GRIFFITHS-JONES, S. 2006. miRBase: the microRNA sequence database. Methods

Mol Biol, 342, 129-38.

Page 200: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

GRIFFITHS-JONES, S. 2010a. miRBase: microRNA sequences and annotation.

Curr Protoc Bioinformatics, Chapter 12, Unit 12 9 1-10.

GRIFFITHS-JONES, S. 2010b. miRBase: microRNA sequences and annotation.

Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis

... [et al.], Chapter 12, Unit 12 9 1-10.

GRIMMONPREZ, L. 1971. [Milk carbohydrates]. Ann Nutr Aliment, 25, A39-79.

GRIMSON, A., FARH, K. K., JOHNSTON, W. K., GARRETT-ENGELE, P., LIM,

L. P. & BARTEL, D. P. 2007. MicroRNA targeting specificity in mammals:

determinants beyond seed pairing. Molecular cell, 27, 91-105.

GROENEN, M. A., DIJKHOF, R. J. & VAN DER POEL, J. J. 1995.

Characterization of a GlyCAM1-like gene (glycosylation-dependent cell

adhesion molecule 1) which is highly and specifically expressed in the

lactating bovine mammary gland. Gene, 158, 189-95.

GU, Y., LI, M., WANG, T., LIANG, Y., ZHONG, Z., WANG, X., ZHOU, Q.,

CHEN, L., LANG, Q., HE, Z., CHEN, X., GONG, J., GAO, X., LI, X. & LV,

X. 2012. Lactation-related microRNA expression profiles of porcine breast

milk exosomes. PLoS One, 7, e43691.

GUNARATNE, P. H., CREIGHTON, C. J., WATSON, M. & TENNAKOON, J. B.

2010. Large-scale integration of MicroRNA and gene expression data for

identification of enriched microRNA-mRNA associations in biological

systems. Methods Mol Biol, 667, 297-315.

GUO, H., INGOLIA, N. T., WEISSMAN, J. S. & BARTEL, D. P. 2010. Mammalian

microRNAs predominantly act to decrease target mRNA levels. Nature, 466,

835-40.

HACKENBERG, M., RODRIGUEZ-EZPELETA, N. & ARANSAY, A. M. 2011.

miRanalyzer: an update on the detection and analysis of microRNAs in high-

throughput sequencing experiments. Nucleic Acids Res, 39, W132-8.

HACKENBERG, M., STURM, M., LANGENBERGER, D., FALCON-PEREZ, J.

M. & ARANSAY, A. M. 2009. miRanalyzer: a microRNA detection and

analysis tool for next-generation sequencing experiments. Nucleic acids

research, 37, W68-76.

Page 201: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

HAIDER, S., BALLESTER, B., SMEDLEY, D., ZHANG, J., RICE, P. &

KASPRZYK, A. 2009. BioMart Central Portal--unified access to biological

data. Nucleic Acids Res, 37, W23-7.

HALDANE, J. B. S. 1965. The possible evolution of lactation. Zoologische

Jahrbuch, 92, 41-48.

HALICKA, H. D., BEDNER, E. & DARZYNKIEWICZ, Z. 2000. Segregation of

RNA and separate packaging of DNA and RNA in apoptotic bodies during

apoptosis. Exp Cell Res, 260, 248-56.

HALL-GLENN, F. & LYONS, K. M. 2011. Roles for CCN2 in normal physiological

processes. Cell Mol Life Sci, 68, 3209-17.

HAND, N. J., MASTER, Z. R., EAUCLAIRE, S. F., WEINBLATT, D. E.,

MATTHEWS, R. P. & FRIEDMAN, J. R. 2009. The microRNA-30 family is

required for vertebrate hepatobiliary development. Gastroenterology, 136,

1081-90.

HATA, T., MURAKAMI, K., NAKATANI, H., YAMAMOTO, Y., MATSUDA, T.

& AOKI, N. 2010. Isolation of bovine milk-derived microvesicles carrying

mRNAs and microRNAs. Biochem Biophys Res Commun, 396, 528-33.

HAWKINS, S. M. & MATZUK, M. M. 2010. Oocyte-somatic cell communication

and microRNA function in the ovary. Ann Endocrinol (Paris), 71, 144-8.

HAYSSEN, V. 1993. Empirical and theoretical constraints on the evolution of

lactation. Journal of Dairy Science, 76, 3213-33.

HENTZE, M. W., KEIM, S., PAPADOPOULOS, P., O'BRIEN, S., MODI, W.,

DRYSDALE, J., LEONARD, W. J., HARFORD, J. B. & KLAUSNER, R. D.

1986. Cloning, characterization, expression, and chromosomal localization of

a human ferritin heavy-chain gene. Proceedings of the National Academy of

Sciences of the United States of America, 83, 7226-30.

HERGENREIDER, E., HEYDT, S., TREGUER, K., BOETTGER, T.,

HORREVOETS, A. J., ZEIHER, A. M., SCHEFFER, M. P., FRANGAKIS,

A. S., YIN, X., MAYR, M., BRAUN, T., URBICH, C., BOON, R. A. &

DIMMELER, S. 2012. Atheroprotective communication between endothelial

cells and smooth muscle cells through miRNAs. Nature Cell Biology, 14,

249-56.

Page 202: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

HINZ, K., O'CONNOR, P. M., HUPPERTZ, T., ROSS, R. P. & KELLY, A. L. 2012.

Comparison of the principal proteins in bovine, caprine, buffalo, equine and

camel milk. The Journal of dairy research, 79, 185-91.

HOCK, J. & MEISTER, G. 2008. The Argonaute protein family. Genome Biol, 9,

210.

HOFFMANN, A., HUANG, Y. S., SUETSUGU-MAKI, R., RINGELBERG, C. S.,

TOMLINSON, C. R., DEL RIO-TSONIS, K. & TSONIS, P. A. 2012.

Implication of the miR-184 and miR-204 Competitive RNA Network in

Control of Mouse Secondary Cataract. Molecular Medicine, 18, 528-538.

HSU, J. B., CHIU, C. M., HSU, S. D., HUANG, W. Y., CHIEN, C. H., LEE, T. Y.

& HUANG, H. D. 2011. miRTar: an integrated system for identifying

miRNA-target interactions in human. BMC Bioinformatics, 12, 300.

HUANG DA, W., SHERMAN, B. T. & LEMPICKI, R. A. 2009a. Bioinformatics

enrichment tools: paths toward the comprehensive functional analysis of

large gene lists. Nucleic Acids Res, 37, 1-13.

HUANG DA, W., SHERMAN, B. T. & LEMPICKI, R. A. 2009b. Systematic and

integrative analysis of large gene lists using DAVID bioinformatics

resources. Nat Protoc, 4, 44-57.

HUANG, P. J., LIU, Y. C., LEE, C. C., LIN, W. C., GAN, R. R., LYU, P. C. &

TANG, P. 2010. DSAP: deep-sequencing small RNA analysis pipeline.

Nucleic Acids Res, 38, W385-91.

HUNKER, C. M., GALVIS, A., KRUK, I., GIAMBINI, H., VEISAGA, M. L. &

BARBIERI, M. A. 2006. Rab5-activating protein 6, a novel endosomal

protein with a role in endocytosis. Biochem Biophys Res Commun, 340, 967-

75.

HUTCHISON, E. R., KAWAMOTO, E. M., TAUB, D. D., LAL, A.,

ABDELMOHSEN, K., ZHANG, Y., WOOD, W. H., 3RD, LEHRMANN, E.,

CAMANDOLA, S., BECKER, K. G., GOROSPE, M. & MATTSON, M. P.

2013. Evidence for miR-181 involvement in neuroinflammatory responses of

astrocytes. Glia.

HYNES, R. O. 2002. Integrins: bidirectional, allosteric signaling machines. Cell,

110, 673-87.

Page 203: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

IBARRA, I., ERLICH, Y., MUTHUSWAMY, S. K., SACHIDANANDAM, R. &

HANNON, G. J. 2007. A role for microRNAs in maintenance of mouse

mammary epithelial progenitor cells. Genes & development, 21, 3238-43.

ILIFF, B. W., RIAZUDDIN, S. A. & GOTTSCH, J. D. 2012. A single-base

substitution in the seed region of miR-184 causes EDICT syndrome.

Investigative ophthalmology & visual science, 53, 348-53.

INUI, M., MARTELLO, G. & PICCOLO, S. 2010. MicroRNA control of signal

transduction. Nat Rev Mol Cell Biol, 11, 252-63.

ISOZAKI, Y., HOSHINO, I., NOHATA, N., KINOSHITA, T., AKUTSU, Y.,

HANARI, N., MORI, M., YONEYAMA, Y., AKANUMA, N.,

TAKESHITA, N., MARUYAMA, T., SEKI, N., NISHINO, N., YOSHIDA,

M. & MATSUBARA, H. 2012. Identification of novel molecular targets

regulated by tumor suppressive miR-375 induced by histone acetylation in

esophageal squamous cell carcinoma. International journal of oncology, 41,

985-94.

J. ROBLN LYTLE, T. A. Y., JOAN A. STELTZ 2007. Target mRNAs are repressed

as efficiently by microRNA-binding sites in the 5` UTR as in the 3` UTR.

PNAS, 104, 9667-9672.

JOHN, B., ENRIGHT, A. J., ARAVIN, A., TUSCHL, T., SANDER, C. & MARKS,

D. S. 2004. Human MicroRNA targets. Plos Biology, 2, 1862-1879.

JOSHUA J. FORMAN, A. L.-M., AND HILARY A. COLLER 2008. A search for

conserved sequences in coding regions reveals that the let-7 microRNA

targets Dicer within its coding sequence. PNAS, 105, 14879-14884.

KELLER, S., SANDERSON, M. P., STOECK, A. & ALTEVOGT, P. 2006.

Exosomes: from biogenesis and secretion to biological function. Immunol

Lett, 107, 102-8.

KERSEY, P. J., STAINES, D. M., LAWSON, D., KULESHA, E., DERWENT, P.,

HUMPHREY, J. C., HUGHES, D. S., KEENAN, S., KERHORNOU, A.,

KOSCIELNY, G., LANGRIDGE, N., MCDOWALL, M. D., MEGY, K.,

MAHESWARI, U., NUHN, M., PAULINI, M., PEDRO, H., TONEVA, I.,

WILSON, D., YATES, A. & BIRNEY, E. 2012. Ensembl Genomes: an

Page 204: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

integrative resource for genome-scale data from non-vertebrate species.

Nucleic acids research, 40, D91-7.

KERTESZ, M., IOVINO, N., UNNERSTALL, U., GAUL, U. & SEGAL, E. 2007.

The role of site accessibility in microRNA target recognition. Nature

genetics, 39, 1278-84.

KIENESBERGER, P. C., OBERER, M., LASS, A. & ZECHNER, R. 2009.

Mammalian patatin domain containing proteins: a family with diverse

lipolytic activities involved in multiple biological functions. J Lipid Res, 50

Suppl, S63-8.

KIERSTEIN, G., VALLINOTO, M., SILVA, A., SCHNEIDER, M. P., IANNUZZI,

L. & BRENIG, B. 2004. Analysis of mitochondrial D-loop region casts new

light on domestic water buffalo (Bubalus bubalis) phylogeny. Molecular

phylogenetics and evolution, 30, 308-24.

KIM, S. K., NAM, J. W., RHEE, J. K., LEE, W. J. & ZHANG, B. T. 2006.

miTarget: microRNA target gene prediction using a support vector machine.

BMC Bioinformatics, 7, 411.

KIM, Y. K., YEO, J., HA, M., KIM, B. & KIM, V. N. 2011. Cell adhesion-

dependent control of microRNA decay. Molecular cell, 43, 1005-14.

KIM, Y. M., BARAK, L. S., CARON, M. G. & BENOVIC, J. L. 2002. Regulation

of arrestin-3 phosphorylation by casein kinase II. J Biol Chem, 277, 16837-

46.

KOSAKA, N., IZUMI, H., SEKINE, K. & OCHIYA, T. 2010. microRNA as a new

immune-regulatory agent in breast milk. Silence, 1, 7.

KOZOMARA, A. & GRIFFITHS-JONES, S. 2011. miRBase: integrating microRNA

annotation and deep-sequencing data. Nucleic acids research, 39, D152-7.

KREK, A., GRUN, D., POY, M. N., WOLF, R., ROSENBERG, L., EPSTEIN, E. J.,

MACMENAMIN, P., DA PIEDADE, I., GUNSALUS, K. C., STOFFEL, M.

& RAJEWSKY, N. 2005. Combinatorial microRNA target predictions.

Nature genetics, 37, 495-500.

KRUGER, J. & REHMSMEIER, M. 2006. RNAhybrid: microRNA target prediction

easy, fast and flexible. Nucleic Acids Research, 34, W451-W454.

Page 205: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

KRZYWINSKI, M., SCHEIN, J., BIROL, I., CONNORS, J., GASCOYNE, R.,

HORSMAN, D., JONES, S. J. & MARRA, M. A. 2009. Circos: an

information aesthetic for comparative genomics. Genome Research, 19,

1639-45.

KUMAR, A., BUSCARA, L., KURUPPATH, S. & LEFÈVRE, C. 2012a. The

emerging role of micro-RNAs in the lactation process. In Lactation: Natural

Processes, Physiological Responses and Role in Maternity. Nova publishing,

Series: Obstetrics and Gynecology Advances. Editors: Lisa M. Reyes Cruz

and Douglas C. Ortiz Gutierrez

.

KUMAR, A., WONG, A. K., TIZARD, M. L., MOORE, R. J. & LEFEVRE, C.

2012b. miRNA_Targets: a database for miRNA target predictions in coding

and non-coding regions of mRNAs. Genomics, 100, 352-6.

KUMAR, S., NAGARAJAN, M., SANDHU, J. S., KUMAR, N., BEHL, V. &

NISHANTH, G. 2007. Mitochondrial DNA analyses of Indian water buffalo

support a distinct genetic origin of river and swamp buffalo. Animal genetics,

38, 227-32.

KUMARSWAMY, R., MUDDULURU, G., CEPPI, P., MUPPALA, S.,

KOZLOWSKI, M., NIKLINSKI, J., PAPOTTI, M. & ALLGAYER, H. 2012.

MicroRNA-30a inhibits epithelial-to-mesenchymal transition by targeting

Snai1 and is downregulated in non-small cell lung cancer. Int J Cancer, 130,

2044-53.

KURUPPATH, S., BISANA, S., SHARP, J. A., LEFEVRE, C., KUMAR, S. &

NICHOLAS, K. R. 2012. Monotremes and marsupials: comparative models

to better understand the function of milk. J Biosci, 37, 581-8.

KWEK, J., DE IONGH, R., NICHOLAS, K. & FAMILARI, M. 2009a. Molecular

insights into evolution of the vertebrate gut: focus on stomach and parietal

cells in the marsupial, Macropus eugenii. J Exp Zool B Mol Dev Evol, 312,

613-24.

KWEK, J. H., IONGH, R. D., DIGBY, M. R., RENFREE, M. B., NICHOLAS, K. R.

& FAMILARI, M. 2009b. Cross-fostering of the tammar wallaby (Macropus

Page 206: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

eugenii) pouch young accelerates fore-stomach maturation. Mech Dev, 126,

449-63.

L BUSCARA, A. K., JA SHARP, KR NICHOLAS, C LEFÈVRE 2013. Lactating

fur seal case study suggests mir21 may contribute to mammary gland

involution evasion during off-shore lactation via regulation of PTEN. Asian

Journal of Animal Research, 1, 5-8.

LANGMEAD, B. & SALZBERG, S. L. 2012. Fast gapped-read alignment with

Bowtie 2. Nat Methods, 9, 357-9.

LÄSSER, C., ELDH, M. & LÖTVALL, J. 2012. Isolation and Characterization of

RNA-Containing Exosomes. J Vis Exp., 59.

LAVIGNE, D. M. 1986. Fur Seals - Maternal Strategies on Land and at Sea -

Gentry,Rl, Kooyman,Gl. Science (New York, N Y ), 233, 1208-1209.

LAWSON, D. M., ARTYMIUK, P. J., YEWDALL, S. J., SMITH, J. M.,

LIVINGSTONE, J. C., TREFFRY, A., LUZZAGO, A., LEVI, S., AROSIO,

P. & CESARENI, G. 1991. Solving the structure of human H ferritin by

genetically engineering intermolecular crystal contacts. Nature, 349, 541-4.

LEE, I., AJAY, S. S., YOOK, J. I., KIM, H. S., HONG, S. H., KIM, N. H.,

DHANASEKARAN, S. M., CHINNAIYAN, A. M. & ATHEY, B. D. 2009a.

New class of microRNA targets containing simultaneous 5 '-UTR and 3 '-

UTR interaction sites. Genome Research, 19, 1175-1183.

LEE, R. C., FEINBAUM, R. L. & AMBROS, V. 1993. The C. elegans heterochronic

gene lin-4 encodes small RNAs with antisense complementarity to lin-14.

Cell, 75, 843-54.

LEE, Y., KIM, M., HAN, J., YEOM, K. H., LEE, S., BAEK, S. H. & KIM, V. N.

2004. MicroRNA genes are transcribed by RNA polymerase II. Embo J, 23,

4051-60.

LEE, Y. S., PRESSMAN, S., ANDRESS, A. P., KIM, K., WHITE, J. L., CASSIDY,

J. J., LI, X., LUBELL, K., LIM DO, H., CHO, I. S., NAKAHARA, K.,

PREALL, J. B., BELLARE, P., SONTHEIMER, E. J. & CARTHEW, R. W.

2009b. Silencing by small RNAs is linked to endosomal trafficking. Nature

Cell Biology, 11, 1150-6.

Page 207: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

LEFEVRE, C. M., SHARP, J. A. & NICHOLAS, K. R. 2009. Characterisation of

monotreme caseins reveals lineage-specific expansion of an ancestral casein

locus in mammals. Reprod Fertil Dev, 21, 1015-27.

LEFEVRE, C. M., SHARP, J. A. & NICHOLAS, K. R. 2010a. Evolution of

lactation: ancient origin and extreme adaptations of the lactation system.

Annu Rev Genomics Hum Genet, 11, 219-38.

LEFEVRE, C. M., SHARP, J. A. & NICHOLAS, K. R. 2010b. Evolution of

lactation: ancient origin and extreme adaptations of the lactation system.

Annual review of genomics and human genetics, 11, 219-38.

LEMAY, D. G., BALLARD, O. A., HUGHES, M. A., MORROW, A. L.,

HORSEMAN, N. D. & NOMMSEN-RIVERS, L. A. 2013. RNA sequencing

of the human milk fat layer transcriptome reveals distinct gene expression

profiles at three stages of lactation. PLoS One, 8, e67531.

LEMAY, D. G., LYNN, D. J., MARTIN, W. F., NEVILLE, M. C., CASEY, T. M.,

RINCON, G., KRIVENTSEVA, E. V., BARRIS, W. C., HINRICHS, A. S.,

MOLENAAR, A. J., POLLARD, K. S., MAQBOOL, N. J., SINGH, K.,

MURNEY, R., ZDOBNOV, E. M., TELLAM, R. L., MEDRANO, J. F.,

GERMAN, J. B. & RIJNKELS, M. 2009. The bovine lactation genome:

insights into the evolution of mammalian milk. Genome Biol, 10, R43.

LEMMON, S. K. & TRAUB, L. M. 2000. Sorting in the endosomal system in yeast

and animal cells. Curr Opin Cell Biol, 12, 457-66.

LEMON, M. & BAILEY, L. F. 1966. A specific protein difference in the milk from

two mammary glands of a red kangaroo. Aust J Exp Biol Med Sci, 44, 705-7.

LEWIS, B. P., BURGE, C. B. & BARTEL, D. P. 2005. Conserved seed pairing,

often flanked by adenosines, indicates that thousands of human genes are

microRNA targets. Cell, 120, 15-20.

LEWIS, B. P., SHIH, I. H., JONES-RHOADES, M. W., BARTEL, D. P. & BURGE,

C. B. 2003. Prediction of mammalian microRNA targets. Cell, 115, 787-98.

LI, P., PENG, J., HU, J., XU, Z., XIE, W. & YUAN, L. 2011a. Localized expression

pattern of miR-184 in Drosophila. Molecular Biology Reports, 38, 355-8.

Page 208: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

LI, W., RUSINIAK, M. E., CHINTALA, S., GAUTAM, R., NOVAK, E. K. &

SWANK, R. T. 2004. Murine Hermansky-Pudlak syndrome genes: regulators

of lysosome-related organelles. Bioessays, 26, 616-28.

LI, Y., LI, C., DING, G. & JIN, Y. 2011b. Evolution of MIR159/319 microRNA

genes and their post-transcriptional regulatory link to siRNA pathways. BMC

Evol Biol, 11, 122.

LI, Y., XU, X., LIANG, Y., LIU, S., XIAO, H., LI, F., CHENG, H. & FU, Z. 2010.

miR-375 enhances palmitate-induced lipoapoptosis in insulin-secreting NIT-1

cells by repressing myotrophin (V1) protein expression. Int J Clin Exp

Pathol, 3, 254-64.

LIAO, Y., DU, X. & LONNERDAL, B. 2010. miR-214 regulates lactoferrin

expression and pro-apoptotic function in mammary epithelial cells. J Nutr,

140, 1552-6.

LINNAEUS, C. 1758. Tomus I. Systema naturae per regna tria naturae, secundum

classes, ordines, genera, species, cum characteribus, differentiis, synonymis,

locis. . Editio decima, reformata. Holmiae. (Laurentii Salvii), [1-4], 1-824.

LIU, C. G., CALIN, G. A., MELOON, B., GAMLIEL, N., SEVIGNANI, C.,

FERRACIN, M., DUMITRU, C. D., SHIMIZU, M., ZUPO, S., DONO, M.,

ALDER, H., BULLRICH, F., NEGRINI, M. & CROCE, C. M. 2004. An

oligonucleotide microchip for genome-wide microRNA profiling in human

and mouse tissues. Proc Natl Acad Sci U S A, 101, 9740-4.

LIU, C. H., CHANG, S. H., NARKO, K., TRIFAN, O. C., WU, M. T., SMITH, E.,

HAUDENSCHILD, C., LANE, T. F. & HLA, T. 2001. Overexpression of

cyclooxygenase-2 is sufficient to induce tumorigenesis in transgenic mice. J

Biol Chem, 276, 18563-9.

LIU, J., RIVAS, F. V., WOHLSCHLEGEL, J., YATES, J. R., 3RD, PARKER, R. &

HANNON, G. J. 2005. A role for the P-body component GW182 in

microRNA function. Nature Cell Biology, 7, 1261-6.

LIU, N., OKAMURA, K., TYLER, D. M., PHILLIPS, M. D., CHUNG, W. J. &

LAI, E. C. 2008. The evolution and functional diversification of animal

microRNA genes. Cell Res, 18, 985-96.

Page 209: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

LOTIA, S., MONTOJO, J., DONG, Y., BADER, G. D. & PICO, A. R. 2013.

Cytoscape app store. Bioinformatics, 29, 1350-1.

LOVE, C. A., HARLOS, K., MAVADDAT, N., DAVIS, S. J., STUART, D. I.,

JONES, E. Y. & ESNOUF, R. M. 2003. The ligand-binding face of the

semaphorins revealed by the high-resolution crystal structure of SEMA4D.

Nat Struct Biol, 10, 843-8.

LU, L., TIMOFEYEV, V., LI, N., RAFIZADEH, S., SINGAPURI, A., HARRIS, T.

R. & CHIAMVIMONVAT, N. 2009. Alpha-actinin2 cytoskeletal protein is

required for the functional membrane localization of a Ca2+-activated K+

channel (SK2 channel). Proc Natl Acad Sci U S A, 106, 18402-7.

MANINGAT, P. D., SEN, P., RIJNKELS, M., SUNEHAG, A. L., HADSELL, D. L.,

BRAY, M. & HAYMOND, M. W. 2009. Gene expression in the human

mammary epithelium during lactation: the milk fat globule transcriptome.

Physiol Genomics, 37, 12-22.

MARAGKAKIS, M., VERGOULIS, T., ALEXIOU, P., RECZKO, M.,

PLOMARITOU, K., GOUSIS, M., KOURTIS, K., KOZIRIS, N.,

DALAMAGAS, T. & HATZIGEORGIOU, A. G. 2011. DIANA-microT

Web server upgrade supports Fly and Worm miRNA target prediction and

bibliographic miRNA to disease association. Nucleic Acids Research, 39,

W145-8.

MCCLELLAND, A. C., HRUSKA, M., COENEN, A. J., HENKEMEYER, M. &

DALVA, M. B. 2010. Trans-synaptic EphB2-ephrin-B3 interaction regulates

excitatory synapse density by inhibition of postsynaptic MAPK signaling.

Proc Natl Acad Sci U S A, 107, 8830-5.

MCKIERNAN, R. C., JIMENEZ-MATEOS, E. M., SANO, T., BRAY, I.,

STALLINGS, R. L., SIMON, R. P. & HENSHALL, D. C. 2012. Expression

profiling the microRNA response to epileptic preconditioning identifies miR-

184 as a modulator of seizure-induced neuronal death. Experimental

Neurology, 237, 346-354.

MEDINA, D. 1996. The mammary gland: a unique organ for the study of

development and tumorigenesis. J Mammary Gland Biol Neoplasia, 1, 5-19.

Page 210: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

MELLIOS, N., HUANG, H. S., GRIGORENKO, A., ROGAEV, E. & AKBARIAN,

S. 2008. A set of differentially expressed miRNAs, including miR-30a-5p, act

as post-transcriptional inhibitors of BDNF in prefrontal cortex. Hum Mol

Genet, 17, 3030-42.

MELTON, C., JUDSON, R. L. & BLELLOCH, R. 2010a. Opposing microRNA

families regulate self- renewal in mouse embryonic stem cells. Nature, 436,

621-626.

MELTON, C., JUDSON, R. L. & BLELLOCH, R. 2010b. Opposing microRNA

families regulate self-renewal in mouse embryonic stem cells. Nature, 463,

621-6.

MESSER, M. & KERRY, K. R. 1973. Milk carbohydrates of the echidna and the

platypus. Science, 180, 201-3.

MEUNIER, J., LEMOINE, F., SOUMILLON, M., LIECHTI, A., WEIER, M.,

GUSCHANSKI, K., HU, H., KHAITOVICH, P. & KAESSMANN, H. 2013.

Birth and expression evolution of mammalian microRNA genes. Genome

research, 23, 34-45.

MEYER, L. R., ZWEIG, A. S., HINRICHS, A. S., KAROLCHIK, D., KUHN, R.

M., WONG, M., SLOAN, C. A., ROSENBLOOM, K. R., ROE, G., RHEAD,

B., RANEY, B. J., POHL, A., MALLADI, V. S., LI, C. H., LEE, B. T.,

LEARNED, K., KIRKUP, V., HSU, F., HEITNER, S., HARTE, R. A.,

HAEUSSLER, M., GURUVADOO, L., GOLDMAN, M., GIARDINE, B.

M., FUJITA, P. A., DRESZER, T. R., DIEKHANS, M., CLINE, M. S.,

CLAWSON, H., BARBER, G. P., HAUSSLER, D. & KENT, W. J. 2012.

The UCSC Genome Browser database: extensions and updates 2013. Nucleic

acids research.

MI, H., MURUGANUJAN, A. & THOMAS, P. D. 2013. PANTHER in 2013:

modeling the evolution of gene function, and other gene attributes, in the

context of phylogenetic trees. Nucleic Acids Res, 41, D377-86.

MICHELIZZI, V. N., DODSON, M. V., PAN, Z., AMARAL, M. E., MICHAL, J. J.,

MCLEAN, D. J., WOMACK, J. E. & JIANG, Z. 2010. Water buffalo

genome science comes of age. International journal of biological sciences, 6,

333-49.

Page 211: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

MIRANDA, K. C., HUYNH, T., TAY, Y., ANG, Y.-S., TAM, W.-L., THOMSON,

A. M., LIM, B. & RIGOUTSOS, I. 2006a. A pattern-based method for the

identification of MicroRNA binding sites and their corresponding

heteroduplexes. Cell, 126, 1203-17.

MIRANDA, K. C., HUYNH, T., TAY, Y., ANG, Y. S., TAM, W. L., THOMSON,

A. M., LIM, B. & RIGOUTSOS, I. 2006b. A pattern-based method for the

identification of MicroRNA binding sites and their corresponding

heteroduplexes. Cell, 126, 1203-17.

MITCHELL, P. S., PARKIN, R. K., KROH, E. M., FRITZ, B. R., WYMAN, S. K.,

POGOSOVA-AGADJANYAN, E. L., PETERSON, A., NOTEBOOM, J.,

O'BRIANT, K. C., ALLEN, A., LIN, D. W., URBAN, N., DRESCHER, C.

W., KNUDSEN, B. S., STIREWALT, D. L., GENTLEMAN, R.,

VESSELLA, R. L., NELSON, P. S., MARTIN, D. B. & TEWARI, M. 2008.

Circulating microRNAs as stable blood-based markers for cancer detection.

Proc Natl Acad Sci U S A, 105, 10513-8.

MONTICELLI, S., ANSEL, K. M., XIAO, C., SOCCI, N. D., KRICHEVSKY, A.

M., THAI, T.-H., RAJEWSKY, N., MARKS, D. S., SANDER, C.,

RAJEWSKY, K., RAO, A. & KOSIK, K. S. 2005. MicroRNA profiling of

the murine hematopoietic system. Genome Biology, 6, R71-R71.

MUNCH, E. M., HARRIS, R. A., MOHAMMAD, M., BENHAM, A. L.,

PEJERREY, S. M., SHOWALTER, L., HU, M., SHOPE, C. D.,

MANINGAT, P. D., GUNARATNE, P. H., HAYMOND, M. & AAGAARD,

K. 2013. Transcriptome profiling of microRNA by Next-Gen deep

sequencing reveals known and novel miRNA species in the lipid fraction of

human breast milk. PLoS One, 8, e50564.

NEEL, J. C. & LEBRUN, J. J. 2013. Activin and TGFbeta regulate expression of the

microRNA-181 family to promote cell migration and invasion in breast

cancer cells. Cellular signalling, 25, 1556-1566.

NEWBURG, D. S. 2001. Bioactive components of human milk: evolution,

efficiency, and protection. Advances in experimental medicine and biology,

501, 3-10.

Page 212: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

NICHOLAS, K., SHARP, J., WATT, A., WANYONYI, S., CROWLEY, T.,

GILLESPIE, M. & LEFEVRE, C. 2012a. The tammar wallaby: A model

system to examine domain-specific delivery of milk protein bioactives. Semin

Cell Dev Biol.

NICHOLAS, K., SHARP, J., WATT, A., WANYONYI, S., CROWLEY, T.,

GILLESPIE, M. & LEFEVRE, C. 2012b. The tammar wallaby: a model

system to examine domain-specific delivery of milk protein bioactives. Semin

Cell Dev Biol, 23, 547-56.

NICHOLAS, K. R., WILDE, C., BIRD, K., HENDRY, K., TREGENZA, K. &

WARNER, B. 1995. Asynchronous concurrent secretion of milk proteins in

the tammar wallaby (Macropus eugenii). In: WILDE, C., KNIGHT, C. &

PEAKER, M. (eds.) Intercellular signalling in the mammary gland. London:

Plenum Press.

NKYIMBENG-TAKWI, E. & CHAPOVAL, S. P. 2011. Biology and function of

neuroimmune semaphorins 4A and 4D. Immunol Res, 50, 10-21.

NOZAWA, M., MIURA, S. & NEI, M. 2010. Origins and evolution of microRNA

genes in Drosophila species. Genome Biol Evol, 2, 180-9.

OFTEDAL, O. T. 2002a. The mammary gland and its origin during synapsid

evolution. J Mammary Gland Biol Neoplasia, 7, 225-52.

OFTEDAL, O. T. 2002b. The origin of lactation as a water source for parchment-

shelled eggs. J Mammary Gland Biol Neoplasia, 7, 253-66.

OKAMURA, K., PHILLIPS, M. D., TYLER, D. M., DUAN, H., CHOU, Y. T. &

LAI, E. C. 2008. The regulatory activity of microRNA* species has

substantial influence on microRNA and 3' UTR evolution. Nat Struct Mol

Biol, 15, 354-63.

OROM, U. A. & LUND, A. H. 2010. Experimental identification of microRNA

targets. Gene, 451, 1-5.

ORPANA, A. & SALVEN, P. 2002. Angiogenic and lymphangiogenic molecules in

hematological malignancies. Leuk Lymphoma, 43, 219-24.

PEAKER, M. 2002. The mammary gland in mammalian evolution: a brief

commentary on some of the concepts. J Mammary Gland Biol Neoplasia, 7,

347-53.

Page 213: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

PICCIANO, M. F. 2001. Nutrient composition of human milk. Pediatr Clin North

Am, 48, 53-67.

PRUITT, K. D., TATUSOVA, T., BROWN, G. R. & MAGLOTT, D. R. 2012.

NCBI Reference Sequences (RefSeq): current status, new features and

genome annotation policy. Nucleic acids research, 40, D130-5.

RAO, J. U., SHAH, K. B., PUTTAIAH, J. & RUDRAIAH, M. 2011. Gene

expression profiling of preovulatory follicle in the buffalo cow: effects of

increased IGF-I concentration on periovulatory events. PLoS One, 6, e20754.

REHMSMEIER, M., STEFFEN, P., HOCHSMANN, M. & GIEGERICH, R. 2004.

Fast and effective prediction of microRNA/target duplexes. RNA, 10, 1507-

17.

RESTANI, P., GAIASCHI, A., PLEBANI, A., BERETTA, B., CAVAGNI, G.,

FIOCCHI, A., POIESI, C., VELONA, T., UGAZIO, A. G. & GALLI, C. L.

1999. Cross-reactivity between milk proteins from different animal species.

Clin Exp Allergy, 29, 997-1004.

REVIGLIO, V. E., HAKIM, M. A., SONG, J. K. & O'BRIEN, T. P. 2003. Effect of

topical fluoroquinolones on the expression of matrix metalloproteinases in

the cornea. BMC Ophthalmol, 3, 10.

REYES-HERRERA, P. H. & FICARRA, E. 2012. One decade of development and

evolution of microRNA target prediction algorithms. Genomics, proteomics

& bioinformatics, 10, 254-63.

RICHERT, M. M., SCHWERTFEGER, K. L., RYDER, J. W. & ANDERSON, S.

M. 2000. An atlas of mouse mammary gland development. J Mammary

Gland Biol Neoplasia, 5, 227-41.

ROHANI, N., CANTY, L., LUU, O., FAGOTTO, F. & WINKLBAUER, R. 2011.

EphrinB/EphB signaling controls embryonic germ layer separation by

contact-induced cell detachment. PLoS Biol, 9, e1000597.

RONEY, K., HOLL, E. & TING, J. 2013. Immune plexins and semaphorins: old

proteins, new immune functions. Protein Cell, 4, 17-26.

RUSINOV, V., BAEV, V., MINKOV, I. N. & TABLER, M. 2005. MicroInspector:

a web tool for detection of miRNA binding sites in an RNA sequence.

Nucleic Acids Research, 33, W696-700.

Page 214: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

SABIKHI, L. 2007. Designer milk. Advances in food and nutrition research, 53,

161-98.

SASAKI, T., SHIOHAMA, A., MINOSHIMA, S. & SHIMIZU, N. 2003.

Identification of eight members of the Argonaute family in the human

genome small star, filled. Genomics, 82, 323-30.

SDASSI, N., SILVERI, L., LAUBIER, J., TILLY, G., COSTA, J., LAYANI, S.,

VILOTTE, J. L. & LE PROVOST, F. 2009. Identification and

characterization of new miRNAs cloned from normal mouse mammary

gland. BMC Genomics, 10, 149.

SHARP, J. A., LEFEVRE, C. & NICHOLAS, K. R. 2007. Molecular evolution of

monotreme and marsupial whey acidic protein genes. Evol Dev, 9, 378-92.

SHATTIL, S. J., KIM, C. & GINSBERG, M. H. 2010. The final steps of integrin

activation: the end game. Nat Rev Mol Cell Biol, 11, 288-300.

SILVERI, L., TILLY, G., VILOTTE, J. L. & LE PROVOST, F. 2006. MicroRNA

involvement in mammary gland development and breast cancer. Reprod Nutr

Dev, 46, 549-56.

SINGH, A., AHUJA, S. P. & SINGH, B. 1993. Individual variation in the

composition of colostrum and absorption of colostral antibodies by the

precolostral buffalo calf. Journal of Dairy Science, 76, 1148-56.

SINGH, S. & GANGULI, N. C. 1976. Compositional and electrophoretic changes in

buffalo milk-fat globule membrane proteins during lactation. The Journal of

dairy research, 43, 381-8.

SMALL, E. M. & OLSON, E. N. 2011. Pervasive roles of microRNAs in

cardiovascular biology. Nature, 469, 336-42.

SPANGHERO, M. & SUSMEL, P. 1996. Chemical composition and energy content

of buffalo milk. The Journal of dairy research, 63, 629-33.

SPURLINO, J. C., SMALLWOOD, A. M., CARLTON, D. D., BANKS, T. M.,

VAVRA, K. J., JOHNSON, J. S., COOK, E. R., FALVO, J., WAHL, R. C.,

PULVINO, T. A. & ET AL. 1994. 1.56 A structure of mature truncated

human fibroblast collagenase. Proteins, 19, 98-109.

STARY, Z. & CINDI, R. 1955. [Protein bound carbohydrates of milk]. Istanbul Tip

Fak Mecmuasi, 18, 69-77.

Page 215: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

STATON, C. A., SHAW, L. A., VALLURU, M., HOH, L., KOAY, I., CROSS, S.

S., REED, M. W. & BROWN, N. J. 2011. Expression of class 3 semaphorins

and their receptors in human breast neoplasia. Histopathology, 59, 274-82.

STEIN, T., SALOMONIS, N. & GUSTERSON, B. A. 2007. Mammary gland

involution as a multi-step process. J Mammary Gland Biol Neoplasia, 12, 25-

35.

TANAKA, T., HANEDA, S., IMAKAWA, K., SAKAI, S. & NAGAOKA, K. 2009.

A microRNA, miR-101a, controls mammary gland development by

regulating cyclooxygenase-2 expression. Differentiation; Research in

Biological Diversity, 77, 181-187.

TAY, Y., ZHANG, J. Q., THOMSON, A. M., LIM, B. & RIGOUTSOS, I. 2009.

MicroRNAs to Nanog, Oct4 and Sox2 coding regions modulate embryonic

stem cell differentiation (vol 445, pg 1124, 2008). Nature, 458, 538-538.

THIEME, C. J., SCHUDOMA, C., MAY, P. & WALTHER, D. 2012. Give It AGO:

The Search for miRNA-Argonaute Sorting Signals in Arabidopsis thaliana

Indicates a Relevance of Sequence Positions Other than the 5'-Position

Alone. Front Plant Sci, 3, 272.

THOMAS, M., LIEBERMAN, J. & LAL, A. 2010. Desperately seeking microRNA

targets. Nat Struct Mol Biol, 17, 1169-74.

TRAPNELL, C., WILLIAMS, B. A., PERTEA, G., MORTAZAVI, A., KWAN, G.,

VAN BAREN, M. J., SALZBERG, S. L., WOLD, B. J. & PACHTER, L.

2010. Transcript assembly and quantification by RNA-Seq reveals

unannotated transcripts and isoform switching during cell differentiation. Nat

Biotechnol, 28, 511-5.

TROTT, J. F., SIMPSON, K. J., MOYLE, R. L., HEARN, C. M., SHAW, G.,

NICHOLAS, K. R. & RENFREE, M. B. 2003. Maternal regulation of milk

composition, milk production, and pouch young development during lactation

in the tammar wallaby (Macropus eugenii ). Biol Reprod, 68, 929-36.

TURCHINOVICH, A., WEIZ, L., LANGHEINZ, A. & BURWINKEL, B. 2011.

Characterization of extracellular circulating microRNA. Nucleic Acids Res,

39, 7223-33.

Page 216: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

TYNDALE-BISCOE, H. A. M. B. R. 1987. Reproductive physiology of Marsupials,

Cambridge University Press, Cambridge England.

UCAR, A., VAFAIZADEH, V., CHOWDHURY, K. & GRONER, B. 2011.

MicroRNA-dependent regulation of the microenvironment and the epithelial

stromal cell interactions in the mouse mammary gland. Cell Cycle

(Georgetown, Tex.), 10, 563-565.

UCAR, A., VAFAIZADEH, V., JARRY, H., FIEDLER, J., KLEMMT, P. A.,

THUM, T., GRONER, B. & CHOWDHURY, K. 2010. miR-212 and miR-

132 are required for epithelial stromal interactions necessary for mouse

mammary gland development. Nature genetics, 42, 1101-8.

URUAKPA, F. O., ISMOND, M. A. H. & AKOBUNDU, E. N. T. 2002. Colostrum

and its benefits: a review. Nutrition Research, 22, 755-767.

VALADI, H., EKSTROM, K., BOSSIOS, A., SJOSTRAND, M., LEE, J. J. &

LOTVALL, J. O. 2007. Exosome-mediated transfer of mRNAs and

microRNAs is a novel mechanism of genetic exchange between cells. Nature

Cell Biology, 9, 654-U72.

VALENCIA-SANCHEZ, M. A., LIU, J., HANNON, G. J. & PARKER, R. 2006.

Control of translation and mRNA degradation by miRNAs and siRNAs.

Genes & Development, 20, 515-524.

VAN NIEL, G., PORTO-CARREIRO, I., SIMOES, S. & RAPOSO, G. 2006.

Exosomes: a common pathway for a specialized function. J Biochem, 140,

13-21.

VICKERS, K. C., PALMISANO, B. T., SHOUCRI, B. M., SHAMBUREK, R. D. &

REMALEY, A. T. 2011. MicroRNAs are transported in plasma and delivered

to recipient cells by high-density lipoproteins. Nature Cell Biology, 13, 423-

33.

WANG, C. & LI, Q. 2007. Identification of differentially expressed microRNAs

during the development of Chinese murine mammary gland. J Genet

Genomics, 34, 966-73.

WANG, K., ZHANG, S., WEBER, J., BAXTER, D. & GALAS, D. J. 2010a. Export

of microRNAs and microRNA-protective protein by mammalian cells.

Nucleic Acids Res, 38, 7248-59.

Page 217: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

WANG, W.-C., LIN, F.-M., CHANG, W.-C., LIN, K.-Y., HUANG, H.-D. & LIN,

N.-S. 2009. miRExpress: analyzing high-throughput sequencing data for

profiling microRNA expression. BMC Bioinformatics, 10, 328.

WANG, X. 2008. miRDB: a microRNA target prediction and functional annotation

database with a wiki interface. RNA, 14, 1012-7.

WANG, Y., LI, Y., MA, Z., YANG, W. & AI, C. 2010b. Mechanism of microRNA-

target interaction: molecular dynamics simulations and thermodynamics

analysis. PLoS Comput Biol, 6, e1000866.

WANYONYI, S. S., LEFEVRE, C., SHARP, J. A. & NICHOLAS, K. R. 2013. The

extracellular matrix locally regulates asynchronous concurrent lactation in

tammar wallaby (Macropus eugenii). Matrix Biol, 32, 342-51.

WARD, A., BALWIERZ, A., ZHANG, J. D., KUBLBECK, M., PAWITAN, Y.,

HIELSCHER, T., WIEMANN, S. & SAHIN, O. 2013. Re-expression of

microRNA-375 reverses both tamoxifen resistance and accompanying EMT-

like properties in breast cancer. Oncogene, 32, 1173-1182.

WARREN, W. C., HILLIER, L. W., MARSHALL GRAVES, J. A., BIRNEY, E.,

PONTING, C. P., GRUTZNER, F., BELOV, K., MILLER, W., CLARKE,

L., CHINWALLA, A. T., YANG, S. P., HEGER, A., LOCKE, D. P.,

MIETHKE, P., WATERS, P. D., VEYRUNES, F., FULTON, L., FULTON,

B., GRAVES, T., WALLIS, J., PUENTE, X. S., LOPEZ-OTIN, C.,

ORDONEZ, G. R., EICHLER, E. E., CHEN, L., CHENG, Z., DEAKIN, J.

E., ALSOP, A., THOMPSON, K., KIRBY, P., PAPENFUSS, A. T.,

WAKEFIELD, M. J., OLENDER, T., LANCET, D., HUTTLEY, G. A.,

SMIT, A. F., PASK, A., TEMPLE-SMITH, P., BATZER, M. A., WALKER,

J. A., KONKEL, M. K., HARRIS, R. S., WHITTINGTON, C. M., WONG,

E. S., GEMMELL, N. J., BUSCHIAZZO, E., VARGAS JENTZSCH, I. M.,

MERKEL, A., SCHMITZ, J., ZEMANN, A., CHURAKOV, G., KRIEGS, J.

O., BROSIUS, J., MURCHISON, E. P., SACHIDANANDAM, R., SMITH,

C., HANNON, G. J., TSEND-AYUSH, E., MCMILLAN, D.,

ATTENBOROUGH, R., RENS, W., FERGUSON-SMITH, M., LEFEVRE,

C. M., SHARP, J. A., NICHOLAS, K. R., RAY, D. A., KUBE, M.,

REINHARDT, R., PRINGLE, T. H., TAYLOR, J., JONES, R. C., NIXON,

Page 218: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

B., DACHEUX, J. L., NIWA, H., SEKITA, Y., HUANG, X., STARK, A.,

KHERADPOUR, P., KELLIS, M., FLICEK, P., CHEN, Y., WEBBER, C.,

HARDISON, R., NELSON, J., HALLSWORTH-PEPIN, K.,

DELEHAUNTY, K., MARKOVIC, C., MINX, P., FENG, Y., KREMITZKI,

C., MITREVA, M., GLASSCOCK, J., WYLIE, T., WOHLDMANN, P.,

THIRU, P., NHAN, M. N., POHL, C. S., SMITH, S. M., HOU, S.,

NEFEDOV, M., et al. 2008. Genome analysis of the platypus reveals unique

signatures of evolution. Nature, 453, 175-83.

WEBER, J. A., BAXTER, D. H., ZHANG, S., HUANG, D. Y., HUANG, K. H.,

LEE, M. J., GALAS, D. J. & WANG, K. 2010. The microRNA spectrum in

12 body fluids. Clinical chemistry, 56, 1733-41.

WEN, J., PARKER, B. J., JACOBSEN, A. & KROGH, A. 2011. MicroRNA

transfection and AGO-bound CLIP-seq data sets reveal distinct determinants

of miRNA action. RNA, 17, 820-34.

WEST, A. P., KOBLANSKY, A. A. & GHOSH, S. 2006. Recognition and signaling

by toll-like receptors. Annu Rev Cell Dev Biol, 22, 409-37.

WICKRAMASINGHE, S., RINCON, G., ISLAS-TREJO, A. & MEDRANO, J. F.

2012. Transcriptional profiling of bovine milk using RNA sequencing. BMC

Genomics, 13.

WITTMANN, J. & JACK, H. M. 2011. microRNAs in rheumatoid arthritis: midget

RNAs with a giant impact. Annals of the rheumatic diseases, 70 Suppl 1, i92-

6.

WU, L., FAN, J. & BELASCO, J. G. 2006. MicroRNAs direct rapid deadenylation

of mRNA. Proceedings of the National Academy of Sciences of the United

States of America, 103, 4034-9.

XU, A., BELLAMY, A. R. & TAYLOR, J. A. 1999. Expression of translationally

controlled tumour protein is regulated by calcium at both the transcriptional

and post-transcriptional level. Biochem J, 342 Pt 3, 683-9.

YANG, J. H., LI, J. H., SHAO, P., ZHOU, H., CHEN, Y. Q. & QU, L. H. 2011.

starBase: a database for exploring microRNA-mRNA interaction maps from

Argonaute CLIP-Seq and Degradome-Seq data. Nucleic Acids Research, 39,

D202-9.

Page 219: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

YANG, J. S., PHILLIPS, M. D., BETEL, D., MU, P., VENTURA, A., SIEPEL, A.

C., CHEN, K. C. & LAI, E. C. 2012. Widespread regulatory activity of

vertebrate microRNA* species. Rna, 17, 312-26.

YANG, Y., BU, D., ZHAO, X., SUN, P., WANG, J. & ZHOU, L. 2013. Proteomic

Analysis of Cow, Yak, Buffalo, Goat and Camel Milk Whey Proteins:

Quantitative Differential Expression Patterns. Journal of proteome research.

YANG, Y., WANG, Y. P. & LI, K. B. 2008. MiRTif: a support vector machine-

based microRNA target interaction filter. BMC Bioinformatics, 9 Suppl 12,

S4.

YATES, L. A., NORBURY, C. J. & GILBERT, R. J. 2013. The Long and Short of

MicroRNA. Cell, 153, 516-9.

YINDEE, M., VLAMINGS, B. H., WAJJWALKU, W., TECHAKUMPHU, M.,

LOHACHIT, C., SIRIVAIDYAPONG, S., THITARAM, C.,

AMARASINGHE, A. A., ALEXANDER, P. A., COLENBRANDER, B. &

LENSTRA, J. A. 2010. Y-chromosomal variation confirms independent

domestications of swamp and river buffalo. Animal genetics, 41, 433-5.

YOUSEF, M., JUNG, S., KOSSENKOV, A. V., SHOWE, L. C. & SHOWE, M. K.

2007. Naive Bayes for microRNA target predictions--machine learning for

microRNA targets. Bioinformatics, 23, 2987-92.

ZHANG, L., HOU, D., CHEN, X., LI, D., ZHU, L., ZHANG, Y., LI, J., BIAN, Z.,

LIANG, X., CAI, X., YIN, Y., WANG, C., ZHANG, T., ZHU, D., ZHANG,

D., XU, J., CHEN, Q., BA, Y., LIU, J., WANG, Q., CHEN, J., WANG, J.,

WANG, M., ZHANG, Q., ZHANG, J., ZEN, K. & ZHANG, C. Y. 2012a.

Exogenous plant MIR168a specifically targets mammalian LDLRAP1:

evidence of cross-kingdom regulation by microRNA. Cell Res, 22, 107-26.

ZHANG, N., WANG, X., HUO, Q., SUN, M., CAI, C., LIU, Z., HU, G. & YANG,

Q. 2013. MicroRNA-30a suppresses breast tumor growth and metastasis by

targeting metadherin. Oncogene.

ZHANG, P., MA, Y. L., WANG, F., YANG, J. J., LIU, Z. H., PENG, J. Y. & QIN,

H. L. 2012b. Comprehensive gene and microRNA expression profiling

reveals the crucial role of hsa-let-7i and its target genes in colorectal cancer

metastasis. Molecular Biology Reports, 39, 1471-1478.

Page 220: Characterisation of RNA in milk and prediction of microRNA …dro.deakin.edu.au/.../kumar-characterisation-2014A.pdf · Characterisation of RNA in milk and prediction of microRNA

ZHANG, Y., LIU, D., CHEN, X., LI, J., LI, L., BIAN, Z., SUN, F., LU, J., YIN, Y.,

CAI, X., SUN, Q., WANG, K., BA, Y., WANG, Q., WANG, D., YANG, J.,

LIU, P., XU, T., YAN, Q., ZHANG, J., ZEN, K. & ZHANG, C. Y. 2010.

Secreted monocytic miR-150 enhances targeted endothelial cell migration.

Molecular cell, 39, 133-44.

ZHOU, Q., LI, M., WANG, X., LI, Q., WANG, T., ZHU, Q., ZHOU, X., GAO, X.

& LI, X. 2012a. Immune-related MicroRNAs are Abundant in Breast Milk

Exosomes. Int J Biol Sci, 8, 118-23.

ZHOU, Q., LI, M. Z., WANG, X. Y., LI, Q. Z., WANG, T., ZHU, Q., ZHOU, X. C.,

WANG, X., GAO, X. L. & LI, X. W. 2012b. Immune-related MicroRNAs

are Abundant in Breast Milk Exosomes. International journal of biological

sciences, 8, 118-123.

ZHOU, X., DUAN, X. C., QIAN, J. J. & LI, F. 2009. Abundant conserved

microRNA target sites in the 5'-untranslated region and coding sequence.

Genetica, 137, 159-164.

ZHU, E., ZHAO, F., XU, G., HOU, H., ZHOU, L., LI, X., SUN, Z. & WU, J. 2010.

mirTools: microRNA profiling and discovery based on high-throughput

sequencing. Nucleic Acids Research, 38, W392-7.

ZHU, H., WU, H., LIU, X., LI, B., CHEN, Y., REN, X., LIU, C. G. & YANG, J. M.

2009. Regulation of autophagy by a beclin 1-targeted microRNA, miR-30a,

in cancer cells. Autophagy, 5, 816-23.

ZIMIN, A. V., DELCHER, A. L., FLOREA, L., KELLEY, D. R., SCHATZ, M. C.,

PUIU, D., HANRAHAN, F., PERTEA, G., VAN TASSELL, C. P.,

SONSTEGARD, T. S., MARCAIS, G., ROBERTS, M., SUBRAMANIAN,

P., YORKE, J. A. & SALZBERG, S. L. 2009. A whole-genome assembly of

the domestic cow, Bos taurus. Genome Biol, 10, R42.

ZOU, Z., WU, L., DING, H., WANG, Y., ZHANG, Y., CHEN, X., CHEN, X.,

ZHANG, C. Y., ZHANG, Q. & ZEN, K. 2012. MicroRNA-30a sensitizes

tumor cells to cis-platinum via suppressing beclin 1-mediated autophagy. J

Biol Chem, 287, 4148-56.