The Importance of Phenotype in Genotype-Phenotype Studies in
CentoMD 4.2 Handbook · o Advanced Genotype to Phenotype module: Based on approved gene symbols...
Transcript of CentoMD 4.2 Handbook · o Advanced Genotype to Phenotype module: Based on approved gene symbols...
Page | 1
CentoMD®5.0_Handbook_V5_July2018
Handbook
Precautions/warnings:
For professional use only.
To support clinical diagnosis.
Page | 2
CentoMD®5.0_Handbook_V5_July2018
Contents
Introduction ................................................................................................. 3
Intended use ................................................................................................ 3
Facts and Features ........................................................................................ 4
Technologies used ......................................................................................... 5
Data acquisition and curation policy ................................................................... 7
Database curators .................................................................................. 7
Data acquisition .................................................................................... 7
Curation workflow ................................................................................. 8
Quality status ....................................................................................... 9
Variant-related information ........................................................................... 10
Genetic variants ................................................................................... 10
Variant location ................................................................................... 11
Variant type on DNA level ....................................................................... 11
Coding effect ...................................................................................... 13
Variant zygosity ................................................................................... 14
Allele frequency at CentoMD® ................................................................... 15
Publication status ................................................................................. 15
Clinical significance according to CentoMD® .................................................. 15
Information on disease and inheritance ............................................................. 19
Individual-related information on phenotype and demographics .............................. 20
Clinical statement of CENTOGENE AG ................................................................ 23
Vcf upload and annotation ............................................................................. 24
Input file............................................................................................ 24
Annotated File ..................................................................................... 25
Appendix .................................................................................................. 27
Abbreviations used in CentoMD® ................................................................ 27
Evidence-based annotation rules to determine the clinical statement ................... 27
Glossary .................................................................................................... 32
Page | 3
CentoMD®5.0_Handbook_V5_July2018
Introduction
Diagnosing a patient with a rare disease is a complex task because not all existing genetic
variants have been described or precisely annotated. Medical professionals need to obtain
all available knowledge about the detected genetic variants in a patient in order to
establish the most accurate diagnosis possible.
CentoMD® is a holistic database that combines phenotype and genotype information
gathered from genetic tests conducted at CENTOGENE AG. This means that every variant
reported in CentoMD® is linked to at least one clinically described individual analyzed at
CENTOGENE AG through a standardized workflow with accredited quality. Respectively,
CentoMD® is a growing database; newly generated data will be imported quarterly.
This handbook describes the content of CentoMD®, how this content is generated, how
clinical significance classes are defined, and how quality standards are fulfilled. The
accompanying CentoMD® user guide provides a detailed description of how to use this web
based database.
Intended use
CentoMD® is a browser based software that provides a genetic diagnosis correlating
information from a comprehensive and unique repository of genetic, biochemical (where
available) and clinical information from consented and manually curated patient data sets,
and probands of different geographical backgrounds. It will be only available to medically-
trained professionals for the evaluation of the genetic variants that have been identified
in their own patients. This enhances the validity of the genetic analytical workflow and
providing the health care professional with a result interpretation and variant assertion
recommendation in evaluating treatment options for patients with rare hereditary diseases.
The software will not allow independent review of data generated by the health care
professional. The software is designed to provide a reliable recommendation of clinical
diagnosis.
Page | 4
CentoMD®5.0_Handbook_V5_July2018
Facts and Features
CentoMD® database provides detailed information of variants detected in individuals who
were referred to genetic testing by their physicians in order to evaluate whether they are
affected by or are carriers of variants which cause rare hereditary diseases. This patient
cohort is a unique representation of the global population originating from more than 115
countries. The allele frequencies stated in CentoMD® reflect the frequency observed in this
particular worldwide cohort. For every analyzed individual, CentoMD® provides information
about the genotype-phenotype correlation based on tested clinical cases. Therefore, all
genetic variants are associated to epidemiological data and clinical information – such as
signs and symptoms of the disease – if described by the physician.
CentoMD® 5.0 contains almost 60,000 variants which are classified and curated (see Variant
quality status). In total, more than 3,400 phenotypes and more than 1.3 billion alleles have
been identified in ~180,000 analyzed individuals. The current release contains more than
13,200 HPO (Human Phenotype Ontology) terms and almost 75,000 individuals are linked
with HPO term(s).
CentoMD® 5.0 provides the following key features:
o Advanced Genotype to Phenotype module: Based on approved gene symbols given
by the users, CentoMD® provides detailed data on corresponding genetic variants and
the associated epidemiological data and clinical information following HPO
nomenclature.
o Advanced Phenotype to Genotype module: Based on HPO terms given by the users,
CentoMD® provides hints on candidate genes and related variants underlying the
phenotype of interest.
o Annotation of genetic variants contained in a single sample vcf file with CentoMD®
clinical significance
o 58% of CentoMD® classified and curated clinically relevant (CRV) and uncertain (VUS)
variants are unpublished in the literature: Users can access data of CRV/VUS which
have not been previously described in literature.
o The annotation and classification of genetic variants is strictly curated by medical
professionals: Users have access to high quality data.
Page | 5
CentoMD®5.0_Handbook_V5_July2018
o Explanation of variant classification, statistics and detailed individual-related data
are available: data can be retrieved at 4 different levels: variant rationale, positive
individuals, statistics, and individual view.
Rationale: Summary supporting the clinical significance according to the ACMG
guidelines and internal evidences. In the current release, more than 75,000
variants are linked with rationales.
Positive individuals: Detailed information on individuals tested positive for the
variant of interest.
Statistics: Statistical analyses of individuals tested positive for the variant of
interest.
Individual view: Information on individuals tested positive for the variant of
interest as well as classified and curated CRV/VUS variants associated with each
individual.
o Co-occurrences are indicated: Users can view the association of the variant of
interest with other classified and curated CRV/VUS variants in the same gene or
other genes.
o Interactive search interface: Users are given the flexibility to perform searching,
sorting, filtering and access specific data contents by simple clicks.
o Data export functions: Users can export data for activated variants into read-only
excel file.
o Users are notified via e-mail when activated variants are re-classified: Users get the
latest information on the clinical significance class of variants of interest.
Technologies used
The following validated technologies are used at CENTOGENE AG to detect changes on
genetic levels and to identify the cause of the disease:
o Sanger: Classical method of DNA sequencing, developed by Fred Sanger, using
chemically altered "dideoxy" bases to terminate newly synthesized DNA fragments
at specific bases (either A, C, T or G). These fragments are then size-separated, and
the DNA sequence can be read.
o NGS: Next-Generation Sequencing: High-throughput sequencing technology,
allowing the parallel sequencing of multiple genes, producing thousands or millions
of sequences concurrently.
Page | 6
CentoMD®5.0_Handbook_V5_July2018
o qPCR: Quantitative Polymerase Chain Reaction: Method to amplify and
simultaneously quantify a targeted DNA molecule. Used especially for detecting
large/gross and gene rearrangements.
o MLPA: Multiplex Ligation-dependent Probe Amplification: Variation of the multiplex
PCR that permits multiple targets to be amplified with only a single primer pair.
Used especially for detecting large/gross and gene rearrangements.
o CES: Clinical Exome Sequencing: Application of the next generation technology to
determine the variations of coding regions of genes which have been associated to
human disease.
o WES: Whole Exome Sequencing: Brute-force approach that involves modern day
sequencing technology and DNA sequence assembly tools to piece together all coding
portions of the genome. The sequence is then compared to a reference genome and
any differences are noted.
o WGS: Whole Genome Sequencing: modern day technology for sequencing of the
entire coding and non-coding regions of the genome.
o Other method: Used when another methodology (for example fragment length
analysis) has been employed to detect the variants.
Interpretations of the enzymatic activities and biomarker levels are provided, when
available, as supporting evidence for the relevance of the detected genetic change. For
example, for Fabry disease, which is an X-linked rare genetic lysosomal storage disease,
measurements of enzymatic activities are conducted in males, and measurements of the
biomarker levels are conducted in both males and females.
The terms used to describe the results of biochemical analyses are explained below:
o Biochemical analysis: Method to analyze enzymatic activity or levels of biomarkers
in samples obtained from patients usually suspected of being affected by a metabolic
disorder. This is a test performed via Tandem Mass Spectrometry to detect,
diagnose, and monitor diseases, disease processes, and susceptibility, and to
determine a course of treatment.
o Biomarker interpretation: Evaluation of the biomarker levels compared to the
reference interval
Normal: Biomarker levels are within the normal range (no change).
Page | 7
CentoMD®5.0_Handbook_V5_July2018
Pathological: Biomarker levels are significantly increased compared to the
normal range.
Slightly increased: Biomarker levels are only slightly increased compared to
the normal range.
o Enzyme interpretation: Evaluation of the enzyme activity compared to the reference
interval
Normal: Levels of activity are within the normal range (no change).
Pathological: Levels of activity are significantly decreased compared to the
normal range.
Slightly decreased: Levels of activity are only slightly decreased compared to
the normal range.
Data acquisition and curation policy
Curation is the process of collection, association, update and review of genetic and
phenotypic data of patients genetically analyzed at CENTOGENE AG into a structured and
standardized format. It utilizes a combination of computer-based tools and manual review
in order to assure the accuracy, efficiency and quality of the curation process.
Database curators
CentoMD® curators are biologists with strong background in human genetics. They
continuously undergo extensive training to ensure curation consistency and
standardization. They confirm that CentoMD® is error-free (items properly associated and
interpreted, no inconsistencies, and/or discrepancies against detected observations in-
house and external sources), and close the curation process by manual approval that
reviewed and curated data agree with standard procedures established in-house.
Data acquisition
Data gathering and variant curation are procedures developed and implemented in a web-
based software, that is compliant with the HGNC, HGVS and HPO nomenclatures allowing
collection of variants detected in nuclear coding, nuclear non-coding and mitochondrial
genes. The software integrates in-house sample management systems and analysis
platforms with external databases providing the curator with a comprehensive and
Page | 8
CentoMD®5.0_Handbook_V5_July2018
straightforward overview of the evidences regarding genotype-phenotype correlation
available in-house as well as external information.
The data is gathered by a combination of manual submission and data import following an
individual-oriented model where characteristics belonging to a particular individual
(patient information, clinical data, methodology and detected genetic variants) are stored
and associated together.
Curation workflow
To provide high-quality data, the curation process at CENTOGENE AG is divided in 3 phases:
variant-wise, individual-wise and warnings-wise procedures.
Curation by variant: To begin the curation process, the variant-linked information is
reviewed. This includes approval of variant nomenclature, terminology, accuracy,
consistency, record completeness.
Curation by individual: In order to start curation by individual, all variants detected in this
individual must be approved. It aims at assuring that the entries belonging to an individual
follow the rules for clinical statement closely, and that all associated data is in agreement
with the agreed guidelines. The following factors are considered as critical for the clinical
statement: variant clinical significance, patient genotype (number of clinically relevant
changes, their zygosity and location -i.e. cis vs. trans), inheritance pattern of the disorder,
the sex of the patient (for X-linked diseases), the phenotypic description, and if available
- levels of biomarkers.
Curation by warning: The database generates warnings at different levels (variant,
individual, gene, database levels) to detect errors, invalid terms and nomenclatures,
inconsistencies, and can provide hints where updates and reviews are necessary. Mostly
these warnings are due to additional evidences obtained internally (medical reports issued
at CENTOGENE AG) or detected externally (e.g. additional articles, publications and
external databases). Each warning is manually resolved.
Quarterly, all approved individuals are anonymized and then released to CentoMD®,
offering the most complete and up-to-date information possible to its users.
CentoMD® is a constantly growing and enriched database. Whenever additional evidence
provided by the medical professionals in-house or by peer-reviewed literature becomes
available, the variants are revised and re-classified accordingly. A detailed overview of the
Page | 9
CentoMD®5.0_Handbook_V5_July2018
clinical significance classes captured in CentoMD® is provided in the chapters “Variant-
related information” and “Clinical significance according to CentoMD®”.
Quality status
CentoMD® offers a dataset of variants derived from various technologies of genetic testing
and processed through a standardized workflow which follows international standards and
ensures high data quality. In CentoMD®, different types of variant and individual quality
status are indicated.
There are three types of variant quality status:
o Classified and curated (++): A variant has been assigned to a clinical significance
class based on the confirmed genotype-phenotype associations and curated by
following strictly the ACMG guidelines and internal expertise.
o Classified (+): A variant has been assigned to a clinical significance class according
to ACMG guidelines but has not yet been curated in the context of genotype-
phenotype association.
o Unclassified (0): A variant has not yet been assigned to any clinical significance class
due to the lack of genotype-phenotype associations. Further evaluation is required,
once additional information is available.
There are two types of individual quality status:
o G2P individuals (++): An individual with confirmed genotype to phenotype
association for the gene in question and linked with at least one classified and
curated variant (++) during manual curation process.
o Non-G2P individuals (+): An individual with not yet confirmed genotype to phenotype
association (unresolved individual) for the gene in question during manual curation
process. This individual could not be linked to any classified and curated variant (++)
in the respective gene. Non-G2P individuals are periodically reviewed against the
most updated knowledge in context of genotype-phenotype correlations.
Page | 10
CentoMD®5.0_Handbook_V5_July2018
Variant-related information
Genetic variants
CentoMD® includes germline and de novo genetic variants detected in all types of genes. A
collection of variants detected in nuclear coding, nuclear non-coding and mitochondrial
genes is available. The HGNC-approved gene symbols are used.
A gene is defined by a sequence of DNA that represents a basic unit of heredity, being
expressed in RNA and proteins.
o Nuclear coding: A gene located in the cell nucleus of a eukaryote that encodes for
protein.
o Nuclear non-coding: A gene located in the cell nucleus that does not encode for a
protein product.
o Mitochondrial: A gene located in the mitochondria.
In CentoMD®, each gene is linked with a transcript or reference sequence, i.e. a digital
nucleic acid sequence, assembled by scientists as a representative example of a species'
set of genes. All variant-type annotations provide mapping to genomic coordinates (genome
build hg19). Coding DNA reference sequence refers to a cDNA-derived sequence containing
the full length of all coding regions and non-coding untranslated regions.
According to the reference sequence used, the genetic variants are linked with the
corresponding location within the gene, with a particular mutation type on three different
levels: genomic/mitochondrial, cDNA, and protein, closely following the HGVS guidelines
and recommendations, for both small and gross gene rearrangements.
o Genomic DNA change: Change at gDNA level following numbering based on genomic
DNA reference sequence.
o Coding DNA change: Change at cDNA level following numbering based on coding DNA
reference sequences.
o Protein change: Change at protein level following numbering based on the amino
acid sequence, using one letter amino acid code and X for designating a translation
termination codon.
Page | 11
CentoMD®5.0_Handbook_V5_July2018
Variant location
Variant location refers to the location of the DNA change relative to the transcription
initiation site, initiation codon, polyadenylation site, or termination codon of the
corresponding gene.
o Upstream: The region located 5' (upstream) from the 5'UTR region of the gene.
o 5'UTR (5'-Untranslated Region): Sequences on the 5' end of messenger RNA (mRNA)
but not translated into protein. It extends from the transcription start site to just
before the ATG translation initiation codon. 5' UTR may contain sequences that
regulate translation efficiency or mRNA stability.
o Exon: The protein-coding DNA sequence of the gene.
o Intron: The non-coding region of a gene that interrupt the protein coding regions
(exons).
o 3'UTR (3'-Untranslated Region): Particular section of mRNA that starts with the
nucleotide immediately following the stop codon of the coding region. This region
contains transcription and translation regulating sequences.
o Downstream: The region located 3' (downstream) from the polyadenylation signal of
the gene.
For large deletions/duplications and gene rearrangements, the location is indicated by the
first and the last exon affected by the change (for example, e1_e9 stands for a large
deletion/duplication affecting exon 1 to exon 9). If, for example, only one exon is linked
with a large deletion, this indicates that particular exon is completely removed (see
mutation types below).
Please note that for mitochondrial genes, only the location exon 1 is valid.
Variant type on DNA level
The variant type describes the different types of changes that can occur in the DNA
sequence. The following types are included in CentoMD®:
o Chromosomal deletion: Loss of part of chromosome.
o Complex rearrangement: Involves the structures or number of the chromosomes, it
is referred to as chromosome mutation, or rearrangement, rearranged
chromosomes.
Page | 12
CentoMD®5.0_Handbook_V5_July2018
o Conversion: Non-reciprocal transfer of information between homologous
sequences; one DNA sequence replaces a homologous sequence such that the
sequences become identical after the conversion event.
o Deletion: A sequence change where one or more nucleotides are removed (deleted).
o Duplication: A sequence change where a copy of one or more nucleotides are
inserted directly 3`-flanking of the original copy.
o Gain of methylation: Gain of the normal DNA methylation level.
o Gene & regulatory region(s) deletion: Refers to loss of the entire gene and flanking
regions.
o Gene & regulatory region(s) duplication: Refers to the gain of the entire gene and
flanking regions.
o Gene deletion: Refers to loss of the entire gene.
o Gene duplication: Refers to gain/duplication of the entire gene.
o Gross deletion: Refers to loss of part(s) of a gene.
o Gross duplication: Refers to gain/duplication of part(s) of a gene.
o Gross inversion: Refers to 180-degree inversion of part(s) of a gene.
o Insertion/Deletion (Indel): Refers to a sequence change that includes a combination
of both insertions and deletions.
o Insertion: A sequence change where one or more nucleotides are added (inserted)
into a DNA sequence, or it may involve portions of a chromosome.
o Inversion: Chromosomal abnormality where a segment of a chromosome is rotated
180 degrees and reinserted.
o Loss of methylation: Loss of the normal DNA methylation level.
o Pathological allele (D4Z4 motif): Deletion of 3.3-kb repeats from a chromosomal
tandem repeat called D4Z4 located near the end of chromosome 4 at the 4q35-ter
location. D4Z4 contains an ORF encoding a putative homeobox protein called DUX4,
a large polymorphic repeat structure consisting of 1–100 KpnI units.
o Repeat expansion: Refers to an increase number of a genomic tandem repeated DNA
sequence.
o Retrotransposon insertion: Retrotransposons (also called transposons via RNA
intermediates) are genetic elements that can amplify themselves in a genome, and
can induce mutations by inserting near or within genes. Retrotransposon-induced
Page | 13
CentoMD®5.0_Handbook_V5_July2018
mutations are relatively stable, because the sequence at the insertion site is
retained as they transpose via the replication mechanism.
o Substitution: A sequence change where one nucleotide is replaced by one other
nucleotide. Substitutions are described using a ">" character (indicating "changes
to").
o Other/complex: Refers to all other types not included in any already mentioned
category above.
Coding effect
The coding effect describes the sequence changes at protein level. The following types are
distinguished:
o Effect unknown: The coding effect on protein level has not been analyzed. An effect
is expected but difficult to predict.
o Frameshift: A sequence change caused by deletion/insertion of nucleotides affecting
an amino acid between the first (initiation, ATG) and last codon (termination, stop),
replacing the normal C-terminal sequence with one encoded by another reading
frame.
o Increased polyglutamine tract/expanded polyQ: Portion of a protein consisting of a
sequence of several glutamine (Glu; Q) units.
o In-frame: A sequence change that does not cause a shift in the triplet reading frame.
As a result, one or more amino acids are added, deleted or replaced by one or more
other amino acids.
o Missense: A single nucleotide change that results in a codon that codes for a different
amino acid. Not all missense mutations are deleterious; some changes can have no
effect. Because of the ambiguity of missense mutations, it is often difficult to
interpret the consequences of these mutations in causing disease.
o New translation initiation site: A change affecting the translation initiation codon
(Met-1) introducing a new upstream initiation codon extending the N-terminus of the
encoded protein.
o Non-coding: The change on DNA level that has no effect on protein, or the effect of
regulatory mutations is unknown.
o Nonsense: A sequence change that results in a premature stop codon, and in a
truncated, incomplete protein product.
Page | 14
CentoMD®5.0_Handbook_V5_July2018
o Silent: A sequence change that does not result in a change of amino acid and
functional change of the protein product.
o Splicing mutation: A sequence change that affects the splicing process (i.e. intron
removal and exons joining). Splice-site mutations occur within genes in the
noncoding regions (introns) just next to the coding regions (exons). Splice-site
mutations can eliminate an existing donor or acceptor site, which will cause an exon
to be skipped and possibly result in a frameshift.
o Start loss: A sequence change in the ATG start codon that prevents the original start
translation site from being used. This kind of mutation may eliminate gene function.
o New translation termination codon: A sequence change that affects the translation
termination codon (Ter/*) introducing a new downstream termination codon,
extending the C-terminus of the encoded protein.
Variant zygosity
Zygosity indicates if a variant is detected on one chromosome or on both chromosomes and
therefore describes the degree of similarity of the alleles for a trait in an organism.
The following zygosities are included in CentoMD®:
o Heterozygous (Het): Gene locus when cells contain two different alleles of a gene.
o Homozygous (Hom): Gene locus when identical alleles of the gene are present on
both homologous chromosomes.
o Hemizygous (Hem): Used for alleles detected in genes located on X-chromosome for
male cases.
For the mitochondrial variants, the zygosity must be read as the degree of heteroplasmy,
i.e. as a mixture of more than one type of mitochondrial DNA (mDNA) within a
cell/individual. In those cases, where a variant in mDNA is responsible for a disease, the
larger the proportion of mutant mitochondria, the more likely the person will show
symptoms of the disease.
Two degrees of heteroplasmy are included:
o Heteroplasmic: The cell has some mitochondria that have a mutation in the mDNA
and some that do not.
o Homoplasmic: The cell has a uniform collection of mDNA: either completely normal
mDNA or completely mutant mDNA.
Page | 15
CentoMD®5.0_Handbook_V5_July2018
Allele frequency at CentoMD®
This term indicates the number of observations of the allele of interest at a particular locus
in CENTOGENE AG-unique population, expressed as decimal.
Publication status
The publication status indicates if the identified variant has previously been published in
the literature or not. For published variants PubMed identifier (PMID) is indicated.
Additionally, the Single Nucleotide Polymorphism Database (dbSNP) ID is provided, if
available. The dbSNP is an archive of genetic variations within and across different species
developed and hosted by the National Center for Biotechnology Information (NCBI) in
collaboration with the National Human Genome Research Institute (NHGRI) and available
to the public.
Clinical significance according to CentoMD®
In CentoMD®, based on the likelihood to predispose to or to cause the observed
phenotype/disease, the detected genetic variants are classified into one of the three
groups: clinically relevant variants (CRV), clinically irrelevant variants (CIV) and variants of
unknown significance (VUS)/uncertain variants/predicted uncertain variants. The CRVs
include the following classes: pathogenic, likely pathogenic, risk factor, modifier and
premutation. The CIVs involve neutral, likely neutral, disease-associated polymorphisms,
Centogene (likely) neutral - published as (likely) pathogenic and mutable normal
(intermediate) (see Figure 1).
For classified and curated variants (++) the classification of genetic germline variants is
done according to the ACMG guidelines which define five classes: pathogenic, likely
pathogenic, uncertain significance, likely neutral and neutral (class 1-5; Richards et al.
(2015), Genet. Med., doi:10.1038/gim2015.30). In addition to the 5 classes specified by
ACMG, CentoMD® also annotates variants as risk factors, modifiers, premutations, disease-
associated polymorphisms, mutable normal (intermediate) and the CentoMD-specific
clinical significance class Centogene (likely) neutral-published as (likely) pathogenic.
Additionally, some modifications of the ACMG guidelines are applied. These modifications
arise from our continuously growing internal expertise in the field of molecular diagnostics
and are represented mainly by new evidences regarding internal observed frequencies,
Page | 16
CentoMD®5.0_Handbook_V5_July2018
segregation data, genotype-phenotype correlation, co-occurrence, enzymatic and
biomarker levels. The adjustments to the ACMG recommendations are specified below.
Figure 1: Classification of genetic variants in CentoMD®. The classification rules determining the clinical significance of a genetic variant are provided in the text. CG: CENTOGENE
Classification as pathogenic is additionally assigned to:
Loss of function (LOF) variants which are associated with pathologically decreased
biochemical levels/ activities.
Non-LOF variants which are associated with pathologically decreased biochemical
levels/ activities and where sufficient clinical information of the associated
individual clearly supports the presence of the metabolic disease.
Classification as likely pathogenic is additionally assigned to:
All variants
Clinically relevant variants(CRV)
Pathogenic
Likely pathogenic
Risk factor
Modifier
Premutation
Variant of unknownsignificance (VUS)
Clinically irrelevant variants(CIV)
CG (likely) neutral – published as(likely) pathogenic
Disease-associated polymorphism
Likely neutral
Neutral
Mutable normal (intermediate)
Page | 17
CentoMD®5.0_Handbook_V5_July2018
LOF-variants detected in the genes related to metabolic disorders with no
biochemical evidences.
Non-LOF-variants found in an individual for whom pathological biochemical data is
supporting but insufficient clinical information was provided to confirm the presence
of the disease.
Risk factors and modifiers are classified based on their distinct manner to influence the
presence or the severity of the disease. To be included in the sub-class of risk factors,
variant should be reported as altering the risk for a disease by influencing function of other
proteins. A modifier is a variant that operates through influencing gene expression and
affects severity of the phenotype but alone is not sufficient to cause the disease.
Premutation is a repeat expansion variant in a range that may not result in the clinical
manifestation of the associated disease in the carrying individual, but that may result in
the manifestation of the disease in the offspring due to potential repeat instability.
A variant is classified as uncertain, when available information is not sufficient to state
pathogenicity. For example, in case of metabolic disorders, novel variants, which are non-
LOF and additionally are associated with inconclusive biochemical data, are annotated as
uncertain.
Variants are classified as neutral and likely neutral based on their high frequency in
population(s), no observed impact on disease presence/severity/susceptibility, or non-
segregation and/or co-occurrence detected, etc.
A mutable normal (intermediate) variant is meiotically unstable and not convincingly
associated with an abnormal phenotype. Because of the instability of alleles in the mutable
normal range, an asymptomatic individual with a mutable normal allele may be predisposed
to having a child with an expanded allele.
The class of disease-associated polymorphism includes variants related to complex,
multigenic disorders with no clear Mendelian inheritance. In order to be classified as
disease-associated polymorphism, variants must have a maximum frequency of 5% in public
databases and the association should be replicated by at least 2 independent studies or in
1 study with functional evidence.
When the internal evidence regarding the clinical significance of a variant is inconsistent
compared to other external sources, the class “CENTOGENE (likely) neutral - published as
(likely) pathogenic” is used in order to emphasize the importance of this observation. This
class of clinical significance is used only to genetic variants of high penetrance. The variants
Page | 18
CentoMD®5.0_Handbook_V5_July2018
associated with this clinical significance class are reclassified based on internal evidences
only. When a variant is reclassified based on external available information, the correct
clinical class is neutral/likely neutral, following strictly the corresponding definitions (see
schematic representation below, Figure 2).
Internal evidences refer to at least one of the following criteria:
Is the DNA change found at Centogene at a frequency above the reported incidence
of its associated disease?
Is the DNA change identified in healthy/ asymptomatic individuals?
Does the DNA change not segregate with the disease in our identified families, or
among independent individuals?
Does the DNA change co-occur with deleterious variants (in the same gene or other
genes) in screened individuals?
Does any other evidence support a likely neutral pathogenicity (like enzymatic
activities or biomarker levels)?
Figure 2: Schematic representation of reclassification of pathogenic variants
Classification of (+) genetic germline variants (not curated, automatically classified), five
ACMG criteria (BA1- stand alone, BP6, PVS1, PM2, and PP5; for more information please
Page | 19
CentoMD®5.0_Handbook_V5_July2018
see
https://www.acmg.net/docs/standards_guidelines_for_the_interpretation_of_sequence_
variants.pdf) are applied. Variants with allele frequency higher than 5% in any of the
following database are automated classified as predicted neutral: gnomAD, ExAc, ESP,
1000Genome or CentoMD.
Rare variants where PVS1 applies (null variants) and / or reputated sources have been
already reported as pathogenic, are automated classified as predicted pathogenic.
Rare variants where conflicting criteria identified, or not enough other evidences at hand,
are classified automatically as predicted uncertain.
Variant re-evaluation and re-classification is a key feature of CentoMD® and performed
regularly in the light of literature, publicly available clinical databases and most important,
based on CENTOGENE AG’s own continuously growing and improving proprietary
information.
Information on disease and inheritance
Every genetic disorder which has been suggested or suspected by the physician is described
according to the Online Mendelian Inheritance in Man® (OMIM®) catalog. OMIM® was
developed for the world-wide-web by NCBI and contains a list of human genes and genetic
diseases with links to other relevant resources (http://www.ncbi.nlm.nih.gov/omim).
Every entry in OMIM® has a unique identifier, which is also captured in CentoMD®.
Each genetic disorder is linked with the observed mode of inheritance (MOI). MOI is defined
by the manner in which a particular genetic trait or disorder is passed from one generation
to the next. The following MOIs are included in CentoMD®:
o Autosomal dominant (AD): The pattern of inheritance in which an affected individual
has one copy of a mutant gene and one copy of normal gene on a pair of autosomal
chromosomes.
o Autosomal recessive (AR): The pattern of inheritance in which both copies of an
autosomal gene must be abnormal for a genetic condition or disease to occur.
o Digenic (Di): The pattern of inheritance that is similar to recessive inheritance,
except that the trait only develops when mutations are found in one copy of each of
the two independent genes simultaneously.
o Imprinting/Epigenetic (Imp/Epi): The pattern of inheritance by mechanisms not
directly involving nucleotide sequences, but paramutations and parental imprinting.
Page | 20
CentoMD®5.0_Handbook_V5_July2018
o Mitochondrial (Mito): The pattern of inheritance of a trait encoded in the
mitochondrial genome.
o Multifactorial (MF): The pattern of inheritance caused by the interplay between
genetic factors and environmental factors.
o Pseudoautosomal dominant (P-AD): The inheritance pattern seen with genes in the
pseudoautosomal region of the X and Y chromosome that can exchange regularly
between the two sex chromosomes. Alleles for genes in the pseudoautosomal region
can show male-to-male transmission, and therefore mimic autosomal inheritance,
because they can cross over from the X to the Y chromosome during male
gametogenesis and be passed on from a father to his male offspring.
o X-linked (X): The mode of inheritance of a trait encoded in the X chromosome.
o Y-linked (Y): The pattern of inheritance that may result from a mutant gene located
on the Y chromosome. By definition, only males are affected.
o Unknown (?): This mode of inheritance is selected for genes not yet associated with
any pathological condition or disease, therefore no pattern of inheritance has been
observed.
Individual-related information on phenotype and demographics
All patient data in CentoMD® are fully anonymized. The following epidemiological and
clinical data are reported for individuals associated with classified and curated CRV and/or
VUS in CentoMD®:
o Random patient ID: Unique identifier assigned to each consented individual in
CentoMD®.
o Finding: Indicates if a variant is related to the indication for testing. Primary findings
are variants related to the indication for testing. Secondary (incidental) findings are
derived from whole exome sequencing (WES) and are pathogenic or likely pathogenic
variants identified in 59 genes recommended by ACMG for reporting of secondary
findings in clinical exome and genome sequencing (Genetics in Medicine, 2017).
Secondary findings are unrelated to the indication for testing.
o OMIM® disease: A number/identifier given by OMIM® to phenotype/disease. For
example, OMIM® disease 230800 stands for Gaucher disease, type I.
Page | 21
CentoMD®5.0_Handbook_V5_July2018
o MOI: Mode of inheritance: It is the manner in which a particular genetic trait or disorder
is passed from one generation to the next.
o Anonymized random family number (ARFN): Unique family number used to keep all
family members together when relationship links are provided.
o Pedigree: Indicates the connection/relation among individuals by blood, marriage, or
adoption in relation to the index patient. Based on the ARFN and the relationships within
one family, it is possible to reconstruct the family trees accordingly. In each family, the
index patient is indicated. The index patient represents the affected individual through
whom the family with a genetic disorder is first diagnosed.
o Sex: Indicates the biological state of the individual of being male, female, intersex,
unknown sex (when no information was provided or a prenatal case was analyzed).
o Age: Age at diagnosis. It is calculated as date of sample entry at CENTOGENE AG minus
date of birth, and is expressed in years. For patients referred to CENTOGENE AG several
times, the date of the first order entry is used by default to calculate the age at
diagnosis.
o Country: Country of sample origin. It indicates the area of the world the patient is
coming from. The basis for this information is the country from which the sample has
been sent to CENTOGENE AG. If physician provides information about the ethnicity of
the patient (e.g. Canadian citizen of German origin), then this (in this case Germany) is
the country selected in this situation.
o Region: Continental region the sample is coming from.
o Clinical information (HPO terms): Description of features and characteristics that the
corresponding physician has provided as supporting evidence of the presence of a
particular disease translated into the vocabulary defined by the HPO
(http://www.human-phenotype-ontology.org/).
Sometimes it is not possible to describe the clinical picture accurately, because the
details are not given by the physician or only general assumptions have been made.
Such cases are documented in CentoMD® in the following manner:
No info/unknown: when no clinical information has been provided;
Healthy/asymptomatic: when the physician has explicitly indicated that the
person is healthy, asymptomatic, or not affected;
Page | 22
CentoMD®5.0_Handbook_V5_July2018
Suspected/affected: when only very general statements are provided by the
physician (e.g. “patient suffering from Fabry disease” or “clinical features of
Marfan”).
o Variant zygosity: Indication if the variant is detected on one chromosome or on both
chromosomes.
o Total number of variants: Total number of detected variants for this case (clinically
relevant; clinically irrelevant) on this particular gene. For example, “10 (1 ; 9)” is to be
interpreted as follows: the total number of variants that were identified in this
proband/patient for this particular gene is 10, one of them is clinically relevant, while
9 are clinically irrelevant variants.
o Genotype: Genetic constitution of a case with respect to the number of alleles and their
clinical significance for this particular gene.
o Enzyme interpretation: Interpretation of the enzyme activity compared to the
reference interval.
o Biomarker interpretation: Interpretation of biomarker levels compared to the reference
interval.
o Clinical statement: The finding or the conclusion of the molecular genetic test
conducted at CENTOGENE AG.
o Sample type: Type of sample sent to CENTOGENE AG for testing. It includes DNA, Blood,
dry blood spot (DBS) or other (e.g. amniotic fluid).
o Age at onset: Refers to the age at which an individual acquires, develops, or first
experiences a condition or symptoms of a disease or disorder.
o Carrier testing: Indicates if the individual was interested in performing a carrier
screening when the presence of specific genetic variant was detected already in other
family members.
o Consanguineous parents: Refers to the marriage between two genetically related
persons.
o Family history: Indicates the presence or the absence of a particular disorder or
symptomatology in blood relatives of a patient.
o Detailed family history: Detailed description of disorders from which direct blood
relatives of the patient have suffered.
Page | 23
CentoMD®5.0_Handbook_V5_July2018
Clinical statement of CENTOGENE AG
The clinical statement is the finding or the conclusion of the molecular genetic test
conducted at CENTOGENE AG. The clinical statement may confirm or disprove the
suspected diagnosis, or serve to elucidate the genetic cause of an uncertain or questionable
condition or disease. When deriving the clinical statement, the following factors are
considered:
o Mode of inheritance of the disorder
o Patient’s genotype
o Clinical significance of all identified genetic variants
o Clinical data provided, if available
o Additionally, sex and/or biochemical evidences, if applicable
The evidence-based rules determining the clinical statement are summarized in Table 1
and Figure 3. The following clinical statements are used in CentoMD®:
o Affected: Individual with confirmed diagnosis at genetic level.
o Probably affected: Fabry male patients carrying an uncertain variant associated with
pathological enzymatic levels, but biomarker levels are within normal range.
o At least carrier: Individual with clinical suspicion most likely confirmed at genetic
level. It includes individuals carrying in trans a pathogenic variant with an uncertain
variant in case of autosomal recessive mode of inheritance, or Fabry females
carrying uncertain variants.
o Probably carrier: Individual carrying an uncertain variant in the context of autosomal
recessive disorders or X-linked disorders (in this last situation it applies only to
female individuals).
o Carrier: Individual who inherited one mutated allele at genetic level in case of
autosomal recessive mode of inheritance, or female in case of X-linked mode of
inheritance.
o Increased risk of developing the disease: Individual with confirmed susceptibility at
genetic level to develop a particular medical condition.
o Increased risk of having affected offspring: Individual who carries a premutation
variant and who may not be clinically affected of the disease himself, but who has
a higher risk of having an affected offspring due to potential repeat instability.
Page | 24
CentoMD®5.0_Handbook_V5_July2018
o Not determined: Individual carrying uncertain variant(s) where clear statement on
either presence or susceptibility to develop a particular disease was not possible.
o Unaffected: Indicates an individual where the susceptibility of the disease was not
confirmed at genetic level.
For example, for an autosomal dominant disorder where the patient’s genotype is
heterozygote, meaning he carries one clinical relevant variant, the expected clinical
statement is either “Affected” or “Increased risk of developing the disease” (according to
the provided clinical information).
Vcf upload and annotation
This functionality refers to annotation of genetic variants detected within genes confirmed to be
associated or cause human diseases / conditions, according to CentoMD®. A single vcf- file is
uploaded, and the genetic variants are subjected to two different approaches:
i) Variants which are identified in CentoMD® are annotated and classified according to the
current version, following the 5 standard ACMG classes (pathogenic, likely pathogenic,
uncertain, likely neutral and neutral) for classified and curated variants (++) and
predicted pathogenic, predicted uncertain and predicted neutral for classified variants
(+).
ii) Variants that are not yet identified in CentoMD® are subjected to on the fly
annotation using Variant Effect Predictor (VEP, see
https://www.ensembl.org/info/docs/tools/vep/index.html) and automatically
classified using the process described above for classified variants (+).
Input file
The input file must be in Variant Call Format (VCF). CentoMD® supports VCF v.4.1 and later
on hg19 genome assembly. For specification see https://samtools.github.io/hts-
specs/VCFv4.1.pdf.
The input file should contain the 8 fixed mandatory columns ('#CHROM', 'POS', 'ID', 'REF',
'ALT', 'QUAL', 'FILTER', 'INFO'), followed by a 'FORMAT' column and then at least one column
containing sample-specific genotype data. The genotype (GT) at every site is mandatory.
If it is not found, the respective variant call will be excluded. Acceptable formats are values
Page | 25
CentoMD®5.0_Handbook_V5_July2018
separated by forward slash (e.g., 0/1) or pipe (e.g., 0|1). The sample in the 10th column
is considered the "active sample". That is, if the file contains additional columns with
genotype data of additional samples or for the same sample but made from additional
variant callers, they will be ignored. The maximum allowed file size is 100MB. Bigger files
will not be accepted. The function does not support multi-sample vcf files.
Annotated File
Once the annotation of the uploaded vcf file is complete the annotated file is created for
download by the user. The annotated file will be provided as a csv file. It contains the
following information:
- Genomic position: The 1-based position of the variation on the given sequence on genome
build hg19
- Ref: The reference base (or bases in the case of an indel) at the given position on the given
reference sequence.
- Alt: The list of alternative alleles at this position.
- Gene: A gene is defined by a sequence of DNA that represents a basic unit of
heredity, being expressed in RNA and proteins. In CentoMD® the HGNC-approved
gene symbols are used. Every gene-variant combination is represented by one entry.
In case a variant maps on more than one gene it will be included more than once.
- Transcript: Coding DNA reference sequence refers to a cDNA-derived sequence
containing the full length of all coding regions and non-coding untranslated regions.
CentoMD® uses RefSeq transcripts (genome build hg19) and only one transcript per
Gene-variant combination will be included. When several transcripts are available
for a single gene the selection of the transcript displayed is done as follows: 1.
Transcript where the variant has the highest impact, 2. Longest transcript, 3.
Transcript with the most number of exons, 4. Length of genomic locus, 5. Transcript
where the variant falls within an exon rather than the transcript where the variant
falls into a non-coding region.
- Coding DNA change
- genomic DNA change
- protein change
- location
- variant type on DNA level
Page | 26
CentoMD®5.0_Handbook_V5_July2018
- coding effect
- clinical significance according to CentoMD® or the predicted clinical significance
- HGMD accession number
- ClinVar classification, comma separated if more than one classification exists for the
variant
Page | 27
CentoMD®5.0_Handbook_V5_July2018
Appendix
Abbreviations used in CentoMD®
Evidence-based annotation rules to determine the clinical statement
(next 2 pages)
MOI Mode of Inheritance
Abbreviation Definition
AD Autosomal dominant
AR Autosomal recessive
Di Digenic
Imp/Epi Imprinting/Epigenetic
Mito Mitochrondrial
MF Multifactoral
P-AD Pseudoautosomal dominant
X X-linked
Y Y-linked
? unknown
Genotype
Abbreviation Definition
Comp Het Compound heterozygote
Hem Hemizygote
Het Heterozygote
Hom Homozygote
Other Other/complex
WT Wild type
Zygosity
Abbreviation Definition
Hem Hemizygous
Het Heterozygous
Hom Homozygous
Page | 28
CentoMD®5.0_Handbook_V5_July2018
Genotype1)
MOI2)
Significance3)
Significance 23)
CI4)
Clinical statement
AD
AR
X-linked7)
Path5)
VUS6)
Path5)
VUS6)
- + ?
Hom/
Hem
x
x
x
increased risk
x
x
x
affected
x
x
x affected/increased risk
x
x
x
not determined
x
x
x
not determined
x
x
x not determined
x
x
x
increased risk
x
x
x
affected
x
x
x affected
x
x
x
not determined
x
x
x
not determined
x
x
x not determined
x x
x
increased risk
x x
x
affected
x x
x affected/increased risk
x
x
x
not determined
x
x
x
not determined
x
x
x not determined
Het
x
x
x
increased risk
x
x
x
affected
x
x
x affected/increased risk
x
x
x
not determined
x
x
x
not determined
x
x
x not determined
x
x
x
carrier
x
x
x
carrier
x
x
x carrier
x
x
x
probably carrier
x
x
x
probably carrier
x
x
x probably carrier
x x
x
carrier
x x
x
carrier
x x
x carrier
x
x
x
not determined
x
x
x
not determined
x
x
x not determined
Page | 29
CentoMD®5.0_Handbook_V5_July2018
Table 2: Evidence-based annotation rules to determine the clinical statement at CentoMD®. See Figure 3 for further illustration of the decision process.
1) The most often detected annotation classes are included. The wild type genotype is excluded. For wild type the clinical statement is “Unaffected”. 2) Mode of Inheritance. 3) Indicates the clinical significance of the identified variant. 4) Clinical information:
- indicates the absence of signs and symptoms of the disease (i.e. healthy/unaffected); + indicates the presence of signs and symptoms of the disease: ? indicates that no clinical information was provided.
5) Refers to a variant annotated as pathogenic, likely pathogenic, modifier or risk factor. 6) Uncertain variant. 7) Two X-linked diseases (i.e. Fabry disease and Hunter disease) do not follow these definitions closely, as additional information is available and used as a decision factor when selecting the finding. For these two diseases, please see the decision trees presented in Figure 4.
Comp Het
x
x
x
x
increased risk
x
x
x
x
affected
x
x
x
x affected/increased risk
x
x
x x
increased risk
x
x
x
x
affected
x
x
x
x affected/increased risk
x
x
x x
not determined
x
x
x
x
not determined
x
x
x
x not determined
x
x
x
x
increased risk
x
x
x
x
affected
x
x
x
x affected
x
x
x x
at least carrier
x
x
x
x
at least carrier
x
x
x
x at least carrier
x
x
x x
not determined
x
x
x
x
not determined
x
x
x
x not determined
x x
x
x
increased risk
x x
x
x
affected
x x
x
x affected
x x
x x
at least carrier
x x
x
x
affected
x x
x
x at least carrier
x
x
x x
not determined
x
x
x
x
not determined
x
x
x
x not determined
Page | 30
CentoMD®5.0_Handbook_V5_July2018
Figure 3: Decision trees that illustrate the evidence-based annotation rules which determine the clinical statement at CentoMD® The decision levels illustrated are: MOI – Genotype – Clinical significance (variant effect) – Clinical information – Clinical statement (the
caption of Table 1 also applies to this figure).
Page | 31
CentoMD®5.0_Handbook_V5_July2018
Figure 4: Decision trees that illustrate the evidence-based annotation rules which determine the clinical statement for Fabry and Hunter disease. The decision levels illustrated are: MOI – Genotype – Clinical significance (variant effect) – Clinical information – Clinical
statement (the caption of Table 1 also applies to this figure)
Fabry disease Hunter/MPS2 disease
Page | 32
CentoMD®5.0_Handbook_V5_July2018
Glossary
Term Explanation
Allele One of two (or more) forms of a gene/genetic locus.
Allele frequency at CentoMD®
Indicates the number of observations of the allele of interest at a particular locus in CentoMD-unique population, expressed as decimal.
Biochemical analysis
Method to analyze enzymatic activity or levels of biomarkers in samples obtained from patients, usually suspected being affected by a metabolic disorder. This is a test performed to detect, diagnose and monitor diseases, disease processes, susceptibility and determine a course of treatment.
Data File Upload-VCF
Variant Call Format: the format of a text file used in bioinformatics for storing gene sequence variations.
Degree of heteroplasmy
Mixture of more than one type of mitochondrial DNA (mDNA) within a cell/individual. In those cases where a mutant mDNA is responsible for a disease, the larger the proportion of mutant mitochondria, the more likely the person will show symptoms of the disease.
Degree of heteroplasmy-Heteroplasmic
Cell has some mitochondria that have a mutation in the mDNA and some that do not.
Degree of heteroplasmy-Homoplasmic
Cell has a uniform collection of mDNA: either completely normal mDNA or completely mutant mDNA.
Disease
Particular abnormal, pathological condition that affects part or all of an organism. It is often construed as a medical condition associated with specific symptoms and signs.
Disease name
Name of a disease according to Online Mendelian Inheritance in Man (OMIM) database.
Gene
Sequence of DNA that represents a basic unit of heredity, being expressed in RNA and proteins.
Gene symbol A unique abbreviation for the gene name assigned by the HUGO Gene Nomenclature Committee (HGNC).
Gene-cDNA
DNA that is synthesized from a messenger RNA template; the single-stranded form is often used as a probe in physical mapping.
Gene-mDNA
An extranuclear double-stranded DNA found exclusively in mitochondria that in most eukaryotes is a circular molecule and is maternally inherited.
Gene-Mitochondrial A gene located in the mitochondria.
Gene-Nuclear coding
A gene located in the cell nucleus of a eukaryote that encodes for protein.
Gene-Nuclear non-coding
A gene located in the cell nucleus that does not encode for a protein product.
Genotype to Phenotype module
Tool allowing search initiation using approved gene symbols. The corresponding genetic variants are associated with epidemiological data and clinical information following the HPO nomenclature.
Genotype to Phenotype module-Gene statistics-Screened individuals
Indicates the total number of individuals screened at genetic level.
Page | 33
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
HGVS
Human Genome Variant Society that promotes: i) collection, documentation and distribution of genomic variation information and associated clinical variations; ii) guidelines and recommendations for mutation and gene nomenclature (http://www.hgvs.org/).
HGVS nomenclature
Standardized system recommended by HGVS to describe and document variant sequences.
HPO term
Phenotypic description of individuals provided by medical experts and translated into the vocabulary defined by the HPO.
Individuals-Analyzed individuals
Indicates the screened individuals at genetic level, under General statistics
Individuals-G2P individuals
Indicates the screened and consented individuals where the manual curation confirmed the genotype-phenotype correlation.
Individuals-Non-G2P individuals
Indicates the screened and consented individuals where the manual curation could not confirm any genotype-phenotype correlation. These individuals are periodically reviewed in context of genotype-phenotype correlations.
Manual curation
Manual review of submitted entries to identify typing, mis-selection and omission errors. This process ensures that all collected items are properly documented, associated and interpreted.
Mutation
Rare difference and permanent change in a DNA sequence or gene at a given locus. In medical genetics it is often used to indicate a disease-causing allele.
OMIM
Online Mendelian Inheritance in Man: Database which contains a list of human genes and genetic diseases with links to other relevant resources, developed for the world-wide-web by NCBI (http://www.ncbi.nlm.nih.gov/omim).
Phenotype to Genotype module
Tool allowing search initiation using valid HPO terminology. Using the population of similar cases sharing the HPO terms, hints on the potential candidate genes explaining the particular phenotype are provided.
Phenotype to Genotype module-Search for similar cases-Candidate genes
Represent the most potential genes associated with a particular combination of HPO terms (phenotype). Candidate genes linked with similar cases that are within 25% of the highest similarity score are displayed.
Phenotype to Genotype module-Search for similar cases-Case ID Random patient ID referring to a consented case.
Phenotype to Genotype module-Search for similar cases-HPO ID Unique HPO identifier for the attributed HPO term.
Phenotype to Genotype module-Search for similar cases-HPO name
Phenotypic description of individuals provided by medical experts and translated into the vocabulary defined by the HPO.
Phenotype to Genotype module-Search for similar cases-P-value
Defines the likeliness of obtaining the corresponding similarity score or higher by accident. The p-value is calculated by comparing individuals with random symptoms and their similarity scores. The p-value reasons over the similarity score distribution. The higher the p-value, the more likely it is to obtain the corresponding similarity score by accident. The p-value ranges from 0 to 1, where 0 is best.
Page | 34
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Phenotype to Genotype module-Search for similar cases-Shared HPO terms
Indication how many HPO terms of a case analyzed at CENTOGENE match the HPO terms provided by the users.
Phenotype to Genotype module-Search for similar cases-Similarity score
Phenotypic semantic similarity measure based on the HPO. The similarity score of two patients is a formal measure of their resemblance with respect to their standardized symptoms. The score is calculated by a pairwise comparison between each symptom of the two patients. The higher the score, the more similar the patients.
Phenotype to Genotype module-Similar cases
The cases analyzed at CENTOGENE which match the HPO terms provided by the user. Displayed are all cases that have a similarity score of 1 or higher.
Positive individual Indicates an individual carrying a particular genetic variant.
Statistics-Carrier Individual who has only one copy of a genetic variant for a recessive disease.
Statistics-Case
Indicates an individual where the diagnosis was confirmed by genetic testing at CENTOGENE.
Statistics-Geographical region
Indicates the area of the world the patient is coming from. The basis for this information is the region where the patient lives.
Statistics-Wildtype Represents a person carrying only normal genetic variations.
Variant A sequence variation in a gene.
Variant-Alt The list of alternative alleles at this position
Variant-cDNA change
Change at cDNA level following numbering based on coding DNA reference sequences.
Variant-Classified and curated
A variant which has been assigned to a clinical significance class based on confirmed genotype-phenotype association and curated by strictly following the curation workflow.
Variant-Clinical significance according to CentoMD®
Indicates the likelihood of this variant to predispose to or to cause the disorder.
Variant-Clinical significance -Disease associated polymorphism (DP)
Variant reported to be significantly associated with a phenotype/disease.
Variant-Clinical significance-CENTOGENE (likely) neutral - published as (likely) pathogenic
Variant published consistently in literature as (likely) pathogenic but re-classified as (likely) neutral based on internal evidences (observed allele frequency, family segregation studies, co-occurrence with other deleterious genetic variants, etc.). This class is used only to genetic variants of high penetrance.
Variant-Clinical significance-Likely neutral
Variant reported to be likely neutral, prediction software indicates a probably not pathological effect, and or high frequency in population observed. This classification class is equivalent to "likely benign".
Variant-Clinical significance-Likely pathogenic
Variant with probable pathogenicity, or the effect on the protein function is predicted to be likely deleterious (>90% probability to cause the disease).
Variant-Clinical significance-Modifier Variant that can alter the expression of another gene in the phenotype of an individual.
Variant-Clinical significance-Mutable normal (intermediate)
Variant that is meiotically unstable and not convincingly associated with an abnormal phenotype. Because of the instability of alleles in the mutable normal range, an
Page | 35
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
asymptomatic individual with a mutable normal allele may be predisposed to having a child with an expanded allele.
Variant-Clinical significance-Neutral
Variant reported not to influence the disease risk of the individual, or predicted to be neutral based on the high frequency in population, no effect on protein or regulatory regions. This classification class is equivalent to "benign".
Variant-Clinical significance-Pathogenic Variant that is known to cause the phenotype/disease.
Variant-Clinical significance-Pathological D4Z4 allele
Large, polymorphic repeat structure associated with a rough and inverse relationship between clinical severity and the residual repeat size, with the smallest repeats causing the most severe phenotype.
Variant-Clinical significance-Predicted neutral
Variant predicted not to influence the disease risk of the individual, or predicted to be neutral based on the high frequency in population, no effect on protein or regulatory regions. This classification class is equivalent to predicted "benign".
Variant-Clinical significance-Predicted pathogenic
Variant that has been automatically predicted to cause a phenotype/disease based on 5 ACMG criteria (BA1, BP6, PVS1, PM2 and PP5).
Variant-Clinical significance-Predicted uncertain
Variant with predicted unknown or questionable impact on a particular clinical phenotype.
Variant-Clinical significance-Premutation
A repeat expansion variant in a range that may not result in the clinical manifestation of the associated disease in the carrying individual, but that may result in the manifestation of the disease in the offspring due to potential repeat instability.
Variant-Clinical significance-Risk factor
Variant reported to be associated with the phenotype/disease and influencing the function(s) of the protein.
Variant-Clinical significance-Secondary mitochondrial mutation
The primary molecular defect resides in a nuclear gene, which leads to secondary mDNA abnormalities, such as loss of mDNA copy number or multiple mDNA deletions.
Variant-Clinical significance-Uncertain (VUS)
Variant with unknown or questionable impact on a particular clinical phenotype.
Variant-Coding effect Describes the impact of the observed DNA change on protein level.
Variant-Coding effect-Effect unknown
The coding effect on protein level has not been analyzed. An effect is expected but difficult to predict.
Variant-Coding effect-Extension
Affect either the first (start, translation initiation, N-terminus, ATG) or last codon (translation termination, stop) and as a consequence extend the protein sequence N- or C-terminally with one or more amino acids.
Variant-Coding effect-Frameshift
A sequence change caused by deletion/insertion of nucleotides affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame.
Variant-Coding effect-Increased polyglutamine tract/expanded polyQ
Portion of a protein consisting of a sequence of several glutamine (Glu; Q) units.
Page | 36
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Coding effect-In-frame
A sequence change that does not cause a shift in the triplet reading frame. As a result one or more amino acids are replaced by one or more other amino acids.
Variant-Coding effect-Missense
A single nucleotide change that results in a codon that codes for a different amino acid. Not all missense mutations are deleterious, some changes can have no effect. Because of the ambiguity of missense mutations, it is often difficult to interpret the consequences of these mutations in causing disease.
Variant-Coding effect-New translation initiation codon
A sequence change that creates a new ATG start codon upstream of the original start translation site. If the new ATG is close enough to the original one (so that it is within the processed transcript and downstream of a ribosome-binding site) and in frame, it will be used to initiate translation, adding amino acids to the amino terminus of the original protein.
Variant-Coding effect-New translation initiation site
A sequence change affecting the translation initiation codon (Met-1) introducing a new upstream initiation codon extending the N-terminus of the encoded protein.
Variant-Coding effect-New translation termination codon
A sequence change that affects the translation termination codon (Ter/*) introducing a new downstream termination codon extending the C-terminus of the encoded protein.
Variant-Coding effect-Non-coding
The change on DNA level produces no effect on protein, or the effect of regulatory mutations is unknown.
Variant-Coding effect-Nonsense
A sequence change that results in a premature stop codon, and in a truncated, incomplete protein product.
Variant-Coding effect-Silent
A sequence change that results in a codon that codes for the same amino acid and without any functional change in the protein product.
Variant-Coding effect-Splicing mutation
A sequence change that affects the splicing process (i.e. intron removal and exons joining). Splice-site mutations occur within genes in the noncoding regions (introns) just next to the coding regions (exons). Splice site mutations can eliminate an existing donor or acceptor site, which will cause an exon to be skipped and possibly result in a frameshift.
Variant-Coding effect-Start loss
A sequence change in the ATG start codon that prevents the original start translation site from being used. This kind of mutation may eliminate gene function.
Variant-dbSNP
The Single Nucleotide Polymorphism Database (dbSNP) is a free public archive for genetic variation within and across different species developed and hosted by the National Center for Biotechnology Information (NCBI) in collaboration with the National Human Genome Research Institute (NHGRI).
Variant-gDNA change
Change at genomic DNA level following numbering based on genomic DNA reference sequence.
Variant-genomic position
The 1-based position of the variation on the given sequence on genome build hg19
Variant-Individual Represents a unique individual who was tested for a certain disease, condition or carrier status at CENTOGENE.
Page | 37
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Individual view-Quality status-G2P individual
Individual with confirmed genotype-phenotype association. At database level a G2P individual (++) is linked to at least one classified and curated variant (++).
Variant-Individual view-Quality status-Non-G2P individual
Individual with not yet confirmed genotype-phenotype association. At database level a Non-G2P individual (+) is linked to at least one classified variant (+).
Variant-Location
The location of the DNA change relative to the transcriptional initiation site, initiation codon, polyadenylation site or termination codon of the corresponding gene.
Variant-Location-3'UTR
3'-Untranslated Region: Particular section of messenger RNA that starts with the nucleotide immediately following the stop codon of the coding region. This region contains transcription and translation regulating sequences.
Variant-Location-5'UTR
5'-Untranslated Region: Sequences on the 5' end of messenger RNA but not translated into protein. It extends from the transcription start site to just before the ATG translation initiation codon. 5' UTR may contain sequences that regulate translation efficiency or messenger RNA stability.
Variant-Location-Downstream
The region located 3' (downstream) from the polyadenylation signal of the gene.
Variant-Location-Exon The protein-coding DNA sequence of the gene.
Variant-Location-Intron
The non-coding region of a gene that interrupt the protein coding regions (exons).
Variant-Location-Upstream
The region located 5' (upstream) from the 5'UTR region of the gene.
Variant-PMID
PubMed-Index for MEDLINE, PubMed identifier or PubMed unique identifier is a unique number assigned to each PubMed record.
Variant-Positive individuals
Indicates how many times a particular variant was observed at CENTOGENE in comparison to the total number of analyzed individuals for a particular gene, expressed as fraction.
Variant-Positive individuals (%)
Indicates how many times a particular variant was observed at CENTOGENE in comparison to the total number of analyzed individuals for a particular gene, expressed as percent (%).
Variant-Positive individuals-Age at onset
Refers to the age at which an individual acquires, develops or first experience a condition or symptoms of a disorder.
Variant-Positive individuals-ARFN
Anonymized random family number: Family unique number used to keep all members together when relationship links are provided.
Variant-Positive individuals-Biomarker interpretation
Evaluation of the biomarker levels compared to the reference interval.
Variant-Positive individuals-Biomarker interpretation-Normal Biomarker levels are within the normal range (no change).
Variant-Positive individuals-Biomarker interpretation-Pathological
Biomarker levels are significantly increased compared to the normal range.
Page | 38
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Positive individuals-Biomarker interpretation-Slightly decreased
Biomarker levels are only slightly decreased compared to the normal range.
Variant-Positive individuals-Biomarker interpretation-Slightly increased
Biomarker levels are only slightly increased compared to the normal range.
Variant-Positive individuals-Carrier testing
Indicates if the individual was interested in performing a carrier screening when the presence of specific genetic variant was detected already in other family members.
Variant-Positive individuals-Clinical information (HPO terms)
Description of features and characteristics that the corresponding physician has provided as supporting evidence of the presence of a particular disease translated into the vocabulary defined by the HPO.
Variant-Positive individuals-Clinical information (HPO terms)-Healthy/asymptomatic
Selected when the physician has explicitly indicated that the person is healthy, asymptomatic or not affected.
Variant-Positive individuals-Clinical information (HPO terms)-No info/unknown Selected when no clinical information has been provided.
Variant-Positive individuals-Clinical information (HPO terms)-Suspected/affected
Selected when only very general statements are provided by the physician (e.g. "patient is suffering from Breast Cancer" or "clinical features of Parkinson").
Variant-Positive individuals-Clinical statement of CENTOGENE
Refers to the finding or the conclusion of the molecular genetic test conducted at CENTOGENE. The clinical statement may confirm or disprove the suspected diagnosis, or serve to elucidate the genetic cause of an uncertain or questionable condition or disease.
Variant-Positive individuals-Clinical statement-Affected Individual with confirmed diagnosis at genetic level.
Variant-Positive individuals-Clinical statement-At least carrier
Individual with clinical suspicion most likely confirmed at genetic level. It includes individuals carrying in trans a pathogenic variant with an uncertain variant in case of autosomal recessive mode of inheritance, of Fabry females carrying uncertain variants.
Variant-Positive individuals-Clinical statement-Carrier
Individual who inherited one mutated allele at genetic level in case of autosomal recessive mode of inheritance, or female in case of X-linked mode of inheritance.
Variant-Positive individuals-Clinical statement-Increased risk of developing the disease
Individual with confirmed susceptibility at genetic level to develop a particular medical condition.
Variant-Positive individuals-Clinical statement-Increased risk of having affected offspring
Individual who carries a premutation variant and who may not be clinically affected of the disease himself, but who has a higher risk of having an affected offspring due to potential repeat instability.
Variant-Positive individuals-Clinical statement-Not determined
Individual carrying uncertain variant(s) where clear statement on either presence or susceptibility to develop a particular disease was not possible.
Variant-Positive individuals-Clinical statement-Probably affected
Fabry male patients carrying an uncertain variant associated with pathological enzymatic levels, but biomarker levels within normal range.
Page | 39
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Positive individuals-Clinical statement-Probably carrier
Individual carrying an uncertain variant in the context of autosomal recessive disorders or X-linked disorders (in this last situation it applies only to female individuals).
Variant-Positive individuals-Clinical statement-Unaffected
Indicates an individual where the susceptibility or presence of the disease was not confirmed at genetic level.
Variant-Positive individuals-Clinically irrelevant variant (CIV)
Variants which do not cause or influence the presence or severity of the disease. It includes variants of the following significance: neutral, likely neutral, disease-associated polymorphism, CENTOGENE (likely) neutral - published as (likely) pathogenic.
Variant-Positive individuals-Clinically relevant variant (CRV)
Variants which do cause or influence the presence or the severity of the disease. It includes variants of the following significance: likely pathogenic, pathogenic, risk factor, modifier.
Variant-Positive individuals-Consanguineous parents Refers to the marriage between two genetically related persons.
Variant-Positive individuals-Co-occurrence, other genes
Indicates the presence of other clinical relevant variant(s) or uncertain variant (s) in other genes than the gene of interest.
Variant-Positive individuals-Co-occurrence, same gene
Indicates the presence of other clinically relevant variant(s) or uncertain variant (s) in the gene of interest.
Variant-Positive individuals-Country
Indicates the area of the world the patient is coming from. The basis for this information is the country where the patient lives. If physician provides information about the ethnicity of the patient (e.g. Canadian citizen of German origin), then this (in this case Germany) is the item selected in this situation.
Variant-Positive individuals-Detailed family history
Detailed description of disorders from which direct blood relatives of the patient have suffered.
Variant-Positive individuals-Enzyme interpretation
Evaluation of the enzyme activity compared to the reference interval.
Variant-Positive individuals-Enzyme interpretation-Normal
Levels of enzyme activity are within the normal range (no change).
Variant-Positive individuals-Enzyme interpretation-Pathological
Levels of enzyme activity are significantly decreased compared to the normal range.
Variant-Positive individuals-Enzyme interpretation-Slightly decreased
Levels of enzyme activity are only slightly decreased compared to the normal range.
Variant-Positive individuals-Enzyme interpretation-Slightly increased
Levels of enzyme activity are only slightly increased compared to the normal range.
Variant-Positive individuals-Family history
Indicates the presence or the absence of a particular disorder or symptomatology in blood relatives of a patient.
Variant-Positive individuals-Finding
Indicates if a variant is related or unrelated to the indication for testing.
Variant-Positive individuals-Finding-Primary Variant related to the indication for testing.
Variant-Positive individuals-Finding-Secondary
Variant unrelated to the indication for testing (incidental finding).
Variant-Positive individuals-Genotype
Represents the genetic constitution of an individual with respect to the number of alleles and their clinical significance identified for a particular gene.
Page | 40
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Positive individuals-Genotype-Compound heterozygote
An individual carrying two different, heterozygous, in trans, uncertain or clinically relevant alleles (likely pathogenic, pathogenic, risk factor, modifier) at a given locus.
Variant-Positive individuals-Genotype-Hemizygote
A male individual carrying one uncertain or clinically relevant allele (pathogenic, likely pathogenic, risk factor, modifier) located on X-chromosome.
Variant-Positive individuals-Genotype-Heterozygote
An individual carrying one uncertain or clinically relevant allele (pathogenic, likely pathogenic, risk factor, modifier).
Variant-Positive individuals-Genotype-Homozygote
An individual carrying two identical, uncertain or clinically relevant alleles (pathogenic, likely pathogenic, risk factor, modifier) at one locus.
Variant-Positive individuals-Genotype-Other/complex
An individual carrying uncertain or clinically relevant alleles (pathogenic, likely pathogenic, risk factor, modifier) in other combinations than described above (e.g. two alleles located in cis, three heterozygous mutations, one homozygous and one heterozygous, etc.).
Variant-Positive individuals-Genotype-Wild type
An individual carrying clinically irrelevant alleles (neutral, likely neutral, disease-associated polymorphism, CENTOGENE (likely) neutral - published as (likely) pathogenic).
Variant-Positive individuals-Mode of Inheritance (MOI)
The manner in which a particular genetic trait or disorder is passed from one generation to the next.
Variant-Positive individuals-MOI-Autosomal dominant (AD)
The pattern of inheritance in which an affected individual has one copy of a mutant gene and one normal gene on a pair of autosomal chromosomes.
Variant-Positive individuals-MOI-Autosomal recessive (AR)
The pattern of inheritance in which both copies of an autosomal gene must be abnormal for a genetic condition or disease to occur.
Variant-Positive individuals-MOI-Digenic (Di)
The pattern of inheritance that is similar to recessive inheritance, except that the trait only develops when mutations are found in one copy of each of the two independent genes simultaneously.
Variant-Positive individuals-MOI-Imprinting/Epigenetic (Imp/Epi)
The pattern of inheritance by mechanisms not directly involving nucleotide sequences, but paramutations and parental imprinting.
Variant-Positive individuals-MOI-Mitochondrial (Mito)
The pattern of inheritance of a trait encoded in the mitochondrial genome.
Variant-Positive individuals-MOI-Multifactorial (MF)
The pattern of inheritance caused by the interplay between genetic factors and environmental factors.
Variant-Positive individuals-MOI-Pseudoautosomal dominant (P-AD)
The inheritance pattern seen with genes in the pseudoautosomal region of the X and Y chromosome that can exchange regularly between the two sex chromosomes. Alleles for genes in the pseudoautosomal region can show male-to-male transmission, and therefore mimic autosomal inheritance, because they can cross over from the X to the Y chromosomes during male gametogenesis and be passed on from a father to his male offspring.
Variant-Positive individuals-MOI-Unknown (?)
This mode of inheritance is selected for genes not yet being associated with any pathological condition or disease, and therefore no pattern of inheritance observed.
Page | 41
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Positive individuals-MOI-X linked (X)
The pattern of inheritance of a trait encoded on the X chromosome.
Variant-Positive individuals-MOI-Y linked (Y)
The pattern of inheritance that results from a mutant gene located on the Y chromosome. By definition, only males are affected.
Variant-Positive individuals-OMIM disease
Number of a disease according to Online Mendelian Inheritance in Man (OMIM) database.
Variant-Positive individuals-Pedigree
Indicates the connection/relation among individuals by blood, marriage or adoption.
Variant-Positive individuals-Pedigree-Index patient
Represents the affected individual through whom the family with a genetic disorder is brought to the attention of others.
Variant-Positive individuals-Random patient ID Random patient ID referring to a consented individual.
Variant-Positive individuals-Region
Indicates the area of the world the patient is coming from. The basis for this information is the region where the patient lives.
Variant-Positive individuals-Sex
Indicates the biological state of the individual of being male (m), female (f), intersex or unknown (?) sex (when no information was provided or a prenatal case was analyzed).
Variant-Positive individuals-Total number of variants
The total number of detected variants for a case (clinically relevant; clinically irrelevant) on a particular gene.
Variant-Protein change
Change at protein level following numbering based on the amino acid sequence, using one letter amino acid code and X for designating a translation termination codon.
Variant-Publication status Indicates if the identified variant has previously been or not published in the literature.
Variant-Publication status-Published
Indicates that the identified genetic variant has been already published in the literature.
Variant-Publication status-Unpublished
Indicates that the identified genetic variant has not been previously published in the literature.
Variant-Quality status-Classified
A variant which has been assigned to a clinical significance class but has not yet been curated due to missing genotype-phenotype correlations.
Variant-Rationale
Summary supporting the clinical significance according to the ACMG guidelines and internal evidences.
Variant-Ref The reference base (or bases in the case of an indel) at the given position on the given reference sequence.
Variant-Sample type Type of samples sent to CENTOGENE for testing.
Variant-Sample type-Blood Blood sample sent to CENTOGENE for testing.
Variant-Sample type-DBS Dried blood spot sample sent to CENTOGENE for testing.
Variant-Sample type-DNA Extracted DNA sample sent to CENTOGENE for testing.
Variant-Sample type-Other A sample sent to CENTOGENE for testing which type is other than blood, DBS or DNA (e.g. tissue).
Variant-Screening method The test used to identify the cause of the disease.
Page | 42
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Screening method-CES
Clinical Exome Sequencing: application of the next generation technology to determine the variations of coding regions of genes which have been associated to human disease.
Variant-Screening method-MLPA
Multiplex Ligation-dependent Probe Amplification: Variation of the multiplex PCR that permits multiple targets to be amplified with only a single primer pair. Used especially for detecting large/gross and gene rearrangements.
Variant-Screening method-NGS
Next-Generation Sequencing: High-throughput sequencing technology, allowing the parallel sequencing of multiple genes, producing thousands or millions of sequences concurrently.
Variant-Screening method-Other method
Refers to other methodology (like fragment length) used to detect the variants.
Variant-Screening method-qPCR
Quantitative Polymerase Chain Reaction: Method to amplify and simultaneously quantify a targeted DNA molecule. Used especially to detect large/gross and gene rearrangements.
Variant-Screening method-Sanger
Classical method of DNA sequencing, developed by Fred Sanger, using chemically altered dideoxy bases to terminate newly synthesized DNA fragments at specific bases (either A, C, T or G). These fragments are then size-separated, and the DNA sequence can be read.
Variant-Screening method-WES
Whole Exome Sequencing: application of the next-generation technology to determine the variations of all coding regions, or exons, of known genes.
Variant-Screening method-WGS
Whole Genome Sequencing: modern day technology for sequencing of the entire coding and non-coding regions of the genome.
Variant-Statistics-Age at diagnosis
Is calculated as date of sample entry at CENTOGENE minus date of birth, and is expressed in years. For patients referred to CENTOGENE several times, the date of the first order entry is used by default to calculate the age at diagnosis.
Variant-Statistics-Clinical information distribution-Frequency in cases
Indication how many times a particular variant in cases with particular phenotype (HPO term) was observed in comparison to the total number of analyzed cases for a particular variant.
Variant-Transcript used in CentoMD®
The transcript that is used at CENTOGENE/CentoMD® as a reference sequence.
Variant-Transcript/Reference Sequence
Digital nucleic acid sequence, assembled by scientists as a representative example of a species' set of genes. Coding DNA reference sequence refers to a cDNA-derived sequence containing the full length of all coding regions and non-coding untranslated regions.
Variant-Type of variant on DNA level Different types of change than can occur in the DNA sequence.
Variant-Type of variant on DNA level-Chromosomal deletion Refers to loss of parts of chromosomes.
Variant-Type of variant on DNA level-Complex rearrangement
Involves the structures or number of the chromosomes, it is referred to as chromosome mutation, or rearrangement, rearranged chromosomes.
Variant-Type of variant on DNA level-Conversion
Non-reciprocal transfer of information between homologous sequences; one DNA sequence replaces a homologous sequence
Page | 43
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
such that the sequences become identical after the conversion event.
Variant-Type of variant on DNA level-Deletion
A sequence change where one or more nucleotides are removed (deleted).
Variant-Type of variant on DNA level-Duplication
A sequence change where a copy of one or more nucleotides are inserted directly 3’-flancking of the original copy.
Variant-Type of variant on DNA level-Gain of methylation Gain of the normal DNA methylation level.
Variant-Type of variant on DNA level-Gene & regulatory region(s) deletion Refers to loss of the entire gene and flanking regions.
Variant-Type of variant on DNA level-Gene & regulatory region(s) duplication Refers to the gain of the entire gene and flanking regions.
Variant-Type of variant on DNA level-Gene deletion Refers to loss of the entire gene.
Variant-Type of variant on DNA level-Gene duplication Refers to gain/duplication of the entire gene.
Variant-Type of variant on DNA level-Gross deletion Refers to loss of parts of a gene.
Variant-Type of variant on DNA level-Gross duplication Refers to gain/duplication of part(s) of a gene.
Variant-Type of variant on DNA level-Gross inversion Refers to 180 degree inversion of part(s) of a gene.
Variant-Type of variant on DNA level-Insertion
Genetic mutation where one or more nucleotides are added (inserted) into a DNA sequence or it may involve portions of a chromosome.
Variant-Type of variant on DNA level-Insertion/Deletion (Indel)
Refers to the mutation class that includes a combination of both insertions and deletions.
Variant-Type of variant on DNA level-Inversion
Chromosomal abnormality where a segment of a chromosome is rotated 180 degrees and reinserted.
Variant-Type of variant on DNA level-Loss of methylation Loss of the normal DNA methylation level.
Variant-Type of variant on DNA level-Other/complex
Refers to all other types not included in any category under Type of variant on DNA level.
Variant-Type of variant on DNA level-Pathological allele (D4Z4 motif)
Deletion of 3.3-kb repeats from a chromosomal tandem repeat called D4Z4 located near the end of chromosome 4 at the 4q35-ter location. D4Z4 contains an ORF encoding a putative homeobox protein called DUX4, a large polymorphic repeat structure consisting of 1-100 KpnI units.
Variant-Type of variant on DNA level-Repeat expansion
Refers to an increase number of repeats of a genomic tandemly repeated DNA sequence.
Variant-Type of variant on DNA level-Retrotransposon insertion
Retrotransposons (also called transposons via RNA intermediates) are genetic elements that can amplify themselves in a genome, and can induce mutations by inserting near or within genes. Retrotransposon-induced mutations are relatively stable, because the sequence at the insertion site is retained as they transpose via the replication mechanism.
Page | 44
CentoMD®5.0_Handbook_V5_July2018
Term Explanation
Variant-Type of variant on DNA level-Substitution
A sequence change where one nucleotide is replaced by one other nucleotide. Substitutions are described using a ">"-character (indicating "changes to").
Variant-Unclassified A variant which has not yet been assigned to any clinical significance class due to the lack of information.
Variant-Zygosity
Indicates if a variant is detected on one chromosome or on both chromosomes. Describes the degree of similarity of the alleles for a trait in an organism.
Variant-Zygosity-Hemizygous (Hem) Used for alleles detected in genes located on X-chromosome for male cases.
Variant-Zygosity-Het/Hom/Hem Ratio indicating the number of individuals relative to variant zygosity.
Variant-Zygosity-Heterozygous (Het) Gene locus when cells contain two different alleles of a gene.
Variant-Zygosity-Homozygous (Hom) Gene locus when identical alleles of the gene are present on both homologous chromosomes.