A review of the Stockholm consensus
EQALM symposium
Berlin 1 July 2009
Quality requirements and quality goals
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
A review of the Stockholm consensus on analytical quality specifications
Per Hyltoft Petersen [email protected]
NOKLUS, Bergen, Norway
Background
Analytical quality specifications based on
Competing approaches to goal-setting in clinical biochemistry
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Analytical quality specifications based on
• clinical and biological use of measurements
• expert groups
• EQAS and PT
• state of the art
Quality Specifications ? History ? Imprecision
1963 David Tonks
ALE = 2CV = [1/4 reference range/mean] x 100% {biological}
1968 Roy Barnett
“Medically significant CV” - opinions of clinicians and laboratory specialists {clinical}specialists {clinical}
1970 Cotlove, Harris and Williams
Biological variation - tolerable analytic variabilityCV < 1/2 CVwithin-subject {biological}
1976 CAP Aspen Conference (1977) {biological}
1978 Wiveka Elion-Gerritzen
“Medically significant CV” - opinions of clinicians {clinical}
1988 Elizabeth Gowans
Specifications for acceptable bias {biological}
1980s Analysis of clinical situations [Nordic countries] {clinical}
Quality Specifications ? History ? Bias
1991 Sverre Sandberg“Medically significant CV” - opinions of patients {clinical}“Medically significant CV” - opinions of patients {clinical}
1997 Callum G. FraserLevels of quality {biological}
1990s EGE-Lab Working GroupBiological variation and state of the art {biological}
European EQA Organisers Working Group {biological}
ISO TC 212/WG3 ISO 15196“Analytical Performance Goals Based on Medical Needs“ {clinical}
Analytical quality specifications for
coefficient of variation:
The Cotlove goal for imprecision
CVAnalytical < 0.5*CVWithin-Subject
Cotlove et al. Clin Chem 1970;16:1028-32
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Influence of analytical variation, CVAnalytical, on the
total variation, CVTotal-Monitoring, during monitoring .
CVTotal-Monitoring2 = CVWithin-Subject
2 + CVAnalytical2 =
Influence of imprecision on monitoring CV
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
CVWithin-Subject2 + (0.5* CVWithin-Subject)
2 =
1.25* CVWithin-Subject2 1.12* CVWithin-Subject
Increase in CVTotal Monitoringt < 12 %
Specifications for Bias:
Bias < 0.25*CV
The Gowans goal for analytical bias
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Bias < 0.25*CVPopulation
Gowans et al. Scand J Clin Lab Invest 1988;48:757-64
Reference interval with limits
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Confidence intervals for reference limits
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Decreasing CI for increasing N
90 % CI 90 % CI
IFCC recommendations
Sample size, N,
IFCC recommendations
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Sample size, N, should be > 120
N = 120
90 % CI 90 % CI
Sample size N > 3000
The Gowans goal for analytical bias
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
N = 120
90 % CI 90 % CI
Acceptable bias
Gowans et al. Scand J Clin Lab Invest1988;48:757-64
With the common reference interval produced
without errors and based on more than 3000
we can allow for bias and imprecision instead of sample size
The Gowans goal for analytical bias
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Thus the common reference interval is as good for all, as if each lab had produced it according to IFCC
and with this analytical quality we can have same reference intervals for homogeneous groups in same regions
S-PotassiumAction limits for S-Potassium among clinicians in 12 hospitals
“Medically significant CV”
Each clinician gets a questionnaire with the same information about a
Elion-Gerritzen W, Thesis, 1978, Drukkerij J.H. Pasmans, S-Gravenhage
Referenceintervals
Median of low action limits and high action limits
same information about a patient and indicates for which concentration he/she will react
Frequency Distributions for two Distributions
35
0.0
5.0
10.0
15.0
20.0
25.0
0 10 20 30 40 50 60 70 80
Fre
quency d
istrib
utio
n
Prevalence 0.167
Gauss 1: n = 300
mean-1 = 25
s-1 = 5
Gauss 2: n = 60
mean-2 = 45
s-2 = 7
Cut-off = 35
Clinical approach - Classification
Two groups:Healthy and diseased
Prevalence 16.7 %
Cut-off: 35 U/L
0 10 20 30 40 50 60 70 80
Concentration
Prevalence, FP, FN and sum (fraction of all)
Cut-off
0.00
0.05
0.10
0.15
0.20
20 25 30 35 40 45 50
Concentration
FP
, FN
or su
m (fractio
n o
f all)
Prevalence 0.167
Gauss 1: n = 300
mean-1 = 25
s-1 = 5
Gauss 2: n = 60
mean-2 = 45
s-2 = 7
Sum FP+FN
Min sum (%) = 0.028
Min at 36.58
Cut-off = 35
Prevalence and
Optimum of sum
Prevalence 16.7 %
Red curve: FN as function of cut-off
Green curve: FP as function of cut-off
Blue curve: sum of FP and FN
Frequency Distributions for two Distributions
35
0.0
5.0
10.0
15.0
20.0
25.0
0 10 20 30 40 50 60 70 80
Fre
quency d
istrib
utio
n
Prevalence 0.167
Gauss 1: n = 300
mean-1 = 25
s-1 = 5
Gauss 2: n = 60
mean-2 = 45
s-2 = 7
Cut-off = 35
Clinical approach - Classification
Red curve: FN as function of bias
Green curve: FP as function of bias
Blue curve: sum of FP and FN
Effect of bias when Cut-off: 35 U/L
0 10 20 30 40 50 60 70 80
Concentration
Prevalence, FP, FN and sum (fraction of all)
Cut-off
0.00
0.05
0.10
0.15
0.20
20 25 30 35 40 45 50
Concentration
FP
, FN
or su
m (fractio
n o
f all)
Prevalence 0.167
Gauss 1: n = 300
mean-1 = 25
s-1 = 5
Gauss 2: n = 60
mean-2 = 45
s-2 = 7
Sum FP+FN
Min sum (%) = 0.028
Min at 36.58
Cut-off = 35
Prevalence and
Optimum of sum
Blue curve: sum of FP and FN
Effect of Bias, Prevalence and Sum (fraction of all)
Bias = 0
0.00
0.05
0.10
0.15
0.20
-15 -10 -5 0 5 10 15
Bias (Cut-off - Concn)
FP
, F
N o
r su
m (
fra
ctio
n o
f a
ll) Cut-off = 35
Prevalence 0.167
Gauss 1: n = 300
Gauss 2: n = 60
Sum FP+FN
Min at Bias = -1.58
Min sum (%) = 0.028
Bias = 0
Effect of Bias, Prevalence
and Optimum of sumConcentration Bias
Clinical approach - Classification
Effect of Bias, Imprecision and Prevalence
Bias = 0
0.05
0.10
0.15
0.20
FP
, F
N o
r su
m (
fra
ctio
n o
f a
ll) Cut-off = 35
Prevalence 0.167
Gauss 1: n = 300
Gauss 2: n = 60
Sum FP+FN
Min at Bias = -1.58
Min sum (%) = 0.028
Bias = 0
Effect of bias and imprecision when Cut-off: 35 U/L
Can be advancedby use of weight
Imprecision
Prevalence, FP, FN and sum (fraction of all)
Cut-off
0.00
0.05
0.10
0.15
0.20
20 25 30 35 40 45 50
Concentration
FP
, FN
or su
m (fractio
n o
f all)
Prevalence 0.167
Gauss 1: n = 300
mean-1 = 25
s-1 = 5
Gauss 2: n = 60
mean-2 = 45
s-2 = 7
Sum FP+FN
Min sum (%) = 0.028
Min at 36.58
Cut-off = 35
Prevalence and
Optimum of sum
Effect of Bias, Prevalence and Sum (fraction of all)
Bias = 0
0.00
0.05
0.10
0.15
0.20
-15 -10 -5 0 5 10 15
Bias (Cut-off - Concn)
FP
, F
N o
r su
m (
fra
ctio
n o
f a
ll) Cut-off = 35
Prevalence 0.167
Gauss 1: n = 300
Gauss 2: n = 60
Sum FP+FN
Min at Bias = -1.58
Min sum (%) = 0.028
Bias = 0
Effect of Bias, Prevalence
and Optimum of sum
0.00
-15 -10 -5 0 5 10 15
Bias (Cut-off - Concn)
FP
, F
N o
r su
m (
fra
ctio
n o
f a
ll)
CV% at cut-off = 14
Effect of Bias, Imprecicion
and Optimum of sum
by use of weight factors for FP and FN
BiasConcentration
Bias
Economical approach - Costs
Cost of follow-up ($) as function of measured Ca-concentration
Bias: + 5 %
Gallaher MP, Mobley LR, Klee GG, Schryver P. The impact of calibration error in medical decision error. NIST 2004
Bias: + 5 %
Editorial:
Fraser CG, Hyltoft Petersen P.
Analytical performance
characteristics should be judged
against objective quality
specifications.
Clin Chem 1999;45:321-3
Stockholm - Consensus Conference
Organiser: Anders Kallner
Kenny et al. SJCLI 1999; 59:585
What was achieved in Stockholm?
The consensus from the conference describes a hierarchical structure of approaches to estimation of analytical approaches to estimation of analytical
quality specifications
What was not achieved in Stockholm?
The agreement from the Stockholm conference was not
followed directly by ISO/TC 212 - ISO 15 196, because the
chairman, Larry Kapland, completely changed his mind after
the conference
So we gave up and didn’t support the ISO-group furtherSo we gave up and didn’t support the ISO-group further
However, there is now a publication, ISO 15 196, with
recommendations very close to the consensus from the
Stockholm conference, but we do not know who are the
authors – and there is no reference to the Stockholm
consensus or to the editorial which was also very close to the
consensus
What was not achieved in Stockholm?
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
CoaguChek S, lot 568
30
%
Control of INR-kit
Bias = – 15 %
CVA = 3.1 %
Difference plotDifference (%)30
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
-30
0
0,5 1,5 2,5 3,5 4,5 5,5
Roskilde, INR
CVMatrix = 7.9 %
CVTotal2 =
CVMatrix2 + CVA
2
CVTotal = 8.5 %
-30
0
Reference method (INR)Borrowed from Esther A. Jensen, Denmark
What was not achieved in Stockholm?
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
There was no discussion about measurements on ordinal scale
Ordinal scale – dichotomous test
U-hCG
0.99Medical limit = 5
Company limit = 25
2
3
-1 0 1 2 3 4 5 6ln of concentration
Fra
cti
on
of
1
Rankit-plot of Ordinal data (0 and 1), Fraction of 1
Urine-hCG
Fraction of
positive
0.90
0.50
0.10
0.01
31 10 30 100 300
Medical limit = 40
-3
-2
-1
0
1
Ran
kit
(z-v
alu
e)
Fra
cti
on
of
1
Concentration (IU/L)
Lower limit: 5 IU/L Upper limit: 40 IU/L
Hyltoft Petersen et al.
Scand J Clin Lab Invest
2008;68:298-311
Percentage of reported values versus concentrations
28.05.5 14.0 55.0
80.0
100.0
Kit-reference conc
Probability of measuring
Kit-value
Ordinal scale – semi-quantitative test
Urine-glucose
Kit-values
Percentage of measurements with
each Kit-value
0.0
20.0
40.0
60.0
80.0
0.1 1.0 10.0 100.0
True concentration
Pe
rce
nta
ge 0
Kit-value 5.5
Kit-value 14.0
Kit-value 28.0
Kit-value 55.0
Kit-value
5.5
14.0
28.0
55.0Bayer Visual
Urine-
Glucose
Kit-values
0 mmol/L
5.5 mmol/L
14 mmol/L
28 mmol/L
55 mmol/L
Concentration (mmol/L)Hyltoft Petersen et al.
Scand J Clin Lab Invest
2009; in press
What was not achieved in Stockholm?
There was no conclusion about absolute and relative quality:
Deviation from a ‘true’ value
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
There was no discussion about measurements on ordinal scale
Deviation from a ‘true’ value
Deviation from the method mean
5
10
15
De
via
tio
n fro
m M
eth
od
Me
an
Validation of Methods and
Validation of Participants
GOOD
METHOD
GOOD
RANGE
WITHIN
METHOD
-10
-5
0
De
via
tio
n fro
m M
eth
od
Me
an
-10 -5 0 5 10 15 Method Bias
GOOD
PER-
FOR-
MER
YOUR
RESULT
What was not achieved in Stockholm?
There was no conclusion about absolute and relative quality:
Deviation from a ‘true’ value
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
There was no discussion about measurements on ordinal scale
Deviation from a ‘true’ value
Deviation from the method mean
There was no agreement on which level of quality should be achieved
The effect of imprecision on test result variability
Desirable quality
Optimum
Minimum quality
Optimum
Fraser et al. Ann Clin Biochem 1997;34:8-12.
Effect of bias on reference values
Fraser et al. Ann Clin Biochem 1997;34:8-12.
What was not achieved in Stockholm?
There was no conclusion about absolute and relative quality:
Deviation from a ‘true’ value
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
There was no discussion about measurements on ordinal scale
Deviation from a ‘true’ value
Deviation from the method mean
There was no agreement on which level of quality should be achieved
There was no agreement on consequences of poor quality
NORIP project on common reference intervals
Creatinine
Hyltoft Petersen: Stockholm consensus EQALM Berlin 2009
Abbot Bayer Roche Cobas
Roche Hitachi-Modular Vitros
www.furst.no/norip
1
1,2
1,4
Rela
tive m
ed
ian
Elecsys AutoDELFIA Arckitect Byk Sangtek Access
External control of TSH
Deutsche Gesellschaft für Klinische Chemie
Youden plot
Co
ntr
ol 2
(co
nc)
30 %
30 %
Five dominating methods
0
0,2
0,4
0,6
0,8
2/00 4/00 1/01 1/02 1/03 2/03 3/03 1/04 2/04 1/05 2/05
Rela
tive m
ed
ian
Control 1 (conc)
Co
ntr
ol 2
(co
nc)
± 30 %
±30 %
30 %
Relative median
2000 Year 2005
Petersen et al. Symposium abstracts –IFCC – WorldLab Fortaleza, Clin Chem Lab Med 2008;46 special suppl:S148
What was not achieved in Stockholm?
There was no conclusion about absolute and relative quality:
Deviation from a ‘true’ value
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
There was no discussion about measurements on ordinal scale
Deviation from a ‘true’ value
Deviation from the method mean
There was no agreement on which level of quality should be achieved
There was no agreement on consequences of poor quality
There was no agreement on the relation between clinical/biological
specifications and specifications for EQAS and PT
Transferrin: External Quality Assessment
Deutche Gesellschaft
für Klinische Chemie
Youden plot
Transferrin results
Target ± 17 %
Transferrin results
for two controls
Target ± 7 %
Analytical quality specifications according to the EGE-Lab
criteria: ± 7 %
What was not achieved in Stockholm?
There was no conclusion about absolute and relative quality:
Deviation from a ‘true’ value
There was no discussion about matrix-effects and
consequently no specifications for allowable matrix
There was no discussion about measurements on ordinal scale
Deviation from a ‘true’ value
Deviation from the method mean
There was no agreement on which level of quality should be achieved
There was no agreement on consequences of poor quality
There was no agreement on the relation between clinical/biological
specifications and specifications for EQAS and PT
Top Related