Reliability and agreement between 2 strength devices used in the newly modified and standardized...

7
Reliability and agreement between 2 strength devices used in the newly modified and standardized Constant score Morten Tange Kristensen, PT, PhD a,b,c, *, Maria Aagesen, PT d , Signe Hjerrild, PT d , Pernille Lund Skov Larsen, PT d , Bente Hovmand, PT, MSc d , Ilija Ban, MD c,e a Physical Medicine and Rehabilitation Research–Copenhagen (PMR-C), Copenhagen, Denmark b Department of Physical Therapy, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark c Department of Orthopedic Surgery, Copenhagen University Hospital Hvidovre, Copenhagen, Denmark d Faculty of Physical Therapy, Metropolitan University College Copenhagen, Copenhagen, Denmark e Clinical Orthopaedic Research Hvidovre, Copenhagen, Denmark Hypothesis: The new and standardized test protocol for the Constant score (CS) provides new methodology, but different devices are still used for shoulder strength testing. It was hypothesized that strength measurements using the IsoForceControl (IFC) dynamometer (MDS Medical Device Solutions, Oberburg, Switzerland) would provide results comparable with the IDO isometer (Innovative Design Orthopaedics, Redditch, UK). Materials and methods: Sixty healthy subjects, aged 19 to 83 years, were studied, with 5 men and 5 women in each of 6 ten-year age groups. The IFC and IDO were used in randomized order with an 8-minute interval be- tween testing. Subjects performed 3 successive trials with strong verbal encouragement, with 1 minute between trials. The best strength performance was used in the analysis. The rater and subjects were blinded to all results. Results: The IFC produced 0.28-kg (0.62-lb) higher strength values on average than the IDO (P ¼ .002). The intraclass correlation coefficient (ICC 2,1 ) was 0.97 (95% confidence interval, 0.95-0.98), whereas the standard error of measurement and smallest real difference were 0.43 kg (0.95 lb) and 1.2 kg (2.63 lb), respectively. The total CS and strength reached mean values of 92.4 points (SD, 6.2 points) and 8.2 kg (SD, 2.6 kg) (18.0 lb [SD, 5.8 lb]), respectively, and were negatively associated with age (r > 0.407, P .001). The strength values decreased (P .001) by 1.3 CS points per decade, and women had strength values that were 8 CS points lower on average than those of men of the same age. Conclusions: The relative (intraclass correlation coefficient) and absolute (standard error of measurement) reliability between the IFC and IDO is excellent, indicating that performances reported from settings using the IDO are comparable with those recorded with the IFC in other settings. Level of evidence: Basic Science, Kinesiology. Ó 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. Keywords: Shoulder; Constant score; standardization; strength devices; intrarater reliability; agreement Institutional review board or ethical committee approval: not applicable. *Reprint requests: Morten Tange Kristensen, PT, PhD, Department of Physical Therapy 236, Hvidovre University Hospital, Kettegaard Alle 30, DK-2650 Copenhagen, Denmark. E-mail address: [email protected] (M.T. Kristensen). J Shoulder Elbow Surg (2014) 23, 1806-1812 www.elsevier.com/locate/ymse 1058-2746/$ - see front matter Ó 2014 Journal of Shoulder and Elbow Surgery Board of Trustees. http://dx.doi.org/10.1016/j.jse.2014.04.011

Transcript of Reliability and agreement between 2 strength devices used in the newly modified and standardized...

Institutional rev

*Reprint req

Physical Therap

DK-2650 Copen

J Shoulder Elbow Surg (2014) 23, 1806-1812

1058-2746/$ - s

http://dx.doi.org

www.elsevier.com/locate/ymse

Reliability and agreement between 2 strengthdevices used in the newly modified andstandardized Constant score

Morten Tange Kristensen, PT, PhDa,b,c,*, Maria Aagesen, PTd, Signe Hjerrild, PTd,Pernille Lund Skov Larsen, PTd, Bente Hovmand, PT, MScd, Ilija Ban, MDc,e

aPhysical Medicine and Rehabilitation Research–Copenhagen (PMR-C), Copenhagen, DenmarkbDepartment of Physical Therapy, Copenhagen University Hospital Hvidovre, Copenhagen, DenmarkcDepartment of Orthopedic Surgery, Copenhagen University Hospital Hvidovre, Copenhagen, DenmarkdFaculty of Physical Therapy, Metropolitan University College Copenhagen, Copenhagen, DenmarkeClinical Orthopaedic Research Hvidovre, Copenhagen, Denmark

Hypothesis: The new and standardized test protocol for the Constant score (CS) provides new methodology,but different devices are still used for shoulder strength testing. It was hypothesized that strengthmeasurementsusing the IsoForceControl (IFC) dynamometer (MDS Medical Device Solutions, Oberburg, Switzerland)would provide results comparable with the IDO isometer (Innovative Design Orthopaedics, Redditch, UK).Materials and methods: Sixty healthy subjects, aged 19 to 83 years, were studied,with 5men and 5women ineach of 6 ten-year age groups. The IFC and IDO were used in randomized order with an 8-minute interval be-tween testing. Subjects performed 3 successive trials with strong verbal encouragement, with 1minute betweentrials. The best strength performancewas used in the analysis. The rater and subjects were blinded to all results.Results: The IFC produced 0.28-kg (0.62-lb) higher strength values on average than the IDO (P ¼ .002).The intraclass correlation coefficient (ICC2,1) was 0.97 (95% confidence interval, 0.95-0.98), whereas thestandard error of measurement and smallest real difference were 0.43 kg (0.95 lb) and 1.2 kg (2.63 lb),respectively. The total CS and strength reached mean values of 92.4 points (SD, 6.2 points) and 8.2 kg(SD, 2.6 kg) (18.0 lb [SD, 5.8 lb]), respectively, and were negatively associated with age (r > �0.407,P � .001). The strength values decreased (P � .001) by 1.3 CS points per decade, and women had strengthvalues that were 8 CS points lower on average than those of men of the same age.Conclusions: The relative (intraclass correlation coefficient) and absolute (standard error of measurement)reliability between the IFC and IDO is excellent, indicating that performances reported from settings usingthe IDO are comparable with those recorded with the IFC in other settings.Level of evidence: Basic Science, Kinesiology.� 2014 Journal of Shoulder and Elbow Surgery Board of Trustees.

Keywords: Shoulder; Constant score; standardization; strength devices; intrarater reliability; agreement

iew board or ethical committee approval: not applicable.

uests: Morten Tange Kristensen, PT, PhD, Department of

y 236, Hvidovre University Hospital, Kettegaard Alle 30,

hagen, Denmark.

E-mail address: [email protected] (M.T. Kristensen).

ee front matter � 2014 Journal of Shoulder and Elbow Surgery Board of Trustees.

/10.1016/j.jse.2014.04.011

Figure 1 IDO isometer strength device.

Strength testing in Constant score 1807

The Constant score (CS)10 is extensively used to evaluateoutcome for patients with various shoulder disor-ders.17,20,21,26,28 Correspondingly, a large number of studieshave examined or reviewed the psychometric properties ofthe CS (0-100 points) or parts of the score, for example, theshoulder strength part (0-25 points).1,3,6-8,14-16,19,22-25,27,29

Various test protocols have routinely been used at differentcenters, probably because of authors’ own interpretation ofConstant’s original work but also because of a lack of aninternationally accepted and standardized test protocol.Constant et al9 published a new guideline report in 2008 tosolve some of the methodology problems associated with theCS, but still without including a standardized test protocol. Astandardized CS test protocol in Danish and English wasprovided in 2013,2 based on this new guideline report and theoriginal article by Constant and Murley.9,10 However, thisprotocol has not been validated, which is a requisite if it is tobe recommended for use in daily clinical practice. Thestandardized protocol provides a thorough description of theCS including the assessment method of maximum shoulderstrength (part D of the CS)2 but does not recommend use of aspecific strength device. Strength devices differ, rangingfrom an unsecured mechanical spring balance to severaldigital devices that differ greatly in cost. The Isobex dyna-mometer and the IsoForceControl (IFC) dynamometer (bothdevices from MDS Medical Device Solutions, Oberburg,Switzerland) have been used in several previous studies andare, by many investigators, considered the gold stan-dard.1,6,14-16,18,23,27 However, the IFC is expensive (approx-imately V1,600), which limits many centers from acquiringit. Several newer and cheaper devices designed specificallyfor the CS strength component are now available. One ofthese is the IDO isometer (Innovative Design Orthopaedics,Redditch, UK), a transportable digital dynamometer, costingapproximately V350, that is designed for the CS strengthcomponent.Whether the results of strength assessment usingthe low-budget IDO are comparable with the results of theexpensive IFC is unknown.

The primary aim of this study was to test the hypothesisthat the relative reliability between strength measures usingthe IFC and IDO is excellent and that measurement error(absolute reliability) is low when using the newly stan-dardized CS test protocol.2 The secondary aim was toexamine whether the total CS and the strength part (part D)are related to age and sex in healthy subjects.

Material and methods

Sixty adult volunteers (30 women and 30 men, aged 19-83 years)from the Copenhagen area were tested within a 2-week period.The following inclusion criteria were used: age 18 years or older,ability to give informed consent, ability to speak and understandDanish, and no current shoulder problems. Subjects were recruitedby telephone or E-mail, according to 6 prespecified age groups,with 10 subjects (5 women and 5 men) in each of the following 6age groups: 18 to 29 years, 30 to 39 years, 40 to 49 years, 50 to

59 years, 60 to 69 years, and 70 to 90 years.15 All subjectsreceived written and oral information about the project and signedan informed consent form before they were tested.

Strength devices

The IDO isometer is a transportable, digital CE-registered dyna-mometer designed to measure the average isometric musclestrength in kilograms or pounds over a period of 3 seconds(Fig. 1).

The IFC is a transportable, digital CE-registered dynamometerthat registers strength values in newtons at the start and end of a self-selected preset period between 5 and 20 seconds, aswell asmaximumand average strengthvalues (Fig. 2).The peak force producedover theperiod from 2 to 4 seconds in a 5-second period was used in ouranalysis. Values in newtons were converted to kilograms by dividingby 9.82, as 1 Newton equals 9.82 kg.

Procedures

All test procedures and instructions were standardized according tothe newly published CS test protocol.2 The 2 strength devices areself-calibrating when started, but the agreement between deviceswas examined with a 5- and 10-kg weight load. The best of 3 re-cordings for each device showed a difference between devices of0.02 kg and 0.04 kg. Pilot testing of the protocol and all procedureswas performed in 10 subjects not included in the project. Two seniorphysiotherapy students were used as test raters. The first rater, whoinstructed all test subjects on the objective parts of the CS (parts Cand D) including strength assessment, was blinded to all measure-ments and kept unaware of the results until all subjects had beentested. The second rater noted the results during testing and ensuredthat all subjects completed the subjective parts of the CS (parts Aand B). Both raters werewithin 6months of graduation, and the roleof physiotherapy students as raters corresponds with a previousstudy with a focus similar to the present study.14

Figure 2 IFC strength device.

1808 M.T. Kristensen et al.

The assessment of the subjective parts of the CS, as well as theassessment of shoulder movement (part C), was performed in allcases before each of the 2 strength measurement sessions.Shoulder movement was assessed using a long-armed goniometer.Each test series of maximum isometric muscle strength contained3 repetitions for the dominant arm, separated by a 1-minute restinterval. There was an 8-minute break between the 2 test series.The order of strength device testing (IFC or IDO) was randomizedby computer to ensure that each device was used as the first devicein 30 subjects. During testing, the subjects stood with their feetpointing directly forward at shoulders’ width with their arm in 90�

of abduction in the scapular plane, the elbow extended as much aspossible, and the wrist pronated.2 The strap of the dynamometerwas placed around the wrist over the caput ulnae of the subject,and the rater was positioned beside the arm to be tested.

The rater guided the subject verbally throughout the test,instructing the subject to push maximally upward for 5 seconds.Verbal encouragement was given simultaneously throughout eachperiod: ‘‘Ready 3-2-1 push, push, push .’’ The highest score of 3attempts for each device was used in the statistical analyses.

Data analysis

Descriptive statistics were calculated for age, sex, dominant arm,previous shoulder injuries, and whether the subjects were students,were employed, or had retired. Strength values adjusted to11.36 kg or less (equal to �25 lb) and the maximum score for theCS strength assessment (part D) were used in the analyses and forcalculation of the total CS (0-100 points). The paired t test (samerater) was used to examine for systematic bias between the 2strength devices, whereas the intraclass correlation coefficient(ICC2,1) with 95% confidence interval; a 2 way random-effectsmodel, consistency, single measure, was used to calculate therelative reliability.30 According to Fleiss’ classification,12 an ICCabove 0.75 indicates excellent reliability; between 0.40 and 0.75,fair to good reliability; and below 0.40, poor reliability.

To assess absolute reliability, the standard error of measurement(SEM) was calculated as SD� O(1�ICC), where SD is the standarddeviation of all strength scores (mean of IFC and IDO) from allsubjects.13,30 The SEM is the estimated standard deviation of mea-surement error, or the difference between the observed values and thetrue values, and gives a clinician a result in the same unit as themeasurement.13 To quantify the required change in strength that mustbe observed (exceeding measurement error and reflecting a realchange at the individual level), the smallest real difference (SRD)wascalculated at a 95% confidence level as SEM � 1.96 � O2 (1.96because of the 95% confidence interval and O2 because of the dif-ference of 2 variances).4 In addition, we calculated SEM% and SRD% as follows: SEM% ¼ (SEM/mean) � 100 and SRD% ¼ (SRD/mean)� 100, in whichmean is the mean of the IFC and IDO scores.

A Bland-Altman plot was used as a qualitative method5 toillustrate the magnitude of agreement between the 2 strength de-vices. The differences between the IFC and IDO were plottedagainst the respective average. A Pearson correlation coefficientwas used to determine the correlation between age and the strengthvalues and total CS, as well as to determine whether the numericalbetween-strength device differences were correlated with the meanstrength values from the IFC and IDO from all subjects (hetero-scedasticity, with a significantly larger variability for higher-strength values).5

According to recommendations by Eliasziw et al,11 we neededto include approximately 50 subjects to yield 80% power at the5% significance level to achieve an ICC set at 0.90 in our intra-device reliability study. To include 10 subjects in each of the 6age groups, we included 60 subjects. Previous studies have shownthat men, on average, present strength values that are 3.2 kg (7 lb)higher than those for women.14,28 A strength difference betweensexes of at least 2.3 kg (5 lb) was considered an important dif-ference in our study. With a standard deviation of the observationsin each group set at 2.3 kg (5 lb), we required 16 subjects in eachgroup.

Finally, we used simple and multiple linear regression analyses(enter method) to examine the influence of age and sex on thetotal CS and the strength part of the CS (part D).28 Age wasentered as a continuous variable and men as the reference valuefor sex.

Data are presented as mean and standard deviation or asnumber of patients and percentage, as appropriate. The statisticalanalyses were conducted using SPSS software, version 19.0(SPSS, Chicago, IL, USA). The level of significance was set atP < .05, and all statistical tests were 2 tailed.

Results

Of the 60 subjects, 15 were students (bachelor’s or higherlevel), 31 were working, and 14 were retired; 55 used theright arm as the dominant arm during testing; 48 reportedthat they were physically active; 9 reported a previousinjury in the dominant shoulder; and the mean age for allsubjects was 49.4 years (SD, 18.5 years). The 15 students,with a mean age of 29.6 years (SD, 7.4 years), presentedsignificantly (P � .02) higher total CS and strength valuesthan the 14 retired subjects (mean age, 73.4 years [SD,7.6 years]); otherwise, no significant differences wereseen in the corresponding values between any of the

Figure 3 Bland-Altman plot comparing strength values of IFCand IDO strength measurement devices. The plot shows anonsignificant correlation (r ¼ �0.164, P ¼ .2) for the numericaldifference between the IFC and IDO and the mean of therespective measurements (no heteroscedasticity) but a systematicbias across trials (P ¼ .001).

Figure 4 Comparison of raw strength values according to sex(A) and CS (B) by age group.

Strength testing in Constant score 1809

abovementioned groups (P > .1). Ten subjects in total didnot reach a maximum score in 1 or more of the subjectiveand objective parts of the CS (parts A-C). Thus, 3 patientsachieved 13 out of the 15 possible points in the pain score(part A); 5 subjects achieved 10 (n ¼ 1), 18 (n ¼ 1), and 19(n ¼ 3) out of the 20 possible points for activities of dailyliving (part B); and 6 subjects achieved 34 (n ¼ 1) and 38(n ¼ 5) out of the possible 40 points for range of motion(part C).

Strength devices

Performances measured with the IFC and IDO were highlycorrelated (r ¼ 0.971, P < .001), but the IFC producedadjusted strength values that were 0.28 kg (SD, 0.64 kg)(0.62 lb [SD, 1.4 lb]) higher on average than those with theIDO (P ¼ .001). The ICC2,1 between the IFC and IDO was0.97 (95% confidence interval, 0.95-0.98), whereas theSEM and the SRD were 0.43 kg (0.95 lb) (SEM%, 4.6%)and 1.2 kg (2.63 lb) (SRD%, 12.9%), respectively. TheBland-Altman plot (Fig. 3) with adjusted strength valuesshowed no signs of heteroscedasticity (r ¼ �0.165,P ¼ .2).

Influence of age and sex

Seventeen of the 30 men presented raw strength valuesabove the possible 25 points (part D), whereas all womenhad raw strength values below 19 points (Fig. 4, A). Thetotal CS and adjusted strength values reached an average of92.4 points (SD, 6.2 points) and 8.2 kg (SD, 2.6 kg) (18.0 lb[SD, 5.8 lb]), respectively, and were negatively associatedwith age (r > �0.407, P � .001), as shown in Figure 4.

Men presented higher (P < .001) total CS (mean, 96.0points vs 88.8 points) and adjusted strength (mean, 10 kg[22.0 lb] vs 6.36 kg [14.0 lb]) values than women in gen-eral, as well as in most of the 6 age groups (Table I andFig. 4). In comparison, subjects aged 70 years or olderpresented significantly (P � .04) lower CS and strengthvalues than those in the 4 age groups from 19 to 59 years butnot in those aged 60 to 69 years. Furthermore, the influenceof age and sex on the total CS and strength performanceswhen evaluated in both simple and multiple linear regres-sion analyses was significant (P � .001) (Table II).Thus, women on average presented strength values thatwere 8 CS points lower than men (b weight, –0.701),whereas the corresponding values for age showed a decrease

Table I Total CS and strength measurements according to sex and age group in 60 healthy subjects, aged 19 to 83 years

Agegroup

CS (points)) P valuefor CS

Strength value �11.36 kg (�25 CS strength points) P valueforstrength

All(n ¼ 10)y

Men(n ¼ 5)y

Women(n ¼ 5)y

All (n ¼ 10)y Men (n ¼ 5)y Women (n ¼ 5)y

19-29 y 95.4 (4.7) 99.6 (0.9) 91.2 (2.0) <.001 9.3 (2.1) 11.2 (0.4) 7.4 (0.9) <.00130-39 y 94.3 (5.7) 99.4 (0.9) 89.2 (2.5) <.001 9.0 (2.6) 11.3 (0.2) 6.6 (1.3) <.00140-49 y 94.3 (5.3) 98.3 (3.9) 90.3 (2.9) .006 8.8 (2.4) 10.6 (1.8) 7.0 (1.3) .00650-59 y 92.8 (5.3) 95.3 (6.6) 90.3 (1.9) .168 8.7 (2.1) 10.5 (1.0) 7.0 (0.9) <.00160-69 y 90.5 (7.4) 92.2 (9.9) 88.7 (4.2) .259 7.4 (3.0) 8.5 (3.7) 6.2 (1.9) .500�70 y 87.2 (6.1) 91.53 (5.5) 82.9 (2.5) .013 6.0 (2.5) 8.0 (2.0) 4.0 (0.7) .003All 92.4 (6.2)

(N ¼ 60)96.0 (6.1)

(n ¼ 30)88.8 (3.8)(n ¼ 30)

<.001 8.2 (2.6)(N ¼ 60)

10 (2.2)(n ¼ 30)

6.4 (1.6)(n ¼ 30)

<.001

Values are reported as mean (standard deviation).) For the CS, 0 points is the worst possible score and 100 points is the best possible score.y The number of patients in each decade group is shown.

Table II Linear regression analysis of factors influencing CS and maximum isometric shoulder abduction strength values

Variable Crude bvalue

95% CI forcrude b value

P valuefor crudeb value

Adjustedb value

95% CI foradjusted b value

P valuefor adjustedb value

CS)

Age (y) �0.151 �0.230 to �0.072 <.001 �0.134 �0.186 to �0.082 <.001Women �7.3 �9.9 to �4.7 <.001 �7.8 �9.7 to �5.9 <.001

Maximum isometric shoulderabduction strengthy

Age (y) �1.27 �0.203 to �0.052 .001 �0.127 �0.176 to �0.078 <.001Women �8.1 �10.3 to �5.9 <.001 �8.1 �9.9 to �6.3 <.001

CI, Confidence interval.) Total CS (0-100 points).y Part D of CS (0-25 points).

1810 M.T. Kristensen et al.

of 1.3 CS points (b weight, –0.403) per decade (Table II).The regression model was statistically stable andexplained 66% (r ¼ 0.809) of the variation in strengthperformances.

Discussion

This study, following the newly standardized guidelines forCS testing,2 found excellent relative reliability (ICC, 0.97)and very low measurement noise at a group level (<1 CSpoint) between the 2 strength devices examined: the IFCand the IDO. Thus, an increase or decrease by 1 or more CSstrength points can be considered a true difference for agroup of subjects, enabling comparison between datarecorded with the IFC and IDO. Furthermore, we found thatthe total CS and both the raw and adjusted strength valueswere related to the age and sex of subjects.

Strength devices

The systematic error between devices in this study of0.28 kg (0.62 lb) corresponds well to findings in previous

studies evaluating whether different strength devices yieldthe same result.3,15,29 Some of the difference in this studycould be explained by the different periods and valuesrecorded by the 2 evaluated devices. The IDO records theaverage force for 3 seconds (the only available setting),whereas the IFC enables recording of both the mean and thepeak force produced between 2 and 4 seconds of a 5-secondtrial. Thus, maximum strength values seem to be consis-tently higher than mean strength values15,29 and are rec-ommended for use in the CS.2,9 Accordingly, we comparedthe maximum average force (IDO) with the maximum peakforce (IFC) for 3 repetitions from each device, whichresulted in very similar values. Thus, it seems as if resultsfor a group of persons assessed with the low-cost IDO,though not able to measure the peak force, can be comparedwith results from another group assessed with the moreadvanced and costly IFC strength measurement device.Two of the previous studies evaluating the results ofdifferent strength devices did not report data for relativeand absolute reliability,3,29 whereas our ICC values are verysimilar to the value reported for a comparison between 2other strength devices (ICC, 0.96; 95% confidence interval,

Strength testing in Constant score 1811

0.91-0.99) also using maximum values of a pull forcetechnique.15

Influence of age and sex

Our finding that age and sex affect the total CS andstrength measurements indicates a high degree of externalvalidity. The study was not powered to evaluate differencesbetween age groups, and our subgroups were smaller thanthose of previous studies.15,28 However, they still arecomparable with those reported by Katolik et al16 in alarger group of patients (N ¼ 441) divided into the sameage groups and those reported by Walton et al29 in 108patients aged between 50 and 89 yearsdboth studies inpatients without shoulder symptoms. In our study, withequally sized age groups of men and women (age range,19-83 years), we found that the total strength values fell by0.13 CS points per year (1.3 per decade) whereas thecorresponding strength values were approximately 3.6 kg(8 CS points) lower for women than for men. Corre-spondingly, Walton et al reported that the CS fell by 0.3points per year and that strength values, on average, were7.5 CS points higher for men, whereas our CS and strengthvalues in relation to age groups (Fig. 4) are quite similar tothose shown by Katolik et al in Figures 2 and 3 in theirstudy.15

Overall, our findings support results from previousstudies and further emphasize the importance of thestrength component in the CS, in addition to the need forage and sex stratification, when using the CS in researchstudies.

Conclusions

The results of this study show that performances of thestandardized strength test in the CS, carried out with theIFC and the IDO, are comparable at a group levelbecause high relative reliability and very low measure-ment noise were found. It is recommended that an in-dividual be tested with the same device to measurechange over time. Our findings that age and especiallysex affected strength values and thereby the total CS ofhealthy subjects are in accordance with results previ-ously reported, which indicate the need for stratificationof patients when using the CS in research projects.

Disclaimer

The authors, their immediate families, and any researchfoundations with which they are affiliated have notreceived any financial payments or other benefits fromany commercial entity related to the subject of this article.

References

1. Bafus BT, Hughes RE, Miller BS, Carpenter JE. Evaluation of utility

in shoulder pathology: correlating the American Shoulder and Elbow

Surgeons and Constant scores to the EuroQoL. World J Orthop 2012;

3:20-4. http://dx.doi.org/10.5312/wjo.v3.i3.20

2. Ban I, Troelsen A, Christiansen DH, Svendsen SW, Kristensen MT.

Standardised test protocol (Constant score) for evaluation of function-

ality in patients with shoulder disorders. Dan Med J 2013;60:A4608.

3. Bankes MJ, Crossman JE, Emery RJ. A standard method of shoulder

strength measurement for the Constant score with a spring balance. J

Shoulder Elbow Surg 1998;7:116-21.

4. Beckerman H, Roebroeck ME, Lankhorst GJ, Becher JG,

Bezemer PD, Verbeek AL. Smallest real difference, a link between

reproducibility and responsiveness. Qual Life Res 2001;10:571-8.

5. Bland JM, Altman DG. Statistical methods for assessing agreement

between two methods of clinical measurement. Lancet 1986;1:307-10.

6. Blonna D, Scelsi M, Marini E, Bellato E, Tellini A, Rossi R, et al. Can

we improve the reliability of the Constant-Murley score? J Shoulder

Elbow Surg 2012;21:4-12. http://dx.doi.org/10.1016/j.jse.2011.07.014

7. Christie A, Hagen KB, Mowinckel P, Dagfinrud H. Methodological

properties of six shoulder disability measures in patients with rheu-

matic diseases referred for shoulder surgery. J Shoulder Elbow Surg

2009;18:89-95. http://dx.doi.org/10.1016/j.jse.2008.07.008

8. ConboyVB,Morris RW, Kiss J, Carr AJ. An evaluation of the Constant-

Murley shoulder assessment. J Bone Joint Surg Br 1996;78:229-32.

9. Constant CR, Gerber C, Emery RJ, Sojbjerg JO, Gohlke F, Boileau P.

A review of the Constant score: modifications and guidelines for its

use. J Shoulder Elbow Surg 2008;17:355-61. http://dx.doi.org/10.

1016/j.jse.2007.06.022

10. Constant CR, Murley AH. A clinical method of functional assessment

of the shoulder. Clin Orthop Relat Res 1987;(214):160-4.

11. Eliasziw M, Young SL, Woodbury MG, Fryday-Field K. Statistical

methodology for the concurrent assessment of interrater and intrarater

reliability: using goniometric measurements as an example. Phys Ther

1994;74:777-88.

12. Fleiss J. The design of analysis of clinical experiments. New York:

John Wiley & Sons; 1986: pp 2-31.

13. Harvill LM. Standard error of measurement. Educ Meas Issues Pract

1991;10:33-41.

14. Hirschmann MT, Wind B, Amsler F, Gross T. Reliability of shoulder

abduction strength measure for the Constant-Murley score. Clin

Orthop Relat Res 2010;468:1565-71. http://dx.doi.org/10.1007/

s11999-009-1007-3

15. Johansson KM, Adolfsson LE. Intraobserver and interobserver reli-

ability for the strength test in the Constant-Murley shoulder assess-

ment. J Shoulder Elbow Surg 2005;14:273-8. http://dx.doi.org/10.

1016/j.jse.2004.08.001

16. Katolik LI, Romeo AA, Cole BJ, Verma NN, Hayden JK, Bach BR.

Normalization of the Constant score. J Shoulder Elbow Surg 2005;14:

279-85. http://dx.doi.org/10.1016/j.jse.2004.10.009

17. Kulshrestha V, Roy T, Audige L. Operative versus nonoperative

management of displaced midshaft clavicle fractures: a prospective

cohort study. J Orthop Trauma 2011;25:31-8. http://dx.doi.org/10.

1097/BOT.0b013e3181d8290e

18. Leggin BG, Neuman RM, Iannotti JP, Williams GR,

Thompson EC. Intrarater and interrater reliability of three iso-

metric dynamometers in assessing shoulder strength. J Shoulder

Elbow Surg 1996;5:18-24.

19. Lillkrona U. How should we use the Constant score?dA commentary.

J Shoulder Elbow Surg 2008;17:362-3. http://dx.doi.org/10.1016/j.jse.

2007.06.013

20. Pape G, Bruckner T, Loew M, Zeifang F. Treatment of severe cuff tear

arthropathy with the humeral head resurfacing arthroplasty: two-year

minimum follow-up. J Shoulder Elbow Surg 2013;22:e1-7. http://dx.

doi.org/10.1016/j.jse.2012.04.006

1812 M.T. Kristensen et al.

21. Postacchini R, Castagna A, Borroni M, Cinotti G, Postacchini F,

Gumina S. Total shoulder arthroplasty for the treatment of failed

hemiarthroplasty in patients with fracture of the proximal humerus. J

Shoulder Elbow Surg 2012;21:1542-9. http://dx.doi.org/10.1016/j.jse.

2011.12.007

22. Razmjou H, Bean A, MacDermid JC, van OV, Travers N, Holtby R.

Convergent validity of the Constant-Murley outcome measure in pa-

tients with rotator cuff disease. Physiother Can 2008;60:72-9. http://

dx.doi.org/10.3138/physio/60/1/72

23. Rocourt MH, Radlinger L, Kalberer F, Sanavi S, Schmid NS,

Leunig M, et al. Evaluation of intratester and intertester reliability of

the Constant-Murley shoulder assessment. J Shoulder Elbow Surg

2008;17:364-9. http://dx.doi.org/10.1016/j.jse.2009.04.008

24. Roy JS, MacDermid JC, Woodhouse LJ. A systematic review of the

psychometric properties of the Constant-Murley score. J Shoulder Elbow

Surg 2010;19:157-64. http://dx.doi.org/10.1016/j.jse.2009.04.008

25. Slobogean GP, Slobogean BL. Measuring shoulder injury function:

common scales and checklists. Injury 2011;42:248-52. http://dx.doi.

org/10.1016/j.injury.2010.11.046

26. Smekal V, Irenberger A, Attal RE, Oberladstaetter J, Krappinger D,

Kralinger F. Elastic stable intramedullary nailing is best for mid-shaft

clavicular fractures without comminution: results in 60 patients. Injury

2011;42:324-9. http://dx.doi.org/10.1016/j.injury.2010.02.033

27. van de Water AT, Shields N, Davidson M, Evans M, Taylor NF.

Reliability and validity of shoulder function outcome measures in

people with a proximal humeral fracture. Disabil Rehabil, 2013, http://

informahealthcare.com/loi/dre (May 29, 2014)

28. von Heideken J, Bostrom WH, Une-Larsson V, Ekelund A. Acute

surgical treatment of acromioclavicular dislocation type V with a hook

plate: superiority to late reconstruction. J Shoulder Elbow Surg 2013;

22:9-17. http://dx.doi.org/10.1016/j.jse.2012.03.003

29. Walton MJ, Walton JC, Honorez LA, Harding VF, Wallace WA. A

comparison of methods for shoulder strength assessment and analysis

of Constant score change in patients aged over fifty years in the United

Kingdom. J Shoulder Elbow Surg 2007;16:285-9. http://dx.doi.org/10.

1016/j.jse.2006.08.002

30. Weir JP. Quantifying test-retest reliability using the intraclass corre-

lation coefficient and the SEM. J Strength Cond Res 2005;19:231-40.