Validity and Reliability of the Critical Care Pain Observation Tool: A Replication Study

10
From the Maine Medical Center, Portland, Maine, and Boston College, Chestnut Hill, Massachusetts. Address correspondence to Kathleen M. Keane, PhD (c), MS, BSN, CNL, CCRN, Cardiothoracic Intensive Care Unit, Maine Medical Center, 22 Bramhall St., Portland, ME 04102. E-mail: [email protected] Received November 18, 2010; Revised January 12, 2012; Accepted January 15, 2012. Supported by an American Associa- tion of Critical Care Nurses Small Grants Award. 1524-9042/$36.00 Ó 2013 by the American Society for Pain Management Nursing doi:10.1016/j.pmn.2012.01.002 Validity and Reliability of the Critical Care Pain Observation Tool: A Replication Study --- Kathleen Marie Keane, PhD (c), MS, BSN, CNL, CCRN - ABSTRACT : Critically ill patients are often not able to self-report the presence of pain. Currently there is no generally accepted assessment tool for this population. The Critical-Care Pain Observation Tool (CPOT) was de- veloped for pain assessment of critically ill patients. The purpose of this study was to replicate the findings of the Gelinas et al. (2006) CPOT reference study and examine the interrater reliability (IRR), discrim- inant validity (DV), and criterion validity (CV) of the CPOT. This quantitative study used a repeated measures design with a conve- nience sample of 21 postoperative open heart surgery patients cared for in a tertiary-care teaching hospital. Testing for IRR in this sample showed a range of results resulting in fair to almost perfect IRR; the findings of this study suggest that the instrument’s IRR is acceptable but variable. Testing for DV demonstrated a significant difference in mean scores between noxious (painful) and nonnoxious (nonpainful) procedures. Testing for CV showed a weak nonsignificant Spearman correlation of 0.26 (P < .312) between CPOT scores and patient self- report during repositioning after extubation. This replication study adds to four studies that have examined psychometric attributes of the instrument and contributes to the process of translating the use of this instrument to the clinical setting. Ó 2013 by the American Society for Pain Management Nursing Critically ill patients can be intubated, sedated, and minimally interactive to non- interactive. For these patients, the ‘‘gold standard’’ of patient self-report is not avail- able to the clinician (Pasero & McCaffery, 2011). Based on the recommendations of clinical guidelines (Herr, Coyne, Key, Manworren, McCaffery, Merkel, & Wild, 2006; Jacobi, Fraser, Coursin, Riker, Fontaine, Wittbrodt, & Lum, 2002), clinicians observe the patient for cues that indicate the patient may be experiencing pain and treat the patient based on those observations. Interdisciplinary clinical guide- lines have called for the development of standardized instruments to aid in the as- sessment of pain in those who cannot self-report (Herr et al., 2006; Jacobi et al., 2002). In response to this need, instruments have been developed to aid the clini- cian in assessment of pain in critically ill adults (Gelinas, Fillion, Puntillo, Viens, & Fortier, 2006; Mateo & Krenzischek, 1992; Odhner, Wegman, Freeland, Steinmetz, & Ingersoll, 2003; Payen, Bru, Bosson, Lagrasta, Novel, Deschaux, & Jacquot, 2001; Pain Management Nursing, Vol 14, No 4 (December), 2013: pp e216-e225 Original Article

Transcript of Validity and Reliability of the Critical Care Pain Observation Tool: A Replication Study

Original Article

From the Maine Medical Center,

Portland, Maine, and Boston College,

Chestnut Hill, Massachusetts.

Address correspondence to Kathleen

M. Keane, PhD (c), MS, BSN, CNL,

CCRN, Cardiothoracic Intensive Care

Unit, Maine Medical Center, 22

Bramhall St., Portland, ME 04102.

E-mail: [email protected]

Received November 18, 2010;

Revised January 12, 2012;

Accepted January 15, 2012.

Supported by an American Associa-

tion of Critical Care Nurses Small

Grants Award.

1524-9042/$36.00

� 2013 by the American Society for

Pain Management Nursing

doi:10.1016/j.pmn.2012.01.002

Validity and Reliabilityof the Critical Care PainObservation Tool:A Replication Study

--- Kathleen Marie Keane, PhD (c), MS, BSN, CNL, CCRN

- ABSTRACT:Critically ill patients are often not able to self-report the presence of

pain. Currently there is no generally accepted assessment tool for this

population. The Critical-Care Pain Observation Tool (CPOT) was de-

veloped for pain assessment of critically ill patients. The purpose of

this studywas to replicate the findings of the Gelinas et al. (2006) CPOT

reference study and examine the interrater reliability (IRR), discrim-

inant validity (DV), and criterion validity (CV) of the CPOT. This

quantitative study used a repeated measures design with a conve-

nience sample of 21 postoperative open heart surgery patients cared

for in a tertiary-care teaching hospital. Testing for IRR in this sample

showed a range of results resulting in fair to almost perfect IRR; the

findings of this study suggest that the instrument’s IRR is acceptable

but variable. Testing for DV demonstrated a significant difference in

mean scores between noxious (painful) and nonnoxious (nonpainful)

procedures. Testing for CV showed a weak nonsignificant Spearman

correlation of 0.26 (P < .312) between CPOT scores and patient self-

report during repositioning after extubation. This replication study

adds to four studies that have examined psychometric attributes of the

instrument and contributes to the process of translating the use of this

instrument to the clinical setting.

� 2013 by the American Society for Pain Management Nursing

Critically ill patients can be intubated, sedated, and minimally interactive to non-

interactive. For these patients, the ‘‘gold standard’’ of patient self-report is not avail-

able to the clinician (Pasero & McCaffery, 2011). Based on the recommendations

of clinical guidelines (Herr, Coyne, Key, Manworren, McCaffery, Merkel, & Wild,

2006; Jacobi, Fraser, Coursin, Riker, Fontaine,Wittbrodt, & Lum, 2002), clinicians

observe the patient for cues that indicate the patient may be experiencing painand treat the patient based on those observations. Interdisciplinary clinical guide-

lines have called for the development of standardized instruments to aid in the as-

sessment of pain in those who cannot self-report (Herr et al., 2006; Jacobi et al.,

2002). In response to this need, instruments have been developed to aid the clini-

cian in assessment of pain in critically ill adults (Gelinas, Fillion, Puntillo, Viens, &

Fortier, 2006;Mateo&Krenzischek, 1992;Odhner,Wegman, Freeland, Steinmetz,

& Ingersoll, 2003; Payen, Bru, Bosson, Lagrasta, Novel, Deschaux, & Jacquot, 2001;

Pain Management Nursing, Vol 14, No 4 (December), 2013: pp e216-e225

e217CPOT Validity and Reliability

Puntillo, Stannard, Miaskowski, Kehrle, & Gleeson,

2002). The charge of research now is to further test

and validate the instruments and thereby to recom-

mend an instrument for general clinical use.

The purpose of the present study was to repro-

duce the findings of one study (henceforth referred

to as the Gelinas reference study) that developed an in-strument to assess pain in critically ill adults, the

Critical-Care Pain Observation Tool (CPOT) (Gelinas

et al., 2006). The aim was to contribute to research

that examines the validity and reliability of the CPOT

instrument and to contribute to translation of research

findings to the clinical setting.

BACKGROUND

There is general consensus that an individual’s experi-ence of pain is subjective and complex and that pa-

tients’ themselves can most accurately measure their

pain experience; this is referred to as a patient’s self-

report (Herr et al., 2006; Jacobi et al., 2002; Pasero &

McCaffery, 2011; Puntillo, Pasero, Li, Mularski, Grap,

Erstad, & Sessler, 2009). Critically ill patients may be

intubated, sedated, and minimally interactive to non-

interactive and assessing pain in these patients isproblematic, because the gold standard of patient self-

report is not available to the clinician (Pasero &

McCaffery, 2011). Poor pain management directly

affects patient care; it is linked to poor patient out-

comes, increased cost of care, and decreased quality

of life (Dalton, Brown, Carlson, McNutt, & Greer,

2000; Granja, Lopes, Moreira, Dias, Costa-Pereira, &

Carneiro, 2005; Mularski, Curtis, Billings, Burt, Byock,Fuhrman, & Levy, 2006). Some consequences of under-

treated pain include impaired immune response, respi-

ratory complications such as atelectasis and infection,

decreased mobilization and venous thromboembolism,

the development of chronic pain syndromes, and psy-

chologic disorders, such as posttraumatic stress disor-

der (Jones, Griffiths, Humphris, & Skirrow, 2001;

Norman, Stein, Dimsdale, & Hoyt, 2008; Page, 2005;Pasero & McCaffery, 2011). Undertreated pain de-

creases patient and family quality of life and is a cause

of moral distress in caregivers (Desbiens & Wu, 2000;

Elpern, Covert, & Kleinpell, 2005; Ferrell, 2005).

Pain assessment in the critically ill is a concern be-

causemultiple studies have recognized that pain is prev-

alent in critically ill patients (Gelinas, 2007; Puntillo,

Arai, Cohen, Gropper, Neuhaus, Paul, & Miaskowski,2010; Puntillo, Morris, Thompson, Stanik-Hutt, White,

& Wild, 2004; Puntillo, White, Morris, Perdue, Stanik-

Hutt, Thompson, & Wild, 2001). Acute pain is often

underrecognized, underassessed, and undertreated

(Arroyo-Novoa, Figueroa-Ramos, Puntillo, Stanik-Hutt,

Thompson,White, &Wild, 2008; Carr, Reines, Schaffer,

Polomano, & Lande, 2005; Gelinas, Fortier, Viens,

Fillion, & Puntillo, 2004; Marquie, Raufaste, Lauque,

Marine, Ecoiffier, & Sorum, 2003; Stanik-Hutt, Soeken,

Belcher, Fontaine, & Gift, 2001). This was illustrated

by a seminal descriptive study (Thunder II) of proce-

dural pain in adult intensive care unit (ICU) patients(n ¼ 5,957) that was conducted at multiple sites (n ¼169) in the United States, Canada, Great Britain, and

Australia (Puntillo et al., 2001). Self-report of pain inten-

sity using the numeric rating scale of 0 (no pain) to 10

(highest level of pain) was elicited from adults who un-

derwent one of six common procedures (mean pain in-

tensity is noted in parentheses after each procedure):

turning (4.93), wound drain removal (4.67), woundcare (4.42), tracheal suctioning (3.94), central venous

catheter insertion (2.72), and femoral sheath removal

(2.65) (Puntillo et al., 2004). Thunder II made clear

the experience of procedural pain in the critically ill; pa-

tients’ reported turning to be the most distressing and

painful of the procedures studied, in addition, <20%

of patients were premedicated with opiates before pro-

cedures (Puntillo et al., 2001). It has been 10 years sincethe findings of Thunder II were reported, yet evidence

suggests that the assessment and treatment of acute

pain in patients remains problematic. In a national study

on postoperative pain, 80% of randomly sampled pa-

tients (n ¼ 250) reported experiencing acute pain,

and 86% of these patients reported the pain to be mod-

erate, severe, or extreme (Apfelbaum, Chen, Mehta, &

Gan, 2003). In research on pain prevalence, 77.4% ofcardiac surgery patients (n¼ 93) interviewed after their

ICU stay remember having pain; 46 of these patients

ranked their pain as moderate to severe, and turning

was reported as the most common cause of pain

(Gelinas, 2007). Puntillo et al. (2010) surveyed ICU pa-

tients (n ¼ 171) at high risk of dying and found that

patients reported pain, along with shortness of breath,

feeling scared, and feeling confusion as symptoms thatcaused distress.

Recognition of the deleterious effects of untreated

pain and a need for improved pain management of the

critically ill has focused on standardizing the approach

to pain assessment in this population. To this end,

multiple observer rated pain assessment instruments

have been developed (Baiardi, Parzuchowski, Kosik,

Ames, Courtney, & Locklear, 2002; Blenkharn,Faughnan, & Morgan, 2002; Gelinas et al., 2006;

Mateo & Krenzischek, 1992; Odhner et al., 2003;

Payen et al., 2001; Puntillo, Stannard, Miaskowski,

Kehrle, & Gleeson, 2002; Webb & Kennedy, 1994).

Health science stands at the cusp of integrating stan-

dardized instruments that assess pain in the critically

ill adult into the clinical setting. Two of these

e218 Keane

instruments, the CPOT and the Behavioral Pain Scale

(BPS), have come to the fore of research and have

shown promising clinical utility in the adult population

(Cade, 2008; Li, Puntillo, &Miaskowski, 2008;Marmo&

Fowler, 2010; Pudas-T€ahk€a, Axelin, Aantaa, Lund, &Salanter€a, 2009). A position statement by the American

Society for Pain Management Nursing recommendsthe CPOT and BPS as tools for pain assessment in non-

verbal adults (Herr et al., 2006). Table 1 describes and

contrasts aspects of these two observer-rating scales,

the CPOT and BPS. Li et al. (2008) reviewed six pain

measures developed to assess pain in critically ill adults

(Behavioral Pain Rating Scale [BPRS], Pain Assessment

and Intervention Notation [PAIN], BPS, Nonverbal

Pain Scale [NVPS], Pain Behavior Assessment Tool,and CPOT); only the CPOT and BPS instruments

showed evidence of content validity, construct validity,

criterion validity (CV), and interrater reliability (IRR).

Based on their assessment of the psychometric proper-

ties of each of these instruments, the authors recom-

mended further rigorous evaluation of these

instruments (Li et al., 2008). Cade (2008), in a review

of three clinical tools (BPS, CPOT, and NVPS), recom-mended the clinical implementation of the BPS in the

ICU setting. This recommendationwas based on the fol-

lowing criteria: 1) The BPS has been found to have good

construct validity in three studies; 2) the BPS has been

found to have acceptable interrater reliability; 3) the

BPS has demonstrated good internal consistency; and

4) the BPS has had testing of its factor structure via prin-

cipal component analysis. The CPOT was noted asa promising instrument with good reliability and valid-

ity, but requiring: 1) further validation in diverse popu-

lations of critically ill patients; and 2) testing of internal

consistency and domain structure. A systematic review

(Pudas-T€ahk€a et al., 2009) evaluated five instruments

that were used to assess for pain in critically ill patients

(BPS, CPOT, NVPS, PAIN, and the Pain Assessment Algo-

rithm). Based on the scored quality of each instrument,the BPS had the highest score (12/20), followed by

TABLE 1.

Comparison of Critical-Care Pain Observation Tool (CPIII Adults

InstrumentBehavioralDescriptors

ScoreRange

ContentValidity

BPS Facial expression,upper limb movement,ventilator compliance

3-12 No

CPOT Facial expression,body movement,ventilator compliance,muscle tension

0-8 Yes

the CPOTandNVPS (11/20); the authors did not recom-

mend one specific instrument for clinical use but rec-

ommended further testing of the psychometric

properties of all of the instruments (Pudas-T€ahk€aet al., 2009). Using a repeated measures design,

Marmo and Fowler (2010) compared the reliability of

the CPOT, the Face, Legs, Activity, Crying, Consolability(FLACC) scale, and theNVPS in one studyof 25 subjects;

the percentage agreement between independent nurse

observers using these scales ranged from 56% to 100%.

Internal consistency of the CPOTand NVPS were calcu-

lated as Cronbach alpha of 0.89 for each instrument;

a measure for the FLACC scale was not reported.

One of the limitations of the BPS is that it was not

designed to assess for pain in adults that are not intu-bated. A modification of the instrument to include as-

sessment of patient vocalizations has been proposed

and studied by Chanques, Payen, Mercier, de Lattre,

Viel, Jung, et al. (2009) in one study of 30 patients. Be-

cause there is only one study published with this mod-

ification, further testing of the instrument is required

before clinical use. Both the BPS and the CPOT were

developed in French, the BPS in France and theCPOT in Canada. Although English-language versions

of both instruments have been developed, only the

CPOT has been noted to have been forward-back-

ward–translated into English (Gelinas & Johnston,

2007).

AIM

The purpose of the present research was to examine

the reliability and validity of the CPOT via replication

of the Gelinas reference study. This replication study

sought to discover if the original findings of Gelinas

et al. (2006) could be reproduced in a similar setting

with a similar population of patients. The aim of thisstudy was to contribute to research that examines the

validity and reliability of the CPOT and to contribute

to research findings on the instrument (Gelinas,

OT) and Behavioral Pain Scale (BPS) for Critically

CriterionValidity

DiscriminantValidity

InterraterReliability

SensitivitySpecificity

No Yes Yes No

Yes Yes Yes Yes

e219CPOT Validity and Reliability

Fillion, & Puntillo, 2009; Gelinas & Johnston, 2007;

Marmo & Fowler, 2010; Tousignant-Laflamme, Bour-

gault, G�elinas, & Marchand, 2010).

The research questions asked were:

1. What are the measurements of the discriminant validity

(DV) and criterion validity (CV) of the CPOT instrument?

2. What is the measurement of the interrater reliability

(IRR) of the CPOT instrument?

METHODS

Study Design, Setting, SampleThis quantitative study used a repeated measures de-

sign to test the validity and reliability of the CPOT in

assessing for pain in 21 open heart surgery patients.

The study was conducted in a teaching hospital(>600 beds) located in the northeastern United States.

As in the Gelinas reference study, data were collected

on postoperative open heart surgery patients in a car-

diothoracic intensive care setting. A convenience sam-

ple was used; 23 patients were enrolled. The patient

population consisted of adults who were scheduled

for open heart surgery procedures. Inclusion criteria

were that patients: 1) be >21 years old; 2) be Englishspeaking; and 3) require cardiothoracic surgery. Exclu-

sion criteria were: 1) a left ventricular ejection fraction

<25%; 2) receipt of neuromuscular blockers after sur-

gery; 3) acute hemodynamic complications after sur-

gery; 4) alcohol or drug dependence; and 5) a history

of medical treatment for chronic pain.

Research EthicsThe principal investigator (PI) was responsible for the

recruitment of prospective participants for the study

and approached eligible patients 1-2 days before their

surgery to explain the purpose of the study and toobtain written informed consent. Once they were

enrolled, study participants were taught how to self-

report pain with the use of the Pain Intensity Descrip-

tive Scale (PDS) scale. This study received expedited

approval from the hospital Institutional Review Board

as well as the university Institutional Review Board be-

fore beginning data collection. Other than the slight

risk of loss of confidentiality of private health informa-tion, no risks to participants were identified. In addi-

tion to the PI, the staff nurses that participated in the

reliability testing of this instrument completed an on-

line training program on ethics and clinical research

via a hospital training program.

Study InstrumentationThe CPOT is an observer rating scale of pain behaviors

that has shown good reliability, validity, and clinical

applicability in initial studies. Evaluative indicators in-

clude facial expression, body movements, muscle ten-

sion, and compliance with the ventilator or

vocalizations. Each indicator can be scored according

to scale criteria as a 0, 1, or 2, and the scale scoring

has a total range of 0 to 8. An initial study of the sensi-

tivity and specificity of the instrument suggests thatduring noxious stimuli, CPOT scores >2 are indicative

of pain; however, scores that are >1 may be indicative

of pain in patients before exposure to noxious stimuli

(Gelinas, Harel, Fillion, Puntillo, & Johnston, 2009).

The Confusion Assessment Method–Intensive

Care Unit (CAM-ICU) was used to screen for delirium.

The CAM-ICU (Ely, Inouye, Bernard, Gordon, Francis,

May, et al., 2001) instrument has been well studiedand validated for use in the critically ill population. Par-

ticipant sedation levels were measured with the use of

the Ramsay scale (Ramsay, Savege, Simpson, &

Goodwin, 1974). This scale assesses patient respon-

siveness and agitation; it is scored as follows: 1 (anx-

ious and agitated, restless, or both), 2 (cooperative,

oriented, and tranquil), 3 (responds to commands

only), 4 (exhibits brisk response to light glabellar tapor loud auditory stimulus), 5 (exhibits a sluggish re-

sponse to light glabellar tap or loud auditory stimulus),

or 6 (exhibits no response) (Ramsay et al., 1974).

The PDS is a self-report instrument that patients

can use to rate their level of pain. The scale consists

of five verbal descriptors of pain (none, mild, moder-

ate, severe, and unbearable), each assigned a numeric

value (0-4, respectively). The PDS has been previouslystudied and has been shown to be a reliable and valid

instrument for pain measurement in postoperative pa-

tients (Mateo & Krenzischek, 1992). This self-report

measure was used to correlate with the CPOT scores

when assessing for instrument CV.

Study ProcedureEach study patient had assessments taken with the

CPOT instrument at three different times on the day

of their surgery (referred to as postoperative day 0).During the second and third assessment periods, if

patients were interactive, they were prompted to

self-report their pain scores. The study procedure is de-

picted in Figure 1. Two nurse observers performed the

assessments with the CPOT independently and were

blinded to each other’s scores. The nurses who partic-

ipated in reliability testing were given one educational

session on the instrument which consisted of viewinga standardized videotape of patient scenarios that was

obtained from the CPOT’s author. Scores were then

reviewed with reference scores of the videotaped pa-

tient scenarios until there was $90% agreement be-

tween scores.

FIGURE 1. - Diagram of study design and data collection.

e220 Keane

Assessment Period OneThe first assessment period took place in the intensive

care unit �1 hour after the participant’s arrival fromthe operating room. Participants were assessed with

the CPOT instrument at rest (time [T] 1), with reposi-

tioning (T2), and again at rest (T3). Two nurse ob-

servers performed the assessments and were blinded

to each other’s ratings. A criterion for this time period

was that the participant be intubated and unconscious

as indicated by a Ramsay score of 5 or 6. Of the 21 par-

ticipants assessed during this period, 15 participantsmet the criteria for evaluation.

Assessment Period TwoThe second assessment period took place when the

participant was still intubated and had become con-

scious, as evidenced by a Ramsey Scale score of 2-4.The participant was assessed with the CPOT by two

nurse observers at rest (T4), with repositioning (T5),

and again at rest (T6). In addition, the participant’s

self-report was solicited immediately after reposition-

ing. A simple self-report system was used during this

time; patients were asked to indicate if pain were pres-

ent or absent by nodding their head yes or no. This

method of self-report was used because patientswere generally too sleepy to use the more complex

numeric PDS.

Assessment Period ThreeThe final assessment period was performed when par-

ticipants were extubated and awake. Subjects were as-sessed at rest (T7), during positioning (T8), and again

at rest (T9). They were also asked to indicate if pain

was present and to self-report using the PDS. As in

the Gelinas reference study, subjects in the study

were screened for delirium after extubation; those

that screened positive for delirium were excluded

from further study participation.

Analytic MethodsDeidentified data were collected and statistical calcula-

tions conducted in MedCalc for Windows, version

10.02.0 (MedCalc Software, Mariakerke, Belgium),

and version 15.0 of SPSS for Windows (SPSS, Chicago,

Illinois). The DVof an instrument refers to the ability of

an instrument to measure one intended variable and

not another, in this case pain versus no pain. The DVof the CPOT instrument was tested by comparing the

mean CPOT scores of patients before and immediately

after turning in bed (repositioning is considered to be

a noxious procedure). Paired Student t tests were per-

formed to compare the mean scores of the testing

periods T1 and T2, T4 and T5, and T7 and T8.

CV refers to the ability of an instrument to accu-

rately measure the phenomenon of interest, in thiscase the measurement of pain, and was evaluated by

correlating the observed CPOT score to the ‘‘gold’’ stan-

dard of pain measurement, the patient self-report

(using the PDS).

IRR measures the extent to which different users

of the instrument can, under similar circumstances,

obtain like measurements. IRR of the CPOT was mea-

sured by comparing the blinded independent assess-ments made by two nurse observers at T1-T9.

Weighted kappa measurements are able to account

for the degree of congruence of measurements that

have multiple items in a scale (Landis & Koch, 1977).

RESULTS

Sample CharacteristicsThirty-three patient charts were screened for poten-

tial enrollment in the study based on study inclusion

TABLE 2.

Discriminant Validity: Differences in CPOTScores of Participants at Rest (T1, T4, and T7)Compared with Participant CPOT Scores withRepositioning (T2, T5, and T8, respectively)

Scoreat Rest/Position

No. ofobservations t df

95% ConfidenceInterval

Lower Upper

T1/T/2 30 �5.784* 29 �3.92543 �1.87457T4/T5 40 �5.785* 39 �2.26064 �1.08936T7/T8 32 �7.662 31 �2.61148 �1.51352

e221CPOT Validity and Reliability

and exclusion criteria, and 29 of these patients were

invited to participate. Six patients declined to partic-

ipate in the study. Five of those six patients stated

that they were too anxious to consider participation;

the sixth patient stated that he was not interested in

participating in research. Of the 23 subjects enrolled,

one patient withdrew from the study citing anxietyas a factor in his decision; a second patient was drop-

ped from the study owing to hemodynamic instabil-

ity. Data from 21 subjects were used in this

analysis. The ages of patients enrolled in the study

ranged from 44 to 85 years, and the mean age was

64 years.

*P < .0001.

Discriminant ValidityThe mean observed CPOT scores at each assessment of

T1-T9 are represented in Figure 2. When comparing

mean CPOT scores during nonnocioceptive periods

(the rest periods of T1, T4, and T7), and periods of no-

cioception (the positioning periods of T2, T5, and T8),

statistically significant differences were noted. Table 2

provides the details of this analysis.

Criterion ValidityThe Spearman coefficient was used to test the associa-

tion between patient self-report (PDS) after extubation

and the average CPOT score. A weak nonsignificant

Spearman association of 0.26 (P<.312) at T8was found.

TABLE 3.

Weighted k Coefficients for Each Assessmentfrom T1-T9

Interrater ReliabilityFindings on IRR were variable; with weighted kappacoefficients ranging from 0.34 to 1.0. These weighted

kappa values are a measure of how well the ratings

of the nurse observers were in agreement. According

to Landis and Koch (1977) these values correlate

with levels of acceptability ranging from fair to perfect.

IRR for each assessment from T1 to T9 is provided in

Table 3.

FIGURE 2. - Mean Critical-Care Pain Observation Tool(CPOT) scores at observation times (T) 1 through 9.

DISCUSSION

The purpose of this research was to examine the reli-

ability and validity of the Critical Care Pain Observation

Tool (CPOT) via replication of the Gelinas reference

study. This replication study sought to discover if the

original findings of Gelinas et al. (2006) could be repro-

duced in a similar setting with a similar population of

patients. The research questions asked were:

1. What are the measurements of the DV and CV of the

CPOT instrument?

2. What is the measurement of the IRR of the CPOT

instrument?

An assumption of this study was that findingsshould be reproducible in similar populations of pa-

tients. Overall, sample demographics of this study

were similar to those of the Gelinas reference study.

See Table 4 for a comparison of the demographics of

Assessment No. of PatientsWeighted kCoefficient*

T1 15 1.0T2 15 .72T3 15 .34T4 21 .61T5 21 .47T6 21 .56T7 17 .36T8 17 .57T9 17 .46

*Levels of acceptability (Landis & Koch, 1977):<0 is poor, 0-0.20 is slight,

0.41-0.60 is moderate, 0.61-0.80 is substantial, 0.81-1.00 is almost perfect.

TABLE 4.

Comparison of Sample Demographics

Demographic Variable

GelinasReferenceStudy*

ReplicationStudy

Age, y [mean (SD)] 60 (80) 64 (10)Sex (%)

Male 79 67Female 21 33

Type of surgery (%)Coronary artery bypass graft 79 57Heart valve replacement orrepair

10 9

Coronary artery bypass graftand valve repair

9 29

Other 2 5

*Gelinas, Fillion, Puntillo, Viens, & Fortier, 2006.

e222 Keane

the sample in this study with the sample demographics

in the Gelinas reference study. A larger percentage of

double-procedure patients is present in the conve-

nience sample for the present study and suggests that

the sample for this study was of higher acuity than

the reference study sample.Compared with the Gelinas reference study, mean

CPOT scores were similar, with mean scores (with

standard deviations in parentheses) for the present

study ranging from 0.20 (0.61) to 3.17 (2.84) and for

the Gelinas reference study ranging from 0.55 (1.03)

to 3.38 (1.38) (Gelinas et al., 2006). Because the total

range of scores for the CPOT instrument is 0-8, the dis-

tribution of mean observed scores in both studies is re-stricted to the lower end of the scale. One possible

explanation is that because subjects were being as-

sessed after surgery, the effect of anesthesia may

have muted scores in this population. However,

Gelinas and Johnston (2007) explored mean CPOT

scores in another study of the instrument with a sample

of mixed ICU patients (trauma, postoperative, and

medical patients). The mean scores of the CPOT instru-ment in that study similarly ranged from 0.36 (SD 0.57)

to 2.2 (SD 1.32). That study also noted that the mean

scores of unconscious patients are lower than those

of conscious patients. This restricted range of scores

can be particularly problematic in two ways. First, be-

cause scale characteristics are based on a limited range

of scores, it is difficult to interpret what a very high

score would mean in the context of the scale criteria.Results on validity and reliability of the instrument per-

tain primarily to a range of scores that are restricted to

the lowest parts of the scale, and it is not clear how up-

per range scale scores contribute to the prediction of

pain in patients. Second, the compression of scale

values at the lower end of the scale suggests that the

instrument may not be sensitive enough to detect

pain in patients who cannot self-report. Gelinas, Harel,

et al. (2009) explored the properties of sensitivity and

specificity of the instrument and found that the instru-

ment specificity and sensitivity varied with the timing

of assessment. The CPOT had high specificity and sen-sitivity (86% and 78%, respectively) during nociceptive

procedures, but lower specificity and sensitivity (47%

and 63%, respectively) when assessing patients before

exposure to a noxious stimuli. The cutoff score for

a positive report of pain on the CPOT instrument

was found to be a score of 2 (i.e., scores <2 indicate

no clinically significant pain, and scores >2 indicate

clinically significant pain). From a clinical point ofview, this implies that finding a positive score gives

a clinician high confidence that pain is present and

should be treated, but a low score may not indicate

that the patient is not in pain. Indeed in the develop-

ment of the instrument, the author is careful to note

that even if the lowest score of the instrument is ob-

tained, it may not necessarily indicate that the patient

is not in pain (Gelinas, Harel, et al., 2009). It is possiblethat the scoring criteria of the scale could be made

more sensitive by a weighting of scale criteria. There

is evidence to suggest that a sensitive indicator of

pain in patients is an analysis of facial expression

(Dalton et al., 1999; Labus, Keefe, & Jensen, 2003).

An exploration of this perspective could be confirmed

by factor analysis of the scale components. Further

testing in different populations of patients shouldhelp to clarify the clinical significance of CPOT scores

and the utility of the instrument in practice.

The findings of this study regarding DV were sim-

ilar to the findings in the Gelinas reference study. The

CPOT instrument was able to discriminate between

a patient pain level at rest, before a noxious procedure,

and after a noxious procedure. Testing for DV revealed

a significant difference in mean scores at each testinginterval. Analysis showed a statistically significant dif-

ference between scores at rest and scores with reposi-

tioning. These findings are consistent with the findings

that tested DV in the Gelinas reference article.

CV was not fully evaluated in this study, owing to

incomplete collection of data. One measure tested, the

association between patient self-report (n ¼ 17) and

CPOT scores was tested at T8 in the assessment pe-riod, a time period that reflects the patient pain score

during a noxious procedure. The findings showed

a weak Spearman correlation that was nonsignificant

(Spearman coefficient 0.26; P < .312). Gelinas et al.

(2006) found significant moderate correlations be-

tween patient self report and observer CPOT scale

scores in the evaluation of CV. Post hoc power analysis

e223CPOT Validity and Reliability

indicates that this analysis of criterion validity is under-

powered, and so could explain why a significant effect

was not found.

Testing for IRR in the present study showed

a range of results; based on the criteria of Landis and

Koch (1977), the results indicate fair to almost perfect

IRR, with weighted kappa scores ranging from 0.34 to1.0 in T1-T9. Weighted kappa scores in the Gelinas

reference study were higher and ranged from 0.52 to

0.88, with IRR ranging from moderate to high

(Gelinas et al., 2006). Lower IRR scores can be ex-

pected in the present study, because seven nurses

were involved in using the instrument to evaluate

IRR, compared with two nurses in the Gelinas refer-

ence study. Scoring of a ‘‘true’’ facial grimace requiredpractice; using a standardized reference such as the Re-

vised Faces Pain Scale could assist in helping to teach

scoring of facial expression (Pasero & McCaffery,

2011).

During the course of this study, clinical scenarios

evolved that may have contributed to disparate scor-

ing using the CPOT. For example, the question arose

about how to score when a patient is prompted touse a device for splinting his incision during reposi-

tioning. Inconsistencies in scoring during this type

of procedure may have contributed to lower reliabil-

ity scores in this study. Using standardized ed-

ucational programs for tool users and clarifying

tool use in different clinical scenarios would be use-

ful for clinicians and would improve instrument

reliability.

STUDY LIMITATIONS

The potential for the presence of a confounding vari-

able in the study should be acknowledged. Anxiety is

a component of critical illness and is well documented

in patients’ reports of their illness experience

(Rotondi, Chelluri, Sirio, Mendelsohn, Schulz, Belle,

et al., 2002; Stein-Parbury & McKinley, 2000). It is pos-

sible that some behaviors measured are related to anx-iety and not pain; patient self-report of anxiety was not

assessed for in the process of the present study.

Owing to incomplete data collection, it was not

possible to fully replicate tests for CV in this study.

The small sample size of this study also limits the

power it has to detect statistically significant differ-

ences in the evaluation of CV.

Potential biases of the PI and study nurses couldhave influenced results. The small sample size of this

study (n ¼ 21) limits the generalizability of the results.

This study was conducted on postoperative open heart

surgery patients, and the results should not be general-

ized to all intensive care patients.

IMPLICATIONS

PracticeThe CPOT is a promising instrument for use in assess-

ing pain in critically ill open heart surgery patients.

Formal research on the instrument’s feasibility also

supports the clinical utility of the tool (Gelinas,

2010). Assessment of pain in the vulnerable population

of critically ill patients is essential to the management

of pain in these patients and serves to inform the out-

comes of intervention research. Without this standard,clinicians cannot systematically compare interventions

nor assess the clinical significance of interventions.

The CPOT instrument is a first step in measuring the

pain experience of critically ill patients.The present

study’s findings show the reproducibility of the find-

ings on discriminant validity and interrater reliability

from the Gelinas reference study and support the evi-

dence base for use of the instrument in the clinicalsetting.

EducationThere is a need for interdisciplinary education on pain

assessment in the critical care setting. As science

changes and progresses, practicing clinicians should

receive education in the use of assessment tools forcritically ill patients. A well developed standardized ed-

ucational module on the CPOT that addresses the

learning needs of practicing clinicians is needed.

ResearchFurther research on the psychometrics of the tool,

particularly factor and Rasch analysis, can inform re-finement of the instrument. The current body of evi-

dence supports the reliability and validity of the

instrument, and there is a need for research to be di-

rected toward effective implementation of the instru-

ment and evaluating the effect of this change in

practice on patient outcomes. Research has demon-

strated that simply implementing the use of an assess-

ment tool in clinical practice is not sufficient to changepractice; theory-based interdisciplinary strategies to

address pain assessment and pain management in the

critical care environment are needed.

CONCLUSION

This replication study supports the DV and IRR of the

CPOT instrument in assessing for pain in open heartsurgery patients and supports the reproducibility of

the findings of the Gelinas reference study. This study

adds to four other studies that have examined the psy-

chometrics of the CPOT and contributes to the pro-

cess of translating its use to the clinical setting

e224 Keane

(Gelinas, Fillion, et al., 2009; Gelinas & Johnston,

2007; Marmo & Fowler, 2010; Tousignant-Laflamme

et al., 2010).

Acknowledgments

The author expresses her sincere thanks to the staff nurses

that contributed their time and expertise to this project,

especially Anne-Marie Gray, CCRN, Bethany Drabik, RN,

Elaine Zappala, CCRN, CSC, Nicole Manchester, CNL,

Laurie Shields, CCRN, Arthur Edgecomb, RN, and Kathleen

Bennett, CCRN. The author also gratefully acknowledges

the support and mentorship of her thesis committee mem-

bers Gene Harkless, DNSc, APRN, FAANP, CNL, and Joanne

G. Samuels, PhD, RN, CNL; both are faculty members of De-

partment of Nursing at the University of New Hampshire.

REFERENCES

Apfelbaum, J. L., Chen, C., Mehta, S. S., & Gan, T. J. (2003).

Postoperative pain experience: Results from a national sur-vey suggest postoperative pain continues to be underman-aged. Anesthesia & Analgesia, 97(2), 534–540.

Arroyo-Novoa, C. M., Figueroa-Ramos, M. I., Puntillo, K. A.,Stanik-Hutt, J., Thompson, C., White, C. L., & Wild, L. R.(2008). Pain related to tracheal suctioning in awake acutelyand critically ill adults: A descriptive study. Intensive &Critical Care Nursing, 24(1), 20–27.

Baiardi, J., Parzuchowski, J., Kosik, C., Ames, T.,Courtney, N., & Locklear, J. (2002). Examination of the re-

liability of the FLACC pain assessment tool with cognitively

impaired elderly. Poster presented at the Annual National

Conference of Gerontologic Nurse Practitioners. Chicago:Illinois.

Blenkharn, A., Faughnan, S., & Morgan, A. (2002). Devel-oping a pain assessment tool for use by nurses in an adultintensive care unit. Intensive & Critical Care Nursing,

18(6), 332–341.Cade, C. H. (2008). Clinical tools for the assessment of

pain in sedated critically ill adults. Nursing in Critical Care,

13(6), 288–297.Carr, D. B., Reines, H. D., Schaffer, J., Polomano, R. C., &

Lande, S. (2005). The impact of technology on the analgesicgap and quality of acute pain management. Regional Anes-thesia and Pain Medicine, 30(3), 286–291.

Chanques, G., Payen, J. F., Mercier, G., de Lattre, S., Viel, E.,Jung, B., & Jaber, S. (2009). Assessing pain in nonintubatedcritically ill patients unable to self report: An adaptation ofthe Behavioral Pain Scale. Intensive Care Medicine, 35(12),2060–2067.

Dalton, J. A., Brown, L., Carlson, J., McNutt, R., &Greer, S. M. (1999). An evaluation of facial expression dis-played by patients with chest pain. Heart & Lung, 28(3),168–174.

Dalton, J. A., Carlson, J., Lindley, C., Blau, W.,Youngblood, R., & Greer, S. M. (2000). Clinical economics:Calculating the cost of acute postoperative pain medica-tion. Journal of Pain and Symptom Management, 19(4),295–308.

Desbiens, N. A., & Wu, A. W. (2000). Pain and suffering inseriously ill hospitalized patients. Journal of the American

Geriatrics Society, 48(5 Suppl), S183–S186.Elpern, E. H., Covert, B., & Kleinpell, R. (2005). Moral

distress of staff nurses in a medical intensive care unit.American Journal of Critical Care, 14(6), 523–530.

Ely, E. W., Inouye, S. K., Bernard, G. R., Gordon, S.,Francis, J., May, L., Truman, B., Speroff, T., Gautum, S.,Margolin, R., Hart, R., & Dittus, R. (2001). Delirium in me-chanically ventilated patients: Validity and reliability of theConfusion Assessment Method for the Intensive Care Unit(CAM-ICU). JAMA, 286(21), 2703–2710.

Ferrell, B. (2005). Ethical perspectives on pain andsuffering. Pain Management Nursing, 6(3), 83–90.Gelinas, C. (2007). Management of pain in cardiac surgery

ICU patients: Have we improved over time? Intensive andCritical Care Nursing, 23(5), 298–303.Gelinas, C. (2010). Nurses’ evaluations of the feasibility

and the clinical utility of the Critical-Care Pain ObservationTool. Pain Management Nursing, 11(2), 115–125.Gelinas, C., Fillion, L., & Puntillo, K. A. (2009). Item se-

lection and content validity of the Critical-Care Pain Obser-vation Tool for nonverbal adults. Journal of AdvancedNursing, 65, 203–216.Gelinas, C., Fillion, L., Puntillo, K. A., Viens, C., &

Fortier, M. (2006). Validation of the Critical-Care Pain Ob-servation Tool in adult patients. American Journal of Criti-

cal Care, 15(4), 420–427.Gelinas, C., Fortier, M., Viens, C., Fillion, L., &

Puntillo, K. A. (2004). Pain assessment and management incritically ill intubated patients: A retrospective study. Amer-

ican Journal of Critical Care, 13(2), 126–135.Gelinas, C., Harel, F., Fillion, L., Puntillo, K. A., &

Johnston, C. C. (2009). Sensitivity and specificity of theCritical-Care Pain Observation Tool for the detection of painin intubated adults after cardiac surgery. Journal of Pain &

Symptom Management, 37(1), 58–67.Gelinas, C., & Johnston, C. (2007). Pain assessment in the

critically ill ventilated adult: Validation of the critical-carepain observation tool and physiologic indicators. ClinicalJournal of Pain, 23(6), 497–505.Granja, C., Lopes, A., Moreira, S., Dias, C.,

Costa-Pereira, A., & Carneiro, A. (2005). Patients’ recollec-tions of experiences in the intensive care unit may affecttheir quality of life. Critical Care (London, England), 9(2),R96–109.Herr, K., Coyne, P. J., Key, T., Manworren, R.,

McCaffery, M., Merkel, S., Pelosi- Kelly, J., & Wild, L. (2006).Pain assessment in the nonverbal patient: Position statementwith clinical practice recommendations. Pain Management

Nursing, 7(2), 44–52.Jacobi, J., Fraser, G. L., Coursin, D. B., Riker, R. R.,

Fontaine, D. K., Wittbrodt, E. T., Chalfin, D. B., Masica, M. F.,Bjerke, S., Coplin, W. M., Crippen, D. W., Fuchs, B. D.,Kelleher, R. M., Marik, P. E., Nasraway S. A. Jr., Murray, M. J.,Peruzzi, W. T., & Lumb, P. D. (2002). Clinical practiceguidelines for the sustained use of sedatives and analgesicsin the critically ill adult. Critical Care Medicine, 30(1),119–141.Jones, C., Griffiths, R. D., Humphris, G., &

Skirrow, P. M. (2001). Memory, delusions, and the devel-opment of acute posttraumatic stress disorder-relatedsymptoms after intensive care. Critical Care Medicine,

29(3), 573–580.

e225CPOT Validity and Reliability

Labus, J. S., Keefe, F. J., & Jensen, M. P. (2003). Self-reportsof pain intensity and direct observations of pain behavior:When are they correlated? Pain (03043959), 102, 109.

Landis, J. R., & Koch, G. G. (1977). The measurement ofobserver agreement for categorical data. Biometrics, 33(1),159–174.

Li, D., Puntillo, K., & Miaskowski, C. (2008). A review ofobjective pain measures for use with critical care adult pa-tients unable to self-report. The Journal of Pain.

Marmo, L., & Fowler, S. (2010). Pain assessment tool in thecritically ill post–open heart surgery patient population.Pain Management Nursing, 11(3), 134–140.

Marquie, L., Raufaste, E., Lauque, D., Marine, C., Ecoiffier, M.,& Sorum, P. (2003). Pain rating by patients and physicians: Evi-denceof systematicpainmiscalibration.Pain,102(3), 289–296.

Mateo, O. M., & Krenzischek, D. A. (1992). A pilot study toassess the relationship between behavioral manifestationsand self-report of pain in postanesthesia care unit patients.Journal of Post Anesthesia Nursing, 7(1), 15–21.

Mularski, R. A., Curtis, J. R., Billings, J. A., Burt, R., Byock, I.,Fuhrman, C., Fuhrman, C., Mosenthal, A. C., Medina, J.,Ray, D. E., Rubenfeld, G. D., Schneiderman, L. J., Treece, P. D.,Truog, R.D., & Levy,M.M. (2006). Proposed qualitymeasuresfor palliative care in the critically ill: A consensus from theRobert Wood Johnson Foundation Critical Care Workgroup.Critical Care Medicine, 34(11 Suppl), S404–S411.

Norman, S. B., Stein, M. B., Dimsdale, J. E., & Hoyt, D. B.(2008). Pain in the aftermath of trauma is a risk factor forpost-traumatic stress disorder. Psychological Medicine,

38(04), 533–542.Odhner, M., Wegman, D., Freeland, N., Steinmetz, A., &

Ingersoll, G. L. (2003). Assessing pain control in nonverbalcritically ill adults. Dimensions of Critical Care Nursing,

22(6), 260–267.Page, G. G. (2005). Immunologic effects of opioids in the

presence or absence of pain. Journal of Pain and Symptom

Management, 29(5 Suppl 1), 25–31.Pasero, C., & McCaffery, M. (2011). Pain assessment and

pharmacologic management. St. Louis: Elsevier/Mosby.Payen, J. F., Bru, O., Bosson, J. L., Lagrasta, A., Novel, E.,

Deschaux, I., Lavagne, L., & Jacquot, C. (2001). Assessingpain in critically ill sedated patients by using a behavioralpain scale. Critical Care Medicine, 29(12), 2258–2263.

Pudas-T€ahk€a, S. M., Axelin, A., Aantaa, R., Lund, V., &Salanter€a, S. (2009). Pain assessment tools for unconscious or

sedated intensive care patients: A systematic review. Journalof Advanced Nursing, 65(5), 946–956.Puntillo, K. A., Arai, S., Cohen, N. H., Gropper, M. A.,

Neuhaus, J., Paul, S. M., & Miaskowski, C. (2010). Symptomsexperienced by intensive care unit patients at high risk ofdying. Critical Care Medicine, 38(11), 2155–2160.Puntillo, K. A., Morris, A. B., Thompson, C. L., Stanik-

Hutt, J., White, C. A., & Wild, L. R. (2004). Pain behaviorsobserved during six common procedures: Results fromThunder Project II. Critical Care Medicine, 32(2), 421–427.Puntillo, K. A., Pasero, C., Li, D., Mularski, R. A., Grap, M. J.,

Erstad, B. L., Varkey, B., Gilbert, H. C., Medina, J., &Sessler, C. N. (2009). Evaluation of pain in ICU patients.Chest, 135(4), 1069–1074.Puntillo, K. A., Stannard, D., Miaskowski, C., Kehrle, K., &

Gleeson, S. (2002). Use of a pain assessment and interven-tion notation (P.A. I.N.) tool in critical care nursing practice:Nurses’ evaluations. Heart & Lung, 31(4), 303–314.Puntillo, K. A., White, C., Morris, A. B., Perdue, S. T., Sta-

nik-Hutt, J., Thompson, C. L., & Wild, L. R. (2001). Patients’perceptions and responses to procedural pain: Results fromThunder Project II. American Journal of Critical Care,

10(4), 238–251.Ramsay, M. A., Savege, T. M., Simpson, B. R., &

Goodwin, R. (1974). Controlled sedation with alphaxalone-alphadolone. British Medical Journal, 2(5920), 656–659.Rotondi, A. J., Chelluri, L., Sirio, C., Mendelsohn, A.,

Schulz, R., Belle, S., Im, K., Donahoe, M., & Pinsky, M. R.(2002). Patients’ recollections of stressful experiences whilereceiving prolonged mechanical ventilation in an intensivecare unit. Critical Care Medicine, 30(4), 746–752.Stanik-Hutt, J. A., Soeken, K. L., Belcher, A. E.,

Fontaine, D. K., & Gift, A. G. (2001). Pain experiences oftraumatically injured patients in a critical care setting.American Journal of Critical Care, 10(4), 252–259.Stein-Parbury, J., & McKinley, S. (2000). Patients’ experi-

ences of being in an intensive care unit: A select literaturereview. American Journal of Critical Care, 9(1), 20–27.Tousignant-Laflamme, Y., Bourgault, P., G�elinas, C., &

Marchand, S. (2010). Assessing pain behaviors in healthysubjects using the Critical-Care Pain Observation Tool(CPOT): A pilot study. The Journal of Pain, 11(10), 983–987.Webb, M. R., & Kennedy, M. G. (1994). Behavioral re-

sponses and self-reported pain in postoperative patients.Journal of Post Anesthesia Nursing, 9(2), 91–95.