Critical Appraisal #2 (Descriptive)

22
Critical Appraisal 1 Critical Appraisal of a Descriptive Study Jill Radtke University of Pittsburgh

description

Critical Appraisal #2 (Descriptive)

Transcript of Critical Appraisal #2 (Descriptive)

Worksheet for Critical Appraisal of Descriptive (Correlation, Comparative) Design Study

Critical Appraisal 1

Critical Appraisal of a Descriptive Study

Jill Radtke

University of Pittsburgh

Worksheet for Critical Appraisal of Descriptive (Correlation, Comparative) Design Study

Citation:

Palmeira, A.L., Teixeira, P.J., Branco, T.L., Martins, S.S., Minderico, C.S, Barata, J.T., et al.

(2007, April 20). Predicting short-term weight loss using four leading health behavior change theories. International Journal of Behavioral Nutrition and Physical Activity, 4,

Article 14. Retrieved June 15, 2007, from http://www.ijbnpa.org/content/4/1/14.

What type of article is this (e.g., research /data-based, clinical paper, review, editorial?)

Research/data-based

If this is a research article/data-base article, what makes it this type of article? Identify 2-3 characteristics of the article.

1. End Product: The article presents original findings based on the conception of a study design and its implementation.

2. Methodology: The article/study seeks to obtain data in a systematic fashion (e.g., the introductions literature search, the attempt to measure variables consistently and accurately in the methods section, the summation of findings in the results sections, etc.).

3. Style: The articles findings and design are presented in an objective and frank manner (also discussing the limitations) in order that the reader may judge, implement, question, and/or disregard the evidence.

State the research question posed by the authors:

How do key exercise and weight management psychosocial variables, derived from four health behavior change theories, predict weight change during a short-term behavioral obesity intervention?

What is my clinical question?

Can the same exercise and weight management psychosocial variables found in this study to predict short-term weight loss in women predict weight loss in women six weeks postpartum?

Using PICO, identify the following if applicable:

P (= population): Premenopausal women from a community who are greater than 24 years of age, not pregnant, free from major disease, and have a BMI greater than 24.9 kg/m2

I (=intervention): 15 weekly weight management meetings of 120 minutes each where the groups of 32-35 participants met with a mix of PhD and Masters level exercise physiologists, as well as dieticians and psychologists who administered to the participants exercise, behavioral, and nutrition content. The content included didactic material (e.g., information on caloric content of food), motivational tools (e.g., giving pedometers), self-awareness instruments (e.g., food log, exercise log), and goal-setting (e.g., dietary and physical activity). The intervention was based on Social Cognitive Theory (SCT), but designed to include constructs from three other behavior change theories: Self-Determination Theory (SDT), Transtheoretical Model (TTM), and the Theory of Planned Behavior (TPB).

C (= comparison group): N/A

O (=outcome): Weight change (and the specific behavioral change theories and psychosocial constructs yielding the most predictive power for weight change) APPRAISAL GUIDECOMMENTS

I. Are the methods valid/trustworthy?

1. Was the research question clear? Was the need for the study adequately substantiated? Explain The research question was stated clearly both in the abstract and the background section of the research report (not under a separate purpose section), the purpose of this study was to investigate the predictive value of changes in exercise and weight management related variables on weight change, in a sample of overweight and moderately obese women participating in a University-based weight management program, (background section). However, it was less clearly delineated how the content of the intervention parlayed into improvement of the psychosocial variables and weight loss. Although some examples were given (e.g., the intervention had the underlying goals of improving autonomyThese are highly motivational factors that should have an effect on SDT constructs), it seems that the study did not include, or at least did not mention, how constructs from each behavioral change theory would be incorporated into the intervention. Thus, the reader remains unsure as to what type of weight loss intervention program (i.e., which variables/constructs should be incorporated and how) would yield the same predictive power of certain behavioral change theories, as well as improvements in psychosocial variables and weight loss as found by the study team.

The need for the study was adequately substantiated in several instances in the background section. The authors comment that obesity has become an epidemic in industrialized countries, yet there has been a great void in the integration of biological, psychosocial, and environmental solutions in weight management programs. The authors hold that several psychosocial variables that they incorporate into the present study (and the basis of the four behavioral change theories in the study) are widely believed to explain weight management in this integration context, yet are underserved in weight management literature. For example, in the background section, Questions remain about which model or set of variables could better explain the outcomes of choice, which constructs may overlap, or if a set of variables from different theories could delineate the way to a new paradigm. Rothmam highlights this last aspect as a likely cause of some of the disappointing results for most studies of behavior change interventions conducted to date.

2. What was the design of the study? How were the data collected (one time (cross-sectional) or repeated over time (longitudinal)? What were the limitations of the data collection methods?The design of this study was descriptive correlational, and the data were collected in a prospective manner at two different time points, baseline and four months (the study is not longitudinal, per se, as it only collected at two time points during a short span of time).

There were several limitations to data collection. One such limitation may be the weighing procedure. The article states that a standardized procedure was utilized in the weighing process and cites a specific scale used. Further elaboration is not provided. However, we are unsure how much clothing participants wore during weighing, what time of day they were weighed (e.g., morning versus later in the day), after what activities they were weighed (e.g., after working out, after eating, etc.), how the scale was calibrated, if participants weighed themselves on the scale without the study team (i.e., self-reportthis is not specified in the article), etc. Moreover, we are unsure whether the conditions for weighing were similar for all participants.

Another limitation in data collection was the self-report used in the psychosocial variable questionnaires (as stated in the article). Although the instruments were validated, there is always a subjective limitation in self-report. For example, a participant may mark feeling competent and autonomous on an instrument at the follow-up because they feel that this is what the researchers would like to see, whether the researchers are communicating this subconsciously (experimenter effect) or not (Hawthorne effect).

Another limitation in the data collection (but could be considered a design limitation), as stated in the article, was only measuring the participants twice: at baseline and at follow-up at four months. Perhaps there was more fluctuation between the baseline and follow-up. Perhaps, as the authors suggest is likely, the predictive power of the weight management and exercise variables in weight loss would be reversed if the data were collected more long-term (i.e., at 16 months).

Other limitations in this study are discussed under sources of bias in this paper, as they seemed to be more of design limitations than data collection limitations, per se.

3. Describe the sample. How was the sample selected (eligibility criteria)? How is the sample representative of the population? At the beginning of the program, the sample consisted of 142 women with BMIs 30.2 3.7 kg/m2 (overweight and obese) and ages 38.3 5.8 years (the sample had 133 completers at the end of the program). The women were free of major disease, premenopausal, not pregnant, and recruited from a particular community. The sample was a purposive sample (due to the very specific eligibility criteria used for selection), recruited using advertisements in the community: newspaper ads, a website, email messages on listservs, and announcement flyers. These recruitment methods were presumed by the reader (myself) to list eligibility criteria, though this is not explicitly stated (perhaps directly stated on the poster or the interested party is directed to call a number for eligibility criteria). The eligibility criteria given by the authors is: premenopausal women greater than 24 years of age, not pregnant, free from major disease, and have a BMI greater than 24.9 kg/m2. It is unclear whether the participants self-selected (i.e., if they called and met criteria they were in the study) or were specifically chosen among all applicants who met eligibility criteria, although the article seems to assume self-selection.

This sample is somewhat representative of the population, in that it satisfies all the eligibility criteria. However, the age range is relatively tight between about 10 years of young to middle adulthood. There are no individuals greater than 45, nor any younger than 32, despite the population requirement only specifying greater than 24 years old. Thus, the age of the sample is not very representative of the population. Additionally, the BMIs of the sample constituted overweight individuals to obese individuals. There were no participants who were severely or morbidly obese. Thus, BMI is not completely representative of the population (population requirement: BMI great than 24.9 kg/m2). Also, we are not given demographics of the sample. Therefore, we cannot be sure that the sample can be generalized or applied to different communities (populations) that differed from the sample significantly on these variables.

4. Describe the variables of interest. If a comparison study, on what variable(s) are the groups being compared? How were the groups similar? How were the groups different? If it is a correlation study, on what variables are associations being examined? Were there any confounding variables? There were multiple variables of interest in this study. One variable was weight (at baseline and at 4 months; the average taken of two readings each time and rounded to the nearest 0.1 kg). There were also weight management psychosocial variables from each behavioral change theory (except SDT) measured as scores on instruments administered to the participants, including self-efficacy and outcome expectancy from SCT; self-efficacy, stages of change (SOC), and processes of change (POC), including both behavioral processes and cognitive processes, from TTM; and intentions, attitudes, subjective norms, and perceived behavioral control (PBC) from TPB. There were exercise psychosocial variables also from each behavioral change theory measured as scores on instruments administered to the participants, including self-efficacy, perceived barriers, and social support from SCT; self-efficacy, SOC, and POC, including both behavioral and cognitive processes from TTM; intentions, attitudes, subjective norms, and PBC from TPB; and interest/enjoyment, perceived competence, importance/effort, pressure/tension, and intrinsic motivation from SDT. Typically, the higher the score on the instruments for the exercise and weight management psychosocial variables indicated greater embodiment of that variable by the participant. Additionally, the four behavioral change theories (SCT, SDT, TTM, and TPB) served as variables of interest in the study. Time was also a variable of interest (from baseline measures to four months). Generally, weight, psychosocial variables, and the behavior change theories acted as dependent variables, while time served as the independent variable.

This was a correlation study, and several associations among these variables were examined. First, weight was examined for its association with time (i.e., weight change from baseline to four months). The exercise and weight management psychosocial variables were also each individually studied for their association with time (change from baseline to four months). Then weight change was correlated with baseline exercise and weight management psychosocial variables in order to determine any possible moderator variables. Weight change was also correlated with four-month change in exercise and weight management psychosocial variables. Finally, the correlation between weight change and the four different behavioral change theories (SCT, SDT, TTM, and TPB) was examined by entering the psychosocial variable scores present in each theory into separate regression models for each theory.

The study did not note any confounding variables.

5. Was the sample size large enough to detect a statistically significant association or difference? Was a power analysis performed?Yes, the sample size was large enough to detect statistically significant associations with 142 subjects to start and 133 completers. It was not mentioned that a power analysis had been performed.

6. Were there any potential sources of bias? (Differences between groups not accounted for in the analysis, drop-outs, discounting outcomes, funding agency, etc.)There were many potential sources of bias in this study. One such bias involves the method of recruitment: through advertisements in the newspaper, on a website, announcement flyers, and email messages on listservs in one community. This is a sampling bias, in that study participants appear to self-select for a purposive sample. These study participants, due to their presence in one particular community and willingness to volunteer for the study (i.e., they likely desire to lose weight), may differ from the population in several fundamental aspects. This limits the generalizability of the study findings.

Another source of bias may be that the SDT was not accounted for in the weight management psychosocial variables. The authors state that this is due to the fact that a valid Portuguese instrument had not been validated for the constructs in this theory with weight management. However, it is plausible that psychosocial variables in this theory still affect weight change (even though they are not tested).

A source of bias also possibly existed in questionable construct validity. In fact, the article states that some variables were measured with less than ideal instruments, such as outcomes expectancies. The article does not tell us the reliability and validity of the instruments used to measure the psychosocial variables, and we are left to look up the instruments on our own or just accept the authors judgment.

Also it is mentioned that there was a 6.3% attrition rate from baseline to four months, with 142 women starting the study and 133 completing it. This is not an especially high attrition rate, but if the subjects dropping out differed in some fundamental way from those staying in the study, then we would have attrition bias (i.e., our results would not reflect the population of interest, but those individuals that had had certain characteristics that allowed or motivated them to complete the study). Because the characteristics of those dropping out (or those staying in) were not elucidated, and the point in the study when the drop-out occurred was not discussed, the reader is unable to make an informed decision as to whether attrition bias existed. Bias could also exist in the relatively small sample size in the study, which affects external validity.

Another potential source of bias is testing effects. The same instruments (questionnaires) were apparently given at baseline and at four months. It is entirely feasible that the subjects became sensitized to the material on the instruments at baseline, and then answered the same questions differently at four months due to the pre-test rather than an actual intervention effect.

Bias could also result from maturation effects. The subjects could have changed from baseline to four months, regardless of the intervention. For example, as women move into middle age, their metabolism slows and weight gain occurs more easily. This weight gain (or lack of weight loss) would have little to do with the intervention.

Validity may have been affected in the study by the Hawthorne effect (i.e., the subject answered the instruments in a certain way or lost more weight because they knew they were in a weight loss study). Experimenter effects could have also been present if the subjects perceived, for example, that the researchers wanted them to lose weight or answer the instruments indicating that their self-efficacy was improving.

Also, the study (as mentioned in limitations) did not include a control group. This is a source of biasif a control group had been present and exposed to the possible Hawthorne effect, experimenter effects, and had differed as much as the intervention group on fundamental aspects (such as race, income, etc.), we could say that the intervention was likely the cause of the changes in weight and psychosocial variables. However, one has to also keep in mind that this is a correlation study and it did not claim causation.

Finally, a source of bias could exist in the outcome that the weight management psychosocial variables better explained weight change from baseline to four months as opposed to the exercise psychosocial variables. In fact, the authors note that in a similar study that was carried to 16 months, exercise psychosocial variables were better correlates of weight loss. If this study had been extended, perhaps they would have also found exercise psychosocial variables as more powerful predictors of weight change.

Biases could also exist in the data collection methods (e.g., self-report) as described in this paper previously.

7. Describe the reliability and validity of the measures.

Were the measures appropriate for the population or the variable being studied? Explain

The first instrument used, the Weight Efficacy Lifestyle Questionnaire has shown significant validity in a 1991 study using cross-validation with two different samples of subjects and with a different instrument measuring self-efficacy, the Eating Self-Efficacy Scale (convergence construct validity). The study also showed the instrument to have good reliability with Cronbach alpha coefficients ranging from .70-.90 for internal consistency. However, the article states its subjects were women, the great majority over 40 years of age (Clark, Abrams, Niaura, & Eaton). Therefore, this instrument may not be appropriate for our subjects in this study under 40. Additionally, the instrument is over 15 years old, and it is reasonable to expect that the instruments constructs may be outdated.

The dream weight outcome expectancy score used in this study, derived from a portion of the Goals and Relative Weights Questionnaire, I feel is mostly appropriate for this study population. The women that the instrument was tested on were in the same general age range as our subjects, however, the test subjects were all obese women (Foster, Wadden, Vogt, & Brewer, 1997). In our study we had overweight to obese women. However, data regarding reliability and validity of the instrument and construct of dream weight was difficult to come by. The 1997 study mentioned above did state that there was questionable reliability of the dream weight for the same subjects measured one week apart. This seems to indicate that dream weight can fluctuate based on changing expectations as one goes along in life and in a study. Thus, by measuring the dream weight expectancy at the beginning and end of this study, we see how expectations change. This particular usage of dream weight, however, has not been adequately validated or shown reliable.

For the SOC measure, the article states that SOC was measured by four questions developed by Suris (Suris, Trapp, DiClemente, & Cousins, 1998). However, the questions are not stated explicitly in the article cited. One has to assume that the questions are part of the URICA short form, which does demonstrate considerable reliability, measured by internal consistency (Suris, et al., 1998). For the POC, the Suris article states that the original form of the Weight Processes of Change Scale (which was used in our article) has good reliability and validity, although the shortened form (used by Suris, et al.) has questionable reliability. Because the SOC in our article is measured by the four questions developed by Suris, et al., who used a small sample of Mexican-American women, we have to question the validity of the measure, as we do not know the ethnic origin of our sample.

The specific scales using 18 and 17 items to address the constructs of intention, attitude, subjective norms, and PBC in the TPB for weight management and exercise, respectively, could not be located. However, these constructs are the basis of TPB, as stated in the article. The constructs seem to have good reliability as measured by internal consistency in this particular study, judging by the alpha levels given in the article. However, because the articles containing the specific scales used could not be located, we are unable to measure the scales true validity and reliability.

Self-efficacy for Exercise Behaviors Scale (SEEB) was not able to be located through the authors citation, nor through an OVID search, but the abstract to the article was given in a different database (although one had to purchase the article to receive full-text). The abstract stated that the scale demonstrates good reliability and validity for measuring self-efficacy behaviors relating to exercise.

The Exercise Perceived Barriers Scale (EPB) has shown considerable reliability and validity in measuring exercise perceived barriers in a 1989 study. However, the study was based on two large samples including undergraduates from a college and a group of workers from a company classified as white, upper-middle class (Steinhardt & Dishman). This sample differs considerably from the middle-aged overweight and obese women in our sample. Additionally, the study was conducted 18 years ago. It would be remiss to cite the same barriers today.

The Exercise Social Support Scales (ESS) validity and reliability could not be located using the authors citation on OVID. However, our article and another (Marquez & McAuley, 2006) cite good internal consistency and, thus, reliability in measuring social supports of exercise behaviors in this scale.

Again, for the exercise SOC and POC, the exact scales could not be located using the authors citations. Thus, there is no way to evaluate the validity and reliability of the measures here. Although, the Exercise Processes of Change (EPC) did have good internal consistency in the cognitive and behavioral domains in our article, indicating reliability for this study.

The study cited for the scale used for the Intrinsic Motivation Theory was accessed in its abstract form. The full article could not be. However, it was stated that the scale had good validity (in that divergent models used to test motivation did not improve the goodness-of-fit as compared to the Intrinsic Motivation Theory model). It also stated that the model had adequate reliability. The make-up of the sample was not discussed in the abstract, however the theory was tested on a sport team in 1989 (McAuley, Duncan, & Tammen, 1989). It should be noted that our population differs considerably from this type of sample. However, the constructs within the model did show good internal consistency, and thus, reliability in our article.

Finally, weight at baseline and at four months was measured and had several potential limitations discussed under data collection limitations earlier. I would suspect that the weighing procedure had good validity, in that the subjects weight in kilograms (to the nearest 0.1 kg) was obtained at two set time periods during the study using an electronic scale, which is perceived to be an accurate and appropriate means of measuring weight. However, reliability could have been an issue with the weighing, although we are told that a standardized procedure is used. However, we do not know whether all subjects were weighed with clothes on or off, what time of day they were measured, etc. If these measurement conditions differed for any subjects, weight could be affected and we would not have a reliable measure. Finally, the authors might have considered measures such as waist circumference or skin fold caliper measurements obtained at baseline and four months instead of, or in addition to, weight. These measures may offer added validity and reliability in measuring true body mass/fat.

8. Were the analysis plans (statistical methods) described in detail?

How were the data distributed (e.g., normal versus skewed)?

Were the correlative and comparative tests appropriate for the type of data analyzed and the questions asked? Explain

The statistical methods were described in some detail in a separate section called Statistical analysis. More detail was divulged about the analysis in the results section. We were told which statistical tests were used for each result, and in some cases, why they were used (for example, in the results section, The first set of correlation was done between baseline values in predictors and weight change, to explore possible moderator effects.).

We are not told if the data was distributed normally or not. However, in order to do the regressions and t-tests, one would make the assumption that the psychosocial variables and weight changes for the subjects were distributed normally.

We cannot be sure that the correlative tests were appropriate for our data, as we were not made privy to detail about the actual tests.

II. What are the results/findings?

1. What were the findings? There was a significant decrease in weight overall from baseline to four months among group members, though there was wide individual variability. Most of the exercise and weight management psychosocial variables improved from baseline to four months, with the most improvement in the exercise variables. However, weight management variables predicted weight change more strongly and significantly than the exercise variables. Self-efficacy was the strongest statistically significant individual psychosocial variable predictor of weight change. Weight change was significantly predicted by each of the four behavior change theories noted above.

The SCT was the strongest model, followed by the TTM, though the only psychosocial variable that added statistically significant power to these theories was self-efficacy.

The importance/effort psychosocial variable was a strong independent predictor of weight change and was statistically significant (accounting for 4.8% of weight change variance), although its theory, the SDT, did not significantly predict weight change.

2. Was there clinical significance? Statistical significance?As stated under the findings section, there was statistical significance found in this study. However, I would be weary of clinical significance. Most of the psychosocial measures were based on small scales, some only measuring four or five items. Therefore, because a large number of people (by statistical standards) participated in the study, a difference of less than one point/position on an item could and did yield statistical significance (e.g., the attitude exercise psychosocial variable change over time in the TPB). In reality, we would probably not consider this a significant difference and confidently predict that with a similar weight loss intervention an individual would see a significant improvement in attitude toward exercise.

3. Did the authors put their findings in the context of the broader literature on this topic? Explain

The authors did put their findings in the context of the broader literature on weight loss and psychosocial variables in the results and discussion sections. The authors discussed how this study was different from similar studies, in that it compared several psychosocial variables from behavioral change theories in the same intervention and study. They also discussed how other theories have used different change variables, e.g., pre-post subtractions versus this studys residuals, but still found similar results. Additionally, the authors discuss how this studys findings may be limited, in that subjects were only measured to four months. They state that in a similar study which followed participants 16 months, exercise psychosocial variables were found to be more predictive of weight loss than weight management psychosocial variables at 16 months, though similar to our study, the trend was reversed in the first four months. The authors suggest if this study were extended, they might find results comparable to the comparison study.

III. How can I apply the results/findings?

1. What relevance do the findings have to nursing practice?

As stated above, the results may not yield as much clinical significance as statistical significance. However, the

findings are relevant to nursing practice, in that obesity is a

major health concern that brings with it a host of

comorbidities and issues that affect nursing care (e.g.,

being aware of the risk for Type II diabetes development,

the propensity of these individuals toward developing

bedsores and being mindful of turning the patient

frequently). Therefore, it is to the nurses advantage that

she be cognizant of emergent literature, like this study, that

strives to understand the associations behind motivation for

behavior change leading to weight loss. From this study,

the nurse can internalize the fact that increasing her

patients self-efficacy (i.e., the feeling that he or she has

the power to affect change in his or her own life) increases

the patients weight loss likelihood.

2. Discuss how the findings can be applied to practice.

These findings may be applied in practice to create hospital

or community weight-loss programs that focus on

increasing the self-efficacy of participants toward weight

loss and exercise (e.g., giving the participants tips to avoid

over-eating on holidays and pointers on how to read food

labels, providing pedometers to the participants to track

exercise progress). Additionally, the programs could

include modules that focus on the intrinsic motivation of the importance/effort of exercise (another strong psychosocial variable predictor of weight loss). For example, participants

might be asked to self-organize a weekly plan for exercise,

according to the types of exercises, times, and days that

they feel they can achieve the best results.

References

Clark, M.M., Abrams, D.B., Niaura, R.S., Eaton, C.A., & Rossi, J.S. (1991).

Self-efficacy in weight management. Journal of Consulting and Clinical

Psychology, 59(5), 739-44.

Foster, G.D., Wadden T.A., Vogt, R.A., & Brewer, G. (1997). What is

reasonable weight loss? Patients expectations and evaluations of

obesity treatment outcomes. Journal of Consulting and Clinical

Psychology, 65(1), 79-85.

Marquez, D.X., & McAuley, E. (2006). Social cognitive correlates of leisure

time physical activity among Latinos. Journal of Behavioral Medicine,

29(3), 281-9.

McAuley, E., Duncan, T., Tammen, V.V. (1989). Psychometric properties of

the Intrinsic Motivation Inventory in a competitive sport setting: A

confirmatory factor analysis. Research Quarterly for Exercise and

Sport, 60(1), 48-58.

Steinhardt, M.A., & Dishman, R.K. (1989). Reliability and validity of

expected outcomes and barriers for habitual physical activity. Journal

of Occupational Medicine, 31(6), 536-46.

Suris, A.M., Trapp, M.C., DiClemente, C.C., & Cousins, J. (1998). Application

of the transtheoretical model of behavior change for obesity in Mexican

American women. Addictive Behaviors, 23(5), 655-668.