Sample size in qualitative research Margarete Sandelowski

5
Research in Nursing & Health, 1995, 18, 179-1 83 Focus on Qualitative Methods Sample Size in Qualitative Research Margarete Sandelowski A common misconceptionabout sampling in qualitative research is that numbers are unimpor- tant in ensuring the adequacy of a sampling strategy. Yet, simple sizes may be too small to support claims of having achieved either informational redundancy or theoretical saturation, or too large to permit the deep, case-oriented analysis that is the raison-d’etreof qualitative inquiry. Determining adequate sample size in qualitative research is ultimately a matter of judgment and experience in evaluating the quality of the information collected against the uses to which it will be put, the particular research method and purposeful sampling strategy employed, and the research product intended. 0 1995 John Wiley & Sons. Inc. A common misconception about sampling in qualitative research is that numbers are unimpor- tant in ensuring the adequacy of a sampling strat- egy. The “logic and power” (Patton, 1990, p. 169) of the various kinds of purposeful sam- pling used in qualitative research lie primarily in the quality of information obtained per sampling unit, as opposed to their number per se. More- over, an aesthetic thrust of sampling in qualita- tive research is that small is beautiful. Yet, inad- equate sample sizes can undermine the credibility of research findings. There are no computations or power analyses that can be done in qualitative research to determine a priori the minimum num- ber and kinds of sampling units required, but there are factors, including the aim of sampling and the type of purposeful sampling and research method employed, which researchers can consid- er to help them decide whether they have col- 11 am indebted to one of the anonymous reviewers of this article for the phrasing “small is beautiful.” lected enough data. These factors are the subject of this article. NEITHER SMALL NOR LARGE, BUT TOO SMALL OR TOO LARGE Adequacy of sample size in qualitative research is relative, a matter of judging a sample neither small nor large per se, but rather too small or too large for the intended purposes of sampling and for the intended qualitative product. A sample size of 10 may be judged adequate for certain kinds of homogeneous or critical case sampling, too small to achieve maximum variation of a complex phenomenon or to develop theory, or too large for certain kinds of narrative analyses. Reported sample sizes are often too small to support claims of having achieved either informa- tional redundancy (Lincoln & Guba, 1985) or theoretical saturation (Strauss & Corbin, 1990). Margarete Sandelowski, PhD, RN, is a professor, Department of Women’s and Children’s This article is part of the ongoing series, Focus on Qualitative Methods, edited or contributed This article was received on September 7, 1994, revised, and acceptedfor publication Novem- Requests for reprints should be addressed to Dr. Sandelowski, University of North Carolina at Health, School of Nursing, University of North Carolina at Chapel Hill. by Dr. Sandelowski. ber 28, 1994. Chapel Hill, #7460 Carrington Hall, Chapel Hill, NC 27599-7460. 0 1995 John Wiley & Sons, Inc. CCC 0160-6891/95/020179-05 179

description

 

Transcript of Sample size in qualitative research Margarete Sandelowski

Research in Nursing & Health, 1995, 18, 179-1 83

Focus on Qualitative Methods

Sample Size in Qualitative Research

Margarete Sandelowski

A common misconception about sampling in qualitative research is that numbers are unimpor- tant in ensuring the adequacy of a sampling strategy. Yet, simple sizes may be too small to support claims of having achieved either informational redundancy or theoretical saturation, or too large to permit the deep, case-oriented analysis that is the raison-d’etre of qualitative inquiry. Determining adequate sample size in qualitative research is ultimately a matter of judgment and experience in evaluating the quality of the information collected against the uses to which it will be put, the particular research method and purposeful sampling strategy employed, and the research product intended. 0 1995 John Wiley & Sons. Inc.

A common misconception about sampling in qualitative research is that numbers are unimpor- tant in ensuring the adequacy of a sampling strat- egy. The “logic and power” (Patton, 1990, p. 169) of the various kinds of purposeful sam- pling used in qualitative research lie primarily in the quality of information obtained per sampling unit, as opposed to their number per se. More- over, an aesthetic thrust of sampling in qualita- tive research is that small is beautiful. Yet, inad- equate sample sizes can undermine the credibility of research findings. There are no computations or power analyses that can be done in qualitative research to determine a priori the minimum num- ber and kinds of sampling units required, but there are factors, including the aim of sampling and the type of purposeful sampling and research method employed, which researchers can consid- er to help them decide whether they have col-

1 1 am indebted to one of the anonymous reviewers of this article for the phrasing “small is beautiful.”

lected enough data. These factors are the subject of this article.

NEITHER SMALL NOR LARGE, BUT TOO SMALL OR TOO LARGE

Adequacy of sample size in qualitative research is relative, a matter of judging a sample neither small nor large per se, but rather too small or too large for the intended purposes of sampling and for the intended qualitative product. A sample size of 10 may be judged adequate for certain kinds of homogeneous or critical case sampling, too small to achieve maximum variation of a complex phenomenon or to develop theory, or too large for certain kinds of narrative analyses.

Reported sample sizes are often too small to support claims of having achieved either informa- tional redundancy (Lincoln & Guba, 1985) or theoretical saturation (Strauss & Corbin, 1990).

Margarete Sandelowski, PhD, RN, is a professor, Department of Women’s and Children’s

This article is part of the ongoing series, Focus on Qualitative Methods, edited or contributed

This article was received on September 7, 1994, revised, and accepted for publication Novem-

Requests for reprints should be addressed to Dr. Sandelowski, University of North Carolina at

Health, School of Nursing, University of North Carolina at Chapel Hill.

by Dr. Sandelowski.

ber 28, 1994.

Chapel Hill, #7460 Carrington Hall, Chapel Hill, NC 27599-7460.

0 1995 John Wiley & Sons, Inc. CCC 0160-6891/95/020179-05 179

180 RESEARCH IN NURSING 8 HEALTH

Impatience, an a priori commitment to what will be seen, or a disinclination to see any more may incline researchers to stop sampling prematurely. Seeing nothing new in newly sampled units or feeling comfortable that a theoretical category has been saturated are functions involving the recognition of what is there and what can be made out of the data already collected, and then deciding whether it is sufficient to create an in- tended product. These functions are acquired through experience. For example, I have noticed in my own development and that of students with whom I have worked that beginning qualitative researchers often require more sampling units than more experienced researchers to “see” and to “make.” One expert qualitative researcher (P. Stern, personal communication, 1989) intimated that we often have all the data we will need in the very first pieces of data we collect, but that we do not (or cannot) know that until we collect more. Ultimately, information can be deemed redundant or theoretical lines deemed saturated-only for now (Morse, 1989).

Conversely, sample sizes may be too large to support claims to having completed detailed an- alyses of data, especially the microanalysis de- manded by certain kinds of narrative and obser- vational studies. Even in qualitative projects aimed at explicating regularities across pieces of data, a high premium is still placed on discerning the particularities or idiosyncrasies presented by each piece of data. While qualitative studies may involve what are considered large sample sizes (over 50), qualitative analysis is generically about maximizing understanding of the one in all of its diversity; it is case-oriented, not variable- oriented (Ragin & Becker, 1989). Any sample size interfering with the case-oriented thrust of qualitative work can, accordingly, be judged too large.

ISSUES IN PURPOSEFUL SAMPLING

One of the major differences between qualitative and quantitative research approaches is that qual- itative approaches typically involve purposeful sampling, while quantitative approaches usually involve probability sampling (Kuzel, 1992; Mor- se, 1986, 1989; Patton, 1990). Patton (1990) de- scribed 14 different types of purposeful sam- pling, involving the selection for in-depth study of typical, atypical, or, in some way, exemplary “information-rich cases” (p. 169). Researchers in both domains of inquiry often have to resort to sampling they know is less than ideal for their

purposes, but qualitative researchers value the deep understanding permitted by information- rich cases and quantitative researchers value the generalizations to larger populations permitted by random and statistically representative samples. Although a sample of one will never be sufficient to permit generalization of findings to popula- tions, it may be sufficient to permit the valuable kind of generalizations that can be made from and about cases, variously referred to as id- iographic, holographic, naturalistic, or analytic generalizations (Firestone, 1993; Lincoln & Guba, 1985; Ragin & Becker, 1992; Simons, 1980; Stake & Trumbull, 1982).

In qualitative research, events, incidents, and experiences, not people per se, are typically the objects of purposeful sampling (Miles & Huber- man, 1994; Strauss & Corbin, 1990). People, in addition to sites, artifacts, documents, and even data that have already been collected are sampled for the information they are likely to yield about a particular phenomenon. Sample size in qualita- tive research may refer to numbers of persons, but also to numbers of interviews and observa- tions conducted or numbers of events sampled. People are certainly central in all kinds of inquiry approaches in the health sciences, but they enter qualitative studies primarily by virtue of having direct and personal knowledge of some event (e.g., illness, pregnancy, life transition) that they are able and willing to communicate to others and only secondarily by virtue of demographic char- acteristics (e.g., age, race, sex).

People Versus Purpose

When qualitative researchers decide to seek people out because of their age or sex or race, it is because they consider them good sources of information that will advance them toward an an- alytic goal and not because they wish to general- ize to other persons of similar age, sex, or race. That is, a demographic variable, such as sex, becomes an analytic variable; persons of one or the other sex are selected for a study because, by virtue of their sex, they can provide certain kinds of information. Accordingly, only as many per- sons of a particular sex are included in a study as is necessary to obtain that information. There is no mandate to have equivalent numbers of wom- en or men or numbers of persons of each sex in the proportions in which they appear in a certain population.

Sampling on the basis of demographic charac- teristics presents something of a problem in achieving both informational and size adequacy

SAMPLE SIZE / SANDELOWSKI 181

in qualitative studies. There is currently a strong impulse (and federal mandate) to eliminate gen- der, race/ethnicity, and class bias in research by including members of minority or traditionally disempowered groups typically underrepresented in research, and by including women and men typically underrepresented in certain domains of research, such as men in family studies and wom- en in studies of heart disease. Trost (1986) de- scribed a “statistically nonrepresentative strati- fied” sampling strategy whereby researchers can select persons varying in demographic charac- teristics to achieve representative coverage and inclusion. That is, while the sample is statis- tically nonrepresentative, it is informationally representative in that data will be obtained from persons who can stand for other persons with similar characteristics. In her illustration involv- ing a study of families with teenagers, five sets of naturally and artificially dichotomized variables (one or two-parent family, one or two or more children, housed in an apartment or home, with a high or low income, and with a male or female teenager) were combined to yield 32 kinds of families to be sampled. A similar kind of sam- pling plan can be used to ensure inclusion of females and males, and persons varying in social class, race, cultural affiliation, religion, or other dimension.

Although this kind of sampling accommodates a new, laudable, and necessary moral conscious- ness concerning underrepresented and, therefore, often misrepresented groups by partially accom- modating the logic of probability sampling, it may wholly contravene the logic of purposeful sampling. Strictly speaking, sampling for varia- tion in race, class, gender, or other such back- ground or person-related characteristics ought to be done in qualitative studies when they are deemed analytically important and where the fail- ure to sample for such variation would impede understanding or invalidate findings (Cannon, Higginbotham, Leung, 1988). Deciding a priori that a sample will include a certain number or percentage of individuals in various demographic groups may meet federal and other mandates for inclusion of traditionally excluded persons, but it may also result in a sample with a kind of varia- tion that has little analytic significance or detracts from analysis goals (Morse, 1989). More impor- tantly, such a sample may be too small ade- quately to address the analytic importance of such factors as gender or race, or, alternatively, too large to favor the deep analysis that qualita- tive projects mandate.

One way to resolve this dilemma is to design

studies in which a phenomenon is investigated in one group at a time (either simultaneously or se- quentially). The design for such studies will in- clude more than one purposeful sampling strate- gy: for example, homogeneous and maximum variation sampling, where person-related homo- geneity is maintained while variation in the target phenomenon is sought. After a series of such studies has been completed, a larger synthesis of findings can be undertaken in which the re- searcher can more adequately address the ques- tion of whether and how a variable such as gen- der is important in understanding a phenomenon.

SAMPLE SIZE IN DIFFERENT KINDS OF PURPOSEFUL SAMPLING

Different kinds of purposeful sampling require different minimum sample sizes. For example, in deviant case sampling, where the intention is to understand a very unusual or atypical manifesta- tion of some phenomenon, one case may be suffi- cient. Yet, even a sample of one requires within- case sampling (Miles & Huberman, 1994). The researcher must decide which of the varieties of data concerning the case to sample to explicate its atypicality. This is especially evident in cases in- volving aggregates of one, such as a family, com- munity, or organization. Even when an individual is the focal one, the researcher must sample from the wealth of data obtainable from and about that individual. In short, any one case offers a variety of data that must be sampled in sufficient quantity to make the case.

Maximum variation is one of the most fre- quently employed kinds of purposeful sampling in qualitative nursing research and typically re- quires the largest minimum sample size of any of the purposeful sampling strategies. As in any kind of sampling, the more variability there is within the confines of a qualitative project, the more numbers of sampling units the researcher will require to reach informational redundancy or theoretical saturation. Researchers wanting maxi- mum variation in their sample must decide what kind(s) of variation they want to maximize and when to maximize each kind. One kind of varia- tion already described is demographic variation, where variation is sought on generally people- related characteristics.

A second kind of variation is phenomenal vari- ation, or variation on the target phenomenon un- der study. For example, the target phenomenon in a study of couples who have obtained positive fetal diagnoses is diagnosis, which varies on such

182 RESEARCH IN NURSING B HEALTH

dimensions as type and time of diagnosis, and the instrumentation used to make it. Like the deci- sion to seek demographic variation, the decision to seek phenomenal variation is often made a priori in order to have representative coverage of variables likely to be important in understanding how diverse factors configure a whole. This kind of sampling is also referred to as selective or criterion sampling, where sampling decisions are made going into a study on “reasonable” grounds, rather than on analytic grounds after some data have already been collected (Glaser, 1978, p. 37; Schatzman & Strauss, 1973).

A third kind of variation is theoretical variation, or variation on a theoretical construct that is asso- ciated with theoretical sampling, or the sampling on analytic grounds characteristic of grounded theory studies. A theoretical sampling strategy is employed to fully elaborate and validate theoreti- cally derived variations discerned in the data. Ini- tial sampling for phenomenal variation permits these theoretical variations to be identified. A program of research employing grounded theory typically begins with a selective or criterion sam- pling strategy aimed at phenomenal variation and then proceeds to theoretical sampling (San- delowski, Holditch-Davis, & Hams, 1992).

Researchers control the number of sampling units required to achieve informational redundan- cy or theoretical saturation by deciding which category of variation to maximize and minimize. This decision is a matter of fitting the sampling strategy to the purpose of and method chosen for a particular study and appraising the resources (including number of investigators and financial support) available to conduct the study. For ex- ample, purposeful sampling for demographic ho- mogeneity and selected phenomenal variation is a way a researcher working alone with limited re- sources can reduce the minimum number of sam- pling units required within the confines of a single research project, but still produce credible and analytically and/or clinically significant findings.

SAMPLE SIZES FOR DIFFERENT QUALITATIVE METHODS

Just as different purposeful sampling strategies require different minimum sample sizes, different qualitative methods require different minimum sample sizes. Morse ( 1994) has recommended that phenomenologies directed toward discerning the essence of experiences include about six par- ticipants, ethnographies and grounded theory studies, about 30 to 50 interviews and/or obser-

vations, and qualitative ethological studies, about 100 to 200 units of observation.

Additional considerations in matching sample size to method are within-method diversity and the multiple uses of a method. Phenomenology offers a good illustration of how within-method diversity and the particular use to which a method is put can alter the requirements for sample size. In a phenomenological case study, one case can be sufficient to show something about an experi- ence that a researcher deems significant for spe- cial display (e.g., Wertz, 1983). One case will not be sufficient, however, if the researcher’s in- tention is to describe invariant or essential fea- tures of an experience. For example, a phenome- nological study, as interpreted by Van Kaam (1959), will likely require 10 to 50 descriptions of a target experience in order to discern its nec- essary and sufficient constituents. When phe- nomenological techniques are used in the service of a goal other than to produce a phenomenology, such as generating items for an instrument, at least 25 descriptions of an experience will likely be required.

SAMPLE SIZES IN COMBINED QUALITATIVE AND QUANTITATIVE

STUDIES

Studies combining qualitative and quantitative approaches involve additional considerations in determining sufficient sample size. Indeed, so- called methodologically triangulated studies pre- sent researchers with many dilemmas (beyond the scope of this article), the resolution of which depend on the researcher’s stance concerning the compatibility of the philosophies and practices of qualitative and quantitative inquiry.

With respect to sampling, the logics of proba- bility and purposeful sampling are arguably suffi- ciently irreconcilable in most cases to preclude using the same subjects for both quantitative and qualitative purposes (Morse, 1991). Subjects se- lected for the purposes of statistical represen- tativeness may not fulfill the informational needs of the study, while participants selected for infor- mation purposes do not meet the requirement of statistical representativeness. Accordingly, whether primarily quantitative or qualitative, or whether designed for purposes of completeness or confirmation (Breitmayer, Ayres, & Knafl, 1993), such combination studies would require two samples drawn simultaneously or sequen- tially according to the two logics of sampling.

SAMPLE SIZE I SANDELOWSKI 183

Yet, it can also be argued that among persons chosen according to the logic of probability sam- pling, there will likely be articulate informants whose selection for the qualitative portion of a combined study can be justified as purposeful. The purposeful sample would have to be expanded only if the data obtainable from the participants already sampled was deemed informationally in- sufficient. Similarly, no additional sampling may be necessary in studies where further information obtainable from standardized instruments is de- sired about a purposefully drawn sample. The caveat here is that the researcher use the data from these instruments for purposes of fuller de- scription, rather than to draw statistical inferences.

CONCLUSION

Determining an adequate sample size in qualita- tive research is ultimately a matter of judgment and experience in evaluating the quality of the information collected against the uses to which it will be put, the particular research method and sampling strategy employed, and the research product intended. Numbers have a place in ensur- ing that a sample is fully adequate to support particular qualitative enterprises. A good princi- ple to follow is: An adequate sample size in qual- itative research is one that permits-by virtue of not being too large-the deep, case-oriented analysis that is a hallmark of all qualitative inqui- ry, and that results in-by virtue of not being too small-a new and richly textured understanding of experience.

REFERENCES

Breitmayer, B. J., Ayres, L., & Knafl, K. A. (1993). Triangulation in qualitative research: Evaluation of completeness and confirmation purposes. Image: Journal of Nursing Scholarship, 25, 237-243.

Cannon, L. W., Higginbotham, E., & Leung, M. L. (1988). Race and class bias in qualitative research on women. Gender & Society, 2 , 449-462.

Firestonc, W. A. (1993). Alternative arguments for generalizing from data as applied to qualitative re- search. Educational Researcher, 22, 16-23,

Glaser, B. G. ( 1978). Theoretical sensitivity: Advances in the methodology of grounded theory. Mill Valley, CA: Sociology Press.

Kuzel, A. J. (1992). Sampling in qualitative inquiry. In B. F. Crabtree & W. L. Miller (Eds.), Doing qualita- tive research (pp. 31-44). Newbury Park, CA: Sage.

Lincoln, Y. S . , & Cuba, E. G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage.

Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: An expanded sourcebook (2nd ed). Thousand Oaks, CA: Sage.

Morse, J. M. (1986). Quantitative and qualitative re- search: Issues in sampling. In P. L. Chinn (Ed.), Nursing research methodology: Issues and imple- mentation (pp. 181-193). Rockville, MD: Aspen.

Morse, J. M. (1989). Strategies for sampling. In J. M. Morse (Ed.), Qualitative nursing research: A con- temporary dialogue (pp. 1 17- I3 I ) . Rockville, MD: Aspen.

Morse, J. (1991). Approaches to qualitative- quantitative methodological triangulation. Nursing Research. 40. 120-123.

Morse, J. M. (1994). Designing funded qualitative re- search. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (pp. 220-235). Thousand Oaks, CA: Sage.

Patton, M. Q. (1990). Qualitative evaluation and re- search methods (2nd ed). Newbury Park, CA: Sage.

Ragin, C. C., & Becker, H. S. (1989). How the micro- computer is changing our analytic habits. In G. Blank, J. L. McCartney, & E. Brent (Eds.), New technology in society: Practical applications in re- search and work (pp. 47-55). New Brunswick, NJ: Transaction.

Ragin, C. C., & Becker, H. S. (1992). Whar is a case? Exploring the foundations of social inquiry. Cam- bridge: Cambridge University Press.

Sandelowski, M., Holditch-Davis, D., & Harris, B. G. (1992). Using qualitative and quantitative meth- ods: The transition to parenthood of infertile couples. In J. F. Gilgun, K. Daly, & G. Handel (Eds.), Qualitative methods in family research (pp. 301-322). Newbury Park, CA: Sage.

Schatzman, L., & Strauss, A. (1973). Field research: Strategies for a natural sociology. Englewood Cliffs, NJ: Prentice-Hall.

Simons, H. (Ed.). (1980). Towards a science of the singular: Essays about case study in educational research and evaluation. Norwich: University of East Anglia, Center for Applied Research in Education.

Stake, R. E., & Trumbull, D. J. (1982). Naturalistic generalizations. Review Journal of Philosophy and Social Science, 7 , 1-12.

Strauss, A,, & Corbin, J. (199). Basics of qualitative research: Grounded theory procedures and tech- niques. Newbury Park, CA: Sage.

Trost, J. E. (1986). Statistically nonrepresentative stratified sampling: A sampling technique for quali- tative studies. Qualitative Sociology, 9, 54-57.

Van Kaam, A. L. (1959). Phenomenal analysis: Exem- plified by a study of the experience of “really feeling understood.” Journal of Individual Psychology, 15,

Wertz, F. J. (1983). From everyday to psychological description: Analyzing the moments of a qualitative data analysis. Journal of Phenomenological Psy-

66-72.

chology, 14, 197-241.