The tension between generalizability and inclusivity in population research

As I have continued my journey in epidemiologic research, one of my primary aims is to illuminate health disparities within my projects. Currently located in New York City, identifying diverse populations is hardly an issue. However, I have thought more about populations around the United States as I apply to PhD programs and consider the projects of potential mentors. One project in particular was out of a prestigious university in the Midwest in which the investigators planned to include equal proportions of individuals of different racial groups(e.g. 25% non-Hispanic Black, 25% White, 25% Asian, 25% Hispanic) in an effort to not exclude demographics.

This raises the following important question: How do we choose the best sampling method and accommodate for disparate demographics in research in a way that preserves data validity?

Diversity in health research populations is crucial because data is needed not only on dominant populations but also on populations which are potentially affected disproportionately by the illnesses examined in the studies. When including individuals in studies, random sampling is beneficial because it reduces bias and improves validity. However, random sampling may under-represent individuals in affected demographics when sampling is done among racially disparate populations. Essentially, investigators may need to choose between representativeness of diverse populations and confidence in their ability to make inferences about a general population.

Let us review the most common methods of sampling and recruitment for studies. There are about six main sampling methods which can generally be considered probability sampling or non-probability sampling. Probability sampling consists of random sampling in which there are about equal chances of being selected into a sample; and stratified sampling in which subsets of the population have low incidence relative to other subsets, and individuals are chosen based on and interval selections method (e.g. each nth individual is selected.1 In non-probability sampling, individuals are selected by non-random methods.1 One example is quota sampling, which is a non-probability equivalent of stratified sampling. Convenience sampling is another method in which accessible individuals in the population of interest are chosen.1 In judgment sampling, a population utilized in the study is considered to be representative of a larger population (e.g. a town in the US is thought to be representative of the US population). Finally, In snowball sampling, participants may recruit other individuals into the study via their own networks.1 Biases may occur with each sampling method, however minimal bias occurs in random sampling methods.

When it comes to ensuring that individuals of various groups or characteristics are included in the study population, some researchers turn to oversampling. In essence, the number of individuals of a particular group may be included disproportionately to the distribution to accommodate for an imbalance in the data.

The issue of racial diversity in study populations is subtle but needs to be addressed according to the specifics of each project. When necessary, choosing between generalizability and inclusion of disparate populations is a decision which requires careful consideration and may implicate possible biases in study results and findings.


  1. Tyrer, S., & Heyman, B. (2016). Sampling in epidemiological research: issues, hazards and pitfalls. BJPsych Bulletin, 40(2), 57–60.

“The views, opinions and positions expressed within this blog are those of the author(s) alone and do not represent those of the American Heart Association. The accuracy, completeness and validity of any statements made within this article are not guaranteed. We accept no liability for any errors, omissions or representations. The copyright of this content belongs to the author and any liability with regards to infringement of intellectual property rights remains with them. The Early Career Voice blog is not intended to provide medical advice or treatment. Only your healthcare provider can provide that. The American Heart Association recommends that you consult your healthcare provider regarding your personal health matters. If you think you are having a heart attack, stroke or another emergency, please call 911 immediately.”