Appendix 1: How to Read an Epidemiologic Study

Key Takeaways

A standard epidemiology study (not counting the abstract—more on this later) has 4 parts:

  • Introduction
  • Methods
  • Results
  • Discussion

Usually these are labelled, but not always.  Sometimes they have different labels (eg, “background” instead of “introduction.”) Even without labels, epidemiology papers are almost always organized in this order. Details about each section are discussed below.


The INTRODUCTION usually consists of three things:

  • What we already know about a topic
    • A very select summary of what we know! It is important to remember that intros are NOT exhaustive literature reviews. Furthermore, what things are included is entirely at the authors’ discretion (with some input from peer reviewers and editors), which means that you do see the occasional biased/incomplete introduction.
  • What we don’t know about the topic (ie, what is the gap in the literature?)
  • What this study will do to address that gap
    • Usually concluding with “our study question was…” or “our objective here was…”

The introduction is where you will find answers to questions like “What is the public health or clinical problem this study is trying to address?” and “What was their research question?”


The METHODS is just that—a description of the methods used for this study.  Ideally, the methods section will describe:

  • How they got their sample from the target population/what dataset was used
    • Including inclusion/exclusion criteria, with rationales as appropriate
  • What is the study design (including design-specific relevant details, such as how participants were randomized, if it’s a randomized controlled trial)
    • Occasionally, if a study has been done using a well-known dataset (e.g., the NHANES data—see Chapter 3), the methods section will just direct the reader to other publications in which these methods are described in detail, rather than re-printing all of the information
  • What was the exposure, how and when was it measured, and how was it operationalized in the analysis
    • ie, did they ask people their ages, but then dichotomize into “old” (>65) vs. “young” (65 and younger)?
  • What was the outcome, how and when was it measured, and how was it operationalized in the analysis
    • ie, what was the case definition used for diagnosis? Were cases identified via clinics, or self-report, or some other method?
  • What confounders and/or effect modifiers were included, how they were chosen, how they were measured, and how they were operationalized in the analysis
    • Collectively these are referred to as “covariables”
    • Any variables listed under “adjusted for” or “included in the model” are confounders
    • Any variables listed as “interactions” or “stratified by” are effect modifiers
  • The statistical methods used

The methods section also should include a sentence about ethics/IRB approval, and informed consent, if applicable.

As a beginning epidemiology student, do not be concerned if you do not understand everything in the methods section! This is particularly true if the study included laboratory assays (e.g. to measure blood lead levels), but also pertains to the statistical methods. Papers must include enough detail in the methods section so that other scientists can evaluate, and potentially replicate, the work—which means they are written for other epidemiologists who are publishing papers, all of whom potentially have many years’ worth of training in relevant methods.

Your task as a first-time epidemiology student (or as an end-user of epidemiologic research who has some – but not a lot of – training in the field) is to read the methods carefully enough to spot any potential sources of bias, given your level of understanding. For instance, after reading this book, my hope is that you could spot egregious selection bias by reading the authors’ methods and thinking through “who did they get, who did they miss?”. However, I would not expect that you would be able to spot a bias introduced because the authors violated one of the assumptions of the statistical model that they used. Sometimes I read papers myself where I don’t quite follow the methods, particularly for laboratory-based measurements. In those cases, I just trust the peer review process—several pairs of eyes were on any given study before mine, so probably the methods are kosher. If it seems like maybe there’s a problem, I ask one of my laboratory (or statistics, or clinical, depending on what my question is) colleagues about it.

Bottom line: read the methods carefully, but if there are parts you don’t quite follow, don’t worry about it unless you are going to cite that work yourself. In that case, ask around and find someone who can help you interpret the methods.


The RESULTS section contains…results.  What did they find?  This section is usually very dense in terms of numbers. There will be odds ratios, risk ratios, confidence intervals, p-values, etc.  Usually the results section begins with a discussion of the sample that was in the study, and this often further includes a table of demographics and relevant risk factors (usually “Table 1”).  This table is a good place to get a feel for who was in the study (and therefore who was not). Then usually the authors will discuss the MAIN results:  what did they find pertaining to their primary research question?  They will not say what the results mean in this section (that’s for the “discussion,” below)—this section just presents the numbers. Expect to go back and forth between the text and the tables and the figures several times—most journals expressly ask authors not to duplicate results (meaning, if the results are presented in a table, don’t repeat them in the text). Thus, to understand all of the results yourself, you will need to read both the text and the tables/figures. The last few paragraphs of the results section are used to present subgroup analyses, or bias/sensitivity analyses.

As you read results sections, think about what you read in the methods section. Do you believe these results, given the methods used?


The DISCUSSION section is the authors’ opinions about what the results mean. It usually begins with a summary of the main findings, and then compares these findings to other published findings on the same or similar topics. It should include a limitations (or strengths and limitations) sub-section, in which the authors provide a very frank picture of what limitations their study had (where there might have been bias, etc).  If, when reading the methods and results sections, you thought of a potential bias that is NOT discussed here…pause.  Perhaps this study is not the best source, then?  All limitations should be acknowledged.  (Corollary:  no study is perfect!  All have limitations.  We can try to minimize, but need to ‘fess up to the ones that remain.)  The discussion section usually concludes with some kind of recommendation, either for policy, or further research.  Again, this is the authors’ opinion.  Indeed, you are welcome to disagree entirely with any or all of a given discussion section—discussion sections are opinion, not fact.


Finally, there is the ABSTRACT. Usually found at the beginning of the paper, often in its own separate box—this is a brief summary of the entire thing.  Sometimes these same subheadings will be in the abstract, other times not.  WARNING: You cannot understand a paper just by reading the abstract.  Often only one or two main results are presented in the abstract, and the methods are quite sparse, as abstracts are limited usually to a few hundred words. Never cite a paper if all you have read is the abstract. This will come back to haunt you.

A few other details

Below are a few more points for if/when you are looking for papers yourself. Consider these things as you determine whether it’s worth reading and/or citing a paper that you have found.

  • Just under the paper’s title is a list of the authors, their affiliations, and (usually) their credentials. Are these the kinds of people who need to be on this study?  For instance, if a study on appropriate treatment for congestive heart failure does not include a cardiologist as an author, maybe that’s a problem. If a study includes fancy statistical methods beyond basic logistic or linear regression, but the author list includes only clinicians, and no one with specialized statistical training, maybe that’s a problem.
  • Somewhere, usually on either the first or last page, or sometimes between the conclusion and the reference list, is a note about funding and other conflicts of interest. These are often illuminating. For instance, I know of a study casting doubt on the benefits of breastfeeding that was funded by the International Formula Council.[i]
  • Many journals will list dates—the date the article was submitted, the date it was received in revised form (meaning, it was received, sent out for review, the reviews were sent back to the authors, who then made the changes), and the date it was accepted. If the initial submission date and the acceptance date are not at least six weeks (and more realistically six months) apart, then it’s possible that the peer review process was circumvented in some way. This happens. Sad but true.


i. Cope MB, Allison DB. Critical review of the World Health Organization’s (WHO) 2007 report on “evidence of the long-term effects of breastfeeding: systematic reviews and meta-analysis” with respect to obesity. Obes Rev Off J Int Assoc Study Obes. 2008;9(6):594-605. doi:10.1111/j.1467-789X.2008.00504.x (↵ Return)


Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Foundations of Epidemiology by Marit Bovbjerg is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Share This Book