Subscribe here for quick access to our latest blog posts. New to RSS feeds? Click here

Blog: Designing Cluster Randomized Trials to Evaluate Programs

Posted on May 25, 2016 by  in Blog

Associate Professor, Education, Leadership, Research, and Technology, Western Michigan University

The push for rigorous evaluations of the impact of interventions has led to an increase in the use of randomized trials (RTs). In practice, it is often the case that interventions are delivered at the cluster level, such as a whole school reform model or a new curriculum. In these cases, the cluster (i.e., the school), is the logical unit of random assignment and I hereafter refer to these as cluster randomized trials (CRTs).

Designing a CRT is necessarily more complex than a RT for several reasons. First, there are two sample sizes, i.e., the number of students per school and the total number of schools. Second, the greater the variability in the outcome across schools, the more schools you will need to detect an effect of a given magnitude. The percentage of variance in the outcome that is between schools is commonly referred to as the intra-class correlation (ICC). For example, suppose I am testing an intervention and the outcome of interest is math achievement, there are 500 students per school, and a school level covariate explain 50 percent of the variation in the outcome. If the ICC is 0.20 and I want to detect an effect size difference of 0.2 standard deviations between the treatment and comparison conditions, 82 total schools, or 41 treatment and 41 comparison schools, would be needed to achieve statistical power equal to 0.80, the commonly accepted threshold. Instead, if the ICC is 0.05, the total number of schools would only be 24, a reduction of 54. Hence an accurate estimate of the ICC is critical in planning a CRT as it has a strong impact on the number of schools needed for a study.

The challenge is that the required sample size needs to be determined prior to the start of the study, hence I need to estimate the ICC since the actual data has not yet been collected. Recently there has been an increase in empirical studies which seek to estimate ICCs for different contexts. The findings suggest that the ICC varies depending on outcome type, unit of the clusters (i.e., schools, classrooms, etc.), grade and other features.

Resources have started popping up to help evaluators planning CRTs find accurate estimates of the ICC. Two widely used in education include:

  1. The Online Variance Almanac:
  2. The Optimal Design Plus Software:*

*Note that Optimal Design Plus is a free program that calculates power for CRTs. Embedded within the program is a data repository with ICC estimates.

In the event that empirical estimates are not available for your particular outcome type a search of the relevant literature may uncover estimates or a pilot study may be used to generate reasonable values. Regardless of the source, accurate estimates of the ICC are critical in determining the number of clusters needed in a CRT.