Designing Cluster Randomized Trials to Evaluate Programs

The push for rigorous evaluations of the impact of interventions has led to an increase in the use of randomized trials (RTs). In practice, it is often the case that interventions are delivered at the cluster level, such as a whole school reform model or a new curriculum. In these cases, the cluster (i.e., the school), is the logical unit of random assignment and I hereafter refer to these as cluster randomized trials (CRTs).

Designing a CRT is necessarily more complex than a RT for several reasons. First, there are two sample sizes, i.e., the number of students per school and the total number of schools. Second, the greater the variability in the outcome across schools, the more schools you will need to detect an effect of a given magnitude. The percentage of variance in the outcome that is between schools is commonly referred to as the intra-class correlation (ICC). For example, suppose I am testing an intervention and the outcome of interest is math achievement, there are 500 students per school, and a school level covariate explain 50 percent of the variation in the outcome. If the ICC is 0.20 and I want to detect an effect size difference of 0.2 standard deviations between the treatment and comparison conditions, 82 total schools, or 41 treatment and 41 comparison schools, would be needed to achieve statistical power equal to 0.80, the commonly accepted threshold. Instead, if the ICC is 0.05, the total number of schools would only be 24, a reduction of 54. Hence an accurate estimate of the ICC is critical in planning a CRT as it has a strong impact on the number of schools needed for a study.

The challenge is that the required sample size needs to be determined prior to the start of the study, hence I need to estimate the ICC since the actual data has not yet been collected. Recently there has been an increase in empirical studies which seek to estimate ICCs for different contexts. The findings suggest that the ICC varies depending on outcome type, unit of the clusters (i.e., schools, classrooms, etc.), grade and other features.

Resources:
Resources have started popping up to help evaluators planning CRTs find accurate estimates of the ICC. Two widely used in education include:

The Online Variance Almanac: http://stateva.ci.northwestern.edu/
The Optimal Design Plus Software: http://wtgrantfoundation.org/resource/optimal-design-with-empirical-information-od*

*Note that Optimal Design Plus is a free program that calculates power for CRTs. Embedded within the program is a data repository with ICC estimates.

In the event that empirical estimates are not available for your particular outcome type a search of the relevant literature may uncover estimates or a pilot study may be used to generate reasonable values. Regardless of the source, accurate estimates of the ICC are critical in determining the number of clusters needed in a CRT.

About the Authors

Jessaca Spybrook

Associate Professor, Education, Leadership, Research, and Technology, WMU

Dr. Jessaca Spybrook is an associate professor of educational leadership, research, and technology at Western Michigan University, specializing in evaluation, measurement and research. She earned her Ph.D. in education from the University of Michigan, where she also received an M.A. in applied statistics and B.A. in elementary education. She teaches, among other courses, Data Analytics I and II and hierarchical linear modeling. Her research focuses on improving the quality of the designs and power analyses of group randomized trials in education. She coauthored the software and documentation for Optimal Design Plus, a program that assists researchers in planning adequately powered experiments. She has published in journals such as Educational Evaluation and Policy Analysis, Journal for Research on Educational Effectiveness, and Evaluation Review. Spybrook's work has been funded by the National Science Foundation, the Institute of Education Sciences, and the William T. Grant Foundation. She received a National Academy of Education/Spencer Postdoctoral Fellowship in 2010. Prior to attending graduate school she was a seventh grade math teacher.

Except where noted, all content on this website is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Designing Cluster Randomized Trials to Evaluate Programs

About the Authors

Jessaca Spybrook

Related Blog Posts

Takeaways from the 2023 CREA Conference

Lessons Learned from Facilitating a Virtual Event

Facilitating Valuable Connections: Thoughts from an ATE Evaluator and ATE PI

Designing Cluster Randomized Trials to Evaluate Programs

About the Authors

Jessaca Spybrook

Never Miss a Post

Related Blog Posts

Takeaways from the 2023 CREA Conference

Lessons Learned from Facilitating a Virtual Event

Facilitating Valuable Connections: Thoughts from an ATE Evaluator and ATE PI