Blog

Subscribe here for quick access to our latest blog posts. New to RSS feeds? Click here


Best Practices for Two-Year Colleges to Create Competitive Evaluation Plans

Posted on September 28, 2016 by , in Blog ()
Ball
Kelly Ball
Ball
Jeff Grebinoski

Northeast Wisconsin Technical College’s (NWTC) Grants Office works closely with its Institutional Research Office to create ad hoc evaluation teams in order to meet the standards of evidence required in funders’ calls for proposals. Faculty members at two-year colleges often make up the project teams that are responsible for National Science Foundation (NSF) grant project implementation. However, they often need assistance navigating among terms and concepts that are traditionally found in scientific research and social science methodology.

Federal funding agencies are now requiring more evaluative rigor in their grant proposals than simply documenting deliverables. For example, the NSF’s Scholarships in Science, Technology, Engineering, and Mathematics (S-STEM) program saw dramatic changes in 2015: The program solicitation increased the amount of non-scholarship budget from 15% of the scholarship amount to 40% of the total project budget to increase supports for students and to investigate the effectiveness of those supports.

Technical colleges, in particular, face a unique challenge as solicitations change: These colleges traditionally have faculty members from business, health, and trades industries. Continuous improvement is a familiar concept to these professionals; however, they tend to have varying levels of expertise evaluating education interventions.

The following are a few best practices we have developed for assisting project teams in grant proposal development and project implementation at NWTC.

  • Where possible, work with an external evaluator at the planning stage. External evaluators can provide the expertise that principal investigators and project teams might lack as external evaluators are well-versed on current evaluation methods, trends, and techniques.
  • As they develop their projects, teams should meet with their Institutional Research Office to better understand data gathering and research capacity. Some data needed for evaluation plans might be readily available, whereas others might require some advanced planning to develop a system to track information. Conversations about what the data will be used for and what questions the team wants to answer will help ensure that the correct data are able to be gathered.
  • After a grant is awarded, have a conversation early with all internal and external evaluative parties about clarifying data roles and responsibilities. Agreeing to reporting deadlines and identifying who will collect the data and conduct further analysis will help avoid delays.
  • Create a “data dictionary” for more complicated projects and variables to ensure that everyone is on the same page about what terms mean. For example, “student persistence” can be defined term-to-term or year-to-year and all parties need to understand which data will need to be tracked.

With some planning and the right working relationships in place, two-year colleges can maintain their federal funding competitiveness even as agencies increase evaluation requirements.

yanowitz
Possible Selves: A Way to Assess Identity and Career Aspirations

Posted on September 14, 2016 by  in Blog ()

Professor of Psychology, Arkansas State University

Children are often asked the question “What do you want to be when you grow up?” Many of us evaluate programs where the developers are hoping that participating in their program will change this answer. In this post, I’d like to suggest using “possible self” measures as a means of evaluating if a program changed attendees’ sense of identity and career aspirations.

What defines the term?

Possible selves are our representations of our future. We all think about what we ideally would like to become (the hoped-for possible self), things that we realistically expect to become (the expected possible self), and things that we are afraid of becoming (the feared-for possible self).[1][2] Possible selves can change many times over the lifespan and thus can be a useful measure to examine participants’ ideas about themselves in the future.

How can it be measured?

There are various ways to measure possible selves. One of the simplest is to use an open-ended measure that asks people to describe what they think will occur in the future. For example, we presented the following (adapted from Osyerman et al., 2006[2]) to youth participants in a science enrichment camp (funded by an NSF-ITEST grant to Arkansas State University):

Probably everyone thinks about what they are going to be like in the future. We usually think about the kinds of things that are going to happen to us and the kinds of people we might become.

  1. Please list some things that you most strongly hope will be true of you in the future.
  2. Please list some things that you think will most likely be true of you in the future.

The measure was used both before and after participating in the program. We purposely did not include a feared-for possible self, given the context of a summer camp.

What is the value-added?

Using this type of open-ended measure allows for participants’ own voices to be heard. Instead of imposing preconceived notions of what participants should “want” to do, it allows participants to tell us what is most important to them. We learned a great deal about participants’ world views and their answers helped us to fine-tune programs to better serve their needs and to be responsive to our participants. Students’ answers focused on careers, but also included hoped-for personal ideals. For instance, European-American students were significantly more likely to mention school success than African-American students.  Conversely, African-American students were significantly more likely to describe hoped-for positive social/emotional futures compared to European-American students. These results allowed program developers to gain a more nuanced understanding of motivations driving participants. Although we regarded the multiple areas of focus as a strength of the measure, evaluators considering using a possible self-measure may also want to include more directed, follow-up questions.

For more information on how to assess possible selves, see Professor Daphna Oyserman’s website.

References

[1] Markus, H. R., & Nurius, P. (1986). Possible selves. American Psychologist, 41, 954–969.

[2] Oyserman, D., Bybee, D., &Terry, K. (2006). Possible selves and academic outcomes: How and when possible selves impel action. Journal of Personality and Social Psychology, 91, 188–204.

Hubbard 150
Six Data Cleaning Checks

Posted on September 1, 2016 by  in Blog ()

Research Associate, WestEd’s STEM Program

Data cleaning is the process of verifying and editing data files to address issues of inconsistency and missing information. Errors in data files can appear at any stage of an evaluation, making it difficult to produce reliable data. Data cleaning is a critical step in program evaluation because clients rely on accurate results to inform decisions about their initiatives. Below are six essential steps I include in my data cleaning process to minimize issues during data analysis:

1. Compare the columns of your data file against the columns of your codebook.

Sometimes unexpected columns might appear in your data file or columns of data may be missing. Data collected from providers external to your evaluation team (e.g., school districts) might include sensitive participant information like social security numbers. Failures in software used to collect data can lead to responses not being recorded. For example, if a wireless connection is lost while a file is being downloaded, some information in that file might not appear in the downloaded copy. Unnecessary data columns should be removed before analysis and, if possible, missing data columns should be retrieved.

2. Check your unique identifier column for duplicate values.

An identifier is a unique value used to label a participant and can take the form of a person’s full name or a number assigned by the evaluator. Multiple occurrences of the same identifier in a data file usually indicate an error. Duplicate identifier values can occur when participants complete an instrument more than once or when a participant identifier is mistakenly assigned to multiple records. If participants move between program sites, they might be asked to complete a survey for a second time. Administrators might record a participant’s identifier incorrectly, using a value assigned to another participant. Data collection software can malfunction and duplicate rows of records. Duplicate records should be identified and resolved.

3. Transform categorical data into standard values.

Non-standard data values often appear in data gathered from external data providers. For example, school districts often provide student demographic information but vary in the categorical codes they use. For example, the following table shows a range of values I received from different districts to represent students’ ethnicities:

Hubbard-Graphic

To aid in reporting on participant ethnicities, I transformed these values into the race and ethnicity categories used by the National Center for Education Statistics.

When cleaning your own data, you should decide on standard values to use for categorical data, transform ambiguous data into a standard form, and store these values in a new data column.  OpenRefine is a free tool that facilitates data transformations.

4. Check your data file for missing values.

Missing values occur when participants choose not to answer an item, are absent the day of administration, or skip an item due to survey logic. If missing values are found, apply a code to indicate the reason for the missing data point. For example, 888888 can indicate an instrument was not administered and 999999 can indicate a participant chose not to respond to an item. The use of codes can help data analysts determine how to handle the missing data. Analysts sometimes need to report on the frequency of missing data, use statistical methods to replace the missing data, or remove the missing data before analysis.

5. Check your data file for extra or missing records.

Attrition and recruitment can occur at all stages of an evaluation. Sometimes people who are not participating in the evaluation are allowed to submit data. Check the number of records in your data file against the number of recruited participants for discrepancies. Tracking dates when participants join a project, leave a project, and complete instruments can facilitate this review.

6. Correct erroneous or inconsistent values.

When instruments are completed on paper, participants can enter unexpected values. Online tools may be configured incorrectly and allow illegal values to be submitted. Create a list of validation criteria for each data field and compare all values against this list. de Jonge and van der Loo provide a tutorial for checking invalid data using R.

Data cleaning can be a time-consuming process. These checks can help reduce the time you spend on data cleaning and get results to your clients more quickly.

Goodyear
Three Tips for a Strong NSF Proposal Evaluation Plan

Posted on August 17, 2016 by  in Blog ()

Principle Research Scientist, Education Development Center, Inc.

I’m Leslie Goodyear and I’m an evaluator who also served as a program officer for three years at the National Science Foundation in the Division of Research on Learning, which is in the Education and Human Resources Directorate. While I was there, I oversaw evaluation activities in the Division and reviewed many, many evaluation proposals and grant proposals with evaluation sections.

In May 2016, I had the pleasure of participating in the “Meeting Requirements, Exceeding Expectations: Understanding the Role of Evaluation in Federal Grants.” Hosted by Lori Wingate at EvaluATE and Ann Beheler at the Centers Collaborative for Technical Assistance, this webinar covered topics such as evaluation fundamentals; evaluation requirements and expectations; and evaluation staffing, budgeting and utilization.

On the webinar, I shared my perspective on the role of evaluation at NSF, strengths and weaknesses of evaluation plans in proposals, and how reviewers assess Results from Prior NSF Support sections of proposals, among other topics. In this blog, I’ll give a brief overview of some important takeaways from the webinar.

First, if you’re making a proposal to education or outreach programs, you’ll likely need to include some form of project evaluation in your proposal. Be sure to read the program solicitation carefully to know what the specific requirements are for that program. There are no agency-wide evaluation requirements—instead they are specified in each solicitation. Lori had a great suggestion on the webinar:  Search the solicitation for “eval” to make sure you find all the evaluation-related details.

Second, you’ll want to make sure that your evaluation plan is tailored to your proposed activities and outcomes. NSF reviewers and program officers can smell a “cookie cutter” evaluation plan, so make sure that you’ve talked with your evaluator while developing your proposal and that they’ve had the chance to read the goals and objectives of your proposed work before drafting the plan. You want the plan to be incorporated into the proposal so that it appears seamless.

Third, indicators of a strong evaluation plan include carefully crafted, relevant overall evaluation questions, a thoughtful project logic model, a detailed data collection plan that is coordinated with project activities, and a plan for reporting and dissemination of findings. You’ll also want to include a bio for your evaluator so that the reviewers know who’s on your team and what makes them uniquely qualified to carry out the evaluation of your project.

Additions that can make your plan “pop” include:

  • A table that maps out the evaluation questions to the data collection plans. This can save space by conveying lots of information in a table instead of in narrative.
  • Combining the evaluation and project timelines so that the reviewers can see how the evaluation will be coordinated with the project and offer timely feedback.

Some programs allow for using the Supplemental Documents section for additional evaluation information. Remember that reviewers are not required to read these supplemental docs, so be sure that the important information is still in the 15-page proposal.

For the Results of Prior NSF Support section, you want to be brief and outcome-focused. Use this space to describe what resulted from the prior work, not what you did. And be sure to be clear how that work is informing the proposed work by suggesting, for example, that these outcomes set up the questions you’re pursuing in this proposal.

Endres 150
National Science Foundation-funded Resources to Support Your Advanced Technological Education (ATE) Project

Posted on August 3, 2016 by  in Blog ()

Doctoral Associate, EvaluATE

Did you know that other National Science Foundation programs focused on STEM education have centers that provide services to projects? EvaluATE offers evaluation-specific resources for the Advanced Technological Education program, while some of the others are broader in scope and purpose. They offer technical support, resources, and information targeted at projects within the scope of specific NSF funding programs. A brief overview of each of these centers is provided below, highlighting evaluation-related resources. Make sure to check the sites out for further information if you see something that might be of value for your project!

The Community for Advancing Discovery Research in Education (CADRE) is a network for NSF’s Discovery Research K-12 program (DR K-12). The evaluation resource on the CADRE site is a paper on evaluation options (formative and summative), which differentiates evaluation from the research and development efforts carried out as part of project implementation.  There are other more general resources such as guidelines and tools for proposal writing, a library of reports and briefs, along with a video showcase of DR K-12 projects.

The Center for the Advancement of Informal Science Education (CAISE) has an evaluation section of its website that is searchable by type of resource (i.e., reports, assessment instruments, etc.), learning environment, and audience. For example, there are over 850 evaluation reports and 416 evaluation instruments available for review. The site hosts the Principal Investigator’s Guide: Managing Evaluation in Informal STEM Education Projects, which was developed as an initiative of the Visitor Studies Association and has sections such as working with an evaluator, developing an evaluation plan, creating evaluation tools and reporting.

The Math and Science Partnership Network (MSPnet) supports the math and science partnership network and the STEM+C (computer science) community. MSPnet has a digital library with over 2,000 articles; a search using the term “eval” found 467 listings, dating back to 1987. There is a toolbox with materials such as assessments, evaluation protocols and form letters. Other resources in the MSPnet library include articles and reports related to teaching and learning, professional development, and higher education.

The Center for Advancing Research and Communication (ARC) supports the NSF Research and Evaluation on Education in Science and Engineering (REESE) program through technical assistance to principal investigators. An evaluation-specific resource includes material from a workshop on implementation evaluation (also known as process evaluation).

The STEM Learning and Research Center (STELAR) provides technical support for the Innovative Technology Experiences for Students and Teachers (ITEST) program. Its website includes links to a variety of instruments, such as the Grit Scale, which can be used to assess students’ resilience for learning, which could be part of a larger evaluation plan.

How Real-time Evaluation Can Increase the Utility of Evaluation Findings

Posted on July 21, 2016 by , in Blog ()
Peery Wilkerson
Elizabeth Peery Stephanie B. Wilkerson

Evaluations are most useful when evaluators make relevant findings available to project partners at key decision-making moments. One approach to increasing the utility of evaluation findings is by collecting real-time data and providing immediate feedback at crucial moments to foster progress monitoring during service delivery. Based on our experience evaluating multiple five-day professional learning institutes for an ATE project, we discovered the benefits of providing real-time evaluation feedback and the vital elements that contributed to the success of this approach.

What did we do?

With project partners we co-developed online daily surveys that aligned with the learning objectives for each day’s training session. Daily surveys measured the effectiveness and appropriateness of each session’s instructional delivery, exercises and hands-on activities, materials and resources, content delivery format, and session length. Participants also rated their level of understanding of the session content and preparedness to use the information. They could submit questions, offer suggestions for improvement, and share what they liked most and least. Based on the survey data that evaluators provided to project partners after each session, partners could monitor what was and wasn’t working and identify where participants needed reinforcement, clarification, or re-teaching. Project partners could make immediate changes and modifications to the remaining training sessions to address any identified issues or shortcomings before participants completed the training.

Why was it successful?

Through the process, we recognized that there were a number of elements that made the daily surveys useful in immediately improving the professional learning sessions. These included the following:

  • Invested partners: The project partners recognized the value of the immediate feedback and its potential to greatly improve the trainings. Thus, they made a concentrated effort to use the information to make mid-training modifications.
  • Evaluator availability: Evaluators had to be available to pull the data after hours from the online survey software program and deliver it to project partners immediately.
  • Survey length and consistency: The daily surveys took less than 10 minutes to complete. While tailored to the content of each day, the surveys had a consistent question format that made them easier to complete.
  • Online format: The online format allowed for a streamlined and user-friendly survey. Additionally, it made retrieving a usable data summary much easier and timelier for the evaluators.
  • Time for administration: Time was carved out of the training sessions to allow for the surveys to be administered. This resulted in higher response rates and more predictable timing of data collection.

If real-time evaluation data will provide useful information that can help make improvements or decisions about professional learning trainings, it is worthwhile to seek resources and opportunities to collect and report this data in a timely manner.

Here are some additional resources regarding real-time evaluation:

Articulating Intended Outcomes Using Logic Models: The Roles Evaluators Play

Posted on July 6, 2016 by , in Blog ()
Wilkerson Peery
Stephanie B. Wilkerson Elizabeth Peery

Articulating project outcomes is easier said than done. A well-articulated outcome is one that is feasible to achieve within the project period, measurable, appropriate for the phase of project development, and in alignment with the project’s theory of change. A project’s theory of change represents causal relationships – IF we do these activities, THEN these intended outcomes will result. Understandably, project staff often frame outcomes as what they intend to do, develop, or provide, rather than what will happen as a result of those project activities. Using logic models to situate intended outcomes within a project’s theory of change helps to illustrate how project activities will result in intended outcomes.

Since 2008, my team and I have served as the external evaluator for two ATE project cycles with the same client. As the project has evolved over time, so too have its intended outcomes. Our experience using logic models for program planning and evaluation has illuminated four critical roles we as evaluators have played in partnership with project staff:

  1. Educator. Once funded, we spent time educating the project partners on the purpose and development of a theory of change and intended outcomes using logic models. In this role, our goal was to build understanding of and buy-in for the need to have logic models with well-articulated outcomes to guide project implementation.
  1. Facilitator. Next, we facilitated the development of an overarching project logic model with project partners. The process of defining the project’s theory of change and intended outcomes was important in creating a shared agreement and vision for project implementation and evaluation. Even if the team includes a logic model in the proposal, refining it during project launch is still an important process for engaging project partners. We then collaborated with individual project partners to build a “family” of logic models to capture the unique and complementary contributions of each partner while ensuring that the work of all partners was aligned with the project’s intended outcomes. We repeated this process during the second project cycle.
  1. Methodologist. The family of logic models became the key source for refining the evaluation questions and developing data collection methods that aligned with intended outcomes. The logic model thus became an organizing framework for the evaluation. Therefore, the data collection instruments, analyses, and reporting yielded relevant evaluation information related to intended outcomes.
  1. Critical Friend. As evaluators, our role as a critical friend is to make evidence-based recommendations for improving project activities to achieve intended outcomes. Sometimes evaluation findings don’t support the project’s theory of change, and as critical friends, we play an important role in challenging project staff to identify any assumptions they might have made about project activities leading to intended outcomes. This process helped to inform the development of tenable and appropriate outcomes for the next funding cycle.

Resources:

There are several resources for articulating outcomes using logic models. Some of the most widely known include the following:

Worksheet: Logic Model Template for ATE Projects & Centers: http://www.evalu-ate.org/resources/lm-template/

Education Logic Model (ELM) Application Tool for Developing Logic Models: http://relpacific.mcrel.org/resources/elm-app/

University of Wisconsin-Extension’s Logic Model Resources: http://www.uwex.edu/ces/pdande/evaluation/evallogicmodel.html

W.K. Kellogg Foundation Logic Model Development Guide: https://www.wkkf.org/resource-directory/resource/2006/02/wk-kellogg-foundation-logic-model-development-guide

zalles2016
Student Learning Assessments: Issues of Validity and Reliability

Posted on June 22, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

In my last post, I talked about the difference between program evaluation and student assessment. I also touched on using existing assessments if they are available and appropriate, and if not, constructing new assessments.  Of course, that new assessment would need to meet test quality standards or otherwise it will not be able to measure what you need to have measured for your evaluation. Test quality has to do with validity and reliability.

When a test is valid, it means that when a student responds with a wrong answer, it would be reasonable to conclude that they did so because they did not learn what they were supposed to have learned. There are all kinds of impediments to an assessment’s validity. For example, if in a science class you are asking students a question aimed at determining if they understand the difference between igneous and sedimentary rocks, yet you know that some of them do not understand English, you wouldn’t want to ask them the question in English. In testing jargon, what you are introducing in such a situation is “construct irrelevant variance.” In this case, the variance in results may be as much due to whether they know English (the construct irrelevant part) as to whether they know the construct, which is the differences between the rock types. Hence, these results would not help you determine if your innovation is helping them learn the science better.

Reliability has to do with test design, administration, and scoring. Examples of unreliable tests are those that are too long, introducing test-taking fatigue that interfere with their being reliable measures of student learning. Another common example of unreliability is when the scoring directions or rubric are not designed well enough to be sufficiently clear about how to judge the quality of an answer. This type of problem will often result in inconsistent scoring, otherwise known as low interrater reliability.

To summarize, a student learning assessment can be very important to your evaluation if a goal of your project is to directly impact student learning. Then you have to make some decisions about whether you can use existing assessments or develop new ones, and if you make new ones, they need to meet technical quality standards of validity and reliability. For projects not directly aiming at improving student learning, an assessment may actually be inappropriate in the evaluation because the tie between the project activities and the student learning may be too loose. In other words, the learning outcomes may be mediated by other factors that are too far beyond your control to render the learning outcomes useful for the evaluation.

zalles2016
Using Learning Assessments in Evaluations

Posted on June 8, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

If you need to evaluate your project, but get confused about whether there should be a student assessment component, it’s important to understand the difference between project evaluation and student assessment. Both are about rendering judgments about something and they are often used interchangeably. In the world of grant-funded education projects however, they have overlapping yet quite different meanings. When you do a project evaluation, you are looking at to what extent the project is meeting its goals and achieving its intended outcomes. When you are assessing, you are looking at student progress in meeting learning goals. The most commonly used instrument of assessment is a test, but there are other mechanisms for assessing learning as well, such as student reports, presentations, or journals.

Not all project evaluations require student assessments and not all assessments are components of project evaluations. For example, the goal of the project may be to restructure an academic program, introduce some technology to the classroom, or get students to persist through college. Of course, in the end, all projects in education aim to improve learning. Yet, by itself, an individual project may not aspire to directly influence learning, but rather influence it through a related effort. In turn, not all assessments are conducted as components of project evaluations. Rather, they are most frequently used to determine the academic progress of individual students.

If you are going to put a student assessment component in your evaluation, answer these questions:

  1. What amount of assessment data will you need to properly generalize from your results about how well your project is faring? For example, how many students are impacted by your program? Do you need to assess them all or can you limit your assessment administration to a representative sample?
  2. Should you administer the assessment early enough to determine if the project needs to be modified midstream? This would be called a formative assessment, as opposed to a summative assessment, which you would do at the end of a project, after you have fully implemented your innovation with the students.

Think also about what would be an appropriate assessment instrument. Maybe you could simply use a test that the school is already using with the students. This would make sense, for example, if your goal is to provide some new curricular innovations in a particular course that the students are already taking. If your project fits into this category, it makes sense because it is likely that those assessments would have already been validated, which means they would have been piloted and subsequently modified as needed to ensure that they truly measure what they are designed to measure.

An existing assessment instrument may not be appropriate for you, however. Perhaps your innovation is introducing new learnings that those tests are not designed to measure. For example, it may be facilitating their learning of new skills, such as using new mobile technologies to collect field data. In this situation, you would want your project’s goal statements to be clear about whether the intention of your project is to provide an improved pathway to already-taught knowledge or skills, or a pathway to new learnings entirely, or both. New learnings would require a new assessment. In my next post, I’ll talk about validity and reliability issues to address when developing assessments.

????????????????????????????????????
Designing Cluster Randomized Trials to Evaluate Programs

Posted on May 25, 2016 by  in Blog ()

Associate Professor, Education, Leadership, Research, and Technology, Western Michigan University

The push for rigorous evaluations of the impact of interventions has led to an increase in the use of randomized trials (RTs). In practice, it is often the case that interventions are delivered at the cluster level, such as a whole school reform model or a new curriculum. In these cases, the cluster (i.e., the school), is the logical unit of random assignment and I hereafter refer to these as cluster randomized trials (CRTs).

Designing a CRT is necessarily more complex than a RT for several reasons. First, there are two sample sizes, i.e., the number of students per school and the total number of schools. Second, the greater the variability in the outcome across schools, the more schools you will need to detect an effect of a given magnitude. The percentage of variance in the outcome that is between schools is commonly referred to as the intra-class correlation (ICC). For example, suppose I am testing an intervention and the outcome of interest is math achievement, there are 500 students per school, and a school level covariate explain 50 percent of the variation in the outcome. If the ICC is 0.20 and I want to detect an effect size difference of 0.2 standard deviations between the treatment and comparison conditions, 82 total schools, or 41 treatment and 41 comparison schools, would be needed to achieve statistical power equal to 0.80, the commonly accepted threshold. Instead, if the ICC is 0.05, the total number of schools would only be 24, a reduction of 54. Hence an accurate estimate of the ICC is critical in planning a CRT as it has a strong impact on the number of schools needed for a study.

The challenge is that the required sample size needs to be determined prior to the start of the study, hence I need to estimate the ICC since the actual data has not yet been collected. Recently there has been an increase in empirical studies which seek to estimate ICCs for different contexts. The findings suggest that the ICC varies depending on outcome type, unit of the clusters (i.e., schools, classrooms, etc.), grade and other features.

Resources:
Resources have started popping up to help evaluators planning CRTs find accurate estimates of the ICC. Two widely used in education include:

  1. The Online Variance Almanac: http://stateva.ci.northwestern.edu/
  2. The Optimal Design Plus Software: http://wtgrantfoundation.org/resource/optimal-design-with-empirical-information-od*

*Note that Optimal Design Plus is a free program that calculates power for CRTs. Embedded within the program is a data repository with ICC estimates.

In the event that empirical estimates are not available for your particular outcome type a search of the relevant literature may uncover estimates or a pilot study may be used to generate reasonable values. Regardless of the source, accurate estimates of the ICC are critical in determining the number of clusters needed in a CRT.