We EvaluATE - Evaluation Design

Blog: Designing a Purposeful Mixed Methods Evaluation

Posted on March 1, 2017 by  in Blog ()

Doctoral Associate, Western Michigan University

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

A mixed methods evaluation involves collecting, analyzing, and integrating data from both quantitative and qualitative sources. Sometimes, I find that while I plan evaluations with mixed methods, I do not think purposely about how or why I am choosing and ordering these methods. Intentionally planning a mixed methods design can help strengthen evaluation practices and the evaluative conclusions reached.

Here are three common mixed methods designs, each with its own purpose. Use these designs when you need to (1) see the whole picture, (2) dive deeper into your data, or (3) know what questions to ask.

1. When You Need to See the Whole Picture
First, the convergent parallel design allows evaluators to view the same aspect of a project from multiple perspectives, creating a more complete understanding. In this design, quantitative and qualitative data are collected simultaneously and then brought together in the analysis or interpretation stage.

For example, in an evaluation of a project whose goal is to attract underrepresented minorities into STEM careers, a convergent parallel design might include surveys of students asking Likert questions about their future career plans, as well as focus groups to ask questions about their career motivations and aspirations. These data collection activities would occur at the same time. The two sets of data would then come together to inform a final conclusion.

2. When You Need to Dive Deeper into Data

The explanatory sequential design uses qualitative data to further explore quantitative results. Quantitative data is collected and analyzed first. These results are then used to shape instruments and questions for the qualitative phase. Qualitative data is then collected and analyzed in a second phase.

For example, instead of conducting both a survey and focus groups at the same time, the survey would be conducted and results analyzed before the focus group protocol is created. The focus group questions can be designed to enrich understanding of the quantitative results. For example, while the quantitative data might be able to tell evaluators how many Hispanic students are interested in pursuing engineering, the qualitative could follow up on students’ motivations behind these responses.

3. When You Need to Know What to Ask

The exploratory sequential design allows an evaluator to investigate a situation more closely before building a measurement tool, giving guidance to what questions to ask, what variables to track, or what outcomes to measure. It begins with qualitative data collection and analysis to investigate unknown aspects of a project. These results are then used to inform quantitative data collection.

If an exploratory sequential design was used to evaluate our hypothetical project, focus groups would first be conducted to explore themes in students’ thinking about STEM careers. After analysis of this data, conclusions would be used to construct a quantitative instrument to measure the prevalence of these discovered themes in the larger student body. The focus group data could also be used to create more meaningful and direct survey questions or response sets.

Intentionally choosing a design that matches the purpose of your evaluation will help strengthen evaluative conclusions. Studying different designs can also generate ideas of different ways to approach different evaluations.

For further information on these designs and more about mixed methods in evaluation, check out these resources:

Creswell, J. W. (2013). What is Mixed Methods Research? (video)

Frechtling, J., and Sharp, L. (Eds.). (1997). User-Friendly Handbook for Mixed Method Evaluations. National Science Foundation.

Watkins, D., & Gioia, D. (2015). Mixed methods research. Pocket guides to social work research methods series. New York, NY: Oxford University Press.

Blog: Sustaining Career Pathways System Development Efforts

Posted on February 15, 2017 by , in Blog ()
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Debbie Mills
Director
National Career Pathways Network
Steven Klein
Director
RTI International

Career pathways are complex systems that leverage education, workforce development, and social service supports to help people obtain the skills they need to find employment and advance in their careers. Coordinating people, services, and resources across multiple state agencies and training providers can be a complicated, confusing, and at times, frustrating process. Changes to longstanding organizational norms can feel threatening, which may lead some to question or actively resist proposed reforms.

To ensure lasting success, sustainability and evaluation efforts should be integrated into career pathways system development and implementation efforts at the outset to ensure new programmatic connections are robust and positioned for longevity.

To support states and local communities in evaluating and planning for sustainability, RTI International created A Tool for Sustaining Career Pathways Efforts.

This innovative paper draws upon change management theory and lessons learned from a multi-year, federally-funded initiative to support five states in integrating career and technical education into their career pathways. Hyperlinks embedded within the paper allow readers to access and download state resources developed to help evaluate and sustain career pathways efforts. A Career Pathways Sustainability Checklist, included at the end of the report, can be used to assess your state’s or local community’s progress toward building a foundation for the long-term success of its career pathways system development efforts.

This paper identified three factors that contribute to sustainability in career pathways systems.

1. Craft a Compelling Vision and Building Support for Change

Lasting system transformation begins with lowering organizational resistance to change. This requires that stakeholders build consensus around a common vision and set of goals for the change process, establish new management structures to facilitate cross-agency communications, obtain endorsements from high-level leaders willing to champion the initiative, and publicize project work through appropriate communication channels.

2. Engage Partners and Stakeholders in the Change Process

Relationships play a critical role in maintaining systems over time. Sustaining change requires actively engaging a broad range of partners in an ongoing dialogue to share information about project work, progress, and outcomes, making course corrections when needed. Employer involvement also is essential to ensure that education and training services are aligned with labor market demand.

3. Adopt New Behaviors, Practices, and Processes

Once initial objectives are achieved, system designers will want to lock down new processes and connections to prevent systems from reverting to their original form. This can be accomplished by formalizing new partner roles and expectations, creating an infrastructure for ensuring ongoing communication, formulating accountability systems to track systemic outcomes, and securing new long-term resources and making more effective use of existing funding.

For additional information contact the authors:

Steve Klein; sklein@rti.org
Debbie Mills; fdmills1@comcast.net

Blog: Research Goes to School (RGS) Model

Posted on January 10, 2017 by  in Blog ()

Project Coordinator, Discovery Learning Research Center, Purdue University

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Data regarding pathways to STEM careers indicate that a critical transition point exists between high school and college.  Many students who are initially interested in STEM disciplines and could be successful in these fields either do not continue to higher education or choose non-STEM majors in college.  In part, these students do not see what role they can have in STEM careers.  For this reason, the STEM curriculum needs to reflect its applicability to today’s big challenges and connect students to the roles that these issues have for them on a personal level.

We proposed a project that infused high school STEM curricula with cross-cutting topics related to the hot research areas that scientists are working on today.  We began by focusing on sustainable energy concepts and then shifted to nanoscience and technology.

Pre-service and in-service teachers came to a large Midwestern research university for two weeks of intensive professional development in problem-based learning (PBL) pedagogy.  Along with PBL training, participants also connected with researchers in the grand challenge areas of sustainable energy (in project years 1-3) and nanoscience and technology (years 4-5).

We proposed a two-tiered approach:

1. Develop a model for education that consisted of two parts:

  • Initiate a professional development program that engaged pre-service and in-service high school teachers around research activities in grand challenge programs.
  • Support these teachers to transform their curricula and classroom practice by incorporating concepts of the grand challenge programs.

2. Establish a systemic approach for integrating research and education activities.

Results provided a framework for creating professional development with researchers and STEM teachers that culminates with integration of grand challenge concepts and education curricula.

Using developmental evaluation over a multi-year process, core practices for an effective program began emerging:

  • Researchers must identify the basic scientific concepts their work entails. For example, biofuels researchers work with the energy and carbon cycles; nanotechnology researchers must thoroughly understand size-dependent properties, forces, self-assembly, size and scale, and surface area-to-volume ratio.
  • Once identified, these concepts must be mapped to teachers’ state teaching standards and Next Generation Science Standards (NGSS), making them relevant for teachers.
  • Professional development must be planned for researchers to help them share their research at an appropriate level for use by high school teachers in their classrooms.
  • Professional development must be planned for teachers to help them integrate the research content into their teaching and learning standards in meaningful ways.
  • The professional development for teachers must include illustrative activities that demonstrate scientific concepts and be mapped to state and NGSS teaching standards.

The iterative and rapid feedback processes of developmental evaluation allowed for evolution of the program.  Feedback from data provided impetus for change, but debriefing sessions provided insight to the program and to core practices.  To evaluate the core practices found in the biofuels topic from years 1-3, we used a dissimilar topic, nanotechnology, in years 4-5.  We saw a greater integration of research and education activities in teachers’ curricula as the core practices became more fully developed through iterative repetition even with a new topic. The core practices remained true regardless of topic, and practitioners became better at delivery with more repetitions in years 4 and 5.

 

Blog: Evaluating Creativity in the Context of STEAM Education

Posted on December 16, 2016 by , in Blog ()
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Shelly Engelman
Senior Researcher
The Findings Group, LLC
Morgan Miller
Research Associate
The Findings Group, LLC

At The Findings Group, we are assessing a National Science Foundation Discovery Research K-12 project that gives students an opportunity to learn about computing in the context of music through EarSketch. As with other STEAM (Science, Technology, Engineering, Arts, Math) approaches, EarSketch aims to motivate and engage students in computing through a creative, cross-disciplinary approach. Our challenge with this project was threefold: 1) defining creativity within the context of STEAM education, 2) measuring creativity, and 3) demonstrating how creativity gives rise to more engagement in computing.

The 4Ps of Creativity

To understand creativity, we turned to the literature first.  According to previous research, creativity has been discussed from four perspectives, or the 4Ps of creativity: Process, Person, Press/Place, and Product   For our study, we focused on creativity from the perspective of the Person and the Place. Person refers to the traits, tendencies, and characteristics of the individual who creates something or engages in a creative endeavor. Place refers to the environmental factors that encourage creativity.

Measuring Creativity – Person

Building on previous work by Carroll (2009) and colleagues, we developed a self-report Creativity – Person measure that taps into six aspects of personal expressiveness within computing. These aspects include:

  • Expressiveness: Conveying one’s personal view through computing
  • Exploration:  Investigating ideas in computing
  • Immersion/Flow: Feeling absorbed by the computing activity
  • Originality: Generating unique and personally novel ideas in computing

Through a series of pilot tests with high school students, our final Creativity – Person scale consisted of 10-items and yielded excellent reliability (Cronbach’s alpha= .90 to .93); likewise, it is positively correlated with other psychosocial measures such as computing confidence, enjoyment, and identity and belongingness.

Measuring Creativity—Place

Assessing creativity at the environmental level proved to be more of a challenge! In building the Creativity – Place scale, we turned our attention to previous work by Shaffer and Resnick (1999) who assert that learning environments or materials that are “thickly authentic”—personally-relevant and situated in the real world—promote engagement in learning. Using this as our operational definition of a creative environment, we designed a self-report scale that taps into four identifiable components of a thickly authentic learning environment:

  • Personal: Learning that is personally meaningful for the learner
  • Real World: Learning that relates to the real-world outside of school
  • Disciplinary: Learning that provides an opportunity to think in the modes of a particular discipline
  • Assessment: Learning where the means of assessment reflect the learning process.

Our Creativity – Place scale consisted of 8 items and also yielded excellent reliability (Cronbach’s alpha=.91).

 Predictive Validity

Once we had our two self-report questionnaires in hand—Creativity – Person and Creativity – Place scales—we collected data among high school students who utilized EarSketch as part of their computing course. Our main findings were:

  • Students show significant increases from pre to post in personal expressiveness in computing (Creativity – Person), and
  • A creative learning environment (Creativity – Place) predicted students’ engagement in computing and intent to persist. That is, through a series of multiple regression analyses, we found that a creative learning environment, fueled by a meaningful and personally relevant curriculum, drives improvements in students’ attitudes and intent to persist in computing.

Moving forward, we plan on expanding our work by examining other facets of creativity (e.g., Creativity – Product) through the development of creativity rubrics to assess algorithmic music compositions.

References

Carroll, E.A., Latulipe, C. Fung, R., & Terry, M. (2009). Creativity factor evaluation: Towards a standardized survey metric for creativity support. In C&C ’09: Proceedings of the Seventh ACM Conference on Creativity and Cognition (pp. 127-136). New York, NY:  Association for Computing Machinery.

Engelman, S., Magerko, M., McKlin, T., Miller, M., Douglas, E., & Freeman, J. (in press). Creativity in authentic STEAM education with EarSketch. SIGCSE ’17: Proceedings of the 48th ACM Technical Symposium on Computer Science Education.Seattle, WA: Association for Computing Machinery.

Shaffer, D. W., & Resnick, M. (1999). “Thick” authenticity: New media and authentic learning. Journal of Interactive Learning Research, 10(2), 195-215.

Blog: The Value of Using a Psychosocial Framework to Evaluate STEM Outcomes Among Underrepresented Students

Posted on December 1, 2016 by , , in Blog ()
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
henderson2x breonte2x markert2x
Drs. Dawn X. Henderson, Breonte S. Guy, and Chad Markert serve as Co-Principal Investigators of an HBCU UP Targeted Infusion Project grant. Funded by the National Science Foundation, the project aims to explore how infusing lab-bench techniques into the Exercise Physiology curriculum informs undergraduate students’ attitudes about research and science and intentions to persist in STEM-related careers.

The National Science Foundation aims to fund projects that increase retention and persistence in STEM-related careers. Developing project proposals usually involves creating a logic model and an evaluation plan. The intervention, specifically one designed to change an individual’s behavior and outcomes, relies on a combination of psychological and social factors. For example, increasing the retention and persistence of underrepresented groups in the STEM education-to-workforce pipeline depends on attitudes about science, behavior, and the ability to access resources that lead to access, exploration, and exposure to STEM.

As faculty interested in designing interventions in STEM education, we developed a psychosocial framework to inform project design and evaluation and believe we offer an insightful strategy to investigators and evaluators. When developing a theory of change or logic model, you can create a visual map (see figure below) to identify underlying psychological and social factors and assumptions that influence program outcomes. In this post, we highlight a psychosocial framework for developing theories of change—specifically as it relates to underrepresented groups in STEM.

psychosocial_frameworkVisual mapping can outline the relationship between the intervention and psychological (cognitive) and social domains.

What do we mean by psychosocial framework?

Both retention and persistence rely on social factors, such as financial resources, mentoring, and other forms of social support. For example, in our work, we proposed introducing underrepresented students to lab-bench techniques in the Exercise Physiology curriculum and providing summer enrichment opportunities in research to receive funding and mentoring. Providing these social resources introduced students to scientific techniques they would not receive in a traditional curriculum. Psychological factors, such as individual attitudes about science and self-efficacy, are also key contributors to STEM persistence. For instance, self-efficacy is the belief one has the capacity to accomplish a specific task and achieve a specific outcome.

A practical exercise in developing the psychosocial framework is asking critical questions:

  • What are some social factors driving a project’s outcomes? For example, you may modify social factors by redesigning curriculum to engage students in hands-on experiences, providing mentoring or improving STEM teaching.
  • How can these social factors influence psychological factors? For example, improving STEM education can change the way students think about STEM. Outcomes then could relate to attitudes towards and beliefs about science.
  • How do psychological factors relate to persistence in STEM? For example, changing the way students think about STEM, their attitudes, and beliefs may shape their science identity and increase their likelihood to persist in STEM education (Guy, 2013).

What is the value-added?

Evaluation plans, specifically those seeking to measure changes in human behavior, hinge on a combination of psychological and social factors. The ways in which individuals think and form attitudes and behaviors, combined with their ability to access resources, influence programmatic outcomes. A psychosocial framework can be used to identify how psychological processes and social assets and resources contribute to increased participation and persistence of underrepresented groups in STEM-related fields and the workforce. More specifically, the recognition of psychological and social factors in shaping science attitudes, behaviors, and intentions to persist in STEM-related fields can generate value in project design and evaluation.

Reference
Guy, B. (2013). Persistence of African American men in science: Exploring the influence of scientist identity, mentoring, and campus climate. (Doctoral dissertation).

Useful Resource

Steve Powell’s AEA365 blog post, Theory Maker: Free web app for drawing theory of change diagrams

Blog: Course Improvement Through Evaluation: Improving Undergraduate STEM Majors’ Capacity for Delivering Inquiry-Based Mathematics and Science Lessons

Posted on November 16, 2016 by  in Blog ()

Associate Professor, Graduate School of Education, University of Massachusetts Lowell

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

One of the goals of the University of Massachusetts (UMass) UTeach program is to produce mathematics and science teachers who not only are highly knowledgeable in their disciplines, but also can engage students through inquiry-based instruction. The Research Methods course is one of the core program courses designed to accomplish this goal. The Research Methods course is centered around a series of inquiry-based projects.

What We Did

Specifically, the first inquiry was a simple experiment. Students were asked to look around in their kitchens, come up with a research question, and carry out an experiment to investigate the question. The second inquiry required them to conduct a research project in their own disciplinary field. The third inquiry asked students to pretend to be teachers of a middle or high school math/science course who were about to teach their students topics that involve the concept of slope and its applications. This inquiry required students to develop and administer an assessment tool. In addition, they analyzed and interpreted assessment data in order to find out their pretend-students’ prior knowledge and understanding of the concept of slope and its applications in different STEM disciplines (i.e., using assessment information for lesson planning purposes).

Our Goal

We investigated whether our course achieved the goal of enhancing course enrollees’ development of pedagogical skills delivering inquiry-based instruction teaching mathematical or scientific concepts embedded in the inquiry projects.

What We Learned

Examinations of the quality of students’ written inquiry reports showed that students were able to do increasingly difficult work with a higher degree of competency as the course progressed.

Comparisons of students’ responses to pre-and-post course surveys that consisted of questions about a hypothetical experiment indicated that students gained skills at identifying and classifying experimental variables and sources of measurement errors. However, they struggled with articulating research questions and justifying whether a question was researchable. These results were consistent with what we observed in their written reports. As the course progressed, students were more explicit at identifying variables and their relationships and were better at explaining how their research designs addressed possible measurement errors. For most students, however, articulating a researchable question was the most difficult aspect of an inquiry project.

Students’ self-reflections and focus group discussions suggested that our course modeled inquiry-based learning quite well, which was a sharp departure from the step-by-step laboratory activities they were used to as K-12 students. Students also noted that the opportunity to independently conceptualize and carry out an experiment before getting peer and instructor feedback, revising, and producing a final product created a reflective process that they had not experienced in other university course work. Finally, students appreciated the opportunity to articulate the connection between practicing inquiry skills as part of their professional requirements (i.e., as STEM majors) and using inquiry as a pedagogical tool to teach the math and science concepts to middle or high school students. They also noted that knowing how to evaluate their own students’ prior knowledge was an important skill for lesson planning down the road.

Blog: Best Practices for Two-Year Colleges to Create Competitive Evaluation Plans

Posted on September 28, 2016 by , in Blog ()
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Ball
Kelly Ball
Ball
Jeff Grebinoski

Northeast Wisconsin Technical College’s (NWTC) Grants Office works closely with its Institutional Research Office to create ad hoc evaluation teams in order to meet the standards of evidence required in funders’ calls for proposals. Faculty members at two-year colleges often make up the project teams that are responsible for National Science Foundation (NSF) grant project implementation. However, they often need assistance navigating among terms and concepts that are traditionally found in scientific research and social science methodology.

Federal funding agencies are now requiring more evaluative rigor in their grant proposals than simply documenting deliverables. For example, the NSF’s Scholarships in Science, Technology, Engineering, and Mathematics (S-STEM) program saw dramatic changes in 2015: The program solicitation increased the amount of non-scholarship budget from 15% of the scholarship amount to 40% of the total project budget to increase supports for students and to investigate the effectiveness of those supports.

Technical colleges, in particular, face a unique challenge as solicitations change: These colleges traditionally have faculty members from business, health, and trades industries. Continuous improvement is a familiar concept to these professionals; however, they tend to have varying levels of expertise evaluating education interventions.

The following are a few best practices we have developed for assisting project teams in grant proposal development and project implementation at NWTC.

  • Where possible, work with an external evaluator at the planning stage. External evaluators can provide the expertise that principal investigators and project teams might lack as external evaluators are well-versed on current evaluation methods, trends, and techniques.
  • As they develop their projects, teams should meet with their Institutional Research Office to better understand data gathering and research capacity. Some data needed for evaluation plans might be readily available, whereas others might require some advanced planning to develop a system to track information. Conversations about what the data will be used for and what questions the team wants to answer will help ensure that the correct data are able to be gathered.
  • After a grant is awarded, have a conversation early with all internal and external evaluative parties about clarifying data roles and responsibilities. Agreeing to reporting deadlines and identifying who will collect the data and conduct further analysis will help avoid delays.
  • Create a “data dictionary” for more complicated projects and variables to ensure that everyone is on the same page about what terms mean. For example, “student persistence” can be defined term-to-term or year-to-year and all parties need to understand which data will need to be tracked.

With some planning and the right working relationships in place, two-year colleges can maintain their federal funding competitiveness even as agencies increase evaluation requirements.

Blog: Articulating Intended Outcomes Using Logic Models: The Roles Evaluators Play

Posted on July 6, 2016 by , in Blog ()
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Wilkerson Peery
Stephanie B. Wilkerson Elizabeth Peery

Articulating project outcomes is easier said than done. A well-articulated outcome is one that is feasible to achieve within the project period, measurable, appropriate for the phase of project development, and in alignment with the project’s theory of change. A project’s theory of change represents causal relationships – IF we do these activities, THEN these intended outcomes will result. Understandably, project staff often frame outcomes as what they intend to do, develop, or provide, rather than what will happen as a result of those project activities. Using logic models to situate intended outcomes within a project’s theory of change helps to illustrate how project activities will result in intended outcomes.

Since 2008, my team and I have served as the external evaluator for two ATE project cycles with the same client. As the project has evolved over time, so too have its intended outcomes. Our experience using logic models for program planning and evaluation has illuminated four critical roles we as evaluators have played in partnership with project staff:

  1. Educator. Once funded, we spent time educating the project partners on the purpose and development of a theory of change and intended outcomes using logic models. In this role, our goal was to build understanding of and buy-in for the need to have logic models with well-articulated outcomes to guide project implementation.
  1. Facilitator. Next, we facilitated the development of an overarching project logic model with project partners. The process of defining the project’s theory of change and intended outcomes was important in creating a shared agreement and vision for project implementation and evaluation. Even if the team includes a logic model in the proposal, refining it during project launch is still an important process for engaging project partners. We then collaborated with individual project partners to build a “family” of logic models to capture the unique and complementary contributions of each partner while ensuring that the work of all partners was aligned with the project’s intended outcomes. We repeated this process during the second project cycle.
  1. Methodologist. The family of logic models became the key source for refining the evaluation questions and developing data collection methods that aligned with intended outcomes. The logic model thus became an organizing framework for the evaluation. Therefore, the data collection instruments, analyses, and reporting yielded relevant evaluation information related to intended outcomes.
  1. Critical Friend. As evaluators, our role as a critical friend is to make evidence-based recommendations for improving project activities to achieve intended outcomes. Sometimes evaluation findings don’t support the project’s theory of change, and as critical friends, we play an important role in challenging project staff to identify any assumptions they might have made about project activities leading to intended outcomes. This process helped to inform the development of tenable and appropriate outcomes for the next funding cycle.

Resources:

There are several resources for articulating outcomes using logic models. Some of the most widely known include the following:

Worksheet: Logic Model Template for ATE Projects & Centers: http://www.evalu-ate.org/resources/lm-template/

Education Logic Model (ELM) Application Tool for Developing Logic Models: http://relpacific.mcrel.org/resources/elm-app/

University of Wisconsin-Extension’s Logic Model Resources: http://www.uwex.edu/ces/pdande/evaluation/evallogicmodel.html

W.K. Kellogg Foundation Logic Model Development Guide: https://www.wkkf.org/resource-directory/resource/2006/02/wk-kellogg-foundation-logic-model-development-guide

Blog: Using Learning Assessments in Evaluations

Posted on June 8, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

If you need to evaluate your project, but get confused about whether there should be a student assessment component, it’s important to understand the difference between project evaluation and student assessment. Both are about rendering judgments about something and they are often used interchangeably. In the world of grant-funded education projects however, they have overlapping yet quite different meanings. When you do a project evaluation, you are looking at to what extent the project is meeting its goals and achieving its intended outcomes. When you are assessing, you are looking at student progress in meeting learning goals. The most commonly used instrument of assessment is a test, but there are other mechanisms for assessing learning as well, such as student reports, presentations, or journals.

Not all project evaluations require student assessments and not all assessments are components of project evaluations. For example, the goal of the project may be to restructure an academic program, introduce some technology to the classroom, or get students to persist through college. Of course, in the end, all projects in education aim to improve learning. Yet, by itself, an individual project may not aspire to directly influence learning, but rather influence it through a related effort. In turn, not all assessments are conducted as components of project evaluations. Rather, they are most frequently used to determine the academic progress of individual students.

If you are going to put a student assessment component in your evaluation, answer these questions:

  1. What amount of assessment data will you need to properly generalize from your results about how well your project is faring? For example, how many students are impacted by your program? Do you need to assess them all or can you limit your assessment administration to a representative sample?
  2. Should you administer the assessment early enough to determine if the project needs to be modified midstream? This would be called a formative assessment, as opposed to a summative assessment, which you would do at the end of a project, after you have fully implemented your innovation with the students.

Think also about what would be an appropriate assessment instrument. Maybe you could simply use a test that the school is already using with the students. This would make sense, for example, if your goal is to provide some new curricular innovations in a particular course that the students are already taking. If your project fits into this category, it makes sense because it is likely that those assessments would have already been validated, which means they would have been piloted and subsequently modified as needed to ensure that they truly measure what they are designed to measure.

An existing assessment instrument may not be appropriate for you, however. Perhaps your innovation is introducing new learnings that those tests are not designed to measure. For example, it may be facilitating their learning of new skills, such as using new mobile technologies to collect field data. In this situation, you would want your project’s goal statements to be clear about whether the intention of your project is to provide an improved pathway to already-taught knowledge or skills, or a pathway to new learnings entirely, or both. New learnings would require a new assessment. In my next post, I’ll talk about validity and reliability issues to address when developing assessments.

Blog: Designing Cluster Randomized Trials to Evaluate Programs

Posted on May 25, 2016 by  in Blog ()

Associate Professor, Education, Leadership, Research, and Technology, Western Michigan University

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

The push for rigorous evaluations of the impact of interventions has led to an increase in the use of randomized trials (RTs). In practice, it is often the case that interventions are delivered at the cluster level, such as a whole school reform model or a new curriculum. In these cases, the cluster (i.e., the school), is the logical unit of random assignment and I hereafter refer to these as cluster randomized trials (CRTs).

Designing a CRT is necessarily more complex than a RT for several reasons. First, there are two sample sizes, i.e., the number of students per school and the total number of schools. Second, the greater the variability in the outcome across schools, the more schools you will need to detect an effect of a given magnitude. The percentage of variance in the outcome that is between schools is commonly referred to as the intra-class correlation (ICC). For example, suppose I am testing an intervention and the outcome of interest is math achievement, there are 500 students per school, and a school level covariate explain 50 percent of the variation in the outcome. If the ICC is 0.20 and I want to detect an effect size difference of 0.2 standard deviations between the treatment and comparison conditions, 82 total schools, or 41 treatment and 41 comparison schools, would be needed to achieve statistical power equal to 0.80, the commonly accepted threshold. Instead, if the ICC is 0.05, the total number of schools would only be 24, a reduction of 54. Hence an accurate estimate of the ICC is critical in planning a CRT as it has a strong impact on the number of schools needed for a study.

The challenge is that the required sample size needs to be determined prior to the start of the study, hence I need to estimate the ICC since the actual data has not yet been collected. Recently there has been an increase in empirical studies which seek to estimate ICCs for different contexts. The findings suggest that the ICC varies depending on outcome type, unit of the clusters (i.e., schools, classrooms, etc.), grade and other features.

Resources:
Resources have started popping up to help evaluators planning CRTs find accurate estimates of the ICC. Two widely used in education include:

  1. The Online Variance Almanac: http://stateva.ci.northwestern.edu/
  2. The Optimal Design Plus Software: http://wtgrantfoundation.org/resource/optimal-design-with-empirical-information-od*

*Note that Optimal Design Plus is a free program that calculates power for CRTs. Embedded within the program is a data repository with ICC estimates.

In the event that empirical estimates are not available for your particular outcome type a search of the relevant literature may uncover estimates or a pilot study may be used to generate reasonable values. Regardless of the source, accurate estimates of the ICC are critical in determining the number of clusters needed in a CRT.