Blog

Subscribe here for quick access to our latest blog posts. New to RSS feeds? Click here


Student Learning Assessments: Issues of Validity and Reliability

Posted on June 22, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

In my last post, I talked about the difference between program evaluation and student assessment. I also touched on using existing assessments if they are available and appropriate, and if not, constructing new assessments.  Of course, that new assessment would need to meet test quality standards or otherwise it will not be able to measure what you need to have measured for your evaluation. Test quality has to do with validity and reliability.

When a test is valid, it means that when a student responds with a wrong answer, it would be reasonable to conclude that they did so because they did not learn what they were supposed to have learned. There are all kinds of impediments to an assessment’s validity. For example, if in a science class you are asking students a question aimed at determining if they understand the difference between igneous and sedimentary rocks, yet you know that some of them do not understand English, you wouldn’t want to ask them the question in English. In testing jargon, what you are introducing in such a situation is “construct irrelevant variance.” In this case, the variance in results may be as much due to whether they know English (the construct irrelevant part) as to whether they know the construct, which is the differences between the rock types. Hence, these results would not help you determine if your innovation is helping them learn the science better.

Reliability has to do with test design, administration, and scoring. Examples of unreliable tests are those that are too long, introducing test-taking fatigue that interfere with their being reliable measures of student learning. Another common example of unreliability is when the scoring directions or rubric are not designed well enough to be sufficiently clear about how to judge the quality of an answer. This type of problem will often result in inconsistent scoring, otherwise known as low interrater reliability.

To summarize, a student learning assessment can be very important to your evaluation if a goal of your project is to directly impact student learning. Then you have to make some decisions about whether you can use existing assessments or develop new ones, and if you make new ones, they need to meet technical quality standards of validity and reliability. For projects not directly aiming at improving student learning, an assessment may actually be inappropriate in the evaluation because the tie between the project activities and the student learning may be too loose. In other words, the learning outcomes may be mediated by other factors that are too far beyond your control to render the learning outcomes useful for the evaluation.

Using Learning Assessments in Evaluations

Posted on June 8, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

If you need to evaluate your project, but get confused about whether there should be a student assessment component, it’s important to understand the difference between project evaluation and student assessment. Both are about rendering judgments about something and they are often used interchangeably. In the world of grant-funded education projects however, they have overlapping yet quite different meanings. When you do a project evaluation, you are looking at to what extent the project is meeting its goals and achieving its intended outcomes. When you are assessing, you are looking at student progress in meeting learning goals. The most commonly used instrument of assessment is a test, but there are other mechanisms for assessing learning as well, such as student reports, presentations, or journals.

Not all project evaluations require student assessments and not all assessments are components of project evaluations. For example, the goal of the project may be to restructure an academic program, introduce some technology to the classroom, or get students to persist through college. Of course, in the end, all projects in education aim to improve learning. Yet, by itself, an individual project may not aspire to directly influence learning, but rather influence it through a related effort. In turn, not all assessments are conducted as components of project evaluations. Rather, they are most frequently used to determine the academic progress of individual students.

If you are going to put a student assessment component in your evaluation, answer these questions:

  1. What amount of assessment data will you need to properly generalize from your results about how well your project is faring? For example, how many students are impacted by your program? Do you need to assess them all or can you limit your assessment administration to a representative sample?
  2. Should you administer the assessment early enough to determine if the project needs to be modified midstream? This would be called a formative assessment, as opposed to a summative assessment, which you would do at the end of a project, after you have fully implemented your innovation with the students.

Think also about what would be an appropriate assessment instrument. Maybe you could simply use a test that the school is already using with the students. This would make sense, for example, if your goal is to provide some new curricular innovations in a particular course that the students are already taking. If your project fits into this category, it makes sense because it is likely that those assessments would have already been validated, which means they would have been piloted and subsequently modified as needed to ensure that they truly measure what they are designed to measure.

An existing assessment instrument may not be appropriate for you, however. Perhaps your innovation is introducing new learnings that those tests are not designed to measure. For example, it may be facilitating their learning of new skills, such as using new mobile technologies to collect field data. In this situation, you would want your project’s goal statements to be clear about whether the intention of your project is to provide an improved pathway to already-taught knowledge or skills, or a pathway to new learnings entirely, or both. New learnings would require a new assessment. In my next post, I’ll talk about validity and reliability issues to address when developing assessments.

Designing Cluster Randomized Trials to Evaluate Programs

Posted on May 25, 2016 by  in Blog ()

Associate Professor, Education, Leadership, Research, and Technology, Western Michigan University

The push for rigorous evaluations of the impact of interventions has led to an increase in the use of randomized trials (RTs). In practice, it is often the case that interventions are delivered at the cluster level, such as a whole school reform model or a new curriculum. In these cases, the cluster (i.e., the school), is the logical unit of random assignment and I hereafter refer to these as cluster randomized trials (CRTs).

Designing a CRT is necessarily more complex than a RT for several reasons. First, there are two sample sizes, i.e., the number of students per school and the total number of schools. Second, the greater the variability in the outcome across schools, the more schools you will need to detect an effect of a given magnitude. The percentage of variance in the outcome that is between schools is commonly referred to as the intra-class correlation (ICC). For example, suppose I am testing an intervention and the outcome of interest is math achievement, there are 500 students per school, and a school level covariate explain 50 percent of the variation in the outcome. If the ICC is 0.20 and I want to detect an effect size difference of 0.2 standard deviations between the treatment and comparison conditions, 82 total schools, or 41 treatment and 41 comparison schools, would be needed to achieve statistical power equal to 0.80, the commonly accepted threshold. Instead, if the ICC is 0.05, the total number of schools would only be 24, a reduction of 54. Hence an accurate estimate of the ICC is critical in planning a CRT as it has a strong impact on the number of schools needed for a study.

The challenge is that the required sample size needs to be determined prior to the start of the study, hence I need to estimate the ICC since the actual data has not yet been collected. Recently there has been an increase in empirical studies which seek to estimate ICCs for different contexts. The findings suggest that the ICC varies depending on outcome type, unit of the clusters (i.e., schools, classrooms, etc.), grade and other features.

Resources:
Resources have started popping up to help evaluators planning CRTs find accurate estimates of the ICC. Two widely used in education include:

  1. The Online Variance Almanac: http://stateva.ci.northwestern.edu/
  2. The Optimal Design Plus Software: http://wtgrantfoundation.org/resource/optimal-design-with-empirical-information-od*

*Note that Optimal Design Plus is a free program that calculates power for CRTs. Embedded within the program is a data repository with ICC estimates.

In the event that empirical estimates are not available for your particular outcome type a search of the relevant literature may uncover estimates or a pilot study may be used to generate reasonable values. Regardless of the source, accurate estimates of the ICC are critical in determining the number of clusters needed in a CRT.

Professional Development Opportunities in Evaluation – What’s Out There?

Posted on April 29, 2016 by  in Blog ()

Doctoral Associate, EvaluATE

To assist the EvaluATE community in learning more about evaluation, we have compiled a list of free and low-cost online and short-term professional development opportunities. There are always new things available, so this is only a place to start!  If you run across a good resource, please let us know and we will add it to the list.

Free Online Learning

Live Webinars

EvaluATE provides webinars created specifically for projects funded through the National Science Foundation’s Advanced Technological Education program. The series includes four live events per year. Recording, slides, and handouts of previous webinars are available.  http://www.evalu-ate.org/category/webinars/

Measure Evaluation is a USAID-funded project with resources targeted to the field of global health monitoring and evaluation. Webinars are offered nearly every month on various topics related to impact evaluation and data collection; recordings of past webinars are also available. http://www.cpc.unc.edu/measure/resources/webinars

Archived Webinars and Videos

Better Evaluation’s archives include recordings of an eight-part webinar series on impact evaluation commissioned by UNICEF. http://betterevaluation.org/search/site/webinar

Centers for Disease Control’s National Asthma Control Program offers recordings of its four-part webinar series on evaluation basics, including an introduction to the CDC’s Framework for Program Evaluation in Public Health. http://www.cdc.gov/asthma/program_eval/evaluation_webinar.htm

EvalPartners offered several webinars on topics related to monitoring and evaluation (M&E). They also have as series of self-paced e-learning courses. The focus of all programs is to improve competency in conducting evaluation, with an emphasis on evaluation in the community development context.  http://www.mymande.org/webinars

Engineers Without Borders partners with communities to help them meet their basic human needs. They offer recordings of their live training events focused on monitoring, evaluation, and reporting. http://www.ewb-usa.org/resources?_sfm_cf-resources-type=video&_sft_ct-international-cd=impact-assessment

The University of Michigan School of Social Work has created six free interactive Web-based learning modules on a range of evaluation topics. The target audience is students, researchers, and evaluators.  A competency skills test is given at the end of each module, and a printable certificate of completion is available at the end of each module. https://sites.google.com/a/umich.edu/self-paced-learning-modules-for-evaluation-research/

Low-Cost Online Learning

The American Evaluation Association (AEA) Coffee Break Webinars are 20-minute webinars on varying topics.  At this time non-members may register for the live webinars, but you must be a member of AEA to view the archived broadcasts. There are typically one or two sessions offered each month.  http://comm.eval.org/coffee_break_webinars/coffeebreak

AEA’s eStudy program is a series of in-depth real-time professional development opportunities and are not recorded.  http://comm.eval.org/coffee_break_webinars/estudy

The Canadian Evaluation Society (CES) offers webinars to members on a variety of evaluation topics. Reduced membership rates are available for members of AEA. http://evaluationcanada.ca/webinars

­Face-to-Face Learning

AEA Summer Evaluation Institute is offered annually in June, with a number of workshops and conference sessions.  http://www.eval.org/p/cm/ld/fid=232

The Evaluator’s Institute offers one- to five-day courses in Washington, DC in February and July. Four levels of certificates are available to participants. http://tei.cgu.edu/

Beyond these professional development opportunities, university degree and certificate programs are listed on the AEA website under the “Learn” tab.  http://www.eval.org/p/cm/ld/fid=43

Maximizing Stakeholder Engagement by Bringing Evaluation to Life!

Posted on April 13, 2016 by  in Blog ()

CEO, SPEC Associates

Melanie is CEO of SPEC Associates, a nonprofit program evaluation and process improvement organization headquartered in downtown Detroit.  Melanie is also on the faculty of Michigan State University, where she teaches Evaluation Management in the M.A. in Program Evaluation program. Melanie holds a Ph.D. in Applied Social Psychology and has directed evaluations for almost 40 years both locally and nationally. Her professional passion is making evaluation an engaging, fun and learning experience for both program stakeholders and evaluators. To this end, Melanie co-created EvaluationLive!, an evaluation practice model that guides evaluators in ways to breathe life into the evaluation experience.

Why is it that sometimes meetings with evaluation stakeholders seem to generate anxiety and boredom, while other times they generate excitement, a hunger for learning and, yes, even fun!?

My colleague, Mary Williams, and I started wondering and defining this about eight years ago. With 60 years of collective evaluation experience, we documented, analyzed cases, and conducted an in-depth literature review seeking an answer. We honed in on two things: (1) a definition of what exemplary stakeholder engagement looks and feels like, and (2) a set of factors that seem to predict when maximum stakeholder engagement exists.

To define “exemplary stakeholder engagement” we looked to the field of positive psychology and specifically to Mihaly Csikszentmihalyi’s  (2008) Flow Theory. Csikszentmihalyi defines “flow” as that highly focused mental state where time seems to stand still. Think of a musician composing a sonata. Think of a basketball player being in the “zone.” Flow theory says that this feeling of “flow” occurs when the person perceives that the task at hand is challenging and also perceives that she or he has the skill level sufficient to accomplish the task.

The EvaluationLive! model asserts that maximizing stakeholder engagement with an evaluation – having a flow-like experience during encounters between the evaluator and the stakeholders – requires certain characteristics of the evaluator/evaluation team, of the client organization, and of the relationship between them. Specifically, the evaluator/evaluation team must (1) be competent in the conduct of program evaluation; (2) have expertise in the subject matter of the evaluation; (3) have skills in the art of interpersonal, nonverbal, verbal and written communication; (4) be willing to be flexible in order to meet stakeholders’ needs typically for delivering results in time for decision making; and (5) approach the work with a non-egotistical learner attitude. The client organization must (1) be a learning organization open to hearing good, bad, and ugly news; (2) drive the questions that the evaluation will address; and (3) have a champion positioned within the organization who knows what information the organization needs when, and can put the right information in front of the right people at the right time. The relationship between the evaluator and client must be based on (1) trust, (2) a belief that both parties are equally expert in their own arenas, and (3) a sense that the evaluation will require shared responsibility on the part of the evaluator and the client organization.

Feedback from the field shows EvaluationLive!’s goalposts help evaluators develop strategies to emotionally engage clients in their evaluations. EvaluationLive! has been used to diagnose problem situations and to direct “next steps.” Evaluators are also using the model to guide how to develop new client relationships. We invite you to learn and get involved.

EvaluationLive! Model

Summary of the EvaluationLive! Model

 

Color Scripting to Measure Engagement

Posted on March 30, 2016 by  in Blog ()

President, iEval

Last year I went to the D23 Expo in Anaheim, California. This was a conference for Disney fans everywhere. I got to attend panels where I learned past Disney secrets and upcoming Disney plans. I went purely for myself, since I love Disney everything, and I never dreamed I would learn something that could be applicable to my evaluation practice.

In a session with John Lasseter, Andrew Stanton, Pete Doctor, and others from Pixar, I learned about a technique created by Ralph Eggleston (who was there too) called color scripting. Color scripting is a type of story boarding, but Ralph would change the main colors of each panel to reflect the emotion the animated film was supposed to portray at that time. It helped the Pixar team understand what was going on in the film emotionally at a quick glance, and it also made it easier to create a musical score to enhance those emotions.

Then, a few weeks later, I was sitting in a large event for a client, observing from the back of the room. I started taking notes on the engagement and energy of the audience based on who was presenting. I created some metrics on the spot including number of people on their mobile devices, number of people leaving the event, laughter, murmuring, applause, etc. I thought I would create a simple chart with a timeline of the event, highlighting who was presenting at different times, and indicating if engagement was high/medium/low and if energy was high/medium/low. I quickly realized, when analyzing the data, that engagement and energy were 100% related. If engagement was high, then energy followed shortly as being high. So, instead of charting two dimensions, I really only needed to chart one: engagement & energy combined (see definitions of engagement and energy in the graphic below). That’s when it hit me – color scripting! Okay, I’m no artist like Ralph Eggleston, so I created a simple color scheme to use.

Graphic 1

In sharing this with the clients who put on the event, they could clearly see how the audience reacted to the various elements of the event. It was helpful in determining how to improve the event in the future. This was a quick and easy visual, made in Word, to illustrate the overall reactions of the audience.

I have since also applied this to a STEM project, color scripting how the teachers in a two-week professional development workshop felt at the end of each day based on one word they shared upon exiting the workshop each day. By mapping participant feelings in the different cohorts and comparing what and how things were taught each day, this resulted in thoughtful conversations with the trainers about how they want the participants to feel and what they need to change to match reality with intention.

Graphic 2

You never know where you’re going to learn a technique or tool that could be useful in your evaluation practice and useful to the client. Be open to learning everywhere you go.

The Retrospective Pretest Method for Evaluating Training

Posted on March 16, 2016 by  in Blog (, )

Director of Research, The Evaluation Center at Western Michigan University

In a retrospective pretest,1 trainees rate themselves before and after a training in a single data collection event. It is useful for assessing individual-level changes in knowledge and attitudes as one part of an overall evaluation of an intervention. This method fits well with the Kirkpatrick Model for training evaluation, which calls for gathering data about participants’ reaction to the training, their learning, changes in their behavior, and training outcomes. Retrospective pretest data are best suited for evaluating changes in learning and attitudes (Level 2 in the Kirkpatrick Model).

The main benefit of using this method is that it reduces response-shift bias, which occurs when respondents change their frame of reference for answering questions. It is also convenient, more accurate than self-reported data gathered using traditional pre-post self-assess methods, adaptable to a wide range of contexts, and generally more acceptable to adult learners than traditional testing. Theodore Lamb provides a succinct overview of the strengths and weaknesses of this method in a Harvard Family Research Project newsletter article—see bit.ly/hfrp-retro.

The University of Wisconsin Extension’s Evaluation Tip Sheet 27: Using the Retrospective Post-then-Pre Design provides practical guidelines about how to use this method: bit.ly/uwe-tips.

Design

The focus of retrospective pretest questions should be on the knowledge, skills, attitudes, or behaviors that are the focus of the intervention being evaluated. General guidelines for formatting questions: 1) Use between 4 and 7 response categories in a Likert-type or partially anchored rating scale; 2) Use formatting to distinguish pre and post items; 3) Provide clear instructions to respondents. If you are using an online survey platform, check your question type options before committing to a particular format. To see examples and learn more about question formatting, see University of Wisconsin Extension’s Evaluation Tip Sheet 28: “Designing a Retrospective Post-then-Pre Question” at bit.ly/uwe-tips.

For several examples of Likert-type rating scales, see bit.ly/likert-scales—be careful to match question prompts to rating scales.

Analysis and Visualization

Retrospective pretest data are usually ordinal, meaning the ratings are hierarchical, but the distances between the points on the scale (e.g., between “somewhat skilled” and “very skilled”) are not necessarily equal. Begin your analysis by creating and examining the frequency distributions for both the pre and post ratings (i.e., the number and percentage of respondents who answer in each category). It is also helpful to calculate change scores—the difference between each respondent’s before and after ratings—and look at those frequency distributions (i.e., the number and percentage of respondents who reported no change, reported a change of 1 level, 2 levels, etc.).

For more on how to analyze retrospective pretest data and ordinal data in general, see the University of Wisconsin Extension’s Evaluation Tip Sheet 30: “Analysis of Retrospective Post-then-Pre Data” and Tip Sheet 15: “Don’t Average Words” bit.ly/uwe-tips.

For practical guidance on creating attractive, effective bar, column, and dot plot charts, as well as other types of data visualizations, visit stephanieevergreen.com.

Using Results

To use retrospective pretest data to make improvements to an intervention, examine the data to determine if some groups (based on characteristic such as job, other demographic characteristics, and incoming skill level) gained more or less than others and compare results to the intervention’s relative strengths and weaknesses in terms of achieving its objectives. Make adjustments to future offerings based on lessons learned and monitor to see if the changes lead to improvements in outcomes.

To learn more, see the slides and recording of EvaluATE’s December 2015 webinar on this topic: http://www.evalu-ate.org/webinars/2015-dec/

For a summary of research on this method, see Klatt and Powell’s (2005) white paper, “Synthesis of Literature Relative to the Retrospect Pretest Design:” bit.ly/retro-syn.

1 This method has other names, such as post-then-pre and retrospective pretest-posttest.

Blog: Evaluating for Sustainability: How can Evaluators Help?

Posted on February 17, 2016 by  in Blog ()

Research Analyst, Hezel Associates

Developing a functional strategy to sustain crucial program components is often overlooked by project staff implementing ATE-funded initiatives. At the same time, evaluators may neglect the opportunity to provide value to decision makers regarding program components most vital to sustain. In this blog, I suggest a few strategies to avoid both of these traps, established through my work at Hezel Associates, specifically with colleague Sarah Singer.

Defining sustainability is a difficult task in its own right, often eliciting a plethora of interpretations that could be deemed “correct.” However, the most recent NSF ATE program solicitation specifically asks grantees to produce a “realistic vision for sustainability” and defines the term as meaning “a project or center has developed a product or service that the host institution, its partners, and its target audience want continued.” Two phrases jump out of this definition: realistic vision and what stakeholders want continued. NSF’s definition, and these terms in particular, frame my tips for evaluating for sustainability for an ATE project while addressing three common challenges.

Challenge 1: The project staff doesn’t know what components to sustain.

I use a logic model to address this problem. Reverting to the definition of sustainability provided by the NSF-ATE program, it’s possible to replace “product” with “outputs” and “service” with “activities” (taking some liberties here) to put things in terms common to typical logic models. This produces a visual tool useful for an open discussion with project staff regarding the products or services they want continued and which ones are realistic to continue. The exercise can identify program elements to assess for sustainability potential, while unearthing less obvious components not described in the logic model.

Challenge 2: Resources are not available to evaluate for sustainability.

Embedding data collection for sustainability into the evaluation increases efficiency. First, I create a specific evaluation question (or questions) focusing on sustainability, using what stakeholders want continued and what is realistic as a framework to generate additional questions. For example, “What are the effective program components that stakeholders want to see continued post-grant-funding?” and “What inputs and strategies are needed to sustain desirable program components identified by program stakeholders?” Second, I utilize the components isolated in the aforementioned logic model discussion to inform qualitative instrument design. I explore those components’ utility through interviews with stakeholders, eliciting informants’ ideas for how to sustain them. Information collected from interviews allows me to refine potentially sustainable components based on stakeholder interest, possibly using the findings to create questionnaire items for further refinement. I’ve found that resources are not an issue if evaluating for sustainability is planned accordingly.

Challenge 3: High-level decision makers are responsible for sustaining project outcomes or activities and they don’t have the right information to make a decision.

This is a key reason why evaluating for sustainability throughout the entire project is crucial. Ultimately, decision makers to whom project staff report determine which program components are continued beyond the NSF funding period. A report consisting of three years of sustainability-oriented data, detailing what stakeholders want continued while addressing what is realistic, allows project staff to make a compelling case to decision makers for sustaining essential program elements. Evaluating for sustainability supports project staff with solid data, enabling comparisons between more and less desirable components that can easily be presented to decision makers. For example, findings focusing on sustainability might help a project manager reallocate funds to support crucial components, perhaps sacrificing others; change staffing, replace personnel with technology (or vice versa), or engage partners to provide resources.

The end result could be realistic strategies to sustain program components that stakeholders want continued supported by data.

Blog: Project Data for Evaluation: Google Groups Project Team Feedback

Posted on February 11, 2016 by  in Blog ()

President and CEO, Censeo Group

At Censeo Group, a program evaluation firm located in northeast Ohio, we are evaluating a number of STEM projects and often face the challenge of how to collect valid and reliable data about the impact of curriculum implementation: What implementation looks like, students’ perceptions of the program, project leaders’ comfort with lessons, and the extent to which students find project activities engaging and beneficial.

We use various methods to gather curriculum implementation data. Observations offer a glimpse into how faculty deliver new curriculum materials and how students interact and react to those materials, but are time-intensive and require clear observation goals and tools. Feedback surveys offer students and staff the opportunity to provide responses that support improvement or provide a summative analysis of the implementation, but not everyone responds and some responses may be superficial. During a recent project, we were able to use an ongoing, rich, genuine, and helpful source of project information for the purpose of evaluation.

Google Groups Blog

Project leaders created a Google Group and invited all project staff and the evaluation team to join with the following message:

“Welcome to our Google Group! This will be a format for sharing updates from interventions and our sites each week. Thanks for joining in our discussion!”

The team chose Google Groups because everybody was comfortable with the environment, and it is free, easy to use and easy to access.

Organizing the Posts

Project leaders created a prompt each week, asking staff to “Post experiences from Week X below.” This chronological method of organization kept each week’s feedback clustered. However, a different organizing principle could be used, for example, curriculum unit or school.

In the case of this Google Group, the simple prompt resonated well with project staff, who wrote descriptive and reflective entries. Graduate students, who were delivering a new curriculum to high school students, offered recommendations for colleagues who would be teaching the content later in the week about how to organize instruction, engage students, manage technology, or address questions that were asked during their lessons. Graduate students also referred to each other’s posts, indicating that this interactive method of project communication was useful and helpful for them as they worked in the schools, for example, in organizing materials or modifying lessons based on available time or student interest.

Capturing and Analyzing the Data 

The evaluation team used NVIVO’s NCapture, a Web browser add-on for NVIVO qualitative data analysis software that allows the blog posts to be quickly imported into the software for analysis. Once in NVIVO, the team coded the data to analyze the successes and challenges of using the new curriculum in the high schools.

Genuine and Ongoing Data

The project team is now implementing the curriculum for the second time with a new group of students. Staff members are posting weekly feedback about this second implementation. This ongoing use of the Google Group blog will allow the evaluation team to analyze and compare implementation by semester (Fall 2015 versus Spring 2016), by staff type (reveal changes in graduate students’ skills and experience), by school, and other relevant categories.

From a strictly data management perspective, a weekly survey of project staff using a tool such as Google Forms or an online survey system, from which data could be transferred directly into a spreadsheet, likely would have been easier to manage and analyze. However, the richness of the data that the Google Groups entries generated was well worth the trade-off of the extra time required to capture and upload each post. Rather than giving staff an added “evaluation” activity that was removed from the work of the project, and to which likely not all staff would have responded as enthusiastically, these blog posts provided evaluation staff with a glimpse into real-time, genuine staff communication and classroom implementation challenges and successes. The ongoing feedback about students’ reactions to specific activities supported project implementation by helping PIs understand which materials needed to be enhanced to support students of different skill levels as the curriculum was being delivered. The blog posts also provided insights into the graduate students’ comfort with the curriculum materials and highlighted the need for additional training for them about specific STEM careers. The blog allowed PIs to quickly make changes during the semester and provided the evaluation team with information about how the curriculum was being implemented and how changes affected the project over the course of the semester.

You can find additional information about NVIVO here: http://www.qsrinternational.com/product. The site includes training resources and videos about NVIVO.

You can learn how to create and use a Google Group at the Google Groups Help Center:
https://support.google.com/groups/?hl=en#topic=9216

Blog: Good Communication Is Everything!

Posted on February 3, 2016 by  in Blog ()

Evaluator, South Carolina Advanced Technological Education Resource Center

I am new to the field of evaluation, and the most important thing that I learned in my first nine months is that effective communication is critical to the success of the evaluation of a project. Whether primarily virtual or face-to-face, knowing the communication preferences of your client is important. Knowing the client’s schedule is also important. For example, if you are working with faculty, having a copy of their teaching and office hours schedule for each semester can help.

While having long lead times to get to know the principal investigator and project team is desirable and can promote strong relationship building in advance of implementing evaluation strategies, that isn’t always possible. With my first project, contracts were finalized with the client and evaluators only days before a major project event. There was little time to prepare and no opportunity to get to know the principal investigator or grant team before launching into evaluation activities. In preparation, I had an evaluation plan, a copy of the proposal as submitted, and other project-related documents. Also, I was working with a veteran evaluator who knew the PI and had experience evaluating another project for the client. Nonetheless, there were surprises that caught both the veteran evaluator and me off guard. As the two evaluators worked with the project team to hone in on the data needed to make the evaluation stronger, we discovered that the goals, objectives, and some of the activities had been changed during the project’s negotiations with NSF prior to funding. As evaluators, we discovered that we were working from a different playbook than the PI and other team members! The memory of this discovery still sends chills down my back!

A mismatch regarding communication styles and anticipated response times can also get an evaluation off to a rocky start. If not addressed, unmet expectations can lead to disappointment and animosity. In this case, face-to-face interaction was key to keeping the evaluation moving forward. Even when a project is clearly doing exciting and impactful work, it isn’t always possible to collect all of the data called for in the evaluation plan. I’ve learned firsthand that the tug-of-war that exists between an evaluator’s desire and preparation to conduct a rigorous evaluation and the need to be flexible and to work within the constraints of a particular situation isn’t always comfortable.

Lessons learned

From this experience, I learned some important points that I think will be helpful to new evaluators.

  • Establishing a trusting relationship can be as important as conducting the evaluation. Find out early if you and the principal investigator are compatible and can work together. The PI and evaluator should get to know each other and establish some common expectations at the earliest possible date.
  • Determine how you will communicate and ensure a common understanding of what constitutes a reasonable response time for emails, telephone calls, or requests for information from either party. Individual priorities differ and thus need to be understood by both parties.
  • Be sure that you ask at the onset if there have been changes to the goals and objectives for the project since the proposal was submitted. Adjust the evaluation plan accordingly.
  • Determine the data that can be and will be collected and who will be responsible for providing what information. In some situations, it helps to secure permission to work directly with an institutional research office or internal evaluator for a project to collect data.
  • When there are differences of opinion or misunderstandings, confront them head on. If the relationship continues to be contentious in any way, changing evaluators may be the best solution.

I hope that some of my comments will help other newcomers to realize that the yellow brick road does have some potential potholes and road closures.