We EvaluATE - Data...

Blog: Using Mutual Interviewing to Gather Student Feedback

Posted on September 18, 2017 by  in Blog ()

Ph.D., Applied Inference

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

It’s hard to do a focus group with 12 or more students! With so many people and so little time, you know it’s going to be hard to hear from the quieter students, but that may be who you need to hear from the most! Or, maybe some of the questions are sensitive – the students with the most to lose may be the least likely to speak up. What can you do? Mutual interviewing. Here is how you do it.

This method works great for non-English speakers, as long as all participants speak the same language and your co-facilitator and one person in each group is bilingual.

Warnings: The setup is time-consuming, it is very hard to explain, pandemonium ensues, people start out confused, they are often afraid they’ll do it wrong, and it is noisy (a big room is best!).

Promises: Participants are engaged, it accommodates much larger groups than a traditional focus group, everyone participates and no one can dominate, responses are anonymous except to the one individual who heard the response, it builds community and enthusiasm, and it is an empowering process.

Preparation (the researcher):

  1. Create interview guides for four distinct topics related to a project, such as program strengths, program weaknesses, suggestions for improvement, future plans, how you learn best, or personal challenges that might interfere. The challenge is to identify the most important topics that are far enough apart that people do not give the same answer to each question.
  2. Create detailed interview guides, perhaps including probe questions the group members can use to help their interviewees get traction on the question. Each group member interviewer will need four copies of his or her group’s interview guide (one for each interviewee, plus one to fill out himself/herself).
  3. Prepare nametags with a group and ID number (to ensure confidentiality) (e.g., 3-1 would be Group 3, member 1). Make sure the groups are equal in size – with between 3 and 6 members per group. This method allows you to conduct the focus group activity with up to 24 people! The nametags help member-interviewers and member-interviewees from the designated groups find each other during the rounds.
  4. Create a brief demographic survey, and pre-fill it with Group and ID numbers to match the nametags. (The ID links the survey with interview responses gathered during the session, and this helps avoid double counting responses during analysis. You can also disaggregate interview responses by demographics.)
  5. Set up four tables for the groups and label them clearly with group number.
  6. Provide good food and have it ready to welcome the participants as they walk in the door.

During the Session:

At the time of the focus group, help people to arrange themselves into 4 equal-sized groups of between 3 and 6 members. Assign one of the topics to each group. Group members will research their topic by interviewing someone from each of the other groups (and being interviewed in return by their counterparts). After completing all rounds of interviews, they come back to their own group and answer the questions themselves. Then they discuss what they heard with each other, and report it out. The report-out gives other people the opportunity to add thoughts or clarify their views.

  1. As participants arrive, give them a nametag with Group and ID number and have them fill out a brief demographic survey (which you have pre-filled with their group/id number). The group number will indicate the table where they should sit.
  2. Explain the participant roles: Group Leader, Interviewer, and Interviewee.
    1. One person in each group is recruited to serve as a group leader and note taker during the session. This person will also participate in a debrief afterward to provide insights or tidbits gleaned during their group’s discussion.
    2. The group leader will brief their group members on their topic and review the interview guide.
    3. Interviewer/Interviewee: During each round, each member is paired with one person from another group, and they take turns interviewing each other about their group’s topic.
  3. Give members 5 minutes to review the interview guide and answer the questions on their own. They can also discuss with others in their group.
  4. After 5 minutes, pair each member up with a partner from another group for the interview (i.e., Group 1 and Group 2, Group 3 and Group 4). The members should mark a fresh interview sheet with their interviewee’s Group-ID number and then they take turns interviewing each other. Make sure they take notes during the interview. Give the first interviewer 5 minutes, then switch roles, and repeat the process.
  5. Rotate and pair each member with someone from a different group (Group 1 and 3, Group 2 and 4) and repeat the interviews using a fresh interview sheet, marked with the new interviewee’s Group-ID number. Again, each member will interview the other for five minutes.
  6. Finally, rotate again and pair up members from Groups 1 and 4 and Groups 2 and 3 for the final round. Mark the third clean interview sheet with each interviewee’s Group-ID number and interview each other for five minutes.
  7. Once all pairings are finished, members return to their original groups. Each member takes 5 minutes to complete or revise their own interview form, possibly enriched by the perspectives of 3 other people.
  8. The Group Leader facilitates a 15-minute discussion, during which participants compare notes and prepare a flip chart to report out their findings. The Group Leader should take notes during the discussion. (Tip: Sometimes it’s helpful to provide guiding questions for the report-out.)
  9. Each group then has about five minutes to report the compiled findings. (Tip: During the reports, have some questions prepared to further spark conversation).

After the Session:

  1. Hold a brief (10-15 minute) meeting with the Group Leaders and have them talk about the process, insights, or tidbits that did not make it to the flip chart and provide any additional feedback to the researcher.

Results of the process:

You will now have demographic data from the surveys, notes from the individual interview sheets, the group leaders’ combined notes, and the flip charts of combined responses to each question.

To learn more about mutual interviewing, see pages 51-55 of Research Toolkit for Program Evaluation and Needs Assessments Summary of Best Practices.

Blog: Not Just an Anecdote: Systematic Analysis of Qualitative Evaluation Data

Posted on August 30, 2017 by  in Blog ()

President and Founder, Creative Research & Evaluation LLC (CR&E)

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

As a Ph.D. trained anthropologist, I spent many years learning how to shape individual stories and detailed observations into larger patterns that help us understand social and cultural aspects of human life.  Thus, I was initially taken aback when I realized that program staff or program officers often initially think of qualitative evaluation as “just anecdotal.” Even people who want “stories” in their evaluation reports can be surprised at what is revealed through a systematic analysis of qualitative data.

Here are a few tips that can help lead to credible findings using qualitative data.  Examples are drawn from my experience evaluating ATE programs.

  • Organize your materials so that you can report which experiences are shared among program participants and what perceptions are unusual or unique. This may sound simple, but it takes forethought and time to provide a clear picture of the overall range and variation of participant perceptions. For example, in analyzing two focus group discussions held with the first cohort of students in an ATE program, I looked at each transcript separately to identify the program successes and challenges raised in each focus group. Comparing major themes raised by each group, I was confident when I reported that students in the program felt well prepared, although somewhat nervous about upcoming internships. On the other hand, although there were multiple joking comments about unsatisfactory classroom dynamics, I knew these were all made by one person and not taken seriously by other participants because I had assigned each participant a label and I used these labels in the focus group transcripts.
  • Use several qualitative data sources to provide strength to a complex conclusion. In technical terms, this is called “triangulation.” Two common methods of triangulation are comparing information collected from people with different roles in a program and comparing what people say with what they are observed doing. In some cases, data sources converge and in some cases they diverge. In collecting early information about an ATE program, I learned how important this program is to industry stakeholders. In this situation, there was such a need for entry-level technicians that stakeholders, students, and program staff all mentioned ways that immediate job openings might have a short-term priority over continuing immediately into advanced levels in the same program.
  • Think about qualitative and quantitative data together in relation to each other.  Student records and participant perceptions show different things and can inform each other. For example, instructors from industry may report a cohort of students as being highly motivated and uniformly successful at the same time that institutional records show a small number of less successful students. Both pieces of the picture are important here for assessing a project’s success; one shows high level of industry enthusiasm, while the other can provide exact percentages about participant success.

Additional Resources

The following two sources are updated classics in the fields of qualitative research and evaluation.

Miles, M. B., Huberman, A. M., & Saldana, J. (2014). Qualitative data analysis: A methods sourcebook. Thousand Oaks, CA: Sage.

Patton, M. Q. (2015). Qualitative research & evaluation methods: Integrating theory and practice: The definitive text of qualitative inquiry frameworks and options (4th ed.). Thousand Oaks, CA: Sage.

Blog: Scavenging Evaluation Data

Posted on January 17, 2017 by  in Blog ()

Director of Research, The Evaluation Center at Western Michigan University

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

But little Mouse, you are not alone,
In proving foresight may be vain:
The best laid schemes of mice and men
Go often askew,
And leave us nothing but grief and pain,
For promised joy!

From To a Mouse, by Robert Burns (1785), modern English version

Research and evaluation textbooks are filled with elegant designs for studies that will illuminate our understanding of social phenomena and programs. But as any evaluator will tell you, the real world is fraught with all manner of hazards and imperfect conditions that wreak havoc on design, bringing grief and pain, rather than the promised joy of a well-executed evaluation.

Probably the biggest hindrance to executing planned designs is that evaluation is just not the most important thing to most people. (GASP!) They are reluctant to give two minutes for a short survey, let alone an hour for a focus group. Your email imploring them to participate in your data collection effort is one of hundreds of requests for their time and attention that they are bombarded with daily.

So, do all the things the textbooks tell you to do. Take the time to develop a sound evaluation design and do your best to follow it. Establish expectations early with project participants and other stakeholders about the importance of their cooperation. Use known best practices to enhance participation and response rates.

In addition: Be a data scavenger. Here are two ways to get data for an evaluation that do not require hunting down project participants and convincing them to give you information.

1. Document what the project is doing.

I have seen a lot of evaluation reports in which evaluators painstakingly recount a project’s activities as a tedious story rather than straightforward account. This task typically requires the evaluator to ask many questions of project staff, pore through documents, and track down materials. It is much more efficient for project staff to keep a record of their own activities. For example, see EvaluATE’s resume. It is a no-nonsense record of our funding, activities, dissemination, scholarship, personnel, and contributors.  In and of itself, our resume does most of the work of the accountability aspect of our evaluation (i.e., Did we do what we promised?).  In addition, the resume can be used to address questions like these:

  • Is the project advancing knowledge, as evidenced by peer-reviewed publications and presentations?
  • Is the project’s productivity adequate in relation to its resources (funding and personnel)?
  • To what extent is the project leveraging the expertise of the ATE community?

2. Track participation.

If your project holds large events, use a sign-in sheet to get attendance numbers. If you hold webinars, you almost certainly have records with information about registrants and attendees. If you hold smaller events, pass around a sign-in sheet asking for basic information like name, institution, email address, and job title (or major if it’s a student group). If the project has developed a course, get enrollment information from the registrar.  Most importantly: Don’t put these records in a drawer. Compile them in a spreadsheet and analyze the heck out of them. Here are example data points that we glean from EvaluATE’s participation records:

  • Number of attendees
  • Number of attendees from various types of organizations (such as two- and four-year colleges, nonprofits, government agencies, and international organizations)
  • Number and percentage of attendees who return for subsequent events
  • Geographic distribution of attendees

Project documentation and participation data will be most helpful for process evaluation and accountability. You will still need cooperation from participants for outcome evaluation—and you should engage them early to garner their interest and support for evaluation efforts. Still, you may be surprised by how much valuable information you can get from these two sources—documentation of activities and participation records—with minimal effort.

Get creative about other data you can scavenge, such as institutional data that colleges already collect; website data, such as Google Analytics; and citation analytics for published articles.

Blog: Possible Selves: A Way to Assess Identity and Career Aspirations

Posted on September 14, 2016 by  in Blog ()

Professor of Psychology, Arkansas State University

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Children are often asked the question “What do you want to be when you grow up?” Many of us evaluate programs where the developers are hoping that participating in their program will change this answer. In this post, I’d like to suggest using “possible self” measures as a means of evaluating if a program changed attendees’ sense of identity and career aspirations.

What defines the term?

Possible selves are our representations of our future. We all think about what we ideally would like to become (the hoped-for possible self), things that we realistically expect to become (the expected possible self), and things that we are afraid of becoming (the feared-for possible self).[1][2] Possible selves can change many times over the lifespan and thus can be a useful measure to examine participants’ ideas about themselves in the future.

How can it be measured?

There are various ways to measure possible selves. One of the simplest is to use an open-ended measure that asks people to describe what they think will occur in the future. For example, we presented the following (adapted from Osyerman et al., 2006[2]) to youth participants in a science enrichment camp (funded by an NSF-ITEST grant to Arkansas State University):

Probably everyone thinks about what they are going to be like in the future. We usually think about the kinds of things that are going to happen to us and the kinds of people we might become.

  1. Please list some things that you most strongly hope will be true of you in the future.
  2. Please list some things that you think will most likely be true of you in the future.

The measure was used both before and after participating in the program. We purposely did not include a feared-for possible self, given the context of a summer camp.

What is the value-added?

Using this type of open-ended measure allows for participants’ own voices to be heard. Instead of imposing preconceived notions of what participants should “want” to do, it allows participants to tell us what is most important to them. We learned a great deal about participants’ world views and their answers helped us to fine-tune programs to better serve their needs and to be responsive to our participants. Students’ answers focused on careers, but also included hoped-for personal ideals. For instance, European-American students were significantly more likely to mention school success than African-American students.  Conversely, African-American students were significantly more likely to describe hoped-for positive social/emotional futures compared to European-American students. These results allowed program developers to gain a more nuanced understanding of motivations driving participants. Although we regarded the multiple areas of focus as a strength of the measure, evaluators considering using a possible self-measure may also want to include more directed, follow-up questions.

For more information on how to assess possible selves, see Professor Daphna Oyserman’s website.

References

[1] Markus, H. R., & Nurius, P. (1986). Possible selves. American Psychologist, 41, 954–969.

[2] Oyserman, D., Bybee, D., &Terry, K. (2006). Possible selves and academic outcomes: How and when possible selves impel action. Journal of Personality and Social Psychology, 91, 188–204.

Blog: Six Data Cleaning Checks

Posted on September 1, 2016 by  in Blog ()

Research Associate, WestEd’s STEM Program

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Data cleaning is the process of verifying and editing data files to address issues of inconsistency and missing information. Errors in data files can appear at any stage of an evaluation, making it difficult to produce reliable data. Data cleaning is a critical step in program evaluation because clients rely on accurate results to inform decisions about their initiatives. Below are six essential steps I include in my data cleaning process to minimize issues during data analysis:

1. Compare the columns of your data file against the columns of your codebook.

Sometimes unexpected columns might appear in your data file or columns of data may be missing. Data collected from providers external to your evaluation team (e.g., school districts) might include sensitive participant information like social security numbers. Failures in software used to collect data can lead to responses not being recorded. For example, if a wireless connection is lost while a file is being downloaded, some information in that file might not appear in the downloaded copy. Unnecessary data columns should be removed before analysis and, if possible, missing data columns should be retrieved.

2. Check your unique identifier column for duplicate values.

An identifier is a unique value used to label a participant and can take the form of a person’s full name or a number assigned by the evaluator. Multiple occurrences of the same identifier in a data file usually indicate an error. Duplicate identifier values can occur when participants complete an instrument more than once or when a participant identifier is mistakenly assigned to multiple records. If participants move between program sites, they might be asked to complete a survey for a second time. Administrators might record a participant’s identifier incorrectly, using a value assigned to another participant. Data collection software can malfunction and duplicate rows of records. Duplicate records should be identified and resolved.

3. Transform categorical data into standard values.

Non-standard data values often appear in data gathered from external data providers. For example, school districts often provide student demographic information but vary in the categorical codes they use. For example, the following table shows a range of values I received from different districts to represent students’ ethnicities:

Hubbard-Graphic

To aid in reporting on participant ethnicities, I transformed these values into the race and ethnicity categories used by the National Center for Education Statistics.

When cleaning your own data, you should decide on standard values to use for categorical data, transform ambiguous data into a standard form, and store these values in a new data column.  OpenRefine is a free tool that facilitates data transformations.

4. Check your data file for missing values.

Missing values occur when participants choose not to answer an item, are absent the day of administration, or skip an item due to survey logic. If missing values are found, apply a code to indicate the reason for the missing data point. For example, 888888 can indicate an instrument was not administered and 999999 can indicate a participant chose not to respond to an item. The use of codes can help data analysts determine how to handle the missing data. Analysts sometimes need to report on the frequency of missing data, use statistical methods to replace the missing data, or remove the missing data before analysis.

5. Check your data file for extra or missing records.

Attrition and recruitment can occur at all stages of an evaluation. Sometimes people who are not participating in the evaluation are allowed to submit data. Check the number of records in your data file against the number of recruited participants for discrepancies. Tracking dates when participants join a project, leave a project, and complete instruments can facilitate this review.

6. Correct erroneous or inconsistent values.

When instruments are completed on paper, participants can enter unexpected values. Online tools may be configured incorrectly and allow illegal values to be submitted. Create a list of validation criteria for each data field and compare all values against this list. de Jonge and van der Loo provide a tutorial for checking invalid data using R.

Data cleaning can be a time-consuming process. These checks can help reduce the time you spend on data cleaning and get results to your clients more quickly.

Blog: How Real-time Evaluation Can Increase the Utility of Evaluation Findings

Posted on July 21, 2016 by , in Blog ()
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Peery Wilkerson
Elizabeth Peery Stephanie B. Wilkerson

Evaluations are most useful when evaluators make relevant findings available to project partners at key decision-making moments. One approach to increasing the utility of evaluation findings is by collecting real-time data and providing immediate feedback at crucial moments to foster progress monitoring during service delivery. Based on our experience evaluating multiple five-day professional learning institutes for an ATE project, we discovered the benefits of providing real-time evaluation feedback and the vital elements that contributed to the success of this approach.

What did we do?

With project partners we co-developed online daily surveys that aligned with the learning objectives for each day’s training session. Daily surveys measured the effectiveness and appropriateness of each session’s instructional delivery, exercises and hands-on activities, materials and resources, content delivery format, and session length. Participants also rated their level of understanding of the session content and preparedness to use the information. They could submit questions, offer suggestions for improvement, and share what they liked most and least. Based on the survey data that evaluators provided to project partners after each session, partners could monitor what was and wasn’t working and identify where participants needed reinforcement, clarification, or re-teaching. Project partners could make immediate changes and modifications to the remaining training sessions to address any identified issues or shortcomings before participants completed the training.

Why was it successful?

Through the process, we recognized that there were a number of elements that made the daily surveys useful in immediately improving the professional learning sessions. These included the following:

  • Invested partners: The project partners recognized the value of the immediate feedback and its potential to greatly improve the trainings. Thus, they made a concentrated effort to use the information to make mid-training modifications.
  • Evaluator availability: Evaluators had to be available to pull the data after hours from the online survey software program and deliver it to project partners immediately.
  • Survey length and consistency: The daily surveys took less than 10 minutes to complete. While tailored to the content of each day, the surveys had a consistent question format that made them easier to complete.
  • Online format: The online format allowed for a streamlined and user-friendly survey. Additionally, it made retrieving a usable data summary much easier and timelier for the evaluators.
  • Time for administration: Time was carved out of the training sessions to allow for the surveys to be administered. This resulted in higher response rates and more predictable timing of data collection.

If real-time evaluation data will provide useful information that can help make improvements or decisions about professional learning trainings, it is worthwhile to seek resources and opportunities to collect and report this data in a timely manner.

Here are some additional resources regarding real-time evaluation:

Blog: Student Learning Assessments: Issues of Validity and Reliability

Posted on June 22, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

In my last post, I talked about the difference between program evaluation and student assessment. I also touched on using existing assessments if they are available and appropriate, and if not, constructing new assessments.  Of course, that new assessment would need to meet test quality standards or otherwise it will not be able to measure what you need to have measured for your evaluation. Test quality has to do with validity and reliability.

When a test is valid, it means that when a student responds with a wrong answer, it would be reasonable to conclude that they did so because they did not learn what they were supposed to have learned. There are all kinds of impediments to an assessment’s validity. For example, if in a science class you are asking students a question aimed at determining if they understand the difference between igneous and sedimentary rocks, yet you know that some of them do not understand English, you wouldn’t want to ask them the question in English. In testing jargon, what you are introducing in such a situation is “construct irrelevant variance.” In this case, the variance in results may be as much due to whether they know English (the construct irrelevant part) as to whether they know the construct, which is the differences between the rock types. Hence, these results would not help you determine if your innovation is helping them learn the science better.

Reliability has to do with test design, administration, and scoring. Examples of unreliable tests are those that are too long, introducing test-taking fatigue that interfere with their being reliable measures of student learning. Another common example of unreliability is when the scoring directions or rubric are not designed well enough to be sufficiently clear about how to judge the quality of an answer. This type of problem will often result in inconsistent scoring, otherwise known as low interrater reliability.

To summarize, a student learning assessment can be very important to your evaluation if a goal of your project is to directly impact student learning. Then you have to make some decisions about whether you can use existing assessments or develop new ones, and if you make new ones, they need to meet technical quality standards of validity and reliability. For projects not directly aiming at improving student learning, an assessment may actually be inappropriate in the evaluation because the tie between the project activities and the student learning may be too loose. In other words, the learning outcomes may be mediated by other factors that are too far beyond your control to render the learning outcomes useful for the evaluation.

Blog: Project Data for Evaluation: Google Groups Project Team Feedback

Posted on February 11, 2016 by  in Blog ()

President and CEO, Censeo Group

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

At Censeo Group, a program evaluation firm located in northeast Ohio, we are evaluating a number of STEM projects and often face the challenge of how to collect valid and reliable data about the impact of curriculum implementation: What implementation looks like, students’ perceptions of the program, project leaders’ comfort with lessons, and the extent to which students find project activities engaging and beneficial.

We use various methods to gather curriculum implementation data. Observations offer a glimpse into how faculty deliver new curriculum materials and how students interact and react to those materials, but are time-intensive and require clear observation goals and tools. Feedback surveys offer students and staff the opportunity to provide responses that support improvement or provide a summative analysis of the implementation, but not everyone responds and some responses may be superficial. During a recent project, we were able to use an ongoing, rich, genuine, and helpful source of project information for the purpose of evaluation.

Google Groups Blog

Project leaders created a Google Group and invited all project staff and the evaluation team to join with the following message:

“Welcome to our Google Group! This will be a format for sharing updates from interventions and our sites each week. Thanks for joining in our discussion!”

The team chose Google Groups because everybody was comfortable with the environment, and it is free, easy to use and easy to access.

Organizing the Posts

Project leaders created a prompt each week, asking staff to “Post experiences from Week X below.” This chronological method of organization kept each week’s feedback clustered. However, a different organizing principle could be used, for example, curriculum unit or school.

In the case of this Google Group, the simple prompt resonated well with project staff, who wrote descriptive and reflective entries. Graduate students, who were delivering a new curriculum to high school students, offered recommendations for colleagues who would be teaching the content later in the week about how to organize instruction, engage students, manage technology, or address questions that were asked during their lessons. Graduate students also referred to each other’s posts, indicating that this interactive method of project communication was useful and helpful for them as they worked in the schools, for example, in organizing materials or modifying lessons based on available time or student interest.

Capturing and Analyzing the Data 

The evaluation team used NVIVO’s NCapture, a Web browser add-on for NVIVO qualitative data analysis software that allows the blog posts to be quickly imported into the software for analysis. Once in NVIVO, the team coded the data to analyze the successes and challenges of using the new curriculum in the high schools.

Genuine and Ongoing Data

The project team is now implementing the curriculum for the second time with a new group of students. Staff members are posting weekly feedback about this second implementation. This ongoing use of the Google Group blog will allow the evaluation team to analyze and compare implementation by semester (Fall 2015 versus Spring 2016), by staff type (reveal changes in graduate students’ skills and experience), by school, and other relevant categories.

From a strictly data management perspective, a weekly survey of project staff using a tool such as Google Forms or an online survey system, from which data could be transferred directly into a spreadsheet, likely would have been easier to manage and analyze. However, the richness of the data that the Google Groups entries generated was well worth the trade-off of the extra time required to capture and upload each post. Rather than giving staff an added “evaluation” activity that was removed from the work of the project, and to which likely not all staff would have responded as enthusiastically, these blog posts provided evaluation staff with a glimpse into real-time, genuine staff communication and classroom implementation challenges and successes. The ongoing feedback about students’ reactions to specific activities supported project implementation by helping PIs understand which materials needed to be enhanced to support students of different skill levels as the curriculum was being delivered. The blog posts also provided insights into the graduate students’ comfort with the curriculum materials and highlighted the need for additional training for them about specific STEM careers. The blog allowed PIs to quickly make changes during the semester and provided the evaluation team with information about how the curriculum was being implemented and how changes affected the project over the course of the semester.

You can find additional information about NVIVO here: http://www.qsrinternational.com/product. The site includes training resources and videos about NVIVO.

You can learn how to create and use a Google Group at the Google Groups Help Center:
https://support.google.com/groups/?hl=en#topic=9216

Blog: Strategic Knowledge Mapping: A New Tool for Visualizing and Using Evaluation Findings in STEM

Posted on January 6, 2016 by  in Blog (, )

Director of Research and Evaluation, Meaningful Evidence, LLC

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

A challenge to designing effective STEM programs is that they address very large, complex goals, such as increasing the numbers of underrepresented students in advanced technology fields.

To design the best possible programs to address such a large, complex goal, we need a large, complex understanding (from looking at the big picture). It’s like when medical researchers seek to develop a new cure–they need deep understanding of how medications interact with the body, other medications, and how they will affect the patient based on their age and medical history.

A new method, Integrative Propositional Analysis (IPA), lets us visualize and assess information gained from evaluations. (For details, see our white papers.) At the 2015 American Evaluation Association conference, we demonstrated how to use the method to integrate findings from the PAC-Involved (Physics, Astronomy, Cosmology) evaluation into a strategic knowledge map. (View the interactive map.)

A strategic knowledge map supports program design and evaluation in many ways.

Measures understanding gained.
The map is an alternative logic model format that provides broader and deeper understanding than usual logic model approaches. Unlike other modeling techniques, IPA lets us quantitatively assess information gained. Results showed that the new map incorporating findings from the PAC-Involved evaluation had much greater breadth and depth than the original logic model. This indicates increased understanding of the program, its operating environment, how they work together, and options for action.

Graphic 1

Shows what parts of our program model (map) are better understood.
In the figure below, the yellow shadow around the concept “Attendance/attrition challenges” indicates that this concept is better understood. We better understand something when it has multiple causal arrows pointing to it—like when we have a map that shows multiple roads leading to each destination.

Graphic 2

Shows what parts of the map are most evidence supported.
We have more confidence in causal links that are supported by data from multiple sources. The thick arrow below shows a relationship that many sources of evaluation data supported. All five evaluation data sources—the project team interviews, student focus group, review of student reflective journals, observation, and student surveys all provided evidence that more experiments/demos/hands-on activities caused students to be more engaged in PAC-Involved.

graphic 3

Shows the invisible.
The map also helps us to “see the invisible.” If something does not have arrows pointing to it, we know that there is “something” that should be added to the map. This indicates that more research is needed to fill those “blank spots on the map” and improve our model.

Graphic 4

Supports collaboration.
The integrated map can support collaboration among the project team. We can zoom in to look at what parts are relevant for action.

Graphic 5

Supports strategic planning.
The integrated map also supports strategic planning. Solid arrows leading to our goals indicate things that help. Dotted lines show the challenges.

Graphic 6

Clarifies short-term and long-term outcomes.
We can create customized map views to show concepts of interest, such as outcomes for students and connections between the outcomes.

Graphic 7

We encourage you to add a Strategic Knowledge Map to your next evaluation. The evaluation team, project staff, students, and stakeholders will benefit tremendously.

Blog: Using Embedded Assessment to Understand Science Skills

Posted on August 5, 2015 by , , in Blog (, )
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
CStylinski
Cathlyn Stylinski
Senior Agent
University of Maryland Center
for Environmental Science
KPeterman
Karen Peterman
President
Karen Peterman Consulting
RBKlein
Rachel Becker-Klein
Senior Research Associate
PEER Associates

As our field explores the impact of informal (and formal) science programs on learning and skill development, it is imperative that we integrate research and evaluation methods into the fabric of the programs being studied. Embedded assessments (EAs) are “opportunities to assess participant progress and performance that are integrated into instructional materials and are virtually indistinguishable from day-to-day [program] activities” (Wilson & Sloane, 2000, p. 182). As such, EAs allow learners to demonstrate their science competencies through tasks that are integrated seamlessly into the learning experience itself.

Since they require that participants demonstrate their skills, rather than simply rate their confidence in using them, EAs offer an innovative way to understand and advance the evidence base for knowledge about the impacts of informal science programs. EAs can take on many forms and can be used in a variety of settings. The essential defining feature is that these assessments document and measure participant learning as a natural component of the program implementation and often as participants apply or demonstrate what they are learning.

Related concepts that you may have heard of:

  • Performance assessments: EA methods can include performance assessments, in which participants do something to demonstrate their knowledge and skills (e.g., scientific observation).
  • Authentic assessments: Authentic assessments are assessments of skills where the learning tasks mirror real-life problem-solving situations (e.g., the specific data collection techniques used in a project) and could be embedded into project activities. (Rural School and Community Trust, 2001; Wilson & Soane, 2000)

You can use EAs to measure participants’ abilities alongside more traditional research and evaluation measures and also to measure skills across time. So, along with surveys of content knowledge and confidence in a skill area, you might consider adding experiential and hands-on ways of assessing participant skills. For instance, if you were interested in assessing participants’ skills in observation, you might already be asking them to make some observations as a part of your program activities. You could then develop and use a rubric to assess the depth of that observation.

Although EA offers many benefits, the method also poses some significant challenges that have prevented widespread adoption to date. For the application of EA to be successful, there are two significant challenges to address: (1) the need for a standard EA development process that includes reliability and validity testing and (2) the need for professional development related to EA.

With these benefits and challenges in mind, we encourage project leaders, evaluators, and researchers to help us to push the envelope by:

  • Thinking critically about the inquiry skills fostered by their informal science projects and ensuring that those skills are measured as part of the evaluation and research plans.
  • Considering whether projects include practices that could be used as an EA of skill development and, if so, taking advantage of those systems for evaluation and research purposes.
  • Developing authentic methods that address the complexities of measuring skill development.
  • Sharing these experiences broadly with the community in an effort to highlight the valuable role that such projects can play in engaging the public with science.

We are currently working on a National Science Foundation grant (Embedded Assessment for Citizen Science – EA4CS) that is investigating the effectiveness of embedded assessment as a method to capture participant gains in science and other skills. We are conducting a needs assessment and working on creating embedded assessments at each of three different case study sites. Look for updates on our progress and additional blogs over the next year or so.

Rural School and Community Trust (2001). Assessing Student Work. Available from http://www.ruraledu.org/user_uploads/file/Assessing_Student_Work.pdf

Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181-208. Available from http://dx.doi.org/10.1207/S15324818AME1302_4