Daniel R. Zalles

Senior Educational Researcher, SRI International

Dr. Daniel R. Zalles is a Senior Educational Researcher at SRI International. He has a long history of evaluating STEM innovation products and leading the research and development of technology innovations for advancing student and teacher understanding of geoscience topics and contemporary environmental challenges. He has served as principal investigator for projects funded by NASA and NSF, and his evaluations have been of innovations in math teacher professional development, education for data and survey literacy in formal and informal settings, and universal design for learning on science topics. For more information about Dr. Zalles and his projects, go to sesis.sri.com.

Blog: Student Learning Assessments: Issues of Validity and Reliability

Posted on June 22, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

In my last post, I talked about the difference between program evaluation and student assessment. I also touched on using existing assessments if they are available and appropriate, and if not, constructing new assessments.  Of course, that new assessment would need to meet test quality standards or otherwise it will not be able to measure what you need to have measured for your evaluation. Test quality has to do with validity and reliability.

When a test is valid, it means that when a student responds with a wrong answer, it would be reasonable to conclude that they did so because they did not learn what they were supposed to have learned. There are all kinds of impediments to an assessment’s validity. For example, if in a science class you are asking students a question aimed at determining if they understand the difference between igneous and sedimentary rocks, yet you know that some of them do not understand English, you wouldn’t want to ask them the question in English. In testing jargon, what you are introducing in such a situation is “construct irrelevant variance.” In this case, the variance in results may be as much due to whether they know English (the construct irrelevant part) as to whether they know the construct, which is the differences between the rock types. Hence, these results would not help you determine if your innovation is helping them learn the science better.

Reliability has to do with test design, administration, and scoring. Examples of unreliable tests are those that are too long, introducing test-taking fatigue that interfere with their being reliable measures of student learning. Another common example of unreliability is when the scoring directions or rubric are not designed well enough to be sufficiently clear about how to judge the quality of an answer. This type of problem will often result in inconsistent scoring, otherwise known as low interrater reliability.

To summarize, a student learning assessment can be very important to your evaluation if a goal of your project is to directly impact student learning. Then you have to make some decisions about whether you can use existing assessments or develop new ones, and if you make new ones, they need to meet technical quality standards of validity and reliability. For projects not directly aiming at improving student learning, an assessment may actually be inappropriate in the evaluation because the tie between the project activities and the student learning may be too loose. In other words, the learning outcomes may be mediated by other factors that are too far beyond your control to render the learning outcomes useful for the evaluation.

Blog: Using Learning Assessments in Evaluations

Posted on June 8, 2016 by  in Blog ()

Senior Educational Researcher, SRI International

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

If you need to evaluate your project, but get confused about whether there should be a student assessment component, it’s important to understand the difference between project evaluation and student assessment. Both are about rendering judgments about something and they are often used interchangeably. In the world of grant-funded education projects however, they have overlapping yet quite different meanings. When you do a project evaluation, you are looking at to what extent the project is meeting its goals and achieving its intended outcomes. When you are assessing, you are looking at student progress in meeting learning goals. The most commonly used instrument of assessment is a test, but there are other mechanisms for assessing learning as well, such as student reports, presentations, or journals.

Not all project evaluations require student assessments and not all assessments are components of project evaluations. For example, the goal of the project may be to restructure an academic program, introduce some technology to the classroom, or get students to persist through college. Of course, in the end, all projects in education aim to improve learning. Yet, by itself, an individual project may not aspire to directly influence learning, but rather influence it through a related effort. In turn, not all assessments are conducted as components of project evaluations. Rather, they are most frequently used to determine the academic progress of individual students.

If you are going to put a student assessment component in your evaluation, answer these questions:

  1. What amount of assessment data will you need to properly generalize from your results about how well your project is faring? For example, how many students are impacted by your program? Do you need to assess them all or can you limit your assessment administration to a representative sample?
  2. Should you administer the assessment early enough to determine if the project needs to be modified midstream? This would be called a formative assessment, as opposed to a summative assessment, which you would do at the end of a project, after you have fully implemented your innovation with the students.

Think also about what would be an appropriate assessment instrument. Maybe you could simply use a test that the school is already using with the students. This would make sense, for example, if your goal is to provide some new curricular innovations in a particular course that the students are already taking. If your project fits into this category, it makes sense because it is likely that those assessments would have already been validated, which means they would have been piloted and subsequently modified as needed to ensure that they truly measure what they are designed to measure.

An existing assessment instrument may not be appropriate for you, however. Perhaps your innovation is introducing new learnings that those tests are not designed to measure. For example, it may be facilitating their learning of new skills, such as using new mobile technologies to collect field data. In this situation, you would want your project’s goal statements to be clear about whether the intention of your project is to provide an improved pathway to already-taught knowledge or skills, or a pathway to new learnings entirely, or both. New learnings would require a new assessment. In my next post, I’ll talk about validity and reliability issues to address when developing assessments.

Blog: Developing a Theory of Change for Your Innovation Project

Posted on September 23, 2015 by  in Blog ()

Senior Educational Researcher, SRI International

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Have you ever had to express a theory of change to convince funders that there’s a logical argument behind your innovation? Funders expect that data will be collected to evaluate whether innovations are meeting their goals. The goals should be tied to the innovation’s theory of change.

This type of thinking is very abstract, so the more you apply the abstractions to concrete situations, the clearer it will be. A useful exercise is to complete some sentences about your STEM education innovation. But first, try the same exercise for something that’s less intellectually challenging. In the table below, Columns C and D show examples of both. The sentence prompts are in Column B.

Zalles Chart.emf

Of course, theories of change and the evaluation strategies that arise from them can get very complicated quickly. For example, let’s say you’re writing a “proposal” to your best friend, who happens to like your kids and is ready to pay for their ice cream if he’s convinced that it will make them happy. Yet, your friend is the skeptical type. He argues, “Why just analyze what they say twice? Wouldn’t it help if you also did it the day after to see if they revert back to their usual remarks? Also, what if the kids simply decide to say fewer things after they eat the ice cream? How would you interpret fewer statements rather than nicer statements? Wouldn’t it be better to simply ask them to report how happy they are on a scale or tell them that they have to make at least four remarks or they’ll only get one scoop next time?”

In the second case provided in the table, let’s say your friend expresses concern about the random assignment. “What if the students won’t like being randomly assigned?” he says. Are there other ways you could get legitimate comparison data from which to draw conclusions? How about getting a participant group by asking for volunteers, then using a design task pretest to see how they match up with those who didn’t volunteer? Then, when analyzing the post task, you could limit your comparisons to groups of participants and nonparticipants to those whose designs on the pre-task were about the same quality.

In closing, it’s helpful to think of theory of change generation as an exercise in reflecting on and expressing the logic behind your innovation’s value. But first, do it around something ordinary, like getting more exercise, and be prepared for those tough questions!