For some reason I have become obsessed with reading about The Race to the Top, and how the present U.S. Department of Education will use these funds to reform education. As with large scale efforts such as this one, achievement testing has become a central aspect of any program, projects, or effort suggested at the State or Local Education Agency (LEA).
One of the core concepts is that the Department wants to use student achievement test scores and results to evaluate the effectiveness of individual teachers, administrators and schools. Aside from irking most teachers around the country, the idea is Not supported with scientific research.
Rick Biche, commented on a recent post, and pointed me to a “letter” written by the Board on Testing and Assessment (BOTA) of the National Research Council. Thank you, Rick. I’ve spent time reading the letter, and its implications for The Race to the Top administrators, and teachers. I also followed your link to your website, and I’ve enjoyed exploring it.
Now, the letter in question is this. It is entitled Letter Report to the U.S. Department of Education on the Race to the Top Fund, and as I said was authored by a committee of the Board on Testing and Assessment. It’s important to keep in mind that the purpose of BOTA is raise questions about, and provide the guidance for judging, the technical qualities of tests and assessments and intended and unintended consequences of their use.
The letter will not make the top administrators of The Race to the Top Fund happy.
The Race to the Top Fund will require that the States use achievement tests to measure “growth” of students, and use this kind of data to assess teacher performance. As most of us would agree, tests do play an important role in evaluating programs, innovations, and projects, but as the BOTA report says, an adequate evaluation calls for more than tests alone. In fact, most evaluations “collect data” throughout the course of a project or, in this case an entire course taught by an individual teacher. These evaluations would include both qualitative and quantitative data. The Race to the Top administrators want to use a single sit-down test as a measure of student academic performance, and within 72 hours, provide the feedback necessary to evaluate the teacher, administrator, or school. They’ve got to be kidding.
In this approach, the Department is trying to use a test as a way to isolate the performance impact of teacher, or administrator. Here is what the BOTA letter says about this idea:
Prominent testing expert Robert Linn concluded in his workshop paper: “As with any
effort to isolate causal effects from observational data when random assignment is not feasible, there are reasons to question the ability of value-added methods to achieve the goal of determining the value added by a particular teacher, school, or educational program” (Linn, 2008, p. 3). Teachers are not assigned randomly to schools, and students are not assigned randomly to teachers. Without a way to account for important unobservable differences across students, VAM techniques fail to control fully for those differences and are therefore unable to provide objective comparisons between teachers who work with different populations. As a result, value-added scores that are attributed to a teacher or principal may be affected by other factors, such as student motivation and parental support.
The BOTA letter also raises issues about using large scale, high-stakes, summative tests as a way to provide feedback on teaching and learning. To wit:
Tests that mimic the structure of large-scale, high-stakes, summative tests, which lightly
sample broad domains of content taught over an extended period of time, are unlikely to provide the kind of fine-grained, diagnostic information that teachers need to guide their day-to-day instructional decisions. In addition, an attempt to use such tests to guide instruction encourages a narrow focus on the skills used in a particular test—“teaching to the test”—that can severely restrict instruction. Some topics and types of performance are more difficult to assess with large-scale, high-stakes, summative tests, including the kind of extended reasoning and problem-solving tasks that show that a student is able to apply concepts from a domain in a meaningful way. The use of high-stakes tests already leads to concerns about narrowing the curriculum towards the knowledge and skills that are easy to assess on such tests; it is critical that the choice of assessments for use in instructional improvement systems not reinforce the same kind of narrowing.
And finally, another area that I will comment on that BOTA raised questions about the feasibility and soundness of using “common assessments” to make assessments across states in the same way that NAEP currently does. As pointed out in the letter, there simply are too many variables that never can be controlled to allow administrators to make comparisons across states, and I would add across school districts, within a state. And one other point here is that the US Department of Education wants to pursue assessments to incorporate “international benchmarking.” Hold on, there!
Well, what do you think about this? Do you think the US Department of Education will listen to to the comments made by the Board of Testing and Assessment of the National Research Council? I hope they do. But I am not holding my breathe. What do you think?