Post 2. Read Post 1 here.
This post was published on Anthony Cody’s blog, Living in Dialogue.
Practicing teachers, clinical professors, and researchers who work in the field know that assessing teachers or students requires much more than simply looking at test scores. And indeed, researchers who have examined the value-added assessment system which purports to measure the “teacher effect” on student achievement test scores, question it’s validity and more important reliability.
The Data Used to Make High-Stakes Decisions on Teachers and Students
Value Added Effect
For example, Terry Hibpshman, of the Kentucky Education Professional Standards Board, did an in-depth review of value-added models and concludes that even though VAM has been implemented in some locations (Tennessee, and Dallas), the methodologies “should not be considered mature or well-formed at this point in its history.” Dr. Hibpshman goes on to explain that VAM models, by their very nature, are extremely complex, and unless one understands the statistical nature of these models, people are quick to make policy decisions without understanding the limitations of these models.
That said, the U.S. Department of Education (ED) has figured out a way to mandate linking student achievement test scores to teacher assessment using VAM. If one reads the details of the NCLB Waivers, states must implement teacher and administrator evaluation that is tied in some way to student progress on high-stakes achievement tests. This was initially a requirement for states receiving Race to the Top funds. The Secretary of Education figured out a way to hold all states accountable to using VAM, because since he knew most states were chomping at the bit to reduce the hold the U.S. Department of Education because of the nutty No Child Left Behind act. Now, any state getting a NCLB Waiver will have to use VAM as part of their assessment of teachers and administrators.
Now we have created a situation where states will use a system that has not been shown to be scientifically valid or reliable (VAM) by using high-stakes test scores which assume that the results on these tests tell us what students have learned in the course being tested, but also how much the teacher contributed (value added) to student progress.
Willis D. Hawley and Jacqueline Jordan Irvine, explain why students’ cultural identities are integral to “measuring” teacher effectiveness.
Drs. Hawley and Irvine believe that the practices that teachers use should be part of any teacher assessment system. Teaching practices, to be used in teacher assessment, need to be observed, or need to be described by teachers themselves. In particular, the authors suggest that there are teaching practices that are called “culturally responsive pedagogy (CRP), and that these need to be included in any “high-stakes teaching evaluation.”
As Hawley and Irvine point out, culturally response teachers,
- understand that all students, regardless of race or ethnicity, bring their culturally influenced cognition, behavior, and dispositions to school.
- understand how semantics, accents, dialect, and discussion modes affect face-to-face interactions.
- know how to adapt and employ multiple representations of subject-matter knowledge using students’ everyday lived experiences.
Hawley and Irvine identify six examples of CRP that taken individually can make a huge difference in embodying the racial and ethnical effects on student learning. These practices are not new, but they reflect a more indirect approach to teaching and learning, and in all cases, the nature of the students is seen as fundamental in teaching. Highly effective teachers use practices such as these, and they should be an integral part of the assessment of teachers.
- Learning from family and community engagement
- Developing caring relationships with students
- Engaging and motivating students
- Assessing student performance
- Grouping students for instruction
- Selecting and effectively using learning resources
Finally, we should add that the Board on Testing and Assessment (BOTA) of The National Academies issued a letter to the Department of Education on the Race to the Top Fund (RTTT). The essence of the letter was a critique of the RTTT Fund’s insistence on linking student test scores to teacher effectiveness. In the letter, the BOTA had this to say:
The initiative should support research based on data that links student test scores with their teachers, but should not prematurely promote the use of value-added approaches, which evaluate teachers based on gains in their students’ performance, to reward or punish teachers.
Achievement Test Scores
Do they measure what students have learned in a course of study?
Are achievement tests that are used as high-stakes assessments at the end of the school year a valid measure of the curriculum standards specific to each teacher’s classroom, or are they estimates of what the curriculum should be, and estimates of what students should learn?
High-stakes test scores that are reported for students, schools, and districts are far from the reality of what students do and should learn. We’ve been fooled into believing that test scores are valid measures of student performance. Let’s look into this claim.
Let’s say we want to design a high-stakes test for mathematics for 8th graders in the state of Georgia. The first item of business is to check the Georgia mathematics standards for grade 8. According to the Georgia Department of Education , 8th graders:
will understand various numerical representations, including square roots, exponents and scientific notation; use and apply geometric properties of plane figures, including congruence and the Pythagorean theorem; use symbolic algebra to represent situations and solve problems, especially those that involve linear relationships; solve linear equations, systems of linear equations and inequalities; use equations, tables and graphs to analyze and interpret linear functions; use and understand set theory and simple counting techniques; determine the theoretical probability of simple events; and make inferences from statistical data, particularly data that can be modeled by linear functions.
Please note this is only a summary of the 8th grade math standards!
There are 98 standards for 8th grade. You can only create a test that a student can take in 1.5 – 3 hours. We have to worry about student test stamina. How long can an 8th grader sit for goodness sake. Let’s say that we design a test with 75 items for the 98 standards. First we note that not all of the curriculum can be “covered” in a single test, so the test makers must make a decision about which standards not to test. We then realize if we are going to “test’ all of the standards, we only allocate one test item per standard, and then use the results on the test to claim that we have measured what a student has learned in 8th grade mathematics. All done in one day, when in fact the student was enrolled in course that was at least 170 days of instruction.
The Long and Short of It All
We’ve created a standards-based testing system that is remarkably short on telling us what students have learned. We have also learned that the statistical model (Value Added Model) that has been shown to be inconsistent, and of questionable reliability.
How can we possibly use data from a complex system, the education of American students, to determine what the contribution of a teacher has on student learning?
What is the “tell” that creates this information on a teacher? None.
There is more to teaching than simply preparing students for the test. There is attitude and effort, collaboration and teamwork, and the development of character. There is inquiry, problem solving, creativity and innovation.
There is also more to preparing people to become teachers than dropping them into classroom with little or no preparation. Why do we have it in our head that teaching requires little to no preparation. Why do we entrust children with teachers who not licensed, when in the state of Georgia, a manicurist must take 9 months of intense training and pass two tests?
Do you think that a teacher’s effectiveness can be measured by using a complicated mathematical model that is based on student test scores?
Tags: achievement gap in the united states, achievement test, achievement tests, High-Stakes Testing, national assessment of educational progress, No Child Left Behind Act, student achievement, student achievement testing