If You Think Student Output as Measured by Achievement Tests Is a Way to Evaluate Teachers, You’d Be Plug Wrong!

If You Think Student Output as Measured by Achievement Tests Is a Way to Evaluate Teachers, You’d Be Plug Wrong!

What will it take to convince school boards, departments of education and administrators that using student achievement scores, one of the outputs that we constantly measure in American schools, is not a scientific nor ethical way to evaluate teachers.  To do so is to ignore the research on this issue, and to perpetuate the myth that using a student test score is a valid way to determine the effectiveness of teachers.

To carry out this plan, which will be implemented in the Cobb County Schools (where I live) and the rest of Georgia’s schools by 2015,  reinforces the machine age conception of our schools.  The machine age gave rise to factories, which became the model used to build and organize schools.  The outputs of a factory such a shoe, a dress, a pot or pan, are analogous to the outputs of schools such as grade point average, drop out rate, or student achievement.  In this machine age example, many people believe that the outputs are explained by a cause-effect relationship.  In our world of education there is the belief that student achievement as an output is caused (or added to) by the teacher.  This is a false belief.  And by the way, if a factory produced “bad” shoes, you can’t pin in on the factory workers, either.

If teachers don’t effect in substantial ways student achievement scores, what does?  To answer this will require us to be willing to think in a different way.  Albert Einstein is quoted by Russell Ackoff about thinking in different ways:

You can’t solve the problems created by the current pattern of thought using the current pattern of thought.

The current pattern of thought, based on causal thinking, derives from the acceptance of a cause as enough for its effect.  In the case of student achievement, this pattern of thought means that the teacher effect can be taken to explain rises or falls in student achievement.  Nothing else needs to be taken into account.  As Russell Ackoff has said, “Machine-Age thinking was, to a large extent, environment-free; it tried to develop understanding of natural phenomena without using the concept of environment.”

But here is the thing.

We’ve left the machine age.  Or perhaps it might be safer to say we are in the midst of a transformation from the machine/factory age of thinking to an other way of viewing the world.   This transformation is to an ecological, interdisciplinary or systems view of the world with writers from many fields describing this new way of thinking, including Rachel Carson (ecology), W. Edwards Deming (economics and business), Russell L. Ackoff (management), and Peter Barnard (systems thinking schools),

We need to think about school as a whole.  It’s a school system, and a more powerful way to look at schooling is to think of it as a system.  A system (according to many researchers in this field) is a whole that cannot be divided into independent parts.  Indeed, every part of a system has properties that it loses when separated from the system, and every system has some properties–its essential ones–that none of its parts do.

In order to improve school, we have to stand back and look at the school system.  As we look at school as a system, researchers such as W. Edwards Deming suggest that 94% of the variation we see in the school system is due to the nature of the system, not the people who work or make the system work.  For many of us, this doesn’t make any sense.  But if we are willing to move away from the linear factory model, and move to a vertical or system view, then we are led to ask what are some causes of the variation.  What causes variation in student achievement, in drop-out rates, and the achievement gap?

The Importance of Understanding Variation

There are two types of variation in a system, common-cause (accounting for 94%) and special cause.  Common-cause variation is the noise in a system.  It’s there in the background.  Its part of the natural pattern of the system.  Special cause is a clear signal, an unnatural patter, an assignable cause.  Variation falling within statistical limits means that any variation we see (test scores, graduation rates, achievement gaps) is the result of the natural behavior of the system, and as such, we can not point to one reason that caused higher scores, lower graduation rates, or decreases in the achievement gaps.  We need to accept the fact that student achievement scores are subject to the behavior of the system, and if you do the math, teachers have almost no control over this.  So why do we continue to put the blame on teachers for kids learning or not learning.

In research on the Trial Urban District Assessment which was reported here based on Ed Johnson’s analysis of TUDA for the years 2002 – 2013, there was very little variation in test scores over this period for 21 urban districts.  In fact, except for four instances at the 4th grade reading system, all the variation in test scores at the 4th and 8th grade in math and reading was due to common causes.

Figure 1. TUDA, Reading, 4th Grade Control Chart Showing Long Term Achievement Scores Across 21 Urban Districts

Figure 1. TUDA, Reading, 4th Grade Control Chart Showing Long Term Achievement Scores Across 21 Urban Districts. Source Ed Johnson, NAEP TUDA 2002 – 2011 Study

When we try to isolate the effect of teachers on any of the outputs of the school, we are sure to fail.  When we try to break the system apart, it loses its essential properties.  In this case the output as measured by student test scores is the product of the system, which is due to interactions and interdependencies that the teacher is only one small part.  How is student achievement affected by inadequate resources, living in poverty, not having a home, parents who struggle to earn a living, the size of the school and district, the location of the school, students coming to school each day hungry or inadequately fed, school policies, and so on?  

 

Which Model Describes the Real World?

For example, Mike Stoecklein wrote a guest post on the W. Edwards Deming Institute Blog, and according to researchers in the field of systems thinking, performance of the person can not be separated from the system, and is unknown.   The relationship between the individual (a teacher in this case) and the system (school system) is important to understand, if we are to try to test teachers based on some measure of student’s performance.  Stoecklein presents three models developed by a colleague of Dr. Deming–Heero Hacquebord.  They are shown in Figure 1.

In World I the individual is independent of the system, and performance is independent, and in this model, pay for performance, ranking and rating makes sense.  But is it the real world? Of course not.  In World II, the person is immersed in the system, and totally dependent on the system.  All outcomes are attributable only to the system.  Does this world exist? No.  World III is a model in which the individual interacts with the system, performance of the individual can not be separated from the system, and is unknown.  Performance pay or ranking makes no sense.  Performance is only improved by focusing on the union of the system and the person.  Stoecklein believes this is the real world.

Figure 1.  Three World Views showing the Interaction between the System and the Individual by Hacquebord, in Mike Stoecklein's blog post.

Figure 1. Three World Views showing the Interaction between the System and the Individual by Hacquebord, in Mike Stoecklein’s blog post.  (Stoecklein, Mike. “We Need to Understand Variation to Manage Effectively.” Deming Blog. W. Edwards Deming Institute, 07 Feb. 2013. Web. 26 Jan. 2014)

If the world of school was depicted as shown in World I, then using VAM scores might be valid.  But World I is not real.  Teachers are not separate from the school system any more than are students.  So why does the state insist that teacher performance can be measured by student performance.  It doesn’t make any sense.  World II might be closer to the truth.  But surely teachers have some sense of independence, and are not totally dependent on the system.

So we come to World III where teacher performance is the result of an interaction between the individual and the system.  Yet, even in this model, it is not possible to dissect how the system affects performance, any more than how student achievement can be used as the reason to judge teacher performance.  There are too many other variables and interactions that affect performance if teachers and students.  If we want to improve teacher performance, then we must focus on the union between the system and person.  In this model we have to make the assumption that one’s ability as a teacher is not only related to his or her pedagogical abilities, but ones interaction with the system.  We could ask, What’s the contribution of the individual to the system?  What’s the contribution of the system in which the teacher works?  These are not easy questions to answer.  To continue to believe student achievement score gains are directly related to personal teacher performance is a falsehood.  It’s a misrepresentation of the complexity of teaching and learning.

Yet, in Georgia (and other Race to the Top winning states), large sums of money are being spent on hiring consultants to tell school districts how to manage its people.  Heero Hacquebord made an important point about this on a comment he made on the Mike Stoecklein’s blog post:

Our systems are cancerous diseases that consultants do not seem to have the courage to address, because that terminates their client contracts!!!  “Performance appraisal:, “pay for performance”, “bonuses”, “productivity measurements” for nurses and physicians, are sold by consultants at great costs to the health care systems. We talk about respect for people, but then we destroy them by the systems we use. We do not motivate people, we only activate them, which means they do what leadership want them to do because of the consequences if they did not? We end up with fear and intimidation, and people have to go along to put bread on the table (note, substitute the word nurses and physicians with administrators and teachers).  (Stoecklein, Mike. “We Need to Understand Variation to Manage Effectively.” Deming Blog. W. Edwards Deming Institute, 07 Feb. 2013. Web. 26 Jan. 2014)

To Stoecklein,  Hacquebord, and others, because system leaders do not understand variation, they continue to lack the knowledge to manage humanely; instead they prod along tampering with the system.  Because of this lack of understanding of systems theory, they think that most of the problems of schools can put on the shoulders of teachers, and they continue to think that simple causal relationships define the teacher-student relationship.  Nothing could be further from the truth.

What is the effect of using student test scores to evaluate teachers?  Its demoralizing not only to teachers, but imagine the kid who says to herself, “today I am going to take a test that will decide if my teacher is hired or fired!”  What’s the effect of this in the school culture?  How would you approach the curriculum if you knew that student scores will affect your performance and job stability?  Wouldn’t you teach to the test?  Using pre-test vs post-test scores, Value Added Measures, and high-stakes tests are unsubstantiated methods that have very low reliability on the one hand, and are simply invalid on the other.  How can school board members vote to carry out such as plan in their own school district?  What are they thinking if they do this?

Last year, a group of Georgia university professors, who are experts in the field of educational evaluation, posted a letter to Governor Deal, State School Superintendent Barge, as well as key politicians in the Georgia Legislature, and superintendents of school districts participating in Georgia’s Race to the Top.   The researchers provided detailed evidence that the teacher evaluation system that the Georgia Department of Education has created is not based on supporting research.  They raised the following concerns, and recommended that using student achievement scores to evaluate teachers should be postponed.  Their concerns included the following:

  1. Value Added Models are not proven;
  2. GA is not prepared to implement this evaluation model;
  3. This model is not the most useful way to spend education funds;
  4. Students will be adversely affected by this Value Added Model.

We need not only suspend the use of teacher evaluation systems based on student achievement gains, we need to think differently about schools.  We need to heed Einstein’s warning that we can’t solve the problems created by the current pattern of thought using the current pattern of thought.

My dear colleagues, school board members, school leaders, if you think student output as measured by achievement tests is a way to evaluate teacher effectiveness, please consider that you might be wrong.

 

About Jack Hassard

Jack Hassard is a writer, a former high school teacher, and Professor Emeritus of Science Education, Georgia State University.

…and I’M STILL FOR HER.

Google