If You Think Student Output as Measured by Achievement Tests Is a Way to Evaluate Teachers, You’d Be Plug Wrong!

If You Think Student Output as Measured by Achievement Tests Is a Way to Evaluate Teachers, You’d Be Plug Wrong!

What will it take to convince school boards, departments of education and administrators that using student achievement scores, one of the outputs that we constantly measure in American schools, is not a scientific nor ethical way to evaluate teachers.  To do so is to ignore the research on this issue, and to perpetuate the myth that using a student test score is a valid way to determine the effectiveness of teachers.

To carry out this plan, which will be implemented in the Cobb County Schools (where I live) and the rest of Georgia’s schools by 2015,  reinforces the machine age conception of our schools.  The machine age gave rise to factories, which became the model used to build and organize schools.  The outputs of a factory such a shoe, a dress, a pot or pan, are analogous to the outputs of schools such as grade point average, drop out rate, or student achievement.  In this machine age example, many people believe that the outputs are explained by a cause-effect relationship.  In our world of education there is the belief that student achievement as an output is caused (or added to) by the teacher.  This is a false belief.  And by the way, if a factory produced “bad” shoes, you can’t pin in on the factory workers, either.

If teachers don’t effect in substantial ways student achievement scores, what does?  To answer this will require us to be willing to think in a different way.  Albert Einstein is quoted by Russell Ackoff about thinking in different ways:

You can’t solve the problems created by the current pattern of thought using the current pattern of thought.

The current pattern of thought, based on causal thinking, derives from the acceptance of a cause as enough for its effect.  In the case of student achievement, this pattern of thought means that the teacher effect can be taken to explain rises or falls in student achievement.  Nothing else needs to be taken into account.  As Russell Ackoff has said, “Machine-Age thinking was, to a large extent, environment-free; it tried to develop understanding of natural phenomena without using the concept of environment.”

But here is the thing.

We’ve left the machine age.  Or perhaps it might be safer to say we are in the midst of a transformation from the machine/factory age of thinking to an other way of viewing the world.   This transformation is to an ecological, interdisciplinary or systems view of the world with writers from many fields describing this new way of thinking, including Rachel Carson (ecology), W. Edwards Deming (economics and business), Russell L. Ackoff (management), and Peter Barnard (systems thinking schools),

We need to think about school as a whole.  It’s a school system, and a more powerful way to look at schooling is to think of it as a system.  A system (according to many researchers in this field) is a whole that cannot be divided into independent parts.  Indeed, every part of a system has properties that it loses when separated from the system, and every system has some properties–its essential ones–that none of its parts do.

In order to improve school, we have to stand back and look at the school system.  As we look at school as a system, researchers such as W. Edwards Deming suggest that 94% of the variation we see in the school system is due to the nature of the system, not the people who work or make the system work.  For many of us, this doesn’t make any sense.  But if we are willing to move away from the linear factory model, and move to a vertical or system view, then we are led to ask what are some causes of the variation.  What causes variation in student achievement, in drop-out rates, and the achievement gap?

The Importance of Understanding Variation

There are two types of variation in a system, common-cause (accounting for 94%) and special cause.  Common-cause variation is the noise in a system.  It’s there in the background.  Its part of the natural pattern of the system.  Special cause is a clear signal, an unnatural patter, an assignable cause.  Variation falling within statistical limits means that any variation we see (test scores, graduation rates, achievement gaps) is the result of the natural behavior of the system, and as such, we can not point to one reason that caused higher scores, lower graduation rates, or decreases in the achievement gaps.  We need to accept the fact that student achievement scores are subject to the behavior of the system, and if you do the math, teachers have almost no control over this.  So why do we continue to put the blame on teachers for kids learning or not learning.

In research on the Trial Urban District Assessment which was reported here based on Ed Johnson’s analysis of TUDA for the years 2002 – 2013, there was very little variation in test scores over this period for 21 urban districts.  In fact, except for four instances at the 4th grade reading system, all the variation in test scores at the 4th and 8th grade in math and reading was due to common causes.

Figure 1. TUDA, Reading, 4th Grade Control Chart Showing Long Term Achievement Scores Across 21 Urban Districts
Figure 1. TUDA, Reading, 4th Grade Control Chart Showing Long Term Achievement Scores Across 21 Urban Districts. Source Ed Johnson, NAEP TUDA 2002 – 2011 Study

When we try to isolate the effect of teachers on any of the outputs of the school, we are sure to fail.  When we try to break the system apart, it loses its essential properties.  In this case the output as measured by student test scores is the product of the system, which is due to interactions and interdependencies that the teacher is only one small part.  How is student achievement affected by inadequate resources, living in poverty, not having a home, parents who struggle to earn a living, the size of the school and district, the location of the school, students coming to school each day hungry or inadequately fed, school policies, and so on?  


Which Model Describes the Real World?

For example, Mike Stoecklein wrote a guest post on the W. Edwards Deming Institute Blog, and according to researchers in the field of systems thinking, performance of the person can not be separated from the system, and is unknown.   The relationship between the individual (a teacher in this case) and the system (school system) is important to understand, if we are to try to test teachers based on some measure of student’s performance.  Stoecklein presents three models developed by a colleague of Dr. Deming–Heero Hacquebord.  They are shown in Figure 1.

In World I the individual is independent of the system, and performance is independent, and in this model, pay for performance, ranking and rating makes sense.  But is it the real world? Of course not.  In World II, the person is immersed in the system, and totally dependent on the system.  All outcomes are attributable only to the system.  Does this world exist? No.  World III is a model in which the individual interacts with the system, performance of the individual can not be separated from the system, and is unknown.  Performance pay or ranking makes no sense.  Performance is only improved by focusing on the union of the system and the person.  Stoecklein believes this is the real world.

Figure 1.  Three World Views showing the Interaction between the System and the Individual by Hacquebord, in Mike Stoecklein's blog post.
Figure 1. Three World Views showing the Interaction between the System and the Individual by Hacquebord, in Mike Stoecklein’s blog post.  (Stoecklein, Mike. “We Need to Understand Variation to Manage Effectively.” Deming Blog. W. Edwards Deming Institute, 07 Feb. 2013. Web. 26 Jan. 2014)

If the world of school was depicted as shown in World I, then using VAM scores might be valid.  But World I is not real.  Teachers are not separate from the school system any more than are students.  So why does the state insist that teacher performance can be measured by student performance.  It doesn’t make any sense.  World II might be closer to the truth.  But surely teachers have some sense of independence, and are not totally dependent on the system.

So we come to World III where teacher performance is the result of an interaction between the individual and the system.  Yet, even in this model, it is not possible to dissect how the system affects performance, any more than how student achievement can be used as the reason to judge teacher performance.  There are too many other variables and interactions that affect performance if teachers and students.  If we want to improve teacher performance, then we must focus on the union between the system and person.  In this model we have to make the assumption that one’s ability as a teacher is not only related to his or her pedagogical abilities, but ones interaction with the system.  We could ask, What’s the contribution of the individual to the system?  What’s the contribution of the system in which the teacher works?  These are not easy questions to answer.  To continue to believe student achievement score gains are directly related to personal teacher performance is a falsehood.  It’s a misrepresentation of the complexity of teaching and learning.

Yet, in Georgia (and other Race to the Top winning states), large sums of money are being spent on hiring consultants to tell school districts how to manage its people.  Heero Hacquebord made an important point about this on a comment he made on the Mike Stoecklein’s blog post:

Our systems are cancerous diseases that consultants do not seem to have the courage to address, because that terminates their client contracts!!!  “Performance appraisal:, “pay for performance”, “bonuses”, “productivity measurements” for nurses and physicians, are sold by consultants at great costs to the health care systems. We talk about respect for people, but then we destroy them by the systems we use. We do not motivate people, we only activate them, which means they do what leadership want them to do because of the consequences if they did not? We end up with fear and intimidation, and people have to go along to put bread on the table (note, substitute the word nurses and physicians with administrators and teachers).  (Stoecklein, Mike. “We Need to Understand Variation to Manage Effectively.” Deming Blog. W. Edwards Deming Institute, 07 Feb. 2013. Web. 26 Jan. 2014)

To Stoecklein,  Hacquebord, and others, because system leaders do not understand variation, they continue to lack the knowledge to manage humanely; instead they prod along tampering with the system.  Because of this lack of understanding of systems theory, they think that most of the problems of schools can put on the shoulders of teachers, and they continue to think that simple causal relationships define the teacher-student relationship.  Nothing could be further from the truth.

What is the effect of using student test scores to evaluate teachers?  Its demoralizing not only to teachers, but imagine the kid who says to herself, “today I am going to take a test that will decide if my teacher is hired or fired!”  What’s the effect of this in the school culture?  How would you approach the curriculum if you knew that student scores will affect your performance and job stability?  Wouldn’t you teach to the test?  Using pre-test vs post-test scores, Value Added Measures, and high-stakes tests are unsubstantiated methods that have very low reliability on the one hand, and are simply invalid on the other.  How can school board members vote to carry out such as plan in their own school district?  What are they thinking if they do this?

Last year, a group of Georgia university professors, who are experts in the field of educational evaluation, posted a letter to Governor Deal, State School Superintendent Barge, as well as key politicians in the Georgia Legislature, and superintendents of school districts participating in Georgia’s Race to the Top.   The researchers provided detailed evidence that the teacher evaluation system that the Georgia Department of Education has created is not based on supporting research.  They raised the following concerns, and recommended that using student achievement scores to evaluate teachers should be postponed.  Their concerns included the following:

  1. Value Added Models are not proven;
  2. GA is not prepared to implement this evaluation model;
  3. This model is not the most useful way to spend education funds;
  4. Students will be adversely affected by this Value Added Model.

We need not only suspend the use of teacher evaluation systems based on student achievement gains, we need to think differently about schools.  We need to heed Einstein’s warning that we can’t solve the problems created by the current pattern of thought using the current pattern of thought.

My dear colleagues, school board members, school leaders, if you think student output as measured by achievement tests is a way to evaluate teacher effectiveness, please consider that you might be wrong.


NAEP Large City Study Sheds Light on the Effects of the Atlanta Public Schools’ Cheating Scandal

NAEP Large City Study Sheds Light on the Effects of the Atlanta Public Schools’ Cheating Scandal.

The National Assessment of Educational Progress (NAEP) created the Trial Urban District Assessment (TUDA) in 2002 to assess student achievement in the nation’s large urban districts.  Reading results were first reported in 2002 for six districts, and math results were reported in 2003 for 10 districts.

The NAEP provides data from 2002 through 2012 on math and reading and are comparable to NAEP national and state results because the same assessments are used.

In July 2011, the Governor of Georgia released a report of its investigation into the Atlanta Cheating Scandal, charging 178 educators as being involved in the scandal.  According to the report, thousands of school children were harmed by widespread cheating in the Atlanta Public Schools (APS).

According to the Governor’s report, “a culture of fear and conspiracy of silence infected this school system, and kept many teachers from speaking freely about misconduct.” Although I’ve never condoned the cheating that occurred in the APS, the report falls short by not pursuing what caused the culture of fear to exist in the system, which apparently led to the cheating.  Who, besides the employees of the APS were involved in the so-called conspiracy? What role did the following play in this scandal:Georgia Department of Education, the Governor’s Office of Student Achievement, the Atlanta School Board, and partners who contributed millions of dollars to the APS to boost academic achievement of Atlanta’s students.

According to the report, the cheating scandal took place in 2009.  By 2010, the scandal had been exposed by the Atlanta Journal and Constitution reports on the APS, and we can assume that there was very little, if any cheating on the state’s 2010 – 2013 Criterion Referenced Competency Tests (CRCT).

During the period leading up to, during, and after the cheating scandal, the NAEP tested students in Atlanta, as part of the Trial Urban District Assessment in mathematics and reading from 2002 to 2012.  Fourth and eighth grade students were tested using the NAEP tests.

Don’t you think that examining the data on the NAEP tests given as part of Trial Urban District Assessment might be helpful in several areas?

  • What is the trend of academic performance of Atlanta students (grades 4 and 8) in mathematics and reading during 2002 – 2012?
  • Are there significant changes (increases, decreases) or no changes in the Atlanta data during this period?
  • Is there evidence that the academic performance of students in the APS was harmed or diminished during and after the scandal?  Do student scores change appreciably after we can be sure that there was little if any cheating going on?  Were students victimized as a result of the testing scandal?

What is the trend of academic performance of Atlanta students (grades 4 and 8) in mathematics and reading during 2002 – 2012?

Figure 1 summarizes Atlanta eighth grade student scores on the NAEP test in mathematics given as part of the Trial Urban District Assessment.  I’ve plotted the average scores of students at the 25th-50th-75th percentiles. The tend at each level is up and there is no evidence here of a decline or slump in scores for 8th grade students.

The Atlanta NAEP scores in 2011 and 2013 did not decline following the 2009 cheating scandal.  This is an important finding in the context of the Atlanta Cheating Scandal.  If students had been harmed academically, then their scored might have dropped after the episode of cheating. For more details that go beyond the graph that I produced here, consult this page in the TUDA 2013 report.

Figure 1 is a graph showing the average score of students at the 25th, 50th and 75th percentiles. NAEP, TUDA 2013 Report
Figure 1 is a graph showing the average score of students at the 25th, 50th and 75th percentiles. NAEP, TUDA 2013 Report

What’s interesting in the data are the scores in 2011.  The 2011 eighth grade students were in the sixth grade during the year of the cheating scandal.  If the students were academically victimized because of changes in CRCT scores, then we would hypothesize that their scores would decrease in the 2011 and 2013 years.  But they did not.  In fact, there is an increase in the scores at each percentile level.

Are there significant changes (increases, decreases) or no changes in the Atlanta data during this period?

Figure 2. NAEP Math scores for APS 8th grade students and large city districts.
Figure 2. NAEP Math scores for APS 8th grade students and large city districts.

The scores of Atlanta eighth grade students are plotted and compared to the average scores of other large cities that participated in the NAEP Trial Urban District Assessment.  Although Atlanta’s scores are lower than the average for each year, the overall trend is upwards, and the gap is closing.

Again, when we compare the years following 2009, the year of the test erasure scandal, two and four years later, Atlanta students are not only doing better, but they are closing the gap.  Is there evidence here that students were academically harmed by the scandal?

Is there evidence that the academic performance of students in the APS was harmed or diminished after the scandal?  Do student scores change appreciably after we can be sure that there was little if any cheating going on?  Were students victimized as a result of the testing scandal?

The NAEP data is separate and administered differently than the state CRCT tests.  Indeed, the NAEP tests are low-stakes, and according to many researchers,  NAEP scores are more valid and reliable than the high-stakes CRCT if one wants to have an idea about student performance.  CRCT is a high-stakes assessment that is used not only to assess the students, but the results are used to evaluate teacher performance.

There clearly are reasons to wonder if students were harmed by the cheating that took place in 2009 in the APS.  Academically, there is little evidence based on average scores reported in the NAEP study.  However, we have to wonder what was the social-emotional consequence caused not only by the cheating that took place, but also by the standardization and high-stakes testing reform movement that most likely what contributed to the “culture of fear and conspiracy” in the APS.

Stephanie Jones, professor of education at the University of Georgia has written extensively on the social-emotional consequences of the authoritarian standards and high-stakes testing dilemma.  She asks, “What’s the low morale and crying about in education these days?  Mandatory dehumanization and emotional policy-making–that’s what.”

Policy makers, acting on emotion and little to no data, have dehumanized schooling by implementing authoritarian standards in a one-size-fits-all system of education.  We’ve enabled a layer of the educational system (U.S. Department of Education and the state departments of education) to carry out the NCLB act, and high-stakes tests, and use data from these tests to decide the fate of school districts, teachers and students.  One of the outcomes of this policy is the debilitating effects on the mental and physical health of students, teachers and administrators.

If you don’t believe that, here is a quote from Professor Jones’ article:

I’ve witnessed sobbing children in school, tears streaking cheeks. When children hold it together at school, they often fall apart at home. Yelling, slamming doors, wetting the bed, having bad dreams, begging parents not to send them back to school.

More parents than ever feel pressured to medicate their children so they can make it through school days. Others make the gut-wrenching decision to pull their children from public schools to protect their dignity, sanity and souls. Desperate parents choose routes they had never thought they’d consider: home schooling, co-op schooling, or, when they can afford it, private schooling. But most parents suffer in silence, managing constant family conflict.

 Were Atlanta’s Students Harmed by the Test Erasure Scandal?

Based on NAEP data, Atlanta students continued to improve in mathematics, even after cheating was discovered, and eliminated from the district.  Although NAEP does not investigate the social-emotional effects of school, there is evidence that the current emphasis on high-stakes testing does contribute to and has amplified emotional and behavioral disorders among youth.  How can it be a good practice that in one district in Georgia, 70 of 180 days of the school year are devoted to some kind of state or federal testing?  For more data on this, please refer to a discussion of The Paradoxes of High Stakes Testing by Madaus, Russell, and Higgins (2009) in this blog article.

The NAEP large cities study does shed light on the Atlanta Public Schools.  There is evidence that any harm that directed toward students was more psychological than academic.

In spite of the national and international attention that the testing scandal generated, teachers in the Atlanta Public Schools positively impacted their students in mathematics and reading.  There is evidence if you dig deeper into the data that there is a continued need for more resources and more experienced teachers in schools which are populated students living in poverty, and students who are on free or reduced lunch.


Do Higher Science Standards Lead to Higher Achievement?

In a recent article in Scientific American, it was suggested that the U.S. should adopt higher standards in science, and that all 50 states should adopt them.

When you check the literature on science standards, the main reason for aiming for higher standards (raising the bar) is because in the “Olympics” of international academic test taking, the U.S. never takes home the gold.  In fact, according the tests results reported by the Program for International Student Assessment (PISA), U.S. students never score high enough to even merit a bronze medal.  In the last PISA Science Olympics, Shanghai-China (population 23 million) took home the Gold, Finland (population 5.4 million) the Silver, and Hong Kong-China (population 7 million, the Bronze.  The United States (population 314 million) average score positioned them 22nd on the leaderboard of 65 countries that participated in the PISA 2009 testing.

Some would argue that comparing scores across countries that vary so much in population, ethnic groups, poverty, health care, and housing is not a valid enterprise.  We’ll take that into consideration as we explore the relationship of standards to student achievement.

Its assumed that there is a connection or correlation between the quality of the standards in a particular discipline such as science, and the achievement levels of students as measured by tests.  So the argument is promoted that because U.S. students score near the bottom of the top third of countries that took the PISA test in 2009, then the U.S. science education standards need to be ramped up.  If we ramp up the standards, that is to say, make them more rigorous and at a higher level, then we should see a movement upwards for U.S. students on future PISA tests.  It seems like a reasonable assumption, and one that has driven the U.S. education system toward a single set of standards in mathematics and reading/language arts (Common Core State Standards-CCSS), and very soon, there will be a single set of science standards.

There is a real problem here

There is no research to support the contention that higher standards mean higher student achievement.  In fact there are very few facts to show that standards make a difference in student achievement.  It could be that standards, per se, act as barriers to learning, not bridges to the world of science.

Barriers to Learning

I’ve reported on this blog research published in the Journal of Research in Science Teaching by professor Carolyn Wallace of Indiana State University that indicates that the science standards in Georgia actually present barriers to teaching and learning. Wallace analyzed the effects of authoritarian standards language on science  classroom teaching.  She argues that curriculum standards based on a content and product model of education are “incongruent” with research in science education, cognitive psychology, language use, and science as inquiry.  The Next Generation Science Standards is based on a content and product model of teaching, and in fact, has not deviated from the earlier National Science Education Standards.

Over the past three decades, researchers from around the world have shown that students prior knowledge and the context of how science is learned are significant factors in helping students learn science.  Instead of starting with the prior experiences and interests of students, the standards are used to determine what students learn.  Even the standards in the NGSS, or the CCSS are lists of objectives defining a body of knowledge to be learned by all learners.  As Wallace shows, its the individuals in charge of curriculum (read standards) that determine the lists of standards to be learned. Science content to be learned exists without a context, and without any knowledge of the students who are required to master this stuff, and teachers who plan and carry out the instruction.

An important point that Wallace highlights is that teachers (and students) are recipients of the standards, rather than having been a part of the process in creating the standards. By and large teachers are nonparticipants in the design and writing of standards. But more importantly, teachers were not part of the decision to use standards to drive school science, in the first place. That was done by élite groups of scientists, consultants, and educators.

The Brown Center Report

According to the 2012 Brown Center Report on American Education, the Common Core State Standards will have little to no effect on student achievement. Author Tom Loveless explains that neither the quality nor the rigor of state standards is related to state NAEP scores. Loveless suggests that if there was an effect, we would have seen it since all states had standards since 2003.

For example in the Brown Center study, it was reported (in a separate 2009 study by Whitehurst), that there was no correlation of NAEP scores with the quality ratings of state standards. Whitehurst studied scores from 2000 to 2007, and found that NAEP scores did not depend upon the “quality of the standards,” and he reported that this was true for both white and black students (The Brown Center Report on American Education, p.9). The correlation coefficients ranged from -0.6 to 0.08.

The higher a “cut score” that a state established for difficulty of performance can be used to define the rigor or expectations of standards. One would expect that over time, achievement scores in states that have more rigorous and higher expectations, would trend upwards. The Brown study reported it this way:

States with higher, more rigorous cut points did not have stronger NAEP scores than states with less rigorous cut points.

The researchers found that it did not matter if states raised the bar, or lowered the bar on NAEP scores. The only positive and significant correlations reported between raising and lowering the bar were in 4th grade math and reading. One can not decide causality using simple correlations, but we can say there is some relationship here.

When researchers looked at facts to find out if standardization would cut the variation of scores between states, they found that the variation was relatively small compared to looking at the variation within states. The researchers put it this way (The Brown Center Report on American Education, p. 12): The findings are clear.

Most variation on NAEP occurs within states not between them. The variation within states is four to five times larger than the variation between states.

According to the Brown Report, the Common Core will have very little impact on national achievement (Brown Report, p. 12).  There is no reason to believe that won’t be true for science.

The researchers concluded that we should not expect much from the Common Core. In an interesting discussion of the implications of their findings, Tom Loveless, the author of the report, cautions us to be careful about not being drawn into thinking that standards represent a kind of system of “weights and measures.” Loveless tells us that standards’ reformers use the word—benchmarks—as a synonym for standards. And he says that they use it too often. In science education, we’ve had a long history of using the word benchmarks, and Loveless reminds us that there are not real, or measured benchmarks in any content area. Yet, when you read the standards—common core or science—there is the implication we really know–almost in a measured way–what standards should be met at a particular grade level.

Loveless also makes a strong point when he says the entire system of education is “teeming with variation.” To think that creating a set of common core standards will cut this variation between states or within a state simply will not succeed. As he puts it, the common core (a kind of intended curriculum) sits on top of the implemented and achieved curriculum. The implemented curriculum is what teachers do with their students day-to-day. It is full of variation within a school. Two biology teachers in the same school will get very different results for a lot of different factors. But as far as the state is concerned, the achieved curriculum is all that matters. The state uses high-stakes tests to decide whether schools met Adequate Yearly Progress (AYP).

Now What?

If standards do not result in improved learning as measured by achievement tests, what should we be doing to improve schools?

Over on Anthony Cody’s blog on Education Week, we might find some answers to this question.  Cody has begun a series of dialogs with the Gates Foundation on educational reform by bringing together discussions between opposing views to uncover some common ground. Cody has already broken new ground because the Gates Foundation is not only participating with him on his website, but Gates is publishing everything on their own site: Impatient Optimists blog. Three of the five dialog posts have been written, and it is the third one written by Anthony Cody that I want to bring in here.

In his post, Can Schools Defeat Poverty by Ignoring it?, Cody reminds us that the U.S. Department of Education (through the Race to the Top and NCLB Flexibility Requests) is unwavering in its promotion of data-driven education, using student test scores to rate and evaluate teachers and administrators.  Cody believes that the Gates Foundation has used its political influence to support this.  There is also an alliance between the ED, and PARCC which is developing assessments to be aligned to the Common Core Standards.  The Gates Foundation is a financial contributor to Achieve, which oversees the Common Core State Standards, the Next Generation Science Standards, and PARCC.

There is a “no excuses” attitude suggesting that students from impoverished backgrounds should do just as well as students from enriched communities.  The idea here is that teachers make the difference in student learning, and if this is true, then it is the “quality” of the teacher that will decide whether students do well on academic tests.

Anthony Cody says this is a huge error.  In his post, he says, and later in the post uses research to tell us:

In the US, the linchpin for education is not teacher effectiveness or data-driven management systems. It is the effects of poverty and racial isolation on our children.

As he points out, teachers account for only 20% of the variance in student test scores.  More than 60% of score variance on achievement tests correlates to out-of-school factors.  Out-of-school factors vary a great deal.  However, as Cody points out, the impact of violence, health, housing, and child development in poverty are factors that far out weigh the effect of teacher on a test given in the spring to students whose attendance is attendance, interest, and acceptance is poor.

In the Scientific American article I referenced at the beginning of this post, the author cites research from the Fordham Foundation that scores most state science standards as poor to mediocre.  We debunked the Fordham “research” here, and showed that its research method was unreliable, and invalid.  Unfortunately, various groups, even Scientific American, accept Fordham’s findings, and use in articles and papers as if it a valid assessment of science education standards.  It is not.

It’s not that we don’t have adequate science standards.  It’s that if we ignore the most important and significant factors that affect the life of students in and out of school, then standards of any quality won’t make a difference.

What is your view on the effect of changing the science standards on student achievement.  Are we heading in the wrong direction?  If so, which way should we go?


Why a Single Set of Science Standards in a Democracy?

Why are we supporting the notion of a single set of science standards which has been done in mathematics and language reading/language art?  We live in a democracy.  One the of founding principles of education is that elected school board members for the more than 15,000 school districts are charged with making decisions for each local school district.  What are we thinking?

For more than 20 years I collaborated with American teachers and our Soviet partners (we started this collaboration in 1981 when the Soviet Union still existed).  During this time we began working with science teachers and professors in several Soviet cities. Working within the Soviet curriculum we worked with Soviet teachers and taught lessons using inquiry, cooperative learning, and later problem basest learning.  The Soviets had a single curriculum, one set of texts, and a centrally controlled education system.  After Perestroika (restructuring) and Glasnost (openness) the Soviet system began to change. One of my colleagues, Mr. Vadim Zhudov, Director of School 710 in Moscow, told me that local schools would now have control over 25% of curriculum at the local level.

And what are we doing?  We’re creating an an education system that is controlled more and more by the Federal government, and less and less by local schools and teachers.  Why would a democratic country fall into this trap?  Do we want a system of education that is modeled after a central command system?

Ready or Not, the New Science Standards are on the way

The Next Generation of Science Standards are under development by Achieve, Inc. and the draft version will be available very soon.  Achieve will identify content and science and engineering practices that all students should learn from K – 12, regardless of where they live.  The science standards will cover the physical sciences, the life sciences, the earth and space sciences, and engineering, technology and applications of science, but in so doing will create a landscape of factoids to be learned by students, and used to develop assessments to measure student achievement.

Grade Band Endpoints: Factoids of Science

Although we haven’t seen any of the science standards, we can tell what they might look like by examining the document A Framework for K-12 Science Education: Practices, Crosscutting Concepts, and Core Ideas. The content of science is detailed in the Framework document, and in the context of the Framework, the standards appear as factoids, which taken as a whole define the field of science that all students should know.  There are examples standards in this document.  Here are few excerpts from a section on Weather and Climate focused on the question: What regulates weather and climate?:

  • By the end of grade 2, students will know that weather is the combination of sunlight, wind, snow or rain, and temperature in a particular time.
  • By the end of grade 5, student will know that weather is the minute-by-minute to day-by-day variation of the atmosphere’s condition on a local scale.
  • By the end of grad3 8, students will know that weather and climate are influenced by interactions involving sunlight, the ocean, the atmosphere, ice, landforms, and living things.
  • By end of grade 12, students will know that global climate is a dynamic balance on many different time scales among energy from the sun falling on Earth; the energy’s reflection, absorption, storage, and redistribution among the atmosphere, ocean, and land systems; and the energy’s radiation into space.

Continue reading “Why a Single Set of Science Standards in a Democracy?”

Shameful and Degrading Evaluations of Teachers by Politicians

Photo by Suleiman on flickr

Teacher bashing has become a contact sport that is played out by many U.S. Governors.  The rules of the game are staked against teachers by using measures that have not been substantiated scientifically.  For many governors, and mayors it is fair play to release the names of every teacher in the city, and their Value-added score determined by analyzing student achievement test scores.  None of the data that has been published has been scientifically validated, and in fact, the data that is provided is uneven, and unreliable from one year to the next.

Steven Sellers Lapham, in a letter to the editor, wrote this on teacher evaluations:

…evaluating teachers on the basis on student test score data has been exposed as a fraud. The final nail in the coffin appeared in the March 2012 issue of the education journal Kappan. In the article “Evaluating Teacher Evaluation,” Sanford professor of education Linda Darling-Hammond and her colleagues echo the 2009 findings of the National Research Council.

Darling-Hammond writes in her research article:

However, current research suggests that VAM ratings are not sufficiently reliable or valid to support high-stakes, individual-level decisions about teachers.

Mr. Lapham adds that student test scores should not be used as a basis for evaluating teachers. The Bill & Melinda Gates Foundation just released a report on the best ways to evaluate teachers. It does not even mention such an absurd idea , much less recommend it.

With this in mind, I am going to put into perspective the reform initiative that many governors, mayors, politicians, and for-profit groups and foundation are pushing on us.  First, I’ll identify the three parts or legs of the reform, accountability, deregulation of schooling, the erosion of teacher education.

Then I’ll report a few stories from several states that will give you a feel for the extent of how teachers are coming under fire, and being held hostage by unscientific methods of evaluation.

Continue reading “Shameful and Degrading Evaluations of Teachers by Politicians”