You can navigate each inquiry from the landing pages for each inquiry.
This blog post introduces the fifth inquiry which focuses on the use of value-added measures to rate teachers.
E-valuating Teaching: It Doesn’t Add Up
Why the Use of Student Achievement Tests Is an Absurd System to Evaluate the Practice of Teaching
Teacher bashing has become a contact sport played out by many U.S. Governors. The rules of the game are staked against teachers by using measures that have not been substantiated scientifically. For many governors, and mayors it is fair play to release the names of every teacher in the city, and their Value-added score determined by analyzing student achievement test scores. None of the data that has been published has been scientifically validated, and in fact, the data that is provided is uneven, and unreliable from one year to the next.
A VAM score is a number that is derived using a covariate adjustment equation (Figure 1). The idea is to rate teachers using student test scores. For example, in the Florida VAM big data release, VAM scores are reported for teachers who taught math and reading, and for those that didn’t teach math or reading. They reported next to each teacher’s name, a score that indicates the learning gains students made above or below what they were expected to learn (based on earlier performance, with OTHER teachers).
Here is equation used to figure teachers’ “value added effect.”
Using student achievement scores to compute a number which claims to find what a teacher “adds” to student learning simply doesn’t add up. This is what this inquiry is about.
For years now, I’ve written about the nonsense attributed to using student achievement scores to assess teachers. But there are others who have written more powerfully about the nonsense attributed to the use of these scores. I want to direct you to two websites where you can find important information why using student test scores to evaluate teachers doesn’t add-up.
Dr. Cathy O’Neil’s Blog (Mathbabe) Cathy O’Neil is the Program Director of The Lede Program. Prior to Columbia, she was a data scientist in the New York startup scene and co-authored the book Doing Data Science. She blogs daily at mathbabe.org, appears weekly on Slate’s The Big Money podcast, and is active in Occupy Wall Street’s Alternative Banking group. She has written considerably on education, and in particular her views on the use of value-added modeling to evaluate teaching. You can read her value-added posts here.
In this inquiry, we ask if it is a practical system to use student achievement test scores as a measure of teacher effectiveness. Is the system viable, and will such a system have a detrimental affect on student learning.
The following are few articles that were posted on the Art of Teaching Science blog that focus on these questions. You can find many more of these articles here and here.
This is another significant analysis of the use of VAM scores that are being used to make tenure and retention decisions about teachers. If you haven’t read any of Dr. O’Neil’s articles, here is a great one to start with, especially given the Vergara v California tentative decision in Los Angeles.
I think I’m supposed to come away impressed, but that’s not what happens. Let me explain.
Their data set for students scores start in 1989, well before the current value-added teaching climate began. That means teachers weren’t teaching to the test like they are now. Therefore saying that the current VAM works because an retrograded VAM worked in 1989 and the 1990?s is like saying I must like blueberry pie now because I used to like pumpkin pie. It’s comparing apples to oranges, or blueberries to pumpkins.
According to the Foundation Center, The Bill & Melinda Gates Foundation, the Walton Family Foundation, and the Eli & Edythe Broad Foundation are ranked 1, 13, and 38 respectively on the top 100 U.S. foundations by total giving. The total assets of these three foundations as of April 2014 was $37 billion for the Gates Foundation, $1.9 billion for the Walton Foundation, and $1.6 billion for the Broad Foundation.
The total grant making in 2012 for these organizations was:
The Bill & Melinda Gates Foundation $3.18 billion
The Walton Family Foundation $423 million
The Eli & Edythe Broad Foundation $153 million
If you count up the number of people who call the shots in these three foundations, here’s the math:
(Gates x 2) + (Walton x 6) + (Broad x 2) = 10 people
Diane Ravitch assigns the “big three” to the Billionaire Boys Club. No matter how you look at it, these organizations’ money and political influence rudder American education reform toward the privatization of public education, and Common Core State Standards-High-Stakes Assessments accountability.
To be sure, there are many other Foundations that give grants to a variety of organizations whose goals merge with the Big Three, but it is the Big Three that dominate the agenda of education reform today.
Education for the People, by the People
In this blog post, I wonder if the deep pockets of just 10 people can be consistent with the ideals of public education. Most of you know that Diane Ravitch published her recent book, Reign of Error: The Hoax of the Privatization Movement and the Danger to America’s Public Schools (Public Library). On one of the end pages of her book she included a 1785 quote by President John Adams that I believe exposes the crux of the problem caused by the influx of money and influence from people such as the Gates, Waltons and Broads. Adams is quoted as saying this:
The whole people must take upon themselves the education of the whole people and be willing to bear the expense of it. There should not be a district of one mile square, without a school in it, not founded by a charitable individual, but maintained at the public expense of the people themselves.
Adams would be shocked by the “charitable” behavior these 10 people.
The funded organizations that are identified on the Big Three websites are pawn’s or infantry sent into schools with lots of money, political influence, and carefully laid plans to carry out the aims of the Big Three. Although there are differences and some overlap among those who receive their marching orders from the Big Three, it becomes obvious what the end game is when you learn who is funded. Let’s take a look at the Big Three.
Bill and Melinda Gates Foundation
In an earlier post (Why Bill Gates Defends the Common Core), I reported that more than 1800 “college-ready” projects have been funded by the Gates Foundation over the past five years. Some organizations have been awarded multiple grants, and in some cases, these amounts exceeded $60 million. In the world of Charter Schools, Gates has awarded more than $279 million. In teacher education, Gates has given millions to Teach for America and the New Teacher Project, yet very little in funding to improve teacher education in American universities. In the research I’ve done analyzing the Gates Awarded Grants, it can be estimated that more than $2.3 billion has been allocated to the “college-ready” category.
If you look at the names of organizations that receive Gates awards, you soon discover how education is being shaped: charter schools, temp teacher training, common standards, venture capitalism, and market-based reforms. Figure 1 identifies some of the organizations that have received grants, as well as the amount they garnered over the past five years.
Here the grant focal points for the Gates Foundation.
When I searched the Awarded Grants site at the Gates Foundation for “charter schools” it returned 134 hits. For example in 2014, the Pacific Charter School Development, Inc. received an award totaling $3,998,633. They joined a long list of recipients whose total amount came to $279,428,324 (see Figure 1). Gates gives more to support charters than does Walton and Broad combined.
Without question, the Gates Foundation leads all organizations in the U.S. to develop and implant a common set of standards in public schools. Achieve, Inc., the organization that wrote the Common Core State Standards in Math and Language Arts, and the Next Generation Science Standards received more than $36 million from Gates. But this is only a tip of the common core iceberg. To find out the extent of the funding for the common core is not as straightforward as you might think.
Achieve is part of a network of organizations that have spearheaded the drive to set up a common core of subjects in American schools that share the same set of performances for all students. As you can see in Table 1, the Gates Foundation funds projects in five program areas. You will find common core projects in the US Program, Global Policy & Advocacy and other program areas. For example, the New Schools Venture Fund has received more than $60 million from the Gates Foundation. As a venture capitalist organization, “their investors are betting hundreds of millions on the digital revolution in the classroom. (NewSchools Venture Fund website, extracted, May 29, 2014).”
One of the grants NewSchools received from Gates was for more than $10 million “to support the successful implementation of the Common Core State Standards and related assessments through comprehensive and targeted communications and advocacy in key states and the District of Columbia” (Gates Foundation Website, extracted May 29, 2014).
Implementing common core standards is a cornerstone of the Gates Foundation efforts to change American education.
Teacher training is supported by the Gates Foundation through its grants to Teach for America (TFA) and The New Teacher Project (TNTP). Based on my experience and research with alternative certification programs, these programs are at simply alternative ways to get people into classrooms, even while lacking profession teaching qualifications.
Is there is a similar plan to train élite college students in six weeks in medicine for the Doctor for America (DFA) program who will be hired for two years as paid doctors in local hospitals and clinics where they will practice medicine, even though they are uncertified? Medical and teaching projects, like these, set up a pipeline of inexperienced and uncertified college graduates to teach in American school, and bolster the over stretched medical profession. Students in these programs need to commit two years, and then move up or out of the system.
TNTP is a step-child of TFA having been founded by Michelle Rhee, who was a TFA “graduate.” TFA has net assets of $419,098,314 for fiscal year 2012. It receives 76% of its money from grants and gifts, and 22.3% from government grants.
In a separate investigation of TFA’s and TNTP’s role in the Race to the Top (RT3), I looked at Georgia’s RT3 Program and discovered that these organizations were receiving $15.6 million and $9.1 million to supply uncertified teachers in the greater Atlanta area, where there is no shortage of certified teachers.
The language used to describe this effort is tied up in the notion of increasing the pipeline of effective educators.
Increase the pipeline of effective teachers through partnership with Teach for America in Atlanta Public Schools, Clayton County, DeKalb County and Gwinnett with the first class of new TFA recruits beginning in the school year 2011-2012. Funding included in section E project 24: $15,6000,000).
A separate line in the budget points to the same kind of arrangement with The New Teacher Project, which will provide new teachers in Savannah, Augusta, and Southwest Georgia, for $7,568,395 million.
Although these two organization provide a small share of teachers to American public schools, that the Gates Foundation and the Race to Top programs support them is troubling. There is already legislation that supports redefining a certified teacher that includes teachers that have received minimal education, and no classroom experience. In areas where experienced teachers are clearly more successful, Gates and even the U.S. Department of Education (ED) ignores the research on teacher effectiveness.
What about the Medical program? DFA doesn’t exist, does it? But I wonder if such a program would be accepted by the medical profession and the local community?
The Gates Foundation in its funded Measures of Effective Teaching (MET) theorized that it was going to be easy to identify effective teaching, especially with the use of video tapes and student test scores. As John Thompson pointed out on Anthony Cody’s Living in Dialogue website over on Education Week,
The MET is a $45 million component of the “teacher quality” movement which studies test scores, teacher observations, and student survey data to isolate the elements of effective teaching. That’s great. But the MET’s assumptions about the outcomes they anticipated have been the basis for Arne Duncan’s test-driven policies — which require test scores to be a “significant part” of teacher evaluations in order for states to receive waivers for NCLB. Then, as evidence was gathered, preliminary reports noted problems with using test score growth for evaluations. The MET has continued to affirm the need for value-added (VAM) as a necessary component of their unified system of using improved instruction to drive reform, even as it reported disappointing findings.
Even though researchers have shown (using Gates Foundation data from the MET Study) that there are very low correlations between teachers instruction with state standards and state and alternative assessments, policy makers ignore such data and believe that teachers should be evaluated using student test scores. This study reported there is no evidence of relationships curriculum alignment and composite measures of teacher effectiveness. And they reported that lack of relationship between Danielson’s Framework of Teaching (used to measure teacher classroom behavior), Tripod (student surveys) to VAM scores.
One of the groups that Gates funds is the National Council on Teacher Quality (NCTQ). Since 2009, NCTQ has received more than $11 million in grants. The name of this organization is an oxymoron, yet with millions in funding from Gates, NCTQ publishes biased reports on teacher effectiveness and teacher education. In an earlier post I showed that NCTQ reporting is nothing short of junk science, yet here we have the billionaire funding such nonsense.
And then the Colorado Legacy Foundation has received more than $20 million to carry out the Common Core State Standards (CCSS) & pursue teacher evaluation systems using student test score growth.
The Walton Family Foundation
The Walton Family Foundation made grants totaling $423 million in 2013. According to the Walton Family Foundation website, its purpose in funding is to “infuse competitive pressure into America’s K-12 education system by increasing the quantity and quality of school choices available to parents, especially in low-income communities.”
The Walton Family Foundation funds school projects that shape public policy, lead to the creation of “quality schools,” and improve existing schools. The California Charter Schools Association and the Alliance for School Choice were the top two recipients of grants from Walton in 2013. Coming in third and fourth was The New Teacher Project and Teach for America.
The focus of funding of the Walton Foundation is school choice and parental choice (parent trigger) as policies supporting charter schools.
The Eli & Edythe Broad Foundation
The Broad funding was $153 million in 2013. The Broad Foundation, just like Gates and Walton accuses public schools of being in distress. They all use the same statistics to claim that American students are not able to compete for jobs in a global market, and that corporations can’t find the “workers” who possess the skills needed to fill their positions. The Broad Foundation highlights the value of competition by the giving of various “Broad Prizes.” The Broad Prize, and Broad Prize for Public Charters is an annual competitions among applicants.
The Broad Foundation also supports its Broad Residency in Urban Education and the Broad Superintendents Academy.
Each of these strategies is very much like the model used by Teach for America and The New Teacher Project. These are part-time training programs that train college graduates in five weeks to be full-time teachers.
The Broad programs trains people to be principals and superintendents, who according to many writers, tend to be confrontative with teachers and their unions, and have no problem in closing schools, and then turning around and opening schools managed by charter companies.
The Broad Foundation funds in more than fifty organizations in four larger categories as listed below. I’ve also included two funded projects or organizations representative of each grouping.
Leadership: Broad Center for the Management of School Systems, Kipp Foundation
Institutions: Charter School Growth Fund, New Schools Venture Fund
The corporate reform funded by Gates, Walton and Broad is a cobweb of organizations that has snared public schools by means of an accountability system that uses student achievement scores as the bottom line. The web also includes organizations whose goal is to shape policy by writing and rewriting state laws that benefit vouchers, choice, charters, and teacher evaluation.
There was an article today in the Atlanta Journal-Constitution that really got my gander up. The article, written by AJC blogger Maureen Downey, was entitled Grading on a curve. The article was about teacher evaluation systems. Downey’s article focused on classroom observation systems, indicating that only 22% of teachers will be evaluated with student test scores. This is the first error in the article. In Georgia, for instance, 50% of each teacher’s evaluation will be based on student test scores. And again, that’s every teacher. Even teachers who do not teach courses in which standardized tests are used will be evaluated by how all teachers in a school do.
She then reviews the study published by the Brookings Institute which examined teacher observation systems in four school districts. As she reports, teacher observation systems seem to be biased in favor of teachers teaching high-performing students, and unfair to teachers teaching low-performing students.
But then she cites Sandi Jacobs of the National Council on Teacher Quality (NCTQ). We’ve debunked the NCTQ here, as have many other bloggers, and so I was disappointed that Downey would refer to NCTQ in her article. The last group to go to for advice on improving teaching is the NCTQ. Most of the reporting done by the NCTQ is junk science.
Twin Methods of Teacher Evaluation
The twin methods that are put together to form a teacher evaluation system are absurd, muddled, and unreasonable. Even more, the assumptions which are used evaluate teachers are rooted in false claims about what is effective teaching, and how one knows when effective teaching happens. At its stupidest level, bureaucrats who sit in front of their computer screens, and who’ve consulted with agronomists, believe they have the algorithms that will actually measure in some quantifiable way, just how much a teacher adds to student academic achievement.
Framework of Classroom Teaching
Then there is the group that believes it is possible to quantify teacher effectiveness by observing teachers in action in the classroom. One of the common systems to measure teacher classroom effectiveness is the Danielson’s Framework for Teaching. The mantra from the Danielson group is that the “framework” is comprehensive and coherent and based on those aspects of teaching (behaviors) that promote student learning.
But here is the thing, the “framework‘ reduces teaching to 22 components and 76 smaller elements organized into four domains of teaching (Planning and Preparation, the Environment, Delivery of Services, and Professional Responsibilities. This is a classic example of reductionism. And for reductionist researchers, the use of this kind of framework of teaching makes sense.
The Danielson Framework is not a new idea. For decades, educational researchers have developed and implemented tens of “instruments” to see and quantify teacher behavior. Most of these instruments were analytic–teacher behavior was divided into categories or clusters of performance, as is done in the Danielson Framework.
And of course the most extreme reductionist measure is the quantification of learning by means of achievement test scores. Using the same logic used in evaluating teacher performance, student performance is measured using standardized tests which are based on content components and smaller elements that are organized into domains of content in fields such as science, mathematics, social studies, and English/language arts.
Teachers should not scored or rated as if they were in a competition to win or lose something. The use of these systems borders on a sinister view of teachers, who for some reason, need to poked, prodded, and measured. If you read the value added technical documents, such as this one in Florida, you will probably have a nervous breakdown. And you will wonder how the algorithms used have anything to do with teaching. Figure 1 is the algorithm used to figure VAM (for Florida teachers).
Using VAM scores is part of a larger plan to use standardization and high-stakes test accountability to privatize public education, and cut teaching to “teaching to the test.”
We have to keep in mind that public education has become a place where the locus of success is based on student achievement test scores. The system of accountability is like mildew, a thin layer covering any sense of creativity and innovativeness, that results in a smell much like the fungus that created it in the first place.
Student achievement gains, according to VAM folks, can be traced back to a teacher’s contribution using an algorithm that most people who work at state department’s of education can not explain to teachers. They have no idea how to use the results of VAM to help teachers improve. All these scores do is offer a story for newspapers to list the VAM scores of teachers, and leave them out to dry.
Dr. Cathy O’Neil, a mathematician and professor at Columbia University where she is director of the Lede Program at the Journalism School writes ablogs at mathbabe (exploring and venting about quantitative issues). I’ve read her blog regularly for the past year, and she’s brought me into a world that has pushed me into areas that I know very little about, but because of the way she writes, I’ve found a number of her ideas applicable to this blog.
Her interest in teaching is quite clear on her blog. If you search ” teaching” on her blog you will find articles that are very pertinent to this blog post. She has a collection of articles on VAM, and discussions of VAM from a perspective that is crucial to efforts to fight against the use of VAM, let along shaming teachers by posting VAM scores publicly. She discussed in one of her articles how detestable it was when New York City teacher’s VAM scores were released.
But read what she said about the nature of the VAM score. What is “underneath” the VAM score? What does it mean? She writes:
Just to be clear, the underlying test doesn’t actually use a definition of a good teacher beyond what the score is. In other words, this model isn’t being trained by looking at examples of what is a “good teacher”. Instead, it derived from another model which predicts students’ test scores taking into account various factors. At the very most you can say the teacher model measures the ability teachers have to get their kids to score better or worse than expected on some standardized tests. Call it a “teaching to the test model”. Nothing about learning outside the test. Nothing about inspiring their students or being a role model or teaching how to think or preparing for college.
A “wide margin of error” on this value-added model then means they have trouble actually deciding if you are good at teaching to the test or not. It’s an incredibly noisy number and is affected by things like whether this year’s standardized tests were similar to last year’s. (O’Neil, C. Teaching scores released, mathbabe, Feb. 26, 2012, extracted May 19, 2014.
We are making a serious mistake to condone the use of VAM, and I was disturbed by Maureen Downey’s article’s lack of any criticism of VAM. She did point out some of the shortcomings of using classroom observation systems, but here is the thing. A classroom visit, especially by a colleague or someone who is informed about providing feedback to help improve instruction, is a much more valuable tool to improve teaching. The concern I have for the way classroom observation systems are being used is that these observations will result in a calculation or a number which will be used with VAM scores to rate, grade, judge teachers.
This needs to be prevented.
If we want to improve teaching, then it needs to accomplished in a collaborative, collegial way. Teachers, to take risks with their teaching style and methods, need to trust the people who visit their classroom to see them at work.
Trust. How can teachers trust the system when it uses complicated algorithms to rate them based on dubious academic achievement (standardized) tests, that may or not “test” the content that was part of their curriculum?
The system of teacher evaluation that is prevalent in most states is absurd.
Could the movement to Opt Out of high-stakes testing be the third strike against using high stakes testing to rate teachers? In an earlier post, two studies were reviewed that cast doubt on the use of VAM scores (which are based on student achievement scores) and classroom observation systems to rate teachers.
There is a movement that begins in the homes of children whose parents have had it with their children being subjected to tests that not only create high anxiety, but are a dubious snapshot of student learning.
In Marietta, Georgia, where I live, a courageous family decided not to allow their children in elementary school to take the state CRCT’s two weeks ago. Met by the police claiming that they would be trespassing if they entered the school and their children did not take the CRCTs, the family went home, and stood their ground against the Marietta School District.
School officials use scare tactics claiming that there is no provision for parents to opt their children out of exams. But, there is no provision preventing parents from opting out. So, in Marietta, the parents were told their kids didn’t have to take the CRCTs.
In the article mentioned at the top of the article, Meg Norris was quoted as saying:
Georgia parents have been told they must withdraw their child from school if they do not wish them tested. Georgia parents have been told they will brought up in front of tribunals, sent to court, referred to DFACS for keeping children at home. Children have been left out of parties and humiliated in front of their classmates.
But as you will see as you read ahead, there is support out there for parents who are brave enough to whether the resistance they will get, as the Marietta family did.
Edy Chamness, a former teacher, and parent in Austin, Texas, and professor Julie Westerlund founded the Texas chapter of the Opt Out Movement. I came in contact with Chamness and Westerlund when I reached out to Joyce Murdock Feilke to find out about what she called “psychological abuse” created by the state-wide obsession with high-stakes testing in an Austin elementary school where she was a school counselor.
Joyce reported her observations to authorities in the state and district and the Austin American-Statesman, but in the end her concerns were dismissed by the superintendent (Dr. Meria Carstarphen, Atlanta’s new superintendent). You can read Joyce’s report here.
Edy Chamness and Julie Westerlund were professional colleagues of Joyce’s and provided more and compelling evidence that children are being used in an experiment, rooted in punitive classic conditioning to meet the goals of the school district, which is increase student test scores and eventually graduation rates.
Edy Chamness wrote to me that Joyce and Julie are not exaggerating when they described the horrible bullying practices using in Austin ISD. She says:
I was banned from my kids’ elementary school for sharing information about the Opt Out Movement with parents. Our son started a new school this year and the administration seems decent. Unfortunately, everything at the school is totally geared toward test prep and practice testing. The vast majority of our son’s assignments are worksheet packets. Tons of work is assigned as nightly homework; most of it is skills-based instruction and memorization. The two elementary schools in my neighborhood, Mills & Kiker, are horrible. No enrichment programs, no literature-based reading instruction, no games for reinforcement or outdoor education. NOTHING but practice tests and worksheets. The only thing that matters is test scores.
Now, Director of Texas Parents Opt Out of State Tests, Edy Chamness is one of many leaders of the Opt Out movement. The Texas Opt Out group is very active, and offer a great deal of support to parents and teachers who acknowledge that the testing craze needs to be stopped, and one way to do this by taking action by not participating in any high-stakes tests.
Testing has become perverse. It doesn’t have to be. But we’ve gone to far, and the testing that is being mandated is unnecessary. One Georgia Department of Education official said that we needed testing to find out how schools are doing, because, after all, the state is spending a lot of money on public education.
If we want to know how our schools are doing, there are better ways to answer that question than forcing millions of American students to spend their school days either preparing for tests, or sitting in front of computer, or at a desk to answer questions written by hired guns by corporations which charge at least $30 per student to do this!
The state of Georgia has just agreed to pay McGraw-Hill $110 million to develop a new battery of tests (to called the GMAP–Georgia Measures of Academic Progress to measure the academic progress of students on the Common Core.
Maybe the students should be paid for participating in these experiments. Maybe students should unionize, much like college students are doing who happen to be athletes.
We already have “data” we need to assess the performance of schools. In 1969, the National Assessment of Educational Progress, also called The Nations ReportCard was established and has been using low-stakes testing in reading, writing, math, science and other subjects to assess student achievement.
Not only do we have access to reports as they come out, but the NAEP has conducted long-term studies for us to look at, and more recently designed and carried out the Trial Urban District Assessment, bi-annual tests in reading, writing, math and science. Using sampling and stratification, the NAEP selects a sample of students from public and private schools large enough to estimate performance by state. Only about 60 students per school selected are tested. No student has to sit for the entire test. Each student takes part of the test, and scores are aggregated to set up averages.
By the way, the state of Georgia could use the same research design methods as NAEP to assess students in the state. In fact, Georgia could use the CRCT, or the new GMAP in a low-stakes approach, thereby informing the state that it’s getting its money worth out of teachers and students, but also tell the school districts about their performance. And there would be no need to test every kid.
If high-stakes testing is revoked, we will make one of the most important decisions in the lives of students and their families, and the educators who practice in our public schools. Banning tests, throwing them out, eliminating them, what ever you wish to call it, will open the door to more innovative and creative teaching, and an infusion of collaborative and problem solving projects that will really prepare students for career and college.
Making kids endure adult anger is not what public education is about. Why in the world are we so angry and willing to take it out on K-12 students? Why do we put the blame on children and youth, and if they don’t live up to a set of unsubstantiated and unscientific standards and statistics, we take it out on teachers?
The best thing for students is throw the bums (tests) out. The next best thing will be for teachers because without standardized test scores, there will be no way to calculate VAM scores as a method to evaluate teachers.