Fordham Institute Review of New Science Standards: Fealty to Conservatism & Canonical Science

Fordham Institute has published their  review of the draft of the Next Generation Science Standards.  Achieve wrote the the new science standards.   Achieve also wrote the math and reading/language arts common core standards.

Unchanging fealty to a conservative agenda and a canonical view of science education restricts and confines Fordham’s review to an old school view of science teaching.  Science education has rocketed past the views in two reports issued by Fordham about science education standards.

The Fordham reviewers use a strict content (canonical) view of science education and dismiss any reference to the scientific practices (science processes) and pedagogical advances such as constructivism, and inquiry teaching.  Many of the creative ideas that emerged in science teaching in the past thirty years represent interdisciplinary thinking, the learning sciences, deep understanding of how students learn science, and yes, constructivism.

These creative ideas are not reflected in Fordham’s analysis of science teaching and science curriculum.

I have also studied and reviewed the draft of the Next Generation Science Standards and have written about them here, and here.

The Framework

In 2011, the Carnegie Corporation funded the National Research Council’s project A Framework for K-12 Science Education (Framework).  The Framework was published last year, and it being used by Achieve as the basis for writing the Next Generation Science Standards (Science Standards)

These two documents, The Framework and the Science Standards, will decide the nature of science teaching for many years to come.

In this post, I’ll focus on how Fordham has responded to these two reports.

In late 2011, the Carnegie Corporation provided financial support to the Fordham Institute to review the NRC Framework.  The Fordham report was a commissioned paper (Review of the National Research Council’s Framework for K-12 Science Education), written by Dr. Paul Gross, Emeritus Professor of Biology. The Gross Report was not a juried review, but written by one person, who appears to have an ax to grind, especially with the science education research community, as well as those who advocate science inquiry, STS, or student-centered ideology. Indeed, the only good standard is one that is rigorous, and clearly content and discipline oriented.

I’ve read and reviewed the Fordham review of the Framework, and published my review here. Here some excerpts from my review.

Grade: B. In general, Dr. Gross, as well as Chester E. Finn, Jr. (President of the Fordham Foundation), are reluctant to give the Framework a grade of “A” instead mark the NRC’s thick report a grade of “B”.

Rigor.  Rigor is the measure of depth and level of abstraction to which chosen content is pursued, according to Gross. The Framework gets a good grade for rigor and limiting the number of science ideas identified in the Framework. The Framework identifies 44 ideas, which according to Gross is a credible core of science for the Framework.  The evaluator makes the claim that this new framework is better on science content than the NSES…how does he know that?

Practices, Crosscutting Concepts & Engineering. The Fordham evaluation has doubts about the Framework’s emphasis on Practices, Crosscutting Concepts, and Engineering/Technology Dimensions. For example, Gross identifies several researchers and their publications by name, and then says:

These were important in a trendy movement of the 1980s and 90s that went by such names as science studies, STS (sci-tech studies), (new) sociology or anthropology of science, cultural studies, cultural constructivism, and postmodern science.

For some reason, Gross thinks that science-related social issues and the radical idea of helping students construct their own ideas are not part of  mainstream science education, when indeed they are. Many of the creative Internet-based projects developed over the past 15 years have involved students in researching issues that have social implications.  The National Science Foundation made huge investments in creative learning projects.

Gross also claims that the NRC Framework authors “wisely demote what has long been held the essential condition of K-12 science: ‘Inquiry-based learning.’ The report does NOT demote inquiry, and in fact devotes much space to discussions of the Practices of science and engineering, which is another way of talking about inquiry. In fact, inquiry can found in 71 instances in the Framework. Gross and the Fordham Foundation make the case that Practices and Crosscutting ideas are accessories, and that only the Disciplinary Core Ideas of the Framework should be taken seriously . This will result is a set of science standards that are only based on 1/3 of the Framework’s recommendations.

Gross cherry picks his resources, and does not include a single research article from a prominent research journal in science education.  Dr. Gross  could have consulted science education journals found here, here, here or here.  If he did, he might have found this article: Inquiry-based science instruction—what is it and does it matter? Results from a research synthesis years 1984 to 2002.  Journal of Research in Science Teaching (JRST) published this article in 2010. Here is the abstract of the research study:

Various findings across 138 analyzed studies show a clear, positive trend favoring inquiry-based instructional practices, particularly instruction that emphasizes student active thinking and drawing conclusions from data. Teaching strategies that actively engage students in the learning process through scientific investigations are more likely to increase conceptual understanding than are strategies that rely on more passive techniques, which are often necessary in the current standardized-assessment laden educational environment.

The Fordham review of the Framework is not surprising, nor is their review of the first draft of the standards.  Fordham has its own set of science standards that it uses to check other organizations’ standards such as the state standards.  They used their standards as the “benchmark” to check all of the state science standards, and concluded that only 7 states earned an A.  Most of  the states earned an F.

If you download Fordham’s report here, scroll down to page 208 to read their science standards, which they call content-specific criteria.

I analyzed all the Fordham standards against Bloom’s Taxonomy in the Cognitive, Affective and Psychomotor domains.  Using Bloom’s Taxonomy, 52% of the Fordham science standards were rated at the lowest level.   Twenty-eight percent of their standards were at the comprehension level, 10% at application, and only 10% above analysis.  No standards were found for the affective or psychomotor designs.

All I am saying here is that Fordham has its own set of science standards, and I found them inferior to most of the state science standards, the National Science Education Standards (published in 1996), as well as the NAEP science framework.  You can read my full report here.  I gave Fordham’s science standards a grade of D.

Fordham Commentary on the New Science Standards

Given this background, we now turn our attention to Fordham’s Commentary & Feedback on Draft I of the NGSS.

The Fordham reviewers, as they did when they reviewed the NRC Framework for Science, felt the standards’ writers “went overboard on scientific and engineering practices.  From their point of view, crosscutting concepts and scientific and engineering practices create challenges to those who write standards.

Fordham science standards are reminiscent of the way  learning goals were written in the 1960s and 1970s.   Writers used one of many behavioral or action verbs such as define, describe, find, diagram, classify, and so forth to construct  behavioral objectives.  The Fordham standards were written using this strategy. Here are three examples from their list of standards:

  • Describe the organization of matter in the universe into stars and galaxies.
  • Identify the sun as the major source of energy for processes on Earth’s surface.
  • Describe the greenhouse effect and how a planet’s atmosphere can affect its climate.

The Fordham experts raised concerns about the way standard statements are written.  As shown in the examples from the draft of the NGSS, the standards integrate content with process and pedagogical components.

I agree with the Fordham reviewers that the Next Generation Science Standards  are rather complex.  Shown in Figure 1 is the “system architecture that Achieve used for all of the standards.  Figure 1 shows just four performance expectations (read standards), and their connection to practices, core ideas, and crosscutting concepts.  Every science standard in the Achieve report is presented in this way.

Figure 1. System Architecture of the NGSS. Source: http://www.nextgenscience.org/how-to-read-the-standards, extracted May 12, 2012

The Fordham reviewers gave careful attention to each standard statement, and indeed in their report they include many examples of how the standards’ writers got the content wrong or stated it in such a way that was unclear.

But the Fordham reviewers take the exception to the science education community’s research on constructivism.  In their terms, science educators show fealty to constructivist pedagogical theory.  To ignore constructivism, or to think that science educators have an unswerving allegiance to this well established and researched theory is quite telling.  To me it indicates that Fordham holds a traditional view of how students learn.  It tells me that these reviewers have boxed themselves into a vision of science literacy by looking inward at the canon of orthodox nature science.  Content is king.

To many science teachers and science education researchers, an alternative vision gets its meaning from the “character of situations with a scientific component, situations that students are likely to encounter as students.  Science literacy focuses on science-related situations (See Douglas Roberts’ chapter on science literacy in the Handbook of Research on Science Education).

The Fordham reviewers recommend that every standard be rewritten to cut “practices” where they are not needed.  They also want independent, highly qualified scientists who have not been involved in the standards writing attempt to check every standard.  The National Science Teachers Association, comprised of science teachers and scientists is quite qualified to do this, and indeed the NSTA sent their recommendations to Achieve last week.

I would agree with the Fordham group that the next version of the standards should be presented in a clearer way, and easily searchable.  I spent a good deal of time online with the first draft, and after a while I was able to search the document, but it was a bit overwhelming.

Finally I would add that when you check the Fordham analysis of the new standards, the word “basic” jumps out.  Near the end of their opinion report, they remind us that the science basics in the underlying NRC Framework were sound.  What they are saying is that the NGSS writers need to chisel away anything that is not solid content from the standards.

One More Thing

Organizations such as Achieve and the Fordham Institute believe the U.S. system of science and mathematics education is performing below par, and if something isn’t done, then millions of students will not be prepared to compete in the global economy. Achieve cites achievement data from PISA and NAEP to make its case that American science and mathematics teaching is in horrible shape, and needs to fixed.

The solution to fix this problem to make the American dream possible for all citizens is to write new science (and mathematics) standards.  One could argue that quality science teaching is not based on authoritarian content standards, but much richer standards of teaching that form the foundation of professional teaching.

What ever standards are agreed upon, they ought to be based on a set of values that are rooted in democratic thinking, including empathy and responsibility. Professional teachers above all else are empathic in the sense that teachers have the capacity to connect with their students, to feel what others feel, and to imagine oneself as another and hence to feel a kinship with others. Professional teachers are responsible in the sense that they act on empathy, and that they are not only responsible for others (their students, parents, colleagues), but themselves as well.

The dual forces of authoritarian standards and high-stakes testing has taken hold of K-12 education through a top-down, corporate led enterprise. This is very big business, and it is having an effect of thwarting teaching and learning in American schools. A recent study by Pioneer Institute estimated that states will spend at least $15 billion over the next few years to replace their current standards with the common core.  What will it cost to implement new science standards?

In research that I have reported here, standards are barriers to teaching and learning.  In this research, the tightly specified nature of successful learning performances precludes classroom teachers from modifying the standards to fit the needs of their students.  And the standards are removed from the thinking and reasoning provesses needed to achieve them.  Combine this with high-stakes tests, and you have a recipe for disaster.

According to the 2012 Brown Center Report on American Education, the Common Core State Standards will have little to no effect on student achievement. Author Tom Loveless explains that neither the quality or the rigor of state standards is related to state NAEP scores. Loveless suggests that if there was an effect, we would have seen it since all states had standards in 2003.

The researchers concluded that we should not expect much from the Common Core. In an interesting discussion of the implications of their findings, Tom Loveless, the author of the report, cautions us to be careful about not being drawn into thinking that standards represent a kind of system of “weights and measures.” Loveless tells us that standards’ reformers use the word—benchmarks—as a synonym for standards. And he says that they use it too often. In science education, we’ve had a long history of using the word benchmarks, and Loveless reminds us that there are not real, or measured benchmarks in any content area. Yet, when you read the standards—common core or science—there is the implication we really know–almost in a measured way–what standards should be met at a particular grade level.

Loveless also makes a strong point when he says the entire system of education is “teeming with variation.” To think that creating a set of common core standards will reduce this variation between states or within a state simply will not succeed.

As the Brown report suggests, we should not depend on the common core or the Next Generation Science Standards having any effect on students’ achievement.

What do you think?  Is Fordham’s view of science education consistent with your ideas about science teaching?

 

Curious Relationship Between NAEP Science Framework and the Next Generation Science Standards

There is a very curious relationship between NAEP Science Framework and the Next Generation Science Standards that I discovered while studying the NGSS and wanting to find out what was emphasized on the NAEP Science Assessments.  I had read on an NSTA list that I receive that someone had questioned the distribution of questions on the NAEP Science Assessment.  They had reported that the questions were distributed as follows: 30% Physical Science; 30 Life Science and 40% Earth Science.  I also wondered about that and went to the NAEP Website to find out.

I ended up at the NAEP 2009 Science Framework publication which you can download here.   The Commissioner of Education Statistics, who heads the National Center for Education Statistics in the U.S. Department of Education, is responsible by law for carrying out the NAEP project. The National Assessment Governing Board, appointed by the Secretary of Education but independent of the Department, sets policy for NAEP and is responsible for developing the framework and test specifications that serve as the blueprint for the assessments.

In 2009, the NAEP published the latest framework for science.  I admit that I had not read this document until today.  But I had read the NRC’s Framework for K-12 Science, and I have studied the Next Generation Science Standards.

We all know that the NGSS was developed by Achieve, and released the first Public version of the new standards last week.  We also knew that the standards were based on the NRC’s Framework for K-12 Science that was developed and written by a 17 member task force set up by the NRC with funding from the Carnegie Foundation.

A Curious Similarity

What I found curious was how the NAEP document describing the rationale and the design of the science framework which is used to develop science assessment items was so similar to the NRC Framework for K-12 Science Education and the Next Generation Science Standards framework.  The NAEP science framework was developed prior to the development of NRC’s science framework, and of course before the Next Generation Science Standards.

Table 1 compares and contrasts the NAEP Framework, the NRC Framework for K-12 Science Education, and the Next Generation Science Standards.  The language used in all three documents is very similar especially when defining key ideas including content or disciplinary core ideas, science and engineering practices, and crosscutting concepts and ideas.  When the NRC Framework was published in 2011, that was great fan fare over the new framework, and the ideas that had formulated the NRC’s committee to design the framework along three lines, shown in Table 1: Disciplinary core ideas, crosscutting concept, and science and engineering practices.

These three big ideas would be used to develop the Next Generation Science Standards.

It turns out that the NAEP had developed its new science framework which they use to design and write test items or assessment for their science assessments.  It is very similar to the NRC Framework, or maybe it would be better to say that the NRC Framework is similar to the Assessment framework.  There is also an overlap in some key members of the planning, steering, and writing committee on the NAEP and NRC committees.  How might this influence the direction taken by each of the frameworks?  I am not questioning the credentials of the members of any of these groups.  I am only wondering about the overlap.

There is another similarity among the three projects, and that is the lack of K-12 educators in the planning processes, and the writing and development process.  I couldn’t find one teacher on the NAEP committees.  There were no teachers on the NRC Framework committee.  And one of the members of the NRC was later hired by Achieve to head up the development of the Next Generation Science Standards.

NAEP Science Framework, 2009 NRC Framework for K-12 Science Education, 2011 Next Generation Science Standards, 2012
Science Content or Disciplinary Core Ideas
  • Physical Science
  • Life Science
  • Earth and Space Science
  • Earth and Space Sciences
  • Life Sciences
  • Physical Sciences
  • Engineering, Technology & Applications
  • Earth and Space Sciences
  • Life Sciences
  • Physical Sciences
  • Engineering, Technology & Applications
Crosscutting Content or Concepts Interaction of science content and practices of science
  • Patterns
  • Cause and Effect
  • Stability
  • Systems and System Models
  • Energy and Matter
  • Interdependence

 

  • Patterns
  • Cause and Effect
  • Stability
  • Systems and System Models
  • Energy and Matter
  • Interdependence
  • Influence
Science Practices
  • Identifying Science Principles
  • Using Science Principles
  • Using Scientific Inquiry
  • Using Technological Design
  • Asking Questions and Defining Problems
  • Planning and Carrying Out Investigations
  •   Using Mathematics and Computational Thinking
  •   Constructing Explanations and Designing Solutions
  •   Engaging in Argument from Evidence
  •   Obtaining, Evaluating, and Communicating Information
  • Asking Questions and Defining Problems
  • Planning and Carrying Out Investigations
  • Using Mathematics and Computational Thinking
  • Constructing Explanations and Designing Solutions
  • Engaging in Argument from Evidence
  • Obtaining, Evaluating, and Communicating Information
  • Asking Questions and Defining Problems
  • Planning and Carrying Out Investigations
  • Using Mathematics and Computational Thinking
  • Constructing Explanations and Designing Solutions
  • Engaging in Argument from Evidence
  • Obtaining, Evaluating, and Communicating Information

 

Framework Development WestEd and CCSSO, AAAS, NSTA National Research Council NRC, Achieve, NSTA, AAAS

Table 1. Comparison of Science Frameworks Designed by NAEP, NRC, & Achieve

NAEP is a low-stakes test, and is perhaps one of the most reliable measures we have of student performance in science, math and reading.  However, the fact that the government assessment framework preceded the NRC and Achieve frameworks raises questions about what this sequence means, and what are we to expect in the future.

There is strong evidence that a national high-stakes science assessment will be developed and required of all states that the adopt the NGSS.  If you don’t believe me, then you should check to see what the record shows about the Common Core State Standards’ national computer-based assessment.  A recent study by the Pioneer Institute reported that to implement the Common Core in the states will cost more than $15 billion, and that does not include testing.

There is also evidence some influential groups have been involved in all three enterprises including the Thomas Fordham Foundation, Achieve, Inc., the U.S. Department of Education, Council of Chief State School Officers, and the National Governors Association.  There were no requests for proposals for any of this work in the documents that I have read.  In each case, organizations were appointed by boards, or foundations, or councils.  There was no attempt in these developments to build in any kind of research and evaluation of the projects.  And of course, this is really odd in that these groups are creating products (tests and standards) that will hold others  accountable: teachers, students, administrators, schools, & school districts.

Do you think that the relationship among these three groups is curious?  Or is it simply of little concern, and we should move on?

 

 

Do Standards Impede Science Teaching and Learning?

Over the next few weeks I am going to focus on standards- and test-based educational reform with an eye toward opening a conversation about how standards and high-stakes tests might actually impede science teaching and learning.

We begin by examining the science standards, which have been an integral part of science education since the publication of the National Science Education Standards by National Research Council in 1995.  Then in 2011, the National Research Council published A Framework for K-12 Science Education: Practices, Crosscutting Concepts and Core Ideas.  The Framework is now being used by Achieve, Inc. to develop Next General Science Standards.  Achieve, Inc. is the company that wrote the Common Core State Standards in K – 12 English Language Art, and Mathematics.

Standards as Reform

Contemporary standards based reform emerged during the early 1990s, and some science education researchers raised questions about the nature of the standards setting process leading up to the publication of the NSES in 1996.  In a research document (read it here) with funding from the National Science Foundation, Linn, diSessa, Pea, and Songer contrasted the 1990s science standards reform effort with the NSF science curriculum projects of 1960s.  Remarkably these researchers noted that the development of the NSES conjured up images of the 1960s reform in which the primary goal was to bring modern scientific ideas into the curriculum, focusing on the fundamentals of the discipline.  Their view was that the curriculum projects of 1960s reform and the standards reform of the 1990s was being designed for future scientist as the target audience.  They indicated in their research that they were concerned that science standards were heading in the same direction.
Continue reading “Do Standards Impede Science Teaching and Learning?”

Can Inquiry Continue to be a Primary Goal of Science Teaching?

Can science as inquiry continue to be a primary goal of science teaching in the burgeoning culture of common standards, and high-stakes testing?

This is a question that I raised about a year and half ago. I am returning to the question now since the National Research Council released its report entitled A Framework for K-12 Science Education. The question is not “should we have standards.” Instead, the question and concern is that the development of standards appears to be driven by high-stakes assessments, resulting in an educational system monitored by test makers and data analysts.

We live in a liberal democracy, and as such, education is a fundamental aspect of helping citizens become literate in not only language and reading, but in mathematics, social studies, art, music, and science. Our society is a diverse, and multicultural, and the recent movement to move American education toward a one-size-fits-all system seems to be the antithesis of education in a democracy.

In a liberal democracy we need an educational system that is decentralized, and that puts into the hands of educators at the local level the responsibility to choose and develop curriculum and methods of teaching by able professional teachers. One of the hallmarks of liberal democracy has been the freedom accorded citizens to develop and express widely varying ideas and inventions. At the heart of this is creativity, and the development of life long aspirations for inquiry.

Admit or not, we have a real problem here.  Science teaching should encourage messing about, handling equipment and materials, measuring and estimating, wondering and hypothesizing, and asking all sorts of questions.   Could these attribute eventually be lost to science teaching because of the collateral effect of high stakes testing in which teachers are almost forced to teach to the test?

Tweakers and Tinkerers 

In a recent article in the New Yorker, entitled The Tweaker: The Real Genius of Steve Jobs, Malcomb Gladwell explores whether Steve Jobs was large-scale visionary and inventor, or a tweaker.  According Walter Isaacson’s biography, Steve Jobs was more of tweaker, although he clearly created a large-scale visionary company.  He was a tweaker in the sense that Malcolm says that Jobs tweaked technologies that existed, like the mouse and icons on the screen, and created the Macintosh Computer.  His genius, according to Malcolm and Isaacson was his editorial ability, not inventive ability.  He worked with what was in front of him, critiqued it, played with it, and refined it.

Science teachers have historically tried to provide opportunities for students to be “tweakers.”  Years ago teachers referred to this as messing about in science class. Students were given materials and equipment-things-and were encouraged to construct, test, probe, and experiment without superimposed questions or instructions. In today’s constructivist model of science learning, “messing about” is integrated across several phases of learning.

In his article, Gladwell asks why the industrial revolution began in England rather than in other countries such as France or Germany. He suggests that in 18th Century Britain there was a large population of skilled engineers and artisans, resourceful and creative persons:

who took the signature inventions of the industrial age and tweaked them—refined and perfected them, and made them work (Gladwell, 2011).

In Britain, during this time, the culture supported the kind of thinking and doing that led to inventions and modifications that improved upon existing technologies. A kind of tinkering, or messing about in attempts to improve upon existing technologies.

Steve Jobs created an environment in Apple in which tinkering was his way of making new products that changed people’s lives.  This kind of thinking was not done in isolation, but was done in teams who had responsibility for solving problems and creating new devices.

In the same way, science teachers believe that hands-on and inquiry teaching are crucial in helping students understand the nature of science.  Hands-on activities provide the opportunity for students to work together in small teams to explore, invent, construct, and yes, learn to be tweakers and tinkerers.

Is the high-stakes testing environment being fostered in America’s schools today conducive to students and teacherswho might be tweakers and tinkerers?

The New Science Standards

I know you realize that nearly all of the states have embraced and adopted the Common Core State Standards in English/language arts and mathematics. These standards also include specific Literacy Standards in History/Social Studies, Science, and Technical Subjects.  The Common Core Standards were written by Achieve, a company that was created by the National Governors Conference, and funded by private benefactors such as the Gates Foundation, the Carnegie Corporation of New York, and the Broad Foundations.

Last year the Carnegie Foundation provided funds to the National Research Council to create a new framework for K-12 science education. The framework was published last summer, and it is being used to write a new set of science standards for American schools, K-12. Guess who will write these standards? You’re correct–Achieve!

You are probably thinking that I am a conspiracy theorist. Actually, I am not, but it seems to me that the fact that one not-for-profit company has such power in developing standards for American schools has to make you wonder.

The Next Generation of Science Standards have not been written. But the process has begun. Achieve announced that they have already recruited writers, and are going to work with the National Science Teachers Association (NSTA) and the American Association for the Advancement of Science (AAAS). NSTA appears to be working hand-in-hand with Achieve, and their website provides updates on the new framework, and summary of the key ideas of the new framework. There is no evidence suggesting that NSTA has questions about the new framework.

Can Inquiry Flourish?

Can inquiry flourish in an environment in which singular sets of standards in the content areas will be written and then adopted by every state? Will the various states adopt the standards as they have in English/literacy and mathematics? Most likely they will. They will because not only is there pressure from groups like Achieve, and the National Governors Association, but the U.S. Department of Education. You probably know that when the Race to the Top  Request for Proposals was released, states were encouraged to adopt the Common Core State Standards. If they didn’t their proposal would not fare as well—they would lose points on the evaluation of their proposal.

The problem with a single set of standards in a diverse culture such as ours, is the eminent development of a common set of high-stakes science assessments that will be created. Funds are already available for the development of national assessment high-stakes tests.

And this is the problem.

Some educators think that the standards movement is part of the assessment movement in which student achievement scores will be used to evaluate not just the students, but more dangerously the effectiveness of teachers and schools. Data analysts have convinced corporate and government leaders that they can indeed measure teacher effectiveness using the so called “value added” approach in which they can nail down how much student achievement progress from beginning to end of year can be attributed to their teacher.

[Science] teachers will have to continue to navigate through this maize of new standards and assessments. They will have to prepare their students for bubble tests, but they will also want to instill in their students a sense of wonder, and help their students understand how science can influence their lives.

Science teaching needs to focus on the lived experiences of students, and engage them in inquiry and experimental ways of knowing that relate to their personal lives. Allowing common standards to determine what is taught, and how, is quite the opposite of a liberalizing and democratic approach to education.

 

References:

Gladwell, Malcolm, The Tweaker: The Real Genius of Steve Jobs, The New Yorker, November 14, 2011.

High-Stakes Testing = Negative Effects on Student Achievement

In earlier posts, I have advocated banning high-stakes testing as a means of making significant decisions about student performance (achievement in a course, passing a course—end-of-year-tests, being promoted, and graduating from high school).  I suggested this because the research evidence does not support continuing the practice in American schools.

The research reported here sheds light on high-stakes testing, and shows why they should not be used to make decisions about students’ achievement, teachers’ performance, or to make sanctions or offer rewards to schools.

Research from the National Academies

The Board on Testing and Assessment of the National Research Council issued a report entitled Incentives and Test-Based Accountability in Education.  The report concludes that using test-based (high-stakes testing) incentives has not created positive effects on student achievement.  It says that school-incentives such as those of the No Child Left Behind Act produce some of the highest effects in the programs studied, but only in elementary mathematics, and the improvements were miniscule.  Exit exams, which are used in 25 states, typically given in each of the major content areas at the end-of-the-year have actually decreased graduation rates.

What do tests measure?

We rely on tests to inform us about academic learning, but we fail to consider not only what tests don’t measure, but the limitations on what they do measure.  We get ourselves in real trouble when we think that a score on a NCLB test, or a CRCT type of test is actually a good measure of student academic learning.  We get ourselves in further trouble when we believe that the score represents what students know, and we dig the hole deeper when we think that changes in student test scores (positive or negative) can be attributed to the performance of teachers.

The authors of the National Research Council report on Incentives and Test-Based Accountability in Education had this to say about tests:

The tests that are typically used to measure performance in education fall short of providing a complete measure of desired educational outcomes in many ways. This is important because the use of incentives for performance on tests is likely to reduce emphasis on the outcomes that are not measured by the test.

Collateral Effects

Collateral effects of testing is that the curriculum becomes narrow as teachers “teach to the test” and consequently stray from activities that might be interesting, or in the case of science, involve students in project-based work, or hands-on collaborative activities.  Using projects and hands-on activities takes away time needed to drill students on the content of the test, or in the case of elementary schools, these take away time to teach math and reading/language arts.

One of the constraints of test-based incentives is that there are many goals of teaching that are not measured by bubble tests such as curiosity, persistence, ability to solve problems, or to collaborate.  Yet, these might be as important as the content that is tested.

But as the Board of Testing report reveals, the tests that we use do not do a great job in measuring the performance in the tested areas such as science, mathematics, English, or social studies.  Since the tests in these areas are based on the outline of content as represented in the content standards of each subject, there simply is not enough time to test students in each content standard.

Constructing a Test is not So Simple

For example in the National Science Education Standards for grades K – 8, there are seven major areas of standards (Science as Inquiry, Physical Science, Life Science, Earth Space Science, Science and Technology, Science in Personal and Social Perspectives, and History and Nature of Science).  In these seven areas there are 64 content standards just for grades K-8.  If you then look at the details of the Science Standards for any one of the 64 content standards, one finds at least three fundamental concepts and principles that underlie the standards.  So at the least, we have 192 concepts to measure on a test. What is a test maker to do?

National Science Education Standards "Content Standards" Grades 5-8

If you were to develop a test for Grade 5, you would need to develop a domain chart that included about 96 concepts.  If you wrote one test item for each concept, then the test would be 96 items long.  But, that’s too long a test, so the test must be reduced in number, to say 30 or 40 items, meaning that not all of the content standards have been measured.  And what is worse, we are only using one test item to “measure” performance on each standard. Wouldn’t it be more valid if we used two or more test items to “measure” each standard? If we do, then we end up testing fewer standards. So high-stakes tests fall short in measuring the standards in most content areas, yet we continue to use them to make decisions about student, teacher and school performance.

As the National Research Council report suggested

…tests also fall short in measuring performance in the tested  subjects and grades in important ways.   Some aspects of performance in many tested subjects are difficult or even impossible to assess with current tests.  As a result, tests can measure only a subset of the content of a tested subject.

We can define what a test measures, but in the current era of high-stakes testing, the tests that are being used to measure performance in any subject (math, science, English) do not represent the full scope of the curriculum, and have been shown to be ineffective in increasing student achievement.  End-of-year tests, such as those given in Georgia, are high-stakes tests, and should not be used to determine if a student should graduate.  The evidence is that end-of-year tests actually result in decreasing graduation rates.

Suggestions

The authors of the Incentives and Test-Based Accountability in Education report recommend that since we do not yet know how to use test-based incentives consistently to make positive effects, policy makers should support and look at alternative evaluation models.  Furthermore, policy makers should make use of basic research and make choices from a number of options.  They go on to say that:

We call on researchers, policy makers, and educators to examine the evidence in detail and not to reduce it to a simple thumbs-up or thumbs-down verdict. The school reform effort will move forward to the extent that everyone, from policy makers to parents, learns from a thorough and balanced analysis of each success and each failure.

We would wish that policy makers would use the report to put a moratorium on using high-stakes tests to make decisions about students, teachers and schools.
Do you think this will happen? Comment and tell us what you think.