Georgia’s College & Career Ready Performance Index is Not Scientific But is a Media Darling

Latest Story

Figure 1. Are you here to measure me again? Really!

The Georgia Department of Education would have you believe that the College & Career Ready Performance Index is based on scientific research, and is a valid and reliable “index” of school performance.

Each year during the spring, about 1.6 million Georgia students need to be in school so that they can spend hours upon hours being measured with the state’s CRCT, a multiple-choice  measurement devise.  Like the cow in the photo, students, starting at age six, go to school to be measured, once again. How much have you grown, the state wants to know? We are beginning to hear from parents about letting the state measure their kids. Some are refusing.

Weights and Measures

The state use the term weights as we do as in weights and measures.  I’ve reported on this blog that corporate reformers, including many state departments of education actually believe that tests, such as a Criterion Referenced Competency Tests, are testing college & career ready skills.  How do they know that?  They really do not.  In fact, the CRCT measures content knowledge in math, reading, language arts, science and social studies.  It does not measure communication skills, or how to solve problems collaboratively, and how to be innovative.  Success in a job and college might be related more to these qualities that don’t figure into the CRCT. Why aren’t “competences” that count, measured?

Mechanistic Thinking

Another fact you should be aware of is the state changes the “formula” to calculate the CCRPI of a school.  Last year,  70 points out of 110 were allocated as the “achievement” portion of the index.  Because local districts in Georgia thought there was not not enough credit given for the school’s “Progress” or improvement on test scores, the DOE balanced the index by adding more points for Progress,  and decreasing the points allocated for achievement from 70 to 60 points.  There is absolutely no scientific basis for this.  It is pure opinion on the part of the DOE staff to reconcile complaints from the field.  Perfectly valid, but not scientific.

But, you must keep in mind that the DOE thinks in mechanistic terms.  So, when you have a scale of o – 100 or 110, they immediately translate numbers to letter grades based on common knowledge.  Maureen Downey quoted one DOE administrator who said:

We all know what a 100 is on a test.  If the score is a 65, there are some things that need to be improved.

This kind of thinking limits the way we view schools.  By thinking in traditional ways, such as thinking that a score of 65 means we are not working hard enough, we ignore the ecology of communities and think that performance in school is unrelated to the world around us.

Another truth is that these tests are not needed.  They don’t contribute to the improvement of learning in any situation.  In fact, there is evidence that students would be more successful in school, and teachers would be able to do what know best– figure out ways to help their learn to love learning.  How can an average score tell you anything about a schools performance, let alone an individual student?

Raising the Bar

The reformers enjoy raising the bar.  They seem to like making it more and more difficult for kids to pass in school.  The state plays a game—an unfair game.  They keep changing the rules to raise the stakes.  It hurts students and their parents.  It puts teachers in the middle.  For instance, at the 5th grade level the state raised the bar by stating that, students passing 5 core courses (now including reading) must also pass all CRCT tests.  The former CCRPI required passing only 4 core courses and did not require passing CRCT scores for credit.  They push the bar up at other levels.  If you want to find out about other changes, and the rationale for them, you are led on a search to the CCRPI Accountability page, developed by Harter, Cardoza and Reichrath.    Don’t look for an easy way to figure out the rationale used in any of this.

Keep in mind that the CRCT is based on Georgia’s content standards.  And as some researchers have noted, standards represent a kind of system of “weights and measures.” Tom Loveless of the Brookings Institute tells us that standards’ reformers use the word—benchmarks—as a synonym for standards. And he says that they use it too often. In science education, we’ve had a long history of using the word benchmarks, and Loveless reminds us that there are not real, or measured benchmarks in any content area. Yet, when you read the standards—common core or science—there is the implication we really know–almost in a measured way–what standards should be met at a particular grade level.

The Georgia Department of Education, bound by the No Child Left Behind act, the Waiver they received, and the Race to the Top, fits the model of using weights and measures to convince us that the scores reported under the banner of CCRPI have scientific meaning.

They do not.

Digging Deeper

If you are willing to look at Excel Spreadsheets, you will soon become aware that the state has collected massive amounts of student data that they believe is real, and is to be believed as measures of student performance in reading, English language arts, math, science, and social studies.

Figure 1. CRCT 2012 All Students by State in Reading.  How is the "standard" determined?
Figure 1. CRCT 2012 All Students by State in Reading. How is the “standard” determined?

Examine any grade in the chart.  More than 120,000 students at each grade level (3rd – 8th)  took the CRCT in reading.  According to the state from 4 to 9 percent of the students did not reach the standard.  You can find similar data in English language arts, math, science and social studies.

But here is the question:  How did the state determine the “cut off” score that establishes the standard?  Did they use some mathematical model to do this?  Is there data out there that they consulted to lead them to a number?


There is no formula.  There is no science.  This is pure opinion by the state.  In fact, they can change the standard each year (which they do).

As you look over the newspaper articles that create league tables listing the schools with the highest CCRPI scores and those with the lowest CCRPI scores, we really do not know what these scores mean.

Here are some additional pages to consult to enable you dig deeper into the CCRPI.


Fordham Report on Next Generation Science Standards Lacks Credibility

On January 29, the Thomas Fordham Institute published a report, “Commentary & Feedback on the Next Generation Science Standards (Commentary).  Nine people wrote the report, none of whom are “experts” in the field of science education.  Yes, most of them have Ph.D’s in science, but they lack the experiential and content knowledge of science education, science curriculum development, and classroom K – 12 science teaching experience.  The lead author of Commentary is Dr. Paul Gross, professor emeritus of life sciences at the University of Virginia.

Amazingly, news and media outlets will quote and not question the Fordham report as if they have the last answer on the Next Generation Science Standards in particular and science education in general.  They do not have the final answer.  In my opinion their answers and comments are flawed.

A Trilobite

Erik Robelen wrote an article today in Curriculum Matters entitled In Science Draft, Big Problems ‘Abound,’ Think Tank Says.  The Think Tank is the Fordham Institute.  Robelen reviewed the report (70 pages) identifying the criticisms that the Fordham reviewers had about the NGSS. The Fordham group claimed that the authors of the NGSS omitted a lot of what they call “essential content.”  They also insist that the practices of science and engineering dominate the NGSS, and claim that basic science knowledge–the goal of science education (again, according to the Fordham group), becomes secondary.  The goal of science education is to create a curriculum that is steeped primarily in science content, with little regards to practices, inquiry, and connection to other disciplines.

Fordham Science Standards—-Return to the Past

The Fordham review used a set of science standards (called criteria) created by their science experts. They use this list of content goals to judge the worthiness of the NGSS & they used it two years ago when they reported on the state of state science standards.

They also use grades to summarize their opinion of science standards.  When they reported on the state science standards, many states failed, that is they received grades of D and F.  They didn’t grade the NGSS standards, but I am sure they will.

I’ve reviewed their standards and analyzed them using Bloom’s taxonomies, and reported them here.  In my analysis, only 10% of the Fordham standards were above the analysis level; 52% were classified at the lowest level in Bloom. There were no mention of the affective or psychomotor domains.

One of the areas that is completely missing in the lists of science content are standards for science inquiry. What is amusing here is that the Fordham authors criticized the states for “poor integration of scientific inquiry.” If any group showed poor integration of inquiry into the standards, it’s the Fordham group. They do not mention one inquiry science outcome or goal, yet they slam the states for not integrating science inquiry into the content of science.  They need to get their own house in order before they go around the country laying it on the states, and now the NGSS.

Their standards are quite simply a list of content goals with little regard to the process of science & engineering (practices in the NGSS–inquiry in the 1995 NSES) or connections across disciplines.  They are a real embarrassment to science educators in the context of the research and development in science education over the past 20 years.   I gave their standards a grade of D.

Let me explain.  The Fordham wrote their “science standards” using the same format that was used in the earlier part of the last century.  For example, here are a few of the Fordham science standards:

  • Know some of the evidence that electricity and magnetism are closely related (physical science)
  • Trace major events in the history of life on earth, and understand that the diversity of life (including human life) results from biological evolution (life science)
  • Recognize Earth as one planet among its solar system neighbors (earth science)
  • Be able to use Lewis dot structures to predict the shapes and polarities of simple molecules (chemistry)
  • Know the basic structures of chromosomes and genes down to the molecular level (biology)

These are simplistic statements that are juvenile compared to the 1995 National Science Education Standards, and the 2013 Next Generation Science Standards.  Here are some example standard statements from the NGSS:

  • Construct an argument using evidence about the relationship between the change in motion and the change in energy of an object.
  • Collect, analyze, and use data to describe patterns of what plants and animals need to survive.
  • Analyze and interpret data from fossils to describe the types of organisms that lived long ago and the environments in which they lived and compare them with organisms and environments today.
  • Use Earth system models to support explanations of how Earth’s internal and surface processes operate concurrently at different spatial and temporal scales to form landscapes and sea floor features.

The Fordham report is an extensive description of their own content specific and narrow view of what science for children and youth should be.  It was written by people who have little experience in science education, and there is some evidence in their reporting that they have little knowledge of science education research.  Their report is not juried, and there has never been an attempt by Fordham to solicit the opinions of science education researchers or curriculum developers.  It is an in-house report, and that is as far as it should go.

One More Thing

I have written several blog posts that are critical of the standards movement, including the Next Generation Science Standards.  You can link to them here, here, here and here. I am not defending the NGSS, but the criteria that Fordham uses to “analyse” the NGSS is not a valid research tool, and lacks reliability and validity, two criteria that would make their report believable.  As it standards, I can not agree with their ideas, nor should the NGSS consider them in their next stage.  Fordham has been pulling the wool over the eyes of policy makers and the media.  Its time to call them out.

There is much to disagree with in their report.  What are your opinions about the Fordham report on the NGSS?

Anthony Cody: Designer of Value-Added Tests a Skeptic About Current Test Mania

Guest Post by Anthony Cody

Follow Anthony on Twitter at @AnthonyCody

Defenders of our current obsession over test scores claim that new, better tests will rescue us from the educational stagnation caused by a test prep curriculum. And one of those new types of tests is an adaptive test, which adjusts the difficulty of questions as students work, so that students are always challenged. This gives a better measure of student ability than a traditional test, and can be given in the fall and spring to measure student growth over the year. This approach is increasingly being used to determine the “value” individual teachers add to their students’ academic ability, which is then used as a significant factor in teacher evaluation — as required by the Department of Education as a condition for relief from No Child Left Behind.

One might expect the designer of these tests to be happy with the many uses now being found for the data they produce. But Jim Angermeyr, one of the architects of the value-added assessment, is not so thrilled. He worked with the Northwest Evaluation Association to develop tests, and more recently as director of research and evaluation with the Bloomington Public Schools. In thisfascinating interview with the Minneapolis Post, he shares some of his concerns as he prepares to retire from the field.

His first concern is the way test scores are being used to rate teachers:

We [test designers] have a healthy respect for error and how to measure it. And always a certain amount of caution when you’re interpreting results.
That caution grows as the groups get smaller, like looking at a classroom instead of a whole school. And that caution grows even more when the stakes increase because increasing the stakes can lead to all kinds of distortions, whether it’s the cheating that goes on in some of schools that you’ve been reading about around the country, or whether it’s just the general over-emphasis on testing to the exclusion of other things.

Dr. Angermeyr helps us put testing in its place. He says,

Where the distortion comes in is that you can only test a limited amount of the domain. Even if it’s a domain like mathematics, you can’t cover everything. And so you make assumptions about kids’ skills in that broader domain. Do we have eighth graders who are good readers based on a pretty small sample of questions and items?
Testing professionals know that you’re just sampling the domain and you don’t try to make inferences further than that. But nonprofessionals do that all the time. “American students are 51st in the world in reading.” There are a lot of assumptions that are made before you can get to that conclusion, but people leap right over that.If I was running the world, I would severely reduce the accountability stakes for tests. I would certainly eliminate things like No Child Left Behind. I would probably take away the current waiver. Even if it looks better, sometimes it’s still really the same wolf in different clothing.I would do away with standards, to be honest. Even though on paper they sound kind of cool, they assume all kids are the same and they all make progress the same way and move in lockstep. And that’s just not accurate. Standards distort individual differences among kids. And that’s bad.
I would put testing back as a local control issue in school districts. I would take the emphasis off of evaluating and [compensating] teachers. I would put the emphasis on good training for principals and curriculum specialists and teachers on how to interpret data and use it for the kind of diagnosis and assessment that it was originally intended for.

This resonates powerfully with what teachers have been saying since the beginning of No Child Left Behind. It reminds me especially of the work that Doug Christensen led in Nebraska several years back, focused on developing local control of testing and standards.

But Jim Angermeyr is also aware of the power of data to provide our leaders with the ability to simplify complex issues.

It’s politicians and some policymakers who believe tests can do more than they really can. And there’s not enough people stopping and saying wait a minute. When you can summarize a whole bunch of complicated things in a single number, that has a lot of power and it’s hard to ignore, especially when it tells a story that you want to promote. And that’s where it gets really twisted.

There are quite a few of us saying “wait a minute.”   There is a National Resolution on High Stakes Testing that has gathered the support of hundreds of organizations and thousands of individuals.

This message is also echoed in the latest news out of Florida, where the state School Board Association recently adopted a resolution condemning the over-use of high stakes tests, and objecting to their use as the primary basis for evaluating teachers, administrators, schools and districts.

Perhaps if those designing the tests raise their voices alongside those of us who are giving the tests, and the students taking the tests, and their parents as well, we can bring about the change we need.

What do you think? Can we return testing to its proper place as a diagnostic tool? 

Anthony Cody spent 24 years working in Oakland schools, 18 of them as a science teacher at a high needs middle school. He is National Board certified, and now leads workshops with teachers focused on Project Based Learning. With education at a crossroads, he invites you to join him in a dialogue on education reform and teaching for change and deep learning. For additional information on Cody’s work, visit his Web site, Teachers Lead. Or follow him on Twitter.  This post was published with Anthony’s permission.

Fordham Institute Review of New Science Standards: Fealty to Conservatism & Canonical Science

Fordham Institute has published their  review of the draft of the Next Generation Science Standards.  Achieve wrote the the new science standards.   Achieve also wrote the math and reading/language arts common core standards.

Unchanging fealty to a conservative agenda and a canonical view of science education restricts and confines Fordham’s review to an old school view of science teaching.  Science education has rocketed past the views in two reports issued by Fordham about science education standards.

The Fordham reviewers use a strict content (canonical) view of science education and dismiss any reference to the scientific practices (science processes) and pedagogical advances such as constructivism, and inquiry teaching.  Many of the creative ideas that emerged in science teaching in the past thirty years represent interdisciplinary thinking, the learning sciences, deep understanding of how students learn science, and yes, constructivism.

These creative ideas are not reflected in Fordham’s analysis of science teaching and science curriculum.

I have also studied and reviewed the draft of the Next Generation Science Standards and have written about them here, and here.

The Framework

In 2011, the Carnegie Corporation funded the National Research Council’s project A Framework for K-12 Science Education (Framework).  The Framework was published last year, and it being used by Achieve as the basis for writing the Next Generation Science Standards (Science Standards)

These two documents, The Framework and the Science Standards, will decide the nature of science teaching for many years to come.

In this post, I’ll focus on how Fordham has responded to these two reports.

In late 2011, the Carnegie Corporation provided financial support to the Fordham Institute to review the NRC Framework.  The Fordham report was a commissioned paper (Review of the National Research Council’s Framework for K-12 Science Education), written by Dr. Paul Gross, Emeritus Professor of Biology. The Gross Report was not a juried review, but written by one person, who appears to have an ax to grind, especially with the science education research community, as well as those who advocate science inquiry, STS, or student-centered ideology. Indeed, the only good standard is one that is rigorous, and clearly content and discipline oriented.

I’ve read and reviewed the Fordham review of the Framework, and published my review here. Here some excerpts from my review.

Grade: B. In general, Dr. Gross, as well as Chester E. Finn, Jr. (President of the Fordham Foundation), are reluctant to give the Framework a grade of “A” instead mark the NRC’s thick report a grade of “B”.

Rigor.  Rigor is the measure of depth and level of abstraction to which chosen content is pursued, according to Gross. The Framework gets a good grade for rigor and limiting the number of science ideas identified in the Framework. The Framework identifies 44 ideas, which according to Gross is a credible core of science for the Framework.  The evaluator makes the claim that this new framework is better on science content than the NSES…how does he know that?

Practices, Crosscutting Concepts & Engineering. The Fordham evaluation has doubts about the Framework’s emphasis on Practices, Crosscutting Concepts, and Engineering/Technology Dimensions. For example, Gross identifies several researchers and their publications by name, and then says:

These were important in a trendy movement of the 1980s and 90s that went by such names as science studies, STS (sci-tech studies), (new) sociology or anthropology of science, cultural studies, cultural constructivism, and postmodern science.

For some reason, Gross thinks that science-related social issues and the radical idea of helping students construct their own ideas are not part of  mainstream science education, when indeed they are. Many of the creative Internet-based projects developed over the past 15 years have involved students in researching issues that have social implications.  The National Science Foundation made huge investments in creative learning projects.

Gross also claims that the NRC Framework authors “wisely demote what has long been held the essential condition of K-12 science: ‘Inquiry-based learning.’ The report does NOT demote inquiry, and in fact devotes much space to discussions of the Practices of science and engineering, which is another way of talking about inquiry. In fact, inquiry can found in 71 instances in the Framework. Gross and the Fordham Foundation make the case that Practices and Crosscutting ideas are accessories, and that only the Disciplinary Core Ideas of the Framework should be taken seriously . This will result is a set of science standards that are only based on 1/3 of the Framework’s recommendations.

Gross cherry picks his resources, and does not include a single research article from a prominent research journal in science education.  Dr. Gross  could have consulted science education journals found here, here, here or here.  If he did, he might have found this article: Inquiry-based science instruction—what is it and does it matter? Results from a research synthesis years 1984 to 2002.  Journal of Research in Science Teaching (JRST) published this article in 2010. Here is the abstract of the research study:

Various findings across 138 analyzed studies show a clear, positive trend favoring inquiry-based instructional practices, particularly instruction that emphasizes student active thinking and drawing conclusions from data. Teaching strategies that actively engage students in the learning process through scientific investigations are more likely to increase conceptual understanding than are strategies that rely on more passive techniques, which are often necessary in the current standardized-assessment laden educational environment.

The Fordham review of the Framework is not surprising, nor is their review of the first draft of the standards.  Fordham has its own set of science standards that it uses to check other organizations’ standards such as the state standards.  They used their standards as the “benchmark” to check all of the state science standards, and concluded that only 7 states earned an A.  Most of  the states earned an F.

If you download Fordham’s report here, scroll down to page 208 to read their science standards, which they call content-specific criteria.

I analyzed all the Fordham standards against Bloom’s Taxonomy in the Cognitive, Affective and Psychomotor domains.  Using Bloom’s Taxonomy, 52% of the Fordham science standards were rated at the lowest level.   Twenty-eight percent of their standards were at the comprehension level, 10% at application, and only 10% above analysis.  No standards were found for the affective or psychomotor designs.

All I am saying here is that Fordham has its own set of science standards, and I found them inferior to most of the state science standards, the National Science Education Standards (published in 1996), as well as the NAEP science framework.  You can read my full report here.  I gave Fordham’s science standards a grade of D.

Fordham Commentary on the New Science Standards

Given this background, we now turn our attention to Fordham’s Commentary & Feedback on Draft I of the NGSS.

The Fordham reviewers, as they did when they reviewed the NRC Framework for Science, felt the standards’ writers “went overboard on scientific and engineering practices.  From their point of view, crosscutting concepts and scientific and engineering practices create challenges to those who write standards.

Fordham science standards are reminiscent of the way  learning goals were written in the 1960s and 1970s.   Writers used one of many behavioral or action verbs such as define, describe, find, diagram, classify, and so forth to construct  behavioral objectives.  The Fordham standards were written using this strategy. Here are three examples from their list of standards:

  • Describe the organization of matter in the universe into stars and galaxies.
  • Identify the sun as the major source of energy for processes on Earth’s surface.
  • Describe the greenhouse effect and how a planet’s atmosphere can affect its climate.

The Fordham experts raised concerns about the way standard statements are written.  As shown in the examples from the draft of the NGSS, the standards integrate content with process and pedagogical components.

I agree with the Fordham reviewers that the Next Generation Science Standards  are rather complex.  Shown in Figure 1 is the “system architecture that Achieve used for all of the standards.  Figure 1 shows just four performance expectations (read standards), and their connection to practices, core ideas, and crosscutting concepts.  Every science standard in the Achieve report is presented in this way.

Figure 1. System Architecture of the NGSS. Source:, extracted May 12, 2012

The Fordham reviewers gave careful attention to each standard statement, and indeed in their report they include many examples of how the standards’ writers got the content wrong or stated it in such a way that was unclear.

But the Fordham reviewers take the exception to the science education community’s research on constructivism.  In their terms, science educators show fealty to constructivist pedagogical theory.  To ignore constructivism, or to think that science educators have an unswerving allegiance to this well established and researched theory is quite telling.  To me it indicates that Fordham holds a traditional view of how students learn.  It tells me that these reviewers have boxed themselves into a vision of science literacy by looking inward at the canon of orthodox nature science.  Content is king.

To many science teachers and science education researchers, an alternative vision gets its meaning from the “character of situations with a scientific component, situations that students are likely to encounter as students.  Science literacy focuses on science-related situations (See Douglas Roberts’ chapter on science literacy in the Handbook of Research on Science Education).

The Fordham reviewers recommend that every standard be rewritten to cut “practices” where they are not needed.  They also want independent, highly qualified scientists who have not been involved in the standards writing attempt to check every standard.  The National Science Teachers Association, comprised of science teachers and scientists is quite qualified to do this, and indeed the NSTA sent their recommendations to Achieve last week.

I would agree with the Fordham group that the next version of the standards should be presented in a clearer way, and easily searchable.  I spent a good deal of time online with the first draft, and after a while I was able to search the document, but it was a bit overwhelming.

Finally I would add that when you check the Fordham analysis of the new standards, the word “basic” jumps out.  Near the end of their opinion report, they remind us that the science basics in the underlying NRC Framework were sound.  What they are saying is that the NGSS writers need to chisel away anything that is not solid content from the standards.

One More Thing

Organizations such as Achieve and the Fordham Institute believe the U.S. system of science and mathematics education is performing below par, and if something isn’t done, then millions of students will not be prepared to compete in the global economy. Achieve cites achievement data from PISA and NAEP to make its case that American science and mathematics teaching is in horrible shape, and needs to fixed.

The solution to fix this problem to make the American dream possible for all citizens is to write new science (and mathematics) standards.  One could argue that quality science teaching is not based on authoritarian content standards, but much richer standards of teaching that form the foundation of professional teaching.

What ever standards are agreed upon, they ought to be based on a set of values that are rooted in democratic thinking, including empathy and responsibility. Professional teachers above all else are empathic in the sense that teachers have the capacity to connect with their students, to feel what others feel, and to imagine oneself as another and hence to feel a kinship with others. Professional teachers are responsible in the sense that they act on empathy, and that they are not only responsible for others (their students, parents, colleagues), but themselves as well.

The dual forces of authoritarian standards and high-stakes testing has taken hold of K-12 education through a top-down, corporate led enterprise. This is very big business, and it is having an effect of thwarting teaching and learning in American schools. A recent study by Pioneer Institute estimated that states will spend at least $15 billion over the next few years to replace their current standards with the common core.  What will it cost to implement new science standards?

In research that I have reported here, standards are barriers to teaching and learning.  In this research, the tightly specified nature of successful learning performances precludes classroom teachers from modifying the standards to fit the needs of their students.  And the standards are removed from the thinking and reasoning provesses needed to achieve them.  Combine this with high-stakes tests, and you have a recipe for disaster.

According to the 2012 Brown Center Report on American Education, the Common Core State Standards will have little to no effect on student achievement. Author Tom Loveless explains that neither the quality or the rigor of state standards is related to state NAEP scores. Loveless suggests that if there was an effect, we would have seen it since all states had standards in 2003.

The researchers concluded that we should not expect much from the Common Core. In an interesting discussion of the implications of their findings, Tom Loveless, the author of the report, cautions us to be careful about not being drawn into thinking that standards represent a kind of system of “weights and measures.” Loveless tells us that standards’ reformers use the word—benchmarks—as a synonym for standards. And he says that they use it too often. In science education, we’ve had a long history of using the word benchmarks, and Loveless reminds us that there are not real, or measured benchmarks in any content area. Yet, when you read the standards—common core or science—there is the implication we really know–almost in a measured way–what standards should be met at a particular grade level.

Loveless also makes a strong point when he says the entire system of education is “teeming with variation.” To think that creating a set of common core standards will reduce this variation between states or within a state simply will not succeed.

As the Brown report suggests, we should not depend on the common core or the Next Generation Science Standards having any effect on students’ achievement.

What do you think?  Is Fordham’s view of science education consistent with your ideas about science teaching?


Curious Relationship Between NAEP Science Framework and the Next Generation Science Standards

There is a very curious relationship between NAEP Science Framework and the Next Generation Science Standards that I discovered while studying the NGSS and wanting to find out what was emphasized on the NAEP Science Assessments.  I had read on an NSTA list that I receive that someone had questioned the distribution of questions on the NAEP Science Assessment.  They had reported that the questions were distributed as follows: 30% Physical Science; 30 Life Science and 40% Earth Science.  I also wondered about that and went to the NAEP Website to find out.

I ended up at the NAEP 2009 Science Framework publication which you can download here.   The Commissioner of Education Statistics, who heads the National Center for Education Statistics in the U.S. Department of Education, is responsible by law for carrying out the NAEP project. The National Assessment Governing Board, appointed by the Secretary of Education but independent of the Department, sets policy for NAEP and is responsible for developing the framework and test specifications that serve as the blueprint for the assessments.

In 2009, the NAEP published the latest framework for science.  I admit that I had not read this document until today.  But I had read the NRC’s Framework for K-12 Science, and I have studied the Next Generation Science Standards.

We all know that the NGSS was developed by Achieve, and released the first Public version of the new standards last week.  We also knew that the standards were based on the NRC’s Framework for K-12 Science that was developed and written by a 17 member task force set up by the NRC with funding from the Carnegie Foundation.

A Curious Similarity

What I found curious was how the NAEP document describing the rationale and the design of the science framework which is used to develop science assessment items was so similar to the NRC Framework for K-12 Science Education and the Next Generation Science Standards framework.  The NAEP science framework was developed prior to the development of NRC’s science framework, and of course before the Next Generation Science Standards.

Table 1 compares and contrasts the NAEP Framework, the NRC Framework for K-12 Science Education, and the Next Generation Science Standards.  The language used in all three documents is very similar especially when defining key ideas including content or disciplinary core ideas, science and engineering practices, and crosscutting concepts and ideas.  When the NRC Framework was published in 2011, that was great fan fare over the new framework, and the ideas that had formulated the NRC’s committee to design the framework along three lines, shown in Table 1: Disciplinary core ideas, crosscutting concept, and science and engineering practices.

These three big ideas would be used to develop the Next Generation Science Standards.

It turns out that the NAEP had developed its new science framework which they use to design and write test items or assessment for their science assessments.  It is very similar to the NRC Framework, or maybe it would be better to say that the NRC Framework is similar to the Assessment framework.  There is also an overlap in some key members of the planning, steering, and writing committee on the NAEP and NRC committees.  How might this influence the direction taken by each of the frameworks?  I am not questioning the credentials of the members of any of these groups.  I am only wondering about the overlap.

There is another similarity among the three projects, and that is the lack of K-12 educators in the planning processes, and the writing and development process.  I couldn’t find one teacher on the NAEP committees.  There were no teachers on the NRC Framework committee.  And one of the members of the NRC was later hired by Achieve to head up the development of the Next Generation Science Standards.

NAEP Science Framework, 2009 NRC Framework for K-12 Science Education, 2011 Next Generation Science Standards, 2012
Science Content or Disciplinary Core Ideas
  • Physical Science
  • Life Science
  • Earth and Space Science
  • Earth and Space Sciences
  • Life Sciences
  • Physical Sciences
  • Engineering, Technology & Applications
  • Earth and Space Sciences
  • Life Sciences
  • Physical Sciences
  • Engineering, Technology & Applications
Crosscutting Content or Concepts Interaction of science content and practices of science
  • Patterns
  • Cause and Effect
  • Stability
  • Systems and System Models
  • Energy and Matter
  • Interdependence


  • Patterns
  • Cause and Effect
  • Stability
  • Systems and System Models
  • Energy and Matter
  • Interdependence
  • Influence
Science Practices
  • Identifying Science Principles
  • Using Science Principles
  • Using Scientific Inquiry
  • Using Technological Design
  • Asking Questions and Defining Problems
  • Planning and Carrying Out Investigations
  •   Using Mathematics and Computational Thinking
  •   Constructing Explanations and Designing Solutions
  •   Engaging in Argument from Evidence
  •   Obtaining, Evaluating, and Communicating Information
  • Asking Questions and Defining Problems
  • Planning and Carrying Out Investigations
  • Using Mathematics and Computational Thinking
  • Constructing Explanations and Designing Solutions
  • Engaging in Argument from Evidence
  • Obtaining, Evaluating, and Communicating Information
  • Asking Questions and Defining Problems
  • Planning and Carrying Out Investigations
  • Using Mathematics and Computational Thinking
  • Constructing Explanations and Designing Solutions
  • Engaging in Argument from Evidence
  • Obtaining, Evaluating, and Communicating Information


Framework Development WestEd and CCSSO, AAAS, NSTA National Research Council NRC, Achieve, NSTA, AAAS

Table 1. Comparison of Science Frameworks Designed by NAEP, NRC, & Achieve

NAEP is a low-stakes test, and is perhaps one of the most reliable measures we have of student performance in science, math and reading.  However, the fact that the government assessment framework preceded the NRC and Achieve frameworks raises questions about what this sequence means, and what are we to expect in the future.

There is strong evidence that a national high-stakes science assessment will be developed and required of all states that the adopt the NGSS.  If you don’t believe me, then you should check to see what the record shows about the Common Core State Standards’ national computer-based assessment.  A recent study by the Pioneer Institute reported that to implement the Common Core in the states will cost more than $15 billion, and that does not include testing.

There is also evidence some influential groups have been involved in all three enterprises including the Thomas Fordham Foundation, Achieve, Inc., the U.S. Department of Education, Council of Chief State School Officers, and the National Governors Association.  There were no requests for proposals for any of this work in the documents that I have read.  In each case, organizations were appointed by boards, or foundations, or councils.  There was no attempt in these developments to build in any kind of research and evaluation of the projects.  And of course, this is really odd in that these groups are creating products (tests and standards) that will hold others  accountable: teachers, students, administrators, schools, & school districts.

Do you think that the relationship among these three groups is curious?  Or is it simply of little concern, and we should move on?