Just a few years ago, it seemed like most of the country was heading towards common state assessments in math and reading. Two groups of states won federal grant funds to create higher-quality tests; these became the PARCC and Smarter Balanced test consortia. Now, despite the demonstrated rigor and academic quality of those tests, the testing landscape is almost as fractured as it was before, with states pursuing a variety of assessment strategies. Some states in the consortia are still waffling. Others that have left are already scrapping the tests they made on their own with no idea of what they’ll do next.
States should think carefully before going it alone or introducing a new testing overhaul without strong justification. There are some big tradeoffs at play in the testing world, and a state might spend millions on an “innovative” new test from an eager-to-please vendor only to find that it has the same, or worse, issues as the “next generation” tests they tossed aside.
But annual testing is only half the story. That’s because Alexander’s bill doesn’t just offer two statewide testing options for policymakers to fight about. It also offers a separate testing option fordistricts on top of the state choices. And although education wonks are up-in-arms over the merits of door #1 vs. door #2 for states, most have, unfortunately, ignored the giant local testing loophole that is behind door #3.
Through it, districts could opt-out of statewide testing and use their own tests instead, regardless of whether Congress chooses door #1 or door #2. But the real kicker is that this loophole isn’t actually new at all. Alexander’s draft bill just makes it far easier for districts to take advantage of–and abuse–existing flexibility.
Districts would only need state approval that their local assessments meet the same federal requirements with which state tests comply. And given the increasing number of districts pushing back on state testing, door #3 would be an irresistible option for many, even as it undermines the comparability of data between schools for evaluation and accountability; states’ abilities to provide technical assistance, support, and professional development to districts; and state investments in new assessment systems aligned to college- and career-ready standards.
The American Federation of Teachers (AFT) and the Center for American Progress (CAP) have released a joint set of principles for ESEA reauthorization. They call for preserving statewide annual testing requirements for students, but they would base school-level accountability only on tests taken once per grade span—once in elementary school, once in middle school, and once in high school.
Like the Education Trust, we think this is a bad idea. Grade-span accountability solves none of the problems of our current system while making other problems worse. Namely:
It doesn’t address concerns about over-testing. Students could be taking the same number of tests as they have in the past, particularly if districts don’t reduce the number of duplicative and unnecessary local tests. CAP has rightly cited these local tests as the root of the problem, but this proposal would not reduce the number of federally mandated tests.
Rather than decreasing the stakes on standardized tests, the AFT/CAP proposal would amplify them. Under their plan, a 5th grader would no longer be taking tests that reflect just on the 5th grade. His or her results would be the basis on which their entire school was judged. How, exactly, does this help “de-link [academic standards] from high-stakes tests”, as AFT President Randi Weingarten suggested a year ago?
It makes it even harder to focus on specific subgroups. NCLB held schools accountable for every subgroup that had a sufficient number of students (called the minimum “n-size”). But under the CAP/ AFT proposal, a school’s 5th grade African-American, ELL, or SWD groups could be too small to meet the minimum n-size and the whole school’s disadvantaged students could go uncounted. This may sound wonky and technical, but it becomes a pretty huge issue even at relatively small n-sizes (such as 10 or 20 students). Arne Duncan has estimated that hundreds of thousands of students were invisible to state accountability systems because of n-size issues. CAP has praised states in the past for lowering their n-sizes, but their plan to have fewer students “count” toward a school’s accountability rating would mean less attention on important subgroups of students.
We already have anecdotes about teachers who prefer to avoid tested grades and subjects. They may prefer teaching in 2nd grade, where there are no required standardized tests, than 3rd grade, where there are. But it’s tough to avoid the current tests altogether because they’re given in 3rd through 8th grade. Grade-span testing would make it even tougher to attract teachers into those few areas with much higher stakes. Who wants to be a 5th grade teacher when they might responsible for their entire school? In most places, they won’t even earn any extra money for all the added pressure! Moreover, this is exactly the kind of policy the AFT previously opposed for teacher evaluations. A year ago, Weingarten wrote: “In Florida, the system went completely haywire, giving teachers value-added scores for students they had never taught.” If it’s not okay for educator accountability, why is it okay for school accountability?
Standardized tests are often criticized for merely reflecting student demographics. While states and districts have been slow to implement accountability systems that incorporate student growth, with annual statewide testing, we at least had a hope of shifting attention to how much progress students make over time. CAP and the AFT once shared this hope. Yes, not all that long ago AFT advocated for an ESEA that “judges school effectiveness—the only valid and fair basis for accountability—by measuring the progress that schools achieve with the same students over time.” With longer gaps between tests that count for accountability purposes, we’re more likely to lean even heavier on raw test scores, measures that are highly correlated with student demographics.
Under this plan, students and families would still get a sense of how much progress they’re making. That’s important, but it’s odd to then turn around and suggest that states and school districts should ignore this same information for determining school progress. As CAP’s 2011 NCLB recommendations suggest, “Measuring and reporting student data is not sufficient to improve our nation’s schools. Congress should take several steps to ensure schools act on that data to boost student outcomes.” We assume they did not mean several steps backwards.
AFT and CAP pitch their proposal as targeting interventions to schools with large achievement gaps. That’s true, it would identify schools with gaps. But, ironically, it would give no credit to schools that are actually closing those achievement gaps. CAP used to support annual gap-closing goals. But now, schools with large concentrations of economically disadvantaged and minority students, English Language Learners, or students with disabilities would all be penalized unfairly, worse than they are under NCLB.
Ultimately, this plan would move us closer to how other countries do testing: fewer tests with much higher stakes. Rather than having regular check-ups on student progress, with relatively low stakes on those results, we’d have much higher stakes attached to a smaller number of test scores. Fortunately, AFT and CAP have already told us why this is a bad idea.
No matter the final outcome, one things is for certain: the new Congress has energized the debate over ESEA reauthorization. In the span of a weekend, numerous organizations articulated key principles for overhauling No Child Left Behind, including state education chiefs, civil rights organizations, and the nation’s second largest teachers union, the AFT.
The big news item here is Secretary Duncan’s “line in the sand”–keeping the requirement for students to be tested statewide in reading and math annually in grades 3-8 and once in high school. But what sets Duncan’s remarks apart from the statements released over the weekend isn’t testing, but how strongly he defended other potential federal responsibilities in a new ESEA, including requirements for states to:
adopt college- and career-ready standards;
continue producing annual information for families about their child’s learning, and the learning environment and results for their schools as a whole;
maintain school accountability systems that include consequences for schools where students don’t make academic progress; and
improve teacher preparation programs, and establish teacher evaluation systems that include evidence of students’ learning.
Duncan also highlighted ways the federal government could be even more active in promoting opportunity, such as resource accountability to ensure that low-income and minority kids are not shortchanged when it comes to course access, effective teachers, and fiscal resources; new support for innovation and research that helps schools continuously improve; and an expanded role within ESEA to help states deliver high-quality preschool.
In defending a robust federal role, Secretary Duncan even co-opted President Bush’s talking point by calling out “the soft bigotry of ‘it’s optional.’” That’s not just a great punch line. It also revealed much more about the politics of reauthorization, the confusing and convoluted federal education policy landscape, and the prospects of this particular effort to rewrite NCLB. Continue reading →
If you follow education news, politics, and social media, it’s clear that testing is having a moment. I was surprised it wasn’t listed alongside Taylor Swift as a nominee for Time magazine’s 2014 Person of the Year. Everyone–policymakers, unions, state leaders, local administrators, teachers, parents, you name it–seems to agree that the amount of testing and its role in America’s schools and classrooms merit reconsideration. But the momentum of this “over-testing” meme has overshadowed the fact that testing policy is complicated. And when the field talks about “over-testing,” it’s often not talking about the same kinds of tests or the same set of issues.
To help clarify and elevate our over-testing conversation (because it’s here to stay), here are four questions to ask, with considerations to weigh, when deciding whether testing is indeed out of control–and evaluating the possible options to change it. Continue reading →