Tag Archives: PARCC

States Need to Get Real on Testing Tradeoffs Before Making Another Big Switch

risksignJust a few years ago, it seemed like most of the country was heading towards common state assessments in math and reading. Two groups of states won federal grant funds to create higher-quality tests; these became the PARCC and Smarter Balanced test consortia. Now, despite the demonstrated rigor and academic quality of those tests, the testing landscape is almost as fractured as it was before, with states pursuing a variety of assessment strategies. Some states in the consortia are still waffling. Others that have left are already scrapping the tests they made on their own with no idea of what they’ll do next.

States should think carefully before going it alone or introducing a new testing overhaul without strong justification. There are some big tradeoffs at play in the testing world, and a state might spend millions on an “innovative” new test from an eager-to-please vendor only to find that it has the same, or worse, issues as the “next generation” tests they tossed aside.

Continue reading

Should Massachusetts PARCC the MCAS? Plus 5 Questions for Other States.

A recent Mathematica study found that new PARCC assessments were statistically no better at predicting which students were ready for college than Massachusetts’ old student assessments (called MCAS). Both tests were slightly superior to the SAT at identifying which students would be successful in college-level courses, but the study should prod all states’ thinking in a few ways:

1. Should we keep trying to build a better mousetrap? If $186 million and five years’ work at PARCC* can’t design something much better than what Massachusetts developed on its own in 1993, why continue this race?

2. If states pick a random test from the cheapest assessment vendor, will they get results even as good as what we’re seeing here? The study also found that cut scores matter. Although the two tests produced results that were statistically indistinguishable, the PARCC cut score did send a stronger signal than the cut score Massachusetts had been using on MCAS. To examine how that’s playing out for their students, all states should be doing studies like this one–the other federally funded assessment consortia, Smarter Balanced, should be as well. I suspect the results would be no matter or perhaps worse than what Mathematica found in Massachusetts. different. If even the highest-quality standards and the best tests we have at the K-12 level don’t tell us that much about readiness for college, what chance do individual states have of coming up with something better?

3. Related to #2, should states still be coming up with their own high school achievement tests? Why don’t more states opt to use the SAT or ACT** as their high school accountability tests? This study found that PARCC and MCAS had slightly more predictive value than the SAT, but there are trade-offs. The SAT and the ACT are older, shorter, and cheaper than what states typically offer, plus they’re familiar to parents and, unlike a given state’s assessment, SAT and ACT scores are accepted by colleges and universities all across the country. The ACT and SAT have proven themselves to be useful, if slightly flawed measures of college-readiness. Why do states think they can do better?

4. How much should we value alignment between K-12 academic standards and tests? One objection to just using the SAT or ACT for official state purposes is that they’re not aligned to each state’s academic content standards. But so what? Both PARCC and MCAS are closely aligned to Massachusetts’ state academic standards, but neither one is all that closely aligned to actual student outcomes at Massachusetts colleges and universities.

5. If there’s only a moderate relationship between high school test scores and first-year college GPA (let alone longer-term GPA or college completion rates), why do we keep our sole reliance on these tests for accountability purposes? I happen to have a whole paper on this topic, but this study is yet another reminder that if states care about college-readiness, they need to be tracking actual college outcomes, not just test scores.

*Disclosure: The PARCC consortia is a Bellwether client, but I am not involved in the work.

**ACT is a former Bellwether client. It was not part of the Mathematica study but it has very similar correlations to first-year college outcomes.

Do New Common Core Test Results Tell Us Anything New?

What do new assessments aligned to the Common Core tell us? Not all much more than what we already knew. There are large and persistent achievement gaps. Not enough students score at high levels. Students who performed well on tests in the past continue to perform well today. In short, while the new assessments may re-introduce these conversations in certain places, we’re not seeing dramatically different storylines.

To see how scores differ in the Common Core era, I collected school-level data from Maine. I chose Maine because they’re a small state with a manageable number of schools, they were one of the 18 states using the new Smarter Balanced test this year, and because they have already made data available at the school level from tests given in the spring of 2015.

The graph below compares average math and reading proficiency rates over two time periods. The horizontal axis plots average proficiency rates from 2012-14 on Maine’s old assessments, while the vertical axis corresponds to average proficiency rates in Spring 2015 on the new Smarter Balanced assessments.* There are 447 dots, each representing one Maine public school with sufficient data in all four years. The solid black line represents the linear relationship between the two time periods.

ME proficiency rates_2012-15

(Click graph to enlarge)

There are a couple things to note about the graph. First is that, as has played out in many other places, proficiency rates fell. The average proficiency rate for these schools fell from 64 to 42 percent. While a number of schools saw average proficiency rates from 2012-14 in the 80s and even the 90s, no school scored above 82 percent this year (this shows up as white space at the top of the graph).

Second, there’s a strong linear relationship between the two sets of scores. The correlation between these time periods was .71, a fairly strong relationship. Schools that did well in the past also tended to do well, on a relative basis, in 2015.

So what does all this mean? A few thoughts: Continue reading

Why Legislative Words Matter

This one’s for all the aspiring policy wonks.

In Newark, NJ, the superintendent recently attempted to revoke the tenure rights of a group of teachers deemed ineffective. The state has a statute (“TEACHNJ”) of recent vintage permitting such things.

Kind of. Well, at least eventually.

One of the affected teachers contested the district’s decision, and an arbitrator sided with the teacher. It turns out that, in an arbitrator’s estimation at least, the statute technically took effect later than the district contends.

The arbitrator ruled that the statute’s language officially started the evaluations-with-state-mandated-consequences clock in 2013-14, not 2012-13. That means the district has only one annual performance evaluation of the teachers in question, not the two that are needed to invoke the state’s tenure-removal provision. So even though the district’s action comports with the spirit of the state law, this personnel decision was overturned, and the “remedy is reinstatement with full back pay and benefits.”

Because of the exact wording of legislative language, dozens of teachers are either–depending on your worldview–being indefensibly shielded from the law’s clear intent or rightly defended from an illegitimate administrative action.

If this law’s lack of specificity frustrates you, consider Section 5 of this North Carolina statute. So concerned that the state board would use its existing statutory and regulatory authority to procure an unpopular testing system (e.g. PARCC or SBAC), the legislature actually prohibits the board from acquiring any new assessment system until it is given new, explicit legislative permission to do so. The law goes even further, actually naming the kinds of tests that would probably be acceptable, (e.g. NAEP, SAT, ACT).

This is the endless tug of war between legislative authority and administrative discretion. In the former, a district gets its hand rapped for trying to squeeze too much power from what it considers sufficiently permissive language. In the latter, lawmakers craft uber-specific language to prevent the state school board from using its existing power to act against the legislature’s wishes.