I really am curious: Why are policy makers, and some educators, so deeply convinced of the value of value-added measures (VAMs), or “student growth” measures, that they are willing to set them up as arbiters of teachers’ work and worth? Maybe you can tell me why VAMs are getting such cred.
The question persists because the evidence of flaws in the system keeps mounting, and advice against reliance upon them except in the most guarded fashion keeps mounting as well.
I do understand the appeal of the basic logic, which can be cartooned sort of like this: Let’s model an input-output system. Measure a student score before she goes into the box, and measure the student’s score when she emerges, and predict what the score should have been. A big enough difference allows us to say the box did a good thing. A smaller difference justifies our saying that the box didn’t do such a good thing. If what’s in the box is a teacher, then we know what value that teacher added for that student. Repeat for other students passing through the same box. Voila! We know something about the teacher, who is the factor in the box!
Except it turns out we don’t know nearly as much as we think, and not nearly as reliably. The National Association of Secondary School Principals (NASSP) has published (here) a very useful overview of the research, as part of their recommendation about how VAMs may and should not be used. This follows a very thoughtful statement by the American Statistical Association (here). The bottom line is that VAMs are unstable and unreliable from year to year, are subject to “noise” from a lot of different sources, and are best used as a small part of a more comprehensive and coherent program for teacher (and school) evaluation.
But as Daniel Katz discusses, the Dept of Education is advocating that VAM-type or student growth measures should be used to judge — not just teachers, but teacher preparation programs (see the announcement at the ED website). Katz provides links to a range of useful resources, and a compelling examination of VAM misapplication in the case of a teacher in New York City. The research on VAMs and similar student growth measures makes clear that this anecdote is an illustration of an endemic problem, and is not merely cherry-picked sensation-mongering.
The ED confidence in, and advocacy of VAM (“or student growth”) metrics seems to be adding new apartments to the air-castle of accountability measures that the accountability movement has been building, with enthusiastic support from many quarters, in defiance of the evidence, and at some considerable cost to our educational enterprise.
I have to say that as I read Katz’s post, followed his links, and revisited other studies I’d seen before, I was reminded of the argument of a favorite book (Voltaire’s Bastards: the dictatorship of reason in the West, by J.R. Saul). Saul argues that one of the unintended consequences of the Enlightenment was the rise of a cult of expertise, in which reasoned judgment is over-ruled by authorities whose metrics are protected from critique, even on the basis of the data that they claim to revere. Education is not the only place where this problem can be espied, certainly; but it’s not the least important sphere in which to struggle against facile but ill-founded policies. There are real human consequences.
But maybe I am missing something. What do you think? Have you seen a way to square the circle, and improve the models? Have you been subjected to “growth models” that measured your impact on your students (or your teachers!), and did it feel like a truthful evaluation?