Is teacher evaluation statistical voodoo?
April 27, 2011 7:01 PM Subscribe

"Value-added modeling is promoted because it has the right pedigree -- because it is based on "sophisticated mathematics." As a consequence, mathematics that ought to be used to illuminate ends up being used to intimidate." John Ewing, president of Math for America and former executive director of the American Mathematical Society, criticizes the "value-added modeling" approach used as a proxy for teacher quality, most famously in a Los Angeles Times story that called out low-scoring teachers by name. A Brookings Institution paper says value-added modeling is flawed but the best measure we have of teacher value, arguing that the metric's wide fluctuations from year to year are no worse than those of batting averages in baseball. (Though the weakness of that correlation is mostly a BABIP issue.) Can we assign a numerical value to teacher quality? If so, how?
posted by escabeche (61 comments total) 20 users marked this as a favorite

No.
posted by saulgoodman at 7:05 PM on April 27, 2011 [9 favorites]

Is anyone else's first reaction to note that batting average is so (not) brilliant that the entire field of sabremetrics exists?

It's a completely knee jerk reaction, but I'm having a really hard time looking past it.
posted by hoyland at 7:09 PM on April 27, 2011

All right, maybe. But it would be way more complicated than how we do it now.
posted by saulgoodman at 7:11 PM on April 27, 2011

Can we assign a numerical value to teacher quality?

I'm sure you can.

However, like most performance metrics in the workplace, it should be part of comprehensive evaluation not taken as the sole arbiter of continued employment.
posted by madajb at 7:12 PM on April 27, 2011 [3 favorites]

Is the math really that complicated? I thought it was just a linear model of student test score improvement, with variables for teacher and student demographics.

If the math is what I think it is, then it's no different from how one would measure drug efficacy in a clinical trial for example.
posted by gyp casino at 7:13 PM on April 27, 2011

Well, clearly, we need some way to measure the averages of uneducated students.

If only we could get the droop-outs to come in for standardized testing as a control…
posted by klangklangston at 7:27 PM on April 27, 2011

Yes.

(Damn that felt good, but added nothing to the discussion.)
posted by mnemonic at 7:28 PM on April 27, 2011

(Oh, and batting average is a pretty useless stat in baseball, Brookings Institute. Try a OPS+!)
posted by klangklangston at 7:32 PM on April 27, 2011 [1 favorite]

As long as the premise, that it's fine to evaluate somebody's job performance and then based on that evaluation get rid of the low performers and reward the high performers, is agreed. Then great lets have a discussion of how best to do the evaluation.

My impression (perhaps incorrect) is that in public education apparently at least according to the teacher unions its so damn difficult to evaluate a teacher we shouldn't even try.
posted by Long Way To Go at 7:42 PM on April 27, 2011

Part of the problem is standardization of student performance data.

Teachers grade and give out their own student grades. If you put too much pressure on a teacher to show letter grade improvements, they might be tempted to start grading generously.

Standardized testing data, on the other hand, is subject to numerous selection biases (in my experience as a database and data warehousing developer working on education related development efforts for the last five years), and is incomplete: there are big gaps in time when most states don't evaluate student performance on standardized tests, and even when they do, the subject matter is limited to the rudiments.

Math, reading, writing. How do you evaluate a history teacher's performance based on standardized test results when the tests don't and ultimately can't include history as a topic?

You can't. In the end, the data is too incomplete to be of much real value, IMO.
posted by saulgoodman at 7:48 PM on April 27, 2011 [2 favorites]

And in order to capture detailed enough data to get an accurate model of learning gains over time, and to establish a strong correlation to the performance of a particular teacher at a particular point in time in the school year, you'd need to track way too much. It'd be a bureaucratic nightmare that would suck what's left out of the joy in getting an education in America. Too many metrics. Too many over paid, incompetent or politically Machiavellian administrators. Not enough attention to students and what goes on in the classroom--the content of an education.
posted by saulgoodman at 7:52 PM on April 27, 2011 [1 favorite]

I find the notion that "eductation" is "doing well on tests" and not, you know, learning skills that are useful to the growth and development of children pretty repugnant. It's short-sighted and cruel, and has killed things like music and art education - being a member of society means understanding and participating in it's culture.
posted by Slap*Happy at 7:54 PM on April 27, 2011 [9 favorites]

Is the math really that complicated? I thought it was just a linear model of student test score improvement, with variables for teacher and student demographics.

I'm not real keen on this as our defining understanding of "quality", but I grant you it looks pretty simple.
posted by brennen at 8:00 PM on April 27, 2011

how do you evaluate the schools orchestra conductor?
posted by robbyrobs at 8:01 PM on April 27, 2011

I've spent most of the last 20 years teaching (university), and while I don't have k-12 experience, I can tell you that student achievement depends 80-90% on the students. Sure, a teacher has to have basic subject knowledge, public speaking and classroom management skills; and sure, a really charismatic teacher can work miracles. But here's the thing: we should not require our teachers to work miracles. We should send them kids who are primed for learning. Any child can learn, the consultants tell us over and over again. Well, sure.....as long as the child wants to learn. And where does desire for learning come from? It's instilled at home.

Student performance in my own courses varies widely from semester to semester. Sometimes the course average is an A. Sometimes it's a C. How can you account for this? I'm the same person, teaching the same subjects: it's the kids. And this is at a school where supposedly all admitted students have the same basic skills. Um, no. Some students are better than others. Some care more than others. This varies a lot, year to year, term to term, and it has NOTHING to do with my performance as a teacher.

What do people really think is going to happen when they match teacher pay to student performance? Test scores are still going to broadly echo kids' home lives, as they do now, but they will be peppered with widespread cheating/exam fixing from the unscrupulous (This happened recently at a school where a family member was a teacher, in a large urban district that shall remain nameless.) And then, when the cheaters are widely praised for their amazing accomplishments in raising test scores, this is utterly demoralizing to those teachers in disadvantaged schools who are a little more ethical.

Paying teachers for student performance is ridiculous, because leaves the effort that must be made by students entirely out of the equation. And that is the most important part.
posted by philokalia at 8:05 PM on April 27, 2011 [12 favorites]

As I said, it might look simple. But the data isn't currently there--it doesn't exist. Florida's Education Data Warehouse tracks education data from around the state for longitudinal analysis in one centralized data repository. It's good data, relatively speaking, and a lot of it.

Considering the paucity of resources in the state, Florida does an excellent job of collecting what data's out there to be collected, but in the end, much of the data a valid evaluation model would need isn't being captured, and nobody wants to spend the money to create the administrative overhead it would take to capture (plus that would take still more education money out of the classroom).

Meanwhile, even information as basic as a social security number is impossible to capture reliably because of recent immigrants who haven't been assigned numbers yet, reluctance on the part of some parents to disclose private social security information (which they have a right not to disclose), and data omissions/entry errors.
posted by saulgoodman at 8:08 PM on April 27, 2011 [1 favorite]

I think a big part of the problem is that political correctness has expanded to the point that it's unacceptable to say anything bad about a person (unless he's an alleged criminal, homosexual or member of the other political party). So we look for numbers-based solutions that (theoretically) allow us to dump somebody who we consider to be bad or incompetent, without saying anything impolite, that might hurt somebody's feelings. After all, you can't argue with numbers, right?

Teachers ought to be able to recognize good teachers -- and bad teachers -- and ought to have enough concern for their profession and their students to do something about the bad teachers. It's all well & good to beat the drum for more funding and smaller classes and longer school days, but what about the teachers who, frankly, should be doing something else? Will no teacher or principal take one of them aside and say "look, how about pursuing a different career? Or a different subject, or age group, or whatever, because what you're doing right now is little beyond a disservice to your students?"

I think it's similar to the bad cops whose stories show up here from time to time. We always hear that most cops are good, but if that's so, then why does no-one call out the bad apples until they shoot the unarmed black kid in the back? (And usually not even then?)

Teaching and policing are tough, important jobs, and I have enormous respect for those who do their best, and do it well. But that's not enough: both teachers and cops have to have the courage to tell uncomfortable truths about their peers, because when they don't, kids get stupid and people get shot.

Going along to get along is the U.S.'s "I was just following orders!" It's not a good excuse.
posted by spacewrench at 8:10 PM on April 27, 2011 [1 favorite]

teaching one child is a joy; teaching a million is a statistic.
posted by GuyZero at 8:23 PM on April 27, 2011 [2 favorites]

Oh, and batting average is a pretty useless stat in baseball, Brookings Institute. Try a OPS+!

Better yet, use wOBA, which weights OBP and SLG differently, and is scaled to look like OBP.
posted by ORthey at 8:27 PM on April 27, 2011 [1 favorite]

Sometimes the course average is an A. Sometimes it's a C. How can you account for this? I'm the same person, teaching the same subjects: it's the kids. And this is at a school where supposedly all admitted students have the same basic skills. Um, no. Some students are better than others. Some care more than others. This varies a lot, year to year, term to term, and it has NOTHING to do with my performance as a teacher.

philokalia, not to detract from the other points you've made, but the allure of the value-added model is apparently that it accounts for this, and much more:

...a method of teacher evaluation that measures the teacher's contribution in a given year by comparing current school year test scores of their students to the scores of those same students in the previous school year, as well as to the scores of other students in the same grade. In this manner, value-added modeling seeks to isolate the contribution that each teacher makes [wikipedia]

But value-added models (VAMs) are much more than merely comparing successive test scores. Given many scores (say, grades 3–8) for many students with many teachers at many schools, one creates a mixed model for this complicated situation. The model is supposed to take into account all the factors that might influence test results— past history of the student, socioeconomic status, and so forth. The aim is to predict, based on all these past factors, the growth in test scores for students taught by a particular teacher. The actual change represents this more sophisticated “value- added”—good when it’s larger than expected; bad when it’s smaller. [Ewing, from the post]
posted by kid ichorous at 8:29 PM on April 27, 2011

philokalia- it looks from the wiki article that kid ichorous linked that there are confidence intervals around each teacher's estimate of added value.

However, it seems like there would be other problems. I admit that I don't know whether other schools are like this, but for my education K-8, each student went through the same sequence of teachers, and there weren't many students that entered the school or left, so for my school using the teachers as predictor variables for student performance would have led to a lot of co-linearity in the model (which may unfairly weight certain teachers' performance over others).
posted by a snickering nuthatch at 8:45 PM on April 27, 2011

A Brookings Institution paper says value-added modeling is flawed but the best measure we have of teacher value

But no doubt neglected to mention that "best we have" doesn't actually mean "worth using".
posted by kenko at 9:01 PM on April 27, 2011 [1 favorite]

(Moreover, how do they know it's the best measure we have? Wouldn't they have had to, like, compare its measurements to actual teacher value and determined that its measurements are more accurate than those of other tools? Because if they have a mechanism for determining the standard according to which measuring devices should be assessed, well, maybe we should be using that.)
posted by kenko at 9:02 PM on April 27, 2011

Yeah, in Florida they use the metric of learning gains in part to attempt to account for the effect individual kids who might have histories of poor performance (meaning, they came in at a disadvantage relative to the pool), but it's not a precise science. It punishes (or at least, fails to provide a mechanism for rewarding) teachers whose students were predominantly already performing well when they entered the classroom, even though the teacher may have actually done an excellent job.
posted by saulgoodman at 9:10 PM on April 27, 2011

Where to start?

1) The inaccuracy of standardized tests, their focus on one linear/linguistic model of learning, and their obvious inability to measure what many teachers do for a living (special ed teachers, art teachers, science/history teachers: spottily and inaccurately measured even when compared with math and English teachers). I hate to bring this up, but I won't be the first: the temptation to doctor results comes up when the stakes are high and the doctoring is easy. 50% of a teacher's compensation package can be tied to test results, based on a recent Congressional bill.

2) At the risk of sounding like a mush-minded hippie, caring about the students makes a huge difference in a student's life, especially in what may end up being their default psychological attitude toward life-long learning. This cannot be measured, even by student/colleague ratings.

I could go on.

I do wish it were easier to fire bad teachers. (States with strong unions do a better job at this, according to statistical evidence, which may have more to do with the local cultures which value education - and unions - than the direct influence of unions on the process, IMO.)

Beware of simplistic answers, and beware especially of those who have their eyes on the money that can be made in privatized education. Public education, if it can be redeemed from its many faults, will be key in transforming the USA into a country where equality, public civic knowledge, and strong local communities can take root and flower.
posted by kozad at 9:10 PM on April 27, 2011 [8 favorites]

Teachers ought to be able to recognize good teachers -- and bad teachers -- and ought to have enough concern for their profession and their students to do something about the bad teachers.

I don't know if you can create an effective system out of this for evaluating teachers, but I have a hard time imagining a more effective way of figuring out if a teacher is doing a good job than having good teachers watch what they do and evaluating them.
posted by straight at 9:12 PM on April 27, 2011

This is an interesting econometric problem, but one that I am politically ill-disposed to engage in. It's like trying to figure out the best mileage you can get out of a '73 Yugo that's 12% duct tape. It's an interesting challenge, I suppose, but if you actually want to get somewhere, the first thing is really just to buy a better car. Sure, we can devote huge amounts of brainpower to figuring out how to do the best we can with our horrifically underpaid and undercapitalized education system. But if we were actually serious about doing justice to our children and competing with the well-educated countries, we'd do what they do: radically raise pay and provide scholarships to entice better-educated teachers, buy better schools and materials, eliminate inequalities due to local funding, and raise the cultural esteem of the profession. Until we do that, we're like Ryan trying to figure out how to balance the budget while cutting taxes for the rich by $1 trillion.* An interesting puzzle, perhaps, but not one we should be engaged in in the first place.

(* No, I don't think Ryan honestly cares about the deficit.)
posted by chortly at 9:50 PM on April 27, 2011 [12 favorites]

I don't know if you can create an effective system out of this for evaluating teachers, but I have a hard time imagining a more effective way of figuring out if a teacher is doing a good job than having good teachers watch what they do and evaluating them.

Yes! Yes! This, after all, is what the tenure system was intended to do (and how it originally functioned).
posted by philokalia at 10:18 PM on April 27, 2011 [1 favorite]

Still not sold that there is a "problem with our schools," beyond the problems that come directly from the poor and/or poorly-governed city the school happens to be in.

These "problems" always seem to be taken as gospel by everyone, but where is the evidence, outside of that rash of "principal vs the bad school" movies in the late 80s? I have never seen any hard evidence to convince me these "problems" are anything but a political invention.*

*Of course standardized-test mania and "No Child Left Behind" are causing horrible problems, but I mean problems that weren't created by the alleged solutions.
posted by drjimmy11 at 10:29 PM on April 27, 2011

One of the problems I see is that I'm not sure some grievances about inferior teaching aren't just standard issue crypto-racism. For example, Writing Teachers: Still Crazy After All These Years by Mary Grabar.

After spending four depressing days this month at a meeting of 3,000 writing teachers in Atlanta, I can tell you that their parent group, the Conference on College Composition and Communication, is not really interested in teaching students to write and communicate clearly. The group’s agenda, clear to me after sampling as many of the meeting’s 500 panels as I could, is devoted to disparaging grammar, logic, reason, evidence and fairness as instruments of white oppression. They believe rules of grammar discriminate against marginalized groups and restrict self-expression.

The best part is where Ms. Grabar calls out, all the usual ethnic grievance communities. She later states, Since most scholarship in the field concerns the invention of increasingly convoluted conspiracies of white privilege, discovered through increasingly primitive forms of communication[.] It gets better from there.
posted by ob1quixote at 10:30 PM on April 27, 2011 [2 favorites]

> Student performance in my own courses varies widely from semester to semester. Sometimes the course average is an A. Sometimes it's a C. How can you account for this? I'm the same person, teaching the same subjects: it's the kids.

Ewing does actually talk about this problem, quoting this paper:

VAM estimates have proven to be unstable across statistical models, years, and classes that teachers teach. One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year.

This isn't really unexpected. Some classes are going to have a few disruptive kids in them. Some years weird things happen. Some classes are easier to teach than others. Mathematically this isn't hard to take into account. You simply have to ask, given the level of variability, how likely is it that differences in ratings are due to chance. Whether administrators understand this is another matter.

This procedure could be used to weed out teachers whose students consistently perform poorly relative to previous performance. As imprecise as it is, you can identify some cases where something is clearly wrong. You wouldn't want to use it for much else though.
posted by nangar at 2:41 AM on April 28, 2011 [1 favorite]

What's funny is that the same models could more accurately assess administrator performance (the sample is much larger, incoming classes are much more homogeneous, the randomized sample within district is larger), but you don't hear that one suggested very often.
posted by a robot made out of meat at 5:18 AM on April 28, 2011 [6 favorites]

This is why, at the very best private schools in this country, extensive testing and data analysis are performed to ensure that teacher performance is maximized!
posted by ennui.bz at 5:39 AM on April 28, 2011

Any metric for evaluating a teacher that fails to include the effects of class overcrowding, the imposition of non-teaching mandates, budget restrictions, parental interference/neglect, and student involvement/disruption, is doomed to be wildly inaccurate and, frankly, unfair.
posted by Thorzdad at 5:50 AM on April 28, 2011

The conquest of nature is to be achieved through measure and number.
posted by ReWayne at 6:13 AM on April 28, 2011

Please raise your hand if you've actually spent classroom time in an American public school of any stripe in the last five years (and no, watching your daughter play a petunia or speaking at Career Day doesn't count). Anyone? Anyone?

Yeah.

I think I'll just let Jim Horn, a Cambridge, MA, educator, speak eloquently for me on the current disgusting, immoral, destructive and immensely stupid and counter-productive destruction of the teaching profession and public education.

From his "Statement at the Massachusetts Department of Elementary and Secondary Education Meeting, April 27, 2011"

"For the teachers who are growing our future today and can't be here, I speak against this latest plan by the Business Roundtable to further cripple our public schools, to more profoundly objectify our children, to pull apart the teacher-child relationship built on caring and trust.

… for the past 30 years we’ve devoted enormous energies to more sorting the poor by testing, that deform children, debase our ethics, and blow up our public schools, thus leaving urban poor kids more intensely segregated in corporate welfare charter schools built on a chain gang pedagogy that accepts no excuses, not even hunger or homelessness.

"Even so, public school teachers of the Commonwealth persist in their noble work of teaching children, and teaching them well despite the unending attacks in the media.

"… One teacher recently interviewed spoke facetiously or cynically (it is hard to tell the difference these days) of how students may soon enter her classroom labeled as “pay cut” or “bonus.” This is harsh, but the reality is that a model that explicitly ties children’s scores to monetary worth creates such an atmosphere. Even effective and empathic teachers will be aware of how individual students may influence their own family’s economic security. Tying teacher pay or job security to test scores will not make teachers more accountable for student achievement, but it will have a deadly impact on the now tenuous relationship at the heart of student learning and growth.

"This whole business of using value-added testing to evaluate teachers requires much more research before it can ever be done responsibly. I urge you to heed the National Research Council findings instead of parroting papers by the New Teacher Project or Education Trust or NCTQ, whose funders control both sides of the aisle of that same corporate jet fueled by tax credits. Don’t turn children into Pay Cut Sally or Bonus Billy based on their socioeconomic status before they ever sit down at a desk. This is bad policy that threatens to finish off the profession and to turn teaching toward a low-level child management occupation of last resort.

"When the disgusted Spanish philosopher Unamuno confronted the fascist General Milan Astray in 1936, he said: 'You will win because you have more than enough brute force. But you will not convince. For to convince you need to persuade. And in order to persuade you would need what you lack: Reason and Right in the struggle. I consider it futile to exhort you to think of Spain.'

"I do not think it futile to exhort you to help preserve the teacher-child relationship in Massachusetts. We are not yet a corporate dictatorship. In the meantime, the teachers, parents, and other active citizens of the Commonwealth are not persuaded. Reason and Right are lacking. We shall continue to stand for Reason and Right and to resist all else."

But don't pay any attention to old, crotchety me. If you don't mind your children being treated as corporate widgets, numbers on a balance sheet, being taught from scripts read by temp babysitters, then by all means - go right ahead and keep posting and lauding garbage like this. You can go along with the swindle, but don't expect this professional teacher with a quarter-century of public school experience to go quietly (or politely) into the good night about it.
posted by AirBeagle at 6:30 AM on April 28, 2011 [15 favorites]

"I do not think it futile to exhort you to help preserve the teacher-child relationship in Massachusetts. We are not yet a corporate dictatorship."

FWIW, I think he's wrong about both of those. But nicely said.
posted by spacewrench at 6:50 AM on April 28, 2011

I have always loved the delicious irony of graders disliking being graded. It is very gradifying.
posted by srboisvert at 7:00 AM on April 28, 2011

Two points:

1) Every other person with a job in this country is evaluated according to a set of quantitative and qualitative metrics, not all of which are in that person's control. Why should teaching be any different? Yes, the imperfection of the metrics means that occasionally a good teacher will be fired. But this happens everywhere else, too; surely an imperfect measurement is better than no measurement at all.

2) Anti-ed reformers should consider that if you can't even begin to estimate the impact of individual teachers, that brings into significant question the value of teachers at all. From what I understand, statistically speaking, education has only a marginal impact on lifetime health, earnings, etc. when compared to bigger factors like home life, nutrition, race/ethnicity, etc. Throwing up your hands and saying that teachers can't control classroom outcomes makes the case for fewer, lower-paid teachers - not the opposite.
posted by downing street memo at 7:14 AM on April 28, 2011

My co-teachers and I have had this conversation many times... all of us feel that getting rid of inept teachers would be a good thing.

Currently, evaluations are done by a principal once every two years, and those teachers who get marks at the bottom of the scale have to participate in a program designed to improve the teacher's performance. The evaluation can be a one-time scheduled observation or a project that's simply turned in to the principal. There are maybe two or three teachers out of hundreds who are put into the program every year, so it's not doing much to identify poor teachers... or there just aren't that many.

Every other person with a job in this country is evaluated according to a set of quantitative and qualitative metrics, not all of which are in that person's control.

That's true, but I've never had a job in which so much is out of my control. Imagine staffing an office by going outside and rounding up the first 20 people you find. Where I teach, you'd be getting highly-educated people who have a lot of life experience and probably a good home life, with plenty to eat, etc. In my previous district you'd be getting people who might not speak English, dropped out of high school, etc. You can't fire any employee for any reason (besides possibly violence), and you have to round up new employees every year.

Your salary is determined by how efficiently your office runs. Which office do you want?

I do have a suggestion for how to evaluate teachers, but I have playground duty right now; I'll post after school. ;-)
posted by Huck500 at 8:01 AM on April 28, 2011 [3 favorites]

1) Every other person with a job in this country is evaluated according to a set of quantitative and qualitative metrics, not all of which are in that person's control

Hogwash. Metrics might guide evaluations of some classes of employees, but rarely are metrics alone used in other fields. What's more, that's a private sector model that doesn't necessarily apply here, because those kinds of evaluations are almost always tied to simple accounting--education is not analogous to business.

And the current metrics aren't even measuring teacher performance--they're only measuring student performance, and attempting to infer teacher performance indirectly. It's not logically sound to causally conflate the two, and the kinds of metrics we might use to meaningfully evaluate teacher performance, if they were comparable to private sector jobs, are not currently available. If we were to directly measure true indicators of teacher quality, we might look at certain measure of core competency, like the ability to clearly organize lesson content, the ability to engage students, knowledge of subject matter, the ability to present subject matter engagingly, knowledge and attentiveness to special needs stemming from students' personal backgrounds, responsiveness to parental concerns, etc.

But we aren't gathering or processing any such metrics directly related to teacher performance. We're attempting to use data sets with only a weak-correlation to what we're claiming to actually measure to evaluate teacher performance.

But this happens everywhere else, too; surely an imperfect measurement is better than no measurement at all.

Not if that imperfect measure leads to a wholly inaccurate view of the underlying reality we wish to measure. Current approaches to evaluating teacher performance are based on a fundamental fallacy: What we're doing is akin to trying to infer the weight of an object based only on the shape and size of the shadow it casts. It fundamentally makes no sense.
posted by saulgoodman at 8:07 AM on April 28, 2011 [2 favorites]

I'll ask one more time: Please tell us the amount of time you've spent in a public school in the last five years please. It's quite a relevant question.

Insinuating that teachers don't want to be evaluated shows just how out of touch folks are with the daily environment of public education in this country, and the depth of misunderstanding in which we are foundering.

Teachers are evaluated, every day, every hour. We're videotaped, we're observed by administrators, we have parents watching, we have textbook publishers watching, we have in-service providers watching, we are on tv, radio, newspapers, and, these days, every anonymous internet commenter who went to elementary school in 1986 and believes, therefore, he/she is an expert, also evaluates us. I can't, in fact, name a single person in this country who does not evaluate and weigh in, officially and formally or unofficially and casually, on what we do and how we do it. On a daily basis, with or without scientific validity.

Also, "gradifying" is spelled "gratifying." (Let's see if you get that one.)

"Surely an imperfect measurement is better than no measurement at all" ... ibid.

Please cite the specific words where I threw up my hands and said teachers can't control classroom outcomes. Please also cite in my post where I, or Jim Horn, claim to be "anti-ed reform." I believe in law that's called "stipulating to facts not in evidence," but I could be wrong about the phrase.

Now please try again, this time concentrating on the issue at hand. And do try to come up with original sources, preferably not from the usual billionaires' think tanks. Points will be taken away for using Wikipedia or Google.

Oh well, why am I screaming into the wind? It's all a moot point. Several months ago, we teachers in Tennessee had 51% of our evaluations tied to test scores using this scam called "value added." Today, we had our collective bargaining rights stripped by the legislature in a bill that will be headed for the governor's signature inside a week. And soon, a blond 40-something guy who's never been married (wink, wink) will see his wet dream of six years passed: Tennessee teachers will be gagged from discussing, in any way, [whisper]teh gays[/whisper] before ninth grade.

Given this, please don't try to convince yourself or other people and especially not me that we don't want or are resisting evaluation. We have swallowed it whole with barely a squeak of protest except maybe by old commies like myself.

My bottom line remains this: Those who claim that kids, teaching and education are widgets, cogs and profit centers are missing something very vital to their basic humanity; their moral compass is spinning. Public education is the foundation of the democracy; we've lost the democracy, now the foundation is being plowed up. Brave new world, huh, kid?

(On preview, what Saul and Huck said!)
posted by AirBeagle at 8:11 AM on April 28, 2011 [6 favorites]

how do you evaluate the schools orchestra conductor?

In California, we sidestep the problem by firing the conductor due to lack of funding.
posted by RikiTikiTavi at 8:12 AM on April 28, 2011 [1 favorite]

I have always loved the delicious irony of graders disliking being graded. It is very gradifying.

Because once you've had to grade stuff, you know how exactly how subjective and arbitrary the whole process is.

Every other person with a job in this country is evaluated according to a set of quantitative and qualitative metrics, not all of which are in that person's control.

And a great many of these are worse than useless as well. It's just that if a big company crashes and burns because the management are idiots, its no skin off my back.

Anti-ed reformers should consider that if you can't even begin to estimate the impact of individual teachers, that brings into significant question the value of teachers at all.

The people who are "anti-ed reform" are really in the end anti-standardized testing - with good reason. While the VAM method does actually give some data on teacher effects on student performance on standardized tests: there's a low signal to noise ratio, but with enough time, several years, you can get some useful data.

However the strongest criticism of the MET study comes when you look at how the standardized tests compare as predictors of performance on one another. The correlation is worse than teacher effects on student test performance. And the combination, if we're trying to actually measure learning rather than skill at one particular test, means that VAM has very poor predictive capabilities.

It calls into question the use of standardized testing in the first place. I mean theoretically we could design and administer a test that might do a decent job of actually measuring student performance, but the costs to administer and develop such a test to ensure its statistical relevance are far in excess of what anyone wants to spend.

I mean math is a powerful tool. It can tell you a great deal. In this case it tells us that a mechanistic approach to education is pretty terrible.
posted by Zalzidrax at 8:21 AM on April 28, 2011 [1 favorite]

Playing off something Saul said — if we want to evaluate teachers every year, why not make them take the tests? Cover the material, as well as pedagogical fitness. Tie maybe the amount that year-to-year progress is predicted by value added testing — which looks to be most solid at around four percent — to their pay, but the rest should be on them.

(I also like the idea of using value added testing to see which administrators are the most effective.)
posted by klangklangston at 8:21 AM on April 28, 2011

Klang, we do take the tests, both in subject and pedagogy. I personally took (and passed) six different such tests (at my own expense) in order to obtain my California teaching certification. The number and type (CBEST, CSET, etc.) varies by grade and subject, but they are done on the teachers' time and dime.

Since I was still in grad school in Michigan, I took the CBEST at a test site in Chicago. The test fees, and my Amtrak fare, rental car and hotel were all at my expense, on my time. The CSET tests I took were in the $400 - $500 range and were located over 90 miles away from the town we moved to after arrival in California. These tests were required before I could be employed anywhere in the state. Tennessee adds on two-four additional tests to that for their own certification, since they don't recognize California's tests (even though they are created by the same corporation). No complaint, just fact.

A digression: We also ponied up over $1,500 bucks for teaching English language learners certification. Never reimbursed by any district. It was worth ... $70 in my paycheck per year. But I did learn how to deal effectively with those classes of fifth and sixth graders which were over 80% ELL students ...

At any rate, all of this testing is, very much, "on them."
posted by AirBeagle at 8:41 AM on April 28, 2011

To be fair, you take the CBEST once. And I was recommending that teachers be tested every year.

Further, I took the CBEST when I was thinking about becoming a substitute teacher for a bit. Anyone who fails the CBEST would fail seventh grade, but it's not at all a good test of teaching ability or mastery of any given subject.
posted by klangklangston at 8:45 AM on April 28, 2011

CBEST tests basic skills, yes. I should define my acronyms. California Basic Education Skills Test (CBEST); California Subject Examinations for Teachers (CSET).

But, you want teachers tested every single year on the same things? Color me confused. What would you be annually testing? In other words, I thought 2+2 equals 4 every year; I thought the consensus as reflected by "correct" answers on the CSET on, let's say, presenting a lesson on long vowel sounds to ELLs, was pretty much the same year after year. And so on.

What knowledge am I supposed to have that changes year by year? Or do I get to keep teaching until I go senile and answer 2+2 = 16? I'm not being facetious (this time), just truly wondering.
posted by AirBeagle at 8:54 AM on April 28, 2011

Several years back I decided to " get involved" and joined the PTA at my son's middle school.
I arrived expecting a considerable crowd as this was the first meeting and the transition to middle school is worrisome, for the parents anyway.

12 parents including myself and two enthusiastic teachers full of plans that would require tripling the nights turnout to accomplish egged to trick other parents into joining.

The classroom was abysmal. Two clocks on the wall neither of which worked. Posters urging the children to stay cool and stay in school. A dictionary had been tossed up onto the hanging fluorescents which all hummed at a migraine like pitch. The paint on the walls was a green hue and someone had taken the time and energy to cut brown paper bags into sheets to cover every inch of the windows.

Every chair/desk listed or swayed making it impossible to sit still for any reasonable time.

Looking around I could see that this Resource Room had once been the wood shop as the hand routed signs above the doors had been left, left to be painted green no doubt.

This was a warm evening in early September so the door was opened exposing an expanse of asphalt it was also open because the heater was on.

My son became more and more morose with each day of attendance. A natural athlete his biggest problem was P.E. P.E. involved upwards of 30 minutes of lining up to orderly move outside onto the tarmac where the bell would then ring signaling the end of the 40 minute class.

His refusal to participate in this required us to attend a meeting where we were given pamphlets on the subject of respect and discipline. This we reviewed sitting upon tottering desks as the alarm malfunctioned every ten minutes. <

Shortly thereafter we officially withdrew him from school but would receive sporadic notices reminding us that our son was absent that particular day.
posted by pianomover at 9:16 AM on April 28, 2011 [2 favorites]

it's nice to see that the statistical techniques of management science, introduced to the automobile industry by Robert McNamara, that enabled US automanufacturers to beat the low-cost competition from abroad, then applied to the science of war, in Vietnam, where rigorous analysis of "body counts" helped the US Army to defeat the communists in South Vietnam, are now being brought to our public education system.
posted by ennui.bz at 9:33 AM on April 28, 2011 [6 favorites]

I'll ask one more time: Please tell us the amount of time you've spent in a public school in the last five years please. It's quite a relevant question.

Well, I've got a child who's currently in a VPK program, and its absurd the degree to which they're already pushing school preparedness now at a young age. When I was a public school student, it was enough to finish Kindergarten if you could demonstrate that you knew your home address and telephone number. Now kids are expected to have basic math and reading skills by the time they get to kindergarten. Ratcheting up the performance pressure on kids, without recognizing the subtle and complex social interactions and developmental processes involved in truly educating them, is destructive.

Also, when it comes to the actual quality and scope of student performance data that is available, believe me, I know my stuff. I authored or helped author some of the ETL jobs used to load this data for my home state's DOE.

Now if I were to go along with the idea that learning gains or similar metrics should be determinative when it comes to teacher evaluation (although I think that's ultimately a bit like tying a bus boy's wage directly to metrics concerning the profitability of a restaurant franchise, rather than using those metrics to evaluate the performance of the managers and executives operating the franchise), I might suggest a different approach, but it would be cumbersome and still deeply flawed.

One approach that might not be so bad would require administering subject matter specific tests at the beginning and end of a particular school year, and then evaluating improvement in the subject over the same time period. You still couldn't use this approach to accurately measure the quality of teachers who start out with kids that are all already high-performers, but if you included some kind of mechanism to account for cases in which special circumstances beyond a teacher's control prevented a child from performing to their potential, then this approach might provide some kind of useful outcome evaluation metric. But in practice, this approach would require much more investment in program administration than most state and local governments are willing to make, and it would likely result in the same kind of myopic, teaching to the test approach that made NCLB such an absymal failure.

And ultimately, even taking this alternative approach, we would still be making the error of evaluating teachers based on metrics that more properly should be used to evaluate the performance of education officials and school administrators. We don't blame the line cook for slumping quarterly profits at the local McDonalds, and we shouldn't blame teachers for poor outcomes that could just as easily stem from poor management and other factors beyond their control.
posted by saulgoodman at 10:12 AM on April 28, 2011

Arrggh. I misread the first part of AirBeagle's comment. Didn't mean to seem defensive, but mistook this for one of those old canards about how anyone who's been in public school in the last five years knows the teachers are incompetent, or whatever. Apologies for the misread.
posted by saulgoodman at 10:17 AM on April 28, 2011

As a former public school teacher and public college instructor of 10 years, I would say that, based on my own experience and countless hours of conversation with similarly-employed people, the situation really is this:

1. Most good teaching is a result of dedicated and talented people who have a vocation to find ways on their own to succeed with their students.

2. There is very little effective supervision of or support for teachers from administrators, who usually have very little idea of what's happening in classrooms so long as it's quiet. Teachers' only support comes from other teachers in their school.

3. A fairly small minority of teachers are completely incompetent and should not be allowed to teach. Raw statistical evaluation of teachers, if used sparingly, will identify truly terrible teachers quite reliably, and could be used to weed them out.

5. Raw statistical evaluation of teachers is completely useless in distinguishing one competent teacher from another, or the excellent ones from the merely competent, because it is impossible to write a standardized test to measure with good reliability most of what good teachers teach.

6. The chief beneficiaries of testing models are incompetent administrators who can use them in place of evaluating teachers intelligently, much less supporting them.

7. If evaluating teachers through tests is tied to how teachers are retained or paid, then the benefit will be the removal of the terrible teachers, but the costs will be the loss of teachers who cannot satisfy their vocation by teaching to the test, and the loss of cooperation among teachers who suddenly find themselves competing with each other over whose students get the higher scores.
posted by GentleReader at 10:29 AM on April 28, 2011 [4 favorites]

On break, have a couple of minutes...

The only way of effectively evaluating teachers IMO is to create a sort of committee of veteran teachers and administrators that would tour schools in groups and formally and informally observe classrooms. There would be a set of criteria, but they would also be looking for that unquantifiable quality that good teachers have... anyone who's spent some time around teachers knows who's great and who's not within a couple minutes of being in a classroom... that air of confidence, empathy, and authority can't be faked.

Ideally there would be scheduled observations and informal walk-ins. To schedule an observation might seem useless, in that the teacher could put on a dog-and-pony show for that time, but that's really not how a classroom functions. Having wildly different expectations for your students one hour a month just wouldn't work.

Part of the evaluation could be based on academic improvement, but all of the other relevant factors would have to be taken into account... and there are a lot of them.

I'd be perfectly willing to give up the whole tenure concept if a system like this were in place.

This, of course, would be a lot more money in subs and other costs than running a test through a scantron... so I'm guessing we'll keep using the methods we have now, which are pretty useless.
posted by Huck500 at 10:35 AM on April 28, 2011 [1 favorite]

it's nice to see that the statistical techniques of management science, introduced to the automobile industry by Robert McNamara

what
posted by a snickering nuthatch at 11:06 AM on April 28, 2011

what
posted by klangklangston at 11:13 AM on April 28, 2011

"But, you want teachers tested every single year on the same things? Color me confused. What would you be annually testing? In other words, I thought 2+2 equals 4 every year; I thought the consensus as reflected by "correct" answers on the CSET on, let's say, presenting a lesson on long vowel sounds to ELLs, was pretty much the same year after year. And so on.

What knowledge am I supposed to have that changes year by year? Or do I get to keep teaching until I go senile and answer 2+2 = 16? I'm not being facetious (this time), just truly wondering."

Pedagogical techniques change as research dictates, and teachers forget things year by year.

But the real goal would be to show that the teacher's performance is consistent year to year, ergo the variable must be elsewhere.
posted by klangklangston at 11:16 AM on April 28, 2011

But the real goal would be to show that the teacher's performance is consistent year to year, ergo the variable must be elsewhere.

The key here is that we need to evaluate the teacher's performance, and you can't do that by evaluating the students and then applying those evaluations to the teacher, especially year-to-year... we have new students with different abilities and a different classroom dynamic every year.

If you tracked the academic progress of one group of students through all the grades, and one teacher's students had a big dip in the middle of steady progress, that might tell you something about the teacher, or it might tell you that one of the dads in the grade level committed suicide in the middle of the year and it disrupted the whole year for those kids... this happened at my school this year.

Again, I think the only way to accurately evaluate teachers is regular observation by veteran teachers and administrators who are still working in a school or a classroom.
posted by Huck500 at 12:13 PM on April 28, 2011 [1 favorite]

Metafilter: Damn that felt good, but added nothing to the discussion.
posted by Reverend John at 2:38 PM on April 28, 2011

Good discussion here, or at least good venting.
It breaks my heart that the current ed reform nonsense is pretty much bipartisan consensus.
I like GentleReader's list, the 7th being the critical point, that assessments are not neutral, and they change the system for everybody.
Thought I would share two more things that break my heart about education:
First: Of of my twin sons comes home from second grade and says "Dad, I love reading, but I hate reading tests" - this is before the SOL's really hit him next year. They are only in second grade, so they only take "practice" SOL's, here in Virginia.

Second, reading my New Yorker this week (ok, maybe it was old, i am catching up) noticing the special advertising section for their (jointly sponsored with University of Phoenix) American Education in the 2st Century conference:
Here is one Q from the Q and A:
Isn't it possible to grade teachers on merit, pegged to student achievement year over year, accounting for slow learners and fast learners both?
Margaret Spellings: We can and we do. Annual assessment and disaggregated data in every school and state in this country are required under No Child Left Behind. these fundamental principles of accountability - embraced by the Obama Administration - are key to measurement and success.
Cynthia Brown : Right. This notion of how we're changing teacher evaluation systems, teacher compensation systems, the measures of school success - all of that is only possible because of the every-grade testing Margaret was talking about.
Madeleine Sackler: Well, I would even argue that's too slow. I visited schools that are closing achievement gaps in seriously underperforming districts and they evaluate the kids and the teachers every few weeks.
Dr.Craig Barrett: We also need to focus on the big picture, which is the competitiveness of the United States, and the competitiveness of our workforce. In Finland, 75% of the kids test proficient. We get excited when 30% of kids in this country test proficient. We need to raise the entire system.

This makes me want to scream.
It's going to get worse before it gets better.
posted by cogpsychprof at 7:07 PM on April 28, 2011

I teach in a private school. We don't pay much attention to standardized testing. Instead, we expend all our effort trying to figure out ways to make each student succeed. Teacher evaluation is conducted in a staged four-year cycle, with the full evaluation year requiring nine classroom observations and eighteen meetings over a one-year period by three separate people (peer, department chair, division head). We're less interested in firing teachers than in making sure they become good teachers and then great teachers, and we have a stepped salary scale that begins with new teachers, goes to experienced teachers, then master teachers, then faculty leaders. Our students are all extremely well prepared, many go to top-notch colleges, and all go to college. Our teachers work their butts off. Most of them also coach or do comparable work as part of their job.

And it costs a bucketload, even though a substantial number of our students get financial aid.

You get what you pay for.

Threatening people is a lousy way to motivate them, whether children or adults, as any classroom teacher can tell you.
posted by Peach at 2:36 PM on April 29, 2011 [1 favorite]

« Older Achievement Porn | The Hidden Night Newer »

This thread has been archived and is closed to new comments

MetaFilter

Is teacher evaluation statistical voodoo?
April 27, 2011 7:01 PM Subscribe

Tags

Share

Is teacher evaluation statistical voodoo? April 27, 2011 7:01 PM Subscribe

Tags

Share

Is teacher evaluation statistical voodoo?
April 27, 2011 7:01 PM Subscribe