It’s Standardized Testing Season Again

HOW “VALUE ADDED” REFORMS DESTROY THE VALUE OF SCHOOLS

It’s that time of year again. The days are getting longer. The air is getting hotter. Dreams of summer vacation are distracting us from the established curriculum. In this regard, the school year is playing out exactly the same way it has for a hundred years and more.

Except, for this generation, there’s an added element. It’s testing season. It’s that time again when my classes are half empty (half full). Students are spending hours at a time with their heads down, or staring at a screen, filling in bubbles because, in Florida, bubbles equal knowledge. It’s all about the bubbles. It’s that time of year when all of the meaningless STUFF that students crammed into their minds, combined with the useless “test taking skills” that they learned, culminates in one ultimate, mindnumbing ritual. The standardized test.

The gym, the media center/library, will all be shut down. Students will be shuffled around as teachers are used as proctors. Multiple days are disrupted for this ritual. When students do find their way back to their teachers, they are no longer in any receptive mood to actually learn. Not that they care as they know that their classes constitute a colossal theft of their time and mental energies.

Nothing has done more damage to schools, students and to the teaching profession than standardized, high stakes testing. To thrive in today’s fast-paced, rapidly changing, globalized and networked society requires citizens to be creative, innovative, critical and independent thinkers. Knowing a lot of information is far less important than knowing how to evaluate and analyze information efficiently. Yet our curriculum, designed around standardized testing, demands content knowledge, conformity, and acceptance of authority. In other words, through testing and other market-based “accountability” reforms, we’ve created a school system that does exactly the opposite of what we need it to do.

This is a colossal and possibly irreversible theft from our children.

Below are three of my best posts on this topic.

January 24, 2015: On Oak Trees and High Stakes Testing

Some time ago my school district showed us an interesting video and sent representatives around to explain the new Value Added Model (VAM) that they were using to assess teachers. The goal, of course, was to sell this woefully inadequate system and to convince us not to worry about the inherent iniquities. Everything was taken care of. The representative was very nice and cheerful, reassuring us that all factors were taken into account when applying VAM to our teacher reviews.

To break it down so that even we simple teachers could understand the megacomplex formula that was being used, the video and rep offered us an example of VAM scoring using an oak tree farmer as a metaphor.

An oak tree farmer? Really?

It turns out that some tree farms really do sell oak trees. It’s unclear that they use a Value Added Model to evaluate their output.

Regardless, the VAM sales pitch was premised on looking at three oak tree farmers all growing oak trees under different conditions. One has ideal sun and water. The other has low water; the other has less sun—or some variation. Really, they lost me with oak tree farmer. The bottom line was that each was going to be judged based not on how much taller their oak trees grew, but rather on a comparison of how much an oak tree can be expected to grow given the various circumstance as compared to how much the oak trees grew with each individual farmer.

So there. Don’t worry. It’s all perfectly fair and rational. Teachers working with kids under adverse conditions are not going to be measured against the expectations of those teachers working under ideal conditions. After all, that wouldn’t be fair. What we’ve done is gathered the demographic data, extrapolated the potential trend lines compared to sample populations, controlled for the various inputs and presto, a perfectly aligned and objective VAM score. See. Nothing to it.

Bless us and save us!

Look, no disrespect to oak tree farmers, but this sales pitch is an insult to the intelligence of anyone who is a teacher or who cares about teaching as a profession. News flash! Teaching kids is not raising oak trees! I’m sure raising oak trees has its challenges, but teaching is an infinitely more complex task not subject to simple metaphors.

If we were to make the metaphor work the rep would have to concede that there are multiple farmers involved in the growth of each oak tree. As a high school oak tree farmer on a block schedule, I see my little seedlings two to three times a week for eighty three minutes at a stretch and fertilize their minds with American History. Other teachers are involved in tilling and cultivating their reading, math, language arts, science and other subjects. Then my oak trees go off and spend time with people who aren’t even farmers in various soils of differential quality before coming back to me. Then they will be tested on material that has nothing to do with the particular fertilizer that I am laying down.

Let alone the fact that our little oak trees come to us having been tended by hundreds of other farmers, with new farmers every year. Each oak tree has developed differential levels of aptitudes and interests, skill sets, and values. Each oak tree has different goals, expectations and prospects. No two oak trees can be tended in exactly the same way.

Somehow, all of these complications were missing from the little animated video. There was one oak tree per farmer racing to grow a taller product. Easy peasy.

Which, of course, leads us to question the very validity of determining the value of an oak tree based on its height. The claim is that this is an objective measure, but upon examination it is subjective. How did the experts objectively determine that height was the best variable for determining the value of an oak tree?

How about breadth?

How about canopy?

How about width of the trunk?

Number of branches?

Acorn output?

Acorn quality?

Color?

Climbability?

Root span?

Number of squirrels?

Aesthetic quality?

Denseness of foliage?

I’ve seen many oak trees. Hell, I lived under oak trees for six years. Never once did I think, ‘wow, if this oak tree were just a little taller it would be a very valuable oak tree.’

The bottom line is that if determining the value added to an oak tree by a particular farmer is impossible, then what about doing the same for a human child or young adult? A model that, upon inspection, would not work for the relatively simple farmer/oak tree interaction is clearly inadequate for evaluating the Value Added by teachers.

Teaching isn’t a quantifiable process. It isn’t even a science. It is a craft. Part technical skill, part creativity. Part inspiration, part perspiration. The single greatest way to kill teaching as a profession is to try to quantify what teachers do, then hold them accountable for not doing more.

Of course, that’s the endgame.

No self-respecting oak tree farmer would submit to such a system.

July 25, 2011: My Students’ Test Scores Went Up

And my reasonable reaction to this news

Some years ago I sat in on a lecture by the great education reform advocate, Alfie Kohn.¹ During the lecture, Mr. Kohn stated that there are two reasonable responses to the news that test scores have increased in one’s district. The first² he referred to as the moderate response: “So what?” According to Kohn, and the prevailing research, there is no empirical evidence that test scores are a reliable measure of “learning.” In fact, he pointed out research that revealed the opposite influence; those who scored the highest on standardized tests tended to be the shallowest thinkers and least actively involved students. He also pointed out that standardized tests have virtually no predictive value for future academic success. In real terms, there is no reason to take pride in higher test scores.

Yet there is considerable social pressure on teachers to do just that. This last year, when improved test scores for my school were announced, there was a resounding applause among the faculty. We were recognized for “working hard to raise our scores.” It was difficult not to get caught up in the exhilaration. Indeed, we did work hard. Much of our efforts were focused on raising those test scores. Teaching methodologies, such as the Kagan method, were implemented, sold to us based in part because they are proven to raise test scores. During two Kagan trainings, the trainer asked us about our goals as teachers. One question was, “do you want your student’s test scores to go up.” There was a resounding “yes” and even applause from the faculty. I was somewhat dismayed by how effectively socialized teachers were in understanding raising test scores as a reasonable goal.³

To be honest, when I look at my students’ test scores, I must admit to a certain visceral excitement when I see their “progress.” Then reason takes hold, as does my crippling sociological imagination. Are the increases in my student’s test scores something to be proud of, or should my response be “so what?”

Yes, my students, taken as a whole, showed higher overall scores on this year’s test as compared to last year’s test. I do not know if the difference between the two is statistically significant, as I’ve not taken the time to run the numbers. (Perhaps that will be a later post). I can say that, as I scanned the data, I observed that the vast majority of students showed little significant improvement despite an overall positive change in their raw scores. According to one source, an increase in 78 points is the equivalent of a year’s learning. Based on this assumption, every student should score around 78 points higher than the previous test. In fact, many of my students did just that. However, was that the result of my awesomeness as a teacher? I’d like to say it was, but the facts might be a little more slippery.

First, what caught my eye were the students who scored well above 78 points. Was I such a good teacher that I over-taught the year? And if that’s true, why didn’t all of my students score more than 78 points? Or could there have been other variables, independent of my quality as a teacher?

Let’s look at one outlying student, I’ll call her Sally. Sally scored over 700 points higher on the reading portion of her FCAT this year than she did last year. Seven hundred points! Yay me! Hey, since I, as a teacher, must take the blame for all the failures of public schools, I should be able to take credit for one profound success. Shouldn’t I? Well, as much as I would like to, I’m afraid there are other factors that I would have to control for before I can understand just how much value added I contributed to this child’s amazing test results.

Perhaps the stars aligned. In other words, the student’s results might be due to nothing more significant than luck. Of course, as a sociologist, I’m not inclined to think in terms of luck; but I can think in terms of probabilities. There is a probability that she just happened to guess well on the stuff she did not know. I tend to discount this hypothesis because it’s demoralizing, but also because there was more than one student whose increase was extraordinary, though hers was the largest.
If this year’s test might be an outlier due to nothing more than probability, it’s also possible that last year’s test was a probabilistic low outlier. In other words, if she wasn’t lucky this year, she may have been unlucky last year. Evaluations of teachers and schools, however, are based on a comparison of the current year to the prior year, not a complicated evaluation of student trends.
It could be that she learned seven years’ worth of material this year; thus, her test results for this year and last year are accurate representations of her learning. That would be wonderful. It’s almost certainly not true. Florida has a pretty robust curriculum set, especially at the high school level. There’s very little time for “re-teaching” much from the year before, let alone from the previous six years. So, if Sally did learn six years’ curriculum this year, she must have done so on her own. More power to her. Can we say that this is the result of quality teaching? Perhaps inspiring teaching? It’s impossible to say.
An option that I think is plausible is that she was able to learn a foundational skill, or a skill set, that helped her understand the test or the test questions better as a whole. How did this happen? It could be that one of her teachers taught these foundational skills in a manner consistent with her learning modalities. But which teacher was it? It probably wasn’t me because I’m not a reading or language teacher. On the other hand, it could have been me since reading is a big part of my class. Or it could be that one teacher, or a combination of teachers taught the skill, and I reinforced it. So how much credit should I get? How much “value” did I add? On the other hand, she might have learned the requisite skills from a previous teacher, but didn’t actually “get” or understand the skills until this year when it finally “clicked”. In this case, the right teacher will not get credit for Sally’s higher test scores.
Related to option 4 is the possibility that Sally acquired or mastered the foundational skill due to neurological development that occurred during this year. She may have been exposed to this hypothetical skill every year for the last five years, but did not possess the neurological “wiring” to assimilate it into her learning until this year. This is an often ignored aspect of learning. In this case, there is no way to assign credit for Sally’s own neurological development.
It could be that this year was the first in which she cared about the FCAT. In Florida, the tenth grade FCAT determines graduation. Understanding the importance of this test over previous exams may have compelled her to try harder.
Conditions of overall health and well-being may have influenced her test performance. It could be that this was the first year in which Sally received adequate nutrition, sleep, or exercise. Perhaps her home life was finally stable. Maybe her social status in the school was such that she developed a positive self-image. Again, these are not variables within the control of the teacher.
I hate to say it, but she could have been under the influence of performance enhancing drugs, legal or illegal, that helped her concentrate or increased her thinking capacity. Many stimulant drugs from Ritalin to cocaine have this effect.

Any of the above variables, and perhaps more, could have been deciding factors in her impressive testing gains. Alas, I can’t reasonably take credit for any of them definitively. Yet I’m expected to take great pride in her results and the cumulative results of all of my students for working hard to increase those test scores. Unfortunately, there’s almost no substantive reason for this pride.

On the other end of the spectrum, were there students who scored inordinately lower this year than last? The answer is yes, but fortunately, there were fewer of them than there were Sallies. Let’s focus on one such student, whom we’ll call Phil. Phil scored over five hundred points lower on this year’s reading FCAT than last year’s. Why? Was I such a bad teacher that I, somehow, untaught Phil six years’ worth of learning? I find this hard to believe. What happened?

Well, all of the variables attributed to Sally could be true in reverse for Phil. Perhaps he had a bout of bad luck. It could be that this year’s test score was an outlier on the negative end of the continuum, or that last year’s was an outlier year on the positive end and this year’s scores were consistent with his overall trend. Maybe Phil just didn’t care about this test enough to try hard. After all, the tenth grade FCAT can be retaken and passed at any time between 10^th and 12^th grades. It could be that Phil knows that he’s not going to graduate anyway, for whatever reason, or that the FCAT is irrelevant to Phil’s educational or personal goals. Perhaps he knows that he’s moving out of state where the FCAT does not matter.

During this time of economic instability, we cannot rule out the possibility that Phil’s test scores reflect a sudden destabilizing of his life. His parent/parents may be out of work, facing foreclosure, arguing over money. Maybe they cannot afford to provide proper nutrition or other health sustaining resources to Phil. It could very well be that Phil had to find a job to help provide for his family. Even the knowledge of reduced economic prospects may have squashed his personal investment in the future, or in school. My school emphasizes college, college, college. If Phil knows that he is not likely to go to college, why should he participate?

Instability does not have to be economic. His family may be unraveling for other reasons. Phil may be facing the prospects of parental conflict or divorce. New members may have entered his household, parental boyfriends or girlfriends. Might these relationships be unhealthy, destabilizing or even abusive?

Phil may have also fallen under the influence of drugs or other unhealthy influences that usurp the focus and goals of young people. This is not unusual for fifteen or sixteen year olds. Even typical teenage relationships with peer groups or romantic interests can interfere with one’s academic prospects. Has Phil changed schools, been uprooted from a comfortable environment?

Then there’s the matter of neurological development. Now neurology does not, typically un-develop in youth, but that does not rule out the prospect of neurological damage caused by falls, toxification or drug use. In Phil’s defense, I do not believe he suffered from brain damage and it is my hope that he has not fallen into a drug habit or addiction (again, I don’t believe he has).

Regardless, there are so many possibilities that might have negatively influenced Phil that are not a reflection of my skills, or the skills of my colleagues. Yet some would penalize teachers for Phil’s poor performance.

One student who was of particular interest to me was a fellow whom we can call Tony. Throughout Tony’s career he has scored high on the FCAT. FCAT is scored between 1 and 5. Tony consistently scored in the high 4’s, last year scoring a 5 on the reading portion. I paid special attention to Tony this year because, no matter what I did, his classwork and assessments never matched his FCAT performance. Tony was bright. If I asked him a question he invariably had an adequate answer. However, he rarely did any of his assignments, and when he did turn something in it was sloppy, poorly written, not done according to directions and often incomplete. I spoke to his mother. I spoke to his other teachers. We had parent teacher conferences. None of us could understand how he could do so well on the FCAT, yet do so poorly in his classes.

This year, Tony resolved the contradiction. He scored a 2 on the FCAT Reading test. How much of this score was strategic on Tony’s part? It’s impossible to say. Which was the true measure of Tony’s abilities? I would hazard, based on observation, that his historic FCAT scores were the most accurate assessment of Tony’s potential. I offer that his class performance was rather more indicative of his work ethic. Of course, there may be some underlying problems that I have not identified.

Regardless, student performance is subject to so many variables, internal or external, and so many ecological pressures that it is impossible to impute the teachers’ skills and influence in a single test score. Yes, there are some statistical procedures that can be performed to parse out a correlation that might indicate positive or negative influence, but they are unlikely to provide significant results when so many variables must be controlled.

Kohn is correct. “So what,” is the best response to the news that your student’s test scores have gone up. Teachers should not be working hard “to raise test scores.” Any professional teacher worthy of that title should be working hard to provide the best, most fulfilling, most inspiring education for their students. They should be working hard to prepare their students for a lifetime of learning and growth outside of the classroom and beyond the school building. Such is the laudable life mission of teachers. A mundane bureaucratic endeavor that reduces “education” to a number between 1 and 5 is not worthy of a teacher’s attentions, and certainly nothing for which a teacher should take pride.

I have never been proud of a test score. I’ve been relieved by test scores, and disappointed by test scores, but never proud. What makes me proud, and what is a true assessment of the value added in the life of a student? It’s when former students reach out to me and say, “Thank you, Mr. Andoscia.” I have many former students now, building their lives and contributing to society. I don’t remember any of their test scores. Nor should I.

______________________

The lecture was to promote his book, The Schools Our Children Deserve: Moving Beyond Traditional Classrooms and “Tougher Standards.”
The second reasonable response, according to Kohn, was the more radical, “My dear God, no! What have you denied my child in order to get those higher test scores?” I agree with this more radical statement, but it is beyond the parameters of this essay. I could offer that when teachers are focusing on test scores, there is the possibility that they are not focusing on the holistic value of a child’s education. When teachers take pride in their test scores, they may be magnifying the false value of a purely bureaucratic endeavor at the expense of a truer value in the humanity of their students. To so “deny” a child in such a manner is not consistent with a teacher’s ethics.
I am very fortunate to work for a principal who truly believes that providing the best possible education is the best strategy for raising test scores. He has mastered a fine-tuned balance between the virtues of quality teaching and the practical reality that our school is judged based on test scores. In my school, teachers are treated as professionals and given every opportunity to better themselves. Such is the cross borne by quality principals and administrators everywhere. The Kagan method is a valuable tool for teachers and can be presented as such. That the presenter felt the need to justify the Kagan method by using test scores is a critique of our current state of education, not the Kagan method itself.

April 30, 2011: I’d Like to Thank Mr. Connelly, My First Tenth Grade World History Teacher

Fortunately, he didn’t have to prove “value added” in order to keep his job

When I was in 10^th grade I was placed into Mr. Connelly’s world history class. It was an average level class with average level students who more or less didn’t want to be in school, or specifically didn’t want to be in a history class. They had very little understanding of history or of how to study the subject. As a responsible teacher faced with students who have little background knowledge, Mr. Connelly didn’t just teach history, he taught history study skills like graphic organizers, time-lines and outlines, etc.

These were skills, however, that I did not need. I always loved history and devoured every source I could on World War II and, at that time, the history of Rome. I was bored out of my mind in Mr. Connelly’s class, doing very little learning and walking away with straight A’s. It didn’t take long for my teacher to realize that I was learning at a level to which he was not teaching. Therefore, he did what any exceptional teacher would and should do. He transferred me to an honors history class where I would be challenged appropriately.

Mr. Connelly wasn’t my teacher for very long, but his professionalism and his judgment may very well have had a significant impact on who I am today in ways that cannot be quantified in the standard, value added manner. In Mr. Johnson’s class, I not only accessed a more challenging curriculum, but also associated with other students who thought like me and had similar interests and goals. I was able to refine my academic interests with my interpersonal interests as I was developing a sense of self. So, thank you, Mr. Connelly, no less than Mr. Johnson who provided the actual learning experience.

This story makes me wonder what Mr. Connelly would do under Florida’s current “value added” model of teaching being pushed through the state legislature. Value added is an economic term that has come into vogue with regard to pedagogy. The merit of a teacher must be determined quantitatively and then acted on either by rewarding teachers who are successful at achieving an acceptable level of quantifiable learning, or by punishing a teacher who fails in this endeavor. The accepted method of quantifying value added among students, and thus determining the “value” of the teacher, is through test scores. According to Florida law, 50% of a teacher’s value is to be determined through test scores.

So let’s take a look at Mr. Connelly in such an environment that determines his value through quantitative test scores. Would it be in HIS best interest to move into another teacher’s class a student who will most likely improve his “value added” scores at the end of the year even if such an action is in the student’s best interest? Of course not. We could argue that holding onto a student who would be better served elsewhere is unprofessional. Indeed it is, but current laws regarding using tests as a reward and punishment system run contrary to any sense of professional ethics. In this case, the best interest of the student is not the only variable influencing the teacher’s decision. The best interest of the teacher, in this case, competes with his professional ethics.

Yes, Mr. Connelly may be a teacher, bound by professional ethics, but it is also likely that Mr. Connelly has a mortgage, a family, even children of his own. The pseudo-measurement of the teacher’s value added determines Mr. Connelly’s ability to meet his own personal and financial obligations and opportunities. Hell, the test scores may also determine his employment or professional status. He would be a fool to sacrifice his job, or even his own children’s well-being, by selflessly acting on the best interest of a student who would improve his own value added measure. If anything, he would want to get rid of students who are likely to test low.

So what we are seeing is the development of a role conflict between a teacher’s professional obligations to his students, and his personal obligations as a private citizen, homeowner or even parent. Exacerbating this role conflict is the fact that most of the variables associated with test scores are outside of the teacher’s control. Test scores are influenced by such factors as socio-economic status, cultural affiliation, nutritional access, parental education level, family stability, neighborhood dynamics, cultural diversity in the school and many other variables that the teacher cannot control. Yes, the quality of the teacher is the number one variable influencing academic success in school, but it is not the most influential variable over all (parental education level is much more influential).

So when you add social strain to role conflict, what could possibly go wrong? Well, at the very least we can expect that otherwise selfless teachers will become more self-interested. The theory among conservatives is that self-interest is the best possible motivation to secure the high quality outcome. In this case, however, self-interest is detrimental to students. Teachers like Mr. Connelly will actually suffer negative consequences for selflessly locating the best placement for their higher-level student. What’s more, the same predictions can be applied to the lower-level students. Students who consistently score less than proficient or non-proficient are simply not statistically likely to improve a teacher’s value added scores. With limited time resources, and high stakes consequences, the teacher would be better served to concentrate on those students who are borderline or intermediary test takers as they are the ones most likely to improve significantly with the right help. Higher-level students will be expected to improve largely on their own, which they will most certainly do. With some persuasion, the lower-level students might be pawned off to the special education department, their value added calculators muted or silenced. After all, the chances are that the lowest scoring students are those students most influenced by negative outside forces and are the least likely to make significant gains.

Whereas the above might be logical strategies that a teacher or even a school could use to improve scores, certainly one would not argue that they are effective teaching strategies. The teacher is ethically responsible for all studentsbut only accountable for those from whom they can demonstrate value added. Such a teacher would be defined as effective or even highly effective when looking at the value added scores despite the reality that the actual teaching is of low quality. One certainly could not blame a teacher for following this course when it means protecting his livelihood.

On the other hand, we might also expect to see an increase in more questionable tactics. We already hear about the reality of teachers “teaching to the test.” I have been present during staff meetings in which testing experts pointed out those areas on the FCAT that are worth the most points. Administrators were instrumental in developing strategies that emphasized developing strategies for concentrating on those high-point areas at the expense of other parts of the test that were not worth as much. This might be ethically questionable, but in light of a system that penalizes low test scores, costing teachers and schools money and jobs, ethics must take a back seat to practical contingency.

At the extreme, we can also expect to see more cheating. This cheating will be justified by the teacher or administrators perpetuating the con as the only means within the control of the institution to effect change. The ends of keeping a school open and securing hard to come by jobs and funds will justify the means of changing a few answers on a couple of tests. When people feel that they have no legitimate control over their own lives, they will find illegitimate means to achieve that control. Students justify their own cheating in much the same way. There is no reason to suspect that teachers are less inclined to cheat to keep their jobs than are students to keep from being grounded.

The current push for showing “value added” is creating this noxious climate. Value added may effectively improve test scoressuch is the measure for success throughout the country. Despite the improved test scores, however, this policy will discourage teaching and produce less educated students. Destroying the selfless professionalism of teachers like my own Mr. Connelly by coercing and rewarding self-interest can only hurt education in the end.

One response to “It’s Standardized Testing Season Again”

Napoleon, Robert E. Lee and the Cult of Past Personalities – The Mad Sociologist Blog

May 9, 2021 at 3:22 pm

[…] 1985, my World History teacher Mr. Connelly recognized that I was wasting my time in his class. I was a high school sophomore, but I had been […]

LikeLike

One response to “It’s Standardized Testing Season Again”

Leave a comment Cancel reply

Listening Across the Political Divide

The Economy is Too Hard

We are Disassembling Public Education

Trending

Listening Across the Political Divide

The Economy is Too Hard

We are Disassembling Public Education

A Brief History of Warfare: Pre-Humans to the Iran War