The Testing Conundrum
top of page

The Testing Conundrum

Can teacher quality be assessed through student outcomes?

Recent research supports a direct link between teacher quality and student learning. At the same time, it is argued that the current teacher evaluation systems are not working. Under current teacher evaluations, most teachers are rated as exceptional. Yet students perform far below the exceptional level on state, national and international assessments. Although numerous factors contribute to student performance, there appears to be a disconnect between teacher ratings and student performance. Precisely which elements of teaching lead to improved student learning and how to measure those elements remains unclear.


The words above were written by Sarah Melvoin Bridich for her dissertation at the University of Denver in 2013. The focus of her graduate study was Colorado’s Senate Bill 10-191, an education mandate which “aimed to improve student learning by overhauling the teacher and principal evaluation system and eradicating teacher tenure”. What the Colorado legislators failed to understand is that by focusing on teachers and principals while ignoring those on the other side of the coin (students), they created a system that invites fraud.

This "disconnect between teacher ratings and student performance" has sparked knee-jerk responses by lawmakers across the country. In Colorado, the legislature decided that incorporating these four elements into educator evaluations would bring a sorely-needed 'performance-based' aspect to them:

  • annual teacher and principal evaluation

  • collection of student growth data

  • collective responsibility

  • school improvement plan


School districts across Colorado scrambled in 2011 and 2012 to rewrite their teacher evaluation systems. All the public knew (or thought they knew) was that now, finally, teachers would be evaluated according to their performance, and nothing else.

This sentiment could not be farther from the truth. In my book, Chaos in our schools, I share the details of my district's new plan, rolled out in 2013. It created a mostly two-pronged approach: the principal's ratings on performance (50%) and the teacher's own student growth data (15% in 2013; 30% by 2018).

The results of student testing in any one school, across grade levels, were now applied to all teachers equally as 20% of their eval. This fulfilled the idea of "collective responsibility", something the power structure thought would encourage teachers to work together for the good of all students.

The "school improvement plan" portion was both created and assessed by the principal, for his own school. When it was introduced to the faculty in 2013, it was without any specifics as to what went into it, but it counted for 15% of our evaluation.


A teacher's Annual Evaluation, the only part of the new mandate that mirrored evaluations prior to 2013, has always been assessed by one person (the principal), from his subjective interpretation of a list of Quality Indicators regarding a teacher's performance:


Professional Preparation a) Demonstrates accurate, up-to-date knowledge of subject matter. b) Demonstrates knowledge of how to integrate subject matter and literacy across content areas. c) Implements research-based best practices in instruction. d) Develops lesson plans incorporating effective lesson design. e) Plans and implements district-adopted curriculum through alignment of resources and assessments. f) Aligns content within course and with previous and succeeding grades/courses.

Professional Practices a) Communicates to students expectations for learning. b) Models and facilitates higher-level thinking, problem-solving, creativity, and flexibility. c) Adapts instruction to meet the instructional needs of all students. d) Administers all building, District, and State assessments with fidelity. e) Uses a variety of assessments to make instructional decisions. f) Explicitly communicates criteria for student success. g) Develops a safe and welcoming learning environment. h) Collaboratively develops, models, and communicates clear expectations for student behavior within a learning environment. i) Develops and carries out appropriate consequences in the classroom.

Professional Responsibilities a) Participates in professional learning opportunities and applies what is learned. b) Establishes and maintains professional communication, which is clear, responsible, and respectful. c) Establishes and maintains meaningful two-way communication in a timely manner with students and guardians. d) Collaborates to accomplish team, school, and district goals and practices. e) Maintains up-to-date records of student progress according to District policy and school norms.


How can any principal know all this about any one of his teachers? The short answer is, he can’t. But he can pretend to, writing up narratives that either praise the teacher or ding her for incompetence. Every bit of the report-out of these indicators is subjective; it follows, then, that a teacher’s rating depends in large part on her favorable standing with the principal, which takes us back to an earlier statement by Bridich:

Under current teacher evaluations, most teachers are rated as exceptional.

We can probably add that “this is most likely because it is easy to rate someone you like as exceptional”. Most teachers hired and retained by a principal are probably also liked and valued by him or her.

Those outside education but with a front-row seat to its ineffectiveness had a hard time wrapping their head around this, the idea that teachers could be rated excellent by their principal even with mediocre student performance. The entire teacher evaluation system, they reasoned, was no more than a façade, with all participants going through the motions of a choreographed dance, but without any real repercussions for missteps. This is actually true, by the way, as I explain in my book. Prior to 2013, most of us teachers went through the motions of our evaluations as if on autopilot, as did our principals. It was a necessary annoyance in order to have the job we loved, but it meant very little to our careers.


So now we come to testing: can it be used to rate teachers accurately?

Since No Child Left Behind (2001) the power structure has felt a need to tie teacher performance to student outcome. Rating teachers on their students' test results might seem an obvious protocol, but that avenue had been stymied by the unions for years. (This might be the only area of union interference in education that I agree with.)

The problem with comparing supposed teaching performance with student outcome is that we are all assuming the evidence of that outcome (test results) has been obtained through earnest effort.

Has it?

Look at the questions on state assessments that are based on PARCC (Partnership for Assessment of Readiness for College and Careers). Here is a question for third grade math:


There's a lot of reading the child must do merely to answer the question 'Do you understand that 1 ten and 15 ones = 10 + 15, which = 25?' This question could have been written so much more directly if that is what we are trying to determine.

So just what are we testing? The kid's math prowess or his ability to interpret and stick with a fairly long question...? (Remember: this kid is 8 years old!)

And, if the child does not explain to the satisfaction of whoever grades his test why he answered the way he did, he does not pass that question, even if he wrote the correct answer (the first game). If the kid writes his 'explanation' by showing his work, like this:

10 + 15 = 25 20 + 1 = 21

25 ˃ 21

does he still get credit for explaining it, even though he didn't use words?

Notice that if all he typed in the box are the numbers to show he can interpret place value correctly, like what I typed above, he didn't actually answer the question that was asked, either. (Did John score more points in the first game or the second game?) So how is that scored?


There is also a Part B to tackle:

Keep in mind that this is a computer test. There's a lot of typing involved. What if the child doesn't feel like typing or has difficulty with the interface? This is why, incidentally, teachers spend a lot of instruction time training kids how to test, using practice tests thoughtfully provided by the publishers.


Here's one more question from that third grade math test:


"Explain how the rows and columns can be used to model the total number of stickers on the two sheets." What?? The kid has received instruction on using rows and columns for multiplication, because that's how we've been teaching math for at least a decade. But when he sits down to a computer test with a convoluted question like this, the only way he's going to get the correct answer is if he takes the time to figure everything out (assuming he can) and type his response fully in the box. Having seen many children test over the years, I can say with confidence that more likely, the kid will shrug and think 'I don't know what to do with this, and I don't really care. Next question.'


[Click here to see examples of 4th and 5th grade math questions.]


The more I look at practice tests, the more I realize that the best way to get a kid to test well is to use practice testing as a mainstay in teaching. 20 years ago, we teachers scoffed at 'teaching to the test' and vowed it would never happen.

Oh, how the times, they are a-changin'!


The full math battery for grades 3 through 6 is usually three to five tests, with around 50 questions per test and a time limit of 60 to 90 minutes, depending on grade level. A student only passes the math assessment if he does well on all parts of it. We're asking for a lot of stamina from fairly young students in order to display that they know grade-level math. I encourage you to look at more third grade math questions here.

Now let's look at “critical reading” for third grade:

Notice the complex sentence structure.

And the vocabulary used.

Here's the rest of the article:

And here is the first question:

How long did it take you to find the word designed in paragraph 6? How long do you think an eight-year-old would persist in looking for it?

And there's another angle to this question: if he gets the answer to Part A wrong, what hope does he have for Part B? As older and more mentally mature people, we adults probably understand that Parts A and B work in concert with each other; we would analyze them nearly simultaneously to figure out which answers have to be correct, in order to support one another.

If an eight-year-old has the cognitive ability to do that, he's an advanced learner. But every third grader is expected to do this, according to Common Core, the current trend that governs how these tests are written.


In regular class, the child is encouraged to ask questions when he doesn’t understand something and to work with partners to troubleshoot his reasoning when it falls short. Testing requires just the opposite. Is it any wonder that many students opt out of putting forth a best effort?

Yet society places tremendous stock on the results procured by these tests.

Perhaps it’s time to address the disconnect between testing and student motivation.

After all, if the Common Core folks insist that every third grader is capable of passing their tests, then the only real hindrance is the motivation to do well, because teachers have been teaching this material, in this way, for more than a decade. (For more on Common Core, read our FEATURE here.)


If a test requires a great deal of thought, interpretation, and problem-solving, why would any student push himself to do well on it?

The only reason I can think of is that the test is important to him: how he fares on it will have a direct impact on his education, such as classroom placement and even grades.

At present, this is not the case, and students know it.

This is where any reform efforts must begin, with the value the recipient places on his own learning. Without addressing that, any other reforms are doomed to failure. The evidence can be seen in test scores over the last 11 years. Teachers jump through a lot of hoops to achieve their excellent rating, while students continue their poor achievement.

The following table shows the percentage of 6th graders and 3rd graders who PASSED Colorado's state assessment from 2015 to 2021:


So just how effective were the massively expensive new teacher evaluation protocols in improving student performance? Not very, it would seem.


Post note:

Public education has been a bone of contention on our political landscape for decades. It will not be easily fixed because its problems hail back to the 1980's when the newly created Department of Education published its in-house investigative report "A Nation at Risk". This scathing report sent us down a slippery slope of mandates and policies that can only be supported by circular bureaucracy.

Our latest video explains how.


My book CHAOS in our schools provides the details behind how this bureaucracy has encouraged everyone in education to engage in subterfuge and fraud. Surely this cannot be what the power structure actually intended.

Or can it?


Recent Posts

See All
bottom of page