The toxic myth of good and bad teachers

Time to read: 13 minutes

There are a number of claims made by various people about the effect on a student having a good teacher versus having a bad teacher. Most of these claims are nonsensical, and rather than increasing the likelihood of improvement in schools, they do a great deal of damage to teachers, students and schools, and make school improvement much less likely.


Because there aren’t many good teachers and there aren’t many bad teachers, most teachers are just average. We know this, because we know that teacher quality, measured across all teachers, results in a normal distribution, a bell curve. Sure we’d find that there are a few high performing teachers at the top end and a few low performing teachers at the bottom, but the far majority would be in the middle with not that much separating them. If you’re a teacher who is reading this post, I’ve got bad news for you, you’re almost certainly an average teacher. Just as I am almost certainly an average teacher. While we’d like to think that we’re high performing, compared to our colleagues, the actual evidence points to the contrary.

It would be same if we measured the quality of carpenters, golfers, doctors, lawyers, public servants, scientists, whoever… but for some reason we don’t complain about the quality of other professions. Teacher bashing has become a convenient excuse for far too many critics.

A few week or so ago, The Age newspaper identified some of the A+ teachers helping students to 40+ in VCE here in Victoria, Australia. In this article five teachers, whose VCE results stand out clearly above other teachers, are identified as great teachers. One teacher had ten out of the top 14 students in VCE Sociology, another taught 17 of the top 33 students for his Business, and another taught 2 of the top 8 students in their Australian History. The results these teachers have achieved are exceptional, and clearly it is impossible to believe that every teacher could produce these kind of results, but it is also clearly wrong to suggest that just because they aren’t producing these kinds of results that they are bad teachers.

Obviously, every parent whose child is undertaking VCE would love to have teachers who produced these results. Yet to believe that it is possible for the majority of teachers to produce results similar to these four teachers in this article is plainly wrong. It is impossible for most teachers to have a number of students in the top bracket of students. Of course it shouldn’t be surprising that there are teachers who produce results like these, rather it is to be totally expected, as obviously when we consider the distribution of teacher quality, its distribution is shaped like a bell curve.


This is why statements from people like John Hattie are so misleading. According to Hattie “teachers account for a variance of 30% in student achievement.” I’m not convinced this is true but even if it is, is Hattie describing the maximum variance within to the two limits of the bell curve? If he is then the 30% variance only applies to a tiny fraction of exceptional good teachers compared with the tiny number of exceptionally poor teachers. For the far majority of teachers, whose quality lies in the middle of curve, the variance will be non-existent. Sure there may be a theoretical maximum 30% difference within the absolute best and the absolute worse, but for the far majority the variance will be close to zero.


Looking at the graph above then Hattie’s 30% maximum variance within good and bad teachers, even if it is true, is vastly overstated. Three standard deviations from the mean of 15% falls at 6% and 24%, which cover the middle 99.7% of teachers. As such there is only 18% difference within the middle 99.7% of teachers. Looking at two standard deviations which account for 95% of all teachers, slims the variance to 12%! While the middle 68% of teachers (1 standard deviation) only shows a 6% variance in student achievement, a far cry from the stated 30%!

Of course Hattie might not believe that teacher quality fits a normal distribution, but if so he needs to justify why he believes this and how many good and bad teachers there is in our schools. He may also suggest that the absolute maximum difference is more than 30% and the 30% figure correlates to the third or even the second standard deviation from the mean, but if that was the case, then surely that would be the figure promoted or he would explain how many teachers are subject to this variance. But he doesn’t so I believe it is safe and proper to fit a normal distribution to his variance claims.

Dylan Wiliam tries to similarly tries to promote the myth of good and bad teachers when speaking at the ALT-C conference in 2007 he said, “If you get one of the best teachers, you will learn in six months what an average teacher will take a year to teach you. If you get one of the worst teachers, that same learning will take you two years.” Again, I’m not agreeing with these figures, in fact I highly doubt them unless of course Wiliam is speaking about pure memorisation and direct instruction, which he well may be.

Wiliam gives us a little more information than Hattie though. He suggests that his data doesn’t fit a normal distribution, where the average would be in the middle of the lower and upper bounds. If Wiliam’s data fitted a normal distribution then the average would be 15 months instead of 12 months, as 15 months is 9 months more than the lower bound of 6 months, and 9 months less than the upper bound of 24 months.  As such, Wiliam’s assertion fits what is called a positively skewed distribution, as shown below.


By graphically representing Wiliam’s figures, it is obvious that his claims are overblown. It is clear that the far majority of teachers produce about the same results, and the teachers at the lower and upper ends of the impact are in the tiny minority. Also according to Wiliam, most teachers, that is more than half, are not producing a year’s worth of learning a year! While the mean is 12 months, in a positively skewed distribution the mode will be less than the mean. In Wiliam’s world, more than 50% don’t produce a year’s worth of learning in one year… and somehow it is their fault??


Even if the maximum difference within the best and the average teaching is 18 months, then the actual variance of what is an average teacher, and the variance within the far majority (1, 2 or 3 standard deviations from the mean) would be much, much smaller, and again just like Hattie’s 30% variance extremely overstated for the majority of teachers and students. Even if Hattie’s and William’s figures are correct, the point to their message must be that this difference does not occur regularly in our classrooms but rather it is an extremely rare exception.


The cumulative effect of Hattie, Wiliam and others suggesting that these rare and extreme examples of teacher quality variance are in fact common occurrences, results in teacher quality being viewed as a much bigger problem than it is. Yes, if their figures are accurate, it should be a big concern that 0.015% of teachers impact student learning outcomes much less than others (probably only measure solely through test scores) but it is a rare problem rather than systemic problem than many people believe, and it should be seen and treated as such. Also, it should be recognised that the problem of variance in professional quality is not unique to teaching but rather occurs at the same distribution and to the same degree in every profession.

Atul Gawande posed the question “What happens when patients find out how good their doctors really are?” in his 2004 article Under The Bell Curve. Gawande describes the efforts over  117 cystic fibrosis clinics across the US over the last 60 years. We’d like to think that when we go to hospital we would get the same quality of care and would have the same expected outcomes regardless of which hospital we attend and which doctor attends to us. Yet, Gawande tells us that isn’t the case at all, there are good doctors and bad doctors, with the good hospitals in 1997 reporting life spans 16 years above the average for cystic fibrosis patients! Gawande points us to the bell curve and reports that the far majority of doctors and hospitals however are average and their patients have much shorter life expectancies.

Gawande then explores how hospitals reacted to the news that the care they were providing to their patients was average, and the efforts they used to increase the quality of the average majority. While there have been substantial improvements, Gawande insists that the bell curve remains, and will always remain, and the difference in life expectancy for cystic fibrosis (and patients will other life threatening illnesses) will always be dependent on the quality of the care they receive.

Gawande finishes his article examining himself as a surgeon. What if he found out that he was just average, or worse? For Gawande however, the problem of being average isn’t as big as settling for being average, something I assume that Gawande admits that he would rather quit being a surgeon than doing.

So do the doctors and hospitals how provide average quality care for cystic fibrosis patients at their clinics want to improve want to improve? One would hope so, but simply identifying them as average doesn’t mean they are happy being average as there is no evidence to suggest this. The bell curve in itself does and cannot distinguish those who want to improve from those who don’t. In fact, Gawande points to patients who chose to stay with their average doctors because of the care they feel built up over a number of years.

It is distressing for teachers to acknowledge the bell curve. After all we all want to view ourselves as being good teachers as opposed to being average but a realistic understanding of how skills and knowledge are distributed across of a cohort forces us to face this unwelcome truth.

Of course it would be easier if we actually could measure teacher quality which would allow us to measure, identify and quantify good, bad and average teachers. The problems is that we don’t have an universal way of understanding teacher quality, while various groups have tried they haven’t done a good job of this. The previous government here in Victoria, unsuccessfully tried to implement a system where school principals would rate their teachers from 1 to 5 in order to identify 20 to 40% of them as being underperforming and not eligible for pay promotion. Clearly those suggesting this system believed that teacher quality in Victorian schools didn’t fit a normal distribution but rather a negatively skewed distribution. Of course, the only reason they had for this was budgetary.

This is where the toxic nature of talking about good and bad teachers is revealed. After all does it matter more about the actual distribution of teacher quality, or does it matter more about what people believe the distribution looks like? What happens when a myth is propagated that teacher quality doesn’t fit a bell curve but rather fits a negatively skewed distribution?

Furthermore, in the absence of appropriate data we do what most people do, we assume that we are a good teacher and therefore we are the definition of a good teacher. And if we’re not teachers ourselves, when base our view on good teachers on the teachers we had when we were at school. It’s almost as if we say, “I might not know what teacher quality is, but I know a great teacher when I see one.” Which might sound reasonable… but in reality these ideas have an incredibly narrow view of what a teacher is, and quickly descend into discrimination and teacher bashing.

Discrimination and teacher bashing? How?

Well, some people believe that to be a mythical great teacher you need to be a highly passionate caring teacher. In this narrative great teachers are in the mould of Miss Honey from Roald Dahl’s “Matilda” with a rare gift to inspire and connect with their students. These people point to inspirational teachers who taught them when they are in school, or the inspirational teacher they believe themselves to be.

This narrow understanding of teacher quality creates unrealistic expectations, it really is impossible for every teacher, in most schools, to have an amazing rapport with each and every student. As a result quality teaching becomes a teacher who displays their passion for teaching by working long hours and having teaching as their only real priority. Who is always positive and never has a bad day!

While we’d all like every teacher to be passionate about teaching, but discrimination happens when we expect every teacher to be only thinking about teaching and willing to put in every hour they can. Single-parents and others for a range of reasons, who are unable or unwilling to devote every waking hour to teaching are quickly labeled as bad teachers, who should be moved on, overlooked for promotion or discriminated in other ways. People with problems in their personal lives, or suffer from medical conditions might not always project this image of the inspirational teacher, and when we’re on the hunt for bad teachers these people can soon be in our sights…

Others believe that to be a mythical great teacher you need high level knowledge and skills. A good teacher is so much smarter and more knowledgeable than a bad teacher. Pretty soon though we’re lining up those teachers we don’t think are knowledgeable enough and moving them on. Tests have recently be proposed here in the Australia to check that new teachers are literate enough teach, despite them having passed their teaching degree and all their school teaching placements.

Older teachers who are not up with technology might be the first to go. Next might be women who have taken maternity leave and have a big gap in their experience or who are not able to (in our eyes) balance family/work. Next might be those who aren’t on Twitter day and night, attending professional conferences whenever they can in order to keep their skills up to date.

We’re all too quick to blame and label those teachers who aren’t just like us. Rather than celebrating diversity and considering what it might offer our students and our education system, we see diversity and being undesirable. We see diversity as being different from good, and we blame those for not being exactly like our picture of an ideal teacher, and we make erroneous assumptions about them.

Again this something that Dylan Wiliam gets really wrong when he says, “if we create a culture where every teacher believes they need to improve, not because they are not good enough but because they can be even better, there is no limit to what we can achieve.” While this might sound reasonable, sort of, where is Wiliam’s evidence that every teacher doesn’t currently want to improve? My confident guess is that teacher’s desire to improve is also distributed as bell curve, and Wiliam’s assertion that there are many teachers that don’t want to improve is misguided and overstated.

William’s attempt to link a teacher’s desire to improve to the variance of teacher quality is also false. You cannot overcome the bell curve by wishing it away, no more than every golfer can play as well as professional golfers if only they wanted to improve! It is silly. And why does Wiliam’s faith in limitless potential derive from? Surely finding better approaches for learning and teaching is where limitless potential might be found, such as via new pedagogical approaches afforded by modern technologies?

But those who talk about good and bad teachers don’t want to find new pedagogical approaches, they’re happy with the system we’ve got. And shame on anyone who can’t be a good teacher in their system and can’t reap good results using their approaches. According to these experts, it’s not the bell curve that’s the problem, it is the teachers themselves.

Not only does this lead to discrimination, with anyone who doesn’t fit their mould being labelled a bad teacher. It also leads to not focussing on what could actually improve student learning outcomes. While we try to narrow the quality gap, whether it be Hattie’s 30% or Wiliam’s year and a half year, we’re not concerned with why all teachers can’t successfully teach in Hattie and William’s systems. We’re not looking for pedagogical approaches, (constructivism anyone?, inquiry anyone?) that might not be so susceptible to such variances due to teacher quality.

Consider the Measures of Effective teaching project whose goal is to identify effective teaching. I’m still at a loss why you wouldn’t just use test scores as a predictor of future test scores, unless of course you’re trying to pretend that student learning isn’t just about test scores. Of course, if you want to try to pretend that you can measure effective teaching beyond test scores you can then appear to agree that learning isn’t just about test scores, which I guess is why METS suggest approaches that weigh test scores somewhere between 33% and 50%…

In order for every student to achieve success we need learning and teaching approaches that are suitable for average teachers. We need to recognise that education of our students is far more than test scores. That is the first stage and until we’ve done that we need to lay off teacher quality. If Hattie, Wiliam, and others do believe that education is all about test scores then they need to be honest and upfront about that before we start labelling teachers as good and bad.

How many good and bad teachers there actually are matters a lot. Take for example, the report: Great Teaching, Inspired Learning What does the evidence tell us about effective teaching? where the authors say:  “Modelling by the US economist Erik Hanushek estimates that if a student had a good teacher as opposed to an average teacher for five years in a row, the effect would be sufficient to close the average performance gap associated with low-socioeconomic status.”

But how likely is it that a student had a good teacher as opposed to an average teacher five years in a row? If we want the results that Hattie and William suggest the best of the best teachers can achieve, then we’re looking at the teachers above the third standard deviation or 0.015% of teachers. How likely is it that a student would have these teachers for five years in a row? We can working this out by multiplying 0.15 with itself five times

0.015 x 0.015 x 0.015 x 0.015 x 0.015 = 0.00000000007%

This is so unlikely you wonder why Hanushek would even bother suggesting this.

If we believe that teacher quality fits with a normal distribution how many standard deviations are we going to choose to identify good teachers, that is, where do we set the bar? Say we set the middle 68% as the average (one standard deviation) which means the top 16% will be good teachers? How likely is a student to have a good teacher five years in a row? Only 0.0001%! Alternatively, if we believe that only 80% of teachers are good, then only 30% of students will have a good teacher for five years in a row. And where do Hattie, Wiliam and their peers set the bar, for what is tolerance of teacher quality for which their pedagogical approaches work?

We have two choices, first we follow the path of Hattie, Wiliam and their peers who think that our pedagogical approaches are set in stone and appropriate and our teacher variance is the problem, or we can decide that teacher variance shouldn’t impact student learning, rather instead our pedagogical approaches should ensure all students equally experience learning success. Make no mistake, a focus on teacher quality is incompatible with a focus on pedagogical innovation and improvement, and conversely a focus on pedagogical innovation and improvement is incompatible with a focus on pedagogical quality. We need to choose which focus we believe offers the biggest gains for increasing student learning and equity.

I believe that we need to find, and that we can find, learning and teaching approaches that work for almost all (99.85%) teachers. If we can find pedagogical approaches that work for 99.85% of teachers, then 99.25% of students will have access to exemplary learning experiences for five years in a row. This will not only result in better learning outcomes but also a system that is more inclusive, equitable and more diverse.

Improving our pedagogical approaches so that they work effectively for all students and teachers is a complex task, and one that we won’t be able to solve while we continue to apportion the blame on bad teachers.

For me, the choice is clear. We need to stop speaking about good and bad teachers, we need to stop worrying about teacher variance, and instead focus on what might actually make a difference in the lives of our students focussing on developing higher quality learning and teaching approaches that are not limited by the variance of teacher quality.


Footnote: By the way, the bell curve as it relates to good and bad school leaders is also true. Sure there may be a tiny great school leaders, and tiny few terrible ones, but most of them are just part of the average majority…. as for doctors, public servants, politicians, car drivers, golfers, singers, ….


Update: Feedback from Andrew Worsnop suggests that I’m misusing Hattie’s 30% figure, I’ve expanded the section on Wiliam’s figures as it makes the same point. I haven’t changed my writing on Hattie’s 30% though, as I’m not sure that I agree with Andrew that I am misconstruing what Hattie is saying about the 30% variance in teacher quality/impact.


Image credit:  A visual representation of the Empirical (68-95-99.7) Rule based on the normal distribution. Creative Commons Attribution-Share Alike 4.0 International license.