Poor research and ideology: Common attempts used to denigrate inquiry

Time to read: 5 minutes

A few months ago a prominent Melbourne University academic tweeted “Pure discovery widens achievement gaps” citing the paper “The influence of IQ on pure discovery and guided discovery learning of a complex real-world task” I was immediately dubious of this research, as research that is commonly quoted showing that the inquiry learning doesn’t work, is usually fundamentally flawed. I’m not a proponent of “pure discovery learning” per se, but I feel this type of research, and the reporting on this type of research is designed to denigrate all inquiry learning. In an attempt to leave only teacher instructional approaches standing – why these researchers don’t instead prove a theoretical basis for instruction is beyond me.

So I took a look at the research to see if educators should have any confidence in its reported findings.

TLDR: No, we shouldn’t haven’t any confidence in this research, and it does not show that pure discovery or inquiry approaches widen achievement gaps.

 

Not surprisingly, this research fails the good educational research test as it doesn’t use a learning theory. That is, the research does not attempt to justify a theoretical basis for its findings. The researcher does use two other (non-learning) theories though, to defend the research, notably game theory, and control value theory. In essence the author uses these theories to defend the research design, yet for some reason he does not believe a learning theory is also required? I find this baffling.

Why doesn’t the author believe that a learning theory is required to define the scope of the research, given that the research is about learning? Why does the author believe that theory is required to explain games, and emotional attainment?

Anyway, let’s look at the research, as I’m always interested in how this type of research is used to investigate inquiry learning, or in this case pure discovery learning.

The author defines pure discovery learning as learning occurring “with little or no guidance. Essentially, knowledge is obtained by practice or observation.” The author spends considerable time explaining how pure discovery occurs in so much of our lives, with ATMs and iPhones requiring people to use them correctly without instructions. He continues, in explaining how Texas Hold’em poker requires people to “use multiple skills to reason, plan, solve problems, think abstractly, and comprehend complex ideas” which are similar to real life pure discovery learning situations.

The author explains: “The poker application used for this study was Turbo Texas Hold’em for Windows, version four copyright 1997–2000 Wilson Software. This is a computerized simulation of a 10-player limit hold’em poker game.”

Wait!!! What????

A computer simulation of a game you play with real people is a suitable method for exploring pure discovery?

Texas Hold'em Archive.orgInterestingly enough, you can play Texas Hold’em, the software used in this research in your browser thanks to archive.org. (Note: If you’re using a mac use Function + right arrow when it asks you to press End to play.) In playing Texas Hold’em you will discover just what an poor attempt of simulating the playing of poker against nine other simulated people this really is.  It appears that the study data used in this paper is actually from a previous study by the same author, “Poker is a skill” dated 2008. The 2008 date still doesn’t explain why such old DOS software was used! In this paper the author explains that 720 hands of Texas Hold’em over six hours is equivalent to thirty hours of casino play, with real people as opponents. That is 6 hours playing against a computer is supposedly the same as playing 30 hours against real people!

If you play the simulation at archive.org it is easy to see how 30 hours of real play can be achieved in 6 hours using this simulator. Turns made by your computer opponents fly past with short text messages popping up briefly on the screen.  Two groups of students used this old software. The researchers provided the instruction group with instructions of a specific poker strategy, the pure discovery group were, for some reason, given documents detailing the history of poker! The success of players was determined by the money that they had won (or lost), though it should be noted that the participants were not playing for real money. Instead the highest ranking players were promised to be playing to a chance to be part of a raffle for an iPod. This was intended to place meaning, to the otherwise valueless money, each player was playing for.

So this study designed is to simulate a real life complex problem, yet it doesn’t even simulate a real life game of poker!  The participants were not playing against real people. The participants were not playing for real money (though their success was measured as if they were.) And… the participants were playing five times faster than real poker is played.

All of this should make anyone question how the author could possibly argue that this research design can possibly be described as “pure discovery,” as commonly used in real life situations. Interestingly, though the author identifies differences between the instruction and control groups. Neither group learned to play poker to the point where they didn’t lose money. Further both groups played many more hands, more than twice as many hands as poker experts are reported to recommend that “good” poker players play. That is, neither group exhibited the one of the main traits of good poker players, folding around 85% of the time, and only playing 15% of the time.

 

How might research of pure discovery be better designed?

Duh.

 

Playing with real people would allow a learner using pure discovery to observe, and seek to understand the decision making of other poker players. Depending on the relationship with the players the learner might ask questions of their opponents, seeking to clarify rules and strategies. Other players might also intervene in the play, offering advice and pointing out pivotal moments in the hand, and pivotal decisions being made by other players. If the participants played against real people, surely they would’ve noticed that they were playing many more hands (much more than twice as many) than their more skilled opponents? Though the computer might be able to simulate the logic of poker, it cannot and does not simulate the interactions between the players, a critical feature of playing any game, and especially poker.

Given that the instruction group did not learn to play Texas Hold’em poker to a satisfactory level, it is obvious that instructional strategies used did not work. Of course, it must be noted neither did the control group, who were left to battle the computer opponents on their own, armed only with a document on poker history. To suggest players playing against real opponents, using pure discovery or other inquiry approaches would also fail to learn to play poker satisfactorily, is obviously outside of the scope of the research, as the researcher did not explore this.

Of course, the lack of a learning theory is what has also led the researcher to his narrow definition of successful learning. Did the author ever consider why people play poker? Is money the only indicator of successful play? Or do people also play games for fun? Are the social aspects of playing with friends an important part of being a poker player?

A more complete understanding of what makes a poker player, a poker player, would consider other indicators, traits, characteristics and motivations.  Did the study participants continue playing poker after the study had finished? Did they enjoy playing poker?  Do they intend to teach friends? Do they feel playing poker with their friends strengthens their friendships? (Not that they were given this opportunity.) Have they developed their own theories and strategies they intend to try out in the future? What do they know about poker?

I believe a better understanding of poker players and the reason people play poker, would greatly improve this poor study. It would also provide further evidence for the worth of learning to play poker by playing poker with friends.  Not that this is an earth-shattering conclusion! After all isn’t that how we all learn to play any new game? Or maybe you’re the one out on the oval by yourself with a ball, a sheet of paper documenting the history of football!

 

Driven By Ideology?
To suggest that individuals playing Texas Hold’em against a computer mirrors inquiry that happens in our schools in complete nonsense.

To suggest that this research proves pure discovery “widens the achievement gap” is complete nonsense.

To suggest that learning poker by yourself on a computer playing against a simulation has anything at all do with student learning and real inquiry is nonsense.

Do academics who favour high levels of teacher instruction really expect us to believe that inquiry classrooms operate the same way that people learn to play poker individually on their computer?

Do academics who favour high levels of teacher instruction really believe that playing poker on your own against a computer tells us anything about how teacher professional development or teacher pre-service training should be designed?

Do academics who favour instruction really believe that a piece of paper with strategies on them is really the best way to learn anything?

Do academics who favour instruction really believe learning is just about knowing, and not about experiencing with others?

Do academics who favour instruction really believe we’re that gullible?

Is there evidence that Positive Education improves academic performance? No

Screen Shot 2016-07-20 at 8.23.50 AM

Time to read: 5 minutes

Lately there has been quite a bit of talk in education circles about social aspects of learning, particularly well-being, grit, growth and other mindsets, positive psychology and other social and emotional programs.

My personal opinion is that these are a tacit recognition by proponents of direct instruction, that their belief that learning and development is a linear cognitive approach of memorising skills is insufficient. Maybe they are starting to understand that development is highly individual in nature, it is not linear or maturational, and that it is a complex transition to qualitatively new understanding of concepts, new motivations, new relationships with others and the world, new directions, and new results?

Unfortunately, rather than reexamining the more appropriate learning theories of Vygotsky, Piaget and other dialectical approaches to development, these instructionists blindly continue down their misguided path co-opting bits and pieces into their flawed framework. Rather than design learning and teaching so that it IS social, they attempt to teach social as if it was a seperate discrete unit to other learning.

One such model is the Visible Wellbeing Instructional Model. Rather than admitting direct instruction (Visible Learning) and performativity (Visible Thinking) don’t work. They’ve now misunderstood the fundamental aspect of the idea that all learning is social from Vygotsky and Piaget, and instead tried to stuff it into their broken Visible Learning and Visible Thinking model in the hope that it will fix it.

How do they justify this?  Well according to them, Positive Education has been shown to increase student academic results by 11 percent.

Unfortunately for the Visible Wellbeing Instructional Model, this is simply untrue.

 

In 2011, Durlak, Weissberg, Dymnicki, Taylor, and Schellinger released their meta-analysis of social and emotional interventions. Notice that their paper is concerned with school based interventions, not a study of social and emotional practices that are embedded in standard learning and teaching practice. Their finding that is widely reported as evidence that these programs improve academic results is found in the abstract where they write:

“Compared to controls,  SEL (Social Emotional Learning) participants demonstrated significantly improved social and emotional skills, attitudes, behavior, and academic performance that reflected an 11-percentile-point gain in achievement.”

Seems clear cut right? Wrong!

If you, like me, and seemingly subsequent researchers who quote this research took “compared to controls” means compared to those who didn’t participate in these programs you’d be wrong, because that’s not at all what they are saying… Let’s read the paper further.

In Table 5, they specify the results of their meta analysis:
Skills 0.57
Attitudes 0.23
Positive Social Behaviours 0.24
Conduct 0.22
Emotional Distress 0.24
Academic Performance 0.24

Though I’m not a fan of effect sizes, as I believe they are completely flawed, consider what John Hattie in the book Visible Learning says about effect sizes:

“Ninety percent of all effect sizes in education are positive (d > .0) and this means that almost everything works. The effect size of d=0.4 looks at the effects of innovations in achievement in such a way where we can notice real-world and more powerful differences. It is not a magical number but a guideline to begin discussion about what we can aim for if we want to see student change.”
(Hattie, p15-17 quoted by http://visiblelearningplus.com/content/faq)

You might notice that all except one of Durlak et al effect sizes fall below Visible Learning’s guideline for beginning discussion about them. The only one is Skills (0.57) so according to their figures only worth of Social and Emotional Interventions are to develop social and emotional skills. Everything else atttitudes (0.23), positive social behaviours (0.24), conduct (0.22), emotional distress (0.24), and academic performance (0.24) fall a fair way below the Visible Learning cut off.

You’re probably wondering, where the 11% gain in academic improvement comes from, in light of its small effect size. To solve this one, we need to keep reading the paper.

“Aside from SEL skills (mean ES = 0.57), the other mean ESs in Table 2 might seem ‘‘small.’’ However, methodologists now stress that instead of reflexively applying Cohen’s (1988) conventions concerning the magnitude of obtained effects, findings should be interpreted in the context of prior research and in terms of their practical value (Durlak, 2009; Hill, Bloom, Black, & Lipsey, 2007).”
Durlak, Joseph A., et al. “The impact of enhancing students’ social and emotional learning: A meta‐analysis of school‐based universal interventions.”Child development 82.1 (2011): 416.

The mean effect sizes in Table 2 (Table 2 contains the same figures as above and broken down into further groups, such as class by teacher, class by non-school) do seem “small,” because they are small! Very small, so small Hattie would no doubt suggest you should ignore social and emotional programs, unless you’re teaching social and emotional “skills” (0.57).

But what do the author’s mean when they say “instead of reflexively applying Cohen’s (1988) conventions”?   I looked up the definition of reflexively… the Webster-Merriam dictionary gives the following meaning:

“showing that the action in a sentence or clause happens to the person or thing that does the action, or happening or done without thinking as a reaction to something”

Now I’m not a methodologist, like Durlak whose other paper is provided as a reference about why the effect size of the social and emotional intervention shouldn’t be understood by the effect size happens because of the intervention. Yet, it does seem a bit of a stretch (to a non-methodologist), to find what the methodologist is an appropriate method of determining its practical value.

Table 5

What the authors did, as far as I can tell as a non-methodologist, in order to “interpret the practical value of social and emotional interventions” is compare the results to other social and emotional interventions.

I’ll say that again, the 11% improvement in academic results is not compared to control groups who did not have interventions at all, they are 11% gains over students in other social and emotional type programs, and all students experience less improvement than those who did not participate in social and emotional programs.

We can see clearly from the last line of the table that the figure 11% was produced by comparing the effect size of 0.27 to four other studies with effect sizes of 0.29, 0.11, 0.30 and 0.24.

I’ve taken a quick look at these studies. They describe: 1) Changing Self Esteem in Children, 2) Effectiveness of mentoring programs for youth, 3) Primary prevention mental health programs for children and adolescents, and 4) Empirical benchmarks for interpreting effect sizes in research.

I must admit (as a non-methodologist) that I don’t understand why or how the fourth study “Empirical benchmarks for interpreting effect sizes in research” fits the criteria of “prior research” given that, from as far as I can tell it has nothing to do with social and  emotional programs. But what that particular research does describe is that typical effect sizes for elementary school are 0.24 and middle school 0.27.  On that research alone the effect sizes are either level or slightly above expected, hardly a ringing endorsement, nor a source of much faith in the 11 percentage points of academic improvement.

A rudimentary understanding of mathematics also suggests the extremely low effect size (0.11) of study into “Effectiveness of mentoring programs for youth” greatly increased the difference between the study in question and the “prior research.” I’d suggest if that study was deemed not fit the “practical value” of the study then the 11 percentage points figure would’ve been much lower.

So, it seems clear to me the 11 percentage points of academic improvement is determined by comparing it to previous similar studies which didn’t work as well. Any other measure would not have produced the same results.

 

Of course, to Vygotsky or Piaget these results would not be surprising. For they know you can’t reduce learning and development to individual traits instead we can only understand it as a complex system.  Maybe, the Visible Wellbeing Model is trying to move towards Vygotsky and Piaget? If so, they’re doing it wrong. By attempting to identify and promote three traits of teacher effectiveness, teacher practice, and wellbeing, they’re not seeing them as a system but rather three individual traits together. Yet, at the same time they’re only measuring one trait… test scores. And when you only measure one trait, guess what, the only traits that matter are that trait!

For Positive Education and wellbeing to ever produce an effect size that is substantial, what is measured would need to change, just as they did to produce the contrived 11% figure. But can what Visible Learning effect sizes deem important change? Could they decide what matters while still believing in “evidence”?

Such is the conundrum that the Visible Wellbeing Model finds itself in? Theoretical baseless, considering test scores only worthwhile, what it finds are worthwhile aren’t what they know are worthwhile… No wonder most of us still listen to Vygotsky and Piaget!

 

Personally, I believe the learning and development is social, so this post is not to belittle the wellbeing movement but rather to suggest reducing social and emotional to skills to be learned though programs and interventions is, in my opinion, a missed opportunity. Further, to think we can bolt on wellbeing in order to improve test scores, is to misunderstand how our students actually learn and develop.

 

Incidentally, Inquiry-based learning in the incredibly flawed Visible Learning meta analysis comes in at 0.35, maybe it is time it replace Positive Education, with an effect size of 0.27 as one of the three components of their model?