A.I. and Education: The Peril of Chatbots (2/x)
- Kevin D
- Nov 14, 2024
- 4 min read
If Chatbots can usher in an era of more limited human interaction but greater personalized learning, what else do we need to fear as we consider the deeper cost of the introduction of this tool into our classrooms and workplaces?
If our current Large Language Models (LLM) which power our chatbots are starting to plateau - what ramifications might that have? It could mean chatbots iterate into a more user-friendly mode; akin to the morph of mobile websites morphing into apps; from web 1.0 to web 2.0. See Charlie Guo for more on this next needed step for commercial applications (especially when partnered with the A.I. agent leap that appears to be coming next).
If the utility of chatbots resides in their ability to mimic human conversation and support student learning and correct misconceptions via a typed or verbal chat interface, the biggest concern we might have is the limitations of the software itself. Hallucinations remain an issue - going undetected up to 20% of the time out of the 1-5% of the time answers contain a hallucination - not a great error rate when it comes to education. With the threat of hallucinations hovering around the support students receive, the necessity for teacher checks and intervention apart from the help provided by the bot remains key.
On top of that, is the general limitations of personalized and computer-driven learning. In addition to losing the essence of human interaction; education delivered in such a manner could be devoid of a greater sense of humanity. In an era of the explosion (relative) of homeschool and classical learning pointing back to the long and deeper history of education; can chatbots work as a tool in a discussion-driven Socratic seminar? (Ignore socrative.ai for the purpose of this discussion).
Others argue that ed tech as a whole has failed - even as chatbots march into our classrooms:
Since the 1980s, a number of meta-analyses (and meta-syntheses pooling these analyses) have been conducted exploring the impact of digital technologies within varied fields of learning. What do the weighted mean effect sizes show?
Math: ES = 0.33 (22 meta-analyses / 1060 Studies / 1464 effect sizes)1
Literacy: ES = 0.25 (17 meta-analyses / 736 Studies / 1547 effect sizes)
Sciences: ES = 0.18 (6 meta-analyses / 391 Studies / 567 effect sizes)
Writing Quality: ES = 0.32 (6 meta-analyses / 75 Studies / 85 effect sizes)2
Specific Learning Needs: ES = 0.61 (10 meta-analyses / 216 Studies / 275 effect sizes)3
At first blush, this looks very promising: seeing as each effect size is larger than zero, surely this means EdTech is working, right?
Not quite.
In 2023, educational statistician John Hattie released Visible Learning: The Sequel. In this major work, he analyzes over 2,100 educational meta-analyses exploring 357 different moderators affecting learning within the typical classroom. What he found was equal parts surprising and predictable: nearly everything has a positive impact on student learning. In fact, of the 357 included learning moderators, only 33 reported a negative effect size (this includes things like abuse, malnutrition, illness, and mental disorders). In other words, 91% of everything a teacher does can be said to improve learning...
Data suggests that in order for students to maintain a 50th percentile rank nationwide, they must improve an average of 0.42 standard deviations per year (calculated using standardized reading and math data across K-12); anything below this will likely lead to declining rank and vice versa. A similar analysis places this value at 0.46, suggesting an effect size of around 0.44 would be a reasonable educational baseline. As a secondary estimate, when the effect size from all 357 moderators noted above are pooled, Hattie reports the average value sits at a less conservative effect size of 0.4. In fact, he calls this a ‘hinge-point’ and recommends that only those tools and/or strategies with values above this level can be seen to ‘work best’ and should be considered for mass inclusion across education, as those have the greatest chance to deliver the greatest impact to the greatest number of students.
Using the less conservative baseline of 0.4, the aforementioned meta-analyses look much weaker. In fact, the only realm within which digital tools seem to be meaningfully beneficial is in the realm of specific learning needs (a topic we will discuss later in this piece). Use of these tools outside of this context may well be driving learning; unfortunately, that learning will be slower, less robust, and likely lead to a drop in rank compared to other, more powerful, non-digital methods.
So if edtech as a whole has a small, insignificant impact in classrooms, are chatbots the jet fuel that can show results (i.e. push those numbers above Hattie's hingepoint of .4)?
My intuition, until we have more data, is that chatbots should remain a tool in the toolbox that we can use especially for students who need partial support - not the full support of a teacher or the independence of work on your own time - but as a way to step up from a youtube video or article and create a more engaging and supportive experience.
If we are counting on chatbots to be our silver bullet and eliminate the educational achievement gaps driven by socioeconomic status, poor teaching, lack of resources, or the other factors, we are continuing to hunt for a quick fix for a deeper issue.

ความคิดเห็น