This article is authored by Meghan McCormick, Research and Impact Officer at Overdeck Family Foundation, and was originally published by The 74, a nonprofit, independent news organization focused on education in America.

The public release of ChatGPT in April 2022 sparked a wave of fear and excitement among educators. While some expressed hesitation about the ability of generative artificial intelligence to make cheating undetectable, others pointed to its potential to provide real-time, personalized support for teachers and students, making differentiated learning finally seem possible after decades of unmet promises.

Today, that potential has begun to come to fruition. Recent national survey data indicate 18% of teachers have used genAI, mostly to support differentiated lesson planning, and 56% of educators believe its use in schools will continue to grow. Increasingly, districts are introducing students to this technology, with products like Khanmigo — which provides individualized tutoring — already being adopted in Indiana, Florida and New Jersey. And students are experimenting with it outside the classroom as well. According to a recent survey, approximately half of 14- to 22-year-olds report having used genAI at some point.

But rapid changes in technology and the speed of adoption are far outpacing the field’s understanding of impacts on teaching and learning. Every day there is a new story about an exciting AI-related development, but given the time it takes to conduct careful evaluation, very limited evidence exists about whether any of these tools have positive benefits for students. As schools start facing hard choices about where to spend their resources in response to continued learning gaps and the ESSER funding cliff, it’s important to take a look at what we know about the impact of genAI on education and what more we need to learn.

Every day there is a new story about an exciting AI-related development, but given the time it takes to conduct careful evaluation, very limited evidence exists about whether any of these tools have positive benefits for students.

What we know

Educators spend about 46% percent of their time on tasks that don’t directly involve teaching, ranging from taking attendance and submitting reports to giving written feedback to students. Gen AI tools hold promise for speeding up and even automating these tasks, saving time that could be spent building meaningful relationships and deepening learning. For example, researchers from UC Irvine found that teachers in California and North Carolina who used the genAI product Merlyn Mind, which automates test question creation and lesson planning, reported spending less time on administrative tasks and more on teaching and learning after seven weeks of use compared to educators without access to the tool. And about 44% of teachers who have used genAI agree the technology has made their job easier.

To date, however, most of these findings rely on anecdotal reports. To quantify the impact of genAI on time saved, the field needs more rigorous evidence — such as through randomized controlled trials — to not only gauge the impact on administrative burden but to explore whether these tools help improve teaching quality.

A separate body of research is finding that genAI-based coaching tools, which aim to give regular, impartial, real-time feedback in a cost-effective way, can have small effects on targeted teacher practices. For example, researchers at Stanford and the University of Maryland developed “M-Powering Teachers,” an automated coaching tool that uses natural language processing to give educators feedback. Across two randomized controlled trials, the tool was shown to reduce teacher-directed talk, increased student contributions and improved completion of assignments. Another study found that feedback provided via TeachFX, an app that uses voice AI to assess key indicators of classroom quality, increased teachers’ use of focusing questions that probe students’ thinking by 20%.

Another randomized controlled trial found a genAI-enabled coaching tool that provided targeted feedback increased the quality of math tasks assigned to students and created a more coherent learning environment. Perhaps more impressive, the feedback resulted in a small positive improvement in students’ knowledge of ratios and proportional relationships, the area it focused on.

These studies show early promise, but the impacts they found have been small. As AI-enabled coaching products for teachers start to expand to more classrooms, more evaluation is needed to better understand the potential of genAI to truly improve teaching and, ultimately, student learning.

What we need to learn

Despite early evidence that AI has potential to make teachers’ jobs a bit easier and professional development more effective, the verdict is still out on whether having students interact directly with genAI can improve academic and social-emotional outcomes. These technologies, especially in education, are changing rapidly, making rigorous studies challenging. This point was recently made by the Alliance for Learning Innovation in calling on Congress to budget almost $3 billion to address the issue.

Despite early evidence that AI has potential to make teachers’ jobs a bit easier and professional development more effective, the verdict is still out on whether having students interact directly with genAI can improve academic and social-emotional outcomes.

While some tools — like Khan Academy’s Khanmigo (which has received funding from Overdeck Family Foundation) — are based on evidence that personalized learning can support better outcomes for some students, and some emerging research indicates that hybrid AI-human tutoring may boost achievement, it is not yet clear whether genAI tools themselves can strengthen and supplement student learning. As these types of products move into classrooms, there is a clear need for families, educators and policymakers to demand proof that they improve outcomes and do not unintentionally harm students most in need of effective support by providing incorrect guidance and feedback.

This is an exciting moment for education, with transformative technology finding its way into all our lives in a way that hasn’t been seen since the introduction of smartphones. Yet much research on genAI does not consider the types of ed tech products schools are actually buying. Instead, it comes from lab-based studies and tools that are not actually used or tested in the classroom.

Now is the time — before these technologies become pervasive — to rigorously evaluate what is being sold into and used in schools. The goal of educators should always be to ensure that students have the most effective tools for learning, not merely those with the best sales pitch.

 

Header image courtesy of Khan Academy