Generative AI has the potential to provide real-time, personalized support for teachers and students, making differentiated learning finally seem possible.

However, the rapid changes in AI technology and the speed of adoption are quickly outpacing the field’s ability to carefully study impacts of genAI on teaching and learning. Given the fast pace of innovation, Overdeck Family Foundation has granted the Research Partnership for Professional Learning (RPPL) $900,000 to manage three studies focused on identifying the features and traits of genAI tools that show potential to improve teacher quality and student learning, and disseminate those results to the education community at large. Our hope is that this research can guide future product development decisions, as well as district procurement choices.

As research and impact officer Meghan McCormick detailed in a recent op-ed, existing evidence on AI-coaching tools suggests there is promise for leveraging AI to strengthen professional learning, but the studies have been conducted in academic rather than real-world settings, shown fairly small impacts on a very targeted set of teacher practices and student classroom behaviors, and demonstrated unclear implications for practice and scale. In partnership with organizations already working to scale effective professional learning, this new body of work will invest in coherent, rigorous research to identify practices that make K-12 professional learning more accurate, impactful, and cost-effective.

Below is a summary of the three research studies:

1. Comparing the impact of reflective versus directive coaching

Dr. Jing Liu from the University of Maryland College of Education, Dr. Heather Hill from the Harvard Graduate School of Education, and Dr. Dora Demszky of the Graduate School of Education at Stanford University will receive $250,000 to conduct an RCT to compare the impact of reflective versus directive coaching, delivered via a genAI coaching tool, in improving instructional quality and student outcomes. Previously, these researchers developed M-Powering Teachers, the most studied AI-based tool for teacher coaching to date. The tool has demonstrated initial proof of concept evidence for improving very targeted teacher practices and teacher-classroom interactions (e.g., increasing teacher-directed talk, increasing student contributions, and improving students’ completion of assignments), but questions remain about how human coaches can most effectively use the automated feedback the tool generates.

In this study, the researchers will conduct a large, well-powered RCT involving 180 fourth through eighth grade teachers across two school districts to examine the impact of feedback that is reflective (i.e., giving feedback that asks teachers themselves to reflect on what went well and what needs to improve) versus feedback that is directive (i.e., directly telling teachers what to do to change practice) on both observed instructional practices and student achievement. Leveraging primary and secondary data, the study will fit a series of multilevel regression models (with a dummy representing study condition) to estimate the difference in teacher and student outcomes for the two conditions on 1) teachers’ perceived utility of the feedback; 2) students’ perception of cognitive engagement; 3) students’ sense of classroom belonging; 4) student reasoning; 5) teacher uptake; 6) teacher questioning quality; and 7) student achievement as measured on test scores. This study has also received funding from the Bill & Melinda Gates Foundation and the National Science Foundation.

2. Studying coaches’ ability to center student learning

Teaching Lab’s research team, led by Dr. Shaye Worthman, will receive $250,000 to develop a gen-AI tool to improve coaches’ ability to center student learning and use an RCT to test the impact of that tool on coach practices, teacher satisfaction with coaching, teaching efficacy, and quality of teaching. In the development phase, the research team will record and analyze coach-teacher conversations from 40 math classrooms (N = 100 teacher-coach conversations) to better understand how to most effectively provide coaching that is focused on student experiences and student learning (rather than teachers themselves). They will focus on identifying themes and patterns related to setting student-centered goals, the use of student data to guide instructional decision-making, and the strategies coaches employ to encourage teachers to adapt practices that directly impact student learning.

In the testing phase, the team will then use the data collection effort to train an AI-engineered feedback tool for coaches designed to enhance student-centered coaching, including identifying and reinforcing the use of student data in coaching conversations, highlighting instances where coaches effectively guide teachers towards setting and achieving student-centered goals, and providing suggestions for improvement when conversations deviate from that focus. The researchers will use an RCT design (N = 60 math teachers) to compare the impact of the AI tool to business-as-usual coaching. Analyses will leverage a series of multilevel regression models to quantify the impact of the AI coaching tool on coach self-efficacy and self-reported coaching practices, teacher satisfaction with coaching, teachers’ self-efficacy to implement student-centered practices, and teachers’ self-reported teaching practices.

3. Examining the use and outcomes of TeachFX coaching

Dr. Elizabeth Chu at the Center for Public Research and Leadership (CPRL) at Columbia University and Mathew Moura from Teaching Matters will receive $250,000 to examine the implementation of coaching routines utilizing an emerging AI technology, TeachFX, which provides automated feedback by analyzing teacher questioning patterns and student talk time. This study uses a mixed methods approach to explore coaches’ and teachers’ use of TeachFX in coaching sessions; how, when, and to what extent coaching with TeachFX improves teachers’ understanding of their own practice and better calibrates their reflections; and how, when, and to what extent coaching with TeachFX leads to more actionable feedback and reflective coaching.

The project has two phases, both conducted in New York City Public Schools. The goal of Phase One is to pilot the coaching approach and measurement tools and to train an expanded team of coaches and reviewers for Phase Two. The goal of the second phase is to answer the research questions, and will include eight schools total, with half engaging in Teaching Matters with TeachFX and half without. During Phase Two, the research team will conduct observations, teacher surveys, and interviews with teachers, coaches, and administrators. They will integrate qualitative and quantitative data to gain a clearer understanding of how TeachFX is being used, its perceived utility, how it shifts teachers’ understanding of their own practice, and whether and how it changes practice beyond receiving Teaching Matters supports on their own.

Overall, these three research studies are designed to begin uncovering ways that genAI can make teacher coaching more impactful and cost-effective. By funding these studies, Overdeck Family Foundation’s ultimate goal is to strengthen professional learning and coaching products at scale, improving the experience of millions of teachers and students for decades to come. If you would like to learn more, contact our research team at:

To stay up-to-date on the latest news from our Foundation, subscribe to our newsletter, and explore the findings from our latest research grants in our Research Repository.


Header image courtesy of Teaching Lab