Major support for MindShift comes from
Landmark College
upper waypoint

Teachers Are Using Software To See If Students Used AI. What Happens When It's Wrong?

Save ArticleSave Article
Failed to save article

Please try again

A teen and her mom in front of a desk.
Ailsa Ostovitz, left, and her mother, Stephanie Rizk, at their home in the Maryland suburbs of Washington, D.C. In mid-November, Rizk met with Ostovitz's teachers to discuss accusations that her daughter had used AI to do some of her schoolwork. (Beck Harlan/NPR)

Ailsa Ostovitz has been accused of using AI on three assignments in two different classes this school year.

“It’s mentally exhausting because it’s like I know this is my work,” says Ostovitz, 17. “I know that this is my brain putting words and concepts onto paper for other people to comprehend.”

Ostovitz, a junior at Eleanor Roosevelt High School in the Maryland suburbs of Washington, D.C., shared with NPR one of the accusations she received from a teacher. The message, from September, included a screenshot from an AI detection program showing a 30.76% probability Ostovitz had used AI on a writing assignment that included a description of the music she listens to.

“I write about music. I love music. Why would I use AI to write something that I like talking about?” Ostovitz says.

Ostovitz reached out to her teacher about the assignment via the school’s online learning platform. “I said, seriously, I didn’t use AI. Can you try a different detector?”

Sponsored

The teacher didn’t respond, and docked Ostovitz’s grade.

Ostovitz’s mom, Stephanie Rizk, says her daughter is a high-achieving student who cares about doing well in school and she was alarmed when the teacher jumped to conclusions about Ostovitz’s work so early in the school year.

“Get to know their level of skill, and then maybe your AI detector is useful,” Rizk says.

Rizk told NPR she met with the teacher in mid-November and the teacher said they never saw her daughter’s message.

Ostovitz says she now runs all her homework assignments through multiple AI detection tools before she turns them in.
Ostovitz says she now runs all her homework assignments through multiple AI detection tools before she turns them in. (Beck Harlan | NPR)

The school district, Prince George’s County Public Schools, made clear in a statement that Ostovitz’s teacher used an AI detection tool on their own and that the district doesn’t pay for this software.

“During staff training, we advise educators not to rely on such tools, as multiple sources have documented their potential inaccuracies and inconsistencies,” the statement said.

PGCPS declined to make Ostovitz’s teacher available for an interview. Rizk told NPR that after their meeting, the teacher no longer believed Ostovitz used AI.

But what happened to Ostovitz isn’t surprising.

More than 40% of surveyed 6th- to 12th-grade teachers used AI detection tools during the last school year, according to a nationally representative poll by the Center for Democracy and Technology, a nonprofit that advocates for civil rights and civil liberties in the digital age.

That’s despite numerous research studies showing that AI detection tools are far from reliable.

“It’s now fairly well established in the academic integrity field that these tools are not fit for purpose,” says Mike Perkins, a leading researcher on academic integrity and AI at British University Vietnam.

Perkins found that some of the most popular AI detectors — including Turnitin, GPTZero and Copyleaks — flagged some things as AI that weren’t, and vice versa. Their accuracy rates dropped even further when AI text was manipulated to appear more human.

“We saw some really concerning problems with some of the most prolific AI text detection tools,” he says.

Despite those problems, NPR found that school districts from Utah to Ohio to Alabama are spending thousands of dollars on these tools.

Why one of the nation’s largest districts uses AI detection software

Near Miami, Broward County Public Schools is spending more than $550,000 on a three-year contract with Turnitin. The long-standing ed-tech company has historically provided schools with plagiarism detection software; in 2023, it introduced an AI detection feature. When educators put student work through this tool, it generates a percentage, which reflects the amount of text the software determines was likely generated by AI. One caveat: According to the company, scores of 20% or lower are less reliable.

“The Turnitin tool is something that helps us facilitate conversation and feedback, not grading,” says Sherri Wilson, director of innovative learning for the Broward school district, which enrolls more than 230,000 students and is one of the largest school districts in the country.

Wilson says the district is “totally aware” of the research showing AI detection tools, including Turnitin, aren’t 100% accurate or reliable.

Turnitin also acknowledges this: On the company’s website, it says, “our AI writing detection may not always be accurate … so it should not be used as the sole basis for adverse actions against a student.”

Turnitin wrote in a statement to NPR that it’s more important to avoid falsely accusing students of cheating than to catch all AI writing.

Wilson says the Turnitin tool is still valuable because it saves teachers time by quickly scanning student work for suspected AI use.

Another reason that Broward teachers have access to the tool, Wilson says, is that the district participates in academic programs, such as International Baccalaureate, or IB, in which student work must be authenticated by teachers before it is sent out for external review.

Both of the programs Broward offers, IB and International Education at Cambridge, told NPR that schools are not required to use AI detection software as part of the authentication process. Nonetheless, Broward told NPR in a statement, “we have chosen to provide our teachers with [Turnitin] as one of the tools to meet the requirements.”

But Wilson says teachers are the ultimate authority on whether a student’s work is their own — not the AI detection tool.

“They’re using these tools as feedback to then have those teachable moments with students,” she says.

Why one teacher uses AI detection tools

Language and literature teacher John Grady says, for him, AI detection tools provide “a jumping off point” to start a conversation with a student who may have used AI.

Shaker Heights High School teacher John Grady says he puts all student essays through GPTZero – but it isn't the only tool he relies on to determine if a student's work is their own. 
Shaker Heights High School teacher John Grady says he puts all student essays through GPTZero – but it isn’t the only tool he relies on to determine if a student’s work is their own.  (Dustin Franz for NPR)

“It’s certainly not foolproof,” he says. “But it gives you something to hang your hat on.”

Grady teaches at Shaker Heights High School, part of the Shaker Heights City School District outside Cleveland. The district serves roughly 4,400 students, and is paying GPTZero, another AI detection software company, about $5,600 this year for annual licenses for 27 of the district’s teachers. The tool calculates a percentage likelihood that a student’s work is AI-generated.

Grady says he puts all student essays through GPTZero; if the tool shows more than a 50% likelihood AI was used for the assignment, Grady digs deeper. That includes using revision history tools to see how much time a student spent on an assignment, and how many edits they made during the writing process. If it appears that a student made only a few edits and spent hardly any time writing, he’ll check in with that student.

“And I’ll say, ‘Hey, this flagged. Can you talk to me about why?’ I’d say the bulk of the time, like 75%, if it was AI, they’d be like, ‘Yeah, I did.’ And I’m like, ‘OK, well now you’ve got to rewrite it with less credit,'” Grady says.

Edward Tian, co-founder and CEO of GPTZero, says this is how educators should be using his company’s tool.

“We definitely don’t believe this is a punishment tool,” Tian says. “This needs to be a tool in the toolkit and not the final smoking gun.”

He says it’s important to understand that a GPTZero probability score under 50% means it’s more likely the text was human versus AI-generated. He says scores over 50% warrant closer examination — like what Grady describes.

Tian doesn’t dispute the research that shows GPTZero isn’t always reliable. But he notes that there are educators, like Grady, who still find it valuable for the information it provides.

He says that tools like his offer a “signal on what’s happening in your classroom” but that teachers should always follow up with students if that signal shows something concerning.

The AI detection skeptics

Shaker Heights junior Zi Shi, whose first language is Mandarin, says his writing style can sometimes look like AI “because of the repetition of words I use. I feel like it’s because of how limited my vocabulary is.”

Shi — who isn’t a student of Grady’s — says he’s still working on his writing skills and he’s concerned that AI detection software might be biased against non-native English speakers like himself.

Some educators share this concern, though the research so far is limited and contradictory.

Shi says an assignment he completed for his English class earlier this fall was flagged by GPTZero as possibly AI-generated. He says his teacher suggested that his use of an online tool called Grammarly may have triggered the detection software. Grammarly uses AI to correct grammar and, if prompted, generate text. (The teacher confirmed Shi’s account with NPR.)

Shi says he only used Grammarly to clean up his writing and that he wrote the assignment himself. “It was definitely disappointing to see the comment of it being flagged as AI,” Shi says.

Shi thinks AI detectors should be thought of as a “smoke alarm, where it’s a sign, or warning. But, you know, sometimes it could be like a false alarm.”

He questions whether the school district should be spending thousands of dollars on AI detection software. He says that money could be better spent on professional development for teachers.

Carrie Cofer, a high school English teacher in the Cleveland Metropolitan School District — just a few miles from Shaker Heights — shares that view.

Last year, as an experiment, she uploaded a chapter of her Ph.D. dissertation into GPTZero. “And it came up with like 89% or 91% AI-written, and I’m like, ‘Oh, no, I don’t think that’s right, because it was all mine,'” Cofer says.

In Cleveland, English teacher Carrie Cofer says educators will need to adapt to AI by changing how they teach and assess student learning.
In Cleveland, English teacher Carrie Cofer says educators will need to adapt to AI by changing how they teach and assess student learning. (Dustin Franz for NPR)

Cofer is helping her district shape its AI policy and guidelines; she says Cleveland schools don’t currently pay for AI detection software and she’d advocate against it.

“I don’t think it’s an efficacious use of their money,” Cofer says. “The kids are going to get around it one way or the other.”

Some workarounds that students could turn to include using AI detection software themselves, to workshop assignments so they don’t get flagged, and using “AI humanizer” programs, which claim to make AI-generated writing appear more human.

Ultimately, she says, teachers will need to adapt to AI by changing how they teach and assess student learning.

Back in Maryland, high school junior Ailsa Ostovitz is also adapting. She now runs all her homework assignments through multiple AI detection tools before she turns them in.

The writing is her own, she says, but she’ll rewrite sentences the software identifies as possibly AI-generated, an extra step that adds about half an hour to every assignment.

“I think I’ve definitely become more vigilant about presenting my work as mine and not AI,” she explains.

She doesn’t want to take any chances.

This reporting was supported by a grant from the Tarbell Center for AI Journalism.

Sponsored

Edited by: Nicole Cohen
Visual design and development by: LA Johnson
Audio story produced by: Lauren Migaki

Transcript:

MARY LOUISE KELLY, HOST:

How can you tell if a student has used artificial intelligence to do their schoolwork? Teachers say it’s a huge challenge. Many are turning to AI detection software for help. Just one problem – this software doesn’t always work. So what does that mean for students? Reporter Lee Gaines has the story.

LEE GAINES: High school junior Ailsa Ostovitz has been accused of using AI to complete her homework assignments three times so far this school year in two different classes.

AILSA OSTOVITZ: It’s mentally exhausting because it’s, like, I know this is my work. I know that this is, like, my brain putting words and concepts onto paper for other people to comprehend.

GAINES: The 17-year-old attends Eleanor Roosevelt High School in Greenbelt, Maryland. She shared messages she received from a teacher. In one case from September, the teacher sent her a screenshot from an AI detection program. It showed about a 30% probability she had used AI to complete a writing assignment. That assignment included a description of the music she listens to.

OSTOVITZ: I write about music. I love music. Why would I use AI to write something that I like talking about?

GAINES: Ostovitz lost points on the assignment. She sent a message to her teacher.

OSTOVITZ: I said, seriously, I didn’t use AI. Can you try a different detector?

GAINES: But she never heard back. Ostovitz’s mom, Stephanie Rizk, told NPR she met with the teacher in mid-November. The teacher said they never saw Ostovitz’s message. Her district, Prince George’s County Public Schools, made clear in a statement that Ostovitz’s teacher used an AI detection tool on their own, and the district doesn’t pay for this software. It said, quote, “during staff training, we advise educators not to rely on such tools, as multiple sources have documented their potential inaccuracies and inconsistencies.” The district declined to make the teacher available for an interview.

OSTOVITZ: I think I’ve definitely become more vigilant with presenting my work as mine and not AI.

GAINES: Ostovitz says the experience changed the way she does her homework. She now runs all her assignments through AI detection software.

OSTOVITZ: That part is really frustrating, where I am putting it through AI checkers and then rewriting my own work.

GAINES: Rizk told NPR that after their meeting, the teacher said they no longer believe Ostovitz used AI. What happened to Ostovitz isn’t totally surprising. Numerous research studies have found that AI detection tools are far from perfect.

MIKE PERKINS: We saw some really concerning problems with some of the most prolific AI text detection tools.

GAINES: Mike Perkins is a leading researcher on academic integrity at British University Vietnam. He found that some of the most popular AI detectors flagged some things as AI that weren’t, and vice versa.

PERKINS: And it’s now fairly well established in the academic integrity field that these tools are not fit for purpose.

GAINES: Not fit for purpose – and yet, more than 40% of surveyed middle and high school teachers used AI detection tools during the last school year. That’s according to a nationally representative poll by the Center for Democracy and Technology, a nonprofit that advocates for digital rights.

SHERRI WILSON: The Turnitin tool is something that helps us facilitate conversation and feedback, not grading.

GAINES: Sherri Wilson is director of innovative learning for Broward County Public Schools in Florida, one of the largest districts in the country. It has a contract with a company called Turnitin for plagiarism and AI detection – a contract worth more than half a million dollars. Wilson says she knows AI detection tools like Turnitin aren’t always accurate.

WILSON: That is why the human agency can never be removed in this process.

GAINES: In a statement, Turnitin says their AI detection tool is just one data point in assessing whether a student’s work is their own. It also says it’s more important to avoid falsely accusing students than to catch all AI writing. Wilson says teachers aren’t automatically punishing students if their work is flagged as AI-generated.

WILSON: They’re using these tools as feedback to then have those teachable moments with students to recalibrate and resubmit.

GAINES: That’s how John Grady uses AI detection software. He teaches language and literature courses at Shaker Heights High School outside Cleveland.

JOHN GRADY: So usually I just call a student over, and I’ll show them the report. And I’ll say, hey, this flagged. Can you talk to me about why? I’d say the bulk of the time – like 75% – if it was AI they’d be like, oh, yeah, I did. And I’m like, OK. Well, now you got to rewrite it with less credit.

GAINES: Grady’s public school district is spending about $5,600 on annual subscriptions to GPTZero, another AI detection software. He knows it isn’t 100% reliable. He also uses revision history tools that allow him to see the progression of a student’s writing over time. One thing he likes about GPTZero is if he’s suspicious about a student’s assignment…

GRADY: It’s something to kind of hang your hat on, where I can say, like, look, it’s been flagged.

EDWARD TIAN: We definitely don’t believe this is a punishment tool.

GAINES: That’s GPTZero CEO, Edward Tian. He doesn’t dispute the research that says GPTZero and other tools aren’t always accurate, but he says they can still help teachers. For example, if his software finds a more than 50% probability that an essay was written by AI, Tian says that should trigger further investigation by the teacher. It should never be the sole measure of whether a student’s work is their own.

TIAN: But if this is a conversation starter, actually we found a lot of teachers get a lot of value there.

GAINES: Just a few miles away from Shaker Heights, teacher Carrie Cofer thinks AI detection tools are a waste of school resources.

CARRIE COFER: I don’t think the AI detection software is reliable.

GAINES: She teaches high school English in the Cleveland school district. Cofer says students who use AI have found ways to fool detectors.

COFER: Like, they go in and change a couple of words in or change something around, and it’s not going to detect that it’s AI-generated.

GAINES: Her district doesn’t currently pay for an AI detection tool, and Cofer says she’d advocate against it. Instead, Cofer says teachers are the best AI detectors.

COFER: You can’t replace a teacher’s experience and instinct when it comes to any kind of classroom work.

GAINES: That’s one thing all the educators NPR spoke with did agree on.

For NPR News, I’m Lee Gaines.

KELLY: And that reporting was supported by a grant from the Tarbell Center for AI Journalism.

lower waypoint
next waypoint
Player sponsored by