Does A.I. Really Encourage Cheating in Schools?

New technologies are raising suspicions about students’ work, but the controversy—like so many others swirling around American classrooms—misses the point of what we want our kids to learn.
Illustration of a student cheating off a robot hand.
Illustration by Till Lauer

For my columns during the back-to-school season, I thought it would be useful to go over the state of public education in America. This series will be similar to the one I wrote on parenting a few months back in that it will be wide-ranging in subject, so please bear with me.

This past spring, Turnitin, a company that makes anti-cheating tools to detect the use of A.I. in student papers, released its findings based on more than two hundred million samples reviewed by its software. Three per cent of papers had been more or less entirely written by A.I. and roughly ten per cent exhibited some traces of A.I. It’s never a great idea to rely on data that a for-profit company releases about its own product, but these numbers do not suggest some epidemic of cheating. Other research has shown that there hasn’t been a significant increase in student plagiarism since the unveiling and mass popularization of large language models such as ChatGPT. Students seem to cheat a lot, generally—up to seventy per cent of students reported at least one instance of cheating in the past month—but they cheated at the same rates before the advent of A.I.

What has increased is the number of teachers and adults who seem convinced that all the kids are cheating. A study by the Center for Democracy and Technology found that “a majority of teachers still report that generative AI has made them more distrustful of whether their students’ work is actually theirs.” Such suspicions have been paired with real questions about the efficacy of A.I.-detection tools, including one concerning finding that showed A.I. detectors were more likely to flag the writing of non-native English speakers. This uncertainty, along with the failure of many school districts to implement a clear and comprehensive A.I. policy, has led to another layer of debate among educators about how to handle instances of alleged cheating. A set of guidelines on the use of Turnitin, which was recently released by the Center for Teaching Excellence at the University of Kansas, warned teachers against making “quick judgments” based on the company’s software and recommended that educators instead “take a few more steps to gather information,” including comparing previous examples of the student’s work, offering second chances, and talking to the student. (Earlier this month, the Wall Street Journal reported that OpenAI, the company that developed ChatGPT, had built its own detection tool, which was much more accurate than its competitors’ software, but had held off releasing it, because admitting that students did indeed use ChatGPT to cheat might be bad for business.)

Educational data is notoriously unreliable. There’s a whole lot of it—kids take tests every day and have nearly every part of their educational journeys tracked from the age of five—but, if you dig into many education studies, you’ll find a whole lot of noise and almost no signal. When trying to parse what, for example, a small increase in statewide reading scores might mean about the efficacy of a given program, the best one can do is look at the data, try to eyeball some larger trend, and then present it somewhat halfheartedly. Here’s what I believe is happening in schools with ChatGPT: teachers are probably a little overly suspicious of students, in part because they have been given tools to catch cheaters. Those panoptic tools have likely scared some students straight, but cheats are going to cheat. When I was in high school, graphing calculators were blamed for student cheating. Ten years later, the ubiquity of cell phones in classrooms stirred up visions of kids across the country texting one another test answers whenever a teacher’s back was turned. Wikipedia also had its moment as the destroyer of research and knowledge in schools; today, it’s clear that Wikipedia has been a net good for society and probably more accurate and less biased than the Encyclopædia Britannicas it replaced.

The situation reminds me of the problem with sports-gambling apps. Gambling, like plagiarism, isn’t new. If you stick a hundred people who have never placed a bet in their lives in a casino, a small number of them will come back the next day, and the next, and the next. The rest will either never bet again or gamble only occasionally and in a responsible manner. Cheating in school strikes me as a similar phenomenon—maybe it’s true that most kids engage in a little bit of unethical schoolwork, but some portion of kids never will and many more likely do so only in the most trivial (or trying) situations. Technology does change the experience; it can encourage edge cases to start tossing dice at a craps table or asking ChatGPT to write a paper. But, for the most part, it’s not why adults gamble on sports or why kids cheat at school. And just as Wikipedia didn’t ruin the written word—and likely deepened the research of many student papers by simplifying the introductory task of getting to know a subject—the five-paragraph essay will survive large language models.

The rush to solve A.I. cheating and the myriad educational tools that have been developed and sold to schools across the country raise a tertiary, and far more interesting, question than whether or not the written word will survive. When we think about students’ work, where do we draw the line between what has sprung out of their developing mind and what has not?

In STEM subjects, the lines are a little clearer. If a student just looks over a neighbor’s shoulder and writes down the same answer, most people agree that’s cheating. But if a student is trying to prove that he understands how to solve a complicated math problem that involves some multiplication, does the use of a calculator mean that the student is cheating? He is not being tested on whether he knows how to multiply or not, so why waste time and potentially introduce careless errors? I do not think that having ChatGPT write a paper is the same thing as using a calculator for more menial and elementary tasks within a larger math problem, but it’s worth asking why we feel differently about the automation of research and the written word. Even in the fine arts, patrons and appreciators have long accepted that the artist doesn’t need to actually perform each brushstroke, construct every sculpture, or build every bit of a large installation. Small armies of uncredited assistants have their hands all over the works of Andy Warhol, Damien Hirst, and Jeff Koons, which has kicked up periodic controversies, but not enough to end the practice. Would we think less of these artists if a machine just did all of the assistants’ work?

These questions are abstract and ridiculous, but they also reflect the arbitrary way in which we think about what constitutes cheating and what does not. Outside of blatant acts of plagiarism, the line between cheating and not cheating in the humanities seems to rely on the amount of time it takes to complete a task. For example, if a student visited a library archive to research what happened in the week after D Day, spooled some microfiche into an ancient machine, and dutifully jotted down notes, we would likely think more highly of that effort than if the student found the same article in a Google search, and certainly more so than if he paraphrased some Wikipedia editor’s reading of that article.

Under this logic, school isn’t about creating new scholarship or answering questions correctly—it’s about teaching proper work habits. A young person who takes the time to go into a library is more likely to develop the types of work habits that will allow him to find accompanying bits of information that might be useful in creating a novel, an algorithm, or a convincing argument. Setting aside the obvious offense of dishonesty, the problem with cheating isn’t so much that the student skips over the process of explaining what they learned—it’s that they deprive themselves of the time-consuming labor of actually having read the book, type out the sentences, and think through the prompt.

One of the fundamental crises that the Internet brought to classrooms was the sense that, because references to facts and history no longer needed to be stored in your brain, nothing really needed to be learned anymore. Search engines, Wikipedia, and ChatGPT all demanded the same explanation: If we have these tools, what’s the point of these lessons? Schools tend to change slowly, even if education trends come and go. This is a good thing and mostly owes to the fact that good teachers tend to have long careers. But, since the days when I was a teacher, in the mid-two-thousands, I’ve noticed a subtle shift in the way people think about what kids should learn in the humanities. The idea of memorization, for the most part, has gone away; children are no longer forced to rattle off the date of the First Defenestration of Prague (1419) or commit the same lists of vocabulary words to memory. At the same time, most of the political fights that people get into over schools these days hinge on curriculum choices, which have always struck me as both silly and wildly beside the point. It’s actually pretty hard to shake a child’s beliefs with a stray book or lesson. But I sometimes wonder if the doctrinaire push in today’s schools, the intense fights over how to teach history or math, the censorious book bans in some states, come from a collective fear that the knowledge-retention part of school might now be outdated. Since it’s hard to justify why kids should learn dates and vocabulary words and the like, we have subtly shifted the purpose of school to teaching them what to believe and how to go through life as a good person. This is an admirable goal but will usually end in bitter conflict over which values matter.

Opinions in education, as a rule, move very quickly and oftentimes in a reactionary way. But the actual implementation of any consensus can take decades to complete. This inefficiency can be harmful—it’s taken far too long to remove phones from schools, for example—but it also allows for little panics like the current one around large-language-model cheating. I do not think A.I. encourages cheating in some revolutionary way, and I imagine any rise in plagiarism might have more to do with the extraordinary pressure of college admissions and the overly competitive atmosphere in many high schools. Until that changes, some population of kids will convert any new app into a cheating tool, educational technology will sell blockers, and the cycle will just repeat itself. It doesn’t have to be this way. The A.I.-cheating panic gives us a chance to reëmphasize the work-habit part of schooling and to walk away from claims that the books that children read are somehow dangerous or that only one version of history can be taught. This, it should be said, is not so different from the way that thousands of teachers across the country already think about their jobs, but the work part of school has become far more gauche than it used to be, with schools across the country eliminating homework and focussing more on developing a student’s love of a subject or the implied politics of a curriculum. A little revanchism, such as in-class essays written with a paper and pencil in elementary and middle school, might go a long way. The lesson is almost always the actual doing of the lesson, not the facts that are learned. ♦