COMMENTARY by Ulises A. Mejias: The Core of Gen-AI is Incompatible with Academic Integrity

By encouraging GenAI, we are directly undermining the principles we have been trying to instill in our students.


Many years ago, I had a student whose written assignments were very hard to decipher. The language just sounded strange. It took me a while to realize that the student was lifting blocks of text found online (a definition from Wikipedia, for example) and replacing some of the words with synonyms, probably also generated online. This was obviously an attempt to circumvent plagiarism detectors. To that student: wherever you are, I hope you are getting credit for creating a precursor to ChatGPT.

That was back in the day when cheating was fairly rare, at least in my classroom. Since last year, however, plagiarism in my courses has increased three-fold, and ChatGPT is squarely to blame for that. But rather than frame this discussion by focusing on delinquent students, I want to call attention to our own role in asking questions about the university’s role in upholding academic integrity.

The Plagiarism Machine

The apparent suicide in December 2024 of former OpenAI engineer Suchir Balaji, who was set to deliver key testimony regarding ChatGPT’s violation of copyright law, is a tragic reminder that there are important unresolved questions about how Generative AI (GenAI) acquired its “knowledge.” Unfortunately, schools are turning a blind eye to these questions.

My main concern is that, by encouraging the adoption of GenAI, we in the educational field are directly undermining the principles we have been trying to instill in our students. On the one hand, we tell them that plagiarism is bad. On the other hand, we give them a plagiarism machine, which, as an aside, may reduce their chances of getting a job, damage the environment, and widen inequality gaps in the process.

As we learn about how the technology works, we realize that GenAI is nothing but statistically derived plagiarism. Researchers at Apple have demonstrated that large language models (LLM) perform “sophisticated pattern matching,” a fancy version of what my synonym-loving student was doing years ago.

GenAI can’t reason, it doesn’t know anything, and it can’t think intelligently about anything. It simply takes vast amounts of content created by us (text, music, images). and uses complex mathematical models to come up with a product that is, statistically speaking, the best match in response to our query. It’s a neat, albeit expensive, trick.

An important part of the trick is that the end product must not directly reference or look too much like the original material. However, the fact that the model is taking original work and intentionally manipulating it to camouflage its source constitutes nothing short of plagiarism. In academic lingo, this is called crafty, cunning, deceptive, or disguised plagiarism. We call out students who engage in such kind of behavior.

AI companies acknowledge that they could not operate without our original copyright-protected material. In several court cases, they have conveniently claimed that their actions are not plagiarism. But they are wrong!

When considering these cases, we may be tempted to examine the mechanics of plagiarism too narrowly. Because the result is not a word-for-word copy but a “derivative outcome” that constitutes “transformative fair use” (as OpenAI says), defendants claim it is not plagiarism.

But instead of focusing on the end product, we need to focus on the social relationship between the plagiarizer and the source content. Plagiarizers take someone else’s content and try to pass it on as their own. That is exactly what these companies are doing, regardless of whether they are using a single source or many, and in spite of their sophisticated paraphrasing.

Zac Zimmer writes, “Citation is the coin of the academic realm, so anything that degrades the credit-granting mechanisms of academic citation and reference would be antithetical to the pursuit of academic knowledge.” So why would we, as institutions of higher learning that try to instill in students a sense of academic integrity, embrace a plagiarism machine? Why would we endorse the narrative peddled by GenAI companies that they must be granted blanket permission to take our content to train their energy-sucking plagiarism machines?

We also need to address a future concern. At the moment, GenAI is free or relatively cheap to use. Following a model established by so-called ‘disruptors,’ AI companies are letting us use their tools for free at first. Why? It is because they are using our interactions with their platforms to improve their technologies (in other words, we are providing free user testing), and in the process, we are becoming increasingly dependent on their services.

At some point, this largesse will stop. AI companies will start charging a hefty subscription fee or, as we have seen in past cases, demand that users surrender more and more of their personal data to feed the machines in exchange for their free services. Does the university have a moral obligation to protect students against this future form of dependency?

The End of the Written Assignment

Last year, I was still somewhat confident I could detect assignments written by AI. This year, as GenAI models got more sophisticated, I’m not sure I can detect its use in my classes, even with the aid of technology. I am also tired and resentful of spending so much extra time checking for AI plagiarism while campuses seem to embrace it to make themselves seem cutting-edge.

Learning experts tell me there are ways to manage this problem. They say we can teach students to integrate GenAI responsibly into their coursework. (I tried, but the siren call of the plagiarism machine was too hard for some students to resist.) They say I should scaffold assignments and maybe place less importance on the written essay.

They say it’s perhaps time for the written essay to die, replaced by more active pedagogical tools, and maybe they are right. But I’m no longer sure alternative pedagogies, prevention, or detection matter. That is because two kinds of disastrous effects have already been unleashed.

First, writing as a practice is being redefinedWalter Ong has argued that writing represents a unique opportunity to exercise abstract thinking. If GenAI is corrupting students’ ability to write (and to read, since they can now ask GenAI to summarize their readings), I’m concerned that it is also undermining their ability to generate ideas by thinking theoretically and conceptually.

Second, I am concerned that GenAI’s net effect in education is to undermine trust between teachers and students. It’s like stepping into a strange dream where objects that appear to be real are not. My ability to assume that any student is the author of their work has been seriously compromised. This is probably not fair to the majority of my students, which is unfortunate.

We need to discuss more about how GenAI is redefining trust relations between students and teachers.

Should We Ban GenAI?

We, as educators, are sometimes too quick to adopt the latest and ‘greatest’ technology, even when there is evidence that profits trump pedagogy. It’s easy to buy into the hype when GenAI companies generate lesson plans and educational activities to get educators to adopt their products (these materials have been criticized for essentially being little more than sales pitches). But I am not calling for an AI ban on campus. That would be unrealistic. So, what am I proposing instead?

For starters, I would like universities to commit to matching every dollar they spend on AI with a dollar spent asking critical questions about AI. This means promoting work, inside and outside the classroom, that explores some of the issues I’ve raised here and any critical perspective on AI. I’ll even settle for 50 cents on the dollar.

One thing that has become clear recently is that students are just as concerned about the impact of AI as we are. They are already seeing evidence that AI will be used to reduce employment opportunities in their fields and undermine their labor power. Many students in the creative fields see GenAI as a threat, and rightly so.

That is why, as a second step, I would like to see universities adopt transparency measures that would allow students to make informed decisions about the consumption of AI on campus.

If some of us professors are going to demand that students refrain from using GenAI in their assignments, students have an equal right to demand certain things from us. Through catalog or syllabus information, students should be able to identify “AI-free” courses: courses where the instructor refrains from using AI to grade work, plan lessons, or produce content. This is not a perfect solution (especially regarding required courses), but it’s a start.

Equally importantly, I believe students (and instructors) should have the right to demand that the university refrain from using data collected from their learning activities to train AI models, whether it is their own or a third party’s. This right to opt out should be enshrined in a Bill of Rights or campus code and should be accompanied by the appropriate verification mechanisms.

I realize this would fundamentally alter the relationship between universities and companies like Google, Blackboard, OpenAI, and Zoom, which have been providing ‘low-cost’ services to schools in exchange for data we generated to train their AI models.

It is time to challenge the legality and ethics of that extractivism, and GenAI provides us with a good opportunity to do so.

_____

Cover graphic courtesy of “The Obscure Columnist’s Quill” on Facebook

Leave a Reply