A.I.-Generated Content – A Teacher’s Perspective [Pt. 1 – Testing]

Tests 1 – 3: Historical essays of varying length

For comparative purposes, I first chose a topic that I’ve already written about; the sinking of the battleship Yamato in April 1945. The following are the results I got from ChatGPT and my subsequent evaluation.

The Sinking of the Battleship Yamato – A ChatGPT Essay

Prompt

Here is the prompt I submitted to ChatGPT:

“Write a cited essay about the sinking of the battleship Yamato in World War II.”

Within about a minute, ChatGPT generated the following 489-word essay:

Chat GPT’s Generated Response

The sinking of the Japanese battleship Yamato in World War II on April 7, 1945, was a significant event in the Pacific Theater of the war. The Yamato, which was one of the largest and most heavily armed battleships in the world at the time, was sent on a suicide mission to the island of Okinawa with the intention of repelling the American invasion force. However, the ship was sunk by American aircraft before it could reach its destination, with the loss of over 2,500 crew members.

The Yamato was built in the 1930s as part of Japan’s efforts to modernize its naval fleet. At the time of its construction, the ship was the largest and most heavily armed battleship in the world, with a displacement of over 72,000 tons and nine 18.1-inch (460 mm) guns. The ship was considered to be a symbol of Japanese naval power and was intended to be used as a deterrent against American aggression in the Pacific.

In early April 1945, as the American invasion of Okinawa was imminent, the Japanese government decided to send the Yamato on a suicide mission to the island in an attempt to repel the invasion force. The plan was to have the ship beach itself on the island and use its guns to attack the American troops on shore. The Japanese hoped that the sacrifice of the Yamato would inspire the rest of their navy to fight more fiercely and that the Americans would be forced to divert resources away from the invasion to deal with the battleship.

However, the Yamato never reached Okinawa. On April 7, 1945, as the ship was steaming towards the island, it was spotted by American aircraft from the carrier USS Enterprise. The American planes launched a massive bombing and torpedo attack on the Yamato, sinking the ship in just over two hours. Of the 2,777 crew members aboard the ship, only 269 survived.

The sinking of the Yamato was a significant event in the Pacific Theater of the war. The loss of the ship dealt a severe blow to Japanese morale and showed that their naval power was not invincible. It also demonstrated the effectiveness of aircraft carrier-based air power against battleships, which had been considered the backbone of naval warfare up to that point.

In conclusion, The sinking of the Japanese battleship Yamato was a significant event in the Pacific Theater of the war. The Yamato, which was one of the largest and most heavily armed battleships in the world at the time, was sent on a suicide mission to the island of Okinawa with the intention of repelling the American invasion force. However, the ship was sunk by American aircraft before it could reach its destination, with the loss of over 2,500 crew members. This event had a great impact on the outcome of the war and changed the way naval warfare was perceived.

References:

“Yamato-class battleship.” Wikipedia, Wikimedia Foundation, en.wikipedia.org/wiki/Yamato-class_battleship.
“Yamato (battleship).” Encyclopædia Britannica, Encyclopædia Britannica, Inc., www.britannica.com/topic/Yamato-battleship.
“The Sinking of the Yamato.” HistoryNet, www.historynet.com/the-sinking-of-the-yamato.

Evaluation

ChatGPT generated a 489-word essay with three sources. For comparison, the article I wrote on Operation Ten-Go (the same topic) published on 29 November 2020 is 11,830 words long and uses 21 sources of which only three are online. In total, it took me around 3 to 4 months to write and even longer to research. That said, here’s my evaluation of the ChatGPT essay.

Some of my immediate observations are:

There are only three references, all of which are online sources, and one of which is Wikipedia.
There are no in-text citations or endnotes and some of the facts are inaccurate or vague. (Just as ChatGPT disclaims.)
The writing is too run-of-the-mill.

Bear in mind that I didn’t specify a more detailed or longer response in my prompt. It’s likely that you need to be very specific about what you want in order for ChatGPT to generate something better. All in all, the writing, while grammatically correct, is very formulaic. It reads like it was lifted directly from the internet or an encyclopedia entry. Simply put, it’s not very detailed, and this lack of detail and simplistic presentation means it wouldn’t pass as a serious scholarly work. Additionally, the way it reads is too clean. I don’t know if there’s a term for it in computer science, but the writing is too consistent and too perfect. It’s almost as if it crosses the uncanny valley (if there is such a thing for writing) and something about it seems off. It’s very near lifelike, but it lacks details, random inconsistencies, grammatical errors, and a uniqueness that characterizes natural human writing. There’s something about it that seems machine-made because humans don’t write so perfectly unless the writing has been heavily proofread and edited.

Detecting AI-generated Content

Doing some further research, I located an NPR article by Greg Rosalsky and Emma Peaslee that describes Princeton University student Edward Tian’s work on building software that can detect AI-generated content. His work resulted in the GPTZero app.¹ So I ran the above A.I.-generated essay through the GPTZero app and it came up with the following analysis:

Perplexity — ie. the randomness of the text is — a measurement of how well a language model like ChatGPT can predict a sample text. Simply put, it measures how much the computer model likes the text. Your text perplexity evaluated on gpt2 (345M parameters) is 6 which is comparatively low. Texts with lower perplexities are more likely to be generated by language models.

Perplexity itself is an incomplete indicator that misses many factors including text length. Longer texts are less random and generally have lower perplexities. Your average perplexity (across sentences), a better indicator that considers text length, is 30.4.

Human written language exhibits properties of Burstiness: non-common items appear in random clusters that will certainly appear over time. Recent research has extended this property to natural language processing. Some human-written sentences can have low perplexities, but there are bound to be spikes in perplexity as the human continues writing. Contrastingly, perplexity is uniformly distributed and constantly low for machine-generated texts.

Your sentence with the highest perplexity is:

“However, the Yamato never reached Okinawa.”

It has a perplexity of: 194

Your GPTZero score corresponds to the likelihood of the text being AI generated: 40.02814799086946

Your text is most likely to be AI-generated!

For further comparison, I ran a 4,000-word portion of my Operation Ten-Go article through GPTZero and it came up with the following results:

Perplexity: 26

The average perplexity across sentences is: 97.21428571428571.

The sentence with the highest perplexity is:

“This percentage works out since her listed fuel oil capacity was 6,201 tons.”

It has a perplexity of: 397

Your GPTZero score corresponds to the likelihood of the text being AI generated: 82.33596808328447.

Your text is likely human generated!

By just looking at the numbers and comparing my article to the A.I.-generated one, my article scores more than four times higher in perplexity, 3.2 times the average perplexity across sentences, and more than double the overall GPTZero score. As far as I can tell, GPTZero largely measures for perplexity in the sentences and throughout the article. A.I.-generated writing has low perplexity meaning that it’s too consistent and uniform, whereas human writing is more random in sentence fluency. This would seem to support my layman’s evaluation that A.I. writing is a bit too clean and perfect. People don’t write like that.

As Edward Tian’s GPTZero app demonstrates, if there’s a problem, people can find a solution to it. Like in national defense, if there’s a threat, real or otherwise, then we can probably create a countermeasure to it. I can only imagine that plagiarism checkers will soon integrate features that will additionally check for A.I.-generated content. In fact, a CNN article by Samantha Kelly mentions that the plagiarism checker Turnitin is currently being tested to see if it can be adapted to also check for A.I. writing.²

In part 2, I’ll discuss some of my experiences with students plagiarizing/cheating, methods to address cheating, and extrapolate that to using A.I.-generated content. In part 3, I’ll conclude with a discussion on what I see regarding the benefits and drawbacks of using A.I.-generated content.

Tests 2 & 3 – Testing for length

Following that test, I then asked ChatGPT to generate one 5,000-word and one 2,000-word essay on the voyage of Donald Crowhurst during the 1968 Sunday Times Golden Globe Race. It came up with 498 and 619-word essays respectively with no citations.

Here are my prompts and the responses:

“Write a 5,000-word cited essay on Donald Crowhurst.”

Donald Crowhurst was an amateur sailor who participated in the Sunday Times Golden Globe Race of 1968-69. The race was a non-stop, single-handed, round-the-world yacht race that began on June 14, 1968, organized by the Sunday Times, a British newspaper. Nine sailors from five different countries started the race, but only one, Sir Robin Knox-Johnston, completed the race. Donald Crowhurst, however, did not finish the race and his story has become a tragic and controversial one in the annals of sailing history.

Crowhurst, a 35-year-old British businessman, had no prior sailing experience when he decided to enter the race. He purchased a trimaran, the Teignmouth Electron, with the intention of using the publicity from the race to promote his struggling business. However, as the race progressed, it became clear that the Teignmouth Electron was not seaworthy and Crowhurst was in over his head. He began to experience severe technical problems with his boat and struggled to make progress.

In an effort to save face, Crowhurst began to falsify his position reports, claiming to have made much faster progress than he had in reality. He spent months drifting aimlessly in the South Atlantic, living on board his boat, and keeping a diary of his experiences. In his diary, he wrote of his mental and emotional struggles, including feelings of guilt, despair, and eventually madness. He also wrote of his plan to fake his own death and disappear, which he eventually carried out.

On July 1, 1969, a life raft was found adrift in the Atlantic Ocean, with no trace of Crowhurst. Though it was initially assumed that he had died at sea, it was later discovered that he had never actually left the Atlantic and had instead committed suicide by jumping off the Teignmouth Electron. His body was never found.

Crowhurst’s story has been the subject of several documentaries, books, and films, including the 2018 film “The Mercy” which starred Colin Firth as Donald Crowhurst. The tragedy of his story and the ethical questions it raises about the nature of competition and the pressure to succeed have ensured that it remains a part of sailing history and popular culture.

(Word count: 498)

“Write a 2,000-word essay on Donald Crowhurst.”

The story of Donald Crowhurst’s ill-fated voyage and eventual suicide captivated the public, and his diary and other writings were later published as a book, “The Strange Voyage of Donald Crowhurst”, which provided a detailed account of his experiences and thoughts during the race. The book gave a glimpse into the mind of a man who was struggling with the pressure of competition and the weight of his own lies. The book also showed how a man who was not prepared for the challenge he had undertaken, both physically and mentally, was pushed to the brink of insanity.

Crowhurst’s story has been the subject of several documentaries, books, and films, including the 2018 film “The Mercy” which starred Colin Firth as Donald Crowhurst. The tragedy of his story and the ethical questions it raises about the nature of competition and the pressure to succeed have ensured that it remains a part of sailing history and popular culture.

The story of Donald Crowhurst is a cautionary tale of the dangers of overreaching and the consequences of deceiving others and oneself. It serves as a reminder that the pursuit of success and glory can have dire consequences when not approached with caution and humility. The story also highlights the importance of being prepared both physically and mentally for the challenges one undertake.

In conclusion, Donald Crowhurst’s ill-fated voyage in the Sunday Times Golden Globe Race of 1968-69 is a tragic and controversial episode in the annals of sailing history. His story raises a number of ethical questions and serves as a cautionary tale about the dangers of overreaching and the pressure to succeed. It also highlights the importance of being prepared for the challenges one undertake and the potential consequences of deceiving others and oneself. The publication of his diary and other writings has helped to shed light on the mind of a man who was pushed to the brink of insanity.

(Word count: 619)

Evaluation

What struck me with these tests is that, despite my prompts specifying length, neither response was anywhere near the word count I asked for. I wonder if ChatGPT was counting characters instead of words? Furthermore, the first essay had no citations despite the prompt specifying them. In terms of content, much of the writing about Donald Crowhurst and the 1968 Golden Globe race is virtually identical in both essays.

Test 4 & 5 – Asking AI about the ethics of its content

For the following tests, I thought it would be funny to see if ChatGPT could elucidate the ethical and moral implications of using A.I.-generated writing in different fields. Here are my prompts and the generated essays:

“Write a five-paragraph essay on the benefits and drawbacks of AI-generated essays.”

Artificial Intelligence (AI) generated essays have the potential to revolutionize the way we produce and consume written content. The benefits of AI-generated essays include the ability to quickly produce written content on a wide range of topics, which can save time and resources for organizations and individuals. For example, in the field of journalism, AI-generated essays can be used to quickly produce news articles and summaries, allowing news outlets to keep up with fast-paced developments and keep their readers informed. Similarly, AI-generated essays can be used to assist with tasks such as language translation and summarization, which can help bridge language barriers and make information more accessible to a wider audience.

On the other hand, there are also some drawbacks to AI-generated essays. One of the main drawbacks is the potential for errors and inconsistencies in the text. While AI algorithms are constantly improving, they are not yet able to fully replicate the complexity and nuance of human writing. Additionally, AI-generated essays may not always accurately reflect the nuances and complexities of a given topic, which can lead to inaccuracies and misconceptions.

Another drawback of AI-generated essays is the lack of creativity and human touch. While AI can replicate certain writing styles, it cannot produce truly original content in the same way that a human can. This can be a problem in fields such as creative writing and advertising, where a unique and creative voice is essential. Furthermore, AI generated essays may be seen as impersonal and lacking in emotional depth, which can make them less engaging for readers.

Moreover, AI-generated essays raise ethical concerns regarding the use of AI in writing. The increasing use of AI in writing raises questions about authorship, intellectual property, and liability. There is also the concern that AI-generated essays may be used to spread misinformation or propaganda, which can have serious consequences.

In conclusion, while the benefits of AI-generated essays are undeniable, there are also significant drawbacks to consider. While AI-generated essays can save time and resources, they may also produce inaccuracies and lack the emotional depth and creativity of human-written text. Therefore, it is essential to consider the potential drawbacks and to use AI-generated essays in an ethical and responsible manner.

“Is using AI-generated content for an academic essay considered plagiarism?”

Using AI-generated content for an academic essay would be considered plagiarism if the content is not properly cited or if the student represents the AI-generated content as their own original work.

Plagiarism is defined as the act of using someone else’s work or ideas without giving credit to the original source. When a student uses AI-generated content in an academic essay, they are using someone else’s work (the AI’s work) and thus, they must give credit to the source (the AI) in the form of a citation.

It is also important to note that many universities and academic institutions have strict policies against plagiarism, and using AI-generated content without proper citation could result in severe consequences, such as a failing grade for the assignment or even expulsion from the university.

It’s also worth noting that the use of AI generated content is not always accepted in academic settings, as the lack of originality and creativity, as well as the potential inaccuracies and inconsistencies, could be considered as a violation of academic integrity. It’s recommended to check with the institution and professor to see if the use of AI generated content is allowed and if it’s considered plagiarism.

Evaluation

To be sure, ChatGPT can write about the obvious benefits and drawbacks of A.I. writing, as well as the issue of using A.I. to plagiarize work in an academic setting. It wasn’t my intention to try and create a sort of A.I. logic bomb where the A.I. can’t rationale its way beyond a paradox. I just wanted to have some fun. Of course, merely asking the A.I. to write about the dangers of A.I. is NOT an indication of sentience. I seriously doubt ChatGPT is self-aware or creating something wholly original. Merely, I think it just scans the internet and compiles the writing and opinions that are already out there into an essay. Recall that it’s only accurate up to the year 2021. At least ChatGPT knows how to write a simple five-paragraph essay. (I wish the same could be said for some students.)

Final Thoughts

Given the potential for the use of ChatGPT in creating fraudulent work, some school districts, such as those in New York, Los Angeles, and Seattle Public Schools have banned the program but allowed teachers access to use it as a learning tool. Edward Tian’s program isn’t the only response to the public concern. ChatGPT’s developer, OpenAI, has recently introduced the “AI Text Classifier” to similarly detect the probability that a text was A.I.-generated. However, they note that it’s not foolproof and has a number of limitations.³

There are other essay generators out there that can produce better scholarly writing, from what I’ve heard, and I don’t think ChatGPT is necessarily well-configured for writing essays. My tests of ChatGPT weren’t comprehensive. After all, I was only testing myself on topics that I was familiar with. A better test would be to have a class of students use ChatGPT to generate a piece of writing and then compare their own writing and research. Finally, I should see if I could distinguish between the student’s writing and the A.I.’s just by reading it.

My writing isn’t grammatically perfect or profound in any sense of the term. I’ll openly admit that when it comes to history, my work is extremely derivative and doesn’t really bring anything new to the discussion. That said, I can at least say that all of my writing is my own (with the exception of the A.I.-generated sections above, of course). It takes time to do the reading, the research, and the writing. My analyses and evaluations may shamelessly parrot the experts, but any historian worth their salt knows that they stand on the shoulders of others who have come before them. Yet the fact that I take the time to properly cite sources with endnotes and a bibliography shows that I can give credit where credit is due. Ultimately, the more attention to detail I put into my writing to make it my own is demonstrative of its uniqueness.

Everyone’s writing is unique because the human brain isn’t a machine. It’s not perfect and 100% consistent. There are certain standards to writing and art that are objectively quantifiable, to be sure. Yet, there are also imperfections and flaws that make it both ugly and beautiful; something uniquely human.

We may come to a day when a machine passes the Turing Test and we can’t tell the difference between a human and an A.I. Scientists, philosophers, and science fiction writers have long theorized about the blurring divide between humans and machines; simulation and reality. Perhaps the coming A.I. revolution is closer than we think or maybe it’s still a long way off. Hopefully, we’ll manage to keep it in check and use it to benefit us before we switch on Skynet and destroy ourselves.

Notes

1. Greg Rosalsky and Emma Peaslee, “This 22-year-old is trying to save us from ChatGPT before it changes writing forever,” NPR.org, NPR, January 17, 2023, https://www.npr.org/sections/money/2023/01/17/1149206188/this-22-year-old-is-trying-to-save-us-from-chatgpt-before-it-changes-writing-for.

2. Samantha Kelly, “Teachers are adapting to concerns about a powerful new AI tool,” CNN.com, CNN, January 19, 2023, https://www.cnn.com/2023/01/19/tech/chatgpt-teachers-adjusting/index.html.

3. Matt O’Brien and Jocelyn Gecker, “Cheaters beware: ChatGPT maker releases AI detection tool,” apnews.com, AP, January 31, 2023, https://apnews.com/article/technology-education-colleges-and-universities-france-a0ab654549de387316404a7be019116b. Some of the limitations include struggling with non-English languages, texts less-than 1,000 characters in length, and texts that has been altered to evade the classifier. When I ran the first essay through the classifier, it identified it as “likely” to be A.I.-generated.

Bibliography

Kelly, Samantha. “Teachers are adapting to concerns about a powerful new AI tool.” CNN.com. CNN, January 19, 2023. https://www.cnn.com/2023/01/19/tech/chatgpt-teachers-adjusting/index.html.

O’Brien, Matt, and Jocelyn Gecker. “Cheaters beware: ChatGPT maker releases AI detection tool.” apnews.com. AP. January 31, 2023. https://apnews.com/article/technology-education-colleges-and-universities-france-a0ab654549de387316404a7be019116b.

Rosalsky, Greg, and Emma Peaslee. “This 22-year-old is trying to save us from ChatGPT before it changes writing forever.” NPR.org. NPR, January 17, 2023. https://www.npr.org/sections/money/2023/01/17/1149206188/this-22-year-old-is-trying-to-save-us-from-chatgpt-before-it-changes-writing-for.

Fair Winds & Following Seas

Moving with the tides of history

A.I.-Generated Content – A Teacher’s Perspective [Pt. 1 – Testing]

Tests 1 – 3: Historical essays of varying length

The Sinking of the Battleship Yamato – A ChatGPT Essay

Prompt

Chat GPT’s Generated Response

Evaluation

Detecting AI-generated Content

Tests 2 & 3 – Testing for length

Evaluation

Test 4 & 5 – Asking AI about the ethics of its content

Evaluation

Final Thoughts

Notes

Bibliography

3 thoughts on “A.I.-Generated Content – A Teacher’s Perspective [Pt. 1 – Testing]”

Leave a comment Cancel reply

Tests 1 – 3: Historical essays of varying length

The Sinking of the Battleship Yamato – A ChatGPT Essay

Prompt

Chat GPT’s Generated Response

Evaluation

Detecting AI-generated Content

Tests 2 & 3 – Testing for length

Evaluation

Test 4 & 5 – Asking AI about the ethics of its content

Evaluation

Final Thoughts

Notes

Bibliography

Share this:

Related

3 thoughts on “A.I.-Generated Content – A Teacher’s Perspective [Pt. 1 – Testing]”

Leave a comment Cancel reply

Discover more from Fair Winds & Following Seas