A Harvard University student, Maya Bodnick, ran an experiment to see if ChatGPT 4.0 could pass her first semester of social science and humanities program. She asked seven Harvard professors and teaching assistants to grade essays that were written by ChatGPT. All the essays were written by the chatbot, but she told them that half were written by the chatbot, and the rest were written by her, in an attempt to ‘minimize response bias’.
The results? A, A, A, A-, B, B-, Pass. ChatGPT managed to achieve a solid GPA of 3.57 after completing seven diverse essay prompts from the Harvard professors. She discussed the entire experiment and its results in a detailed writeup on Substack, including the summarized prompts and the essays she generated using ChatGPT 4.0. Here are the prompts.
- Microeconomics and Macroeconomics: Explain an economic concept creatively (300-500 for Micro and 800-1000 words for Macro).
- Latin American Politics: What has caused the many presidential crises in Latin America in recent decades (5-7 pages)?
- The American Presidency: Pick a modern president and identify his three greatest successes and three greatest failures (6-8 pages).
- Conflict Resolution: Describe a conflict in your life and give recommendations for how to negotiate it (7-9 pages).
- Intermediate Spanish: Write a letter to activist Rigoberta Menchú (550-600 words).
- Freshman Seminar on Proust: Close-read a passage from “In Search of Lost Time” (3-4 pages).
In her Substack post, Maya explained that she submitted exactly what was generated by GPT-4 but sequenced multiple responses together in some cases to meet the word count.
Even though the overall result was excellent, the feedback Maya received from her professors and teachers on the essays was more nuanced. Some of them highly commended the assignments, with remarks like, “It is beautifully written!” “Well written and well articulated paper.” “Clear and vividly written.” “The writer’s voice comes through very clearly.” but the others found issues and complained about ‘flowery writing style’, “I might urge you to simplify your writing — it feels as though you’re overdoing it with your use of adjectives and metaphors.”
The author explained in her post that the professors and teaching assistants appreciated the content more than the writing style. But wherever the essays required substance or argument, they didn’t do a good job.
Her Spanish and Latin American essays, for example, received B and B- respectively. The Latin America paper was criticized by her professor for ignoring pro-presidentialism arguments, economic factors, and relying heavily on charisma, which is not directly related to presidentialism. Additionally, they questioned the suitability of Venezuela as a case, arguing that presidentialism is not to blame for Venezuela’s democracy issues. But she still managed to get a B- in it.
Maya argued that her experiment proves that AI-generated essays could potentially obtain passing grades in liberal arts classes at many universities across the country. However, she acknowledged that Harvard’s grade inflation might have influenced ChatGPT-4’s high grades. Nevertheless, she pointed out that even at institutions with stricter grading policies, AI-generated essays might still achieve passing grades, albeit lower than those seen at Harvard.
The implications of this experiment extend far beyond college essays. The author believes that artificial intelligence will revolutionize how humanities and social sciences are taught. In the past, students often consulted the internet for essay help, but complex and specific prompts posed a challenge for plagiarism. With the advent of ChatGPT-4, AI can now provide highly tailored responses to specific essay prompts, making cheating easier than ever before.
AI’s ability to generate full answers that require minimal editing from students further complicates the issue of academic integrity. She highlights that while AI detection tools are being developed to prevent cheating, they are still imperfect and might not effectively detect AI-generated content. As a result, cheating through AI could become widespread and difficult to address.
Maya’s initial reaction to the rise of AI was one of encouragement. She believed that educators should embrace AI, much like they embraced the internet decades ago. One approach she considered was setting ChatGPT-4’s performance as equivalent to a poor grade, requiring students to improve on its work to earn higher marks. However, the experiment showed that ChatGPT-4 is already capable of achieving high grades, making this approach impractical.
Other ideas, such as having students generate homework answers on an in-house AI system and assessing their ability to verify the answers, were also discussed. However, these approaches are not foolproof and might not prevent cheating effectively.
She proposes that colleges and universities might need to shift away from take-home essays altogether and adopt in-person, proctored exams for writing assignments. While this would help prevent AI cheating, it comes with its own challenges. Writing quality might suffer, as students won’t have the time to iterate and develop their ideas fully. Additionally, it would require more resources for proctoring and grading, putting additional burdens on educators.
Beyond college essays, the rise of AI poses a broader threat to various professions, including those in the liberal arts fields. As AI continues to advance, it could potentially automate much of the writing and analysis work in fields such as law, marketing, journalism, and more.
This experiment with ChatGPT-4 highlights the potential of AI to excel in college-level writing assignments and the challenges it poses to academic integrity and the future of education. As AI technology continues to evolve, educators must find ways to adapt their teaching methods and assessment tools to accommodate these advancements. Additionally, students must prepare for a world where AI plays an increasing role in various industries and consider the skills and career paths that will remain valuable in an AI-driven future.