AI, LLMs, BS, and vibe coding
I highly recommend not using ChatGPT or similar large language models (LLMs) in this class.
I am not opposed to LLMs in many situations. I use GitHub Copilot for computer programming-related tasks all the time, and I have ongoing research where we’re experimenting with using Meta’s Ollama to try automatically categorizing thousands of nonprofit mission statements. Using LLMs requires careful skill and attention and practice, and they tend to be useful only in specific limited cases.
Writing and thinking
I am deeply opposed to LLMs for writing.
Google Docs and Microsoft Word now have built in text-generation tools where you can start writing a sentence and let the machine take over the rest. ChatGPT and other services let you generate multi-paragraph essays with plausible-looking text. Please do not use these.
There’s a reason most university classes require some sort of writing, like reading reflections, essay questions, and research papers. The process of writing is actually the process of thinking:
Writing is hard because the process of getting something onto the page helps us figure out what we think—about a topic, a problem or an idea. If we turn to AI to do the writing, we’re not going to be doing the thinking either. That may not matter if you’re writing an email to set up a meeting, but it will matter if you’re writing a business plan, a policy statement or a court case. (Rosenzweig 2023)
Using LLMs and AI to generate a reflection on the week’s readings will not help you think through the materials. You can create text and meet the suggested word count and finish the assignment, but the text will be meaningless. There’s an official philosophical term for this kind of writing: bullshit (Hicks, Humphries, and Slater 2024; Frankfurt 2005).1
1 I’m a super straight-laced Mormon and, like, never ever swear or curse, but in this case, the word has a formal philosophical meaning (Frankfurt 2005), so it doesn’t count :)
Bullshit
Philosophical bullshit is “speech or text produced without concern for its truth” (Hicks, Humphries, and Slater 2024, 2). Bullshit isn’t truth, but it’s also not lies (i.e. the opposite of truth). It’s text that exists to make the author sound like they know what they’re talking about. A bullshitter doesn’t care if the text is true or not—truth isn’t even part of the equation:
[A bullshitter] does not reject the authority of the truth, as the liar does, and oppose himself to it. He pays no attention to it at all (Frankfurt 2005, 61).
LLMs and AI systems like ChatGPT, Gemini, Claude, and so on are bullshit machines. That might sound hyperbolic, but at a technological level, that’s literally what they do. LLMs use fancy statistics to take an initial prompt or starting phrase and then generate the most plausible words or phrases that are likely to follow. They don’t care whether the text it creates is true—they only care about whether it looks plausible.
This is why these systems will generate citations to articles that don’t exist, or generate facts that aren’t true, or fail at simple math questions. They aren’t meant to tell the truth. Their only goal is to create stuff that looks plausible:
The problem here isn’t that large language models hallucinate, lie, or misrepresent the world in some way. It’s that they are not designed to represent the world at all; instead, they are designed to convey convincing lines of text. (Hicks, Humphries, and Slater 2024, 3)
Do not replace the important work of writing with AI bullshit slop. Remember that the point of writing is to help crystalize your thinking. Chugging out words that make it look like you read and understood the content will not help you learn.
A key theme of the class is the search for truth. Generating useless content will not help with that.
In your session check-ins and assignments, I want to see good engagement with the readings. I want to see your thinking process. I want to see you make connections between the readings. I want to see your personal insights. I don’t want to see a bunch of words that look like a human wrote them. That’s not useful for future-you. That’s not useful for me. That’s a waste of time.
I will not spend time trying to guess if your assignments are AI-generated.2 If you do turn in AI-produced content, I won’t automatically give you a zero. I’ll grade your work based on its own merits. I’ve found that AI-produced content will typically earn a ✓− (50%) or lower on my check-based grading system without me even needing to look for clues that it might have come from an LLM. Remember that text generated by these platforms is philosophical bullshit. Since it has nothing to do with truth, it will not—by definition—earn good grades.
2 There are tools that purport to be able to identify the percentage of a given text that is AI, but they do not work and result in all sorts of false positives.
What about code?
I am less opposed to LLMs for coding. Kind of.
I often use LLMs like ChatGPT and GitHub Copilot in my own work (GitHub Copilot in particular is really really good for code—it’s better than ChatGPT, and free for students). These tools are phenomenal resources for coding when you already know what you are doing.
However, using ChatGPT and other LLMs when learning R is actually really detrimental to learning, especially if you just copy/paste directly from what it spits out. It will give you code that is wrong or that contains extra stuff you don’t need, and if you don’t know enough of the language you’re working with, you won’t understand why or what’s going on.
When you ask an LLM for help with code, it will spit out a plausible-looking answer based on a vast trove of training data—millions of lines of code that people have posted to GitHub. The answer that it gives represents a sort of amalgamated average of everyone else’s code. Often it’ll be right and useful. Often it’ll be wrong and will send you down a path of broken code and you’ll spend hours stuck on a problem because the code you’re working with can’t work.
Vibe coding and learning
Remember that LLMs are (philosophical) bullshit machines. They don’t care if the answer if gives you is wrong or right. They don’t care if the code runs or not. All they care about is if the answer looks plausible.
That said, in part due to the massive amount of publicly available code that’s inside their training data, LLMs can work okay with code in limited circumstances. Think of them as fancy documentation—if a programming language or package has good documentation with lots of good examples online, the stuff that LLMs generate about it will generally have good coverage and won’t be completely made up.
They work well enough that a whole new approach to programming has emerged: vibe coding. When vibe coding, you essentially tell an LLM what you want it to make and then let it actually build the whole thing. Some code editors actually allow LLMs to create and edit files directly on your computer. Like, you can say
Make a website that has a login page where users can log in and then create and edit blog posts. Create a theme for it that is mostly blue, but with some nice contrasting colors, and use a warm, friendly sans serif font. Add some photos from Flickr too.
… and it will attempt to do that, creating all the necessary files and code. It feels neat and magical to code without coding.
If you know what you’re doing, and you mistrust everything the LLM spits out, and hand-check everything it says, vibe coding can save you some time. HOWEVER it is (1) potentially dangerous, and (2) bad for learning.
Danger!
Because LLMs are designed to make plausible looking code, they can reference external packages that don’t exist. For instance, you might want to make a lollipop chart, and it’ll say something like
Certainly! First, run
remotes::install_github("someone/gglollipop")
and then runlibrary(gglollipop)
to load it.
BUT GUESS WHAT. There is no package named {gglollipop}. It’s not a thing. Malicious actors also vibe code and look for the names of packages that LLMs think should exist, then they go and create those packages themselves, but include malware in those fake packages. If you’re using an editor that allows LLMs to make changes to your computer, it can run all this code without you knowing what happens and bad things can happen.
Additionally, vibe-coded code can run well but be full of security vulnerabilities and issues and inefficiencies and bugs. The code it generates is essentially an average of all other code online—it will have bugs and security issues.
It’s also incredibly easy to accidentally expose private information to LLMs, like feeding them passwords, usernames, private data, and so on, especially if you’re mindlessly generating and running code.
Always distrust and be suspicious about all code that LLMs generate.
Bad for learning!
Some people claim that vibe coding helps them learn how to code. This is most likely not the case—pure vibe coding likely hurts learning. Think of learning to code like learning a language. You could try to learn French by watching hours and hours of French movies, and you’d certainly pick stuff up, but you won’t learn nearly as well as you would by taking a dedicated French class.
You need the foundation first. If you take a few years of French classes and then watch hours and hours of French movies, you’ll learn a lot. If you jump right into a pure media-based approach to learning, it’ll be detrimental to learning the language.
Recent research explores this idea with LLMs:
Students who substitute some of their learning activities with LLMs (e.g., by generating solutions to exercises) increase the volume of topics they can learn about but decrease their understanding of each topic. Students who complement their learning activities with LLMs (e.g., by asking for explanations) do not increase topic volume but do increase their understanding. We also observe that LLMs widen the gap between students with low and high prior knowledge. (Lehmann, Cornelius, and Sting 2025)
Prior knowledge is essential if you want to do any code-related stuff with LLMs, and even then, LLMs are helpful when asking for explanations (e.g. “This code works, but I don’t know why! Can you explain what’s happening with geom_point()
here?”), but far less helpful for learning than just generating plausible-looking answers.
Using LLMs requires a good baseline knowledge of R to actually be useful. A good analogy for this is with recipes. ChatGPT is really confident at spitting out plausible-looking recipes. A few months ago, for fun, I asked it to give me a cookie recipe. I got back something with flour, eggs, sugar, and all other standard-looking ingredients, but it also said to include 3/4 cup of baking powder. That’s wild and obviously wrong, but I only knew that because I’ve made cookies before. I’ve seen other AI-generated recipes that call for a cup of horseradish in brownies or 21 pounds of cabbage for a pork dish. In May 2024, Google’s AI recommended adding glue to pizza sauce to stop the cheese from sliding off. Again, to people who have cooked before, these are all obviously wrong (and dangerous in the case of the glue!), but to a complete beginner, these look like plausible instructions.
I can typically tell when you submit code generated by an LLM; it has a certain vibe/style to it and often gives you extraneous code that you don’t actually need to use. Like this—this comes right from ChatGPT:
# Load the required libraries
library(dplyr)
# Read a file named cars.csv into the R session
<- read.csv("data/cars.csv")
data
# Calculate the average value of cty by class
<- data %>%
average_cty_by_class group_by(class) %>%
summarize(average_cty = mean(cty, na.rm = TRUE), .groups = "drop")
# Show the results
print(average_cty_by_class)
Here’s how I can tell that that ↑ code comes from an LLM:
- Comments at every stage
read.csv()
instead ofread_csv()
,- The older
%>%
pipe instead of the newer|>
pipe na.rm = TRUE
insidemean()
.groups = "drop"
insidesummarize()
- Using
print()
to show the created object
However, nothing in that ChatGPT code ↑ is actually wrong! It is 100% correct and is totally normal. If there are missing values in the data, you should use na.rm
in mean()
, otherwise things mess up. The %>%
pipe is fine (see this for more about that). Adding .groups = "drop"
inside summarize()
is valid too—it prevents extra messages about leftover groupings (see this for more about that). Adding comments about the different steps is good and normal and should happen too!
The issue with this code is that the vibes are slightly off. It’s using features that you’ve never run into if you’re writing R code for the first time. It’s using extra settings that you wouldn’t use in this particular circumstance (e.g. there’s technically no need to use .groups = "drop"
here because you’re only grouping by one column, not two, so it automatically drops the group). Someone who already knows R will see that output and say “Yeah, that’s all fine, but I wouldn’t do it exactly that way”; brand new R users don’t know why the code looks like that and they just hope it works.
It’s like when people first discover the thesaurus. You’ve all seen overwrought, over-thesuarized sentences where the word choice is slightly off. The words can all be grammatically correct, but they can feel off in context.
In general, don’t ever just copy/paste from ChatGPT and hope it runs. If you do use these tools, use them to ask for explanations. Make sure you know what every line of code is doing. For example, if it spits out summarize(average_cty = mean(cty, na.rm = TRUE), .groups = "drop")
, and you don’t know what those na.rm = TRUE
or .groups = "drop"
things mean, look at the help page for mean()
and see what na.rm
actually does so you can decide if you really need to use it or not, or ask the LLM why it included it.
I will call out code that looks vibe-coded in your assignments. Please don’t do it.