Process
Status Items Output None Questions None Claims None Highlights Done See section below
Highlights
id592524387
Think of ChatGPT as a blurry JPEG of all the text on the Web. It retains much of the information on the Web, in the same way that a JPEG retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation.
id592525371
But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry JPEG, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.
id592528917
This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation—that is, estimating what’s missing by looking at what’s on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average.
✏️ ChatGPT takes two points in “lexical space” and fills in what would occupy the location between them (e.g. tell me about world history as if you were a pirate) 🔗 View Highlight
id592533777
Large-language models identify statistical regularities in text. Any analysis of the text of the Web will reveal that phrases like “supply is low” often appear in close proximity to phrases like “prices rise.”
✏️ It’s an advanced form of text completion. It sees correlations of things that exist repeatedly together throughout the internet, so that when you ask about one thing, it’ll bring up the other stuff that’s related. Is that understanding or is that just statistics? 🔗 View Highlight
id592534290
GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.
✏️ Another example of statistical correlation but not understanding. It doesn’t know basic math skills, but it can replicate whatever it finds, failing elsewhere. 🔗 View Highlight
id592534456
In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.
✏️ The “understanding” we see from ChatGPT giving us essays is an illusion of it rephrasing material due to its lossy compression, triggering in us a comparison to how students rephrase things as well. Still, it’s all an illusion. 🔗 View Highlight
id592534995
if a model starts generating text so good that it can be used to train new models, then that should give us confidence in the quality of that text. (I suspect that such an outcome would require a major breakthrough in the techniques used to build these models.) If and when we start seeing models producing output that’s as good as their input, then the analogy of lossy compression will no longer be applicable.
✏️ This is when we know that we can better trust the output of these models. 🔗 View Highlight
id592535751
Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.
✏️ Making an argument against using AI drafts as a starting off point. Not only are you depriving yourself of learning how to express ideas in the first place, you’re starting off with being given copies of ideas instead of whatever original thoughts you’re having. 🔗 View Highlight