Artificial intelligence (so-called) is typified by its boom and bust cycles, and we’re in a boom now. But as more and more money pours in with decreasing returns, we’re going to see a shakeout, and hype is rushing in to stoke the enthusiasm. In other words, the con is on.
Dr Emily M. Bender and Dr Alex Hanna are co-hosts of the podcast Mystery AI Hype Theater 3000, and the authors of The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want. They join us for this episode.
Listen to this episode
Video
Patreon supporters
Thanks to all our patrons! Become a Patreon supporter yourself and get access to bonus episodes and more!
Become a Patron!Show notes
The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want
By Emily M. Bender, Alex Hanna
https://www.harpercollins.com/products/the-ai-con-emily-m-benderalex-hanna?variant=43065101189154
Mystery AI Hype Theater 3000
https://www.dair-institute.org/maiht3k/
Mark Zuckerberg Gets Roasted for Saying the Average American Has ‘Fewer Than Three Friends’ While Pushing AI Chatbots
https://sfist.com/2025/05/01/mark-zuckerberg-gets-roasted-for-saying-the-average-american-has-fewer-than-three-friends-while-pushing-ai-chatbots/
AI | 404 Media
https://www.404media.co/tag/ai/
Mushroom pickers urged to avoid foraging books on Amazon that appear to be written by AI
https://www.theguardian.com/technology/2023/sep/01/mushroom-pickers-urged-to-avoid-foraging-books-on-amazon-that-appear-to-be-written-by-ai
Artists warn of the harm AI-generated illustrations can do to their careers
https://www.marketplace.org/episode/2023/06/15/artists-warn-of-the-harm-ai-generated-illustrations-can-do-to-their-careers
AI Slopaganda feat. Ryan Broderick (E319) | QAA Podcast
https://soundcloud.com/qanonanonymous/ai-slopaganda-feat-ryan-broderick-e319
AI-Generated Code Packages Can Lead to ‘Slopsquatting’ Threat
https://devops.com/ai-generated-code-packages-can-lead-to-slopsquatting-threat/
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
https://arxiv.org/abs/2401.01313
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
https://arxiv.org/abs/2409.04109
Powerful new AI software maps virtually any protein interaction in minutes
https://www.science.org/content/article/powerful-new-ai-software-maps-virtually-any-protein-interaction-minutes
This is a DOGE intern who is currently pawing around in the US Treasury computers and database
https://www.reddit.com/r/singularity/comments/1ijbtqf/this_is_a_doge_intern_who_is_currently_pawing/
AI’s $600B Question | Sequoia
https://www.sequoiacap.com/article/ais-600b-question/
Transcript
[Transcript provided by SpeechDocs Podcast Transcription]
[Because Language theme]
Daniel: Hello and welcome to this special episode of Because Language, a show about linguistics, the science of language. I’m Daniel Midgley and with me is my trusted colleague and totally a human, Hedvig Skirgård.
[laughter]
Hedvig: Yes. I am not generated. I think an AI would be more consistent in its dialect in English than I am.
Daniel: Oh, interesting.
Hedvig: I feel like–
Emily: Eh, depends on how it’s prompted, I suppose.
Alex: Yeah. Have you heard some of these things? They sound so metallic and stunted.
[laughter]
Hedvig: Yeah, but they sound the same way is what I’m thinking. And I’m thinking I sound different on different days. The other week, we had an Australian visiting and everyone at work was like, “You sound Commonwealth.”
Alex: Oh, interesting.
Emily: Yes.
Alex: Yeah, fascinating.
Emily: There is one nonhuman. It’s Alex’s cat.
Hedvig: Yes.
Alex: Yeah. This is Clara. In the mornings, she likes to get on the podcasts. [laughs]
Daniel: There you go. Beautiful. Well, those voices that you are hearing, it’s an old friend of ours and a new friend of ours. They’re the co-hosts of the podcast, Mystery AI Hype Theater 3000. And they’re both authors of The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want. It’s linguist, Dr. Emily Bender, of the University of Washington and sociologist, Dr. Alex Hanna, Director of Research at the Distributed AI Research Institute, or DAIR. Hey, thanks to you for joining us. This is great to have you both.
Emily: Super excited to be here.
Alex: Thanks for having us.
Daniel: Well, let’s get stuck in. You have a podcast. How did you get started on that?
Emily: That’s a funny story. So, we were both doing sort of hype-busting work in textual format, blog posts sometimes– Well, eventually op-eds, blog posts, and Twitter posts. And we came across something that was just like too long to do that way. And the first one is actually a video. I’m like, “How do we do this?” And one of our friends, Dr. Margaret Mitchell, Meg, said, “Well, you’ve just got to give it the Mystery Science Theater 3000 treatment.”
Hedvig: Yeah.
Emily: Fast forward a couple of months, come across this big, long blog post. So, I’m like, “Okay, this one needs the MST3K treatment. Who’s in?” And Alex said– [laughs]
Alex: Yeah. I’m like, “Let’s go for it.” And one of the things that we thought about doing was doing it on a Twitch stream. And so, DAIR had a twitch channel that I think we had used for a few events. For instance, we had a Stochastic Parrots Day to celebrate the release of the stochastic parrot’s paper.
Emily: It was actually the second anniversary of it, Stochastic Parrots Day, yeah.
Alex: Yeah. Yeah. And that was really well attended, like 3,000 people came to that and it was wild. And so, then we ended up going like, “Well–” Oh, actually, the thing is that we did the stream before that because I didn’t really know how to run a stream.
[laughter]
Emily: But you had the confidence that you could, I think, from your derby experience.
Alex: Yes. Well, no, because we didn’t have the derby experience. That came later. So, I’m like, “I could figure this out. How do you stream?” So, we did it by the seat of our pants. And I set up a stream and just to start, the software didn’t work. We had to switch to a different platform. It was completely a mess. If you listen to the very first one, you’ll hear Emily and I going saying, “Are we back? Are we back?” just because were complete noobs at this podcasting thing. And then, we did a few episodes. We onboarded a friend of mine, Christy Taylor, who’s a roller derby person I know and who is a professional producer, has done radio producing for over a decade.
Hedvig: Nice.
Alex: And she actually helped us [laughter] professionalize our setup, not just two wayward academics just trying to do something we’ve never done before. So, it sounds a lot better now and it’s a lot more professional.
Emily: Yeah, lot more profession. Unless you’re like a real hardcore fan of our show, don’t start with the early episodes. But we kept the Twitch stream format, so everything starts off as a live stream and we have a wonderful live audience that like comments as we go. And that’s actually really fun, I think, relatively unusual in the podcasting space, conversation that we’ve got going on with our audience as we’re streaming, sort of like you all do on your live episodes.
Daniel: Yeah, that’s right.
Hedvig: Yeah, it’s really nice to do that. And especially when you have like a fun fan group who like ask fun questions and have interesting comments, when you feel like it’s a good community you’ve built, that’s really fun.
Emily: Yeah.
Alex: Yeah.
Hedvig: Does that mean that you say “chat” a lot?
Emily: No. And I think that in my case, that might just be an age thing.
Alex: Is that like the linguistics thing, like chat–? Is this something?
Daniel: Is this for real?
Alex: I used one of those recently on BlueSky, but I think it was because I was looking for an open-source privacy-preserving nutrition application. I’m like, “Chat, does this exist?” But yeah, we don’t say chat on the chat.
Hedvig: Really? That’s interesting because I kind of it’s like you’re talking to like a many-headed hydra that’s like in your common field. And you’re like, “Chat, what do you think about this?” And they’re like, “[onomatopoeia]”
Alex: Well. I’m going to start using that though because I don’t think I knew the correct deployment of the term. So, yes, thank you. Thank you, Hedvig. I’m going to roll with that.
Emily: There’s this debate going on about whether it’s a new incipient pronoun to refer to. And I don’t know, where do you all stand on that debate?
Daniel: The elusive seven and a half plural. Yeah. No, it’s not a pronoun. It doesn’t inflect like a pronoun.
Emily: Well, pronouns inflect.
Hedvig: What do you mean, Daniel?
Daniel: Pronouns inflect for object and subject. In English-
Hedvig: Like you, like it.
Daniel: -you’ve got I, which is subject case. Yes, okay, you’ve picked two examples, you and it, where the subject case is the same as the object case.
Hedvig: They are the most common pronouns.
Daniel: Will everyone just please shut up and listen to the linguistics professor?
[laughter]
Alex: There’s multiple of you.
Daniel: Oh, yeah. All right. That wasn’t a very good referring expression. You got I, subject case. Me, object case. It inflects. We got we, subject case. Us, object case. Chat not only–
Emily: Okay, but here’s the thing.
Daniel: Yeah. Okay.
Hedvig: Lots of languages, that’s not how it works.
Emily: Yeah. And on top of that, English only has vestigial case. So, if we are gaining a new pronoun, we would not expect it to have case because the case that we have is remnants of a very old system.
Hedvig: It’s like a y’all. It’s like a specific y’all.
Alex: As a sociologist, I’m just here enjoying this because I think this is such deep nerdery. And I’m going to go back and be like, “Okay, but let’s talk about these social groups that are using this. And what is this a reference to?” And I’m like, “And then, where’s the emergence of this term and what internet subcultures have spread this?” These are my questions.
Daniel: These are great questions.
Emily: Yeah. And then, you’re doing sociolinguistics, right? That is also–
Alex: Yeah. Yeah. There you go.
Daniel: There you go. I want to share something with you because I’ve been reading The AI Con that I saw– So, I’ve been in touch with your fantastic publicists at HarperCollins. You can see my screen now. Okay, now you’re seeing the thing.
Hedvig: Yeah, we see the thing, I have the thing as well.
Daniel: The AI Con, there it is, the title page and it’s in Adobe Acrobat. And it has, let’s just count the sparkles. How many AI assistants does it have? One, two, three, four, five, six. And if I select any text, I get ask AI Assistant, I get seven. Got no fewer than seven sparkles.
Alex: Wow, terrible.
Emily: That’s appalling.
Daniel: What is this like for you?
Alex: It’s a nightmare. They’re trying to put it in damn near everything. This is nothing new in tech companies, is that when there’s a feature they’re really trying to push on you, they really use some dark patterns that try to put it in certain places. So, one of the things that absolutely sent me was that– I’m not using that correctly. One thing that actually pissed me off was that at DAIR, we were using Google and we have our G Suite and in the email where they have the Archive button, they replaced the Archive button with the Gemini button. And I’m like, “Oh, that is so usual to–” Luckily, we got it disabled. But apparently, if you’re listening to this, if you complain about enough to G Suite, they’ll remove it, but they replaced it. And so, you have this all over.
And I think you had a question too, as well in I think in the preview of your questions, which is like, “Well, when you click these things, what does it do?” And it doesn’t really do anything. Like–
Daniel: I didn’t click, so I don’t know.
Alex: Yeah, I’ve clicked them on accident and then sometimes, I’ll admit that it’s like the intrusive thoughts. I’m like, “Well, what the fuck does this do?” Can I curse on this? [laughs]
Daniel: You can say whatever you want.
Hedvig: Yeah. Yeah. Curse as much as you want.
Alex: We always got to ask like, “What the fuck does this do?” And click it. And it’s like, “Ask me anything,” and it [unintelligible 00:09:36] anthropomorphize. And it’s like, I don’t have anything to ask. It’s just you’re adding a layer. I want to read the PDF. So, it’s just infuriating.
Emily: I can tell you what I would expect to do. I also don’t click those things. I do my very best to never experience synthetic text. But this “generate a summary” thing, what that almost certainly is doing is that it is taking whatever large training set it already has as background and then inputting a prompt that says “create a summary of this text.” It’s reading in the text that’s in the PDF and then it will output something that is basically papier-mâché of a selection of the text that you’re summarizing with little bits added from the big training data.
Hedvig: Yeah. So, you don’t really know.
Emily: You don’t really know. Automatic summarization has multiple issues, especially if it’s driven by large language models, you might get stuff that’s just added because it was common in the pre-training data. But also, you’re not going to know what was missed, and the whole point of a summary is giving you the most important parts. And papier-mâché of the text shaped to look like a summary is not actually a reliable way to get to the most important parts. So, we thank you for actually reading the book.
Alex: Yeah. [laughs]
Hedvig: I heard someone say it’s a bit like corn syrup. It’s like in everything suddenly, but no one really asked for it to be there. So, WhatsApp has added it too. And what I find a little bit frustrating and I feel like I’m going to take the role in our conversation here to be like, there are places where I actually find large language models useful. For example, I write R-code and sometimes I’m debugging a complex thing from a colleague. And I’m like, I don’t actually know where it’s going wrong. And I give the script to a large language model and I say, “Hey, do you know what’s going wrong?” And then, it tells me something and then I implement it and I see if it works or not because I can actually evaluate it myself to see if I now get something that’s reasonable and I can look at the solution and understand it. And that’s useful.
What’s confusing to me are these like– Meta also added Ask Gemini or whatever to WhatsApp and I’m like, “This is my direct messaging with my family. Why are you in this room?”
Alex: Yeah.
Hedvig: I talk to you at work about specific things. Like, what’s–
Daniel: Why are you here?
Hedvig: What am meant to do with you here?
Alex: Yeah.
Emily: What data are you collecting-
Alex: Right.
Daniel: And for how long?
Emily: -for Meta?
Alex: It’s really interesting. I don’t know if you saw this thing posted today. We’re recording on May 1st. But speaking of Meta, Mark Zuckerberg said something like he’s using this line from the research on social capital about the number of friends that Americans typically have, which is a really weird line of research. And there’s a lot of controversy around this and around the General Social Survey and how they generated these lists. This is a sociologist hat coming on now. And so, it’s this bowling alone hypothesis, like that Americans don’t have friends or they don’t have close friends. And he said this thing that is absolutely like flooring, which is, “Well, Americans have three friends or less. And then they-“
Emily: They have demand or something?
Alex: They have capacity for 15 friends. I’m like, “Well, first off, what does capacity for friendship mean? I don’t know what you’re saying, bro.” And he’s suggesting that these AI companions can supplement that friendship. But that’s not what’s even happening. It’s not like there’s a consumer demand for friendship and we need to fill that quota somehow.
Emily: There’s a market, Alex. [laughter]
Alex: Well, that’s the thing. It’s more of an– It’s engagement farming. It’s kind of the thing is like, “How do you stay on the platform more?” And there was some great reporting on 404 Media recently on their podcast where they were talking about the Meta AI bots, and some of these are really horrific. Some of them are like whatever, talk to some celebrity. I think initially they had something like “cook with Snoop Dogg” or something.
Hedvig: Okay.
Alex: But then, they have these things and most of them are like “AI girlfriends” or they’re salacious or something. And now, they’ve had this thing and one of the– 404 did an investigation on this and apparently Wall Street Journal did too, where they were finding that in the AI studio that was available to kids, they were very easily prompting these for like sexual content.
Hedvig: Okay.
Alex: And so, this is tending very quickly towards maximizing engagement. So, when you see this on WhatsApp or Instagram, they’re trying to keep you on the platform. It’s like the tried-and-true thing about the “algorithm.” It’s another vector on which they’re trying to do engagement farming.
Daniel: The title of the book is The AI Con. Is it a con? Was it always meant to be or–?
Emily: Yes.
Daniel: Is it just a space where grifters can do their thing?
Emily: I would say historically, it’s always been a con. And in this present moment, it’s very much a con. And the con happens on multiple levels. So, the large language model parlor trick being sold as something that understands rather than just something that can mimic the shape of various different kinds of communications we do, sort of serves as a substrate on which you can build various other cons. Like, “Oh, we aren’t going to need to hire any more teachers because everyone has a personalized tutor.” Or, “Never mind that you can’t access mental health services, this thing can work as a psychotherapist.” And all these other cons on top of that, all the way up to, “You have to give us all of your data and let us accumulate all of this capital because if we don’t build the good AI, then someone else,” and it’s usually the highly xenophobic, Sinophobic discourse around China here, someone else is going to build the bad AI, so just cons on top of cons.
Daniel: Dang.
Hedvig: And what’s weird, which I was just talking to our supporters on Patreon on our Discord with the other day, is there are things that I could imagine it being useful and for some reason, it’s not used for that. So, making art is not something I am interested in consuming from a robot. Whereas, for example, a decent autocorrect that actually correctly predicts my text and corrects typos would be helpful to me. But for some reason, they’re not doing that. And I think–
Emily: Autocorrect isn’t something that seems like it’s really intelligent. And this is where the con comes in. I’m not saying that all language technology is not useful. So, autocorrect, spell checkers, automatic transcription, machine translation, so long as they are transparent about just how accurate and inaccurate they might be, lots of good use cases. Autocorrect can be funny sometimes, but if what you’re building is autocorrect and you say, “I’m building autocorrect and I’m putting effort into making better autocorrect,” you don’t then get to also say, “And it’s going to be so smart, then it can do everybody’s jobs and make $100 billion.”
Hedvig: Right. But what I’m confused about is why are they making art and not a better autocorrect? Why are they making a product that I don’t want instead of improving a product that I actually would like? That’s what I’m confused about.
Alex: That’s because you can’t return on investment for better autocorrect. There [unintelligible 00:17:13] existed better autocorrect. I have never used Grammarly. I think it’s gotten too aggressive from what I’ve heard. And it’s kind of rewriting things in a way that is definitely not in a voice that is of people, but that’s– This is not an endorsement of Grammarly, but there were elements of that where it may have helped and may have offered helpful selections or corrections. If you’re trying to justify whatever, a $600 billion revenue turn that needs to be churned out, it can’t be relegated Grammarly. It needs to seem like it can do so much more than that. So, that might be image generation. This might be this multimodal– Take any input and get some kind of other output. So, it has to look like it mimics a human kind of elements to it.
There’s a great line in the book, to quote ourselves, and it goes something like. [laughs] Just to quote ourselves–
Emily: Yeah, yeah, go, go.
Alex: But there’s a great line which is something like, “Well, creativity is kind of the last line of kind of testing or proof for many of the boosters. And if you are just automating things, like if you’re automating customer support, if you’re answering telephone calls, still really terrible use case, but if you’re doing that, that’s industrial automation. But if you’re generating something which, let’s say, speaks to the human condition or attempts to, because it is not being produced by a human, then this is like moving into the realm of the creative or moving into the realm of the ‘Wow, these things are really, really intelligent.’” And so, the kind of art element of that really is trying to do that. It’s really trying to provide credence to intelligence, especially this kind of notion of general intelligence.
Hedvig: Right. But they’re just–
Daniel: At least if it’s doing art and it gets it horribly wrong, all you get is bad art. Whereas if it’s trying to do medical diagnoses and it gets it horribly wrong, you get really bad outcomes. I’m actually, oh, I don’t want the art, but let’s just keep it with art. No that’s bad because–
Emily: So, the art, it’s not immediately lethal in the way that some of these other things are although there is the fake mushroom foraging guides that go up on Amazon which are sort of on the boundary between– And people have actually been hospitalized by following one of these, not realizing they were fake. Yeah, Hedvig’s face is saying it all.
Hedvig: Oh, wait, I’m just putting it together. So, someone’s like generating a book that says, like, “These are the mushrooms you can eat,” and it’s images. And then, someone goes out and eats them and they’re toxic and they go to the hospital?
Emily: Yes.
Hedvig: I didn’t know this. Lovely.
Emily: So, there’s that. But also, if you think about the way that people make a living as an artist, there is sort of a small number of people who are just doing their own art and able to sell it on the art market.
Hedvig: Very small.
Emily: But most working artists are doing things like graphic design to illustrate magazine articles. And that kind of artwork is being supplanted. That kind of work by artists is being supplanted by these synthetic image machines. And so that is very damaging to artistic careers. It’s damaging to the people who are trying to survive doing that kind of work. And also, I think this is Molly Crabapple’s point. Those are the careers that like, allow people to hone their craft and then become the sort of very virtuoso artists that we celebrate as contributing to culture. And if you cut that off, then we are actually losing a lot of artwork. So, it is not as immediately lethal as these medical applications or the mushroom foraging guides, but it is still highly problematic.
Alex: I would also say that the mushroom foraging guide is part of this– I don’t know what to call this. I like to think about this as just like doing a DDoS on particular industries through synthetic texts, and DDoS, for folks that don’t know, distributed denial of service attack, and the Amazon element of it is that people produce these texts with such rapidity, they just post them to Amazon. It’s an obvious scam, but they’re trying to get people to click it just because there’s so much they can produce. There’s so much to Amazon self-publishing platform and then, that is doing harm because it also then people who are actually experts, mycological experts perhaps have spent a lot of time honing their craft in mycology. And in particular, the people that raised the alarm on the mushroom thing was the New York Mycological Society. And so, those are folks that are not receiving any kind of funds that they might be getting for publishing.
There’s also this thing that happens and there was some– I think 404 had some reporting on this. And we saw this in real time where scammers will write a text, give it the same name as a text that is like a bestseller, and then they will try to get sales from that. So, for instance, we saw this recently. There was that book, Careless People, that was from this former Facebook director of public policy, Sarah Wynn-Williams. And Emily and I were looking at book sales, and it came up in our book sales. It was not that book, but it was something with the same name, some random fake author. And then, it was still sort of like second in sales, even though it was a completely different book. It was a complete scam. And so, yeah, you’re seeing that harm to artists both on the training side and then also kind of on this purchasing consumer side.
Daniel: It’s also happening with languages. This is Basic Navajo Made Easy. It’s for sale, and it’s completely– It’s slop.
Emily: Yeah.
Alex: Yeah, yeah, we saw that. And then there was like, another one I think we talked about on the pod, which I think it was Anishinaabe. And folks were like, “We didn’t freaking write this. What the hell?” Yeah, yeah.
Hedvig: So, this kind of content was made before. Meme pages that have content creator programs with Facebook or whatever, and who want to create lots of things that, was happening before these kinds of generative models. And now it’s just supercharged because people sitting in a room making this shit up can only make up so much shit in an hour, right?
Alex: Yeah.
Hedvig: But these models can make up so much more. And you get these– I was listening to the QAA podcast, and they’re talking about AI Slopaganda. And you get these interesting phenomena where there are a lot of sort of content creator farms abroad from America who want to go viral in America. So, they’re like, “What do Americans like? Veterans, pregnant women, and Jesus.”
Alex: Shrimp Jesus.
Hedvig: You get Shrimp Jesus. You also get lots of pregnant women being like, “We love our troops.” And you get these content and it’s such an interesting– As a European, I’m like, “This is an interesting [unintelligible 00:24:27],” because what goes viral says something– And then, what’s interesting is often things go more viral if they’re a little bit weird, if they’re a little bit off. Because then even me, a person who does not believe that Shrimp Jesus exists, will click on it to be like, “What? Why is the cat head coming out of the stomach? Like, what?” And that’s how you get clicks, right?
Emily: Yeah. I want to admire slopaganda and also put in slopsquatting as another slop neologism. Slopsquatting refers to when you use large language models to generate code, sometimes they will output fake packages. So, import whatever package. And slap squatters write malware with those names.
Alex: Yeah.
Emily: So, Word of the Week candidate, maybe? Slop as a–?
Daniel: We’ve got a couple here. Thank you very much.
Hedvig: Emily, this is very important news because– So, I have not yet imported a package, but ChatGPT has told me of functions and arguments of functions that don’t exist. And I’m like, “I don’t think deep layer works like that. Don’t think there’s a function like that.”
Emily: Yeah. And you’ve got to be really, really careful if it’s import because then you’re opening-
Hedvig: Yep. You’re right.
Emily: -up a big vulnerability-
Alex: Completely.
Emily: …and it’s one that’s of course being taken advantage of.
Alex: Yeah.
Hedvig: Yeah. You’re right.
Daniel: Let me ask then, because we’re talking about hallucination, which I know is a term that we try to avoid because it’s kind of anthropomorphic, but it’s being called hallucination mitigation. It’s a thing that people are trying to do and you can do it by saying, “Please don’t hallucinate. Be very careful about what you say.” And for some reason, that brings down the error rates…
Emily: Does it?
Daniel: …a bit. Apparently in the research that I’ve seen, it is one tactic that’s being used. Another is you get another piece of software to examine it for claims, for claim detection and then check those via some sort of knowledge engine or something. But what do you think about attempts to mitigate hallucination? Have you seen anything that works?
Emily: I can tell you sort of from first principles that nothing could work perfectly. And those first principles sort of are on two levels. The first is that at base, a language model is just a model of the distribution of word forms in text, all right? And when they are turned in time out to become these synthetic text extruding machines, we are using that model of the distribution of the word forms in text to repeatedly answer the question, what is a likely next word? That’s it.
And to the extent that the training data is close enough to the factual claims that we’re interested in, some of the time, what comes out is going to map to something that we want, but it is always just making papier-mâché of the training data. And to get to the levels of apparent fluency that we have, it has to be absolutely enormous training data. So, none of this stuff is curated, all right? And it’s going to mess up little things like, oh, negation, for example. So, we’re going to have all of these issues just there from the start.
If you then look at the second level, it’s like, “Well, why would you expect to be able to have a general-purpose answer machine? Why do you think that kind of a thing could actually exist?” And that is based on a misapprehension of how information works, how information access works. And I’m very proud that in the book, we managed to work in a Douglas Adams reference in this context.
Daniel: Yes, you did. The number 42.
Emily: Yes. And a bit more actually.
Daniel: We need a bigger computer to fix the problems from the first computer.
Alex: Right.
Emily: Yes, yes. And it’s also like the idea that we could somehow abdicate the very human activity and very social activity of making sense of our world to some kind of automation just misunderstands the problem and we are much better off saying, “Okay, what is the specific context that I am automating something in? Do I need access to a database of public-facing medical articles?” You could write a search engine. You could even use a language model as a component of a search engine that helps people enter a query and then land one of these articles from the database and not present it as “This answers your question,” but, “This is relevant possibly to your question,” and do much, much better rather than trying to build a general-purpose answer machine.
Alex: Right.
Hedvig: That’s actually partially how I’ve– So, I was kind of resistant to using any large language model at all for the longest time. And the use cases I’ve found are, for example, in academia, you read a lot of different papers and I don’t know, Emily and Alex, if you are subject to this, but there are certain journals, certain social networks that you get exposed to more. You know certain colleagues, you’re more likely to read their paper than someone else’s paper. So, I was a bit worried that I was getting like a biased effect of that. So, I asked ChatGPT, I was like, “Oh, I’m thinking of writing a paper about this, I’m going to cite these in these and these people. Do you know of any other papers that would be reasonable for me to look into for this that I haven’t thought of yet?” And they recommended, like for example, some Chinese-authored paper that I had never heard of because they’re not in my social sphere. But then, I could go and click through to that and look at it and see if it’s reasonable and think about it and do something. It’s sort of like a, I don’t know, a smarter semantic search, but I know what my use case is and I use it for that. And I don’t think it’s going to give me the answer to meaning of life.
Emily: I suspect– It’s good to have a specific use case and it’s good to confine those use cases to ones where you can verify. I’m nervous about the “give me citations” use case because it is also going to be reproducing whatever biases there are. And the real gold standard for the answer to that question is go talk to a reference librarian.
Hedvig: Yes.
Emily: This is their job.
Alex: I would also say that there’s a lot of kind of issues in thinking about that. One of the concepts I think we talk about, or at least is in the references in the book, is this notion of citational justice, this idea of thinking about where certain ideas come from. And citational justice, I think, is also tied very closely to the Cite Black Women movement. This idea that there’s a lot of these ideas that are undercited that come from black women and other women of color authors.
And what happens there is you’re going to reproduce a lot of those biases that exist in the training data. You might be trying to do this work of getting out of your networks but, yeah, is that going to be the best alternative? Is there other kind of strategies? There’re huge problems with the semantic search tools. And don’t get me wrong, Google Scholar is terrible. [laughs] It’s very, very bad for what it does and what it privileges. It privileges preprints and archive papers. They also don’t let you rename yourself if you’re a trans author or you need a name change.
Hedvig: They don’t? That’s weird.
Alex: Oh, there’s a whole thing around it.
Emily: Seems like an easy thing. That seems easy.
Alex: Oh, you would think it would be an easy thing until you get into a fight with that person who works on that product-
Hedvig: Well, I’m so sorry. That sounds rough.
Alex: -like at Google, and that’s actually very– And that was raised by many trans folks at Google. And for background, I was at Google. And so, there’s a lot of reasons not to use Google Scholar. And there’s a lot more reasons, as Emily said, to go to the reference library or if you’re trying to pursue references specifically from particular scholars, trying to identify those scholars and identify sort of citation networks and doing your own work there. I think doing that work is really an underrated skill too. I think for researchers and in training researchers, I think really trying to identify those networks, I think is something that we ought to cultivate in people. I don’t trust a synthetic text extruding machine to do that with any kind of proficiency.
Hedvig: I agree. I should talk to my librarian more often. I do think that I do a lot of good citation network finding things, but I am still worried that my personal academic network is already biased. And I do feel that when I do search Google Scholar or talk to ChatGPT about it, it does tell me about things that I hadn’t heard about before. And maybe that’s very bad that I didn’t find out about it before, but I think sometimes for some people– maybe I should just talk to my library more, that’s probably the answer, but it’s–
Emily: And as Alex pointed out, Google Scholar has lots of issues, but if it was a choice between Google Scholar and ChatGPT, I would pick Google Scholar.
Daniel: You kind of answered a question for me. I was going to ask, what’s wrong with getting a large language model to plow through tens of thousands of scientific papers looking for new avenues of research? And the answer that I’m getting from the book and from you now is the kinds of hypotheses that we are able to come up with are a product of our experience and our own biases. And that kind of stuff is best left to a human and not to a biased large language model that’s going to reflect an ability to only ask the kind of questions that are already being asked by the mainstream. Am I getting that–?
Emily: Yes, I think that’s part of it. You can take the papier-mâché machine and get new combinations of things for someone to consider. And there was this very ridiculous study that was evaluating the novelty of research ideas that came out of one of the large language models versus the novelty of research ideas provided by some actual researchers who I think were like early PhD students. And the weird thing about this test was first of all, the PhD students were not going to give their most favorite ideas that they’re working on the one hand. On the other hand, the ideas that came out of a large language model were pre-screened to take the most plausible one. So, it was a completely bonkers experimental design.
But I want to point out that there are sensible ways to use text processing over scientific literature to work on specific questions. I know that there were projects in the past doing syntactic processing of– I think it was somewhere in the biomedical literature sort of looking at– Yeah, it was protein-protein interactions and then also like protein medication interactions. And the idea is that you would do automatic processing, not large language model processing, but automatic processing of a whole bunch of scientific papers to do named entity recognition. So, which proteins are being mentioned, which other chemical agents are being mentioned. And then, you could extract from that, okay, what are all the combinations that have been looked at and what has not been looked at.
o, you can do text processing, but it’s not throwed it into the large language model and ask ChatGPT to give you the answer. But rather, a very intentional, “Here is how we are going to try to do text processing over this literature to answer a specific question that we have about what has already been studied in the literature.”
Hedvig: So, I have a follow-up question about that because I have some friends now who are working in companies where people are using AI and they’re trying– A recurrent thing that seems to be happening is that you have a senior boss who wants to automate a process that they used to hire a human to do. They have been told that a large language model can do it instead, but they have not been told that there actually exists– like, we’ve had pretty good OCR technology for 15 years, so, you could probably use a much more simple, transparent and reliable tool to do the tasks that you’re after. But there’s like a knowledge gap of like 20 or 30 years where people don’t know about any tools between like, I don’t know, Microsoft Word and ChatGPT. And like you were just saying now, Emily, there are all these other kinds of processing automatization tools that are much more interrogatable– what do we call them? Like, you can [crosstalk]
Daniel: Interrogable.
Alex: Yeah, interrogable. I think that’s– yeah, sure, I like it.
Hedvig: Okay, great. There are all these tools that you could use. You might be using also you might also be using a hypercomplex thing to do something very, very simple. You’re actually just trying to read in a bunch of like forms that you had and like reliably find the numbers.
Emily: When Elon Musk took over the US Digital Service and renamed it DOGE, one of the people he hired, one of these kids fresh out of high school or college apparently posted on Reddit saying, “What’s the best large language model to use to get Excel spreadsheets into JSON files?” or something.”
Alex: Oh, my gosh.
Hedvig: Is that right?
Daniel: I remember that.
Hedvig: That’s two like super standardized formats. Also, Excel, don’t use it. But, like JSON is–
Alex: Wait, I want to know your Excel beef.
Hedvig: Oh, my God, I have so much Excel beef. Don’t use Excel.
Alex: Well, after you answer this question, I do want to hear that. [laughter]
Emily: Yeah. To be continued some other time, yeah. So, yes, I think this actually comes back around to the AI hype and we haven’t yet in this conversation really expressed our beef with the term, AI.
Alex: Yeah.
Emily: So, when you say AI or the AI, you end up lumping together all of these different technologies that are not actually a coherent set of technologies and furthermore sort of adding to the idea that, “Well, if it can create images based on what I said and it can also help me with my R programming and it can write a poem about getting a peanut, butter, and jelly sandwich out of a VCR,” or whatever that early example was, “then surely it must also be good for these things. And surely this is something brand new that we’ve never had before and it’s time to jump on this train.”
And that disconnect, I think, is what your friend’s bosses are experiencing, unfortunately imposing on your friends that, “Oh, we’ve heard about this thing, this one thing out in the world now that is brand new and can do everything. So, we’ve got to jump on that train.” As opposed to our old fashioned procurement processes where we look carefully at what’s the thing we need to do, what are the vendors that are providing it, how do we evaluate, how will each of the vendor systems work for our purposes and then move forward from there.
Hedvig: And what’s so funny about what has happened with this corn syrup, just put it in everything, also, where it’s not needed, is that you also get a bit of a consumer backlash, right? So, there were some surveys that show that when you say “with AI” or something, people feel more negatively towards it. So, why are we–
Daniel: Why are we still shoving AI into everything?
Alex: Yeah.
Emily: Because there’s a lot of money.
Alex: Right, because there’s a lot of money involved in it. That’s the sort of thing that is not as well publicized, but this stuff isn’t making money. The kind of orders of magnitude investment to build the infrastructure, to do the model training, to have the data centers, to do the chip fabrication, to ostensibly pay AI engineers, although they say that they don’t need AI engineers, which I don’t really believe because people are doing those things. It’s a huge, huge investment. And there’s so much money tied up in these networking companies between the hardware manufacturers like Nvidia and TSMC and all of them and Google and OpenAI and Amazon, there’s so much infrastructure that needs to be built and needs to be justified. And so, if people– There is a backlash in a way where like Uber and Lyft were losing money for so many years.
Hedvig: I was just going to say that, yeah.
Alex: But at least Uber and Lyft, you can see what they do and people actually use them. Unfortunately, they killed taxi industries in certain cities. I live in the Bay Area, they kind of absolutely decimated the Bay Area taxi infrastructure. But at least, there’s some kind of replacement that you see. They’re trying to push it on it because there’s no other path to monetization. And this is why it’s really appearing a lot in enterprise products. So, if you have a subscription, if you have a G Suite, if you have a Zoom, things that you already pay for, you pay a little bit more and then you can justify this AI investment, but that is just not paying off.
Hedvig: How long can that go on?
Emily: It’s got to stop at some point. I got to speak with the deans of various West Coast US colleges and universities last spring, I think, and I got to say to them, “Look, the reason that this is your problem now, this AI stuff is your problem now, is not that there’s been some big technological advancement. It’s that there’s a lot of money trying to sell this to everybody, trying to sell it to your students, trying to sell it to you. And it’s important to sort of see it in that light.”
Daniel: Yeah, well, that was what I was going to ask. The history of machine learning is the history of boom-and-bust cycles like the– Yeah, the early days of machine translation. Initial promise, lots of funding. Then, it doesn’t work out and it dries up. And we see this in like 20- or 30-year cycles. And we’re in one now. So, what happens next? Are we seeing a retraction now or what?
Emily: Alex, you want to talk about the grimy residue?
Alex: Yeah, yeah, right. I think we are seeing this, this bust happening, but it seems to be happening very slow. Most recently, there was– I pull up the $600 billion figure because there is this article written by David Cahn at Sequoia Capital and it was called something like “AI’s $600 billion question.” And the kind of lede is the AI bubble is reaching a tipping point. Navigating what comes next will be essential. And it’s kind of thinking about– And he’s kind of outlining, “Okay, this is basically how much money is just coming out in data center spend.” And the estimate here is $300 billion just for the data center. And then, there’s this Nvidia data center run rate which I think is a word that’s just like how much it costs to– I don’t know, I’m not an investor, I can’t answer this. I’m suggesting it has to do things with operation costs. So, $600 billion in Q4 and we know that they’re not reaching that. It’s just a ridiculous amount of money going about.
And so, I think we’re really seeing indications that this boom cycle is on its last legs. We’ve seen a huge investment in Open AI and they think it’s like, “Well, this is really your– We’re going to turn into and really try to put this in,” but it seems like it’s on the precipice. And so, there’s different ways in which it implodes. The implosion can go quickly, which would be very concerning since so much of the stock market is kind of premised on the growth of Nvidia and Google. Apple has kind of pulled back a bit, they’re like, “This is maybe not the best thing,” which, good for them. But Meta is trying to find the next big thing. Or, it could go very slowly, kind of like the Uber or Lyft model, like keep throwing bad money after bad money and see what comes of it.
And so, we’re seeming to be at this place where it really needs to turn a profit, but it’s not going to happen. What happens after? Well, unfortunately, those data centers are going to keep on being there, right? There’s some reporting, I think it was in– I forgot which publication, maybe MIT Tech Review. But it was how China did this huge build out of data centers and many of them are just not running. Microsoft has frozen a lot of their projects on data center builds. And so, we’re seeing some indication of just data center construction slowing down.
But then, what’s going to happen to those jobs that were told we’re not needed? What’s going to happen to those professions that have been decimated? Those aren’t magically going to come back. And so, it’s a bit of a depressing feature of this, but this makes this bust cycle a little more consequential than the ones we’ve had around AI and machine learning in the past.
Hedvig: So, thinking about services like Uber, for example, in many European countries, a lot of European states said either we’re not having Uber at all, or if you have actual employees who actually only work for you and you call them freelancers, that’s your problem. They’re employees and you should pay them benefits and pay taxes and etc., which means that in some places, Uber is more expensive and more similar to taxis in some European places. So, there’s been a sort of like regulation, and it’s become a bit more like it was before. So, there are taxis in our city, for example, that are just called Uber as well. You can do them with Uber app, but they’re actually kind of just like taxis.
Alex: Yeah. Yeah.
Hedvig: So, do you think that if there is a slowdown in this AI hype, does that mean that these 60-year-old bosses who said, “Let’s do everything with large language models,” are going to learn basic OCR software? Like, are they going to catch up on what they missed out on or what’s going to happen?
Emily: So, our hope with the book is that we empower a bunch of people to speak clearly in their own context about why the AI stuff doesn’t make sense. And to say, “We recognize the need. There’s something you want to automate here, let’s think about how to automate it differently.” With luck, that’ll happen.
Daniel: Well, in that case, let’s talk about how we can avoid getting conned. How can we dodge the AI con [Alex laughs] and navigate things sensibly? What do you want people to do as a result of the book?
Emily: So, ask questions is a big part of it. So, there’s both sort of individual action and collective action. And starting with the individual action, we can be very critical in the sense of critical thinking, not critical in the sense of criticizing, but critical consumers of any proposed automation. And just there’s a set of simple questions to ask. Okay, what are we automating? What’s the input to that automation? What’s the output? Does that pairing even make sense? Is there plausibly enough information in the input to get to the output? Because a bunch of these things, totally not, right? The things where they propose to detect whether or not someone’s a criminal based on a picture, that information is just not in the picture.
So, what’s being automated? What’s the input? What’s the output? Why are we automating it? How do we evaluate this automated system? How does that evaluation actually map onto the particular use case that we’re talking about here? And then, what’s the recourse? If this goes wrong, how have we set things up? We can also, as individuals, look at journalism with a high expectation of holding power to account. So, are the journalists asking those questions? Or, are they effectively serving as PR mouthpieces for the tech companies practicing access journalism? But also, there’s a lot of collective stuff we can do. So, pushing for regulation, but also organizing. And I think I want to pass it over to Alex for that one.
Alex: Totally. I think a lot of especially labor organizations have been doing very critical work in thinking about these, especially as these tools are kind of a labor breaking tool and things that kind of threaten livelihoods of folks. And then, kind of the big example here is the WGA, the Writers Guild of America, and how they pushed back on AI in the writers’ room. And so, we’re seeing this in kind of nursing and dock work and kind of shipping and ling– not linguistics, logistics, sorry. [laughter] Although I’m sure that there’s people that want to bring LLMs to your jobs too.
And so, thinking about what collectivities and I think there’s a lot of good political education and public kind of like collective understanding within particular jobs that are really helpful there. And so, if there’s any kind of AI working group, I think that’s very helpful.
Then, also different kinds of way that this is ruining things. We’ve seen this and this has happened at a faster clip than we hope. The DOGE example is instructive too, because that “AI” has been very instructive there. The people that fired from this, from the USDS, the US Digital Service, posted this thread on BlueSky that was about how they had this AI sandbox, that they were just testing these consumer tools that then DOGE rolled it out and they said, “This is going to replace everything in government,” and this thing called GSA AI. And so, pushing back and knowing that in Federal Services, in social services, well, what’s happening here? As a citizen, you also have the ability to ask these in public, in citizen groups, in civil society. And so, a lot of that collective action is very important. So, you bring these individual questions and individual understandings and organize collectively with those understandings.
Daniel: So, it’s not inevitable that AI will-
Alex: Certainly not. [laughter]
Daniel: Will be everywhere.
Emily: And that claim that it is inevitable is really a bid to take away our agency, that you have to deal with this. It’s coming. And the thing that I really want people to hold on to is strategic refusal. We can say no. We are collectively building our future together, and we all have a part in that. And we do not have to give in to the big tech vision of what it will be.
Daniel: Don’t click the sparkles.
Alex: Don’t click the sparkles. And we’re going to reclaim the sparkles because those belong to the gays first, and we will take them back. And so, yeah.
Daniel: The book is The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want. It’s available now from HarperCollins. We’re talking to the authors and cohosts of Mystery AI Hype Theater 3000, Dr. Emily Bender and Dr. Alex Hanna. Emily and Alex, thanks so much for hanging out with us and talking about your work today.
Emily: Thank you. This was amazing.
Alex: Thank you. Really enjoyed it.
[Transcript provided by SpeechDocs Podcast Transcription]