This Week in AI: AI Isn’t the End of the World — But It’s Still Very Harmful

Hey everyone, and welcome to TechCrunch’s regular AI newsletter.

This week, a new study was released in the AI field that argues that generative AI isn’t all that bad after all — at least not in the apocalyptic sense.

In a paper submitted to the annual conference of the Association for Computational Linguistics, researchers from the University of Bath and the University of Darmstadt argue that models like those in the Llama Meta family are unable to learn on their own or acquire new skills without explicit instructions.

The researchers conducted thousands of experiments to test the ability of several models to perform tasks they had not encountered before, such as answering questions about topics outside the scope of their training data. They found that while the models could superficially follow instructions, they were unable to master new skills on their own.

“Our study shows that the fear that a model will disappear and do something completely unexpected, innovative and potentially dangerous is not justified,” Harish Tayyar Madabushi, a computer scientist at the University of Bath and a co-author of the study, said in a statement. “The common narrative that this type of AI poses a threat to humanity is preventing the widespread adoption and development of these technologies, and is also distracting from the real problems that need our attention.”

There are limitations to the study. The researchers didn’t test the latest, most powerful models from vendors like OpenAI and Anthropic, and comparative models tend to be an inexact science. But the study isn’t the first to show that today’s generative AI technology doesn’t threaten humanity—and that assuming otherwise risks making poor policy.

In an editorial in Scientific American last year, AI ethicist Alex Hanna and linguistics professor Emily Bender made the case that corporate AI labs are misdirecting regulators’ attention toward imaginary doomsday scenarios as a bureaucratic maneuver. They pointed to OpenAI CEO Sam Altman’s appearance at a May 2023 congressional hearing, during which he suggested—without evidence—that generative AI tools could go “completely wrong.”

“The general public and regulatory agencies must not be fooled by this maneuver,” Hanna and Bender wrote. “Instead, we should turn to scientists and activists who engage in peer review and have pushed past the hype surrounding AI to understand its harmful effects in the here and now.”

Ich and Madabushi are key points to keep in mind as investors continue to pour billions into generative AI and the hype cycle nears its peak. The stakes are high for companies backing generative AI technology, and what’s good for them — and their backers — isn’t necessarily good for the rest of us.

Generative AI may not cause our extinction. But it is already harming us in other ways—witness the spread of unwanted deepfake pornography, wrongful arrests for facial recognition, and hordes of underpaid data annotators. Let’s hope policymakers take notice and share this view—or finally change their minds. If not, humanity may have something to fear.

News

Google Gemini and AI, oh my! Google’s annual Made By Google hardware event took place Tuesday, and the company announced a slew of updates to its Gemini assistant — as well as new phones, earbuds, and smartwatches. Check out TechCrunch’s roundup for the latest.

Progress on Artificial Intelligence Copyright: A class action lawsuit filed by artists who claim Stability AI, Runway AI and DeviantArt illegally trained their AIs on copyrighted works may proceed, but only partially, the presiding judge ruled Monday. In the mixed ruling, several of the plaintiffs’ claims were dismissed while others survived, meaning the lawsuit could now proceed to trial.

Tasks for X and Grok: X, the social media platform owned by Elon Musk, has come under fire for a series of privacy complaints after it shared user data in the European Union to train AI models without asking people for permission. X has agreed to stop processing data in the EU to train Grok — for now.

YouTube tests Gemini brainstorming: YouTube is testing integration with Gemini to help creators brainstorm video ideas, titles, and thumbnails. The feature, called Brainstorm with Gemini, is currently only available to select creators as a small, limited experiment.

OpenAI’s GPT-4o does weird things: OpenAI’s GPT-4o is the company’s first model trained on voice, as well as text and image data. This causes it to sometimes behave in strange ways — for example, it mimics the voice of the person speaking to it, or randomly shouts in the middle of a conversation.

Research paper of the week

There are a ton of companies offering tools that claim to reliably detect text written by generative AI models, which would be useful for combating misinformation and plagiarism, for example. But when we tested a few of them a while ago, they rarely worked. And a new study suggests that hasn’t improved much.

The UPenn researchers designed a dataset and scorecard, the Robust AI Detector (RAID), of more than 10 million AI-generated and human-written recipes, news articles, blog posts, and more to measure the performance of AI text detectors. They found that the detectors they evaluated were “mostly useless” (in the researchers’ words), working only when applied to specific use cases and text that was similar to the text they were trained on.

“If universities or schools relied on a narrowly trained detector to catch students using (generative AI) to write their papers, they could falsely accuse students of cheating when they weren’t,” Chris Callison-Burch, a professor of computer and information science and a co-author of the study, said in a statement. “They could also miss students who cheated while using a different (generative AI) to generate their homework.”

It seems there is no perfect solution when it comes to AI text detection — the problem is unsolvable.

OpenAI has reportedly developed a new text-detection tool for its AI models itself—an improvement over the company’s first attempt—but is refusing to release it out of concern that it could disproportionately affect non-English-speaking users and become ineffective with minor text modifications. (Less philanthropically, OpenAI also has concerns about how its built-in AI text detector might affect how its products are perceived—and used.)

Model of the week

It seems that generative AI is not just good for memes. MIT researchers are applying it to flag problems in complex systems like wind turbines.

A team at MIT’s Computer Science and Artificial Intelligence Lab has developed a framework called SigLLM that includes a component for converting time-series data—measurements taken repeatedly over time—into text-based input that a generative AI model can process. A user can feed this prepared data into the model and ask it to start identifying anomalies. The model can also be used to predict future time-series data points as part of an anomaly detection process.

The structure didn’t work exceptionally well in researchers’ experiments. But if its performance can be improved, SigLLM could, for example, help technicians flag potential problems in equipment such as heavy machinery before they occur.

“Because this is only the first iteration, we didn’t expect to see this result right away, but these results show that it’s possible to use (generative AI models) for complex anomaly detection tasks,” Sarah Alnegheimish, a senior studying electrical engineering and computer science and lead author of the SigLLM paper, said in a statement.

Gift bag

This month, OpenAI updated ChatGPT, its AI-powered chatbot platform, to a new baseline model — but didn’t publish any changelog (well, barely a changelog, really).

since last week ChatGPT has a new GPT-4o model. we hope you like it and check it out if you haven’t already! we think you’ll like it 😃

— ChatGPT (@ChatGPTapp) August 12, 2024

So what to do about it? What Power one of them, exactly? There’s nothing to base it on except anecdotal evidence from subjective testing.

I think Ethan Mollick, a Wharton professor who studies AI, innovation, and startups, was right. Generative AI models are hard to write release notes for, because the models “feel” different in one interaction than another; they rely heavily on vibration. At the same time, people are using—and paying for—ChatGPT. Don’t they deserve to know what they’re getting into?

It’s possible that the improvements are incremental, and OpenAI feels it’s unwise to flag this for competitive reasons. It’s less likely that the model somehow relates to OpenAI’s reported breakthroughs in reasoning. Regardless, when it comes to AI, transparency should be a priority. Without it, there can be no trust — and OpenAI has already lost much of that.