Google’s Veo 3 AI video generator is a slop monger’s dream
Share this @internewscast.com

Even at first glance, there’s something unusual about the body on the street. The white sheet draping it is almost too pristine, and the officers’ actions lack any real purpose. “We need to clear the street,” one of them insists with a firm hand gesture, yet her lips remain still. It’s definitely AI. But here’s the twist: my prompt didn’t include any dialogue.

Veo 3, Google’s latest AI video creation model, generated that line all by itself. In just the past 24 hours, I’ve created a dozen clips featuring news reports, catastrophic events, and even whimsical cartoon cats with remarkably convincing audio, some of which the model invented spontaneously. It’s more than a bit unsettling and far more advanced than I had anticipated. Although I don’t think it’s about to lead us into a misinformation crisis just yet, Veo 3 seems like a bona fide AI content-generating powerhouse.

Google unveiled Veo 3 at I/O this week, spotlighting its key new feature: the ability to generate sound alongside AI video. “We’re entering a new era of creation,” Google’s VP of Gemini, Josh Woodward, stated during the keynote, describing it as “incredibly realistic.” I wasn’t entirely convinced, but then, a few days later, I used Veo 3 to create a video of a news anchor announcing a fire at the Space Needle. It only required a simple text prompt, a few moments, and a costly subscription to Google’s AI Ultra plan. And you know what? Woodward wasn’t overstating it. It’s remarkably lifelike.

I tried the news anchor prompt after seeing what Alejandra Caraballo, a clinical instructor at Harvard Law School’s Cyberlaw Clinic, was able to produce. One of her clips features a news anchor announcing the death of US Secretary of Defense Pete Hegseth. He is not dead, but the clip is incredibly convincing. A post including a string of videos with AI-generated characters protesting the prompts used to create them has 50,000 upvotes on Reddit. The scenes include disasters, a woman in a hospital bed using a breathing tube, and a character being threatened at gunpoint — all with spoken dialogue and realistic background sounds. Real lighthearted stuff!

Maybe I’m being naive, but after playing around with Veo 3 I’m not quite as concerned as I was at first. For starters, the obvious guardrails are in place. You can’t prompt it to create a video of Biden tripping and falling. You can’t have a news anchor announce the assassination of the president, or even generate a video of a T-shirt-and-chain-wearing tech company CEO laughing while dollar bills rain down around him. That’s a start.

That said, you can generate some troubling shit. Without any clever workarounds I prompted Veo 3 to create a video of the Space Needle on fire. Starting with my own photo of Mount Rainier, I generated a video of it erupting with smoke and lava. Coupled with a clip of a news anchor announcing said disaster, I can see how you could seed some mischief real easily with this tool.

Here’s the better news: it doesn’t seem like a ready-made deepfake machine. I gave it a couple of photos of myself and asked it to generate a video with specific dialogue and it wouldn’t comply. I also asked it to bring a pair of giant boots in a photo to life and have them walk out of the scene; it managed one boot stomping across the sidewalk with some comical crunching noises in the background.

I had an easier time generating videos when my prompts were less specific, which is how I confirmed something my colleague Andrew Marino pointed out: Veo 3 is excellent at creating the kind of lowest-common-denominator YouTube content aimed at kids.

If you’ve never been subjected to the endless pit of garbage on YouTube Kids, let me enlighten you. Imagine watching the worst 3D rendering of a monster truck driving down a ramp, landing in a vat of colored paint. Next to it, another monster truck drives down another ramp into another vat of paint — this time, a different color. Now watch that again. And again. And again. There are hours of this stuff on YouTube designed to mesmerize toddlers. These videos are usually harmless, just empty calories designed to rack up views that make Cocomelon look like Citizen Kane. In about 10 minutes with Veo 3, I threw together a clip following the same basic formula — complete with jaunty background music. But the clip that’s even more troubling to me is the two cartoon cats on a pier.

I thought it would be funny to have the cats complain to each other that the fish aren’t biting. In just a couple of minutes, I had a clip complete with two cats and some AI-generated dialogue that I never wrote. If it’s this easy to make a 10-second clip, stretching it out to a seven-minute YouTube video would be trivial. In its current form, clips revert to Veo 2 when you try to extend them into longer scenes, which removes the audio. But the way that Google has been pushing these tools forward relentlessly, I can’t imagine it’ll be long before you can edit a full feature-length video with Veo 3.

Honestly, I wonder if this sort of use for AI-generated video is a feature and not a bug. Google showed us some fancy AI-generated video from real filmmakers, including Eliza McNitt, who is working with Darren Aronofsky on a new film with some AI-generated elements. And sure, AI video could be an interesting tool in the right hands. But I think what we’re most likely to see is a proliferation of the kind of bland imagery that AI is so good at generating — this time, in stereo.

Share this @internewscast.com