AI agents are science fiction not yet ready for primetime
Share this @internewscast.com

Introducing The Stepback, your go-to weekly update that breaks down a key story from the tech landscape. Keep up with the latest in AI by following Hayden Field. Subscribers receive The Stepback every week at 8 AM ET. Interested? Subscribe here.

It all started with J.A.R.V.I.S. Yes, that J.A.R.V.I.S. The one from the Marvel movies.

Perhaps Iron Man’s AI assistant wasn’t the beginning of it all, but it certainly popularized the idea of an AI agent. In conversations with AI experts about agentic AI, J.A.R.V.I.S. often emerges as the quintessential example of an ideal AI solution—one that anticipates your needs, processes and interprets vast data sets, offers tactical guidance, and manages specific aspects of your workflow. Definitions of AI agents might differ, but at its essence, it’s an evolution beyond chatbots. It’s a system designed to execute complex, multitiered tasks autonomously without constant interaction with you. Effectively, it crafts its own list of tasks needed to achieve your desired outcome. While this dream edges closer to reality, issues remain in ensuring practical utility for everyday users, with some challenges that may never be fully resolved.

The term “AI agent” has existed for some time, but its popularity surged in the tech community in 2023. That year, AI agents became a buzzword as people explored the concept and pathways to realization, yet tangible success stories were sparse. In the subsequent year, 2024, the focus shifted to the implementation phase—teams actively deployed AI solutions into real-world settings to test their capabilities. However, results were modest and often accompanied by numerous error notifications.

The enthusiasm around AI agents can be traced back to a milestone announcement: In February 2024, fintech firm Klarna disclosed that within a month, its AI assistant, leveraging OpenAI technology, had matched the output of 700 full-time customer service representatives, automating two-thirds of the company’s service interactions. These data points dominated discussions in the AI circles I encountered for some time.

The momentum didn’t falter, and soon after, tech industry leaders were consistently referencing AI agents in their financial reports. Executives from Amazon, Meta, Google, Microsoft, and various other tech giants began emphasizing their dedication to developing effective and efficient AI agents—and backing those promises with financial investments and strategic initiatives.

The ultimate ambition for AI agents included performing tasks spanning from organizing travel to crafting visual aids for business purposes. The perfect AI assistant would ideally identify optimal schedules and venues for gatherings, taking into account all participants’ availability, dietary preferences, and constraints. It would manage everything—right from securing dining reservations to arranging calendar updates for everyone involved.

Now let’s talk about the “AI coding” of it all: For years, AI coding has been carrying the agentic AI industry. If you asked anyone about real-life, successful, not-annoying use cases for AI agents happening right now and not conceptually in a not-too-distant future, they’d point to AI coding — and that was pretty much the only concrete thing they could point to. Many engineers use AI agents for coding, and they’re seen as objectively pretty good. Good enough, in fact, that at Microsoft and Google, up to 30 percent of the code is now being written by AI agents. And for startups like OpenAI and Anthropic, which burn through cash at high rates, one of their biggest revenue generators is AI coding tools for enterprise clients.

So until recently, AI coding has been the main real-life use case of AI agents, but obviously, that’s not pandering to the everyday consumer. The vision, remember, was always a jack-of-all-trades sort of AI agent for the “everyman.” And we’re not quite there yet — but in 2025, we’ve gotten closer than we’ve ever been before.

Last October, Anthropic kicked things off by introducing “Computer Use,” a tool that allowed Claude to use a computer like a human might — browsing, searching, accessing different platforms, and completing complex tasks on a user’s behalf. The general consensus was that the tool was a step forward for technology, but reviews said that in practice, it left a lot to be desired. Fast-forward to January 2025, and OpenAI released Operator, its version of the same thing, and billed it as a tool for filling out forms, ordering groceries, booking travel, and creating memes. Once again, in practice, many users agreed that the tool was buggy, slow, and not always efficient. But again, it was a significant step. The next month, OpenAI released Deep Research, an agentic AI tool that could compile long research reports on any topic for a user, and that spun things forward, too. Some people said the research reports were more impressive in length than content, but others were seriously impressed. And then in July, OpenAI combined Deep Research and Operator into one AI agent product: ChatGPT Agent. Was it better than most consumer-facing agentic AI tools that came before? Absolutely. Was it still tough to make work successfully in practice? Absolutely.

So there’s a long way to go to reach that vision of an ideal AI agent, but at the same time, we’re technically closer than we’ve ever been before. That’s why tech companies are putting more and more money into agentic AI, by way of investing in additional compute, research and development, or talent. Google recently hired Windsurf’s CEO, cofounder, and some R&D team members, specifically to help Google push its AI agent projects forward. And companies like Anthropic and OpenAI are racing each other up the ladder, rung by rung, to introduce incremental features to put these agents in the hands of consumers. (Anthropic, for instance, just announced a Chrome extension for Claude that allows it to work in your browser.)

So really, what happens next is that we’ll see AI coding continue to improve (and, unfortunately, potentially replace the jobs of many entry-level software engineers). We’ll also see the consumer-facing agent products improve, likely slowly but surely. And we’ll see agents used increasingly for enterprise and government applications, especially since Anthropic, OpenAI, and xAI have all debuted government-specific AI platforms in recent months.

Overall, expect to see more false starts, starts and stops, and mergers and acquisitions as the AI agent competition picks up (and the hype bubble continues to balloon). One question we’ll all have to ask ourselves as the months go on: What do we actually want a conceptual “AI agent” to be able to do for us? Do we want them to replace just the logistics or also the more personal, human aspects of life (i.e., helping write a wedding toast or a note for a flower delivery)? And how good are they at helping with the logistics vs. the personal stuff? (Answer for that last one: not very good at the moment.)

  • Besides the astronomical environmental cost of AI — especially for large models, which are the ones powering AI agent efforts — there’s an elephant in the room. And that’s the idea that “smarter AI that can do anything for you” isn’t always good, especially when people want to use it to do… bad things. Things like creating chemical, biological, radiological, and nuclear (CBRN) weapons. Top AI companies say they’re increasingly worried about the risks of that. (Of course, they’re not worried enough to stop building.)
  • Let’s talk about the regulation of it all. A lot of people have fears about the implications of AI, but many aren’t fully aware of the potential dangers posed by uber-helpful, aiming-to-please AI agents in the hands of bad actors, both stateside and abroad (think: “vibe-hacking,” romance scams, and more). AI companies say they’re ahead of the risk with the voluntary safeguards they’ve implemented. But many others say this may be a case for an external gut-check.

0 Comments

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.


Share this @internewscast.com