The Future of AI Agents -- My Deep Dive into AI Tooling, AGI, and Why We May Be Doing It Wrong

The Future of AI Agents -- My Deep Dive into AI Tooling, AGI, and Why We May Be Doing It Wrong

Introduction: I Went Down the AI Rabbit Hole and This Is What I Found

I recently went into a hole, diving back into coding and experimenting with AI tooling like Bolt, Replit, Loveable, VS Code Copilot (Gpt 4o) and GitHub Copilot. I wanted to see how much time I could actually save by leveraging AI to generate a distributed full-stack app (UI frontend, backend API service, and a Postgres database with third-party AI integrations like Claude) all in about 4-6 hours.

Sounds great, right? Except the reality wasn’t as smooth as people love to claim.

What started as an experiment quickly turned into me fighting against AI’s limitations, spending more time fixing hallucinations, debugging, and restructuring bad AI-generated logic than I would have spent coding it myself.

I’ll walk through exactly what I ran into, why today’s AI coding workflows are deeply flawed, and where we need to go if AI agents and AGI are ever going to replace traditional development.

AI Coding: Speed at a Cost

I started by defining specs for an application, detailing user needs, API interactions, and workflows. I spent about an hour making sure I covered 80% of core needs upfront—because I knew AI tools wouldn’t magically intuit what I wanted. The plan was to have the UI hosted on one platform, the API on another, and a Postgres database running separately—a distributed architecture, not some basic no-code hack.

Then I tested AI’s ability to implement it. One massive prompt? Nope. Every AI tool I tested (Bolt, Replit, Lovable) barfed and hallucinated non-functional garbage during this phase. Code was generated but it was broken or wouldn't build at all and the Agents were stuck in loops trying to fix it. So I had to break things down into a series of highly detailed, structured prompts for each use case. Another hour of work!

Once I structured my asks correctly, AI started generating code that kind of worked—but not exactly.

AI Fixes One Thing, Breaks Three

  • The UI was a mess. Trying to have AI mock the user experience? Total disaster once you added more functionality past the initial generated UI.
  • It would “fix” something simple like scrolling text in a chatbot and, in the process, wipe out colors, remove form elements, or break responsiveness.
  • The AI was constantly deleting valid imports and overwriting working code with garbage, making manual debugging a nightmare.
  • Looping issues --> AI tools would get stuck in cycles, trying to solve the same problem over and over, timing out instead of adjusting approaches.
  • Hallucinations "ugh" --> At one point, Copilot generated an API service, but instead of implementing my actual spec, it wrote documentation about a completely fictional API service, included “typical” OAuth and API key flows from AWS, Google, and Microsoft, and embedded that into the app as if it were real.
  • Code deletions and overwrites "Almost cancelled my subscription!" --> Replit and VS Code Copilot repeatedly deleted necessary code and overwrote functions with comments like "// rest of component logic remains the same...." added in places where code was removed. (I was being mocked by the machine I tell you!) In some cases, the agent claimed to have modified files, but no actual changes were made. This forced me to rely heavily on Git versioning, as AI-generated changes often disrupted working logic, requiring frequent rollbacks and manual corrections.

Had I not had previous dev experience, I’d have been stuck for hours—just like many PMs will be. Many PMs experimenting with AI-generated code will reach this exact point, struggle to debug AI’s unpredictable behavior, and have no path forward.

This is where AI coding breaks down—it can write code, but it doesn’t always fully understand what it wrote, and without engineering intuition and experience to re-prompt, making sense of it is nearly impossible for PMs who have never coded.

AI speeds up initial development but often breaks existing functionality, making debugging a nightmare. PMs and non-engineers attempting AI-generated projects will likely hit major roadblocks due to AI’s unpredictable behavior, hallucinations, and lack of structure.

AGI and the Future of AI Agents: Why We’re Doing This Wrong

AI-generated code still relies on traditional human programming languages like Python and JavaScript. But this approach is just a temporary bridge and we’re forcing AI to operate within the constraints of legacy programming paradigms instead of rethinking how AI agents should function in a truly autonomous ecosystem.

The Wrong Path: AI Writing Traditional Code

  • AI shouldn’t need to write Java or Python. It should be able to self-discover the agents it needs and communicate natively.
  • Instead of APIs and hardcoded protocols, AI should dynamically negotiate communication protocols.
  • Instead of traditional software architectures, AI should operate at the machine level without human-readable code.

Right now, AI is simply an accelerator for traditional development, but that’s not where AGI needs to go. True AI agents shouldn’t be generating and managing Python files. Agents should be creating their own optimized execution layers that don’t require human-written syntax at all but still require human oversight to ensure quality and prevent them from going rogue.

What the Future Looks Like: AI Agents That Just Exist & Act

Imagine a world where AI agents handle entire workflows without human intervention or predefined software rules.

Take an airline ticket booking as an example:

User says: “I want to fly to Miami for Christmas. Book me a flight.”

Your AI agent: Determines the best price, availability, and options by talking to other AI agents.

Other agents coordinate: Payment processing, airline reservations, and TSA clearance—seamlessly.

User receives a simple confirmation: No app needed, no manual check-ins, no booking pages.

At the airport: The user walks up, scans their eye/hand, and boards the flight.        
Notice what’s missing? No UI-heavy apps, no long-winded workflows, no JAVA, no Python. Only seamless agent-to-agent interactions that adapt and automatically learn/heal from any hiccups on the fly.
This is where we are going but so many selling courses and giving talks are missing the fact that this is the vision and end outcome! I'm all for quick $ if you have the ability and time but lets be honest and forthcoming about it!

The Transition Phase: We’re Stuck Using Old Tools (For Now)

Of course, we can’t jump straight into fully autonomous AI agents overnight. Right now, we’re forced to rely on human-readable programming languages and API-driven architectures because that’s all we know.

But the long-term shift is clear:

  • AI must become self-discovering, capable of dynamically identifying and coordinating with other agents.
  • AI must remove human software limitations instead of conforming to them.
  • AI should operate at a protocol and execution level that doesn’t require traditional coding practices.

Bottom line?

We’re forcing AI into a mold that won’t exist in the future. AGI shouldn’t be geared towards learning how to write Python faster! It should be focused on eliminating the need for human-written code altogether.

We are currently training AI to operate within outdated software paradigms. AGI won’t generate Python code. instead, it will dynamically discover, adapt, and execute tasks in a way that eliminates the need for traditional programming languages altogether. The future is agent-driven, with minimal human intervention and we may be currently doing it wrong!

We need more articles like this. Excellent overview of the current state of AI for coding. The rabbit hole is 💯 real and where most people will end up stuck. The general AI experience you describe is spot on. You lost me on the future vision a bit. Not because I disagree but more because I don't see it being very feasible. All of these models are just statistical webs of our current practices, good and bad. By the very nature of model training I think it will be extremely difficult for them to "remove human software limitations" because there is no built-in reward function in the training to encourage that. Same for real-time feedback. We'll generally reject what we don't understand even if it's better. I also think it would be easier to make the machines better at accommodating our limitations than for us to be trained up in understanding a machine-level abstraction layer to review and monitor. The latter feels how Terminator or the Matrix ends up happening. 🤣

To view or add a comment, sign in

More articles by Ryan M.

Insights from the community

Others also viewed

Explore topics