Riding the vibe: Experiences and Reflections on AI-Generated Code

Riding the vibe: Experiences and Reflections on AI-Generated Code

Over the past months, I've been working in vibe coding while building an application based on AI agents in the financial investment space. This journey gave me the chance to explore - hands-on - the true potential and real limitations of using LLMs to generate code.

Vibe coding is currently a buzz word refering to a programming technique where users describe the need to a LLM which produces the code. The concept of "vibe" is linked to the fact that you're at the mercy of what the AI writes and that you don't have full control over the code. Basically you ride with whatever the AI generates for you.

For this project, I used OpenAI ChatGPT (o1 and o3-mini-high) and Claude (3.5 Sonnet and 3.7 Sonnet), as well as Cursor, which also rely upon Claude.

Small note on the product

I created a product that provides financial recommendations related to buying and selling stocks by considering the joint and orchestrated work of various specialized agents that handle news sentiment analysis, technical analysis, and creation of a macroeconomic framework.

Models comparison

The experience of describing a feature or a logical flow you have in mind and seeing how this feature is transformed first into architecture and then into code is truly exciting. While both Claude and ChatGPT performed surprisingly well, my experience lines up with what many benchmarks suggest: I found Anthropic's Claude better.

Thus, It was more precise during brainstorming, better at keeping context, and more consistent in code generation. Also, the user experience of writing code with Claude (often underated as a factor) felt smoother. In the final phase of the project, when I needed stronger debugging support, I switched to Cursor, which ended up being the most helpful tool right now for coding.

Practical advice

Vibe coding is an extremely exciting experience at times and incredibly frustrating at others. Based on my practical experience, I feel like sharing some advice to avoid some mistakes I did:

  1. Give more context than you think is necessary, especially for frontend tasks. Long prompts are your friends.
  2. Document reusable context: as said you're going to spend a lot of time in prompting the AI with information which may be redundant for you but foundamental for the AI, so you are going to repeat yourself a lot. keep notes with key definitions or explanations so you can quickly reintroduce them into the chat.
  3. Specify when you want to make additions to the code to "modify ONLY that specific part" or "DON'T TOUCH THE REST". Everyone says it, it seems obvious, but it works.
  4. Start small: I found development with AI more effective with something closer to an agile approach rather than waterfall. Starting with a small requirement, rather than the complete list of what you'd like to see, helps the AI generate better output and helps the user have tighter control of the application. The good thing is that by going in small increments, you can validate and, if necessary, undo and redo extremely quickly (but this is already agile theory), but what really makes the difference is the speed at which these iterations occur (sometimes minutes). It can be hectic, but I believe it's one of the true revolutions of vibe coding.
  5. If you're not a developer, you're generally a lazy person, and you're working with very long code, probably you'll be tempted to ask the AI, for example in the case of a small modification, to rewrite all the previous code with the small necessary modification added so that you can copy-paste the entire codebase into the IDE. Well, this isn't exactly a good idea. AI indeed have constraints on the maximum number of characters they can generate in output, and if you work with advanced models, they will try to overcome this problem in different ways. OpenAI will try to synthesize the code with the concrete risk of eliminating portions of code, or in other cases, it will propose the code in multiple responses. In this second case, it will be up to you to "tie" the two blocks of code with the risk of running into formatting and indentation problems. Claude, on the other hand, in its latest version 3.7, has overcome the character limit by generating multiple outputs but all in the same window, allowing copy and paste. Even in this case, however, sometimes small changes to the code are generated. With Cursor, I haven't encountered these problems.

Where AI Shines, and Where It Struggles

It takes just a few hours of testing to notice how LLMs are exceptionally good at writing backend logic and related architecture. For my app, I used Python, and it was truly surprising to see the quality of the output. Conversely, when I moved to building the interface and front-end logic, I didn't find the same level of quality, and often the outputs were imprecise, compared to the prompt, or in the worst cases non-functional. Regressions were also more frequent.

Something as seemingly simple as styling a dashboard CTA ended up breaking the entire navbar and disconnecting the CTAs from their underlying logic.

The real madness: debugging

It's a huge mistake to think that all code written by AIs is free from bugs; in fact, the longer the code, the higher the risk that something won't work as hoped or won't work at all. The most frequent problems I encountered were issues with Python code indentation and regressions. The real problem when doing vibe coding as a non-developer is that you can do very little when there are bugs... except to ask the AI itself to solve them, but this is where the problems begin. Indeed, AIs are not good at solving problems in the code they themselves have written, and I can assure you that it won't be enough to write "There's this error, SOLVE IT!" to resolve the situation. In fact, it's common for the AI to claim it has identified the problem to please the interlocutor, but in reality, the problem is still there. At that point, all the euphoria can completely fade away and debugging became the least "vibey" part of the whole experience. Here below two strategies that have helped me:

  1. Ask multiple models to review the same problem: It may seem incredibly basic as an approach, but in my case, it sometimes worked. It's plausible that the contextual memory system of AIs somehow blocks them in addressing a problem only in a certain way, causing them to always propose the same code with errors.
  2. Learn some basic debugging skills: it's really the only alternative to solve the biggest problems. If you can advance hypotheses about the causes of problems, report issues from the console, and dig into logs looking for things that don't add up, then the AI will use this info to proceed with resolution. It's obviously something that moves away from the magic wand idea because, besides being tiring, it requires a minimum level of experience. The good news is that even if you report nonsense, no one will judge you.

Article content
the number of memes about the vibe debugging speaks more than a thousand words

A Subtle Trap: Silent Code Simplification

Another issue worth mentioning occurs when dealing with large code blocks. ChatGPT in particular would sometimes "help" by silently simplifying the code, removing functionality without warning. If you’re lucky, you’ll notice the code is 200 lines shorter. If not, you might catch it during testing… or never at all. Claude and Cursor didn’t have this issue in my experience.

Is this the best?

Vibe coding, as the term indicates, assumes that you accept giving autonomy to the AI in writing code and that you ride this vibe. It's all beautiful, and the results are incredible, but I've always had a constant worry: is this really the best way to write this code?

From a proof of concept or MVP perspective, it's a negligible question given the objectives, but in cases where it's necessary to guarantee scalability, security, and speed, certainly not being 100% sure that your product is written in the best possible way doesn't leave you completely at ease.

Will Vibe Coding Replace Developers?

Short answer: Probably not, at least not in a short period.

It's natural to also reflect on the impact of this approach on the work of software engineers. From my perspective, AI-generated code is an incredibly fast way to arrive at satisfactory products in a short time and without teams - but as always, what really makes the difference is the quality and scalability of what you do. Even if what the AI does is excellent code, you will always need a software engineer to comfirm you that it's excellent code, so software engineers are, for this reason alone, fundamental.

I believe developers will evolve into higher-level architect: people who deeply understand product logic, business needs, and how to orchestrate systems. Obviously, the ability to debug is also precious at the moment. What may become less relevant is the need to manually write low-level code.

Final Thoughts

My experience can only be positive given the results. However, this evaluation can only be honest if one considers that I have the right background: I’ve worked for years in digital innovation, and while I’m not a developer, I’ve been involved in product development, architecture discussions, and QA processes. This means that I had the right toolkit available both to ensure I reached a good product and to do troubleshooting. For those who are completely uninitiated, the experience will probably be incredibly complex and not satisfactory at all.

Ironically, I think developers will get the most out of vibe coding as there will be fewer vibes and more ability to control the code – which, as you may have guessed by now, is the key to this approach.

Francesco Santo

Data Analyst & Strategy Manager

1mo

Articolo molto interessante, la mia esperienza è stata molto simile. Il debugging è la vera bestia nera del vibe coding (o nel mio caso, non essendo uno sviluppatore e non avendo competenze nel codig, definirei lamer coding 😂). A volte mi sono bastati 10 minuti per avere il codice python che mi serviva ma più di 3 giorni per risolvere i bug. Oltre a provare le strategie che hai descritto sono arrivato a cercare le soluzioni su internet su siti e forum e non sapendo bene come implementare ho ridato in pasto a ChatGPT le soluzioni che trovavo. Più di una volta questo ha funzionato permettendo di uscire da quel loop che si crea con le AI: ti propongono soluzionie A, non funziona, allora ti propongono soluzione B, non funziona, ti ripropongono soluzione A. Per fortuna tutto è in costante evoluzione e già rispetto a 6 mesi fa il numero di bug restituiti è notevolmente diminuito.

Like
Reply
Francesco Santo

Data Analyst & Strategy Manager

1mo

Lo salvo per leggerlo nel weekend 📖

Daniele Milani

Digital Marketing Manager - Corporate Communication @ Stellantis

1mo

Very interesting and well written! Thanks Santolo!

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics