Riding the vibe: Experiences and Reflections on AI-Generated Code
Over the past months, I've been working in vibe coding while building an application based on AI agents in the financial investment space. This journey gave me the chance to explore - hands-on - the true potential and real limitations of using LLMs to generate code.
Vibe coding is currently a buzz word refering to a programming technique where users describe the need to a LLM which produces the code. The concept of "vibe" is linked to the fact that you're at the mercy of what the AI writes and that you don't have full control over the code. Basically you ride with whatever the AI generates for you.
For this project, I used OpenAI ChatGPT (o1 and o3-mini-high) and Claude (3.5 Sonnet and 3.7 Sonnet), as well as Cursor, which also rely upon Claude.
Small note on the product
I created a product that provides financial recommendations related to buying and selling stocks by considering the joint and orchestrated work of various specialized agents that handle news sentiment analysis, technical analysis, and creation of a macroeconomic framework.
Models comparison
The experience of describing a feature or a logical flow you have in mind and seeing how this feature is transformed first into architecture and then into code is truly exciting. While both Claude and ChatGPT performed surprisingly well, my experience lines up with what many benchmarks suggest: I found Anthropic's Claude better.
Thus, It was more precise during brainstorming, better at keeping context, and more consistent in code generation. Also, the user experience of writing code with Claude (often underated as a factor) felt smoother. In the final phase of the project, when I needed stronger debugging support, I switched to Cursor, which ended up being the most helpful tool right now for coding.
Practical advice
Vibe coding is an extremely exciting experience at times and incredibly frustrating at others. Based on my practical experience, I feel like sharing some advice to avoid some mistakes I did:
Where AI Shines, and Where It Struggles
It takes just a few hours of testing to notice how LLMs are exceptionally good at writing backend logic and related architecture. For my app, I used Python, and it was truly surprising to see the quality of the output. Conversely, when I moved to building the interface and front-end logic, I didn't find the same level of quality, and often the outputs were imprecise, compared to the prompt, or in the worst cases non-functional. Regressions were also more frequent.
Something as seemingly simple as styling a dashboard CTA ended up breaking the entire navbar and disconnecting the CTAs from their underlying logic.
Recommended by LinkedIn
The real madness: debugging
It's a huge mistake to think that all code written by AIs is free from bugs; in fact, the longer the code, the higher the risk that something won't work as hoped or won't work at all. The most frequent problems I encountered were issues with Python code indentation and regressions. The real problem when doing vibe coding as a non-developer is that you can do very little when there are bugs... except to ask the AI itself to solve them, but this is where the problems begin. Indeed, AIs are not good at solving problems in the code they themselves have written, and I can assure you that it won't be enough to write "There's this error, SOLVE IT!" to resolve the situation. In fact, it's common for the AI to claim it has identified the problem to please the interlocutor, but in reality, the problem is still there. At that point, all the euphoria can completely fade away and debugging became the least "vibey" part of the whole experience. Here below two strategies that have helped me:
A Subtle Trap: Silent Code Simplification
Another issue worth mentioning occurs when dealing with large code blocks. ChatGPT in particular would sometimes "help" by silently simplifying the code, removing functionality without warning. If you’re lucky, you’ll notice the code is 200 lines shorter. If not, you might catch it during testing… or never at all. Claude and Cursor didn’t have this issue in my experience.
Is this the best?
Vibe coding, as the term indicates, assumes that you accept giving autonomy to the AI in writing code and that you ride this vibe. It's all beautiful, and the results are incredible, but I've always had a constant worry: is this really the best way to write this code?
From a proof of concept or MVP perspective, it's a negligible question given the objectives, but in cases where it's necessary to guarantee scalability, security, and speed, certainly not being 100% sure that your product is written in the best possible way doesn't leave you completely at ease.
Will Vibe Coding Replace Developers?
Short answer: Probably not, at least not in a short period.
It's natural to also reflect on the impact of this approach on the work of software engineers. From my perspective, AI-generated code is an incredibly fast way to arrive at satisfactory products in a short time and without teams - but as always, what really makes the difference is the quality and scalability of what you do. Even if what the AI does is excellent code, you will always need a software engineer to comfirm you that it's excellent code, so software engineers are, for this reason alone, fundamental.
I believe developers will evolve into higher-level architect: people who deeply understand product logic, business needs, and how to orchestrate systems. Obviously, the ability to debug is also precious at the moment. What may become less relevant is the need to manually write low-level code.
Final Thoughts
My experience can only be positive given the results. However, this evaluation can only be honest if one considers that I have the right background: I’ve worked for years in digital innovation, and while I’m not a developer, I’ve been involved in product development, architecture discussions, and QA processes. This means that I had the right toolkit available both to ensure I reached a good product and to do troubleshooting. For those who are completely uninitiated, the experience will probably be incredibly complex and not satisfactory at all.
Ironically, I think developers will get the most out of vibe coding as there will be fewer vibes and more ability to control the code – which, as you may have guessed by now, is the key to this approach.
Data Analyst & Strategy Manager
1moArticolo molto interessante, la mia esperienza è stata molto simile. Il debugging è la vera bestia nera del vibe coding (o nel mio caso, non essendo uno sviluppatore e non avendo competenze nel codig, definirei lamer coding 😂). A volte mi sono bastati 10 minuti per avere il codice python che mi serviva ma più di 3 giorni per risolvere i bug. Oltre a provare le strategie che hai descritto sono arrivato a cercare le soluzioni su internet su siti e forum e non sapendo bene come implementare ho ridato in pasto a ChatGPT le soluzioni che trovavo. Più di una volta questo ha funzionato permettendo di uscire da quel loop che si crea con le AI: ti propongono soluzionie A, non funziona, allora ti propongono soluzione B, non funziona, ti ripropongono soluzione A. Per fortuna tutto è in costante evoluzione e già rispetto a 6 mesi fa il numero di bug restituiti è notevolmente diminuito.
Data Analyst & Strategy Manager
1moLo salvo per leggerlo nel weekend 📖
Digital Marketing Manager - Corporate Communication @ Stellantis
1moVery interesting and well written! Thanks Santolo!