Accelerating QA with GenAI: Our GPT-Engineer Experience
Today’s insights are brought to you by Mateusz Czajka, Chief Delivery Officer at Netguru.
At Netguru, we’re always exploring tools that can supercharge software delivery—without sacrificing quality. That’s where GPT-Engineer comes in.
We ran an experiment: Could we use GPT-Engineer to automate functional testing across key product workflows?
To find out, we ran an internal experiment using GPT-Engineer, an open-source AI tool designed to generate entire codebases from natural language prompts.
We focused on common application flows: login, account management, and user registration. We provided GPT-Engineer with structured prompts based on user stories and acceptance criteria, then evaluated the quality and accuracy of the code it produced.
The results were better than expected.
GPT-Engineer was able to generate Python-based test scripts that followed best practices, including modular code structure, reusable components, and understandable naming conventions. In many cases, the output was almost production-ready, requiring only minor adjustments from our QA team.
Most notably, we saw a reduction of more than 50% in the time required to produce test scripts. That included not just writing the tests themselves, but also setting up page object models, environment configuration, and test execution logic.
Of course, this wasn’t a plug-and-play experience. The tool required thoughtful prompting and post-generation validation. It didn’t always align with our preferred architecture out of the box, and it sometimes made incorrect assumptions about business logic. But with guidance, GPT-Engineer became a powerful productivity multiplier.
If you’re exploring how GenAI can fit into your QA or development workflows, I’d be happy to share more about our experiment, what worked, and what to watch out for.
Best,
Mateusz