ChatGPT Likes Our Code !  ChatGPT AI Agent Code Review
ChatGPT AI agent code review - overall assessment

ChatGPT Likes Our Code ! ChatGPT AI Agent Code Review

We recently tested ChatGPT based Code Review assistants using C/C++ source from our company's Github codebase currently in production and deployed to dozens of customers. We used C/C++ for two reasons (i) for high performance, optimized systems (e.g. telecom media, robotics) C/C++ still provides fastest executable output, and (ii) C/C++ is well known for its vulnerability risks, making expert coding and code review continuously required. We found agent results both immediately useful and helpful to understand likely trends in coding.

Article content
Code review excerpt

Results - Pros

AI agent code review recommendations were focused on readability and safety, and produced several valid points and suggestions. Suggestions were sensible and detailed; examples included using snprintf() instead of printf(), defining additional numerical constants, adding helper functions to make code more readable, and additional error handling. These changes are not crucial for functionality as the code we submitted is already deployed and in use widely for many years, but for purposes of making our published open source as "customer ready" as possible they should be implemented.

A pleasant surprise was the detail and organization of the recommendations. It looks to me like agent recommendations can now be assigned to junior engineers: review code with AI tools, make suggested improvements step-by-step, and perform regression test after each step. This would make a coding team more efficient; less experienced engineers would need less hand-holding and supervision as their tasks would be clearly delineated. Senior engineers could look at the initial review, sign off on changes (possibly making exceptions such as the extern references example below), step out of the way, and re-enter the picture for review as each step gets knocked out.

In addition to improving team efficiency, increased review granularity and detail will help with "code discovery". For instance there have been many times I needed something, say a small C++ class that does some very specific, well-defined task -- that should be simple, right ? But I often had to look long and hard to find one that was well written, at least somewhat documented, included error handling, etc. That was time consuming. Asking an AI agent to help with this makes complete sense and will save time.

Results - Cons

One criticism I have is that in areas the AI agent "doesn't like", it should be able to read and understand associated comments to see if there is good reason for the non-recommended coding practice. For example, in our submission there were a number of externs (references to global variables), as that code is a section of a larger code that was re-organized for readability purposes. In this case there is a comment there:

/* as noted in Revision History, code was split from mediaMin.cpp; the following extern references are necessary to retain tight coupling with related source in mediaMin.cpp. There are no multithread or concurrency issues in these references */        

The purpose of the comment is to make clear in this particular case "tight coupling" is an intentional design choice, not a liability, but unfortunately that seemed to have no effect on AI agents. It should be the case that at least the comment is accounted for and noted in the code review, so it doesn't appear externs were haphazardly placed.

Another criticism I have is AI agents need a better understanding of header comments, for example revision history, which can be crucially important when debugging issues that show up over time. Revision history is mentioned but its analysis should be further granularized, or graded on its consistency, detail, attribution (who made the change), and by-date-time organization.

Takeaways

First, for anyone expecting AI agents to solve performance and stability issues -- for example intermittent crashes, performance bottlenecks, thread contention issues that occur one out of every 1B test runs -- that isn't likely to happen any time soon. Performance and stability debug is so much harder because it quickly gets into I/O throughput, threads, and other OS interaction. Expert senior engineers with debug experience know what to look for and what to try - even the smallest clues are crucial. Furthermore they know how to frame the problem, create a minimum reproducible example, and ask appropriately on forums such as stackoverflow. AI agents will not replace this level of expertise in the near term.

Second, if my junior engineer assessment above is accurate, it might mean that fears of coding job losses at the junior level are overblown.

What I find most promising is that I now fully expect Github at some point to display, by default, a rating or "grading" of each source code. They already publish several metrics, such as percentage breakdown of source code types, contribution / update rates, issue resolution rates, etc. Publishing source code review results will favor organizations and companies that invest robustly in documentation, both source codes and formal API pages. Such efforts are painstaking and detailed work, but pay off over time, gradually increasing source code reliability and value. I expect potential users and customers to either run their own code review analysis (or use Github's when implemented), before making license decisions. In the case of commercial licenses, why would a careful CFO office risk purchasing code that AI tools deem unorganized, unstructured, or undocumented ? I can see upper management teams enforcing rules when their engineering/tech groups come to them with usage or license purchase decisions. Even in closed source situations, a customer might say "we want to see a cross-section of your source under strict NDA, either we or a third-party will run AI analysis on it using private LLMs". And if the results are not solid, no sale.

Summary

Any source code intended for widespread free usage or licensed usage will need to generate strong AI agent reviews. As AI tools become more accessible, thorough documentation and well-structured code will be essential for credibility and adoption. There will be no escaping this, as anyone can use these tools !

Jane S.

VP of Technology Strategy at Mavenir

3mo

Thanks, Jeff, for the insightful review. It’s great to see that these tools provide strong support for C/C++ code. Leveraging them for the initial code review would significantly shorten the process and enhance productivity.

Like
Reply

To view or add a comment, sign in

More articles by Jeff Brower

Insights from the community

Explore topics