Measuring AI effectiveness beyond productivity metrics

Last year was a milestone in the field of artificial intelligence, characterized by enthusiasm, optimism and caution. AI-based productivity tools promise to increase productivity by automating repetitive coding and tedious tasks and code generation. A year later, organizations are struggling to quantify the impact of their AI initiatives and are re-evaluating metrics to ensure they reflect desired business outcomes.

Measuring developer productivity has long been a challenge, with or without the introduction of AI-based development tools. Last year, McKinsey & Company described measuring developer productivity as a “black box,” noting that in software development, “the link between inputs and outputs is much less clear” than in other functions.

Reporting AI-powered coding performance requires a more nuanced approach than traditional metrics such as lines of code produced, number of code commits, or task completion. This requires moving to an assessment of actual business outcomes that balance software development speed, software quality, and security.

While using AI to create more code faster can be beneficial, it can also lead to technical debt if the resulting code is not high quality and secure. AI-generated code often requires more time to review, test, and maintain. For example, developers can save time by using AI to write code, but it will likely be used later in the software lifecycle. Additionally, any vulnerabilities in AI-generated code will require the involvement of security teams and additional time to mitigate potential security incidents.

When assessing the value that AI brings to software development, it is important to consider that AI should be implemented and evaluated as a complement to human programmers, not a replacement.

Better productivity rates

Instead of focusing on adoption rates or lines of code generated, organizations should strive to take a more holistic view of AI’s impact on productivity and the bottom line. This approach ensures that the real benefits of AI-powered software development are fully realized and appreciated.

The best approach is to combine quantitative data from across the software development lifecycle (SDLC) with qualitative insights from developers about how AI actually impacts their day-to-day work and its impact on long-term development strategies. For example, a GitLab study found that developers spend about 75 percent of their time on tasks other than code generation, which means that more productive use of AI could enable developers to spend less time reviewing, testing, and maintaining code.

One recommended measurement technique is the DORA platform, which analyzes the performance of a development team over a specific time period. DORA metrics measure deployment frequency, change turnaround time, mean recovery time, change failure rate, and reliability to provide insight into team agility, operational efficiency, and speed as a proxy for how well an engineering organization balances speed, quality, and security.

Additionally, teams should consider using value stream analysis to evaluate the entire workflow from concept to production. Value stream analytics is not based on a single metric; constantly monitors metrics such as lead time, cycle time, deployment frequency and manufacturing defects. This approach focuses on business outcomes, not developer activities.

Successful AI implementation

AI is still a new technology, and organizations should anticipate the typical growing pains of the transition while being aware that development and security teams may not yet trust AI. Introducing new AI tools into existing workflows may require additional changes to processes such as code review, testing, and documentation.

To start, teams should develop best practices by working in a lower-risk segment before expanding their AI applications to ensure they scale safely and sustainably. For example, AI code generation helps with scaffolding, test generation, syntax correction, and documentation. This way, teams can build momentum and motivation by seeing better results and learning to use the tool more effectively. Initially, productivity may decline as teams adapt to new workflows. Organizations should give their teams a grace period to determine how AI best fits their processes.

Artificial intelligence will play a key role in the evolution of DevSecOps platforms, transforming the way development, security and operations teams work together to accelerate software development without sacrificing quality and security. Business leaders will be asking how their investments in AI-powered tools are paying off, and developers should embrace this analysis and take the opportunity to demonstrate how their work aligns with the organization’s broader goals.

By taking a holistic approach that evaluates code quality, collaboration, bottom line costs, and developer experience, teams can leverage AI technologies to augment human efforts.

Image source: Irinayeryomina / Dreamstime.com

Taylor McCaslin is the AI/ML Lead at GitLab.