Navigating the 5 Pillars of Data & AI Quality: What a Quality Dashboard could look like
Ensuring success and responsible deployment of AI systems through effective tracking and decision-making.
In my previous article, I discussed the importance of establishing quality in Data & AI projects and introduced the comprehensive 5-pillar framework. These pillars encompass Data Quality, Model Quality, Infrastructure Quality, Compliance and Ethics, and Data Governance. To help you better understand the practical implications of this framework, this article dives deeper into each pillar, providing concrete examples of key performance indicators (KPIs) that can be used to monitor, measure, and optimize the quality of your Data & AI initiatives. I present a few KPIs for each pillar, listed in order of priority with a formula used to measure them.
By leveraging a report or dashboard that tracks KPIs and metrics associated with each pillar, informed decisions can be made to improve the health of AI projects.
The 5 pillars again
Before we get to the KPIs, let's re-look at the 5 pillars and what they are supposed to measure.
KPIs for a Quality Dashboard
Pillar 1 - Data Quality
Most of the Data Quality KPIs are typically easy(-ier) to capture due to the presence of Data Profiling tools that can be attached to data pipelines and/or queries that can be run on storage systems. I discussed this a little in a previous article on Data Quality on Pipelines (Good and Bad) along with a sample report that can be generated using tools like Great Expectations.
Pillar 2 - Algorithm/Model Quality
These metrics are a little harder to measure as they don't necessarily come from tools. However, the use of synthetic data and equivalence partitioning along with domain value analysis as I wrote in an earlier article can be helpful to bring objectivity to these measures. While it is easy to list these KPIs, measuring them regularly is a complex activity that requires a lot of knowledge and technical know-how.
Recommended by LinkedIn
Pillar 3 - Infrastructure Quality
These measures are easier to measure again - especially in today's world of cloud infrastructure. Most of these measures are available through all the major cloud service providers. Performance and security though, are vast topics of their own that warrant lengthy discussions in their own right.
Pillar 4 - Ethics and Compliance
Development teams and management alike generally have a solid understanding of the first three pillars. Sadly, this pillar is frequently disregarded and misinterpreted. Similar to Data Governance, organization-wide initiatives are necessary to first define and enumerate what responsible use of Data & AI implies in order to measure quality in this pillar. This definition will differ by industry, domain, and application.
However, some general KPIs can be defined. Perhaps someone with the expertise would be kind enough to direct me to good work done on this topic.
Pillar 5 - Data Governance
Much like the previous pillar, Data Governance as a pillar requires context. There are tools and other methods to measure some of the KPIs mentioned below, but without an organization wide (or at least unit wide) measure of what these mean in the context of the data/process/technology, the measures by themselves do not matter much. Data Governance requires a re-look at policies, process, and cataloging of the status quo before correlating them to external policies/regulations such as HIPAA, GDPR, Sarbanes-Oxley etc. While I have mentioned some measures below, the process of Data Governance requires much thinking and context-building from within.
And so...
As we continue to integrate AI into our daily lives and businesses, ensuring the quality of AI systems becomes paramount. The 5-pillar framework, along with the KPIs discussed in this article, provides an approach to tackling the unique challenges that Data & AI projects bring. However, it's important to recognize that there's still so much to learn and explore in this field.
This article merely scratches the surface, and I encourage you to ask questions and share your experiences. By doing so, we can collectively work towards building AI systems that are not only effective and reliable, but also ethical and accountable.