Heard about 3 V’s of  Data. What about 3 D’s of Data?

Heard about 3 V’s of Data. What about 3 D’s of Data?

How are you all Data Stalwarts doing? Having fun with Datalake or you think Datalake is not keeping its promise? I have seen amazing success stories and amazing failures of data journey of several organizations. Well, it would not be fair to blame anyone for Datalake failures since Datalake along with it brought so many technologies & tools & approaches that many organizations and engineers either  couldn’t cope up with it or didn’t have the holistic perspective to do it right. Many orgs & leaders have started to even think whether Big Data was just a hype created or can it still add value. Well answer is both Yes and No. Question is – Have you done it right?

I gather that in most of the cases - there is one common reason that led to success or failure – Data Governance, the most undermined paradigm of Big Data journey. I have seen Data Stewards & engineers trying to address Data Governance with meetings, checklists, signoffs, and eventually nagging. My Dear Friends - without robust Data Governance, you wouldn’t know what is growing inside the Datalake, leave alone deriving insights out of it. With the burst of Data related technologies, comes unforeseen complexity, security, compliance, data management, & operational challenges, which if not dealt properly can derail your entire data driven enterprise strategy before you know it. And these problems continue to grow even worst with ever evolving architectures & patterns such as hybrid environments, multi cloud architectures, federated data lakes, data mesh, data fabric, data virtualization, DaaS etc. As a result, the very definition of Data Governance has been extended and now demands - Data Security, Policies Compliance, Lineage, Catalog, Quality, Semantic Layer Management, Active metadata, Knowledge Graphs, Datalake management, monitoring, centralized automation frameworks, automated data classification, auditing, automate ETL Ingestion & Testing, Universal Semantic Layer and so on. Already undermined paradigm has only grown more complex. So, what’s the solution?

Say Hello to 3 Ds of Data: DataGovOps, DataOps & DevSecOps.

 The Trio together aims to implement Data Governance with three-pronged strategy – Automation, Multi-Level Abstraction and Centralization. Automation in ETL ingestion & testing, CI/CD, auditing, security, quality can accelerate TTM and reduce operational overhead & costs significantly. Multi-level Abstraction can help to build universal semantic layer, business glossary, active metadata catalog, knowledge graphs, common business language & data democratization. Finally, Centralization can help in managing the Data Hub from a single plane of glass and having centralized framework services for security, audit & quality to support complex architecture with multitude of technologies. See figure below, depicting the evolving technologies & tools landscape:

No alt text provided for this image

With such vast & evolving ecosystem of technologies, you can’t just pick up technologies stack and do A/B testing to see whether it works or not. Do your due intelligence to see what fits in your Technology roadmap by assessing the business requirements, use cases, workload types, volumetrics, POCs, weighted scorecard etc. There is no single solution that fits all scenarios. Embrace Automation in your Architecture using 3 D's and deriving insights from data will become a sport.

Happy Coding, Happy Architecting and Happy DATing!!! 

Rajashree Parida/Das

Vice President | India Capability Leader | India Country Board Member | Master Architect | Digital Transformation | Passionate Technologist | Diversity & Inclusion Custodian | Trainer | Mentor

3y

Very insightful and you nailed down the key requirement for a successful big data implementation.

Like
Reply
Chandrashekhar Javeer

Lead Enterprise Data Architect | Data Mesh | Cloud Data Architecture | TOGAF® 9 Certified

3y

Data governance is key in Data Lake architecture. Really helpful article.

Like
Reply

To view or add a comment, sign in

More articles by Abhishek Mittal

  • LLMOps

    As organizations seek to leverage the power of LLMs in production environments, the need for efficient and scalable…

    1 Comment
  • How to Evaluate Large Language Models (LLMs)

    Large Language Models (LLMs) like GPT, Falcon, Gemini, BERT, Dolly etc have revolutionized the field of natural…

  • Retrieval-Augmented Generation (RAG) Techniques

    In the evolving field of artificial intelligence, the Retrieval-Augmented Generation (RAG) framework has emerged as a…

  • Rise of the LakeHouse Architecture

    Modern Data Platforms have come a long way in trying to create a feasible Data Architecture. Initially it started with…

  • DataOps

    DataOps is an approach to data management that aims to combine agile methodologies, automation, and collaboration…

  • Data Quality & Data Observability

    As the Data Lake grows in volumes, it poses significant challenges for data quality, as Data Lake often lack the…

  • Big Data calls for better Metadata Management: Knowledge Graphs

    All Datizens have reached to the consensus that management & administration of Big Data Lakes demand definition of…

    2 Comments
  • Data 360

    Hi there! In the continuation of my Data Series, here I am again to wrap up the series with my masterpiece -Data/Info…

    3 Comments
  • The Rise of Data Stores & the Role of K8

    As per the prominent Digital Strategy by many research firms -SMAC (social, mobile, analytics and cloud), many…

    1 Comment
  • CRAZY Big Data

    Hi Datizens! Here I am again to throw a Point of View on Big Data Architecture & Essentials. Now that we have learnt it…

Insights from the community

Others also viewed

Explore topics