Data Architecture

Data Architecture


What is data architecture

Data architectures will define a company’s livelihood. If a company were a chess piece, the data architecture defines the moves the company can make on the board.

A primitive architecture allows your company to move like a pawn. An advanced architecture can make that pawn a queen.

Picture these different data architectures:

Storing a file as a .csv on a local hard drive and reading the file into Tableau on a person’s computer for analysis is a very simple kind of data architecture.
Streaming data from a set of point-of-sale registers to accounting is another kind of architecture.

The data architecture is 100% responsible for increasing a company’s freedom to move around the world.

If agility is what is needed to avoid collapse during slow seasons or to capitalize on the spontaneous popularity of a new product, the more advanced the data architecture is, the more capable the company is to take action.

Explicitly, the data architecture:

Gives a fuller picture of what is happening in the company
Creates a better understanding of the company’s data
Offers protocols by which data moves from its source to being analyzed and consumed by its destinations
Ensures a system is in place to secure the data
Grants all teams the ability to make data-driven decisions
Components of data architecture

The architectural components of today’s data architectural world are:

Data pipelines
Cloud storage
APIs
AI & ML models
Data streaming
Kubernetes
Cloud computing
Real-time analytics
And more…

Data standards

Data standards are the overarching standards of a data architecture, which you apply to areas such as data schemas and security.

Data schemas

The architecture is responsible for setting the data standards that define what kinds of data will pass through it.

These standards can be achieved by creating a data schema. The data schema defines:

Each entity that should be collected. Schema for contact info, for example, might include name, phone number, email, and place of work.
The type of data each piece should be. For example, name is text data, phone number is integer data, email is text data, place of work is text data.
The relationship of that entity to others in the database, such as where it comes from and where it’s going.

Most companies will version their data schema. As data becomes increasingly pervasive, companies will begin using relational databases over more traditional SQL databases.

Relational (NoSQL) databases allow you to easily add data and piece data together more like a network of entities rather than a strict hierarchy of entities. Plus, these relational databases can grow much larger and handle adding data dynamically to the database, where traditional SQL databases could not (or was strongly advised against).

That’s why versioning is so vital. Versioning the data schema helps standardize:

What to find where
The ability to ask when a data was where

(Explore data storage from database to warehouse to lake and from hot to cold.)

Data security

Data standards also help set the security rules for the architecture. These can be visualized in the architecture and schema by showing what data gets passed where, and, when it travels from point A to point B, how the data is secured.

Security protocols can include:

Encrypting data during travel
Restricting access to individuals
Anonymizing data to decrease the value of the information upon receipt by receiving party
Additional actions
Shifting to new architecture

McKinsey published a great article about six important changes to consider when building a data architecture in today’s world. It highlights the older architectural components, and how it has been updated to the distributed, agile architecture for today’s companies.

Here is the short version of these six changes:

From on-premise to cloud-based data platforms
From batch to real-time data processing
From pre-integrated commercial solutions to modular, best-of-breed platforms
From point-to-point to decoupled data access
From an enterprise warehouse to domain-based architecture
From rigid data models toward flexible, extensible data schemas?        

To view or add a comment, sign in

More articles by Darshika Srivastava

  • Actuarial Rate

    Actuarial Rate

    What Is an Actuarial Rate? An actuarial rate is an estimate of the expected value of the future losses of an insurance…

  • Consumer Goods

    Consumer Goods

    What Are Consumer Goods? Consumer goods are finished products bought by individual buyers for their use. Also called…

  • break/fix

    break/fix

    What is break/fix? Break/fix IT is defined as the reactive model of hiring IT service providers to perform one-time…

  • Cognos

    Cognos

    Introduction To Cognos It's only sensible to begin with an introduction to Cognos. Cognos is a complete compilation of…

  • Adobe Campaign Classic

    Adobe Campaign Classic

    Integrated customer profile Profiles (customers, prospects, newsletter subscribers, etc.) are centralized in the Adobe…

  • Campaign Management

    Campaign Management

    What is Campaign Management Software? Campaign management software streamlines the planning, execution, and monitoring…

  • What is a machine learning Model?

    What is a machine learning Model?

    What is a machine learning Model? A machine learning model is a program that can find patterns or make decisions from a…

  • Financial Modeling

    Financial Modeling

    What is Financial Modeling? Financial modeling is one of the most highly valued, but thinly understood, skills in…

  • UI testing

    UI testing

    UI testing is the process to validate both the functionality and visual aspects of an application's UI. UI testing…

  • Auto Increment in SQL

    Auto Increment in SQL

    In SQL databases, a primary key is important for uniquely identifying records in a table. However, sometimes it is not…

Insights from the community

Others also viewed

Explore topics