Microsoft Fabric: Citizen Data integrator and Pro Developer Data Integrator

Organizations today use data to open new possibilities to grow revenue and bring operational efficiency. Traditional data processing solutions are not able to match the scale of the data analysis required to build a good insight into the data on time. Any ETL/ELT solution should ensure data consistency, accuracy, and reliability on a large and diverse data source.   

While Traditional ETL tools at times are great for on premise solutions but there are issues we come across like Limited data integration, expensive and hard to scale for the Modern data ingestion and transformation projects. Modern ETL Tools like Azure Data factory are flexible, scalable, cost-effective and user friendly.  

Microsoft Fabric has a different approach in managing the ETL process. It brings two of its popular ETL tools Azure Data Factory and Data flows to a single platform. It would be interesting to see how the organization could benefit from it because the users of these two tools have different approaches to data processing. 

Data Flow Gen 2: 

The Power BI data flows has been available for the data processing independently from data sets. The user can do data wrangling using the Power Query Engine. Power Query has been available on various Microsoft software like Excel (Power Pivot) and Power BI.  Power Query has the capability to connect with data sources, transform and cleanse data, combine or append the data, create aggregations, add columns (Conditional and formula based) etc. Power Query is popular because of its ease of use, ease of reshaping the data, support for data transformation and support for custom functions.  

Microsoft Fabric takes this one step ahead by introducing the Data flow Gen2 for data processing bringing in advantages like support for wide range of data sources, and ability to see the data lineage from the source to the destination. Also, these Dataflows can be scheduled for refresh periodically.  

Data Factory:  

The Azure data factory is an enterprise scale data orchestration tool which helps in ingestion and processing at scale. The data factory pipelines have activities like Dataflow Gen2, Notebook, Spark job definition, etc. to support data movement. Data flow and Data factory complement each other. Data pipelines is used to control the data flows and other activities, whereas the data flows would be used for data transformation to match the required data formats. 

Now that we understand that these two tools let now see how these tools tries to bring together the two set of users (Citizen data integrator and Pro data developer)

To have a successful outcome, both the data analyst (citizen integrator) and data engineer (Pro Data developer) must work in close collaboration. Here we see that most of the activities are heavy on the data engineering side and typically the Power BI data analyst must wait till the data is available to them on the Sematic layer. The Data engineer may not necessarily understand how Power Query works on Power BI. Similarly, the data analyst may not understand how the data ingestion and transformation takes place. This may often lead to the creation of work silos even though they are working on the same objective. 

Microsoft Fabric would bring both the Data analyst and Data engineer on the same platform leading to close collaboration. Since the data is now available from the initial ingestion to the data analyst, the data analyst could continue to work on the tools to deliver their part of solution and the Data Engineer could work on the heavy lifting required to get data from various data source and do complex transformations to deliver a robust data analytic platform. 

Microsoft Fabric has a good collaboration opportunity as it brings together both the Citizen integrator and the Pro developers on a common platform where these two groups of users can work more closely together to build an efficient solution. 

Reference: 

Data warehousing and analytics - Azure Architecture Center | Microsoft Learn 

Lineage in Fabric - Microsoft Fabric | Microsoft Learn 

What is Data Factory - Microsoft Fabric | Microsoft Learn 

To view or add a comment, sign in

More articles by Sathish SN

Insights from the community

Others also viewed

Explore topics