DATA MODELING IN POWER BI

DATA MODELING IN POWER BI

What is Data Modeling?

Data modeling is the process of creating visual representations of the connections between data structures, with information about the individual attributes contained within those data structures.

When talking about data modeling in general, the term you will hear most often is the star schema. This is a widely adopted approach to designing data warehouses and relational databases and is the recommended approach to take in Power BI as well. 

There are two main benefits of using a star schema:

  • Usability: A star schema makes your data model cleaner and more organized, and your report will be easier to use.
  • Performance: Star schemas allow you to easily scale your report to very large volumes of data. DAX measures also calculate faster and the Power BI report refreshes faster overall.

How does a star schema work?

A star schema is made up of a central fact table with multiple dimension tables branching off of this fact table, much like the appearance of a star. Ideally, it is good practice to have only one fact table in a data model. However, it is possible to include multiple fact tables in Power BI.


Article content
POWER BI STAR SCHEMA

A fact table is made up of values that can be summarized and aggregated, as well as one or more keys that link to the dimension tables. The purpose of the key is to make connections between tables. If you have complex data models or large volumes of data, it is a good idea for the keys to be numeric values rather than text strings, as this can improve the performance of the model.

Fact tables should be very narrow and include as few columns as possible. Any columns that describe the features of a dataset, such as Product Name or  Product Category, should be separated out of the fact table (if not already separated in the data warehouse) into their respective dimension tables. This process is called normalization of the data, and the goal is to avoid repetitiveness.

Dimension tables describe the features of the dataset, where features are logically grouped together in separate tables to avoid bloating a single table or making it difficult to create relationships. The dimension tables should contain unique values for the features they are describing, and their columns will be responsible for filtering or grouping the data in the Power BI report. 

An extension of the star schema is the snowflake schema. In this approach, the dimension tables are further divided if a feature has any further categories or subcategories. In our case, we have a Products table that contains the Product ID, Product Name, and Product Category. In a snowflake schema, this Products table will only contain the Product ID and the Product Name, while the Product Category would be further divided out into it’s own table with a Product Category ID and Product Category.


Article content
SNOWFLAKE SCHEMA

However, using a snowflake schema can unnecessarily add to the complexity of your data model; including these extra tables could decrease performance and make creating relationships difficult and confusing.

How to Create Relationships in Power BI

There are two ways you can create a relationship in Power BI:

  1. Select a field from one table and drag it onto the field in the second table with which you want the relationship to form. 
  2. Select Manage Relationships from the ribbon and select “New” to add a relationship using the same window that we will be discussing next (except that it will start as blank).

By default, Power BI will try to infer a relationship between tables; it doesn't always get this right, so you may wish to turn this feature off in the settings or delete any relationships that are created automatically. To edit the relationship, right-click the connecting line between them and select “Properties”. 

This window has 2 interesting options to choose from when defining a relationship: cardinality and cross filter direction. The choices for each of these options can have a big impact on the resulting report, so choose carefully. Let's break down each of these options.


Article content
CREATING RELATIONSHIPS IN POWER BI


Cardinality

There are 4 cardinality choices: many-to-one, one-to-one, one-to-many, or many-to-many. 

When creating relationships, it is recommended that the joining field contains unique values in at least 1 of the tables. This allows you to use the one-to-many or many-to-one options in your data model. 

In our data, we have a relationship between the Financials table and the Products table using the Product ID field. The Products table has unique values for the Product field (each product only appears once in the table). However, the Financials table can have each product showing up several times by date, country, segment, etc.

The many-to-many option is the most problematic, and while it can be very useful in complex models, it should only be selected if you fully understand the consequences of this relationship.

To illustrate where many-to-many relationships can cause problems, we create a yearly Sales Target table by Product Category in the data model. We need to be able to report sales numbers and these targets side-by-side in the same visual. This can be achieved by creating a many-to-many relationship between the Sales Target table and the Product table. 


Article content
MANY-TOMANY RELATIONSHIPS IN POWER BI


Pavan kumar Cheemala

ETL Developer | Informatica PowerCenter | IICS | Data Integration | SQL & Oracle Specialist | UNIX | Teradata | GitHub | ServiceNow|Rally|WinScp|Putty|Concourse|Harbour|Harness|Outlook|MS Excel For Data Analysis.

8mo

borrapadu

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics