Optimizing the Performance of LookML Queries
Ideally, you want to avoid re-running massive queries if nothing has changed, and instead, append new data to existing results to reduce repetitive requests
The most commonly used methods to optimize query performance in Looker: 1)persistent derived tables,
2)aggregate awareness, and
3)performantly joining views.
Persistent derived tables (PDTs)
Looker allows you to write SQL and LookML queries to your database as a temporary table. When this table is cached or persisted, it's called a PDT.
By saving these queries as a table, you have control over when or how they are built. Tables can rebuild every morning, once a month, or only when new data is added. Ideally, you configure your derived tables to reflect the nature of your data.
You can also create incremental PDTs to append new data without rebuilding the entire table. Applying incremental changes works well for large tables in which existing (older) data is not frequently updated because the primary update to the table is new records.
Recommended by LinkedIn
Aggregate awareness
For very large tables in your database, Looker's aggregate awareness can create smaller aggregate tables of data grouped by various combinations of attributes. The aggregate tables act as "rollups" or summary tables that Looker can use instead of the original large table for queries whenever possible. Aggregate awareness allows Looker to find the smallest, most efficient table available in your database to run a query while still maintaining accuracy
Join views in a performant manner
Data Engineering & BI | Google Looker | Apache Airflow | SQL | Redshift | Bigquery | Python | 12+ Years exp.
1yWell said, I have few points to add here. 1. Avoid using Incremental PDT's if your source table get update in certain period of time. Incremental PDT's best with create only table otherwise you will end up having duplicate records in destination. 2. Model level filter is good option while filtering a data.