In Power BI, references offer a powerful tool for building upon existing queries and streamlining the data transformation process.

When you create a reference to a query, you generate a new query that points to an existing one. This new query doesn’t contain its own data; instead, it refers back to the original query for its data. Any additional transformation steps and analyses performed on this new query won’t affect the original query’s data.

This distinguishes ‘Reference’ from ‘Duplicate.’ While both create new queries based on an existing one, ‘Reference’ is more efficient since it doesn’t contain its own copy of the data, thus using less memory. Furthermore, ‘Reference’ updates automatically to reflect changes in the original query and data source, whereas duplicates would have to be manually updated.

Here are two use cases that highlights the practicality of the ‘Reference’ function:

Migrating Data Source

‘Reference’ simplifies the migration of data sources. You can achieve this by initially creating a query that points to the folder locations of the data used in a report.

For this example, I’m creating a query named ‘Sharepoint Folder’ that points to a specific subfolder within my personal SharePoint Folder.

In the Queries pane on the left, you can right-click the original query to create a reference. Note that in the referenced query, the query field now displays ‘=  #”Sharepoint Folder’ instead of the original URL.

From this query, I can now access the files I need by clicking on ‘Binary’, and carry out the necessary transformations for each step. I can repeat this process for all the files I will be using from this folder.

Suppose I’ve moved my data to a new SharePoint site. If I had created each query manually or as duplicates, I would have to manually update the data source of each query to the new location.

Fortunately, because all my tables reference one query, updating the original query’s source URL in the query is sufficient for the rest to update automatically.

Next, navigate to the location of the new data.

The reference queries will adjust accordingly once they detect a file with the same name in the new location.

Build Upon Existing Query

You can also use ‘Reference’ to generate new queries that can be used for aggregations or to create data subsets.

In my example, I’ll be producing a list of users and their email addresses from the existing ‘Progress’ query.

After creating the ‘Users’ reference query, I’ll transform the data by removing unused columns and deduplicating the users.

 

Upon completion of this process, we’ll have a dynamically updated list of user details, based on the users in the original query.

Since the original ‘Users’ query is a reference to the ‘Sharepoint Folder’ query, any migration of the data source will cause the list of users to automatically update to align with the available data in the new source..

 

Hopefully, this example demonstrates how referencing allows you to create streamlined workflows that efficiently use memory, ensure data consistency, and reduce unnecessary duplication. It’s particularly helpful when dealing with complex data transformations and large datasets, or when you need to migrate data sources, and I hope that you will find some use for it in your own projects.

The Data School
Author: The Data School