Without question, the Expand feature that shows up when you are merging tables using Power Query is very useful. But one of the things I’ve never called out is that – when you are merging tables – you have the opportunity to aggregate data while expanding columns.
Scenario Background
Let’s say that we have these two tables of data:
The one of the left is Inventory items, and holds the SKU (a unique product identifier), the brand, type and sale price. The table on the right is our Sales table, and holds the transaction date, SKU sold (many instances here), brand (for some reason) and the sales quantity. And, as it happens, I already have two queries set up as connections to these tables:
(Both of these were created by selecting a cell in the table, creating a new query –> From Table, setting the data types, then going to Close & Load To… –> Only Create Connection.)
The goal here is to work out the total sales for each SKU number, and maybe the average as well. You can follow along with the workbook found here.
Step 1: Join the Sales table to the Inventory table
The first thing we need to do is merge the two tables together. We will use the default (Left Outer) join (as described in this post) to make this happen:
- Go to the Workbook Queries pane –> right click Inventory –> Merge
- Choose Sales for the bottom table
- Select the SKU column in each table
- Click OK
And we’ll end up with the following table in Power Query:
There are now two ways to get the totals we need.
Method 1: Expand then Group
This is the approach that I’ve usually taken, as it feels like a logical breakdown to me. So here goes:
- Click the Expand icon to the top right of the NewColumn column
- Choose to Expand the Date and Sales Quantity (as the other columns already exist in the Inventory table.)
- Uncheck the “Use original column name as prefix” checkbox
- Click OK
You should end up with a list of 20 items, as many of the sales items (like Granville Island Ale) are sold many times:
Next, we need to group them up, as this is too detailed.
- Go to Transform –> Group By
- Set up the grouping levels as follows
The key to understanding this is that the fields in the top will be preserved, the fields in the bottom will be aggregated (or grouped) together. Any columns in your original data set that you don’t specify will just be ignored.
- Click OK
The results are as we’d hoped for:
That’s cool, so let’s finalize this query:
- Call the query “Expand and Group”
- Go to Home –> Close & Load
Method 2: Aggregate data while expanding columns
Now let’s look at an alternate method to do the same thing…
Start by following Step 1 exactly as shown above. None of that changes. It’s not until we get to the part where we have the tables merged and showing a column of tables that the methods depart.
So this time:
- Click the expand button
- Click the Aggregate button at the top of the expand window:
Notice how the view changes immediately!
The logic here is that, if the field is a date or text, it defaults to offering a count of the data in that column for each sales item I have. But if I click on the Sum of Sales Quantity, I get the option to add additional aggregation levels:
After selecting the Sum and Average for Sales Quantity:
- Ensure “Use original column names as prefix” is unchecked
- Click OK
And, as you can see, the data is already grouped for us, with results consistent to what we created by first expanding and then grouping the data:
This is cool, as we don’t have to first expand, then group. And while I haven’t tested this, it only stands to reason that this method should be faster than having to expand all records then group them afterwards.
One thing that is a bit of a shame is that we can’t name the columns in the original aggregation, so we do have to do that manually now:
- Right click Sum of Sales Quantity –> Rename –> Total Units Sold
- Right click Average of Sales Quantity –> Rename –> Avg Units Sold
And finalize the query
- Rename the query to Expand and Aggregate
- Go to Home –> Close & Load
3 thoughts on “Aggregate Data While Expanding Columns”
Very cool. I have never noticed the aggregate option on that dialog box. Thanks for sharing.
Nicely done Mr Puls I found this useful a few days ago and was surprised you hadn't done a blog post about it. I'm really finding these tips very useful and time saving.
Why thank you Mr. Duchesne. I appreciate that!