Aggregate Data While Expanding Columns

Without question, the Expand feature that shows up when you are merging tables using Power Query is very useful.  But one of the things I’ve never called out is that – when you are merging tables – you have the opportunity to aggregate data while expanding columns.

Scenario Background

Let’s say that we have these two tables of data:

SNAGHTML2493b688

The one of the left is Inventory items, and holds the SKU (a unique product identifier), the brand, type and sale price.  The table on the right is our Sales table, and holds the transaction date, SKU sold (many instances here), brand (for some reason) and the sales quantity.  And, as it happens, I already have two queries set up as connections to these tables:

SNAGHTML249f8aea

(Both of these were created by selecting a cell in the table, creating a new query –> From Table, setting the data types, then going to Close & Load To… –> Only Create Connection.)

The goal here is to work out the total sales for each SKU number, and maybe the average as well.  You can follow along with the workbook found here.

Step 1:  Join the Sales table to the Inventory table

The first thing we need to do is merge the two tables together.  We will use the default (Left Outer) join (as described in this post) to make this happen:

  • Go to the Workbook Queries pane –> right click Inventory –> Merge
  • Choose Sales for the bottom table
  • Select the SKU column in each table

image

  • Click OK

And we’ll end up with the following table in Power Query:

image

There are now two ways to get the totals we need.

Method 1:  Expand then Group

This is the approach that I’ve usually taken, as it feels like a logical breakdown to me.  So here goes:

  • Click the Expand icon to the top right of the NewColumn column
  • Choose to Expand the Date and Sales Quantity (as the other columns already exist in the Inventory table.)
  • Uncheck the “Use original column name as prefix” checkbox

image

  • Click OK

You should end up with a list of 20 items, as many of the sales items (like Granville Island Ale) are sold many times:

SNAGHTML24ab8bbb

Next, we need to group them up, as this is too detailed.

  • Go to Transform –> Group By
  • Set up the grouping levels as follows

image

The key to understanding this is that the fields in the top will be preserved, the fields in the bottom will be aggregated (or grouped) together.  Any columns in your original data set that you don’t specify will just be ignored.

  • Click OK

The results are as we’d hoped for:

image

That’s cool, so let’s finalize this query:

  • Call the query “Expand and Group”
  • Go to Home –> Close & Load

Method 2: Aggregate data while expanding columns

Now let’s look at an alternate method to do the same thing…

Start by following Step 1 exactly as shown above.  None of that changes.  It’s not until we get to the part where we have the tables merged and showing a column of tables that the methods depart.

So this time:

  • Click the expand button
  • Click the Aggregate button at the top of the expand window:

image

Notice how the view changes immediately!

The logic here is that, if the field is a date or text, it defaults to offering a count of the data in that column for each sales item I have.  But if I click on the Sum of Sales Quantity, I get the option to add additional aggregation levels:

image

After selecting the Sum and Average for Sales Quantity:

  • Ensure “Use original column names as prefix” is unchecked
  • Click OK

And, as you can see, the data is already grouped for us, with results consistent to what we created by first expanding and then grouping the data:

image

This is cool, as we don’t have to first expand, then group.  And while I haven’t tested this, it only stands to reason that this method should be faster than having to expand all records then group them afterwards.

One thing that is a bit of a shame is that we can’t name the columns in the original aggregation, so we do have to do that manually now:

  • Right click Sum of Sales Quantity –> Rename –> Total Units Sold
  • Right click Average of Sales Quantity –> Rename –> Avg Units Sold

And finalize the query

  • Rename the query to Expand and Aggregate
  • Go to Home –> Close & Load

Share:

Facebook
Twitter
LinkedIn

3 thoughts on “Aggregate Data While Expanding Columns

  1. Nicely done Mr Puls I found this useful a few days ago and was surprised you hadn't done a blog post about it. I'm really finding these tips very useful and time saving.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest Posts