Merge Tables using Outer Joins in Power Query

The term “Join” comes from the database world, and I’ll admit that I’ve struggled with understanding it... especially when you combine it with some other keywords.  So this week I thought it would be good to explore it in a bit more detail, specially to show the different results when we merge tables using outer joins.

Different types of Outer Joins

There are actually three flavours of Outer Join you could come across, and each work in different ways.  Those three flavours are:

  • Left Outer join
  • Right Outer join
  • Full Outer join

Which… to an Excel person… mean very little.  To confuse matters more, there are also Inner joins, and Anti joins.  (We’ll look at those next week though.)

Sample Data

To illustrate the different join types, we are going to work with the set of data shown below (which you can download here.)

SNAGHTML1ebe0085

So two tables of data, one called Transactions, and one called ChartOfAccounts.

Now, the key piece you want to watch here is that we need the Account-Dept combination to exist in both tables in order to make a perfect join between them.  Think VLOOKUP for a second… if you tried to make a VLOOKUP for account 10045 from the Transactions table against the ChartOfAccounts table, what would you get?  Of course, you’d get #N/A since 10045 doesn’t exist in the ChartOfAccounts table.

In this case we have items in both tables that don’t exist in the other.  (The yellow records in the Transactions table don’t have a match in the ChartOfAccounts table, and the red records in ChartOfAccounts don’t have a match in the Transactions table.)  With these differences we can test how each of the first three join types available to us behave when we try to merge the data in both tables together.

Groundwork

Of course, the first thing we need is a pointer to each table for Power Query to work.  So let’s set that up first.

  • Click in the Transactions table –> New Query –> From Table
  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection
  • Click in the COA Table –> New Query –> From Table
  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

I also right clicked each query in the Workbook Queries pane and choose to move them to a group I called Source, just to keep them organized:

image

This will work nicely.  We have a pointer to both tables, but didn’t cause any data duplication by loading a new copy to a worksheet.

Merge Tables using Outer Joins – Left Outer Join

Okay, let’s get started.  We need to join these based on the combination of the Account and Dept number.  Fortunately we can use the trick discussed in this post to do this with creating a concatenated key manually first:

  • Right click the Transactions query and choose Merge
  • Select ChartOfAccounts for the bottom table
  • For the top query (Transactions) select Account, hold down CTRL and select Dept
  • For the bottom query (ChartOfAccounts) select Account, hold down CTRL and select Dept

Your join selections should now look like this:

image

Notice that the Join Kind in the bottom is “Left Outer (all from first, matching from second).  Let’s click OK and see what that means.

When you get into the Power Query editor:

  • Right click the NewColumn column –> Rename –> COA
  • Click the Expand icon on the top right of the COA column
  • Leave all the defaults and click OK

The resulting query should look as follows (barring the colour of course):

image

So this is a “Left Outer Join” – the default choice for Power Query.  It returns all entries in the left table (or top in the case of the power query editor) and returns the matching values it finds based on the lookup column.  Essentially this is the same as VLOOKUP.  It returns a match, except where it can’t find a matching record.  (The main difference between VLOOKUP and Power Query is that if Power Query found multiple matching records, it would return all of them, not just one.)

Again, the key point here is that every value from the Left table is returned, whether there is a match or not in the Right table.  The yellow rows here match the yellow rows in the original Transaction table shown at the beginning of the post.

But… notice also that accounts 10040 and 11000 (the red accounts in the COA table) do not shown in the listing at all.  This is your Left Outer join in action.  It pulls all records from the left table, any matches from the right (or null if no records on the right match).  It never looks at the right side at all to see if records exist there that don’t exist in the left hand table.

Okay, so now we can see what’s happening here, let’s finish it off:

  • Change the name of the query to LeftOuter
  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

Merge Tables using Outer Joins – Right Outer Join

Rather than discuss the difference here, let’s just demo it.  One caveat… it’s important to get a clear understanding that you start with same base table in order to make parallel comparisons!

  • Right click the Transactions query and choose Merge
  • Select ChartOfAccounts for the bottom table
  • For the top query (Transactions) select Account, hold down CTRL and select Dept
  • For the bottom query (ChartOfAccounts) select Account, hold down CTRL and select Dept
  • Change the Join type to “Right Outer (all from second, matching from first)”
  • Click OK

And then we’ll do the same thing we did before so that we can compare the results:

  • Right click the NewColumn column –> Rename –> COA
  • Click the Expand icon on the top right of the COA column
  • Leave all the defaults and click OK

If you followed along correctly, your result should look like this:

image

 

Notice the big difference here?  This time the values from the Right table (ChartOfAccounts) show whether there is a match in the Left (Transactions) table or not.  The red rows (containing 10040 and 11000 from our original table) are now present, where they weren’t in the previous scenario.

But the (yellow) items that were in the Left (Transaction) table which don’t have a match in the Right (ChartOfAccounts) table?  They’re nowhere to be seen!

Where I now know I struggled with this when writing SQL code from scratch is that no one ever explained to me which table was Left and which was Right.  Lacking that knowledge there really isn’t anything to explain what these joins are truly doing.  But now that we can see that the table we start our merge from is the Left table, and the one we are joining to it is the Right table… it starts to make a LOT more sense.  The only kicker we have with Power Query is that the Left table is the top in our merge dialog, and the Right is the bottom.  But you can certainly see why the UI was designed this way (imagine trying to fit it on your screen if it was side by side?)

At any rate, we can now compare and contrast those two joins.  Let’s finish this one off and look at the final join we’ll examine today.

  • Change the name of the query to RightOuter
  • Go to Home –> Close & Load –> Close & Load To… –> Only Create Connection

Merge Tables using Outer Joins – Full Outer Join

By now, you know the drill…

  • Right click the Transactions query and choose Merge
  • Select ChartOfAccounts for the bottom table
  • For the top query (Transactions) select Account, hold down CTRL and select Dept
  • For the bottom query (ChartOfAccounts) select Account, hold down CTRL and select Dept
  • Change the Join type to “Full Outer (all from second, matching from first)”
  • Click OK

And now modify the query:

  • Right click the NewColumn column –> Rename –> COA
  • Click the Expand icon on the top right of the COA column
  • Leave all the defaults and click OK

And what do we have?

image

Aha!  So the Full Outer join makes sure we’ve got all items from both sides.  I can see this being SUPER useful for trying to compare to lists of transactions like in a bank reconciliation.  Transactions that match would get lined up nicely, and any that needed attention would have a bunch of nulls beside them on either side.  Very cool.  (I’ll have to do a post on that some day!)

Wrapping Up

This shows how to merge tables using Outer Joins… If you'd like to learn about the three remaining join types shown below, you can do so at this article:

  • Inner Join
  • Left Anti Join
  • Right Anti Join

Share:

Facebook
Twitter
LinkedIn

36 thoughts on “Merge Tables using Outer Joins in Power Query

  1. Hi Ken,
    this is pretty awesome - could have used it yesterday actually... so thanks a lot for sharing!

    If you just want to see the non-matches, you can use this filter:
    = Table.SelectRows(PrevStep, each [Account] = null or [COA.Account] = null

  2. I kinda knew how it would operate, having done it many times ... but the clarity with which you have explained it should leave no doubt in people's minds.

    Thanks.

  3. Ken,

    Excellent guidance. Thank you.

    I did notice one typo: (Fixed in the article. Thanks for reporting!)

  4. Thanks Ken,

    Very clear and well presented information.

    Minor typo also: (Has been fixed in the article. Thanks for reporting it!)

    Thanks again.

  5. An application of this that I have found useful is to create allocations. For instance if I have global value $100 in my left table. And global is owned 50/50 by entities A and B in my right table. A left join will duplicate the $100 in my left table and bring in the applicable ownership percentages from my right table. Then the product of global and ownership becomes my allocated total.

    The flip side of this is if you don't know that you have a duplicate in your right table your join will double (at least) count numbers that match. Easily enough to handle...if you know it's happening.

  6. Excellent article, I've struggled with brain fog on the concept of joins for a long time so look forward to the next instalment!

  7. Thanks everyone for the comments. It makes me feel a bit better to know I wasn't the only one. (Jimmy... that Venn diagram made my head hurt! LOL!)

  8. Great post. I was lucky enough to have a database professor who explained this much like you have in this post.
    For the outer join section should the join type be something besides "...(all from second, matching from first)?"
    I can't seem to figure out how to display the Join Kind dropdown box in my 2016 version. Any tips or suggestions how/where to search? Thank you.
    p.s. Ordered your "M" book as a little Christmas present to myself. Committed to not opening until 25th so if the answer is in it sorry for the question.

  9. Hey Nolan,

    Thanks for this. Re your questions:

    The Right Outer join is "all from second, matching from first". Both in the UI and in effect. The Left Outer join is "all from first, matching from second."

    Re Excel 2016, have you installed all the latest updates? Personally I'm running version 16.0.6366.2036 and the options are there...

  10. Which version of Excel, Oz? I'm hearing now that I may have a bit more advanced build since I'm on the Office 2016 insider program...

  11. Hi Ken,

    I'm using version 16.0.6001.1073. Unfortunately, the only join option I have is just "only include matching rows". I guess I got the same problem as Oz and Nolan. Already clicked the update button, but the problem still exist. Any tips? Thanks in advance bro

  12. What version of Power Query/Excel are you using? To my understanding this was in general release for all versions now.

    You say you clicked the update button. Did it take you to download the update? The download file does need to be opened in order to install.

  13. This option appears to have disappeared in Excel 2016 (PowerPro O365).

    I used to have to it in Excel 2013 using the PowerView plugin.

    Seems strange to remove such useful functionality.

  14. Actually, it's more likely that your Excel 2016 version doesn't have the latest updates. What version of Excel 2016 are you running (from File --> Account --> About Excel)?

  15. I believe Ben is right about this. I have merge option but i can't specify the type of merge I am doing. I have to go to advance editor
    #"Merged Queries" = Table.NestedJoin(#"Grouped Rows",{"Date"},#"Draw Data",{"Date"},"NewColumn",(((((ENTER JOIN TYPE))))),

    WARNING: It is Case Sensitive
    The syntax for the joins to enter inside enter join type are:
    JoinKind.Inner
    JoinKind.LeftOuter
    JoinKind.RightOuter
    JoinKind.LeftAnti
    JoinKind.RightAnti

  16. Hey Peter,

    What version of the software are you using? If it's Excel 2016 on subscription it is very likely that you are on either the standard release channel or the deferred release channel. If you're on Excel 2016 without subscription, then I'm actually not sure how long it will take to show up.

  17. Hi,

    *Full Outer Join:

    How resolve problem of non matchs? I have to create an additional column ?

  18. I have the same issue, I migrated from Excel 2013 with PowerQuery add-in to Excel 2016 64bit and the Join options disappeared.
    In which version of Excel was this corrected ?
    I currently have Excel 16.0.4266.1001

  19. Wow, Erik, I think that is the very first public Excel 2016 build (or at least very close to.) I'm assuming you did not install the Office 365 version, but rather got a non-subscription (MSI) installer? Based on that, I'm honestly not sure how long it will take. You can try going to File --> Account --> Update and see if anything is there. If you don't have that option, or it doesn't work, then I think it depends on how long it takes for Excel to roll up a service pack to deliver your updates.

    Power Query is one of those areas where the Office 365 subscription really becomes beneficial as the updates are just streamed to you. Without a subscription though... I'm not sure how long it will teak.

  20. Are the joins dependent on the data type you are trying to match? I had the joins working yesterday and lost it today, it brings back all null values which means it didn't find anything matching...

  21. Hi Ken, I have the same issue. After IT upgrade my excel 2013 to Office 2016, i lost the different join types option. My employer has Office 365 subscription. Could you please advise how do i get the different join types options on excel 2016? Thanks!

  22. Hi Lanie,

    It sounds an awful lot like you are still running an old build, as the first Excel 2016 build pre-dated this feature. Try going to File --> Account. On the right side you should see the details of your office. If it's not marked as subscription, you're not actually using a subscription product. If you are, then there shoudl be an Update button there.

  23. @ Hai Li, yes, most likely this is the case. Easy to check, just format the source data columns to be of a consistent type before triggering the merge.

  24. Im having difficulty getting all my rows when using Right Join. I get all my rows when I use Left Join but not Right.

  25. Hey Bob, in the preview window, or upon your final load? I can understand not seeing all the rows during a preview, as each table will have a preview limit, but upon loading the data it should all be there.

  26. Dear Friends. is the any way or guidance how to manage Full Outer Join in case there are some duplicated items available in both table (left table,right table)? i have an issue, if one items repeated 3 time in left table and same item available only 1 time in right table the result will be same: 3 items (in LT) = 3 items (in RT). In other words, its incorrect cause i should get 3 - 1 (other 2 items in right table should be blank). Will appreciate any feedback!
    have an idea to create couples of merge queries. 1st one: to merge exactly all items where qty will be only 1. 2nd one: to group by as per main ID, Name, Qty...

  27. Hi Alex,

    If you want a 1 to many style join, then right click the column you are using to join from one of the tables and choose Remove Duplicates. If you don't, then you'll get a Cartesian Product (many to many join). Hope that helps make some sense of what is happening!

  28. Hi, Ken
    thanks for feedback. actually, i cannot remove duplicates, because i need to compare 2 tables just to see which items (or rows) are missing in Table 2 and which one extra and not available in Table 1. I got support from other forum:
    1. group items
    2. suggested to add index in both tables inside of group by each Table.AddIndexColumn(_, "Index", 1, 1)
    3. expand table columns
    4. merge items + index

  29. I have merged 2 tables using a full outer Join by Address field.
    When I check the join column in the 2nd table, it indicates null ( concluding that no record was matched).
    However, when I highlight the row and view the details in the record selection the Column is populated correctly.Also, when I filter on "null" values from the 2nd join column the record does not appear, also record is populated correctly when exported to a table. I've tried refreshing and re merging, renaming the workbook, but the null value still appear in the Expanded columns
    Can you please suggest on this.

  30. It's almost 2019 and this post is still helping people - in this case me! Very clear and easy to follow steps even with minor discrepancies in the 2010 Power Query ribbon.

  31. Pingback: Merge Tables using Inner and Anti Joins in Power Query - Excelguru

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest Posts