Clean WhiteSpace in PowerQuery

The other day as I was working through a model, I once again tripped upon the fact that Power Query’s Text.Trim function doesn’t clean whitespace inside the text string, only at the ends.  For those who are used to Excel’s TRIM function, this is a frustrating inconsistency.

Just to circle on it, here’s the difference:

Source Function Result
Excel =TRIM(“  trim   me  “) “trim me”
Power Query =Text.Trim(“  trim   me  “) “trim   me“

Typically, I’ve just gone through the cycle of replacing a double space with a single space a few times on the same column to deal with this issue.  The issue, of course, is that you need to do this twice if there are 4 spaces, but add more spaces, and you have to do this more times.  Doesn’t seem like a really robust solution.

At any rate, this time I emailed one of my friends on the Power Query team and suggested that they should implement a function to make this a bit easier.

My Suggestion for a Clean Whitespace Function

The gist of my suggestion was to create a new function that would not only trim the whitespace internally, but would also allow you to specify which character you want to clear out.  This way it would work nicely to clean whitespace in the shape of spaces (the usual culprit in my world), but would also allow you to substitute in other characters if needed.  (Maybe you need to replace all instances of repeating 0’s with a single 0.)

It got referred to another friend on the team, (who wishes to remain nameless,) and he responded with some pretty cool code.  I’ve taken that code, broken it down and modified it a bit, and the end result is a slightly different version that can work the same as Excel’s TRIM() function, but adds an optional parameter to make it even more robust.  For lack of a better name, I’m going to call it “PowerTrim”.  (Just trying to do my part to keep the Power in Power Query!) 😉

Here’s the function:

(text as text, optional char_to_trim as text) =>
let
char = if char_to_trim = null then " " else char_to_trim,
split = Text.Split(text, char),
removeblanks = List.Select(split, each _ <> ""),
result=Text.Combine(removeblanks, char)
in
result

And to implement it, you’d take the following steps:

  • Copy the code above
  • Create a new query –> From Other Sources –> Blank Query
  • Change the query name to PowerTrim
  • Go into the Advanced Editor
  • Select all the text and replace it with the code above –> Done

Like this:

image

How it Works

We’d call this from a custom column, feeding in a column of text, and specifying the character (or even string of characters) we’d like to trim.  The function then works through the following process:

  • It checks to see if the char_to_trim was provided, and uses a space if not
  • It splits the text by that character, resulting in a list:

image

(This list shows the word “bookkeeper” split by “e”)

It then:

  • Filters out any blank rows
  • Combines the remaining items using the original character to split by

(The original version was actually all rolled up in one line, but I find it easier to debug, step through, examine and play with when it’s separated.)

Demo

Here’s some examples of the function in action. I started with a raw table from Excel.  (Create a new query –> From Table)

image

And added a Custom column by going to Add Column –> Add Custom Column

  • Name:  Trim_null
  • Formula:  =PowerTrim([Text])

image

Notice that in the first row it trimmed the leading, trailing and internal spaces.  Just like Excel!  (Remember that if you used Power Query’s default Text.Trim() function, you would return “trim   me”, not “trim me”.)

Now, let’s add another and try with an alternate character… like 0.  Again, we go to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”0”)

image

In this case the extraneous zeroes are trimmed out of row 3, leaving only a single one.  Cool stuff.  Now what about the “e”. Let’s see how that one goes.

Once more to Add Column –> Add Custom Column:

  • Name:  Trim_0
  • Formula:  =PowerTrim([Text],”e”)

image

The first time I looked at this, I thought there was an issue with the function.  But then I remembered in this case we are removing all leading and trailing e’s, as well as replacing any duplicate e’s with a single e.  You can see that this is indeed what happened in both rows 2 and 4.

Final Thoughts

I wish there was a way to get this to easily role into the Text functions category, so that I could call it something like Text.PowerTrim() or even replace the Text.Trim() function with my own.  Unfortunately a query name can’t contain the period character, which kind of sucks.  I guess it’s to to protect you from accidentally overwriting a function, but I’d like the ability to do it intentionally.

Share:

Facebook
Twitter
LinkedIn

34 thoughts on “Clean WhiteSpace in PowerQuery

  1. Pingback: Transform table column using own function in Power Query – Ivan Bond's blog

  2. useful function. Thanks! Do you know if you can make the replacement inside the same column, without adding a new one.

  3. Unfortunately there isn't a one click transformation to do this on an existing column, no. Would be nice to have though, so maybe submit it as an idea at http://excel.uservoice.com. Until we see it in the product though, the custom function is going to be the best route.

  4. Ken...
    I just thought I'd drop a note of appreciation for your blog. I end up here a lot when searching to solutions for problems in power query and find the information you post invaluable. This function was fantastic for cleaning up a text file that had irregular columns.
    Cheers,
    Mike

  5. Evening Ken. I'd just lifted the above and was about to deploy when I stumbled across the following by accident
    If you click on a column and hit the filter arrow you will an option, 5th one down entitled 'Remove Empty'
    I'm in Power BI desktop and can't say I've ever seen this option before
    Presumably it would get rid of rows where there is all white space, i.e. empty?
    Or it could be something entirely different?

  6. Anthony, I always thought that was truly empty, as in it didn't even say null or have any spaces or other characters. But could be wrong there too... I've never actually used that one.

  7. There are at least two types of spaces - ordinary and non-breaking space.
    For example "asdf zxcv" here are three spaces between asdf and zxcv. I.e. asdf[space][nbsp][space]zxcv. Not sure your website's parser displays it properly. So if you apply your function to this string, nothing changes

    This (a bit modified) function fixes this problem
    ====
    (text as text, optional char_to_trim as text) =>
    let
    char = if char_to_trim = null then " " else char_to_trim,
    nbsp = if char=" " then Character.FromNumber(160) else "",
    split = Text.SplitAny(text, char & nbsp),
    removeblanks = List.Select(split, each _ ""),
    result=Text.Combine(removeblanks, char)
    in
    result
    ====
    You can also add some more spaces (or not spaces) to nbsp variable.

    Read more: Whitespace character - https://en.wikipedia.org/wiki/Whitespace_character

  8. Thank you! This is exactly what I needed to clean up some merged columns. It has saved me a ton of time

  9. Ken,
    I was just wondering if there is a way to make this (or any) custom function always be available rather than add it every time I need it. I tried placing the function into the hidden personal.xlsb file, but you have to unhide the workbook to see the function.

    Thanks for your blog and site. This information is helping me cut all kinds of time at work.

    Mt

  10. Very helpful Ken - thanks.

    However, I've run into a snag when dealing with nulls in my data column.
    I've attempted to modify your function to check for nulls, but I'm new to this and can't get it to work.

    Are you able to NULL-proof it? Thanks, PJ

  11. Great article!!! Really helped. It's a shame the custom function has to be added to every workbook separately. This really should be on Microsoft's radar to add to the next release of PQ.

  12. Dear Sir,

    I have data as "D3Q20D #BGM" , i used your function but in the end what i got was "D3Q20D #BGM". there is still one space between the 2 words. i would like to join them together without any space. how to do it?

  13. Hey Rishi, that's even easier. Just right click the column --> Replace Values --> type a space in the Values to Replace part and click OK. That will remove all the spaces for you.

  14. Pingback: Power Query – A function to remove spaces within Text values – Power BI Business Analytics Solutions

  15. Thank you so flippin' much! Yes, Power Query needs to get on this level. Unfortunately, as of the November 2019 Power BI update, their regular Text.Trim function is still lacking.

  16. Unfortunately, due to the need to support query folding, and since SQL only has an LTRIM() and RTRIM() function, I don't see this changing...

  17. Thanks Ken. This has been super helpful. One would expect Trim in Power Query to work the same as Trim in excel but unfortunately not. Hopefully, they'll make the update soon.

  18. What's wrong with importing the data then right click on the column an choose - Transform - Trim?

  19. Hi Gordon,

    The formula that I list above is the formula that is created when you right click and choose Transform -> Trim. The contents of the article explain why this is an issue.

  20. I have made additional changes to fxPowerTrim and thought I'd share. I have a situation where a client is providing an Excel file that my users need to import, but the client has been inconsistent with linefeed characters and spaces in the column headings. I wanted a function that I can call right after #"Promoted Headers" that would scrub the column headings, then all the steps following (like column reorder) would not fail if the client were to inject spaces/linefeeds that are hard for a regular user to see and troubleshoot.

    This function gave me inspiration on how to make it happen (thank you Ken). See down below for how I applied it across the column headings.
    ///////////////////////////////////////
    // Function to remove special characters and multiple spaces from passed text.
    // Based on a script from Ken Puls
    // https://excelguru.ca/2015/10/08/clean-whitespace-in-powerquery/
    // Includes a modified handling of nbsp by Sergey Losev
    // Ben Sacherich 2/9/2019: added replacement of lf, cr, and tab. I replace these with char instead of "".

    (text as text, optional char_to_trim as text) =>
    let
    char = if char_to_trim = null then " " else char_to_trim,
    no_tab = Text.Replace(text, "#(tab)", char),
    no_cr = Text.Replace(no_tab , "#(cr)", char),
    no_lf = Text.Replace(no_cr, "#(lf)", char),
    no_nbsp = Text.Replace(no_lf , Character.FromNumber(160), char),
    split = Text.Split(no_nbsp, char),
    removeblanks = List.Select(split, each _ ""),
    result=Text.Combine(removeblanks, char)
    in
    result
    ///////////////////////////////////////

    To apply a function across all column headings use the line below. In my limited testing this adds lag to the script refresh. For me it's better than a phone call explaining to the user to look for linefeeds in the headings (and explaining what a linefeed is).

    ///////////////////////////////////////
    // If you are unsure if the headers will be consistent with spaces and linefeeds, after Promoted Headers it is a
    // good idea to use fxPowerTrim across all column headings. It does add some lag to the refresh so you may not want to add it until you deploy.
    PowerTrim_Headers = Table.TransformColumnNames(#"Promoted Headers", (columnName as text) as text => fxPowerTrim(columnName)),
    ///////////////////////////////////////

  21. Fantastic!! I don't know what I would have done, if I hadn't found that code. Thank you ... Power-Ken.

  22. Thanks very much, I wanted to remove all " " spaces within the text so I deleted
    "char" from your code line:
    result=Text.Combine(removeblanks, char)
    to get:
    result=Text.Combine(removeblanks)

    Which worked splendidly!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest Posts