I recently received a comment on one of my blog posts asking how to separate values and text, especially when there is no common delimiter such as a space separating them. This is a bit of an interesting one, as there is no obvious function to make this happen.
Background
The scenario the user has here is a list of values with their unit of measure, similar to this:
This issue here is that we don’t really have anything to easily split this up, as there isn’t really a good pattern. Sometimes there are spaces after the values, sometimes not. The letters change, and there is non consistency to how many characters the values represent. So how would you approach this?
You can download the sample workbook here.
My Route
I think that a solution for this type of problem is going to be specific to the data in use. Looking at the sample data, I figured that I can probably bank on all the numbers being at the beginning of the string, and that I probably won’t see something like square meters expressed as m2. Of course, if that assumption wasn’t correct, I’d have to come up with another method.
At any rate, the angle I approached this was to build a custom function to remove the leading numeric values. That should leave me with the text values, which I could then replace in the original string. Let’s take a look.
Removing Numbers
As we recommend in M is for Data Monkey, the way to build a custom function is to start with a regular query that will let us step through each piece you need to do.
So focusing on doing this through the user interface, here’s how I started this solution.
- Create new Power Query –> From Other Sources –> Blank Query
- In the formula bar, I typed in 1.07Kg (no quotes, just that text) and pressed Enter
- I then right clicked the text in the Power Query window, and choose to convert it to a list
Of course, you can’t do a ton with Lists in the user interface, so I converted it to a table:
- List Tools –> Transform –> To Table –> OK
To be fair, I could have started by creating a record or a list from scratch (as we show you how to do in M is for Data Monkey,) but I didn’t really need to here in order to get up and running quickly. Regardless, I’m now sitting in a nice place where I have the entire UI exposed to do what I need (which was my original goal).
At this point, things become pretty easy:
- Right click Column1 –> Replace Values –> Replace 0 with nothing
- Repeat for 1 through 9 and the decimal character
This removed all numbers and decimals, leaving me with just text. But because I know some of the values had spaces in them as well, I should deal with that:
- Right click Column1 –> Transform –> Trim
The final thing I did was to drill into the data point there, as I don’t really want to return a table when I convert this into a function. To do that I needed to:
- Click the fx on the left of the formula bar
- Append the following to the text in the formula bar: [Column1]{0}
Notice that we now have just the data point, not the Column1 header.
Converting the Query to a Function
Now, we’ve got a neat little function that will let me take a data point, sanitize it, and turn it into data point with no leading values. But how can I re-purpose that to use it for every record? The answer is to turn this query into a custom function, as we describe in Chapter 22 of M is for Data Monkey. Here’s how we do it:
- Go to View –> Advanced Editor
- Right before the “let” line, add the following:
(Source) =>
- Go and place two / characters in front of the current Source line in order to comment it out (otherwise it would overwrite the function input)
//Source = “1.07Kg”,
- Click Done
- Rename the query to fxRemoveNumbers
That’s it. We’ve converted it to a function. Now you can go to Home –> Close & Load to save it and it’s ready for use. The interesting part here is that creating the logic is the hard part, converting it to a function is deadly easy.
Separate Values and Text
So now let’s use our new function to separate values and text. Here’s how I did this:
- Select any cell in the table –> create a new query –> From Table
- Go to Add Column –> Add Custom column
- New column name: Measure
- Column formula: fxRemoveNumbers([Quantity])
And we’ve got a nice new column with just the textual values.
Not bad, now we just need to figure out a way to replace the matching text in the Quantity column with nothing… After checking MSDN’s Power Query formula guide, I found a formula called Text.Replace() that seems it should do just that:
- Go to Add Column –> Add Custom column
- New column name: Value
- Column formula: =Text.Replace([Quantity],[Measure],"")
To summarize here, we’re going to look at what is in the Quantity column and replace any instance of the text in the Measure column with the value between the two sets of quotes (i.e. nothing.) The results are shown below:
Now it’s just a simple matter of doing some cleanup:
- Right click the Value column –> Change Type –> Decimal Number
- Right click the Quantity column –> Remove
And there you go. It’s finished. We simply need to go to Home –> Close & Load to commit it, and then refresh it any time we need it.
M is for Data Monkey
The book is now available and is packed with good information that will help you solve this issue as well as many others. Learn more about the book here.
16 thoughts on “Separate Values and Text in Power Query”
Great post! Thanks.
Ken,
Thanks for posting this example, I have a couple of projects at work where I could use a similar type of function. I believe that you could simplify the step where you drill into the data point because you did not need a table by just right clicking on that data point and selecting drill down. I think that would be simpler than appending the text in the formula bar to [Column1]{0}.
By the way, I just bought your book yesterday and I am really looking forward to reading it.
Good point, and hope you enjoy the book!
Pingback: Separate Values and Text in Power Query | MS Excel | Power Pivot | DAX
Another awesome article Ken!
You really have a knack of making seemingly complex issues become simple; almost obvious! It clearly shows how with a bit of creativity, out of the box thinking and using what the application already provides, one can REALLY do amazing things with data.
TX for your blog and please continue to provide these wonderful, clear step-by-step articles. They are most valued in the community!
Cheers!
Is there a way to save these custom functions so that they are available to be used in more than one workbook like you can with macros?
BTW great article and I plan on buying your book very soon.
Excellent article!
Is there a way of this extracting numbers when they can occur anywhere in the string?
Thanks!
Since the workshop all i use now is power query, an amazing tool.
Yes, but it involves using a Power BI Pro subscription in order to share them with your organization.
Great article! I will definitely be using this in my work. I've been using Power Query for a while now and starting to get into more complex scenarios and your book and blog are now my main go-to sources, especially on functions.
Great, Steven. Glad to hear that you're finding both the site and the book useful!
thank you very much.
Thanks, enjoying the book and this article but I need help to get to the next level:
I want to split a column (in Source table) based on values from another table or list (splitter table). The logic would be loop through splitter table until a match (splitter value contained in source table record) is found then split on that match. (I couldn't split on any single delimiter as nothing consistent as in your example).
Hi Simon,
I'm thinking that this might be a good question to post in our forums at https://www.excelguru.ca/forums This would allow you to upload a short sample so we can see what you're tying to accomplish.
Hope this helps!
Pingback: Keep Only Numbers in Power QueryThe Excelguru Blog
Pingback: Separate Values and Text in Power Query - Part 2
Pingback: Keep Only Numbers in Power Query