A couple of weeks ago I was teaching a course on Power Query. I distributed a bunch of sample files, started to walk the attendees through the steps, and a hand went up.
“I get an error in my date column…”
For a second it stumped me… I tested my examples 15 times before pushing them out to the class. Why now?
As it turns out it’s that pesky issue that drives many non-North Americans crazy. I keep my regional settings set to use the MM/DD/YYYY format. I just find it makes life a lot easier for me. What we ran up against here was a user who was using the Canadian standard of DD/MM/YYYY. Yuck.
I promised a way to show how to fix it, and am finally getting around to posting it…
Replicating the Issue
The file I’m working with here is a csv file (saved in a txt format) that uses the DD/MM/YYYY format, as that replicates the issue for me. So I import it and end up with this:
So what’s the problem? Well, if you look at the areas I circled in red, the ChangedType step changed the data type for the TranDate column to Text. That’s no good. So I highlight the TranDate column and change it to Date:
Great. An error.
So this is what I’m talking about. The dates above have been converted to dates, but not the 1st of February, 2nd of February and so on. Instead it’s January 2nd, February 2nd, etc. As soon as it gets to the 13 it gags as there is no 13th month.
This is not an uncommon issue, by any means. If you database exports in one format and your system settings are in another, you’ll run into this. Multinationals that exchange files across borders see this all the time (just look in any forum for the amount of help posts on the topic.)
I REALLY wish that Power Query were smarter about this. I also wish there was an easy way to just tell Power Query which date format your data (not your system) was using, or an easy way to convert it. Anything to let me do something in one click versus what we need to do to fix this. (As an added insult, the old text import wizard DID let you declare this. Go figure!)
The Solution
All right, so how do we fix it? I’m sure there are a variety of ways, and hopefully a new ribbon will come along to make this obsolete, but here is ONE way.
Step 1: Remove the garbage
First I deleted the “Changed Type” step of the query.
Step 2: Split by the / character
Next I selected the TranDate column, went to the Transform Tab (if you don’t have it download the latest Power Query update), chose Split Column, then By Delimiter, then Custom. I put in a / and clicked OK.
As an aside here, I wish Power Query suggested what it thought you wanted to use as the delimiter, without actually clicking OK for you. In this case I think it’s pretty obvious, even though it’s not in the list of defaults. Granted, I might want to override it with something crazy like a 0, which is why I wouldn’t want it to just click OK for me, but it might save me time if it made an intelligent suggestion.
At any rate, I end up with this:
Step 3: Rename the columns
This part is important for the next piece. Rename your columns to reflect which part of the date they belong to. You can do this by right clicking the column header and choosing Rename… (Alternately you can edit the M code in the formula bar manually, but you need to edit the “SplitColumnDelimiter” step first, then edit the ChangedType step to reflect the new column names. Probably easier to just right click and rename them:
Step 4: Putting the date together… correctly
OK, now the tricky part. We need to insert a column and write a Power Query formula. To do this, go to the Insert tab, and click Insert Custom Column. You’ll be presented with a dialog to write your formula. And here’s the formula you need:
=#date([Year],[Month],[Day])
Sadly, there really isn’t a lot of documentation yet that explains this stuff. I believe that the # sign tells Power Query that this is a data type, the “date” portion determines what type it is, and then we feed it the components that make it up (for each row). I also changed the name at the top of the dialog to “Date” and clicked OK:
Step 5: Force the data type
I’m not sure why this is, but even after declaring this as a date, it formats the column visually as a date, but the data type is left blank. For whatever reason, I had to select the column and force the data type to a date in order to make it recognize it properly as such.
Step 6: Cleanup
So this part is optional, but you can now delete the Day, Month and Year columns if you don’t need them any more. Unlike standard Excel, this move won’t leave you with a bunch of #REF! errors down the Date column. 🙂
Implications/Modification
The implications of this are somewhat interesting to me. Let’s assume that your data goes the other way. I send you a file in MM/DD/YYYY format, and you use DD/MM/YYYY format. You should be able to follow all of the above steps, with the only difference being which column you identify as which when you rename them. (In this case it would go Month Day Year after splitting them.)
I’d love to get some feedback from others on this, but I believe the formula in the custom column should work for you without modification as long as the columns are named correctly.
I’ve attached two files to this post for testing purposes. Right click the links below and SaveAs to get copies to play with:
Try them both, you should be able to generate a correct date for your format from either using these steps.
16 thoughts on “Importing Dates in Power Query”
Ummm... Power Query does have a way of handling this easily: http://office.microsoft.com/en-gb/excel-help/internationalization-HA104061425.aspx
Hi Chris,
Thanks for the comment, but I'm not sure I buy this as "easy". I gave it a go, but honestly, I ended up struggling to find what culture I needed to use if mine wasn't in line with the data. So basically I'm sent to guessing in a little dialog.
While I can't speak for all Excel users, I've NEVER run into the term "culture" before using Power Query, and I have to assume this is more of a database side term. If it weren't for you driving me to that page, I would have had no reason to explore it. To me this is not discoverable for Excel pros.
In the existing Excel text file Import Wizard, we get a very clear option to choose the format, listed in terms we understand: M/D/YY and such. That is really what we need imo.
Pingback: Importing Dates in Power Query | MS Excel | Power Pivot | DAX
Fair point - from a database person's point of view this makes perfect sense, but I can also see why from an Excel user's point of view it doesn't (and Power Query is aiming at the latter group, after all). Maybe someone from the Power Query team is reading this...?
Anyway, if you'll excuse the advert, I do have a few sections on this in my Power Query book which will (hopefully) be published later this week. The way Power Query handles things also covers other internationalisation issues such as the use of commas as decimal separators and full stops (or periods or whatever you want to call them) as thousand separators in European countries, which is the opposite of how they are used in the UK and North America.
Advert excused, Chris. I'm looking forward to reading it. 😉
I'm a little late to the party; however, I tried my hand at using a function to accomplish this, incorporating some of what Chris mentioned. "TransactionDate", "Start Date", and "End Date" all come from a csv where the format is DD/MM/YYYY HH:MM:SS AM/PM pasted into local table.
let
SalesData = Excel.CurrentWorkbook()
{[Name="tblSalesData"]}[Data]
ColumnSelect = Table.SelectColumns(SalesData,
{
"TransactionDate",
"Start Date",
"End Date"
}),
EUtoUSFunction = (x) => Date.From(DateTime.FromText(Text.From(x, "en-US"), "en-GB"), "en-US"),
ChangeTypeToDate = Table.TransformColumns(ColumnSelect,
{
{"TransactionDate", EUtoUSFunction},
{"Start Date", EUtoUSFunction},
{"End Date", EUtoUSFunction}
})
in
ChangeTypeToDate
Hi Ken,
i am located in mumbai, India and when i import some textual data containing dates, the date column shows "Any". When i convert it to Date/Time/TimeZone, it shows Error in some cells. The dates in some cells shows up in US format ("mm/dd/yy") with +5:30 (Time zone for Mumbai) appended to end.
e.g.
2/10/10 12:00:00 AM +05:30
11/6/11 12:00:00 AM +05:30
Error
Error
.....
.....
I tried your above suggestion, but the Year column still shows 2 digit year and hence the final date is still not recognized. How do i resolve this?
forgot to add that, the joined date column shows up as :
6/12/0010
7/10/0010
1/22/0011
3/19/0011
4/16/0011
the year has gone kaput!
also this is still in US format and not my local format (dd/mm/yyyy).
any help would be most appreciated.
Hi Ken,
I found out the solution myself. now you don't need to split the column into separate columns using '/' as delimiter and then merge them using #Date function. If you right-click on the column-name of this textual date column, you should get a right-click context menu. the last option in that menu gives you a 'Custom Format' option. Once you click on that, you get 2 dropdowns viz., format to choose & System Locale. I chose 'Date' as the format and System Locale as 'English - United States'. That made the textual dates formatted as Date format.
Hope this helps!
Ha! You beat me to it by a few minutes. I JUST published a blog post with the new method: https://excelguru.ca/2015/07/08/fix-date-errors/
Cheers,
Ken
Sorry, you still win!
when you right-click, its Change Type > Using Locale....i must have dozed off while typing the earlier message...hehe 😉
Hi guys!
I have solved it using following steps, that looks quicker than:
Split
Rename
Realocate column as your date location
Transform > merge columns > /
Change type of data by location
\0/\0/\0/
I could not use =#date(Day, Month, Year) because it has returned an error.
Ken, might it be possible the data type in my CSV file interfere Power Query recognizing dates?
No, that shouldn't be an issue. Regardless, I'd focus on using the Change Type --> Using Locale to deal with this. Don't follow the steps above, as they are waaaaaayyyy too difficult and out of date.
I'm a bit late to the game, but how about a Text to column?
Your header and date cells don't have a space so use space as a delimiter, then when doing the text to column, you can choose the format, being DMY or MDY to apply to the result, and you end up with correct dates.
I do this all the time as I have to work with reports from all over the world and merge them into a single report using Power Query.
Hi Laurent,
That will absolutely work if you're able to use Excel. But if you are connecting to a text file via Power BI desktop, that's going to cause you some issues.
Pingback: Fix date errors when importing data in Power Query