This post on changing a single file query to a From Folder query is a guest article by John MacDougall, a Microsoft Excel MVP from Canada and owner of HowToExcel.org.
Recently, I had to import a large-ish CSV file at around 1.5 million rows into Excel using Power Query.
I used a From Text/CSV query then proceeded to transform the data and merge a couple other tables with it. I did this all in the one query then loaded it into Power Pivot to analyze. It worked great and Power Query/Pivot handled it easily.
Then I had to look at a longer time period.
I tried to export a new set of data with a longer timeline. The system I was exporting from had trouble before with a much smaller time period, and sure enough, it wasn't working with the new larger time period.
This meant I had to export a couple smaller files and glue them together. I needed a From Folder query instead of the single CSV query.
My original query had a lot of transformation steps in it after importing, so I didn't want to just throw it away and start over.
In this post, we'll look at how we can convert a From Text/CSV query into a From Folder query.
You can download the before and after workbooks along with the CSV files used in this post here.
The Original Query
Ok, this isn't my original query or data and the transformations are spurious and just for example.
Here's a From Text/CSV query. Excel will automatically create the first three steps in the query which define the source of the data, promote the first row of data to headers and then changes the data types.
The remaining steps are the transformations I've added to the query. These are the transformations I want to preserve and don't want to have to go through creating again by starting from scratch with a new From Folder query.
I've loaded this query into a table in the workbook and named it ReportData.
In this example, there aren't a lot of added steps. But you can imagine with multiple merges, added custom columns and other transformations it could be a pain to reproduce within a new query.
Create a New From Folder Query
We are going to need to create a new From Folder query.
Go to the Data tab and press the Get Data button then choose From File and then From Folder.
In the Folder dialog box we can either copy and paste the path of the folder which contains the set of reports or browse to it.
The next window will show a preview of the files in the folder and we can press the Combine button and then choose the Combine & Load To option.
The Combine Files window will appear and we can go with the default options and press the Ok button.
Now select Only Create Connection in the Import Data dialog box.
Power Query creates a new query named after the folder. In our example here, mine is called Reports.
Edit the M Code of our Previous Query
Now we are going to edit the M code of our original query which was called ReportData. In the Queries & Connections window pane, right click on the ReportData query and select Edit to open up the query editor.
We need to make it reference our newly created folder query as its source.
Go to the Home tab and press the Advanced Editor button to open up the M code editor.
We now need to edit a couple of lines of the code highlighted in yellow above. These are the lines of code from the original single file CSV import.
- The first line of code is the source of the data. It points to a single CSV file with a folder path and file name. We need to delete this part and replace it with a reference to our new folder query. The new source is now going to be the Reports folder query so we can replace this with #"Reports".
- The second line is another step automatically added to our original query which promotes the first row of data to column headers. This step now happens in our new folder query, so we don't need this and can delete it.
- The third line changes data types and is not needed we can also delete this step.
- The fourth line is the first step in our data transformation, but it references the previous step which we deleted. We need to update it to reference the Source step.
Finishing and Loading the Query
The revised M code should now look like this:
We have updated the source to reference our new folder query. We have removed the promote headers and change data type steps. The remaining step's references have been updated to reference the previous step.
Now we can press the Done button and then Close & Load the query. We should see our table update to include all the data from the files in our folder.
Conclusions
It's not a terribly complicated process to change the source of a query and update it from a single file import to a folder import. But why not avoid it in the first place?
Making mistakes allows for the opportunity to learn new thing and in this instance, I learned two things about importing text, CSV or Excel files into Power Query:
- Import a clean untouched version of the data as a connection only, then reference it in a new query for any transformation steps.
- Use a From Folder query instead of a From Text/CSV or From Workbook query. This way you can easily add more files to the folder for import at a later time.
5 thoughts on “Changing a Single File Query to a From Folder Query”
Thank you very much for this Topic, it is very usefull for me.
I want to do the same but in my case I want to change the source from a Folder to a SharePoint Folder. Do you think that will be posible following the same procedure?
Thank you
Regards
You'd need to use the "From SharePoint Folder" functionality, but yes, it's pretty close!
Brother your write-up really saved me a grand melt-down, I don't remember how many number of hours I put in to design an excel Power Pivot dashboard that takes feed from 10 different queries. One query's length I could not determine well in advance and designed the model on the basis of a single file which now comes in so many different files with a combined size in excess of 6 million rows
I went nut finding a solution to this problem which almost seemed like i'll have to either re-do all the formulas and query modeling starting afresh for this data set or my model becomes irrelevant for larger periods but thankfully I bumped into this article and with a very simple M code modification I was successfully able to make the transition happen and I now really look forward to a cracking Monday! Thanks man, Well done and Keep rocking
Thank you very much, I am so happy to find this solution!
Thanks for the help!
I was able to adapt to switching from single Excel file with multiple aggregated tables to doing the same process against a folder.