I got sent this today from a friend. He was a bit frustrated, as he got a message from Power Query that read “Formula.Firewall: Query 'QueryName' (step 'StepName') references other queries or steps and so may not directly access a data source. Please rebuild this data combination.”
What Does This Mean?
This message is a bit confusing at first. For anyone who has worked with Power Query for a bit, you know that it’s perfectly acceptable to merge two queries together. So why when you do some merges does Power Query throw this error?
The answer is that you cannot combine an external data source with another query.
Looking At Some Code
Let’s have a quick look at the code that my friend sent:
Notice the issue here? The Merge step near the end references a query called DimShipper. No issue there. But it tries to merge that to an external data source which is called in the first line.
This is actually a bit irritating. But let’s break this down a bit further. Looking at the first line, it is made up as follows:
Source=Excel.Workbook(File.Contents(Filename())),
The Filename() portion is calling a function to return the desired filename from a workbook table, (based somewhat on this approach.) We already know that works, but we also know that this is definitely pulling data from an external workbook (even it the file path is for this current workbook!) And to be fair, this would be the same if we were pulling from a web page, database or whatever. As long as it’s an external source being combined with another query, we’re going to get smacked.
So it’s suddenly starting to become a bit clearer what Power Query is complaining about (even if we think it’s frustrating that it does complain about it!)
Let’s “rebuild this data combination”
Right. Who cares why, we care how to deal with this. No problem. It’s actually fairly straightforward. Here’s how I dealt with it:
- Right click the Query in Excel’s Query window
- Choose Edit
I’m now in the Power Query window. On the left side, I found the Queries pane and clicked the arrow to expand it:
Duplicate the Existing Query
The query in question here is “Purchase”…
- Right click “Purchase” and choose Duplicate
- Immediately rename the query to PurchaseList
- Go to View –> Advanced Editor
- Selected everything from the comma on the second line down to the last row of the query:
- Press Delete
- Change the final line of “Merge” to “Purchase_Sheet”
Perfect… so my new PurchaseList query looks like this:
- Click Done
This query will now load. Even though it is pointing to an external data source, it’s not being combined with anything else. So it’s essentially just a data stage.
Modify the Existing Query
Now we need to go back and modify our existing query. So in that left pane we’ll again select the Purchase query.
- Go to View –> Advanced Editor
- Select the first two lines and delete them
- Put a new line after the let statement that reads as follows
Source = PurchaseList,
NOTE: Don’t forget that comma!
And what we end up with is as follows:
EDIT: The "Removed Columns" line should reference Source, not Purchase_Sheet as follows:
#"Removed Columns" = Table.RemoveColumns(Source, ...
So What’s The Difference?
All we’ve really done is strip the “external” data source and put it in its own query. Yet that is enough for Power Query to be happy. The new Purchase query above now has two references that it is comfortable with, and it works. (And that’s probably the main thing.)
Designing To Avoid This Issue
I make it a practice to land all of my data sources into specific “Staging Tables”, which are set to load as connections only. From there I build my “finalization” tables before pushing them into my Data Model (or worksheet). You can visualize it like this:
The key takeaways here are:
- I always go from data source into a Staging Query (and do a bit of reshaping here)
- My staging queries are always set to load to “Connection Only” so that the aren’t executed until called upon
- All combining of data from disparate sources is done in queries that reference other queries, not external data sources
I’ve found this approach works quite well, and always avoids the “rebuild this data combination” error.
132 thoughts on “Power Query Errors: Please Rebuild This Data Combination”
That's super Ken.
I have a scenario where I can only connect to source data when I am on-site. I extract basic Customer Data (Code & Name) and load it to a worksheet. This is enhanced with additional classifications that are not available in the basic app. The Customer table loaded to the model is a merge of the worksheet that came from the external source and the Additions table.
Of course I could not load the Customers to the model without being on-site. Creating a "Staging Query" from the worksheet that was loaded with the external data did the trick.
Many thanks.
Thanks for the explanation - was not clear how I needed to solve that problem from the error description.
One question - in your other posts you mention query folding - to which I think you mean that powerquery is building the sql command and submitting it back to the data source to actually run the query. Seems ideal.
So I am wondering if your approach of using connection only staging queries still offers the benefits of query folding, or is the query now run by power query rather than the data source? Hope that makes sense
Hi Charlie,
Your understanding of Query Folding is correct, yes. Power Query rolls up its own SQL statement and sends it to the database.
With regards to the staging query, they will still offer the benefits of Query Folding, as they are pulling from the database. Any queries that refer to the staging queries, however, will not. So the majority of your filtering should be done in the staging query, before you get into serious reshaping at the next stage.
Thanks and btw - is there a way to enable email notification when someone replies to my post - other blogs have that feature and it makes it easier rather than having to frequently check the post for replies
Hi Charlie,
Not yet. Great feedback though, and I'll see if I can add that to the blog feature set when I have a bit of time. 🙂
let
FullPath=RawDataFolderPath{0}[RawDataFolderPath]&"\STEP.xls",
Source = Excel.Workbook(File.Contents(FullPath))
in
Source
I get this same error on the query above. The RawDataFolderPath query pulls a table that includes the text of the file path (excluding the file name itself) and then I append the name of the file. The first line correctly creates the full path as text, but throws an error when I try to actually access the file at that path.
Any suggestions? The RawDataFolderPath query is essentially a "staging query" as described in this post, right?
Thanks!
Hi Zachary,
If you were to modify your FullPath assignment so that the RawDataFolderPath was treated as a function, then this would work. Power Query doesn't like two separate tables being fed to the same query.
Have a read through this article: https://excelguru.ca/2014/11/26/building-a-parameter-table-for-power-query/ It should help you accomplish exactly what you're trying to do without running in to the firewall issue. 🙂
Hi Ken,
thanks for the tip - helped me solved an urgent issue! Really weird problem though...
Thanks, Peter
Ken, I am trying to utilize a parameter table in my spreadsheet to pass variables to an SQL statement that is pulling from Teradata. The Teradata query works great. The Parameter table works. But when I put them together by using ..."&Text.From(fnGetParameter("Period")&"... I get the rebuild this data combination error.
Any suggestions on how to get around this?
Thanks in advance.
Two options here.
You can try declaring a variable at the top of your query to load the value from fnGetParameter like this:
myPeriod=fnGetParameter("Period"),
Then using that variable in place of the function call in the middle of your document like this:
..."&Text.From(myPeriod&"...
That may help. If not, you may have to break the queries down and load them into staging queries first. Just be warned here... as soon as your first line of non-ui M code enters the equation, query folding stops. That's a big downside if you're filtering tables.
Hi Ken,
I have an interesting issue whereby I attempt to run two queries sequentially; the first a query generates a data table in excel that contains formulas within additional columns, and the second query takes this table and performs a table.Group operation. Attempting the second query throws this error...
In de-bugging I have striped the second query down so that it only replicates the table but it didn't work!
Regrettably the excel formulas cannot be replicated efficiently in the first query step (running total on 35,000 + rows) as per your blog article 'Create Running Totals in Power Query'.
So do you know if it is ever possible to reference excel tables that originate themselves from query tables?
Hey Joseph,
Yeah... that's problematic, for sure. Honestly, the way I've got around this in the past is actually to use a bit of VBA to refresh query 1, copy the results to another table (disconnected from PQ), then pull that table back into PQ to do further operations. It's a bit of a pain, but it allows the job to be done.
Cheers Ken, I've done that and it works OK now. I've actually never been able to reference a connection only table, it always throws this error... with or without an external data source. Is this still a bug?
That's odd... you should be able to reference a connection only query from another query without any issue.
Ah yes sorry I just tried with a separate connection only query and it does work...but it still gives the error when dealing with more complex query tables and merge operations, the query that throws the error, however, couldn't be simpler...
let
Source = RANGE_CORRECTION_APPEND,
#"Table Join" = Table.Join(Source,{"Index"},MIP_TEST_DATA_APPEND, "Index"})
in
#"Table Join"
Where RANGE_CORRECTION_APPEND and MIP_TEST_DATA_APPEND are connection only queries. Referencing either table without the Table.Join operation works fine. The particular error is slightly different and makes zero sense to me!...
"Query 'MIP_TEST_DATA_APPEND' (step 'Added Index') used 'MultipleUnclassified/Trusted' data when last evaluated, but now is attempting to use 'MultipleUnclassified/Trusted' data."
Sorry I got my errors confused the above post was intended for the ''MultipleUnclassified/Trusted error'' blog post discussion. I'll re-post it there.
Many thanks Ken,
I was stuck with this issue and your article save my day 🙂
I have found today that you can also replicate this issue if you are referencing `PurchaseList` using `Expression.Evaluate("PurchaseList",#shared)`.
http://stackoverflow.com/questions/35980096/call-a-table-using-a-dynamic-identifier
This has really killed some ideas I've been having about referencing queries dynamically based on conditionals (ie having staging use fixtures (seed data) in development and then SQL Server data in production).
Sad face.
What does "PurchaseList" do? If it connects directly to a data source, then that doesn't surprise me as you are essentially loading the M code from somewhere and running it. If PurchaseList just transforms data that is loaded from another source (but doesn't connect to the data source) then I'd be surprised you're getting this issue.
Thank you for your article.
I encounter the similar error in the following scenario.
Query1 imports data from a csv file (Source = Csv.Document(File.Contents("C:\...) ).
Query2 first references to Query1 and then gets the value of a named range from the CurrentWorkbook.
This produces the same error (...references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.).
Note: if I do all steps in Query1 it works fine. However I can't because I have intermediate queries that do some manipulations and need to be separate (merge and append).
Thank you for any assistance
Hi there,
So the issue is not the first query, it's the second. Make Query2 pull in from the named range then load to connection only. Then create a new query that references BOTH Query 1 and Query 2. That will allow you to avoid the error.
Hi,
Thank you for your fast reply, and apologies for my late reaction - it's only now I'm back on this.
I'm trying to figure it out, but I don't see how to reference to two queries.
Note that I'm new with PQ.
It's obvious how to reference to one query (Source = ...), but how can a query reference to an additional query (Source2 = ...)?
Thank you.
I think I'm going to point you to our help forum for this at http://www.excelguru.ca/forums. We can help you out with this, but I think it's going to take a bit of back & forth. If you can post in the Power Query forum, we'll get you sorted. 🙂
Ken,
Very much appreciated.
I've posted the in the forum.
Many thanks.
Just stumbled across here and recognized that there are some new features out meanwhile that make this task a bit easier:
1) If you can accept the security aspects coming with it, you can simply enable "fast combine" and then wouldn't need to rebuild anything. ("Check: "Ignore the Privacy Levels...." under Settings -> Options -> Privacy)
2) Splitting queries is made easier than never before: In the query editor check your step "Removed Column" -> rightclick your mouse and choose "Extract Previous". This will separate the query automatically as you need it.
Hi Imke,
Re #1, I don't believe the "Fast Combine" setting exists any more. "Fast Data Load" does, but it does something different now. (Doesn't load to a temp memory space first.) You can still turn off privacy, but that's now done under Privacy --> Ignore (which I think is what you're saying.)
Re #2, absolutely true. 🙂
Hi Imke/Ken,
Firstly, thanks Ken - have just re-read this post and finally get it in relation to why you recommend the staging tables.
I've scooted past the issue before in PowerBI Desktop using Imke's method ("Check: "Ignore the Privacy Levels...." under Settings -> Options -> Privacy), but now I am trying to schedule a refresh on PowerBI.com using the personal gateway and am getting a failure:
Data source error: [Unable to combine data] Section1/Employees/AutoRemovedColumns1 references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.
Am I right in thinking a) this is essentially the same issue that you've outlined in your post and b) that even though I've supressed the warnings in PowerBI Desktop - these can't be supressed on PowerBI.com - so I will need to follow your staging table method to get the refresh working?
very helpful post. thanks!
managed to solve my problem following the steps outlined in your tutorial 🙂
I couldn't use the Purchase_Sheet reference though (in last example). Had to put in the source name instead. I wasn't removing columns there though - I was defining the column type. Don't know if that makes a huge difference but I found that strange.
Anyway, my query works so I'm happy.
Hey Brian,
I'm going to say "I suspect so". I haven't actually tested this myself with PowerBI.com yet. I'd actually be curious to hear your results!
Thank you so much for this! I don't know how I would have solved this problem without this post. Using this in combination with the new parameters feature of power bi desktop made my day.
Thank you, Thank you, You saved the rest of the hair that I did not pull out.
LOL! Glad to be of service Mark!
Hey Ken - you should change your formula fnGetParameter in your book - Datamonkey too ( for the next print 😉 ), because the error pops up too, when you use the paramtable directly in the Function fnGetParameter. (i used it to filter a calendarweek in a query, which comes after the staging querys) Now i referenced it to a staging-query of the parameter -table and all works perfectly.
(ParameterName as text) =>
let
ParamSource = Parameter,
ParamRow = Table.SelectRows(ParamSource, each ([ParaName] = ParameterName)),
Value=
if Table.IsEmpty(ParamRow)=true
then null
else Record.Field(ParamRow{0},"Value")
in
Value
Bye Bye - Sven
Cheers Ken, solved my problem straight away.. Why I can't be trusted to set these permissions myself I don't know..
Thanks Sven. Yes, I've ended up playing with that a bit to try and figure out the best way to navigate the formula firewall with the parameter function. Such a pain!
My load queries are not stopping at the staging queries they point to, but rather kicking off all related queries. Any help on what may be causing this?
Hey Josh,
Are the staging queries set up as "Connection Only" queries, or do they load to a destination (worksheet or data model)? The other thing to check would be if those staging queries are referencing other queries. I assume you're just initiating the refresh on a single Load query?
Hi,
I'm confused as to why I get the error message in this context. I think I get the concepts here but any advice gratefully accepted
____________________________________________________________
Formula.Firewall: Query 'Data Filter' (step 'Filtered Rows') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.
____________________________________________________________
let
SolutionYear = fnGetParameter("Selected Year"),
SolutionMonth = fnGetParameter("Selected Month"),
Source = Data,
#"Filtered Rows" = Table.SelectRows(Source, each ([Year] = SolutionYear) and ([Month] = SolutionMonth))
in
#"Filtered Rows"
The Parameter model is working using other parameters like filepaths...
The Source 'Data' is a merge of 4 different data sources (csv files)
Hi Ken,
I have been using your blogs and your "M is for (Data) Monkey" book for my data projects - thanks for sharing your knowledge and expertise. I have noticed that your writings are very easy to understand/follow and implement compared to most other authors. Keep up the great work!
I am trying to write a function to get the current quarter into a PowerQuery step instead of too many table merging. Here is my first attempt to get the current quarter (based on today's date) from the enterprise date database using a function:
let fCurQtr = ()=>
SELECT QTR_NAME FROM [SHARED].[ENTERPRISE_DATE]
WHERE FULL_DATE=CAST(GETDATE() AS DATE)
in fCurQtr
I got the error that you mentioned in this post when I used this function. So I created a staging query (CurQtr) as you suggested to pull the current quarter and then I reference it in the function as follows
let fCurQtr = ()=>
CurQtr
in fCurQtr
Now, I don't get the error but when I use the function fCurQtr() it doesn't output the quarter directly but gives me a table to expand. Is there any way to avoid the expand table step and directly get the current quarter?
Any ideas or suggestion would be greatly appreciated!
Thanks, Anil
Hi Ken,
I changed the function as follows and it worked, please let me know if there is a better way to do it or appreciate comments if any (basically I am directly referencing the value in the table which always have one row)
let fCurQtr = ()=>
CurQtr{0}[QTR_NAME]
in fCurQtr
Thanks, Anil
Hey Anil,
Glad you got solution. The only question I'm really curious on here is why you're using a custom function to generate a quarter. Why not just add a custom column and use the Date.QuarterOfYear() function to generate that. It would avoid the firewall, and would also be calculated in one step.
I get this error although I am not using an external data source.
I am just trying to pass a dynamic source name to a query
I have created a simple parameter table with the names of the tables I want to pass as source, like this:
SourceTable
Table1
Table2
Table3
Then I have built a simple connection only query called Parameter using above table as source.
Now I am trying to build my actual query using this:
Source = Excel.CurrentWorkbook(){[Name=Parameter[SourceTable]{0}]}[Content]
Although all my tables and queries are in the same workbook I get the Formula Firewall error
Any help would be very much appreciated
Thanks, Panagiotis
PS: I am using Excel 2016 64 bit
Hi Ken,
I posted a question yesterday but I don't see it now.
Yesterday evening the question was still here with a status of "awaiting moderation"
What did I do wrong?
thanks
Panagiotis
Hi there,
You didn't do anything wrong. I'm travelling for business, and didn't have time to look at it (and still don't). A better place to get help would be to post this in our forum at http://www.excelguru.ca/forums as there are experts watching there.
James, probably best to post this question in our forum at http://www.excelguru.ca/forums. You can upload a sample workbook and someone can help you through the issue.
Ken -
I'm trying to combine your parameter tip with a merge query and I'm not getting this to work despite trying multiple combinations. I have a parameter that determines how many days of history will be loaded.
fnGetWorkBookParameter
(PName as text) =>
let
ParamSource = Excel.CurrentWorkbook(){[Name="tblParameter"]}[Content],
ParamRow=Table.SelectRows(ParamSource, each ([ParamName]=PName)),
ParamValue = if Table.IsEmpty(ParamRow) = true then null else Record.Field(ParamRow{0}, "Value")
in
ParamValue
This correctly returns the value in the table (7 in this case). My original query that generated the error was as follows:
Order History
let
Source = Table.NestedJoin(Orders,{"Business Unit", "Par Location ID", "Compartmnt", "Item ID", "PO/MSR No", "Line No"},Receipts,{"Business Unit", "Par Location ID", "Compartment", "Item ID", "PO No.", "PO Line Number"},"NewColumn",JoinKind.LeftOuter),
ChooseReceivedQty = Table.ExpandTableColumn(Source, "NewColumn", {"Received Qty"}, {"Received Qty"}),
/*
For smaller extracts, take only the dates according to the incremental value
*/
DateInc = fnGetWorkbookParameter("Days History") * -1,
//
LatestDatesOnly = Table.SelectRows(ChooseReceivedQty, each [Demand Date] > Date.AddDays(Date.From(DateTime.FixedLocalNow()), DateInc)),
AddColRecvVariance = Table.AddColumn(LatestDatesOnly, "Recv Variance", each if [Received Qty] = null then 0-[#"PO/MSR Qty"] else [Received Qty]-[#"PO/MSR Qty"]),
RecvVarTypeInt = Table.TransformColumnTypes(AddColRecvVariance,{{"Recv Variance", Int64.Type}})
in
RecvVarTypeInt
So I split these out into GetHistoryParameter
let
Source = Table.NestedJoin(Orders,{"Business Unit", "Par Location ID", "Compartmnt", "Item ID", "PO/MSR No", "Line No"},Receipts,{"Business Unit", "Par Location ID", "Compartment", "Item ID", "PO No.", "PO Line Number"},"NewColumn",JoinKind.LeftOuter),
ChooseReceivedQty = Table.ExpandTableColumn(Source, "NewColumn", {"Received Qty"}, {"Received Qty"})
in
ChooseReceivedQty
and Order History
let
Source = GetHistoryParameter,
//
DatesInc = fnGetWorkbookParameter("Days History") * -1,
LatestDatesOnly = Table.SelectRows(Source, each [Demand Date] > Date.AddDays(Date.From(DateTime.FixedLocalNow()), DatesInc)),
AddColRecvVariance = Table.AddColumn(LatestDatesOnly, "Recv Variance", each if [Received Qty] = null then 0-[#"PO/MSR Qty"] else [Received Qty]-[#"PO/MSR Qty"]),
RecvVarTypeInt = Table.TransformColumnTypes(AddColRecvVariance,{{"Recv Variance", Int64.Type}})
in
RecvVarTypeInt
But the error persists.
Hey GMF,
Can you ask this in our forum at http://www.excelguru.ca/forums ? That will allow you to upload a sample workbook to make it easier for us to test.
Thanks!
Actually what if you can only make a second query after you know the list of items in the first query and also you are only able to make a single request each time?
My idea was to have a list of items and for each of them run a custom function that would make a second query for that value...
This is giving that firewall problem. I cannot run the second query globally because it really needs the input parameter that is coming from first query.
Yes, so that is definitely an issue. You can turn off the firewall but turning off the Privacy settings, or you can use VBA to do a staged refresh where you land the data in a table, then suck it back into Power Query after. (The latter is kind of a pain though.)
I have repeatedly run into this issue and sometimes it makes sense but other times (like today) it does not.
Today I have written a script ...
1. Get a sheet from the current workbook, a table of cars (with other data).
2. Create a table of distinct makes from the table of cars.
3. For each make, access a webpage and retrieve ALL the tables from that page combined into a single table and store as a sub-table for the make.
P.S. I bought your book on my Kindle but this seems to be way beyond that!
4. Expand the sub-table and process the data.
5. Re-group the data by car.
6. Do some further processing of the data (using list.generate to add up values, etc and find first entries for certain data.
7. Re-expand the data.
All good, no problems.
But, if whilst the data is grouped by car, I try to reference the original Cars table to get the Car ID I get the dreaded error!
This makes absolutely no sense to me. Why should this cause a problem? This has to be an issue/bug with the lazy interpretation model of Power Query.
The only way I can think to resolve this is to do steps 1-7 to create a new temporary sheet in the workbook and then have an additional query to add the Car ID and output another sheet/table but that seems crazy.
Any advice/tips would be very welcome.
With regards to my earlier comment, I am really confused as to why accessing of an already accessed table causes issues. I stubbornly refused to create an intermediary (i.e. 2-stage) table so tried a few other things and came up with a working solution ... I grouped the All-Cars table by make and included a sub-table of cars and IDs ... I then loaded the web-page for each make into another sub-table and expanded it (which meant the cars sub-table was then on every line) and then queried the cars sub-table for the car to retrieve the ID and then deleted the cars sub-table ... that worked ... which begs the question, WTF was wrong with trying to access the All-Cars table to get the ID!?! Label me confused! P.S. This is nothing to do with buffering of data (which I tried), it quite simply seems to be a problem with the lazy interpretation model of Power Query.
I'd have to see it, but basically, if you take the data from the spreadsheet, that is considered connecting to a raw data source. If you then try to combine it without another data source (the web) in the same query, you'll hit the firewall.
Hi Ken. Thanks for the response. A lot of my scripts combine multiple data sources without any problems. I've moved the loading of common data into function scripts. I still get occasionally get the errors. As per my previous comment, I found a solution which, even though it was still combining the same data sources, worked in a different way/order and was successful, so the issue can't be with combining data sources ... I really think this is a problem with the lazy interpretation methodology of the M language. The scripts (there are calls to other function scripts) are too much to post on here.
Hi Nick. I hear you. I've seen parts where restructuring the query works, but I haven't found that it's always reliable either. My concern would be that if they tweak the algorithm it may cause issues later. I'm virtually certain that the issue IS about combining the data sources, but the order you have may be fooling the lazy evaluation engine. Having said that, it's only a theory.
Hi Ken. Thank you for your insight on this. I am fairly new to I follow the example provided; however my query looks at a folder with multiple excel workbooks to then return a specific tab in each. I am not sure how to then split this query up in to the 'Purchase_List' and 'Purchase_Sheet' you show since I do not have a second line of code that defines anything after "source =..."
I greatly appreciate your time/ consideration.
let
Source = Folder.Files("C:\Users\caroll.moya\Desktop\HC&S Database"),
#"Create File Paths" = Table.AddColumn(Source, "File Pointer", each [Folder Path]&[Name]),
#"Del Open File References" = Table.SelectRows(#"Create File Paths", each not Text.Contains([Name], "~")),
#"Del Non Excel File References" = Table.SelectRows(#"Del Open File References", each Text.Contains([Extension], ".xl")),
#"Excel Workbook Content" = Table.AddColumn(#"Del Non Excel File References", "Custom", each Excel.Workbook([Content],true)),
#"Removed Columns" = Table.RemoveColumns(#"Excel Workbook Content",{"Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", {"Name", "Data"}, {"Custom.Name", "Custom.Data"}),
#"Filtered Data Set" = Table.SelectRows(#"Expanded Custom", each not Text.Contains([Custom.Name], "Print") and Text.StartsWith([Custom.Name], "TI Benchmark")),
#"Invoked Custom Function" = Table.AddColumn(#"Filtered Data Set", "f Extract Project Cost Data", each #"f Extract Project Cost Data"([File Pointer], [Custom.Name])),
#"Removed Other Columns" = Table.SelectColumns(#"Invoked Custom Function",{"f Extract Project Cost Data"}),
#"Expanded f Extract Project Cost Data" = Table.ExpandTableColumn(#"Removed Other Columns", "f Extract Project Cost Data", {"Project No.", "Project Name", "Location", "Region", "Year", "Quarter", "iSF", "Scope", "Current/Final Costs", "Cost/ SF", "Cost Data Marker Column"}, {"Project No.", "Project Name", "Location", "Region", "Year", "Quarter", "iSF", "Scope", "Current/Final Costs", "Cost/ SF", "Cost Data Marker Column"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded f Extract Project Cost Data",{{"Project No.", type text}, {"Project Name", type text}, {"Location", type text}, {"Region", type text}, {"Year", type text}, {"Quarter", type text}, {"Scope", type text}, {"Cost Data Marker Column", type text}})
in
#"Changed Type"
Hi Caroll, would you mind posting this question in our forum at https://www.excelguru.ca/forums? That would be a better place for this discussion I think. It would also be helpful if you could include a sample workbook, as the issue is also going to require looking at your "f Extract Project Cost Data" function.
I just wrote a query which ...
1. Uses List.Generate to build a list of serial numbers from 1 to NNN.
2. Converts the list to a table.
3. Adds a column which calls a function to open a sheet (with the name "sheetNNN" using the serial number) and do some heavy processing on the sheet before returning it as a table.
4. Sort all the relevant data.
That query loaded 148 sheets and there were no problems!
As for doing intermediary steps, surely it shouldn't make any difference as the lazy evaluation method will only evaluate a step if it feels it is necessary to do so.
I had a very "interesting" error a couple days ago ...
"There was an error deserializing the evaluation results. The operation might succeed on a retry."
First time I've seen that one. I did retry the query and it worked fine!
I really wish they'd turn it into a proper development environment. My major gripes are ...
1. The inability to save your work from within the environment, I have lost quite a lot of (hours of) work when Excel crashed closing PQ, so much so that I now (if I remember) copy the open scripts to Notepad.
2. No simple editing commands such as search-replace in the editor, again I use Notepad if necessary.
3. No proper step-through of the code. Run a function script and you have literally no idea what's going on!
4. In Excel, moving the mouse over a query causes it to try to reload, which for a slow script (the most intensive one takes about 30 minutes!) I'd rather avoid, and moving the mouse off frequently causes it to stop the load and then show a warning triangle without any real explanation, the only way to fix it being to re-run the query!
5. I can't open a second workbook if PQ is open.
Don't get me wrong, I think that Power Query is an amazing tool and it has allowed me to write some "code" which is probably way out of scope (i.e. should have been written in another, proper programming language) e.g. http://rr3.wikia.com/wiki/User:QuickNick/Sandbox14g for which all calculations and data-manipulation have been done with PQ.
Hi Ken, always thankful for your articles. Have saved me too many times!
I hit the wall with this same issue this week and all of my queries were working just fine.
I used the fast combine option:
https://support.office.com/en-us/article/Privacy-levels-Power-Query-CC3EDE4D-359E-4B28-BC72-9BEE7900B540?ui=en-US&rs=en-US&ad=US
and it was solved.
I know it is risky, but in my case this wasn't a problem.
Hope it helps somebody else
For reference, ignoring privacy ALWAYS fixes this problem. The question is, is it the right thing to do? If the data is all internal to your org, I'd say it's pretty low risk though.
Here we are, more than 2 years later, and it is still a problem.
Turning off the firewall or allowing a by-pass does not help my situation, because we share files and even though I may be able to use the queries without a problem, the next person may not be able to.
But this solution solves the problem for everyone. I hope.
I can understand that at some point, someone decided to make all connections, which do not reference another query, a full path connection, even to tables within the same file, but why would it not require you to create a jump off query in the first place. If by default, you must create a staging query to keep a clean connection to your table, then they should have required it in the building of queries process, instead of leaving people to wander in the dark until they found the one blog that could offer a solution.
But Thank You. Thank you so much for this solution. Please don't allow it to disappear. Because there will be another person stumbling in from the dark, with this same problem.
Hi Ken,
Thank you for this post I am using Power BI and somehow this is becoming an Issue.
Do you have a post about the "Unexpected error: Evaluation was canceled"? I get both of them a lot after the May Update I change the privacy settings but still is really annoying and I lose a lot of time.
thanks and keep with this great blog!
Jorge
Hey Jorge
I don't have a post on that error, no. Typically I've found it happens when there is an error in the data set or when one data set fails to refresh. I haven't dug too deeply into why though, to be fair.
I ran into the same issue pulling multiple data frames in PowerBI using R.
One workaround I found was to click the down arrow beside "Merge Queries" and select instead "Merge Queries as New"
I am having a similar issue and I have applied your solution but it is still not working. I am getting this error message
Expression.Error: The import Risk_Sheet matches no exports. Did you miss a module reference?
Please can you advise if I am not doing something right?
Thanks a lot
The queries used are below:
QUERY 1 : RiskData
let
Source = Excel.Workbook(File.Contents("XXXX.xlsx"), null, true),
Risk_Sheet = Source{[Item="Data",Kind="Sheet"]}[Data]
in
Risk_Sheet
QUERY 2
let
Source = RiskData,
#"Changed Type" = Table.TransformColumnTypes(Risk_Sheet,{{"Column1", type text}, {"Column2", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
#"Invoked Custom Function" = Table.AddColumn(#"Promoted Headers", "GetRisks", each fnGetBatchRiskFile([File Location])),
#"Expanded GetRisks" = Table.ExpandTableColumn(#"Invoked Custom Function", "GetRisks", {"ID", "TYPE", "Batch / Project / Prog. Name", "CATEGORY", "DATE RAISED", "SUMMARY / TITLE", "DESCRIPTION", "IMPACT", "IMPACT COST", "PROBABILITY", "IMPACT_1", "SCORE", "RAG Status", "Impact Prob", "Risk Cost", "MITIGATING ACTION(S)", "ACTION DUE DATE", "OWNER (PERSON RESPONSIBLE)", "ACCOUNTABLE PERSON", "LAST REVIEWED", "COMMENTS SINCE LAST UPDATE", "OPEN / CLOSED", "STATUS", "TARGET CLOSE DATE", "DATE CLOSED", "REASON CLOSED", "ESCALATE / DOWNGRADE", "ESCALATION LEVEL", "Impact List", "Probability List", "Category List"}, {"ID", "TYPE", "Batch / Project / Prog. Name", "CATEGORY", "DATE RAISED", "SUMMARY / TITLE", "DESCRIPTION", "IMPACT", "IMPACT COST", "PROBABILITY", "IMPACT_1", "SCORE", "RAG Status", "Impact Prob", "Risk Cost", "MITIGATING ACTION(S)", "ACTION DUE DATE", "OWNER (PERSON RESPONSIBLE)", "ACCOUNTABLE PERSON", "LAST REVIEWED", "COMMENTS SINCE LAST UPDATE", "OPEN / CLOSED", "STATUS", "TARGET CLOSE DATE", "DATE CLOSED", "REASON CLOSED", "ESCALATE / DOWNGRADE", "ESCALATION LEVEL", "Impact List", "Probability List", "Category List"})
in
#"Expanded GetRisks"
Try making both those steps in query 1 RiskSheet instead of Risk_Sheet. That may help. Outside that, I'd suggest posting in our help forum at https://exelguru.ca/forums That way we can share your file so we can see it in action.
Thanks for this article, very useful.
Another variant I noticed is if you insert a value into a sql string that derives from another query your analysis will not refresh on the service, it will refresh fine in PBI desktop but not the service. You need to have the information inserted into the sql string from a parameter that is manually entered, or just type it in directly.
I do find it annoying that there are so many caveats around using M to get it to reliably work in the service it is really not worth trying to do anything smart in it at all, as you are likely to break what ever you are trying to do through refresh failures. I have ended up moving everything back to SQL as much as I can. I think moving forward I am going to move to an architecture of running all the analysis elsewhere and then simply pointing PBI at the result for the purposes of the data visualisation and sharing.
I think PBI should not be viewed as an analysis tool or a even a data transformation tool, but rather as a means of sharing visualisations, as it isn't really fit for purpose as far as analysis and data transformation goes.
Hey Ben, I can't argue with you on the frustration there. It's maddening.
Great ! Helped me a lot !
That's great to hear, Heraldo. Thanks for reading!
Just hit the same issue that Ben had re 5th August Post. Painful.
Your solution revealed to be a total game changer for my project. This issue was going to be a big show-stopper for our workbook set that allows to manage a complex registry of all our company' activities (hour by hour), all in Excel, where the single registries have to be distributed on different workstations (hence with different filepaths involved).
But what I actually ended up doing is using the "getParameter" fx straight in every query, rather than using staging queries.
The peculiar type of data filtering would have led to a proliferation of data in the workbook.
FYI-
I had the same error and fixed it "ignoring" the Privacy Levels.
File\Options and settings\Query Options\Privacy\Ignore the Privacy and Potentially improve performance.
Hey Vince,
That's not really so much a fix as just turning off your security. It works, but it's not really the best way to deal with the issue.
Pingback: Webinar Followup: Be a Full Stack #PowerBI Jedi - DataChant
Staging queries didn't work for me. I had to literally turn all my staging queries into function queries. That finally freed me from the error but it sure doesn't feel appropriate. What might be going on?
Hey Rob,
There are two different ways to deal with formula firewall issues. This is one, but Chris Webb also has another that you can find here: https://blog.crossjoin.co.uk/2017/06/26/data-privacy-settings-in-power-bipower-query-part-3-the-formula-firewall-error/ As he says there "I’ll admit I don’t understand this particular topic perfectly (I’m not sure anyone outside the Power Query dev team does)", and I'm in the same boat with him. I'd check out that article as well.
Hi Ken,
I came upon this article since I also experience the same issue. I have two sources, one from sharepoint and my custom function gets its data from web. I am able to refresh my data in Power BI desktop. However when I publish it online, I get the error:
"[Unable to combine data] Section1/MarketIndicators_Prices/AutoRemovedColumns1 references other queries or steps, so it may not directly access a data source. Please rebuild this data combination. Table: MarketIndicators_Prices."
Tables affected are those which invoked the custom function. I hope you could help me with this.
Thank you.
Hi Jenel,
I'll be honest, I've had no luck with this. My suggestion here would be to do this in desktop then use Power Update to publish it. My guess is that you disabled privacy checks in Desktop, but there is no way to do this in the Power BI service. Using Power Update, you can schedule it to be refreshed locally then pushed to the Power BI service, working around this issue.
Thank you very much! As far as me "Fast Combine " was enough. At the end I believe that the root cause of this issue is the "lazy evaluation" in M.
We are dealing with "the privacy levels for each source". When you use indirect references for opening a file (for example with functions) you cannot analyze such source directly, this analysis is deferred... when you remove such "privacy verification" removing "fast combine" then all works. No dicrect analisys.
I see that all workarounds use intermediate files just to get the "evaluation" of the file... now you have a real file wich is not in a "lazy " state. i.e. it is just evaluated.
I would like that someone agree or disagrees with me... I am an island here in Spain... 🙂
Interesting... I'm not sure on this Javier. I wish I understood the evaluation model better.
Hi Ken,
in my case it seems the method you suggest cannot be applied to solve the error 🙁 I'm trying to build a web query "Table1" where the URL is parametric and it depends on "Ticker" being a table query that pulls the stock Ticker value from a cell.
Any idea here?
Thanks!
let
Source = Web.Page(Web.Contents("https://finance.yahoo.com/quote/" & Ticker & "/analysis?p=" & Ticker)),
Data1 = Source{1}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data1,{{"Column1", type text}, {"Column2", type text}})
in
#"Changed Type"
Hi Emiliano,
What happens if you try this?
let
vTicker = fnGetParameter("Ticker"),
vURL = "https://finance.yahoo.com/quote/" & vTicker & "/analysis?p=" & vTicker,
Source = Web.Page(Web.Contents(vURL)),
Data1 = Source{1}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data1,{{"Column1", type text}, {"Column2", type text}}) in
#"Changed Type"
If that doesn't work, I'm afraid that I'd be left with turning off the formula firewall to avoid the error. 🙁
FYI: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/35895718-fix-the-action-around-formula-firewall-so-instead
Hi Emiliano,
I do exactly this in my PQ spreadsheet which I use for extracting and building data-sets for a Wiki.
I ran into the formula-firewall (catch-all?) error a while back and found that moving all retrieving of data from excel sheets into external function calls helped a lot, initially I wrote set functions for each sheet (so that the data could be manipulated) but then I wrote a standard function which does nothing but retrieve the specified sheet ...
let funcLoadExcelSheet = (strSheet) => Excel.CurrentWorkbook(){[Name=strSheet]}[Content] in funcLoadExcelSheet
I then wrote another routine which retrieves my Parameters sheet and returns either the whole sheet or the specified parameter(s).
//------------------------------------------------------------
// function to load the Parameters sheet and return all or only the specified parameters
// tblParameters = fExcelParameters() // loads all parameters
// tblParameters = fExcelParameters({}) // loads all parameters
// tblParameters = fExcelParameters({"CashToGoldRatio"}) // loads the named parameter(s)
// currently allows for upto 5 values per parameter but more can be added if necessary
//------------------------------------------------------------
let LoadParameters = (optional lstParameters) as table =>
//----
let
//--------
// TEST-START
//----
// lstParameters = null,
// lstParameters = {},
// lstParameters = {"CashToGoldRatio"},
//----
// TEST-END
//--------
// convert parameter list to upper-case
listParameters =
if lstParameters is null
then {}
else List.Transform(lstParameters, each Text.Upper(_) ),
// retrieve the parameters from the spreadsheet
// NOTE: PQ doesn't like field names with the minus "-" symbol so replace it with underscore "_" or don't use it!
tblParameters =
// 3. retrieve the relevant columns
Table.SelectColumns(
// 2. ignore inactive parameters
Table.SelectRows(
// 1. rename the columns
Table.RenameColumns(
// 0. load the excel parameters sheet
Table.Buffer(fExcelSheet("Parameters"))
// 1.
,{
{"Column1","Active"}, {"Column2","Parameter"}
,{"Column3","Value1"}, {"Column4","Value2"}, {"Column5","Value3"}, {"Column6","Value4"}, {"Column7","Value5"}
} )
// 2.
,each Text.Upper([Active]) = "Y"
and [Parameter] <> null
and Text.TrimEnd([Parameter]) <> ""
and (if List.IsEmpty(listParameters) then true else List.Contains(listParameters, Text.Upper([Parameter]) ) ) )
// 3.
,{"Parameter", "Value1", "Value2", "Value3", "Value4", "Value5"} )
// done
in
tblParameters
//----
in LoadParameters
Sorry, I don't know how to format the code differently (as Ken does) so it stands out from my message. I hope this helps.
Nick
Corrections to my earlier message ....
I didn't make it clear that the sheet-load function is called fExcelSheet and is referred to in the subsequent section of code.
It stripped out the less-than and greater-than symbols from step 2 which should have read ...
and [Parameter] "not equal" null
and Text.TrimEnd([Parameter]) "not equal" ""
Ken, if you could correct my earlier message AND change the formatting of the code sections, that would be great, thanks.
Done Nick. 🙂
Hey Ken,
Would your solution of Query Staging work when one has a custom function to define the data source?
Hi Mike,
I doubt it, as it's still seen as a separate data source.
Pingback: Power BI - Issue in using Query List Parameter (Query references other queries or steps) - Simple BI Insights
Pingback: Power Query - Import Data from the Web - Excel off the grid
Thank you. This really helped.
I know this is years after the original post, but I am encouraged that you are still responding to comments. When I followed this process in Power BI, I found that the duplicate queries were re-querying the source. For example, I have 5 staging queries, and one or two queries that use these as the source. My status shows that the duplicates are querying the same source over API. This is extremely slow, as you can imagine.
Is there a way to do this staging method that will only query once, through the original connection query?
Hi Adam, no not really. Power Query actually requests previews, and should re-use cached queries automatically. I'm not saying that it isn't making multiple calls, but the user interface isn't always the most reliable to understanding what's truly going on.
I am having this issue with two queries to get a dynamic access file location and then pull in the data from a table in an access file.
This gets the location of the db via a table in the Excel file (tblFilePath) that holds the file path -
let
Source = Excel.CurrentWorkbook(){[Name="tblFilePath"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"File_Path", type text}}),
Source_Data_File_Path = #"Changed Type"{0}[File_Path]
in
Source_Data_File_Path
I then have a query to pull in the data from the db -
let
Source = Access.Database(File.Contents(SourceFileName&"\StockMovement.accdb"), [CreateNavigationProperties=true]),
GoodsIn = Source{[Schema="",Item="GoodsIn"]}[Data]
in
GoodsIn
When this is stepped through in PQ, it refreshes as required but I use code in a VBA module to perform the refreshes of the two queries
ActiveWorkbook.Connections("Query - SourceFileName").Refresh
ActiveWorkbook.Connections("Query - GoodsOut").Refresh
In the above, the SourceFileName refreshes but I get an error for the GoodOut refresh -
Run-time error '1004':
We couldn't refresh the connection 'Query - GoodsOut'. Here's the error message we got:
Query 'GoodsOut' (step 'AutoRemovedColumns1') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.
I have spent quite a bit of time trying to rebuild this query but I haven't been successful yet and I am confused by the step the error quotes as there isn't any removed columns in any steps.
I have solved this before by changing the privacy levels where only one user needs to use the file but the above process is in a file being used by multiple users some of which will need to open the file as Read-Only (which causes the error) or if a user opens the file as Read/Write after somebody else has opened, saved and closed it and the privacy setting have been changed for the first user.
Some advice would be super as this file is for a client who is getting a little frustrated with having to call me every time the error pops up.
TIA
Hi Michael,
I think in this case that you're going to have to suppress privacy to get it to work. As you pointed out, this change is not cached between users. There isn't anything we can do to automate this, but you could do this to advise the user what they need to do to solve the issue on their own: https://excelguru.ca/2014/10/29/tame-power-query-workbook-privacy-settings/
Hope that helps!
Thanks for the prompt response Ken, much appreciated!
Not exactly what I was hoping to hear and I doubt this will go down very well with my client but it is what it is I suppose!
With regards to the link in your reply, would the use have to open up the query and make the selections themselves or can this be automated in some way?
Thanks
Michael
Pingback: Power Query - Common Errors & How to Fix Them - Excel Off The Grid
Bonjour Ken,
Je suis novice sur Query. J'ai deux requêtes que je souhaiterai fusionné mais j'obtiens cette erreur :"Formula.Firewall : Requête « IMPORT » référence d'autres requêtes ou étapes et ne peut donc pas accéder directement à une source de données. Reconstruisez cette combinaison de données."
La première requête charge des données du fichier .txt avec ajout de 2 colonnes faisant référence à une table excel (BDD_CLIENT).
Ensuite j'ai souhaité faire un requête paramétrée pour traiter tous les fichiers .txt d'un dossier de la même manière (IMPORT)
Lorsque je construis la requête me permettant d’accéder au dossier et que je souhaite ajouter la colonne personnalisé avec l'utilisation de la requête paramétrée, j'obtiens toujours l'erreur.
Ci dessous l'éditeur avancé des différentes requêtes :
let
Source = Csv.Document(File.Contents("H:\Vitre\Commun\Identification IDEA\2020\Données IDEA\decembre2019.txt"),[Delimiter=";", Encoding=1252]),
#"En-têtes promus" = Table.PromoteHeaders(Source),
#"Requêtes fusionnées" = Table.NestedJoin(#"En-têtes promus",{"N° PUCE"},BDD_CLIENT,{"N° PUCE"},"NewColumn",JoinKind.LeftOuter),
#"NewColumn développé" = Table.ExpandTableColumn(#"Requêtes fusionnées", "NewColumn", {"Commentaire", "AFFECTATION"}, {"NewColumn.Commentaire", "NewColumn.AFFECTATION"})
in
#"NewColumn développé"
-----------------
(Fichier as text) =>
let
//Fichier = "H:\Vitre\Commun\Identification IDEA\2020\Données IDEA\decembre2019.txt"
Source = Csv.Document(File.Contents(Fichier),[Delimiter=";", Encoding=1252]),
#"En-têtes promus" = Table.PromoteHeaders(Source),
#"Requêtes fusionnées" = Table.NestedJoin(#"En-têtes promus",{"N° PUCE"},BDD_CLIENT,{"N° PUCE"},"NewColumn",JoinKind.LeftOuter),
#"NewColumn développé" = Table.ExpandTableColumn(#"Requêtes fusionnées", "NewColumn", {"Commentaire", "AFFECTATION"}, {"NewColumn.Commentaire", "NewColumn.AFFECTATION"})
in
#"NewColumn développé"
-------------
Merci pour votre aide
Maxence
Salut Maxence,
Le problème est que vous essayez de vous connecter au fichier et de fusionner dans la même requête. Comme cela accède à deux sources de données dans une même requête, il renvoie cette erreur.
Essayez de diviser votre étape Source en sa propre requête (appelez-la quelque chose comme "Fichier". Puis référencez la requête "Fichier" et faites tout à partir des en-têtes promus.
Alternativement, vous pouvez désactiver le pare-feu de formule en définissant vos paramètres de confidentialité sur ignorer, mais cela ne devrait pas être réellement nécessaire dans ce cas.
J'espère que cela pourra aider!
Hi Ken, I am a newbie and pardon my ignorance.
My Query is connected to csv files in a folder which are then filtered on filename and then transformed to get the desired data table.
I am trying to filter this table through parameter. using a function as detailed in https://www.howtoexcel.org/power-query/how-to-parameterize-your-power-query/
The function when introduced as last step gives the error rebuild data error. tried your technique doesn't work. My knowledge is copy/ paste limited.
Master query (without filter function)
let
Source = Folder.Files("C:\Users\NAVAID\OneDrive - Etihad Airways\In Work\WIP"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? true),
#"Filtered Rows3" = Table.SelectRows(#"Filtered Hidden Files1", each Text.Contains([Name], "ADAT")),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Rows3", "Transform File", each #"Transform File"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
#"Expanded Transform File" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22"}, {"Transform File.Column1", "Transform File.Column2", "Transform File.Column3", "Transform File.Column4", "Transform File.Column5", "Transform File.Column6", "Transform File.Column7", "Transform File.Column8", "Transform File.Column9", "Transform File.Column10", "Transform File.Column11", "Transform File.Column12", "Transform File.Column13", "Transform File.Column14", "Transform File.Column15", "Transform File.Column16", "Transform File.Column17", "Transform File.Column18", "Transform File.Column19", "Transform File.Column20", "Transform File.Column21", "Transform File.Column22"}),
#"Removed Top Rows" = Table.Skip(#"Expanded Transform File",4),
#"Removed Columns1" = Table.RemoveColumns(#"Removed Top Rows",{"Source.Name"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Columns1", [PromoteAllScalars=true]),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([Defect Type] = "Aircraft Technical Log")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Log Series", "NR #", "Defect Type", "Due Date", "Counter", "UOM", "CAT", "NR Material", "Remarks", "Visit/Date", "NR Operation", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Log. No.", "ATL"}, {"A/C Reg", "REG"}, {"Date Raised", "Date_Raised"}, {"ATA Code", "ATA"}, {"Description of Defect", "Description"}}),
#"Changed Type2" = Table.TransformColumnTypes(#"Renamed Columns",{{"ATL", type text}, {"ATA", type text}, {"Date_Raised", type date}}),
#"Filtered Rows1" = Table.SelectRows(#"Changed Type2", each [Date_Raised] > #date(2020, 1, 31)),
#"Invoked Custom Function" = Table.AddColumn(#"Filtered Rows1", "KWSearch", each KWFunction([Description])),
#"Replaced Value" = Table.ReplaceValue(#"Invoked Custom Function",null,"00",Replacer.ReplaceValue,{"ATA"}),
#"Uppercased Text" = Table.TransformColumns(#"Replaced Value",{{"Description", Text.Upper, type text}}),
#"Added Custom" = Table.AddColumn(#"Uppercased Text", "Status", each if Text.StartsWith([ATA], "5") or Text.Contains([Description],"SCRATCH") or Text.Contains([Description],"DENT") or Text.Contains([Description],"NICK") or Text.Contains([Description],"GOUGE") or Text.Contains([Description],"CRACK") or Text.Contains([Description],"STRIKE") or Text.Contains([Description],"CARGO") then "P1-Not Started" else "P2-Not Started"),
#"Removed Duplicates" = Table.Distinct(#"Added Custom", {"ATL", "REG"}),
#"Filtered Rows2" = Table.SelectRows(#"Removed Duplicates", each ([KWSearch] = null) and ([Description] "NIL" and [Description] "NIL " and [Description] "NIL DEFECT" and [Description] "NIL DEFECT " and [Description] "NIL DEFECTS" and [Description] "NIL DEFECTS " and [Description] "NIL DERFECTS" and [Description] "NIL FURTHER" and [Description] "NLI DEFECT" and [Description] "VOID" and [Description] "VOID " and [Description] "VOIDED")),
#"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows2",{"KWSearch"})
in
#"Removed Columns2"
Function
let PRaisedDate = (TableName,RowNumber) =>
let
Source = Excel.CurrentWorkbook() {[Name= TableName]}[Content],
value = Source {RowNumber} [Value]
in
value
in PRaisedDate
Applying filter as follows to Master Query above as last step throws rebuild data error
#"Filtered Rows4" = Table.SelectRows(#"Removed Columns2", each [Date_Raised] >= #date(Date.Day(fgeDate("PTable",0)), Date.Month(fgeDate("PTable",0)), Date.Year(fgeDate("PTable",0))))
Can you help with this?
Hi Navaid,
So the short answer is that you are making a data call inside a function and, by applying this on a row by row basis, you're going to hit this challenge. In addition, you are also calling against an Excel workbook at the beginning. Even if we did get this to work (you could disable privacy to get past it), the performance will be poor.
What I would do:
1) I your first query, go to the "Filtered Rows" step, right click it and Extract Previous into a new query called "Source Data". Go to Close & Load To-> Only Create Connection
2) Create a new query that reads from the table in the current workbook. Called it "RaisedDates" and load it to connection only
3) Modify your function to read as follows:
let PRaisedDate = (TableName,RowNumber) =>
let
Source = RasiedDates,
value = Source {RowNumber} [Value]
in
value
in PRaisedDate
The performance will still be bad, but at least this will get you on the correct path. (What you should really be doing is buffering the RaisedDates table and passing that into the function, but the other stuff needs to be done first.
Hi,
the last step does not work:
let
Source = PurchaseList
#"Removed Columns" = Table.RemoveColumns(Purchase_Sheet, {"Column47
You put "Source = PurchaseList" in, but references "Purchase_Sheet" in the next step. So I get an error message "Expression.Error: the import matches no exports"
Any idea?
Thanks in advance
Denxx
Ah, my bad. Yes, just swap "PurchaseList" in place of "Purchase_Sheet" and it will work fine.
Had this issue when working with an API in PowerBI all the tables were either created by DAX (Calendar tables) or directly from the API, ironically, PowerQuery started complaining about joining one external data source to another!
Great article! It explains a lot.
I found more easy way to solve it in my case. Just need to go to File – Options and settings – Options - Current File - Privacy and set Privacy Levels option to "Ignore the Privacy Levels and potentially improve performance".
Hi Hannadii,
To be completely technically accurate, you haven't actually solved the issue, as much as you have told Power Query to ignore the issue. There is a big difference. By disabling privacy you avoid this issue, and probably improve performance as well, but you lose the in-built checks to make sure that you aren't leaking private data to public sources. You'll also be able to build a solution that will not refresh in Power BI online, as you can't disable privacy there.
My office has been remotely deployed due to COVID-19 concerns. This kicked everyone into Citrix-based Excel 2016 rather than O365 Excel. I have discovered issues, to include the one addressed in this string which are very annoying. Thanks Ken for maintaining this blog and the historical posts, as I walk back in time to address Excel 2016 issues when running Power Queries developed using current O365.
Eww... Excel on Citrix. That takes me back a few years! Don't miss those days. 🙂
Hi. I have power query connecting by API to two data sources- one time recording the other HR. Within PQ I have a query that merges one of the imports with another to get employee data. Although I read the comments on staging, this query works on my machine without any problem. However, when I distribute the file the Firewall error comes up. I have looked at permissions and am about to experiment with staging but what I would like to understand is why this should work fine on my own pc with no staging but not when distributed? Is there an obvious explanation?
Did you disable privacy on your machine?
Hi. The privacy settings in permissions for all data sources is set to none. In PQ my global privacy level is set to Always ignore which the workbook inherits?
Ken
Thanks for that heads up, got correspondent to check privacy levels and everything is working. I still struggle a little to see why and found the MSFT explanation a tad too dense for me. However, thanks a lot- liked the book too.
Hi Ken,
I have merged couple of columns in Power BI.
Now i am trying to extract the same with below basic python command
#--------------------------------------------------------------------------------------
import pandas as pa
val = pa.DataFrame(dataset)
val.to_csv('c:/users/hp/documents/export.csv',index=false)
#--------------------------------------------------------------------------------------
I am receiving the error "Query .. references other queries or steps and so may not directly access a data source"
When i apply this query above merge step, it works fine, but gives an error when executed after the merge step.
As few transformations are done post merge, i need to extract the latest one to csv.
Please help me out..
Mahan, assuming that "dataset" is another query, the only way I think you could do this is to disable the privacy settings on the file. I don't have any experience using Phython, but I can't see an alternative here as you are combining two data set/operations.
Hi Ken,
I was curious if you wouldn't mind fixed the error you mention in the comments (into the blog post), so that I am sure that I am doing it correctly?
Also, I see that your answer above might be relevant to my problem. I am pulling in two dataframes in an R code, merging them together (using text analytics). This works if I have it ignore privacy settings, but I would like not to if possible.
Hi Julie,
I don't have the data any more to reshoot the image, but I've added a note below the image in question to point out what the code should be. With regards to your R code, I'm sorry but I don't know R, so have never tried to use it in a query, so can't really advise on this.
Hi, Ken.
Like so many others a huge THANK YOU for this information and all your other great pages.
Also excellent is your tip to start by building connection-only staging tables and then using them in all subsequent merges! Brilliant!
Mark
Hey Ken,
I must say, I can't keep this rather simple example not up and running. Excel firewall still kicks in...
STAGING QUERY named Subnets_import
let
Source = Sql.Databases("REMOVED IN RELATION TO SECURITY"),
Sizing = Source{[Name="Sizing"]}[Data],
SubnetsSheet = Sizing{[Schema="dbo",Item="subnets_view"]}[Data]
in
SubnetsSheet
SUBSEQUENT QUERY named Subnets_view
let
Source = Subnets_import,
#"Split Column by Delimiter" = Table.SplitColumn(Source, "subnet", Splitter.SplitTextByEachDelimiter({"/"}, QuoteStyle.Csv, true), {"subnet.1", "subnet.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter",{{"subnet.1", "IP"}, {"subnet.2", "Range"}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Renamed Columns", "Range", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Range.1", "Range.2"}),
#"Renamed Columns1" = Table.RenameColumns(#"Split Column by Delimiter1",{{"Range.1", "Range"}, {"Range.2", "Comment"}}),
#"Replaced Value" = Table.ReplaceValue(#"Renamed Columns1",null,"",Replacer.ReplaceValue,{"name", "IP", "Range", "Comment", "customer"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Filter", each if GetValue("CustomerName")"*" then Text.Upper(Text.Middle([customer],0,Text.Length(GetValue("CustomerName"))))=Text.Upper(GetValue("CustomerName")) else true),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Filter] = true)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Filter"})
in
#"Removed Columns"
Firewall kicks in on the AddColumn function.
I'm using a very recent Excel version being Excel version 2011 (not the year 2011 ;-))
Hi Jules,
It's the GeetValue function that is causing the issue. What does it do? My guess it that it touches another data source, which is where the problem is going to lie, as that will trigger the formula firewall...
I am very amateur at PQ and developed my queries with help from web forums. The queries are executed by VBA code and refer to data in Excel files elsewhere on the same computer as the file in which the VBA and PQ code reside.
My queries worked perfectly form many months, until I had to have my computer repaired, which forced me to reinstall Excel and everything else.
Now they've stopped working, giving the error message about referencing other queries or steps. I keep thinking there was a setting in Excel that I failed to reset after re-installing. Can anyone think of such a setting? .... resetting it would certainly simplify the fix. My query code differs considerably from the code cited in the lead article here.
Hi David,
Most likely you had disabled privacy at a global level previously. I wouldn't suggest doing that, but rather to just disable privacy for the given workbook. To do that, go to Get Data --> Query Options --> Current Workbook --> Privacy Levels --> Ignore
Hi Ken,
I have tried splitting the first line of this query into another query, but I'm still getting the "references other queries or steps" error. I guess I'm not correctly understanding where the conflict arises. Any advice, please?
let
Source = Excel.CurrentWorkbook(){[Name="Table_Test_Results"]}[Content],
#"Removed Other Columns" = Table.SelectColumns(Source,{"Test No", "Docket No", "Date Sampled", "Mix", "Design Cement", "Actual Cement", "Water", "Air %", "Batch Yield", "Design Yield ", "Plastic Density", "Spec Slump", "Meas slump", "7 day Strength (MPa)", "7 Day Normalised", "28 Day Ave Cyl Density", "Avg 28 day Strength (Mpa)", "Within Test", "28 Day Normalised", "Exclude?"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([#"Exclude?"] = null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Exclude?"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Date Sampled", type date}}),
#"Filtered Rows1" = Table.SelectRows(#"Changed Type", each [Date Sampled] >= #date(Date.Year(All_Mix_Params("All_Mix_From_To",1)), Date.Month(All_Mix_Params("All_Mix_From_To",1)), Date.Day(All_Mix_Params("All_Mix_From_To",1))) and [Date Sampled] <= #date(Date.Year(All_Mix_Params("All_Mix_From_To",2)), Date.Month(All_Mix_Params("All_Mix_From_To",2)), Date.Day(All_Mix_Params("All_Mix_From_To",2)))),
#"Added Custom" = Table.AddColumn(#"Filtered Rows1", "StdDevRate", each All_Mix_Params("Parameters",1)),
#"Merged Queries" = Table.NestedJoin(#"Added Custom", {"Mix"}, MixTable, {"Mix"}, "MixTable", JoinKind.LeftOuter),
#"Expanded MixTable" = Table.ExpandTableColumn(#"Merged Queries", "MixTable", {"Merge Mix ID", "Grade", "Maximum COV %", "Target mean strength MPa", "Target Air", "Minimum Strength Mpa", "Air AE mix limits max", "Std Dev"}, {"Merge Mix ID", "Grade", "Maximum COV %", "Target mean strength MPa", "Target Air", "Minimum Strength Mpa", "Air AE mix limits max", "Std Dev"}),
#"Added Custom1" = Table.AddColumn(#"Expanded MixTable", "Custom", each if [Std Dev] = null then [StdDevRate] else [Std Dev]),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"StdDevRate", "Std Dev"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns1",{{"Custom", "StdDev"}}),
#"Added Custom2" = Table.AddColumn(#"Renamed Columns", "Custom", each if [Merge Mix ID] =null then [Mix] else[Merge Mix ID]),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom2",{{"Custom", type text}}),
#"Removed Columns2" = Table.RemoveColumns(#"Changed Type1",{"Merge Mix ID"}),
#"Renamed Columns1" = Table.RenameColumns(#"Removed Columns2",{{"Custom", "Merge mix"}}),
#"Parameter05" = Table.AddColumn(#"Renamed Columns1", "20 to 35MPa 7-Day Alert Line", each All_Mix_Params("Parameters",5)),
#"Parameter06" = Table.AddColumn(#"Parameter05", "40 to 50MPa 7-Day Alert Line", each All_Mix_Params("Parameters",6)),
#"Parameter07" = Table.AddColumn(#"Parameter06", "Normalised 28-Day Min", each All_Mix_Params("Parameters",7)),
#"Parameter08" = Table.AddColumn(#"Parameter07", "Normalised 28-Day Max", each All_Mix_Params("Parameters",8)),
#"Parameter09" = Table.AddColumn(#"Parameter08", "Normalised Slump Target Limits Min", each All_Mix_Params("Parameters",9)),
#"Parameter10" = Table.AddColumn(#"Parameter09", "Normalised Slump Target Limits Max", each All_Mix_Params("Parameters",10)),
#"Parameter11" = Table.AddColumn(#"Parameter10", "Yield Target Limits Min", each All_Mix_Params("Parameters",11)),
#"Parameter12" = Table.AddColumn(#"Parameter11", "Yield Target Limits Max", each All_Mix_Params("Parameters",12)),
#"Parameter13" = Table.AddColumn(#"Parameter12", "Air AE mix limits min", each All_Mix_Params("Parameters",13)),
#"Reordered Columns" = Table.ReorderColumns(Parameter13,{"Test No", "Docket No", "Date Sampled", "Mix", "Merge mix", "Design Cement", "Actual Cement", "Water", "Air %", "Batch Yield", "Design Yield ", "Plastic Density", "Spec Slump", "Meas slump", "7 day Strength (MPa)", "7 Day Normalised", "28 Day Ave Cyl Density", "Avg 28 day Strength (Mpa)", "Within Test", "28 Day Normalised", "Grade", "Maximum COV %", "Target mean strength MPa", "Target Air", "Minimum Strength Mpa", "Air AE mix limits min", "Air AE mix limits max", "StdDev", "20 to 35MPa 7-Day Alert Line", "40 to 50MPa 7-Day Alert Line", "Normalised 28-Day Min", "Normalised 28-Day Max", "Normalised Slump Target Limits Min", "Normalised Slump Target Limits Max", "Yield Target Limits Min", "Yield Target Limits Max"})
in
#"Reordered Columns"
Hi Ken, thanks for the helpful article.
One question - could the steps under 'Let’s “rebuild this data combination”' be replaced with the 'Extract Previous' tool? This shows up when you right click a step in the Query Settings sidebar, and it does all the work to split your query into two with all the references correct.
Hi Chris,
Short answer, yes. Longer answer - I'm not actually sure that the Extract Previous command was on the right click menu when I originally wrote this article. (A LOT of change has come to Power Query in the last 6 years!)
Hey Ken. I realize this is an old thread. My question is, do you still stand by this logic? The examples in your code above are simple for obvious reasons. But in the real world where we do multiple merges, I have found that I simply cannot "rebuild" the queries to satisfy the formula.firewall. My only option is to disable Privacy settings.
This presents a HUGE problem when it comes time to publish my reports to the service and my dataset will not refresh.
I've seen presentations by Chris Webb where he says the problem has to do with multiple "partitions". He actually said he has better luck using fewer queries (as opposed to staging queries).
Thanks in advance!!
Mike
Pingback: Power Query “Folder Import” auf Basis einer Excel-Fileliste realisieren – Linearis :: Self-Service Business Intelligence
Wow, this was weird.
I got the "please rebuild" admonishment when I tried inserting a few steps in the middle of a long query. I removed the steps then looked elsewhere in the query where I had essentially done the same thing.
In that case, about lines in a row gave me that warning, and then after that the warning went away and the query worked.
Anyway, I applied this fix, and everything is cool now.
Hi,
This was a very good article! I was wondering if this i still a challenge? I was wondering if this is solved by first connecting to source, and then use this PQ as a reference in the next "step" of transforming data?
Hi,
I have been facing a challenge of making schedule refresh for a web query. This web query uses two parameters start and end date from another sql table in the power query.
This report works well in Power BI desktop, but fails in schedule refresh. I find that if instead I use static parameters in the web query there is no problem, but as soon as I connect to sql source, it fails.
It may be same error as you describe here, any suggestions to resolve are appreciated.
It is ABSOLUTELY still a challenge Terje. While the referencing helps, you have to make sure that you don't bring a second data source in accidentally in subsequent steps.
Hi MKJ, this is a really challenging one. Pulling the two data points from SQL is easy enough to do (as you've seen), but if you are then using those to dynamically build a URL which is then password to the web query... that is combining multiple data points. You may be able to get around this by building a custom function (set it up with a static parameter to build your function, then invoke on a table/column with the combined values.) While the preferred method is to work around the privacy issues, you may also be able to set the privacy levels via the web. In this case, if you can get everything set to the same level, it should play okay. Just be aware of the risks of doing so before you go this route.
Pingback: Split Power Queries