Expanding Data from different functions in Power Query, in same column

My latest project work related project has offered me many first time experiences. Most prominent is, that this is my first fully fledged Power BI project. I have been playing around with Power BI for some time on a household basis and back in the summer of 2014 I ran some workshops with a selection of business people in Maersk Line.

The more curio first time experience is that the company we are developing this PBI Model for a company that was founded 368 years ago. Not months or weeks; years! They’ve been around and then some.

My main technical discovery with this project was the use of try … otherwise in M. I stumbled upon the construct via Gerhard Brueckl (b|l|t) in this blog post of his on error handling.

I found usage of this construct, not for direct error handling as such, but then again in some way it is error handling.

Data comes in many shapes and forms, and this project is no exception. In order for the project not to explode in hundreds of functions and exponentially as many queries, I had to come up with a way of containing this. So for educational purposes, let’s say we get three different file types from a single source. This single source needs to be presented to the end-user as one and only one source of data. Each of the three file types has a specific structure, which needs to be handled during import. There may be hundreds of files in total, but each file confers to one and one only format.

Generating the Steps

After going through a normal GUI import of each file type in Excel or Power BI Desktop, we can copy out the M code from the advanced editor. Once we have the code it’s fairly easy to convert into a function which we can invoke at will (read: on each file in a folder).

Let’s go over this, by a quick example. Given is a file containing data, for some specific period.

We obviously want to model this into a better shape, so reporting is made easier. We do so manually in Power BI by applying different steps. Here’s the result of my steps.

How I got there is not really important for the sake of this demonstration, only that I can pick up the code behind by looking at it in the advanced editor (apologies for the Danish labels). 

The code will look something like this

or as in the screen shot


Create a Function in M

Turning the above code into a function is actually pretty straight forward. We know we are to iterate a folder full of files, so we know we are going to have the location of each file. Our function will then take a file (location) as parameter.

This is done like this:

Notice only a few minute changes, wrapping the step a let #”Import” = (file) => … in #”Import” is almost all it takes. We only need to apply the parameter to the steps already defined. So we swap the fixed file path with the parameter file. Now we are set to create a new query that iterates files in a folder and invokes this new function, taking the file location as parameter.

Ones this has been done for all of the different types, three in this particular example we need a way of appending the data from the different queries to one another. This is where the try … otherwise structure comes in handy.

Imagine we have a function available in the model for each of the three file types. Let’s call them fnImportFileType1, fnImportFileType2 and fnImportFileType3.  A prerequisite for this to work properly is that we know for a fact that no one file will produce any meaningful result from more than one (1) of the functions. Otherwise we would have duplicate data, and while we love data, duplicating it should be done with careful consideration.

So rather than adding a column for each of the three file types, or simply generating a query for each one and then appending each of those in the end, we can accomplish this with a little cleverness. Now, some might argue that files should be sorted and only those relevant for a given function is actually iterated and parsed – but the world is not always perfect or maybe there is no way of sorting the files (in a reasonable amount of time).

So, without further ado, here is a way of appending data from multiple sources/functions into one single column which can be expanded, as two steps in M.

For each row, we invoke first fnLoadFileType1, if that fails we load fnLoadFileType2 and if that fails we invoke fnLoadFileType3, finally we return null if all loads fail to produce a result.

Conclusion

While I have no idea if this is killing performance in the long run (it doesn’t now, going through 500 files) I get that this is maybe not the perfect situation. However, I am able to produce a (valid) result for our client within a narrow time frame – which is something they really love.

I’d love any input on this, as I am learning as I go along.

 

 

 

Leave a Reply