Warning: opendir(/var/www/t-sql.dk/public_html/wp-content/cache/db/options//ecb/b1d) [function.opendir]: failed to open dir: No such file or directory in /var/www/t-sql.dk/public_html/wp-content/plugins/w3-total-cache/Util_File.php on line 133
Warning: opendir(/var/www/t-sql.dk/public_html/wp-content/cache/db/options//d1f/dca) [function.opendir]: failed to open dir: No such file or directory in /var/www/t-sql.dk/public_html/wp-content/plugins/w3-total-cache/Util_File.php on line 133 PowerBI – Page 2 – T-SQL.dk
In just little over a week the Power BI World Tour will be stopping by in Copenhagen, Denmark. More precisely in Lyngby at Microsoft HQ. Copenhagen was luckily reelected for hosting the World Tour again this year, which I think will be very beneficial to the local Power BI adoption and community to say the least.
Last time around I was honoured by having one of my sessions selected. This year I get to have two sessions. I am super excited about that!
My first session will be on Tuesday 11th about Power BI Best Practices. From the trenches of some of our own projects I have gathered a list of things to do in a project, to make your life easier.
My other session will be an Introduction to Power BI and Power Query (M). The query language is one of the few things of late that blown my mind in terms of capability and versatility. I will be showing you how to get started with the basics.
I am so looking forward to spending a couple of days with other Power BI Professionals, foreign and domestic!
My latest project work related project has offered me many first time experiences. Most prominent is, that this is my first fully fledged Power BI project. I have been playing around with Power BI for some time on a household basis and back in the summer of 2014 I ran some workshops with a selection of business people in Maersk Line.
The more curio first time experience is that the company we are developing this PBI Model for a company that was founded 368 years ago. Not months or weeks; years! They’ve been around and then some.
My main technical discovery with this project was the use of try … otherwise in M. I stumbled upon the construct via Gerhard Brueckl (b|l|t) in this blog post of his on error handling.
I found usage of this construct, not for direct error handling as such, but then again in some way it is error handling.
Data comes in many shapes and forms, and this project is no exception. In order for the project not to explode in hundreds of functions and exponentially as many queries, I had to come up with a way of containing this. So for educational purposes, let’s say we get three different file types from a single source. This single source needs to be presented to the end-user as one and only one source of data. Each of the three file types has a specific structure, which needs to be handled during import. There may be hundreds of files in total, but each file confers to one and one only format.
Generating the Steps
After going through a normal GUI import of each file type in Excel or Power BI Desktop, we can copy out the M code from the advanced editor. Once we have the code it’s fairly easy to convert into a function which we can invoke at will (read: on each file in a folder).
Let’s go over this, by a quick example. Given is a file containing data, for some specific period.
We obviously want to model this into a better shape, so reporting is made easier. We do so manually in Power BI by applying different steps. Here’s the result of my steps.
How I got there is not really important for the sake of this demonstration, only that I can pick up the code behind by looking at it in the advanced editor (apologies for the Danish labels).
The code will look something like this
let
Kilde = Excel.Workbook(File.Contents("C:\Users\jve.CMS\OneDrive - CatMan Solution A S\eksempel.xlsx"), null, true),
Tabel1_Table = Kilde{[Item="Tabel1",Kind="Table"]}[Data],
#"Ændret type" = Table.TransformColumnTypes(Tabel1_Table,{{"EAN", Int64.Type}, {"Date", type date}, {"Copenhagen", Int64.Type}, {"London", Int64.Type}, {"Tokyo", Int64.Type}, {"Berlin", Int64.Type}, {"Paris", Int64.Type}}),
#"Fjernet pivotering af andre kolonner" = Table.UnpivotOtherColumns(#"Ændret type", {"EAN", "Date"}, "Attribut", "Værdi"),
#"Omdøbte kolonner" = Table.RenameColumns(#"Fjernet pivotering af andre kolonner",{{"Attribut", "Store"}, {"Værdi", "Value"}})
in
#"Omdøbte kolonner"
or as in the screen shot
Create a Function in M
Turning the above code into a function is actually pretty straight forward. We know we are to iterate a folder full of files, so we know we are going to have the location of each file. Our function will then take a file (location) as parameter.
This is done like this:
let
#"Import" = (file) =>
let
Kilde = Excel.Workbook(File.Contents(file), null, true),
Tabel1_Table = Kilde{[Item="Tabel1",Kind="Table"]}[Data],
#"Ændret type" = Table.TransformColumnTypes(Tabel1_Table,{{"EAN", Int64.Type}, {"Date", type date}, {"Copenhagen", Int64.Type}, {"London", Int64.Type}, {"Tokyo", Int64.Type}, {"Berlin", Int64.Type}, {"Paris", Int64.Type}}),
#"Fjernet pivotering af andre kolonner" = Table.UnpivotOtherColumns(#"Ændret type", {"EAN", "Date"}, "Attribut", "Værdi"),
#"Omdøbte kolonner" = Table.RenameColumns(#"Fjernet pivotering af andre kolonner",{{"Attribut", "Store"}, {"Værdi", "Value"}})
in
#"Omdøbte kolonner"
in
#"Import
Notice only a few minute changes, wrapping the step a let #”Import” = (file) => … in #”Import” is almost all it takes. We only need to apply the parameter to the steps already defined. So we swap the fixed file path with the parameter file. Now we are set to create a new query that iterates files in a folder and invokes this new function, taking the file location as parameter.
Ones this has been done for all of the different types, three in this particular example we need a way of appending the data from the different queries to one another. This is where the try … otherwise structure comes in handy.
Imagine we have a function available in the model for each of the three file types. Let’s call them fnImportFileType1, fnImportFileType2 and fnImportFileType3. A prerequisite for this to work properly is that we know for a fact that no one file will produce any meaningful result from more than one (1) of the functions. Otherwise we would have duplicate data, and while we love data, duplicating it should be done with careful consideration.
So rather than adding a column for each of the three file types, or simply generating a query for each one and then appending each of those in the end, we can accomplish this with a little cleverness. Now, some might argue that files should be sorted and only those relevant for a given function is actually iterated and parsed – but the world is not always perfect or maybe there is no way of sorting the files (in a reasonable amount of time).
So, without further ado, here is a way of appending data from multiple sources/functions into one single column which can be expanded, as two steps in M.
For each row, we invoke first fnLoadFileType1, if that fails we load fnLoadFileType2 and if that fails we invoke fnLoadFileType3, finally we return null if all loads fail to produce a result.
Conclusion
While I have no idea if this is killing performance in the long run (it doesn’t now, going through 500 files) I get that this is maybe not the perfect situation. However, I am able to produce a (valid) result for our client within a narrow time frame – which is something they really love.
I’d love any input on this, as I am learning as I go along.
This months T-SQL Tuesday is hosted by Jorge Segarra (b|t|l) and the invitation is found following this link.
T-SQL Tuesday was started by Adam Machanic (b|t), and this is the SQL Server community’s monthly blog party, where everyone is invited to write about a single common topic. This time, the topic is PowerBI.
I am hoping this blog post makes it into the summary, by Jorge. Admitted, I am a bit late. I only saw the invitation tweet today, and I was kind of lucky it was on topic, with what I was currently working with.
SSAS Performance Dashboard
My story with this half-baked product (the Dashboard you are about to see), is that I needed some way of tracking performance on a couple of Analysis Services (SSAS) query servers. There are a lot of good posts and talks about how to collect and store performance counters and SSAS logs out there, and I suggest you look into this, this or that, if you need inspiration.
The current data set is about 200K rows, as I am sampling each server every 5th minute.
The reason why I say this is half-baked, is that the querying part of the equation is missing. Currently I am only logging/storing Windows performance counters, and due to release procedures, I have not been able to implement the SSAS eXtended Events that gives us the link, to the queries run at any given time. Windows performance counters by themselves are not that interesting, because we can’t correlate them with specific queries. So we can’t really tell, what makes the servers go bunkers.
By Date
The By Date report is intended to let the end-user browse the data, based on the calendar approach to data. This is to identify peak hours, days etc.
By Server
The By Server report is to let the end-user easily distinguish which work load is related to what server. The rest of the break down is again based on calendar.
Brush
In this example the Brush Chart isn’t really fed the proper kind of data, but I put it in there, to let you see the possibilities with it. Mark the lower chart to zoom in on the upper chart.
StreamGraph
This is also a very cool visualization, not sure it has any relevance to my data, but it looks awesome!
Final Thoughts
What I really miss for this kind of event based reporting, is a chart type that allows me to have a line for say CPU Utilization and on top of that mark events and their duration, by ie. a broken line chart or similar. Not sure how to correctly describe this, but kind of Gant-style on top of a line chart.
I have been working with PowerBI since it emerged, and I have been challenged to keep up with all the features and changes the PowerBI team has released during the last year. I am really looking forward to see what will be served the next year, even more so, because I will be spending more time with PowerBI now than before.
Lately Microsoft seems to have stepped even more on the gas, releasing amazing features upon features for both SQL Server 2016 and Power BI as well as their Cloud platform Azure. This blog post will run through some of the features I find most interesting, in the latest couple of releases. With that said, Microsoft is realeasing new features on a weekly basis for the Power BI service, on a monthly basis for the Power BI Designer and have made three Community Tech Previews (CTP) available over summer for SQL Server 2016.
Just back from PASS Summit and I am a happy Camper, full of all the great new stuff already there, in preview or about to hit us with the vNext of SQL Server.
See it all in a foundation presentation by James Philips (t|l), Corporate VP at Microsoft.
Without further ado, here’s my list of points of interest – it may be a mess to you, but reflects my personal interest in the Microsoft tool stack at this moment.
SQL Server 2016
Strecth Database, for individual tables. See blog post by Marco Freccia, CTP2
Quick test, is this feature for me?
Multiple TempDB files. See blog post by Jeff Shurak, CTP2.4
SQL Server now defaults to 8 data files or the number of cores (read: threads), whichever is less, for the TempDB files.
AlwaysEncrypted, See blog post by TechNet, CTP3
Always Encrypted allows clients to encrypt sensitive data inside client applications and never reveal the encryption keys to the Database Engine
Native JSON Support, see blog post by Jovan Popovic, CTP3
Feature set is growing by each release of CTP, currently SQL Server can format and export data as JSON string, load JSON text in tables, extract values from JSON text, index properties in JSON text stored in columns and more to come.
Temporal Tables, see MSDN Article, CTP3
Allows you to keep a full history of data changes and allow easy point in time analysis. Temporal Tables is a new type of user table in SQL Server 2016
Row Level Security (RLS), see MSDN Article, CTP3
RLS enables you to implement restrictions on data row access. For example ensuring that workers can access only those data rows that are pertinent to their department, or restricting a customer’s data access to only the data relevant to their company.
PolyBase, see MSDN Article
PolyBase allows you to use T-SQL statements to access and query in an ad-hoc fashion data stored in Hadoop or Azure Blob Storage.
Parallel Partition Processing in Tabular, see blog post by Haidong Huang CTP2
Allows for faster processing of Tabular cubes. Mind the default settings though, read the blog post.
Single Package Deployment (Project Mode), see blog post by Andy Leonard (b|l|t)
Now you’ll be able to deploy a single package in a project, to fix a bug or similar scenario. Read the blog post, as there is no such thing, as a free lunch!
Azure
AlwaysEncrypted for Azure SQL Database, see blog post by MSDN, Preview
Azure Data Lake Store
The Data Lake store provides a single repository where you can capture data of any size type and speed simply without forcing changes to your application as the data scales.
Azure Data Lake Analytics Service
The analytics service can handle jobs of any scale instantly by simply setting the dial for how much power you need. You only pay for your job when it is running making it cost-effective.
Azure IoT Hub
Connect, monitor, and control millions of IoT assets running on a broad set of operating systems and protocols.
Data Catalog
Data Catalog lets users—from analysts to data scientists to developers—register, discover, understand, and consume data sources.
SQL Data Warehouse
Azure SQL Data Warehouse is an elastic data warehouse as a service with enterprise-grade features based on the SQL Server massively parallel processing architecture.
If that’s not enough for you, check out the impressive set of Azure services here.
Power BI
I can’t keep up!! (which is probably a good thing)
Duplicate Report Page, read the Power BI weekly update blog A common scenario when creating reports is to have multiple pages that are identical except for having different filters applied.
Collapse Navigation Pane though an URL parameter, read the Power BI weekly update blog
Allows you to add an URL parameter to your link that will automatically collapse the left navigation pane for any visitor who clicks the link.
Full Screen Mode, read the Power BI weekly update blog Enables Full Screen Mode for Power BI dashboards and reports.
I was very happy to get the news, that I have been selected to speak at Campus Days 2014 in Copenhagen, Denmark. The session will mainly be focusing on how to integrate data into Power Map and Power View, what are the new features and how do I present that to the business… The session level is 200, which means that some prior knowledge about Power Pivot is to be expected, but if your new to it, you would probably be able to tag along.