A Hello World Custom Assembly in SSAS

This is my second post i a series of entry-level posts, as suggested by Tim Ford (b|l|t) in this challenge.
I have not set any scope for the topics of my, at least, twelve (12) posts in this new blog category of mine, but I can assure, it’ll be focused on SQL Server stuff. This one is going to be about how to extend Analysis Services (SSAS) with your own custom code.

The example in this post will be the well known Hello World example in the context of SSAS, and I trust this will illustrate the possibilities with this technique well enough, for you to apply your own solution, to your challenges.

Creating the Custom Code

First of all, we have to create the custom code. This can be done with the free tool Visual Studio (download). To get going, we create a new project  [Ctrl]+[Shift]+[N] or via the dialog below:

New Project Dialog

In the next dialog, we then select both which programming language to use for the project and also what kind of output the compiler will provide. I select a Class Library output, programmed in Visual C#, in this example. This will provide us with an assembly, which we can import directly into the SSAS Server.

Class Library Dialog

First a couple of house keeping routines. Even though this is just a tiny example, there’s no need to not do the right thing. Naming conventions should be agreed on, before diving into the code.
We begin by renaming the entire solution:

Rename Solution

I name this one ‘CustomAssembly’

Next we rename the project:

Rename Project

Finally we need to rename the Class File:

Rename Class FileI have named mine HelloWorld.cs. With this in order, we are now ready to implement our custom code.
Of course the code in this example is simple, but I trust you will be able to spot the potential of the technique.

The Class implementation looks as follows:

As I said, very simple. This function only returns the same text, for every call.
In order for us to leverage this via SSAS, we need an assembly to register. We now just need to successfully build the project in Visual Studio and we are off.

Right click the Solution name in the top of the Solution Explorer ([Ctrl]+[Alt]+[L]) or press [Ctrl]+[Shift]+[B] to build.

Build SolutionThis will produce the desired output and put it in a predefined location. Right click the Project and select ‘Open Folder in File Explorer’

Open Folder in File Explorer

In the bin folder, depending on your deployment settings (Debug/Release), you will find your assembly. Remember the location or move it to a known location. Issues can occur, if you register the assembly from the same location as you build.

Registering the Assembly with SSAS

We begin by opening SQL Server Management Studio (SSMS) and connect to the server we want to test our custom code on. When connected, we can right click the Assembly collection and select to add a new Assembly.

Add New AssemblyThis opens a new dialog, in which we can specify different settings, most importantly the settings on security and impersonation. For details on this, please see this MSDN description. The different levels of permission settings:

Permission Setting Description
Safe Provides internal computation permission. This permission bucket does not assign permissions to access any of the protected resources in the .NET Framework. This is the default permission bucket for an assembly if none is specified with the PermissionSet property.
ExternalAccess Provides the same access as the Safe setting, with the additional ability to access external system resources. This permission bucket does not offer security guarantees (although it is possible to secure this scenario), but it does give reliability guarantees.
Unsafe Provides no restrictions. No security or reliability guarantees can be made for managed code running under this permission set. Any permission, even a custom permission included by the administrator, is granted to code running at this level of trust.

In my example, I will not have a dedicated account, which I would highly recommend, if you were to go into production with this.

I have selected the path where Visual Studio is configured to put the output, and I have selected Permission Setting Safe and I am impersonating the Current User.

Register Assembly Dialog

When we click OK, the dialog will close, and the collection of assemblies ion the instance will be updated, to contain our newly created custom assembly.

Assembly Registered

Run the Query

The above steps enables us to query the function created, directly from MDX as shown in the screenshot below:

Result Set

Note! The Assembly Name applied in the registration, is also the name I use in the MDX query. Had I named the Assembly MyFirstCode, the MDX would look like this:
My First CodeThink of this as an alias for the Class, the method name does not change.

Next steps

As described above, we can extend Analysis Services with all kinds of thinkable custom code. We can even send back parameters, in order to have some sort of context for the algorithm in question. We can of course also have multiple algorithms per assembly, allowing us to select specific ones, for each specific MDX query.

An assembly can, as shown above, be registered at the server level. But we can also register our assembly in the private assembly collection of any number of databases on the server. This allows us to differentiate the code base, based on the solution at hand.

Now, I am not saying custom assemblies are the solution to all of our challenges on the platform, but in the course of my career I have implemented a couple of solutions where I have used this technique. The end result was far more scalable, robust and better performing than the equivalent MDX voodoo it was set to replace.

 

Posted in Entry Level, Programming | Tagged , , | Leave a comment

Permission Scope in Analysis Services

This is my first post i a series of entry-level posts, as suggested by Tim Ford (b|l|t) in this challenge.
I have not set any scope for the topics of my, at least, twelve (12) posts in this new blog category of mine, but I can assure, it’ll be focused on SQL Server stuff. This one is going to be about permission scope in Analysis Services (SSAS).

What this post will not be about: The how to setup basic dimension security in SSAS nor How do you manage Security.

In this post, I will highlight the difference between standard NTFS permission scope and the way SSAS handles Allowed and Denied sets when dealing with multiple roles. So if you define multiple roles on your solution, you should be on the lookout, because SSAS has some surprises.

NTFS

In NTFS, permissions are defined so that deny generally takes precedence over allow. As illustrated in below diagram, we can define an Allowed set as a sub set of the total set. The blue rectangle is to be seen as the full set.

NTFS Allowed SetAnd as we apply a Denied set, this will restrict the previous Allowed sub set, the intersection of the two sets, illustrated below: (The color indicates the finally allowed set.)

NTFS Denied Set

 SSAS

In SSAS however, the Allowed set takes precedence over Denied. So if you apply your usual NTFS logic to your dimensional security in SSAS, you may well be in for a surprise.

When no security is defined in a cube, everything is accessible to everyone. This color indicates the allowed set. (the complete rectangle)

Entire Cube Space

As soon as you introduce the first role, the cube is locked down to everyone not a member of said role.

Role w/ Allowed Set Introduced

If you then introduce restrictions in another role, you will get a different result than in the NTFS based security. The members of the role will still be able to see the full set of the Allowed set. Even though we just Denied that!

Role w/ Denied Set Introduced

Example

By creating two test roles, we can easily bring forward the, to some extend, unexpected behavior of SSAS.

Allowed Set Defined

Allowed Set Defined

As seen in Role Test 1, I have defined the Allowed set to contain Weeks 41 and 42 of the Year 2015. By browsing the Cube through SQL Server Management Studio, we can identify the Allowed set is working:

Allowed OK

Allowed OK

As I the introduce a new Role Test 2, and in that denying the week 42 of 2015, I would expect the browsing result to be only displaying only Week 41, but…

Denied Set

Denied Set

The result we get, when browsing the cube using both roles, shows all dates with data: (WTF! – Yes, you said it!)

Denied Set actually allowing everything?

Denied Set actually allowing everything!?

Fix

Clearly that was not the intention, by denying Week 42. So, how to fix the above violation to the Allowed set?

By adding an empty Allowed set ‘{}’ to the role containing the Denied set, in this case Test 2, as depicted below:

With this Empty Set ‘{}’ in place, we can browse the Allowed set again, but the Denied set does not restrict the result set.

Allow-Denied-Meh

Final Thoughts

While the way SSAS interprets the Allowed and Denied Sets may be straight forward, for simple security tasks. I think the above examples shows just how much testing one needs to do, in order to be totally sure the desired permissions are reflected in the end user experience.

As a side note, adding an empty set to the Denied Set of the first Role (Test1) does not alter the outcome.

 

Posted in Entry Level, Programming | Tagged , , | Leave a comment

The Circle of Work Life

Before my now roughly two and a half years in Maersk Line IT, I was working both as a consultant and an internal developer in Rehfeld Partners, now part of IMS Health. At Rehfeld I was both part of the Effektor team, but also worked in the Health Care team as well as the Private Sector team. Before joining Rehfeld, I had a short stop at KMD, which I will gracefully skip describing here. But, about Eight (8) years ago, I left Knowledge Cube where I had been scooped up as a Business Intelligence Rookie. Initially the Company (A-Ware) had three (3) partners, of whom two stayed on to become partners in Knowledge Cube. One of these partners, Erik Svensen (b|l|t), is now Owner of the Company CatManSolution. Erik was one of the initial three who instilled BI into my career path, and now he will, once more.

From today, I will (again) be working with the team on CatManSolution as a Business Intelligence Architect (effectively: BI Dude, so no change there). I am looking so much forward to taking on this opportunity, as I see a lot of fun and challenging tasks ahead – with great people around me. Not to mention a lot of potential for growth and research & development. I will, going forward, be working with the full suite of SQL Server as well as Azure offerings. I have missed out on that, during my tenure at Maersk, so It’ll be good again to get hands-on, rather than just reading up on new features and products.

CatManSolution allows you, as end user, to find report groups and concepts, such as

  • Status and ranking
  • Focus on development
  • Benchmarks and potentials
  • Promotion management
  • Zero sale and distribution
  • Price analyzer
  • New launchings
  • StorePictures and StoreEvent
  • Stock analysis

Data is coming in from an number of different sources, and obviously more sources are to come.

Posted in Community, Personal | Leave a comment

#TSQL2SDAY – Data Modeling Gone Wrong

tsql2sdayThis month marks the 72nd T-SQL Tuesday.  Adam Machanic’s (b|l|t) started the T-SQL Tuesday blog party in December of 2009. Each month an invitation is sent out on the first Tuesday of the month, inviting bloggers to participate in a common topic. On the second Tuesday of the month all the bloggers post their contribution to the event for everyone to read. The host sums up all the participant’s entries at the end of the week.
This month Mickey Stuewe (b|l|t) is the host and the topic is …

Data Modeling Gone Wrong

I am really looking forward to the other entries – because “cleverness” never seems to know any boundaries when it comes to Database Design or Modeling, and I just know there are some crazy things going on out there. So be sure to tune in to the host summary, that will appear on Mickey’s blog in the near future.

For my own part, I will refer to the latest “bright idea” that I came across. The application in question, was based on SQL Server, and was used to track price quotes. The application on top was in use at that point in time, and enhancements were developed, business heavily depending on this being “on-line”.

I general there were three (3) types of quotes. For each type, there was a set of table in the database that was almost identical. Only a some attributes where different, depending on type, but an inconsistent naming convention still gave away the major relations between tables in the database.
This meant that the application relied on three (3) sets of almost identical tables. This could most definitely have been designed differently, but it’s not the real cluster f**k of this application, so I will not go into detail here.
In every table of the database, there was a column name with a post fix ‘key‘. There was also a column name with a post fix ‘id‘ in all of the tables. At first sight that seemed to look like two tech leads having a ball. But actually it was not. I discovered later, that the had been only one “architect” behind this application. The good thing about that, was that the problem was easy to “contain”, since that particular employee was no longer working on the project. :)

After some investigation and data profiling, I slowly honed in on the fact that data in column blablabla_key and blablabla_id wasn’t related, in any way. Nor was key in one table related to key in any other table. Neither was id. In fact, there were no foreign keys defined in any of the tables. So no ref. integrity was ensured. In theory, even if I found the right match, crappy data could actually obfuscate this fact that there really was a relation.

Further investigation led to the conclusion, that id and key where in no way inter-related. So id in one table was not related to key in another table assembling the naming convention. No, it turns out, that for every quote, the business is operating with a term validity. The id of the respective validity, one (1) of three (3), is related to a specific quote key, which in no way was reflected, in any way, in the naming convention of either tables nor columns. E.g.: QuoteKey was related to ValidityId, in each of the respective quote type schemas. But that’s not it. For each of the three types of quota, two validities had been combined into one table. In fact, additional logic was to be applied, as if one validity was not present in said table, the other should take over. Sort of a fail safe validity. This meant that keys, if not present, was to be treated differently…

Oh Come on!

Needless to say, I spent a good amount of time trying to figure out how to combine the data, so we could do some proper business intelligence on top. Conference call after conference call left me fruitless, and still to this day, I am not sure i could have asked the developers any differently. The disconnect between me and them was soo huge, too huge to identify it seems. Asking one thing and getting a seemingly sane answer just made it even more difficult. Because none proved valid when querying the database. In the end, profiling the database senselessly, made the relations stick out.

So, to wrap up. Get serious about naming conventions, event if they seem to be a pain. Secondly, and more importantly, do define the relations in the database. There are several benefits of doing that, even though you will probably meet someone who will argue this.

As a smart exit remark, I wanted to end with a quote (only fitting for this post), but unfortunately I wasn’t able to find the original source for the quote. Did some extensive searching on twitter yesterday, without any luck. To my best knowledge, this was something Adam Machanic (b|l|t) supposedly should have said. Don’t kill me, if it’s not :)

“Performance doesn’t matter, when you have Dirty Data.”

What I read of off this quote is, that if you have dirty data, no performance in the world will ever make up for that. This is where Foreign Key constraints becomes your best friend. There are actually several good articles out there on Foreign Key Constraints:

Do Foreign Key Constraints Help Performance? by Grant Fritchey (b|l|t)
Foreign Key Constraints: Friend or Frenemy? by Mickey Stuewe (b|l|t)
Do Foreign Keys Matter for Insert Speed? by Jeremiah Peschka (b|l|t)

Thanks to Mickey Stuewe for hosting such a brilliant topic. I look very much forward to read the other entries!

Posted in Community, Programming | Tagged , , | Leave a comment

Everything Microsoft BI (as of late)

Lately Microsoft seems to have stepped even more on the gas, releasing amazing features upon features for both SQL Server 2016 and Power BI as well as their Cloud platform Azure. This blog post will run through some of the features I find most interesting, in the latest couple of releases. With that said, Microsoft is realeasing new features on a weekly basis for the Power BI service, on a monthly basis for the Power BI Designer and have made three Community Tech Previews (CTP) available over summer for SQL Server 2016.
Just back from PASS Summit and I am a happy Camper, full of all the great new stuff already there, in preview or about to hit us with the vNext of SQL Server.
See it all in a foundation presentation by James Philips (t|l), Corporate VP at Microsoft.

Without further ado, here’s my list of points of interest – it may be a mess to you, but reflects my personal interest in the Microsoft tool stack at this moment.

SQL Server 2016

Strecth Database, for individual tables. See blog post by Marco Freccia, CTP2
Quick test, is this feature for me?

StretchDB

Multiple TempDB files. See blog post by Jeff Shurak, CTP2.4
SQL Server now defaults to 8 data files or the number of cores (read: threads), whichever is less, for the TempDB files.

AlwaysEncrypted, See blog post by TechNet, CTP3
Always Encrypted allows clients to encrypt sensitive data inside client applications and never reveal the encryption keys to the Database Engine

Native JSON Support, see blog post by Jovan Popovic, CTP3
Feature set is growing by each release of CTP, currently SQL Server can format and export data as JSON string, load JSON text in tables, extract values from JSON text, index properties in JSON text stored in columns and more to come.

Temporal Tables, see MSDN Article, CTP3
Allows you to keep a full history of data changes and allow easy point in time analysis. Temporal Tables is a new type of user table in SQL Server 2016

Row Level Security (RLS), see MSDN Article, CTP3
RLS enables you to implement restrictions on data row access. For example ensuring that workers can access only those data rows that are pertinent to their department, or restricting a customer’s data access to only the data relevant to their company.

PolyBase, see MSDN Article
PolyBase allows you to use T-SQL statements to access and query in an ad-hoc fashion data stored in Hadoop or Azure Blob Storage.

Parallel Partition Processing in Tabular, see blog post by Haidong Huang CTP2
Allows for faster processing of Tabular cubes. Mind the default settings though, read the blog post.

Single Package Deployment (Project Mode), see blog post by Andy Leonard (b|l|t)
Now you’ll be able to deploy a single package in a project, to fix a bug or similar scenario. Read the blog post, as there is no such thing, as a free lunch!

Azure

AlwaysEncrypted for Azure SQL Database, see blog post by MSDN, Preview

Read more about all Azure Preview Features here

Azure Data Lake Store
The Data Lake store provides a single repository where you can capture data of any size type and speed simply without forcing changes to your application as the data scales.

Azure Data Lake Analytics Service
The analytics service can handle jobs of any scale instantly by simply setting the dial for how much power you need. You only pay for your job when it is running making it cost-effective.

Azure IoT Hub
Connect, monitor, and control millions of IoT assets running on a broad set of operating systems and protocols.

Data Catalog
Data Catalog lets users—from analysts to data scientists to developers—register, discover, understand, and consume data sources.

SQL Data Warehouse
Azure SQL Data Warehouse is an elastic data warehouse as a service with enterprise-grade features based on the SQL Server massively parallel processing architecture.

If that’s not enough for you, check out the impressive set of Azure services here.

Power BI

I can’t keep up!! (which is probably a good thing)

Duplicate Report Page, read the Power BI weekly update blog
A common scenario when creating reports is to have multiple pages that are identical except for having different filters applied.

Collapse Navigation Pane though an URL parameter, read the Power BI weekly update blog
Allows you to add an URL parameter to your link that will automatically collapse the left navigation pane for any visitor who clicks the link.

Full Screen Mode, read the Power BI weekly update blog
Enables Full Screen Mode for Power BI dashboards and reports.

If you want to see some of the amazing things you can already do in Power BI, please visit the Power BI Best Visual Contest page

Not to mention all the features updated each month in the Power BI Desktop application, see latest updates here.

If you find something missing, Microsoft is actually listening, so please feel free to register and suggest your fantastic new idea at this site.

Posted in Programming | Tagged , , | Leave a comment