What to make of it?
Currently I am conducting a review on all existing solutions. These solutions have seen their share of developers over the years. Hence there are several “trademarks” to be found. One such trademark threw me off a bit. It seemed as if the same method had been applied to all queries in the solution. For every expression in the where clause, there would be a similar sub select statement. So if there was a slice by [Product].[Category].&[4], there would be a Sub Select eg. ( SELECT [Product].[Category].&[4] ON COLUMNS FROM… )
I hadn’t seen this construct before, so I felt challenged to find out, if this had some purpose or it was poorly written MDX. I acknowledge, that sometimes development is rushed, and you leave behind unwanted code. But this looked as if it was written to stay. There was almost no complexity to the MDX, other than the odd construct… An example of a statement converted into Adventure Works lingo, would look something like this:
SELECT {[Measures].[Internet Sales Amount]} ON COLUMNS , NONEMPTY {[Promotion].[Promotion].[Promotion]} ON ROWS FROM ( SELECT [Scenario].[Scenario].&[1] ON COLUMNS FROM ( SELECT [Sales Territory].[Sales Territory].[Group].&[Europe] ON COLUMNS FROM ( SELECT [Product].[Category].&[4] ON COLUMNS FROM [Adventure Works] ) ) ) WHERE ( [Scenario].[Scenario].&[1] , [Sales Territory].[Sales Territory].[Group].&[Europe] , [Product].[Category].&[4] )
I looked up Sub Cubes in Analysis Services 2008 Unleashed, which is a ressource I would highly recommend it, if you don’t already know of it. In this book, a sub cube is described as an arbitrary shaped discret and limited sub space of the cube. This illustrated by the following query (Query 1) creating the following sub space (Figure 1 – grey area):
One of the important Things to notice about sub cubes are, that the Default member can change, if the “normal” default member is not a part of the sub space. Then the first member of the sub space will become the default member.
I was reconfirmed in most of my knowledge on sub cube when I dug into the documentation on the subject, and nothing would lead me to believe, that the construct was other than a misunderstanding. So I tweetet a question using hashmark #ssashelp, and Brent Ozar (blog|twitter), Chris Webb (blog|twitter) and Robert L Davis (blog|twitter) replied, after twisting some arms 🙂 This lead to this blog post, describing the issue, in just a little more than 140 characters…So lets see what replies are in store! To my best knowledge, the first query in this blog post, is more confusing than brilliant.
If this pattern is being applied to *every* query, then it’s definitely wrong and I’ll bet the person writing the queries didn’t know what they were doing. However, that doesn’t mean it might not be a good idea in some cases (and I’ll bet the person who wrote the queries probably found one of these scenarios and decided to apply it to every query).
I’ve seen several cases where using subselects changes the query plan used by SSAS and improves performance. However, in general subselects should be avoided because (before the very latest CUs of 2012) they prevent the effective use of the formula engine cache:
http://cwebbbi.wordpress.com/2013/08/07/subselects-and-formula-engine-caching-in-ssas-2012/
http://cwebbbi.wordpress.com/2008/10/28/reporting-services-generated-mdx-subselects-and-formula-caching/
Subselects also do something similar to the where clause, but not quite the same thing. Subselects should really be used for controlling how subtotals are displayed in a query; in the query you show above this is not relevant and you should get the same results without the subselect. If you have a hierarchy that is on a visible axis and the All Member is selected somewhere on that axis, and if that same hierarchy is present in the subselect, then you will see this impact on how the subtotals (ie the value of the All Member) is displayed.
The one of the main differences between a subselect and a where clause is that a subselect doesn’t affect what the currentmember function returns, whereas a where clause does. It could be that whoever wrote these queries started off using subselects, found that calculated members didn’t behave the way they were expecting, and then had to add a where clause to overcome this.
In summary, I’m 90% sure that there’s no good reason to follow this pattern. But as always, you need to test properly before you change anything…
Hi Chris
Thanks for your reply, it is very much appreciated!
The pattern seems to be in some of the solutions, indicating that a certain developer has had the notion, that this was a good practise.
I have run some test, tracing the queries, and they come up with alomst the same footprint. The results can be categorised in two, first one having only a regular where clause and a select from the cube. The second one is a combination of sub selects, either nested or non-nested, with or without the where clause. There is a slight increase in rows read in the first scenario, about 5% (depends of course on the data). But the second scenario all shows the same footprint in regards to reads, read_kb, rows_scanned and rows_returned. Are there be any other way of seing a difference in the footprint left from the different queries?
Thanks for sharing your articles, they were good to read.
Br
Jens