T-SQL – T-SQL.dk

How to generate the alphabet from ASCII codes in T-SQL

Posted by Jens Vestergaard on 19/02/2014 in Programming 0 Comment

Just now I had the requirement to create a loop in SSIS, and have that loop the alphabet.
In order to facilitate that, I generated a T-SQL statement, that provides the alphabet including the Danish Æ, Ø and Å. If you don’t need these, you can leave out the latter part of the last where clause (197, 198, 216).

   1:  SELECT CHAR(number) FROM master..spt_values
   2:   WHERE
   3:      Type = 'P'
   4:      AND
   5:      ( number between 65 and 90
   6:          OR
   7:        number IN (197, 198, 216)
   8:      )

Link to an ASCII table with possible values ranging from 0 to 255.

Using a Custom Assembly to search an Analysis Services dimension

Posted by Jens Vestergaard on 19/06/2012 in Programming 4 Comments

Problem

A client of ours had the business requirement that they wanted to be able to search for members in certain dimensions in their cube. In particular two specific dimensions, patients and health care persons (doctors, dentists, nurses etc.). And they wanted to be able to search by first name, last name, age, social security number and other attributes.

Now, in Analysis Services (SSAS) this can be done by means of MDX, but it’s not performing very well, at least not when the number of members in the dimensions exceeds 10.000. Of course this is variable in respect to hardware, but the main point is that SSAS doesn’t scale very well in this matter.

Solution

The solution was to create a custom assembly that was able to utilize the fact that the dimensions are build on a table from at datamart. In short, we query the underlying datamart rather than the cube itself. This of course requires a bit of configuration, since we wan’t to be able to use the assembly in more than one cube.

The method I made in this example has the flollowing signature:

Find( string dimensionName, string attributeName, string searchValues );

The signature is chosen in order to control the output in accordance with the dimension/attribute we want to query on through MDX (not T-SQL). The method uses the two first parameters to build the member that is going to be part of the result, eg. “[<dimensionName>].[<attributeName>].&[<key_from_sql_statement>]“. One could argue to put this into the config file as well, and yes that could be done. The details of the configuration is not the point of this post, so I will leave it up to the individual situation to deal with that issue, and choose how this is done.

Configuration

This really was the hardest part. I have a .Net background and took some things for granted when I tried to implement this assembly. First I thought that if I was to deploy the <Assembly Name>.dll.config, the assembly would be able to use the build in .Net Configuration Manager and read the configuration settings in the file. Even when following the advice of this post (although in SSRS) by setting the permission level, I didn’t succeed in getting the configuration values “the ordinary way”. The end result was a Null reference exception every time. The only way, then, is to manually load the config file as an XML document, and query it for the needed attributes. This is a bit of a hassle, and I had hoped the more direct and regular approach had worked out. If anyone has clues to make this work please let me know.

Execution

The really nice part about this solution, is that once the custom assembly is defined, you can fire MDX at will from SQL Server Management Studio (SSMS), Reporting Services (SSRS) or any other tool that allows you to modify the MDX. Exceptions here are Targit amongst others…

An example of an MDX statement that includes a call to a custom assembly:

Code Snippet

WITH
    SET [Results] AS
        STRTOSET(
            CustomSearch.Find(
                ‘Customer’
                , ‘Customer’
                , ‘#LASTNAME#=Car|#FIRSTNAME#=Rob’
            )
        )
SELECT
    {} ON COLUMNS
    , {[Results]} ON ROWS
FROM [Adventure Works]    

The resulting T-SQL, based on the definition in the config file, looks like this:

Code Snippet

SELECT
    DISTINCT c.[CustomerKey]
  FROM [AdventureWorksDW].[dbo].[DimCustomer] c
    INNER JOIN [AdventureWorksDW].[dbo].[DimGeography] g ON c.GeographyKey = g.GeographyKey
WHERE
    c.FirstName LIKE ‘%#Rob#%’
    OR
    c.LastName LIKE ‘%#Car#%’
    OR
    g.PostalCode LIKE ‘%%’
    OR
    c.BirthDate LIKE ‘%%’

The method in the assembly then uses the c.[CustomerKey] to build the MDX member.
If the result is empty, and empty set ‘{}‘ is returned.

Flexibility

Certainly there are more ways to use this construct than the way I’ve showed here, but the part I really like about it, is that I am able to do my own T-SQL and do my own MDX on top of the result. There is no hidden part that does some magic trick, thats left entirely to the magician behind the keys.
A side effect that I first spotted during the making of this post, is that often there is no attribute in SSAS for both first name and last name. They would be somehow concatenated into a name attribute or similar. With this solution, we are actually allowed, due to the way the datamart is constructed, to seperate the search for ‘Rob’ in the first name and ‘Car’ in the last name. When done with MDX, we would be flooded by members that contained ‘Rob’ in the last name and ‘Car’ in the first name.

Performance

This is probably the most compelling reason for doing this. I have no hard numbers at the moment, but it shouldn’t be too hard to figure out, that when the number of members climb, the query time including an MDX InStr( …. ) comparison sky rockets as well. The Analysis engine can in no way beat SQL Engine on its home turf.

Pitfalls:

You need to be sure you are querying tables that have been processed by the cube. An easy mistake is to query an updated SQL Server table, getting the result back and using it to query a dimension in Analysis Services that has not yet been updated.

Resources:

VS 2010 Project + MDX Query: Custom Assembly Search Project

Quick-tip: Default column type in SSMS

Posted by Jens Vestergaard on 24/09/2011 in Programming 0 Comment

Looking for some other settings I stumbled upon this very nice, and very hidden, feature in SQL Server Management Studio (SSMS): The ability to set your default column type!

That should save me a lot of time, from now on.

Selecting an arbitrary number of random rows from a table with T-SQL

Posted by Jens Vestergaard on 24/09/2011 in Programming 0 Comment

While searching for a fairly large dataset to use for performance baselining, I was looking at the option of duplicating data in the Adventure Works database to match our needs.
In order to check for weak spots, I need at least 10M fact rows and preferably at least one dimension with approximately 1M members.
One approach would be to generate the cross product of all customers by first-, mid- and last name, this makes app. 11M records. These members could the be matched to an abitrary number of facts.

So, how to get an arbitrary number of fact rows:

BEGIN TRANSACTION
DECLARE @FirstName nvarchar(50), @MiddleName nvarchar(50), @LastName nvarchar(50), @SalesOrderNumber NVARCHAR(2) = ‘SO’
DECLARE @MaritalStatus nchar(1)
DECLARE @Gender nvarchar(1)
DECLARE @RecordCount INT, @GeographyKey INT, @CustomerKey INT, @SONumber INT = 100000
DECLARE customer_cursor CURSOR FAST_FORWARD
    FOR SELECT FirstName, MiddleName, LastName FROM dbo.AllCustomers
OPEN customer_cursor
FETCH NEXT FROM customer_cursor
INTO @FirstName, @MiddleName, @LastName;
WHILE @@FETCH_STATUS = 0
    BEGIN    
        SET @RecordCount = FLOOR( CAST( DATEPART(MILLISECOND, GETDATE() ) AS DECIMAL ) / 4 )
        SET @GeographyKey =( SELECT TOP 1 GeographyKey FROM DimGeography ORDER BY NEWID() )
        SET @Gender =( SELECT TOP 1 Gender FROM DimCustomer WHERE FirstName = @FirstName ORDER BY NEWID() )
        SET @MaritalStatus =( SELECT TOP 1 MaritalStatus FROM DimCustomer ORDER BY NEWID() )
        SET @SONumber = @SONumber + 1
        INSERT INTO dbo.DimCustomer
            (
                CustomerAlternateKey
                , GeographyKey
                , FirstName
                , MiddleName
                , LastName
                , MaritalStatus
                , Gender            
                , IsModified
             )
            VALUES
            (
                CAST( LEFT( newid(), 15 ) AS nvarchar(15) )
                , @GeographyKey
                , @FirstName
                , @MiddleName
                , @LastName
                , @MaritalStatus
                , @Gender
                , 1
            )
        SELECT @CustomerKey = SCOPE_IDENTITY()
        DECLARE @SQL VARCHAR(MAX) =
        ‘        
        WITH Data AS 
        (
            SELECT TOP ‘ + CAST( @RecordCount AS VARCHAR(4) ) + ‘ f.ProductKey, OrderDateKey, f.DueDateKey, f.ShipDateKey, ‘ + CAST( @CustomerKey AS VARCHAR ) + ‘ AS CustomerKey, f.PromotionKey, f.CurrencyKey, f.SalesTerritoryKey,  ”’ + @SalesOrderNumber + CAST( @SONumber AS VARCHAR(10) ) + ”’ As SalesOrderNumber, f.SalesOrderLineNumber, f.RevisionNumber, f.OrderQuantity, f.UnitPrice, f.ExtendedAmount, f.UnitPriceDiscountPct, f.DiscountAmount, f.ProductStandardCost, f.TotalProductCost, f.SalesAmount, f.TaxAmt, f.Freight, f.CarrierTrackingNumber, f.CustomerPONumber FROM dbo.FactInternetSales f ORDER BY NEWID()
        )
        INSERT INTO dbo.FactInternetSales
        SELECT 
            ProductKey
            , OrderDateKey
            , DueDateKey
            , ShipDateKey
            , CustomerKey
            , PromotionKey
            , CurrencyKey
            , SalesTerritoryKey
            , SalesOrderNumber
            , ( RANK() OVER ( ORDER BY NEWID() ) ) AS SalesOrderLineNumber
            , RevisionNumber
            , OrderQuantity
            , UnitPrice
            , ExtendedAmount
            , UnitPriceDiscountPct
            , DiscountAmount
            , ProductStandardCost
            , TotalProductCost
            , SalesAmount
            , TaxAmt
            , Freight
            , CarrierTrackingNumber
            , CustomerPONumber 
        FROM Data’
        –PRINT (@SQL)
        EXEC(@SQL)
        FETCH NEXT FROM customer_cursor
        INTO @FirstName, @MiddleName, @LastName;
    END
CLOSE customer_cursor;
DEALLOCATE customer_cursor;
COMMIT TRANSACTION

Running this script will generate between 1 and 275 million fact rows, depending on how long you let it run.

Alert:
This script is put together in the least amount of time possible, with absolutely no consideration towards performance.runs incredibly slow, so any hints on how to make this run in less than a week are highly appreciated! 😉

Alert (again)
This script will run for days!

Renaming the name of the physical database and log file

Posted by Jens Vestergaard on 10/08/2011 in Programming 0 Comment

During one of my current tasks, which was to run the same test on multiple copies of the same database, I needed to setup a flow where I could

deploy the database from VS2010 or setup database w/ vsdbcmd
run the test (import 5.5 mio facts + 250K dimension elements into a dw model)
rename the database including the physical files
goto step 1 w/out overwriting the databases already created

Since both a deploy from VS2010 and a setup run through vsdbcmd will produce the same name on the physical files over and over again, I needed to alter these names. Here’s how the script looks:

USE [master]

DECLARE @Database NVARCHAR(255) = '<name of database>'
DECLARE @BackupDevice NVARCHAR(255) = N'<full path to backupmedia>'
DECLARE @CurrentDataFileName NVARCHAR(255) = 
              N'<name of logical database file>'
DECLARE @TargetDataFileFullPath NVARCHAR(255) = 
              N'<operating system file name for new database file>'
DECLARE @CurrentLogFileName NVARCHAR(255) = 
              N'<name of logical log file>'
DECLARE @TargetLogFileFullPath NVARCHAR(255) = 
              N'<operating system file name for new log file>'

DECLARE @SQL VARCHAR(MAX) = N'
RESTORE DATABASE ' + @Database + N'
FROM  DISK = N''' + @BackupDevice + '''
WITH  FILE = 1,  
MOVE N''' + @CurrentDataFileName + ''' 
TO N''' + @TargetDataFileFullPath + ''',  
MOVE N''' + @CurrentLogFileName + ''' 
TO N''' + @TargetLogFileFullPath + ''',  STATS = 10
'
--PRINT @SQL
EXEC (@SQL)

rename_database_physical_files.sql