Saturday, 19 November 2016

What does Deterministic mean in SQL?

In reading SQL Server books, you may come across the term, "deterministic" without it being explained. So what does it mean?

Deterministic means that the SQL query determines two things about the result of the query:

  1. Firstly, it has to determine which rows you get back, but only if you're not selecting all rows. Let's say you had a table (dbo.Table) with three rows, and your query was SELECT * FROM dbo.Table. Your query would in this case be determining your result set, because its determining that it will return all rows from dbo.Table. Even if the data in the table changes (let's say a row was added) your query is still deterministic in that its determining that you'll get back all rows. But let's say we did SELECT TOP 1 * FROM dbo.Table. In this case, your query is not deterministic because its selecting a limited number of rows (in this case, 1 row) but it's not determining which row (out of the three in the table). Of course, if you only had one row in your table, it would be deterministic, because it would be selecting all rows. Again, lets take the query SELECT * FROM dbo.Table WHERE ID = 3. This query is deterministic even though there might be multiple rows with an ID of 3. This is because the query is telling SQL Server to return all rows where the ID is 3. So its determining the result set in that sense. But if you had SELECT TOP 1 * FROM dbo.Table WHERE ID = 3 then your query will only be deterministic if there is a single row with an ID of 3. If there are multiple rows with an ID of 3, then your query will only return one of those rows, but it hasn't determined which one, so it is called 'indeterministic'.
  2. Secondly, it has to determine the order in which those rows are presented. To do this in SQL you add an ORDER BY clause, though that is not always enough.

Making A Query Fully Deterministic

Lets say you have a table with three records in it:

   SELECT [Name], [Age] FROM [dbo].[NameAgeTable]

Name Age
John 45
Peter 23
Mike 87

How do you determine which results you get back?

In your query you have not specified a TOP clause, so your code has already determined which rows you get back, because it returns all of them! To determine the order in which the rows appear, you simply use ORDER BY, like this:

   SELECT [Name], [Age] FROM [dbo].[NameAgeTable] ORDER BY [Age]

Name Age
Peter 23
John 45
Mike 87

The ORDER BY clause sorts out the row order in-determinism, and it also makes sure that if you were to use a TOP clause, your query would determine which rows you got back! This is because TOP works in conjunction with ORDER BY where it does the ordering first, and then selects the top n records. So now you have sorted both row-order in-determinism and rows returned in-determinism, just by using the ORDER BY clause.

But there is a complication: what if the data in your ORDER BY column(s) had multiple instances of the same value in it, like this:

   SELECT [Name], [Age] FROM [dbo].[NameAgeTable] ORDER BY [Age]

NameAge
Peter45
John45
Mike45

If your ORDER BY column(s) has more than one instance of the same value, your query becomes in-deterministic again, because it has failed to determine the order in which the rows are returned.

So, to make your query fully deterministic, you need two things:
  1. An ORDER BY clause
  2. A unique result for each row returned by the ORDER BY list of columns. This can be achieved in various ways:
    1. By adding columns to the ORDER BY column list
    2. Using a SQL Server constraint
    3. Using the WITH TIES option. Don't know what one of those is? Read on...

Making A TOP Query Partially Deterministic without a unique ORDER BY list, using WITH TIES

Helpfully, if you are trying to make your TOP query deterministic, and you've set the ORDER BY column list you want, but the ORDER BY column list is not unique, there is still a way to determine which rows are returned by your query, without having a unique ORDER BY column list. The trick is to use the WITH TIES option, as follows. This gets rid of Rows Returned In-Determinism.

To ensure your query always returns the same rows (provided the data doesn't change) even when the ORDER BY list produces non-unique results, add the WITH TIES option. For example:

 SELECT TOP 3 WITH TIES [CityName] FROM [dbo].[Address] ORDER BY [CityName]

But be careful! The way this works is by returning all the duplicates from the table you're querying that appear in your result set. This means that if you do a TOP 2, you might more than 2 rows returned.

For example, lets say your data looked like this:

   SELECT * FROM [dbo].[LetterNumber]


Letter
A
B
B
C

If you used TOP with WITH TIES you would end up with the following:

   SELECT TOP 2 WITH TIES [Letter] FROM [dbo].[LetterNumber]

Letter
A
B
B

not:

Letter
A
B

Moreover, unless your ORDER BY list is unique, you will still have Row Order In-Determinism, because multiple rows with the same value (or combination of values over multiple columns) in the ORDER BY list will still have no determined order. So your query will only be partially deterministic. But of course, since it doesn't matter which way around your rows appear if they are identical, this doesn't matter much.

Now you know what deterministic means and how to make your queries deterministic.

Saturday, 12 November 2016

Where to type your DAX queries in SSMS and Excel

New to DAX querying and wondering where you are meant to type in your code? This is the post for you.

SSMS prior to 2017

Before one of the update to SSMS in 2017, you would have to type your DAX query into the MDX query window. Odd, huh? After the update, SSMS provided you with a seperate DAX query window. But if you still have an old version of SSMS, here is what you do:

In SSMS, click the Connect button in Object Explorer > right click the database you want to query > click New Query > select MDX > now start typing your DAX code. Yup, you read it right. You type DAX into the MDX query window and click Execute (or press F5)

Be careful with the Cube pane on the left - if you drag columns over from this pane the code will appear in the MDX language, not DAX. To query with DAX you have to type it manually.

Excel 2016

In Excel 2016, select the Data tab on the ribbon > click New Query > click From Database > click From SQL Server Analysis Services Database > type the server name and the database name > select the drop down arrow beneath to allow you to type your DAX query.

To edit your DAX query in Excel 2016, select the Data tab on the ribbon > click Show Queries > hover the mouse over the query (do not click) > click Edit > click the cog symbol next to Source on the right under Applied Steps > edit your DAX query.


Microsoft Database Query Languages


Microsoft provides various different types of database software with various compatible query languages:

Database Compatible Query Language
Microsoft Access SQL
Microsoft SQL ServerTSQL
Microsoft SQL Server Analysis Services Multidimensional Cubes MDX
Microsoft SQL Server Analysis Services Tabular Models DAX / MDX
Microsoft Power BI*
Microsoft Power Query for Excel*
DAX*

* With Power BI and Power Query for Excel one can query the data model stored in these programs using the DAX language. However, to manipulate the data as its being brought into the model, these programs allows you to use a language called Power Query Formula Reference (informally known as "M").