Saturday, 24 December 2016

A Quick Comment On Data Vault

If you don't know what Data Vault is, this blog post is not for you.

If you are considering using it in your business, here is my reaction:

Data Vault 2.0, as advanced in the major work Building a Scalable Data Warehouse with Data Vault 2.0 by Dan Linstedt (with a forward by none other than Bill Inmon) is a major investment for any business. If you thought the Inmon method was over-engineered, you ain't seen nothing yet.

Data Vault basically replaces your ODS layer providing an even more complicated data storage solution, which attempts to keep a record of literally everything (all data, regardless of how clean it is) that ever gets sent to your database. Data proceeds from the Data Vault through your ETL solution to your presentation layer, where your Star and Snowflake schemas are based.

For the vast, vast majority of businesses, Data Vault is the wrong route to go down. It will seriously complicate your ETL. It will be hard to hard to find developers who are familiar with the methodology. It will slow down the development of your data solutions.

For a select few businesses, who have a need to store almost everything that ever gets sent to their database, Data Vault is a breakthrough methodology that will help you out in a big way.


My SSRS Wish List

I have often felt that SSRS recieves the least love from Microsoft. Even Multidimensional Cubes received enough attention while being developed to allow them to become a mature product. I do not feel the same way about SSRS.

Yes, with SQL Server 2016 they have Mobile Reports now, but frankly, there are quite a few limitations of Mobile Reports, and I would say Mobile Reports have received even less love than Paginated reports.

Here are the features I would most like to see in SSRS:

  • Far more integration with Excel. I would like to be able to turn features on and off in excel when exporting from SSRS to Excel, and even publish data from Excel back to the SQL Server data source through the SSRS defined dataset.
  • Dynamic slicing and drill downs with linked charts! SSRS Mobile Reports can do it. Power BI can do it. Why, O, why, don't we have this feature in Microsoft's enterprise level reporting tool?
  • The ability to set the order in which parameters and data sets are loaded. I want to be able to load datasets before parameters, parameters before other datasets. Parameters and datasets are two of the most crucial concepts in any reporting tool. I would like to determine how they are ordered.

The relationship between Power BI and SQL Server

Power BI is constantly being updated. Therefore, if this post is out of date very soon, please forgive me.


Power BI and SQL Server are the two major data products sold by Microsoft. They share many of the same features and functionality, but work in different ways. This raises some questions:
  • How do Power BI and SQL Server relate to each other? 
  • How do Power BI and SQL Server compare against one another?
  • When should Power BI and SQL Server be used?
In this blog post I attempt to answer these questions, but be warned, my answers will not be comprehensive. There is so much more that could be said, but here are a few of the main things I have noticed:

How Power BI and SQL Server relate to each other

Power BI and SQL Server can both do a lot of the same things. If your organisation wants to take data from various source systems, transform it and use it in dashboards, both Power BI and SQL Server can do that. In fact, SQL Server Reporting Services will soon be able to integrate with Power BI (the rollout is planned late 2017; this post is being made late 2016).

SQL Server can do some things Power BI cannot, and Power BI can do some things SQL Server cannot. For example, SQL Server Integration Services can output data in formats other than dashboards. It can output data as XML files, CSV files, Excel files. It can also extract data from sources that Power BI can't, such as looping through a folder and extracting file names. But more on that in the next section.

Power BI is designed for power-users (non-technical users with a considerable knowledge of the product). It is good for use within a single department, as part of a single project. Any time that only a very limited number of users will need to access reporting, Power BI is ideal.

SQL Server, on the other hand, is designed as an enterprise level data tool. We're talking large-scale roll out to the entire business, huge amounts of data, data that has to conform to set business standards and government regulations.

Power BI grew out of Microsoft Excel. The Power BI team is separate now, but it still integrates with Excel well and brings many of the usability features of Office to the sphere of data manipulation and presentation.

SQL Server is still Microsoft's main data offering for business. It is a more mature product than Power BI, offering many more features and use cases than its younger brother.

How Power BI and SQL Server compare against one another

Reporting

Generally speaking, Power BI offers a better chart experience than SQL Server Reporting Services, with dynamic slicing, dynamic drill-downs and the ability to link charts together for slicing and drill down. Power BI also offers more charting 3rd party options to download. There is one feature of Power BI that I personally do not think is as helpful as some in the BI industry seem to think it is: that is, search. Power BI lets users construct their own reports in the search bar, but to do this they have to be pretty familiar with how to use it, and I'm not confident they are. 

SSRS, however, allows for better presentation of tables and matrices. It also allows for more customization in the way the report appears.

Data Integration

ETL in Power BI is a dream. It is so easy, one wonders why Microsoft didn't do ETL this way in SSIS. However, Power BI does not have all the features of SSIS. Pretty much everything that can be done in Power BI can be done in SSIS, but it cannot do many of the things SSIS can do, such as For Each Loops.

Moreover, SSIS is an enterprise level solution that can feed data into any data destination, whereas the Power BI ETL tool only ever works with Power BI reports, which rules it out for many uses, such as loading a SQL Server database. 

Power BI does not offer any scripting module, so one cannot do advanced things with it like scanning a folder structure for data about files.

Data Storage

Power BI is an in-memory database, whereas SQL Server offers a number of disk based alternatives, including the SQL Server Database Engine and SSAS Multidimentional Cubes. This is not so bad, but the biggest drawback to Power BI is that other data tools, such as SQL Server, cannot connect to it to extract the data! This is probably the worst feature of using Power BI, and the main reason to stick with SQL Server to create your data-sets. You can then import them into Power BI to create dashboards.

SQL Server boasts far more security features and customization than Power BI, although Power BI does have some, such as row based security. 

When should Power BI and SQL Server be used?

Because of the limitations of each product, Power BI and SQL Server should be used together. It is important that Microsoft BI professionals get to grips with both products. 

For most enterprise level needs, SQL Server will still be the first port of call for the BI professional. This is especially true with regards to data storage and data integration. However, this should be supplemented with the advanced reporting and dashboarding features in Power BI such as dynamic slicing in charts.

Thursday, 1 December 2016

What NOT to Use A Multi-Dimensional Cube For

The whole point of a Multidimensional Cube is make it easier for business users to slice aggregated numbers about their business. This means two things:
  1. If the data your business users request contains no numbers, a Cube is not the best option for your solution.
  2. If your business users do not want aggregations, a Cube is not the best option for your solution.
I feel this needed stating, as I have come across the sentiment that all reporting should come from a Cube, and that is simply not so.