Saturday, 11 March 2017

Types of Hierarchy in a Data Warehouse

In Data Modeling, there are various types of hierarchy (such as a balanced hierarchy), and then there are various ways to implement a particular type of hierarchy in a database (such as a parent-child hierarchy). This post aims to explain both.

Types of Hierarchy

Balanced Hierarchy

In a balanced Hierarchy all the branches descend to the same level.

Example 1: Year > Quarter > Month > Day
Example 2: Division > Branch > Sales person 

Here are two representations of a balanced hierarchy (both are balanced): 

Un-Balanced Hierarchy

In an un-balanced hierarchy all the branches do not descend to the same level.

Example: A director with two employees who report to him, one of whom is a manager with multiple employees reporting to him, whereas the other has no employees reporting to him.



Ragged Hierarchy

In a ragged hierarchy, branches extend to different levels.

Example: In a company where everybody's pay is set according to what pay band they fall into, a CEO might be in pay band A, and he has Directors reporting to him who are in pay band B, and he also has a PA reporting to him in pay band C.


Multi-Parent Hierarchy

Many may not consider the following to be an example of a real hiearchy, but since we might refer to it as a hierarchy in the real world, I thought I would include it. In a multi-parent hierarchy, a lower level has multiple parents.

Example: Division > Branch > Sales Person, where the same sales person works at two branches.


Hierarchy Implementations

That's all very well and good, but how do we implement these in the database?

Parent-Child Hierarchy Structure

Here is an example of how you might implement any of the types of hierarchy listed above, except a multi-parent hierarchy. This is called a Parent-Child hierarchy, and I have chosen to implement the ragged hierarchy above:

Parent Child
A A
A B1
A B2
A1 C1
B1 C1
B1 C2

To implement a multi-parent hierarchy, you would need to have a multiple rows for the same member with multiple parents.

Natural Hierarchy Structure

A Natural Hierarchy structure cannot represent a multi-parent or a ragged hierarchy. Obviously, you would usually not start with Level 1, since they are all the same. So if Level 1 was the company, and Level 2 was the regional divisions withing the company, and Level 3 was the branch within the regional division, you would just start with Level 2, the regional divisions.

Level 1 Level 2 Level 3
A B1 C1
A B1 C2
A B2 C3
A B2 C4

Many-to-Many Hierarchy Structure

To represent a multi-parent hierarchy, you need a many-to-many structure, and in cases where a node has multiple parents, you need three three tables, one for the parents, one for the child, and one connecting both together with foreign keys from both (this is called a Factless Fact or a Bridge table, in Kimball methodology).

In SSAS, you import a bridge table into SSAS as a Fact table, and use a many-to-many relationship to link together the two dimensions, with a Measure Group of Count, which is hidden.

No comments:

Post a Comment