Author: Mesa Jajan
What is a Business Vault?
Previously several posts have been published about the Data Vault. But while having a closer look at the Data Vault and reading online discussions, the term ‘Business Vault’ catches the attention. A long journey has been conducted to have a clear definition of the Business Vault. Also to get convinced of the business value that a Business Vault has to offer. This paper will give an overview of the information available about the definition of the Business Vault. Also about the related definitions: ‘Raw vault’ and ‘Staging Out’.
Definition of the Business Vault
It is difficult to arrive at a clear conclusion about the definition ’Business Vault’ because this term seems not be universally applied. But with some logical sense, one can have an impression. Having a look at the name one can conclude that the Business Vault is a layer that includes the possibility to be used by the business and also includes business rules. These business rules seem to be executed here for the requirement of the alignment with the business keys and for common transformations required by the enterprise. The Business Vault seems to be situated between the Data Vault and the data marts.
Let’s sum up the following terms: Data Vault, Business Vault, Raw Vault and Staging Out. And in addition: Operational Data Vault and Data Marts. Even for a senior data warehouse developer all these, might be too much. And how about the sequence of all these terms? Let’s not forget the focus of the main goal: source data that needs to be transformed into business value.
Attached to the term Business Vault, is somehow, the term Raw Vault. To make it even more unclear; this term also seems to be connected to different meanings. But what one can conclude is the fact that these are all layers, supposedly making the Data Vault method efficiently integrated in the data warehouse environment.
Let’s try to give a practical overview of these layers.
This is not part of the Data Vault methodology but rather a layer that is 100% generated from the source. It will be mentioned though because it is a step that needs to be conducted in order to apply the Data Vault method. In this layer there is no integration and no business rules are applied. Also the format of the source data does not change. This process is a copy of several data sources into the data warehouse.
Business Vault/Raw Vault/Operational Data Vault
When thinking with common sense, this is the part where the actual Data Vault should be situated. Why making a differentiation between all these Vaults? Which hick up or issue from the business needs to be encountered here? And how Raw can a Raw Vault be? When defining HUBs, LINKs and Satellites, one aligns business keys. And isn’t aligning business keys also part of business rules? Let’s forget all these terms and layers at this stage and just call it ‘Data Vault’.
So what comes after the Data Vault? Should one place a Business Vault here? To me, this doesn’t sound logic at all. This is the point where one models his data marts to make the data available through transformations for the business. Dan Linstedt states a solution whereby Powerpivot is used as the actual Business Vault. Claiming that Powerpivot has the capability of producing and acting as the Business Vault. Being able to run business rules from the business users to the raw integrated data warehouse. He doesn’t mention the data mart. This article does not have the intention to discuss the possibilities of Powerpivot. Therefore this solution is only mentioned and not discussed.
As stated above, it seems as the most logical step to build a data mart after the Data Vault. But the Data Vault community offers another layer before modelling the data marts. This layer is called the ‘Staging out’. Information about this layer was provided by this article of Hans Hultgren and this article of Ronald Damhof.
Staging Out/Enterprise Data Warehouse
This layer modularizes the downstream business rules. Cleansing and enriching of the data is conducted. When the business rules are generic, one needs to deploy them once and execute it more often. Main reasons for this are: Maintenance, re-usability and auditability.
Eventually a data mart is modelled as a presentation layer. The data mart is a subset of the data warehouse used by the business. Assuming that the majority of the transformations have been executed in the ‘Staging out’, one only needs to model his data into a star schema/snowflake.
A look at this article proves that the terminology related to the Data Vault method is not universally applied. I have not found any reasons to add extra layers to the Data Vault method. These layers are extra layovers that don’t improve the quality of business value. The following sequence is the classical point of view: Staging – Data Vault – data mart. I have not been convinced by any argumentations to change this view. But all reactions and arguments are welcome to create a better understanding.