One Giant Model or Split it Up?

I came across this question in social media. It was a question submitted to @GuyInACube who asked his guest, Marco Russo, “Is it OK to build one giant data model, or should you split it up into multiple models?”

He answered quickly, without wavering, “No.”

His reasoning against one giant model provided non-technical examples. For instance, the same entity, or measure, even, might have different names or different meanings by different departments in the same company. They might even need to be defined with different code!

I have experienced this first-hand. I was asked to develop a large data warehouse and publish it as a single Tabular Model. It contained HR data, sales and commissions data, revenue, service, and utilization (rental industry).

Right away, I noticed a difference in the definitions of sales type and revenue type. The rules classifying sales revenue types and actual revenue types (in the general ledger) were different, but they requested that they be presented as one in the Tabular Model. They also wanted departments, when using the model, to see 100% agreement with their existing transactional reports.

Having your analytic data match transactional reports is necessary; I agree. If they don’t match, the users won’t have confidence in the data. I solved this problem the only way that I could solve it.

I created separate measures:

  • Sales Revenue
  • Company Revenue

I also created separate dimensions:

  • Sales Revenue Type
  • Company Revenue Type

I also created perspectives: Sales, Corporate, Fleet, HR. This helped to reduce the confusion, but there were many users who used the entire model, and in my opinion, this caused problems with user-designed reports.

The correct approach, however, is to create one giant SQL data warehouse out of which you publish multiple data marts, each data mart having its own SSAS Tabular Model.