Inmon and Kimball have created a great debate in Information Technology during the last decade. They both relentlessly thrived for conceptualizing information management for decision support. They approached the problem with different philosophies, design techniques, and implementation strategies.
This article is an analysis of these two approaches based on the issues raised and discovered.
INTRODUCTION TO WILLIAM INMON AND RALPH KIMBALL
Mr. William (Bill) Inmon is known as the “Father of Data Warehousing”, entitled for coining the term “Data Warehouse” in 1991. He defined a model to support “single version of the truth” and championed the concept for more than a decade. He also created “Corporate Information Factory” in collaboration with Ms. Claudia Imhoff. Mr. Inmon is known to have published 40+ books and 600+ articles.
Mr. Ralph Kimball is known as the “Father of Business Intelligence” for defining the concept behind “Data Marts”, for developing the science behind the analytical tools that utilize dimensional hierarchies, and for conceptualizing star-schemas and snowflake data structures. He defined a model to support analytical analysis and championed data marts for more than a decade. Though Kimball’s writings do not exceed Inmon’s by quantity, Kimball’s books are all-time best sellers on data warehousing.
PHILOSOPHIES: QUEST FOR A COMMON GOAL
Inmon and Kimball are two pioneers that started different philosophies for enterprise-wide information gathering, information management, and analytics for decision support. Inmon believes in creating a single enterprise-wide data warehouse for achieving an overall business intelligence system. Kimball believes in creating several smaller data marts for achieving department-level analysis and reporting.
Inmon’s philosophy recommends to start with building a large centralized enterprise-wide data warehouse, followed by several satellite databases to serve the analytical needs of departments (later known as “data marts”). Hence, his approach has received the “Top Down” title.
Kimball’s philosophy recommends to start with building several data marts that serve the analytical needs of departments, followed by “virtually” integrating these data marts for consistency through an Information Bus. Hence, his approach received the “Bottom Up” title. Mr. Kimball believes in various data marts that store information in dimensional models to quickly address the needs of various departments and various areas of the enterprise data.
Besides the differences in approaches, Inmon and Kimball also differ in the structure of the data. Inmon believes in creating a relational-model (third normal form: 3NF) where as Kimball believes in creating a multi-dimension model (star-schema and snowflakes).
Inmon argues that once the data is in a relational model, it will attain the enterprise-wide consistency which makes it easier to spawn-off the data-marts in dimensional-models. Kimball argues that the actual users can understand, analyze, aggregate, and explore data-inconsistencies in an easier manner if the data is structured in a dimensional-model. Additionally, to enable the Information Bus, data marts are categorized [Imhoff, Mastering Data warehouse design] as atomic data marts, and aggregated data marts that both use dimensional-models.
Irrespective of the structural differences in the model, both Inmon and Kimball agrees that there is a need to separate the detailed-level data from aggregated-level data.
Another difference is in the granularity of the content. Inmon believes that the content in the data warehouse has to be at the most granular level possible and must include all the possible historical data within an enterprise. His argument is that the end-users will mandate the needs on the level of data-detail that are not known at the time of building the data warehouse.
Though Mr. Inmon and Mr. Kimball have different philosophies to their approach, they do tend to agree with each other in an indirect manner. Though Inmon’s basis is on a single data warehouse, he stressed on iterative approach and discouraged the “big bang” approach. On the other hand, though Kimball’s philosophy is to quickly create few successful data marts at a time, he stresses on integration for consistency via an Information Bus.
DATA WAREHOUSE vs. BUSINESS INTELLIGENCE
Business Intelligence = Inmon’s Corporate Data Warehouse + Kimball’s Data Marts + Data Mining + Unstructured Data.
ON THE STREET, IN PRACTICE
Over the years, almost every Fortune 500 company has implemented flavors of both the Inmon’s and the Kimball’s philosophies in pursuit of providing “single version of truth” through easily maneuverable analytics. In early 90s, several conferences promoted, numerous magazines recommended, and almost all the large corporations attempted to build Inmon’s centralized data warehouses. These are huge undertakings that needed to brew for several years through hard work and complex designs.
Due to the ease of implementation of Kimball’s philosophy with quicker returns, several mid-size companies initially implemented data marts rather than an enterprise-wide data warehouse. Additionally, there are also reviews stating that all the projects that tried to use Inmon’s approach have failed and none reported to succeed.
However, due to the underestimation of the complexity of the relations between the data marts as well as due to the departmental pressures to develop their respective data marts independently and simultaneously using Kimball’s approach, the database designers have created “silo-ed” data marts. As a result, the essence of Inmon’s centralized data warehouse that integrates departmental data for consistency has been reconsidered and pursued.
INMON, KIMBALL, … AND THE OTHERS
An analysis of Inmon vs. Kimball is not complete without referring to the first published work in 1988 on data warehousing by Barry Devlin and Paul Murphy of IBM Ireland. They coined the less-widely know term “Information Warehousing” which is defined as “A structured environment supporting end users in managing the complete business and supporting Information Systems in ensuring data quality”. Mr. Devlin has finally published his work as a book in 1997 called “Data Warehouse – from Architecture to Implementation”.
THE LESS DISCUSSED DIMENSION: PROCESS
The one dimension that has been ignored, stressed by several experts, is the “process”. For some reason, corporate processes have been treated as if they are not influential factors to the “analytics”.
STILL AN UNCONQUERED SCIENCE
Despite the great efforts from Inmon, Kimball, and the Others, the world of Data Warehousing is still facing great challenges. Even in 2005, after 14 years of Inmon explaining the concept, more than 50% of today’s data warehouse projects are anticipated to fail [Robert Jaques]. In fact, Ted Friedman, a Principal Analyst in Gartner wrote in 2005, “Many enterprises fail to recognize that they have an issue with data quality. They focus only on identifying, extracting, and loading data to the warehouse, but do not take the time to assess the quality.”
Today’s data warehouses suffer from poor quality of the data. Whether or not the poor quality of data existed a decade ago is a questionable hypothesis. In the 90s, the new breed of software products and the ease of implementing data-moving techniques have opened several avenues that resulted in data duplication. As a result, any data inconsistencies in source systems have been remedied by scrubbing and cleansing them on “local copies” of the data sets rather than taking efforts to correct them at the sources.
If Inmon or Kimball had foreseen the wave of software product proliferation in the 90s that implemented duplicated functionality in an organization, they might have stressed on architecting for better quality.
Inmon and Kimball have seen the world of accessing enterprise-wide data with different sets of eyes. They both agree that easier access of enterprise data in an accurate and timely manner is the key success factor for creating an integrated solution for corporate information. Additionally, they both agreed that creating independent silos (often misrepresented as Kimball’s data marts) can only solve a set of specialized needs; difficult to support and maintain in long run; and often requires reconciliation and duplicate efforts to migrate into an enterprise-wide effort.
Inmon has definitely foreseen the hurdles and issues with data management through integration. Inmon presented them in a very academic manner that cannot be ignored. Several failures in the market can be attributed due to ignoring what Inmon has warned about. However, Kimball has brought forward a practical approach that corporation love to execute – with a “project” mindset that has definitive budget and time. Inmon’s writings usually tend to generalize a concept with little attention to the technical details. Kimball’s writings try to establish a definitive science that includes implementation techniques with abundant examples.
However, data accuracy is still the monster that needs to be conquered by several organizations.
On the end note, Inmon and Kimball have created approaches that complement each other. It only appears that they contradict each other if one tends to “pick” an approach.
1. Using the Data Warehouse, W. H. Inmon, Richard D. Hackathorn, Wiley, July, 1994.
2. Managing the Data Warehouse, W.H. Inmon, John Wiley & Sons, December, 1996.
3. The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses, Ralph Kimball, John Wiley & Sons, February, 1996.
4. The Data Warehouse Challenge: Taming Data Chaos, Michael H. Brackett, John Wiley & Sons, July, 1996.
5. Data Warehouse Project Management, Sid Adelman, Larissa T. Moss, Addison-Wesley Professional, December, 2000
6. Building the Data Warehouse (3rd Edition)”, W.H. Inmon, Wiley, March, 2002.
7. Nicholas Galemmo's Mastering Data Warehouse Design.
ABOUT THE AUTHORS
|< Prev||Next >|