Part II: Some Words You Need to Know
In Part I of this series; Eliminating Uncertainty, I discussed the need to clearly define what you want to capture in your High Value Low Cost data warehouse.
In this segment I'm going to provide some basic definitions of the words used to describe the data the structures that you will create.
I advocate an incremental "evolutionary" implementation starting with spreadsheets versus "revolutionary" methods that require a large upfront expense and lots for solution evangelizing.
Each tab in Excel 2007 is capable of holding 1,048,576 rows and 16,384 columns of data. The amounts to over 17 billion data points in a single spreadsheet tab. I dare you to tell me Excel is not capable of storing huge amounts of data.
Is Excel the best way to store that much data reliably? No, but for our purposes it provides plenty of space to start a High Value Low Cost BI project and prove, with actual visualizations of the data, the value of this approach to the management of your company.
Let discuss some of the key words you need to understand as we go forward. These words have to do with the architecture of the data warehouse and the structure of the data itself.
Granularity or Gain
Refers to the level of detail or summarization of the units of data in the data warehouse. The more detail there is, the lower the level of granularity. The less detail there is, the higher the level of granularity." So when I say we want to capture the information lowest "grain" possible what I mean is that we want to get the most detailed information available.
For example; If your referring to the granularity of the data in a sales fact table your grain may be expressed at the invoice level or at the invoice line-item level of detail. The invoice line-item level of detail would be "more granular" or would provide a more detailed "grain" than the invoice level.
Fact Table and Facts
A fact table in a data warehouse is were the performance measurements are stored. Typically these are numeric and additive. So, facts are data elements that you can apply math toward.
For example; a sales fact able would include facts like the quantity of units sold and the price charged for each unit.
Dimension Table and Dimensions
A dimension is generally textual descriptive data that is used to filter or sort information. The dimension table is were these descriptive attributes are stored.
For example; a customer dimension table would include dimensions like the customer name, the customer type or the customer sales channel. All important ways that you might want to view sort or filter your data.
Key Records
The key record in a fact table or dimension table is a data item or combination of data items used to identify or locate a record contained in the database. A primary key is a unique attribute used to identify a single record in a database.
Key records are also used to join records in contained in different database tables. These joins can take several forms.
In Part III of this series; The Evolutionary Path to a High Value Low Cost BI Solution, I'll describe the migration path you can follow to achieve a flexible and visual information reporting system using Tableau Software and Excel spreadsheets.