The mathematical properties of junk dimensions are fascinating to me. Let me explain.
Junk dimensions are usually a collection of flags and other values with low cardinality. The records in a junk dimension represent unique combinations of all the columns. Since the number of possible values in each column is finite, using basic combinatorics, you can calculate how many records are going to be in that table if you know all the values a priori.
If we let x represent the number of possible values in a column, then total number of records in a table is defined by:
Total number of records = x1 * x2 * x3*…xn
Suppose we had a table with the following profile.
Column Name | Number Of Possible Values |
---|---|
Color | 5 |
Size | 3 |
isActiveProduct | 2 |
This means that the total number of records in this table will be:
5 * 3 * 2 = 30
This is not just an academic exercise. You can use this information to determine if there will be too many records in this table and maybe consider splitting up some of the columns into smaller logical groupings.
But the real fun of all this is loading the data completely before the first load even hits! Yeah, yeah. We could just let the table populate naturally but what fun is that?!
In our example above, we already know what all the values are. Additionally, there are not that many. We can take advantage of CROSS JOIN to put the data together and load the table!
Watch While I Science The **** Out Of This
USE demo
--create necessary tables
DECLARE @ProductInformation AS TABLE (Color NVARCHAR(20), Size NVARCHAR(1), isActiveProduct BIT)
DECLARE @Color AS TABLE(Color NVARCHAR(20))
DECLARE @Size AS TABLE(Size NVARCHAR(1))
DECLARE @isActiveProduct AS TABLE(isActiveProduct BIT)
--populate component tables
INSERT INTO @Color(Color)
SELECT 'Red'
UNION
SELECT 'Blue'
UNION
SELECT 'Brown'
UNION
SELECT 'Green'
UNION
SELECT 'Yellow'
INSERT INTO @Size(Size)
SELECT 'S'
UNION
SELECT 'M'
UNION
SELECT 'L'
INSERT INTO @isActiveProduct(isActiveProduct)
SELECT '1'
UNION
SELECT '0'
--Prepopulate table
INSERT INTO @ProductInformation(Color, Size, isActiveProduct)
SELECT Color, Size, isActiveProduct
FROM @Color
CROSS JOIN @Size
CROSS JOIN @isActiveProduct
--gloat over my math superiority
SELECT COUNT(*) AS TotalRecords
FROM @ProductInformation
--see the results
SELECT *
FROM @ProductInformation
Copyright © 2020, Mass Street Analytics, LLC. All Rights Reserved.