Note that there are some explanatory texts on larger screens.

plurals
  1. POGeneral understanding of star schema design
    text
    copied!<p>So, I think I understood what to put in dimensions, what in the fact table, and how to achieve this. Now I got the problem, that I have this dimension 'product' and a dimension 'productProperties'. I had to split this, cause otherwise my natural key in 'product' wouldn`t have been unique anymore. I asked this in <a href="https://stackoverflow.com/questions/3710182/star-schema-design-one-column-dimensions">this question</a>.</p> <p>So my 'productProperties' dimension table should have looked like this: Color | Material | Size</p> <p>1.)To achieve this, I had to create every possible permutation of the values of 'color', 'material', 'size' and so on, right?</p> <p>This would have been far over 200 million rows, so I decided to split this up. I have a dimension 'Color' now, which consists actually of the columns 'color', 'colorFront', 'colorBack'. </p> <p>2.) That`s fine I guess, but what about the dimension 'size', which consists only of the columns 'surrogate_key' and 'value'?</p> <p>I`ve read about 'degenerate dimensions' (in the reading recommendation given in my other question), which means, to make 'single column dimensions' one column in the fact table. That seems a bit impractical to me, since I would end up with about 5-6 extra columns in my fact table. </p> <p>What if I should do this? </p> <p>3.) Are those degenerate dimensions part of the primary key in the fact table?</p> <p>Most important question: I will have entries in my fact table with products, that don`t match every column in my dimensions or not all dimensions at all. Meaning, I may have an entry / product, which has the property 'color', but not 'colorFront' or 'colorBack'. Since I created every permutation of 'color', 'colorFront' and 'colorBack', when trying to populate my fact table I will get multiple surrogate keys, if the product only has the property 'color' which results in kind of duplicates in my fact table, right?</p> <p>4.) So do I have to filter those duplicates out, when querying my fact table? Or is this wrong at all?</p> <p>I could of course split dimension 'color' in three dimensions. But then I will get entries with NULL values in some columns. Same thing with entries / products that don`t use some dimensions at all.</p> <p>5.) How to handle those NULL values?</p> <p>Thanks in advance for any help.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload