Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I think you are on the right track. All questions above should be possible to answer using only one fact table covering up the sales.</p> <p>I think one should start out unaggregated, and rather aggregate later if needed. Considering that one sale can contain multiple products and multiple items, I'd organize it as follows ... one fact row for each product in the sale (typically lines on the invoice, so I'd call it "order lines" or "sale lines"), and maybe three counter attributes:</p> <ul> <li><code>NumItems</code> - number of items, i.e. 3 if the customer bought three of the same product.</li> <li><code>NumLines</code> - number of "order lines" - should always be 1. May be useful when aggregating data later (big win to already have <code>sum(NumLines)</code> rather than <code>count(*)</code> in the SQL), or when adding correction items (<code>NumLines = -1</code>).</li> <li><code>NumSales</code> - a fractional number so it can be summed up to yield the number of sales (i.e. 0.333 if the sale involves three different products and hence contains three order lines).</li> </ul> <p>Now, one will get a problem to get the right count i.e. for "number of sales involving black clothes". We had this problem at my previous workplace - I'm sure there must exist some "best practice" for this, we ended up more or less by introducing a <code>SaleID</code> in the fact table (or <code>TransactionID</code>) and do <code>count(distinct SaleID)</code>. That lacks elegance, but works.</p> <p>In our setup we had several money attributes - most important, one for the revenue (what's left of the income after paying the direct costs attributed with the items sold) and one for the turnover (the price paid by the customer for the item). Sales tax or VAT may add more complications. One can make it with only one money attribute and then split the sales up into multiple lines in the fact table, but I think I would rather recommend multiple money columns in the sales line fact table. Everything in the fact table was counted in "base currency" (Euros, in our case), and then we had an exchange rate dimension to track the exact amounts.</p> <p>I don't think it makes sense to have a date dimension containing the hour of the day. At my former work I kept my warehouse in postgres, and I actually managed quite well without a date dimension at all - although a date dimension is considered "best business practice" I found that performance-wise for all our purposes we got much better performance by using standard postgres date functions instead of dragging in a date dimension. I was playing quite a lot with it, and I think in the end I found the most optimal was to split up date and time into two different attributes. (Timezones and daylight saving gave me quite some extra headaches...)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload