Note that there are some explanatory texts on larger screens.

plurals
  1. POPoint-in-time snapshots in data warehouse
    primarykey
    data
    text
    <p>I am attempting to re-create the status of a customer at an exact point in time. For instance, each customer has many attributes that can change at any time (e.g., risk-score, billings-to-date, customer satisfaction).</p> <p>Every time a customer submits an application for credit, I would like to see the value of all of these characteristics at the time of submission. Subsequently, I want to use these values to develop a predictive model.</p> <p>My first thought was to create a Type-2 slowly changing dimension with effective and expiration date/timestamps and to use a half-open join time_effective &lt;= date_of_application &lt; time_expired.</p> <p>However, most of these attributes are behavioral dimensions that require complicated calculations using historic data from fact tables. Moreover, the calculated values also cannot be grouped using ranges (0-$500, $500-$750, etc.). Keeping track of all of these attributes for each dimension cause it to explode. Note: Some values change daily, others change at arbitrary points in time.</p> <p>My ideal data extract would be as follows:</p> <ul> <li>ID# for Credit Application</li> <li>Time of submission</li> <li>Value of Attribute 1 at time of submission</li> <li>Value of Attribute 2...</li> <li>Value of Attribute N</li> </ul> <p>In addition to credit applications, there are other fact tables where I want to find characteristics that were in effect at the time of that event.</p> <p>What are recommendations for handling this? I see several approaches:</p> <ol> <li>Allow the dimension to explode</li> <li>Create separate tables that have one or several attributes and separately query the particular tables that have the attributes I am interested in</li> <li>Append a column to the credit applications fact table that contains a "snapshot" of all the attributes I am interested in.</li> </ol> <p>Some of these issues are discussed in Kimball's ETL Toolkit book (pp 190-192) and in his Data Warehouse Toolkit book (187-191). On pp 154-157, there is a discussion of "rapidly changing monster dimensions" that seems very relevant. Still, I'm having trouble implementing these recommendations.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. COI really like #3. Think of those attributes as "degenerate dimensions" related specifically to that application, rather than as general attributes of the customer. I'm not sure this question's going to get a good answer here on SO, because it really depends on your end-game use case and just how powerful your DBMS is. A DW applicance could probably handle #1 for awhile (though it will really explode if you are storing a daily risk score for every customer)...
      singulars
    2. COIs #3 still a good option if the source system (during OLTP) is not able to record the values when the transaction occurs? In other words, it would have to be added in during post-processing ETL when constructing the fact table. Do you have any thoughts on #2? One possible benefit of that is that you can potentially build the attributes in a modular fashion.
      singulars
    3. COIf the source system doesn't track the attributes as-of the time of application, then yes, it would have to be added in during ETL. #2 is basically just creating an "independent" SCD table or set of tables for the interesting attributes. In any case you'll be generating the "snapshot" of data during your ETL process no matter which way you go, whether that snapshot is an update to an SCD or adding the data to the fact table...
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload