Note that there are some explanatory texts on larger screens.

plurals
  1. POPoint-in-time snapshots in data warehouse
    text
    copied!<p>I am attempting to re-create the status of a customer at an exact point in time. For instance, each customer has many attributes that can change at any time (e.g., risk-score, billings-to-date, customer satisfaction).</p> <p>Every time a customer submits an application for credit, I would like to see the value of all of these characteristics at the time of submission. Subsequently, I want to use these values to develop a predictive model.</p> <p>My first thought was to create a Type-2 slowly changing dimension with effective and expiration date/timestamps and to use a half-open join time_effective &lt;= date_of_application &lt; time_expired.</p> <p>However, most of these attributes are behavioral dimensions that require complicated calculations using historic data from fact tables. Moreover, the calculated values also cannot be grouped using ranges (0-$500, $500-$750, etc.). Keeping track of all of these attributes for each dimension cause it to explode. Note: Some values change daily, others change at arbitrary points in time.</p> <p>My ideal data extract would be as follows:</p> <ul> <li>ID# for Credit Application</li> <li>Time of submission</li> <li>Value of Attribute 1 at time of submission</li> <li>Value of Attribute 2...</li> <li>Value of Attribute N</li> </ul> <p>In addition to credit applications, there are other fact tables where I want to find characteristics that were in effect at the time of that event.</p> <p>What are recommendations for handling this? I see several approaches:</p> <ol> <li>Allow the dimension to explode</li> <li>Create separate tables that have one or several attributes and separately query the particular tables that have the attributes I am interested in</li> <li>Append a column to the credit applications fact table that contains a "snapshot" of all the attributes I am interested in.</li> </ol> <p>Some of these issues are discussed in Kimball's ETL Toolkit book (pp 190-192) and in his Data Warehouse Toolkit book (187-191). On pp 154-157, there is a discussion of "rapidly changing monster dimensions" that seems very relevant. Still, I'm having trouble implementing these recommendations.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload