Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>At my workplace, we have home-brew ETL. I can see you raise a brow :). The minimal meta-data we have describes the following. Subscription details, Audit, Data-mapping, Run-order. </p> <p>The <strong>subscription details</strong> again fall into two categories, vendor from whom the data was purchased and teams/applications using it. The ftp/http details, access credentials are also stored. Fortunately we were asked to have absolutely zero SPs, Major exception "identity generators". </p> <p><strong>Audit details</strong> involve, date of data, last modified time, user who ran it, failure/success count.</p> <p><strong>Data-mapping table</strong> describes the tables and column names which hold the data. We used to have an additional composite-key descriptor table. However we decided to do away with that. The performance loss was compensated by asking data table owners to create proper partitioning strategy.</p> <p><strong>Run_order table</strong> is another table we have which determines if the user can run (Y/N) and the order in which runs can occur.</p> <p>The meta-data is also stored with an history (based on date). So if any one decides run an archived/historic subscription. The run would go ahead.</p> <p><strong>Uses of the above</strong>: We can prioritize the data loads based on importance of subscription. We can monitor failures at a generic level (birds-eye view). We can write generic code that can create dynamic sql queries (no hard-coding). Our load and extract processes are forced to use the data-mapping table, so no user can get away with stale information.</p> <p>This seemed to work so far in our experience.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload