Note that there are some explanatory texts on larger screens.

plurals
  1. POStrategy to avoid OutOfMemoryException during ETL in .NET
    text
    copied!<p>I have wrote a ETL process that perform ETL process. The ETL process needs to process more than 100+ million or rows overall for 2 years worth of records. To avoid out of memory issue, we chunk the data loading down to every 7 days. For each chunk process, it loads up all the required reference data, then the process open a sql connection and load the source data one by one, transform it, and write it to the data warehouse.</p> <p>The drawback of processing the data by chunk is it is slow.</p> <p>This process has been working fine for most of the tables, but there is one table I still run into out of memory. The process has loaded too many reference data. I would like to avoid chunk the data down to 3 days so that it has a decent performance.</p> <p>Is there any other strategies that I can use to avoid OutOfMemoryException? </p> <p>For example, local database, write the reference data to files, spawn another .NET process to hold more memory in Windows, use CLR stored procedure to do ETL...</p> <p>Environment: Windows 7 32 bit OS. 4 GB of RAM. SQL Server Standard Edition.</p> <p>The only one solution is to use a store procedure and let SQL Server handle the ETL. However, I am trying to avoid it because the program needs to support Oracle as well. Other performance improvement I tried are added indexes to improve the loading queries. Create custom data access class to only load the necessary columns, instead of loading the entire row into memory.</p> <p>Thanks</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload