Note that there are some explanatory texts on larger screens.

plurals
  1. POUpload a Massive CSV File to SQL Server Database
    text
    copied!<p>I need to upload a massive (16GB, 65+ million records) CSV file to a single table in a SQL server 2005 database. Does anyone have any pointers on the best way to do this?</p> <p><strong>Details</strong></p> <p>I am currently using a C# console application (.NET framework 2.0) to split the import file into files of 50000 records, then process each file. I upload the records into the database from the console application using the SqlBulkCopy class in batches of 5000. To split the files takes approximately 30 minutes, and to upload the entire data set (65+ million records) takes approximately 4.5 hours. The generated file size and the batch upload size are both configuration settings, and I am investigating increasing the value of both to improve performance. To run the application, we use a quad core server with 16GB RAM. This server is also the database server.</p> <p><strong>Update</strong></p> <p>Given the answers so far, please note that prior to the import:</p> <ul> <li>The database table is truncated, and all indexes and constraints are dropped.</li> <li>The database is shrunk, and disk space reclaimed.</li> </ul> <p>After the import has completed:</p> <ul> <li>The indexes are recreated</li> </ul> <p>If you can suggest any different approaches, or ways we can improve the existing import application, I would appreciate it. Thanks.</p> <p><strong>Related Question</strong> </p> <p>The following question may be of use to others dealing with this problem:</p> <ul> <li><a href="https://stackoverflow.com/questions/141556/potential-pitfalls-of-inserting-millions-of-records-into-sql-server-2005-from-fla">Potential Pitfalls of inserting millions of records into SQL Server 2005 from flat file</a></li> </ul> <p><strong>Solution</strong></p> <p>I have investigated the affect of altering batch size, and the size of the split files, and found that batches of 500 records, and split files of 200,000 records work best for my application. Use of the <code>SqlBulkCopyOptions.TableLock</code> also helped. See the answer to this <a href="https://stackoverflow.com/questions/779690/what-is-the-recommended-batch-size-for-sqlbulkcopy/869202#869202">question</a> for further details.</p> <p>I also looked at using a SSIS DTS package, and a <code>BULK INSERT</code> SQL script. The SSIS package appeared quicker, but did not offer me the ability to record invalid records, etc. The <code>BULK INSERT</code> SQL script whilst slower than the SSIS package, was considerably faster than the C# application. It did allow me to record errors, etc, and for this reason, I am accepting the <code>BULK INSERT</code> answer from <a href="https://stackoverflow.com/users/15401/concernedoftunbridgewells">ConcernedOfTunbridgeWells</a> as the solution. I'm aware that this may not be the best answer for everyone facing this issue, but it answers my immediate problem.</p> <p>Thanks to everyone who replied.</p> <p>Regards, MagicAndi</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload