Note that there are some explanatory texts on larger screens.

plurals
  1. POFastest way to insert into a SQL Server table from .NET code?
    primarykey
    data
    text
    <p>What is the fastest way to do this:</p> <ul> <li>One table, no references that I cannot prefill (i.e. there is one reference key there, but i have all the data filled in)</li> <li>LOTS of data. We talk of hundreds of millions of rows per day, coming in dynamically through an API</li> <li>Requests must / should be processed as soon as feasible in a near real time scenario (i.e. no writing out to a file for upload one per day). 2 seconds is the normal maximal delay</li> <li>Separate machines for data / application and the SQL Server</li> </ul> <p>What I do now:</p> <ul> <li>Aggregate up to 32*1024 rows into an array, then queue it.</li> <li>Read the queue in 2-3 threads. Insert into database using SqlBulkCopy.</li> </ul> <p>I get about 60k-75k rows imported per second, which is not enough, but quite close. I would love to hit 250.000 rows.</p> <p>So far nothing is really used. I get 20% time "network I/O" blocks, have one core 80% loaded CPU side. Discs are writing out 7mb-14mb, mostly idle. Average queue length on a RAID 10 of 6 raptors is.... 0.25.</p> <p>Anyone any idea how to speed this up? Faster server (so far it is virtual, 8gb ram, 4 cores, physical disc pass through for data).</p> <hr> <p>Adding some clarifications:</p> <ul> <li>This is a 2008 R2 Enterprise SQL Server on a 2008 R2 server. machine has 4 cores, 8gb ram. All 64 bit. The 80% load average comes from this machine showing about 20% cpu load.</li> <li>The table is simple, has no primary key, only an index on a relational reference (instrument reference) and a unique (within a set of instruments, so this is not enforced) timestamp.</li> <li>The fields on the table are: timestamp, instrument reference (no enforced foreign key), data type (char 1, one of a number of characters indicating what data is posted), price (double) and volume (int). As you can see this is a VERY thin table. The data in question is tick data for financial instruments.</li> <li>The question is also about hardware etc. - mostly because i see no real bottleneck. I am inserting in multiple transactions and it gives me a benefit, but a small one. Discs, CPU are not showing significant load, network io wait is high (300ms/second, 30% at the moment), but this is on the same virtualization platform which runs JSUT the two servers and has enough cores to run all. I pretty much am open to "buy another server", but i want to identify the bottleneck first.... especially given that at the end of the day I am not grabbing what the bottleneck is. Logging is irrelevant - the bulk inserts do NOT go into the data log as data (no clustered index).</li> </ul> <p>Would vertical partitioning help, for example by a byte (tinyint) that would split the instrument universe by for example 16 tables, and me thus doing up to 16 inserts at the same time? As actually the data comes from different exchanges I could make a partition per exchange. This would be a natural split field (which is actually in instrument, but I could duplicate this data here).</p> <hr> <p>Some more clarifications: Got the speed even higher (90k), now clearly limited by network IO between machines, which could be VM switching.</p> <p>What I do now is do a connection per 32k rows, put up a temp table, insert into this with SqlBUlkdCopy, THEN use ONE sql statement to copy to main table - minimizes any lock times on the main table.</p> <p>Most waiting time is now still on network IO. Seems I run into issues where VM wise. Will move to physical hardware in the next months ;)</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload