Note that there are some explanatory texts on larger screens.

plurals
  1. POHuge table (9 million records) in SQL server
    primarykey
    data
    text
    <p>Our underwriting company just sent us a data extract of leads. There are 9 million rows. The rows consist of LeadID (guid), RawLeadXML (xml -- probably 3-4kb max), and a LeadStatusID (int).</p> <p>I first tried to add an autonumber integer and make it the primary key of this table. Well, it added the field, but couldn't make it the primary key (There is insufficient memory available in the buffer pool.)</p> <p>What I need to do is to take every record, 1 by 1, and get the XML, put it into an XmlDocument object in .Net, strip out all the fields I want (firstname, lastname, etc) and store that parsed information into another table.</p> <p>I can't even get this statement to run: select * from Leads where id between 1 and 1000</p> <p>If I just select 1000 records at a time (select top 1000 * from Leads), that works, but how would I then select the next 1000 records without some sort of reference point?</p> <p>My machine has 4 processors (2.53Ghz) and 12 GB of RAM. It's not a server, but it's a beefy machine. I don't know what to try next, honestly.</p> <p><strong>EDIT</strong>: I left out that the original file was actually an MDF (and the associated LDF) file, so I just attached to them in SQL Server. </p> <p><strong>EDIT 2</strong>: I messed up and said that the RawLeadXML column was XML -- it isn't, it's just nvarchar(max). I honestly didn't know there WAS an xml datatype. </p> <p><strong>EDIT 3</strong>: I can't even issue a delete statement on this table: "delete from leads where leadid = '100a7927-5311-4f12-8fe3-95c079d32dd4' " blows up with:</p> <pre><code>Msg 802, Level 17, State 20, Line 2 There is insufficient memory available in the buffer pool. </code></pre> <p>I have no idea what to do next. How in the hell is this even a problem? There are thousands of databases in the world with more records than I've got.</p> <p><strong>EDIT 4</strong>: In case anyone cares, none of the below solutions worked. I'm sure this is a limitation of my machine, and definitely not a condemnation of the fine answers I received below. Currently, I am transferring the zipped DB (2.6 GB) to our servers in Rackspace, and then I will attempt to add an index on that hardware, hopefully without taking down our production servers. Once the index is added, I'm hopeful that I can zip up the db and bring it back down to my local machine, and then be able to actually do something with it.</p> <p><strong>EDIT 5</strong>: My machine was literally incapable of handling a table of this size. My machine has 12 GB RAM, 64 bit Windows 7 Professional, a quad core 2.53Ghz processor, SSD drive, etc. It's pretty beefy for a development machine. And it couldn't handle this. </p> <p>So, I moved the DB to our server in Rackspace in London. 48 GB or memory in that one, and it was able to add the index I needed. Even after that, my machine here was unable to do anything all that useful with it, so I've written a .Net program that is running in London to put out the records 1000 at a time, parse them into another table, and then mark the original records as processed.</p> <p>Once I do this, I'm going to have to leave the DB in London, because I doubt I could write any meaningful reports against this monstrosity locally. Which will make development fun.</p> <p><strong>Summary</strong>: I believe that there is no good way to process a dataset this large without using server class hardware with at least 48 GB of RAM (in my case).</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload