Note that there are some explanatory texts on larger screens.

plurals
  1. POWhat is the fastest way to load an XML file into MySQL using C#?
    primarykey
    data
    text
    <h3>Question</h3> <p>What is the fastest way to dump a large (> 1GB) XML file into a MySQL database?</p> <h3>Data</h3> <p>The data in question is the StackOverflow Creative Commons Data Dump.</p> <h3>Purpose</h3> <p>This will be used in an offline StackOverflow viewer I am building, since I am looking to do some studying/coding in places where I will not have access to the internet.</p> <p>I would like to release this to the rest of the StackOverflow membership for their own use when the project is finished.</p> <h3>Problem</h3> <p>Originally, I was reading from XML/writing to DB one record at a time. This took about 10 hours to run on my machine. The hacktastic code I'm using now throws 500 records into an array, then creates an insertion query to load all 500 at once (eg. "<code>INSERT INTO posts VALUES (...), (...), (...) ... ;</code>"). While this is faster, it still takes hours to run. Clearly this is not the best way to go about it, so I'm hoping the big brains on this site will know of a better way.</p> <h3>Constraints</h3> <ul> <li>I am building the application using C# as a desktop application (i.e. WinForms).</li> <li>I am using MySQL 5.1 as my database. This means that features such as "<code>LOAD XML INFILE filename.xml</code>" are not usable in this project, as this feature is only available in MySQL 5.4 and above. This constraint is largely due to my hope that the project would be useful to people other than myself, and I'd rather not force people to use Beta versions of MySQL.</li> <li>I'd like the data load to be built into my application (i.e. no instructions to "Load the dump into MySQL using 'foo' before running this application.").</li> <li>I'm using MySQL Connector/Net, so anything in the <code>MySql.Data</code> namespace is acceptable.</li> </ul> <p>Thanks for any pointers you can provide!</p> <hr> <p><strong>Ideas so far</strong></p> <blockquote> <p>stored procedure that loads an entire XML file into a column, then parses it using XPath </p> </blockquote> <ul> <li>This didn't work since the file size is subject to the limitations of the max_allowed_packet variable, which is set to 1 MB by default. This is far below the size of the data dump files.</li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload