Note that there are some explanatory texts on larger screens.

plurals
  1. POPainfully slow Azure table insert and delete batch operations
    primarykey
    data
    text
    <p>I am running into a huge performance bottleneck when using Azure table storage. My desire is to use tables as a sort of cache, so a long process may result in anywhere from hundreds to several thousand rows of data. The data can then be quickly queried by partition and row keys.</p> <p>The querying is working pretty fast (extremely fast when only using partition and row keys, a bit slower, but still acceptable when also searching through properties for a particular match).</p> <p>However, both inserting and deleting rows is painfully slow.</p> <p><strong>Clarification</strong></p> <p>I want to clarify that even inserting a single batch of 100 items takes several seconds. This isn't just a problem with total throughput of thousands of rows. It is affecting me when I only insert 100.</p> <p>Here is an example of my code to do a batch insert to my table:</p> <pre><code>static async Task BatchInsert( CloudTable table, List&lt;ITableEntity&gt; entities ) { int rowOffset = 0; while ( rowOffset &lt; entities.Count ) { Stopwatch sw = Stopwatch.StartNew(); var batch = new TableBatchOperation(); // next batch var rows = entities.Skip( rowOffset ).Take( 100 ).ToList(); foreach ( var row in rows ) batch.Insert( row ); // submit await table.ExecuteBatchAsync( batch ); rowOffset += rows.Count; Trace.TraceInformation( "Elapsed time to batch insert " + rows.Count + " rows: " + sw.Elapsed.ToString( "g" ) ); } } </code></pre> <p>I am using batch operations, and here is one sample of debug output:</p> <pre><code>Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Starting asynchronous request to http://127.0.0.1:10002/devstoreaccount1. Microsoft.WindowsAzure.Storage Verbose: 4 : b08a07da-fceb-4bec-af34-3beaa340239b: StringToSign = POST..multipart/mixed; boundary=batch_6d86d34c-5e0e-4c0c-8135-f9788ae41748.Tue, 30 Jul 2013 18:48:38 GMT./devstoreaccount1/devstoreaccount1/$batch. Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Preparing to write request data. Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Writing request data. Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Waiting for response. Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Response received. Status code = 202, Request ID = , Content-MD5 = , ETag = . Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Response headers were processed successfully, proceeding with the rest of the operation. Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Processing response body. Microsoft.WindowsAzure.Storage Information: 3 : b08a07da-fceb-4bec-af34-3beaa340239b: Operation completed successfully. iisexpress.exe Information: 0 : Elapsed time to batch insert 100 rows: 0:00:00.9351871 </code></pre> <p>As you can see, this example takes almost 1 second to insert 100 rows. The average seems to be about .8 seconds on my dev machine (3.4 Ghz quad core).</p> <p>This seems ridiculous.</p> <p>Here is an example of a batch delete operation:</p> <pre><code>Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Starting asynchronous request to http://127.0.0.1:10002/devstoreaccount1. Microsoft.WindowsAzure.Storage Verbose: 4 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: StringToSign = POST..multipart/mixed; boundary=batch_7e3d229f-f8ac-4aa0-8ce9-ed00cb0ba321.Tue, 30 Jul 2013 18:47:41 GMT./devstoreaccount1/devstoreaccount1/$batch. Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Preparing to write request data. Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Writing request data. Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Waiting for response. Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Response received. Status code = 202, Request ID = , Content-MD5 = , ETag = . Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Response headers were processed successfully, proceeding with the rest of the operation. Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Processing response body. Microsoft.WindowsAzure.Storage Information: 3 : 4c271cb5-7463-44b1-b2e5-848b8fb10a93: Operation completed successfully. iisexpress.exe Information: 0 : Elapsed time to batch delete 100 rows: 0:00:00.6524402 </code></pre> <p>Consistently over .5 seconds.</p> <p>I ran this deployed to Azure (small instance) as well, and have recorded times of 20 minutes to insert 28000 rows.</p> <p>I am currently using the 2.1 RC version of the Storage Client Library: <a href="http://blogs.msdn.com/b/windowsazurestorage/archive/2013/07/12/introducing-storage-client-library-2-1-rc-for-net-and-windows-phone-8.aspx">MSDN Blog</a></p> <p>I must be doing something very wrong. Any thoughts?</p> <p><strong>UPDATE</strong></p> <p>I've tried parallelism with the net effect of an overall speed improvement (and 8 maxed out logical processors), but still barely 150 row inserts per second on my dev machine.</p> <p>No better overall that I can tell, and maybe even worse when deployed to Azure (small instance).</p> <p>I've increased the thread pool, and increase the max number of HTTP connections for my WebRole by following <a href="http://tk.azurewebsites.net/2012/12/10/greatly-increase-the-performance-of-azure-storage-cloudblobclient/">this advice</a>.</p> <p>I still feel that I am missing something fundamental that is limiting my inserts/deletes to 150 ROPS.</p> <p><strong>UPDATE 2</strong></p> <p>After analyzing the some diagnostics logs from my small instance deployed to Azure (using the new logging built in to the 2.1 RC Storage Client), I have a bit more information.</p> <p>The first storage client log for a batch insert is at <code>635109046781264034</code> ticks:</p> <pre><code>caf06fca-1857-4875-9923-98979d850df3: Starting synchronous request to https://?.table.core.windows.net/.; TraceSource 'Microsoft.WindowsAzure.Storage' event </code></pre> <p>Then almost 3 seconds later I see this log at <code>635109046810104314</code> ticks:</p> <pre><code>caf06fca-1857-4875-9923-98979d850df3: Preparing to write request data.; TraceSource 'Microsoft.WindowsAzure.Storage' event </code></pre> <p>Then a few more logs which take up a combined 0.15 seconds ending with this one at <code>635109046811645418</code> ticks which wraps up the insert:</p> <pre><code>caf06fca-1857-4875-9923-98979d850df3: Operation completed successfully.; TraceSource 'Microsoft.WindowsAzure.Storage' event </code></pre> <p>I'm not sure what to make of this, but it is pretty consistent across the batch insert logs that I examined.</p> <p><strong>UPDATE 3</strong></p> <p>Here is the code used to batch insert in parallel. In this code, just for testing, I am ensuring that I am inserting each batch of 100 into a unique partition.</p> <pre><code>static async Task BatchInsert( CloudTable table, List&lt;ITableEntity&gt; entities ) { int rowOffset = 0; var tasks = new List&lt;Task&gt;(); while ( rowOffset &lt; entities.Count ) { // next batch var rows = entities.Skip( rowOffset ).Take( 100 ).ToList(); rowOffset += rows.Count; string partition = "$" + rowOffset.ToString(); var task = Task.Factory.StartNew( () =&gt; { Stopwatch sw = Stopwatch.StartNew(); var batch = new TableBatchOperation(); foreach ( var row in rows ) { row.PartitionKey = row.PartitionKey + partition; batch.InsertOrReplace( row ); } // submit table.ExecuteBatch( batch ); Trace.TraceInformation( "Elapsed time to batch insert " + rows.Count + " rows: " + sw.Elapsed.ToString( "F2" ) ); } ); tasks.Add( task ); } await Task.WhenAll( tasks ); } </code></pre> <p>As stated above, this does help improve the overall time to insert thousands of rows, but each batch of 100 still takes several seconds.</p> <p><strong>UPDATE 4</strong></p> <p>So I created a brand new Azure Cloud Service project, using VS2012.2, with the Web Role as a single page template (the new one with the TODO sample in it).</p> <p>This is straight out of the box, no new NuGet packages or anything. It uses the Storage client library v2 by default, and the EDM and associated libraries v5.2.</p> <p>I simply modified the HomeController code to be the following (using some random data to simulate the columns that I want to store in the real app):</p> <pre><code>public ActionResult Index( string returnUrl ) { ViewBag.ReturnUrl = returnUrl; Task.Factory.StartNew( () =&gt; { TableTest(); } ); return View(); } static Random random = new Random(); static double RandomDouble( double maxValue ) { // the Random class is not thread safe! lock ( random ) return random.NextDouble() * maxValue; } void TableTest() { // Retrieve storage account from connection-string CloudStorageAccount storageAccount = CloudStorageAccount.Parse( CloudConfigurationManager.GetSetting( "CloudStorageConnectionString" ) ); // create the table client CloudTableClient tableClient = storageAccount.CreateCloudTableClient(); // retrieve the table CloudTable table = tableClient.GetTableReference( "test" ); // create it if it doesn't already exist if ( table.CreateIfNotExists() ) { // the container is new and was just created Trace.TraceInformation( "Created table named " + "test" ); } Stopwatch sw = Stopwatch.StartNew(); // create a bunch of objects int count = 28000; List&lt;DynamicTableEntity&gt; entities = new List&lt;DynamicTableEntity&gt;( count ); for ( int i = 0; i &lt; count; i++ ) { var row = new DynamicTableEntity() { PartitionKey = "filename.txt", RowKey = string.Format( "$item{0:D10}", i ), }; row.Properties.Add( "Name", EntityProperty.GeneratePropertyForString( i.ToString() ) ); row.Properties.Add( "Data", EntityProperty.GeneratePropertyForString( string.Format( "data{0}", i ) ) ); row.Properties.Add( "Value1", EntityProperty.GeneratePropertyForDouble( RandomDouble( 10000 ) ) ); row.Properties.Add( "Value2", EntityProperty.GeneratePropertyForDouble( RandomDouble( 10000 ) ) ); row.Properties.Add( "Value3", EntityProperty.GeneratePropertyForDouble( RandomDouble( 1000 ) ) ); row.Properties.Add( "Value4", EntityProperty.GeneratePropertyForDouble( RandomDouble( 90 ) ) ); row.Properties.Add( "Value5", EntityProperty.GeneratePropertyForDouble( RandomDouble( 180 ) ) ); row.Properties.Add( "Value6", EntityProperty.GeneratePropertyForDouble( RandomDouble( 1000 ) ) ); entities.Add( row ); } Trace.TraceInformation( "Elapsed time to create record rows: " + sw.Elapsed.ToString() ); sw = Stopwatch.StartNew(); Trace.TraceInformation( "Inserting rows" ); // batch our inserts (100 max) BatchInsert( table, entities ).Wait(); Trace.TraceInformation( "Successfully inserted " + entities.Count + " rows into table " + table.Name ); Trace.TraceInformation( "Elapsed time: " + sw.Elapsed.ToString() ); Trace.TraceInformation( "Done" ); } static async Task BatchInsert( CloudTable table, List&lt;DynamicTableEntity&gt; entities ) { int rowOffset = 0; var tasks = new List&lt;Task&gt;(); while ( rowOffset &lt; entities.Count ) { // next batch var rows = entities.Skip( rowOffset ).Take( 100 ).ToList(); rowOffset += rows.Count; string partition = "$" + rowOffset.ToString(); var task = Task.Factory.StartNew( () =&gt; { var batch = new TableBatchOperation(); foreach ( var row in rows ) { row.PartitionKey = row.PartitionKey + partition; batch.InsertOrReplace( row ); } // submit table.ExecuteBatch( batch ); Trace.TraceInformation( "Inserted batch for partition " + partition ); } ); tasks.Add( task ); } await Task.WhenAll( tasks ); } </code></pre> <p>And this is the output I get:</p> <pre><code>iisexpress.exe Information: 0 : Elapsed time to create record rows: 00:00:00.0719448 iisexpress.exe Information: 0 : Inserting rows iisexpress.exe Information: 0 : Inserted batch for partition $100 ... iisexpress.exe Information: 0 : Successfully inserted 28000 rows into table test iisexpress.exe Information: 0 : Elapsed time: 00:01:07.1398928 </code></pre> <p>This is a bit faster than in my other app, at over 460 ROPS. This is still unacceptable. And again in this test, my CPU (8 logical processors) is nearly maxed out, and disk access is nearly idle.</p> <p>I am at a loss as to what is wrong.</p> <p><strong>UPDATE 5</strong></p> <p>Round and round of fiddling and tweaking have yielded some improvements, but I just can't get it much faster than 500-700(ish) ROPS doing batch InsertOrReplace operations (in batches of 100).</p> <p>This test is done in the Azure cloud, using a small instance (or two). Based on comments below I'm resigned to the fact that local testing will be slow at best.</p> <p>Here are a couple of examples. Each example is it's very own PartitionKey:</p> <pre><code>Successfully inserted 904 rows into table org1; TraceSource 'w3wp.exe' event Elapsed time: 00:00:01.3401031; TraceSource 'w3wp.exe' event Successfully inserted 4130 rows into table org1; TraceSource 'w3wp.exe' event Elapsed time: 00:00:07.3522871; TraceSource 'w3wp.exe' event Successfully inserted 28020 rows into table org1; TraceSource 'w3wp.exe' event Elapsed time: 00:00:51.9319217; TraceSource 'w3wp.exe' event </code></pre> <p>Maybe it's my MSDN Azure account that has some performance caps? I don't know.</p> <p>At this point I think I'm done with this. Maybe it's fast enough to use for my purposes, or maybe I'll follow a different path.</p> <p><strong>CONCLUSION</strong></p> <p>All answers below are good!</p> <p>For my specific question, I've been able to see speeds up to 2k ROPS on a small Azure instance, more typically around 1k. Since I need to keep costs down (and therefore instance sizes down), this defines what I will be able to use tables for.</p> <p>Thanks everyone for all the help.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload