Note that there are some explanatory texts on larger screens.

plurals
  1. POMoving Millions of items from one Storage Account to Another
    text
    copied!<p>I have somewhere in the neighborhood of 4.2 million images I need to move from North Central US to West US, as part of a large migration to take advantage of Azure VM support (for those who don't know, North Central US does not support them). The images are all in one container, split into about 119,000 directories.</p> <p>I'm using the following from the Copy Blob API:</p> <pre><code>public static void CopyBlobDirectory( CloudBlobDirectory srcDirectory, CloudBlobContainer destContainer) { // get the SAS token to use for all blobs string blobToken = srcDirectory.Container.GetSharedAccessSignature( new SharedAccessBlobPolicy { Permissions = SharedAccessBlobPermissions.Read | SharedAccessBlobPermissions.Write, SharedAccessExpiryTime = DateTime.UtcNow + TimeSpan.FromDays(14) }); var srcBlobList = srcDirectory.ListBlobs( useFlatBlobListing: true, blobListingDetails: BlobListingDetails.None).ToList(); foreach (var src in srcBlobList) { var srcBlob = src as ICloudBlob; // Create appropriate destination blob type to match the source blob ICloudBlob destBlob; if (srcBlob.Properties.BlobType == BlobType.BlockBlob) destBlob = destContainer.GetBlockBlobReference(srcBlob.Name); else destBlob = destContainer.GetPageBlobReference(srcBlob.Name); // copy using src blob as SAS destBlob.BeginStartCopyFromBlob(new Uri(srcBlob.Uri.AbsoluteUri + blobToken), null, null); } } </code></pre> <p>The problem is, it's too slow. Waaaay too slow. At the rate it's taking to issue commands to copy all of this stuff, It is going to take somewhere in the neighborhood of four days. I'm not really sure what the bottleneck is (connection limit client side, rate limiting on Azure's end, multithreading, etc).</p> <p>So, I'm wondering what my options are. Is there any way to speed things up, or am I just stuck with a job that will take four days to complete?</p> <p><strong>Edit: How I'm distributing the work to copy everything</strong></p> <pre><code>//set up tracing InitTracer(); //grab a set of photos to benchmark this var photos = PhotoHelper.GetAllPhotos().Take(500).ToList(); //account to copy from var from = new Microsoft.WindowsAzure.Storage.Auth.StorageCredentials( "oldAccount", "oldAccountKey"); var fromAcct = new CloudStorageAccount(from, true); var fromClient = fromAcct.CreateCloudBlobClient(); var fromContainer = fromClient.GetContainerReference("userphotos"); //account to copy to var to = new Microsoft.WindowsAzure.Storage.Auth.StorageCredentials( "newAccount", "newAccountKey"); var toAcct = new CloudStorageAccount(to, true); var toClient = toAcct.CreateCloudBlobClient(); Trace.WriteLine("Starting Copy: " + DateTime.UtcNow.ToString()); //enumerate sub directories, then move them to blob storage //note: it doesn't care how high I set the Parallelism to, //console output indicates it won't run more than five or so at a time var plo = new ParallelOptions { MaxDegreeOfParallelism = 10 }; Parallel.ForEach(photos, plo, (info) =&gt; { CloudBlobDirectory fromDir = fromContainer.GetDirectoryReference(info.BuildingId.ToString()); var toContainer = toClient.GetContainerReference(info.Id.ToString()); toContainer.CreateIfNotExists(); Trace.WriteLine(info.BuildingId + ": Starting copy, " + info.Photos.Length + " photos..."); BlobHelper.CopyBlobDirectory(fromDir, toContainer, info); //this monitors the container, so I can restart any failed //copies if something goes wrong BlobHelper.MonitorCopy(toContainer); }); Trace.WriteLine("Done: " + DateTime.UtcNow.ToString()); </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload