Note that there are some explanatory texts on larger screens.

plurals
  1. POWhat's the standard algorithm for syncing two lists of related objects?
    text
    copied!<p>I'm pretty sure this must be in some kind of text book (or more likely in all of them) but I seem to be using the wrong keywords to search for it... :(</p> <p>A recurring task I'm facing while programming is that I am dealing with lists of objects from different sources which I need to keep in sync somehow. Typically there's some sort of "master list" e.g. returned by some external API and then a list of objects I create myself each of which corresponds to an object in the master list (think "wrappers" or "adapters" - they typically contain extended information about the external objects specific to my application and/or they simplify access to the external objects).</p> <h2>Hard characteristics of all instances of the problem:</h2> <ul> <li>the implementation of the master list is hidden from me; its interface is fixed</li> <li>the elements in the two lists are not assignment-compatible</li> <li>I have full control over the implementation of the slave list</li> <li>I cannot control the order of elements in the master list (i.e. it's not sortable)</li> <li>the master list does either not provide notification about added or removed elements at all or notification is unreliable, i.e. the sync can only happen on-demand, not live</li> <li>simply clearing and rebuilding the slave list from scratch whenever it's needed is not an option: <ul> <li>initializing the wrapper objects should be considered expensive</li> <li>other objects will hold references to the wrappers</li> </ul></li> </ul> <h2>Additional characteristics in some instances of the problem:</h2> <ul> <li>elements in the master list can only be identified by reading their properties rather than accessing them directly by index or memory address: <ul> <li>after a refresh, the master list might return a completely new set of instances even though they still represent the same information</li> <li>the only interface for accessing elements in the master list might be a sequential enumerator</li> </ul></li> <li>most of the time, the order of elements in the master list is stable, i.e. new elements are always added either at the beginning or at the end, never in the middle; however, deletion can usually occur at any position</li> </ul> <p>So how would I typically tackle this? What's the name of the algorithm I should google for?</p> <p>In the past I have implemented this in various ways (see below for an example) but it always felt like there should be a cleaner and more efficient way, especially one that did not require two iterations (one over each list).</p> <p>Here's an example approach:</p> <ol> <li>Iterate over the master list</li> <li>Look up each item in the "slave list"</li> <li>Add items that do not yet exist</li> <li>Somehow keep track of items that already exist in both lists (e.g. by tagging them or keeping yet another list)</li> <li>When done, iterate over the slave list and remove all objects that have not been tagged (see 4.) and clear the tag again from all others</li> </ol> <hr> <p><strong>Update 1</strong> Thanks for all your responses so far! I will need some time to look at the links.<br> [...] <em>(text moved to main body of question)</em></p> <p><strong>Update 2</strong> Restructered the middle-paragraph into a (hopefully) more easily parseable bullet lists and incorporated details added later in the first update.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload