Note that there are some explanatory texts on larger screens.

plurals
  1. POManaging efficiently pages of data in memory
    primarykey
    data
    text
    <p>I'm looking for an appropriate structure to handle the following problem:</p> <ul> <li>An application is receiving (e.g. from a web server) pages Pi of data of variable sizes, for example a page P1 can contain 20 elements, P2 3, P3 20, P4 20, etc...</li> <li>Each page contain elements Tj with a globally unique decreasing Id j, for example P2=[T150, T149, T120]. In this example, P1 Tj element Ids will be strictly lower than 120 and P3 elements strictly greater than 150.</li> <li>This means that the i in Pi does not represent the network receiving order but rather the final page order, which is unknown when we receive the page and which can change when new pages are inserted.</li> </ul> <p>These pages can be received in any order. An example for a set of page P1..P10:</p> <ol> <li>First P3 then P2 then P1</li> <li>Then P6 then P5 then P4</li> <li>Then P10 then P9</li> <li>Then P8 then P7 (note that P10 and P9 would be the 8th and 9th page before insertion of these pages).</li> </ol> <p>I want to find a structure that allows me to do the following:</p> <ul> <li>Inserting new pages anywhere in the middle, end or beginning of the sequence of pages (for example inserting P8 and P7 between P9 and P6), thus according to the inner Tj elements. But I'm looking for a better complexity than O(n).</li> <li>Deleting pages would be nice too.</li> <li>The interesting part is querying: I want to be able to query according to an interval: for example from the 15th element to the 25th. In the example presented first, I would retrieve the 5 last elements of P1 + the 3 of P2 + the two first of P3. Of course, here too, I'm expecting a complexity better than O(n)...</li> </ul> <p>Basically, what I'm trying to achieve is storing in memory pages of tweets efficiently as I receiving them (<a href="https://dev.twitter.com/docs/working-with-timelines" rel="nofollow">Twitter timeline</a>). I could use of course an Array or Linked list but that would mean O(n) insertion and query time... And of course I need to be able to query items according to their "position" in the list in order to display them in a ListView.</p> <p>I thought about a few solutions but none of them are appropriate:</p> <ul> <li>First, Interval trees but they allow insertion and queries on the "same range" of elements, i.e. insertion on the "j" but query on the "j" but not the "i". Not sure I can append a kind of prefix sum to it based on the "i".</li> <li>What I though about is using a Fenwick tree to store cumulative sums of pages number of items, the i in Pi being the "position" in the tree which would represent the key to which a value Tj is associated. But Fenwwick trees are not appropriate to insert new elements... I was wondering if it would be possible to implement a Fenwick tree with a Red-Black tree but I'm not sure about it...</li> <li>Another solution maybe would be to get rid of pages and insert directly elements in a kind of B-tree I guess. But I'm a bit concerned with the speed if I want to insert a page with many elements at once.</li> </ul> <p>I hope my problem is clearly stated. Any idea about a possible efficient solution that scales well?</p> <p><strong>Edit:</strong> I want to query pages not on the internal item Ids (e.g. T140, T150 or anything else) but rather on the Element (i.e. Tweet) index. For example in my first example, T120 will be the 21st item (as there is a page P1 of 20 elements). So I want to be able to query an interval [20-29] which should return the elements [T120, ...]. I don't want to search for 120 directly.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload