Note that there are some explanatory texts on larger screens.

plurals
  1. POHBase schema for storing timeseries user data
    primarykey
    data
    text
    <p>I'm currently working on prototyping a solution for storing user's current location history into a HBase table. (Assume there are hundreds of millions of users). Each user's trial of locations are stored in a HBase table. This trail of locations are then utilized as part of few offline data analysis jobs.</p> <p>Following are the 2 main data access patterns: </p> <ol> <li><p>I should be able to scan through all or a subset of locations (based on time range) of a specific user from the stored location trial.</p></li> <li><p>For offline data analysis, I should be able to scan through all locations of all users within a time range.</p></li> </ol> <p>Given the above requirements, I came up with the following row-key design:</p> <pre><code>&lt;uid&gt;_&lt;timestamp&gt; </code></pre> <p>where 'uid' represent the user-id and 'timestamp' represent the time at which the location was detected and saved.</p> <p>With this row-key design, achieving access pattern #1 is straight forward - scan request can have a start-key and end-key with the given time stamp appended to a specific uid.</p> <p>However, the tricky part is with access pattern #2 with which I'm seeking help from the HBase experts. Since, I need to scan for all users say for last 6 months, I will end up not using any keys with scan operation. This has an impact of scanning through the entire HBase table. Which I feel is inefficient. Moreover, my data size is expected to grow sooner with a write load of 2K/sec.</p> <p>I had a look at OpenTSDB which was pointed by many of people in open forums. But I'm not able to relate that solution on to my data access patterns.</p> <p>I'm looking for help in optimizing this schema which would result in avoiding the full table scan.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload