StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POhow to design Hbase schema?
text
Body
copied!<p>suppose that I have this RDBM table (<a href="http://en.wikipedia.org/wiki/Entity-attribute-value_model" rel="nofollow noreferrer">Entity-attribute-value_model</a>):</p> <pre><code>col1: entityID col2: attributeName col3: value </code></pre> <p>and I want to use HBase due to scaling issues.</p> <p>I know that the only way to access Hbase table is using a primary key (cursor). you can get a cursor for a specific key, and iterate the rows one-by-one . </p> <p>The issue is, that in my case, I want to be able to iterate on all 3 columns. for example :</p> <ul> <li>for a given an entityID I want to get all its attriutes and values</li> <li>for a give attributeName and value I want to all the entitiIDS ...</li> </ul> <p>so one idea I had is to build one Hbase table that will hold the data (table DATA, with entityID as primary index), and 2 "index" tables one with attributeName as a primary key, and the other one with value</p> <p>each index table will hold a list of pointers (entityIDs) for the DATA table.</p> <p>Is it a reasonable approach ? or is is an 'abuse' of Hbase concepts ?</p> <p>In this <a href="http://blog.rapleaf.com/dev/?p=26" rel="nofollow noreferrer">blog</a> the <a href="http://blog.rapleaf.com/dev/?author=7" rel="nofollow noreferrer">author</a> say:</p> <blockquote> <p>HBase allows get operations by primary key and scans (think: cursor) over row ranges. (If you have both scale and need of secondary indexes, don’t worry - Lucene to the rescue! But that’s another post.)</p> </blockquote> <p>Do you know how Lucene can help ? </p> <p>-- Yonatan</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload