Note that there are some explanatory texts on larger screens.

plurals
  1. POHBase scan with compare filters has long delay when returning last row
    primarykey
    data
    text
    <p>I have HBase running in standalone mode and encountered some problems when I query the tables using the Java API. The table has several million entries (but might grow to billions) which have the following row key metric :</p> <pre><code>&lt;UUID&gt;-&lt;Tag&gt;-&lt;Timestamp&gt; </code></pre> <p>I use two compare-operation filters to query a specific row range which represents a time interval.</p> <pre><code>Scan scan = new Scan(); RowFilter upperRowFilter = new RowFilter(CompareOp.LESS, new BinaryComparator(securityId + eventType + intervalEnd) .getBytes())); RowFilter lowerRowFilter = new RowFilter(CompareOp.GREATER_OR_EQUAL, new BinaryComparator(securityId + eventType + intervalStart) .getBytes())); FilterList filterList = new FilterList(); filterList.addFilter(lowerRowFilter); filterList.addFilter(upperRowFilter); scan.setFilter(filterList); scanner = table.getScanner(scan); result = scanner.next(); </code></pre> <p>When I call the ResultScanner#next() method everything works fine until it gets to the last row of the key range which is specified through the filters. It takes up to 40 seconds until the ResultScanner returns the last row, which is lexically smaller than the upper row range limit.<br></p> <p>When I change the order of the filters in the filterList from</p> <pre><code>filterList.addFilter(lowerRowFilter); filterList.addFilter(upperRowFilter); </code></pre> <p>to</p> <pre><code>filterList.addFilter(upperRowFilter); filterList.addFilter(lowerRowFilter); </code></pre> <p>it takes the scanner up to 40 seconds until it starts to return any results but there is no more delay on returning the last row, so I figured that the delay comes from the CompareOp.LESS - filter.</p> <p>The only way I know of to get around this delay is to omit the upperRowFilter and check manually if the row keys are out of range but I am sure there has to be something wrong, because I found nothing on that problem searching the internet.</p> <p>I also already tried to get rid of that with caching, but when I use a cache size which is less than the number of rows returned it doesn't change anything and if I use a cache size bigger than the number of rows returned the delay is still there but again before any results are returned.</p> <p>Do you have any idea what could cause that kind of behaviour? Am I doing it wrong or is there something that I'm missing?</p> <p>Thanks in advance!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload