Note that there are some explanatory texts on larger screens.

plurals
  1. POText.getBytes() returns unexpected results
    primarykey
    data
    text
    <p>I'm getting some behavior from the Text constructors that don't really make any sense. Basically, if I construct a Text object from a String, it is not equal to another Text object that I constructed from bytes, even though getBytes() returns the same value for both objects. </p> <p>So we get weird stuff like this :</p> <pre><code>//This succeeds assertEquals(new Text("ACTACGACCA_0"), new Text("ACTACGACCA_0")); //This succeeds assertEquals((new Text("ACTACGACCA_0")).getBytes(), (new Text("ACTACGACCA_0")).getBytes()); //This fails. Why? assertEquals(new Text((new Text("ACTACGACCA_0")).getBytes()), new Text("ACTACGACCA_0")); </code></pre> <p>This manifests when I'm trying to access a hashmap. Here, I'm trying to do a lookup based on a value returned by org.apache.hadoop.hbase.KeyValue.getRow() :</p> <pre><code>//This succeeds assertEquals((new Text("ACTACGACCA_0")).getBytes(), keyValue.getRow()); //This returns a value hashMap.get(new Text("ACTACGACCA_0")); //This returns null. Why? hashMap.get(new Text(keyValue.getRow())); </code></pre> <p>So what's going on here, and how do I deal with it? Does this have something to do with encoding?</p> <h2>UPDATE : PROBLEM SOLVED</h2> <p>Thanks to Chris for pointing me in the right direction with this. So, a little background : the keyValue object is captured (using a Mockito ArgumentCaptor) from a call to htable.put(). Basically, I had this chunk of code :</p> <pre><code>byte[] keyBytes = matchRow.getKey().getBytes(); RowLock rowLock = hTable.lockRow(keyBytes); Get get = new Get(keyBytes, rowLock); SetWritable&lt;Text&gt; toWrite = new SetWritable&lt;Text&gt;(Text.class); toWrite.getValues().addAll(matchRow.getMatches(hTable, get)); Put put = new Put(keyBytes, rowLock); put.add(Bytes.toBytes(MatchesByHaplotype.MATCHING_COLUMN_FAMILY), Bytes.toBytes(MatchesByHaplotype.UID_QUALIFIER), SERIALIZATION_HELPER.serialize(toWrite)); hTable.put(put); </code></pre> <p>where matchRow.getKey() returns a text object. You see the problem here? I was adding all the bytes, <strong>including the invalid ones.</strong> So I created a nice helper function that does this :</p> <pre><code>public byte[] getValidBytes(Text text) { return Arrays.copyOf(text.getBytes(), text.getLength()); } </code></pre> <p>And changed the first line of that block to this :</p> <pre><code>byte[] keyBytes = SERIALIZATION_HELPER.getValidBytes(matchRow.getKey()); </code></pre> <p>Problem solved! In retrospect : wow, what a nasty bug! I think what it comes down to is that the behavior of Text.getBytes() is very n00b-unfriendly. Not only does it return something that you may not expect (non-valid bytes), the Text object doesn't have a function to return only the valid bytes! You would think this would be a common use-case. Maybe they'll add this in the future?</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload