StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
13098727
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2012-10-27T09:07:50.803
FavoriteCount
0
LastActivityDate
2012-10-27T09:14:53.993
LastEditDate
2012-10-27T09:14:53.993
LastEditorUserId
220215
OwnerUserId
220215
ParentId
6417339
PostTypeId
2
Score
4
ViewCount
0
LastEditorDisplayName
text
Body
<code>SequenceFile</code> is a key/value pair file format implemented in Hadoop. Even though <code>SequenceFile</code> is used in HBase for storing write-ahead logs, <code>SequenceFile</code>'s block compression implementation is not. The <code>Compression</code> class is part of Hadoop's compression framework and as such is used in HBase's HFile block compression. HBase already has built-in compression of the following types: <ul> <li>HFile block compression on disk. This uses Hadoop's codec framework and supports compression algorithms such as LZO, GZIP, and SNAPPY. This type of compression is only applied to HFile blocks that are stored on disk, because the whole block needs to be uncompressed to retrieve key/value pairs.</li> <li>In-cache key compression (called "data block encoding" in HBase terminology)—see <a href="https://issues.apache.org/jira/browse/HBASE-4218" rel="nofollow">HBASE-4218</a>. Implemented encoding algorithms include various types of prefix and delta encoding, and trie encoding is being implemented as of this writing (<a href="https://issues.apache.org/jira/browse/HBASE-4676" rel="nofollow">HBASE-4676</a>). Data block encoding algorithms take advantage of the redundancy between sorted keys in an HFile block and only store the differences between consecutive keys. These algorithms currently do not deal with values, and therefore are mostly useful for the case of small values (relative to key size), e.g. counters. Due to the light-weight nature of these data block encoding algorithms, it is possible to efficiently decode only the necessary part of the block to retrieve the requested key or advance to the next key. This is why these encoding algorithms are good for improving cache efficiency. However, on some real-world datasets delta encoding also allows to save up to 50% on top of LZO compression (e.g. applying delta encoding and then LZO vs. LZO only), thus achieving significant savings on disk as well.</li> <li>A custom dictionary-based write-ahead log compression approach is implemented in <a href="https://issues.apache.org/jira/browse/HBASE-4608" rel="nofollow">HBASE-4608</a>. Note: even though SequenceFile is used for write-ahead log storage in HBase, <code>SequenceFile</code>'s built-in block compression cannot be used for write-ahead log, because buffering key/value pairs for block compression would cause data loss.</li> </ul> HBase RPC compression is a work in progress. As you mentioned, compressing key/value pairs passed between client and HBase can save bandwidth and improve HBase performance. This has been implemented in Facebook's version of HBase, 0.89-fb (<a href="https://issues.apache.org/jira/browse/HBASE-5355" rel="nofollow">HBASE-5355</a>) but has yet to be ported to the official Apache HBase trunk. RPC compression algorithms supported in HBase 0.89-fb are the same as those supported by the Hadoop compression framework (e.g. GZIP and LZO). The <code>setCompressedMapOutput</code> method is a map-reduce configuration method and is not really relevant to HBase compression.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHBase Key-Value Compression?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USMikhail at YugaByte
UserOwnerUserId
1. USMikhail at YugaByte
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. CO+1 - should be informative for plenty of users
 singulars
 PostPostId
 PO
 UserUserId
 USali haider

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.