StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POBitcask ok for simple and high performant file store?
text
Body
copied!<p>I am looking for a simple way to store and retrieve millions of xml files. Currently everything is done in a filesystem, which has some performance issues.</p> <p>Our requirements are:</p> <ol> <li>Ability to store millions of xml-files in a batch-process. XML files may be up to a few megs large, most in the 100KB-range.</li> <li>Very fast random lookup by id (e.g. document URL)</li> <li>Accessible by both Java and Perl</li> <li>Available on the most important Linux-Distros and Windows</li> </ol> <p>I did have a look at several NoSQL-Platforms (e.g. CouchDB, <a href="http://wiki.basho.com/" rel="noreferrer">Riak</a> and others), and while those systems look great, they seem almost like beeing overkill:</p> <ol> <li>No clustering required </li> <li>No daemon ("service") required</li> <li>No clever search functionality required</li> </ol> <p>Having delved deeper into Riak, I have found Bitcask (see <a href="http://downloads.basho.com/papers/bitcask-intro.pdf" rel="noreferrer">intro</a>), which seems like exactly what I want. The basics described in the intro are really intriguing. But unfortunately there is no means to access a bitcask repo via java (or is there?)</p> <p>Soo my question boils down to</p> <ul> <li>is the following assumption right: the Bitcask model (append-only writes, in-memory key management) is the right way to store/retrieve millions of documents</li> <li>are there any viable alternatives to Bitcask available via Java? (BerkleyDB comes to mind...) </li> <li>(for riak specialists) Is Riak much overhead implementation/management/resource wise compared to "naked" Bitcask?</li> </ul>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload