Note that there are some explanatory texts on larger screens.

plurals
  1. POReliable and efficient key--value database for Linux?
    primarykey
    data
    text
    <p>I need a fast, reliable and memory-efficient key--value database for Linux. My keys are about 128 bytes, and the maximum value size can be 128K or 256K. The database subsystem shouldn't use more than about 1 MB of RAM. The total database size is 20G (!), but only a small random fraction of the data is accessed at a time. If necessary, I can move some data blobs out of the database (to regular files), so the size gets down to 2 GB maximum. The database must survive a system crash without any loss in recently unmodified data. I'll have about 100 times more reads than writes. It is a plus if it can use a block device (without a filesystem) as storage. I don't need client-server functionality, just a library. I need Python bindings (but I can implement them if they are not available).</p> <p>Which solutions should I consider, and which one do you recommend?</p> <p>Candidates I know of which could work:</p> <ul> <li><a href="http://fallabs.com/tokyocabinet/" rel="nofollow noreferrer">Tokyo Cabinet</a> (Python bindings are <a href="http://pypi.python.org/pypi/pytc" rel="nofollow noreferrer">pytc</a>, see also <a href="http://github.com/turian/pytc-example/blob/master/hashdb.py" rel="nofollow noreferrer">pytc example code</a>, supports hashes and B+trees, transaction log files and more, the size of the bucket array is fixed at database creation time; the writer must close the file to give others a chance; lots of small writes with reopening the file for each of them are very slow; the Tyrant server can help with the lots of small writes; <a href="http://michael.susens-schurter.com/tokyotalk/tokyotalk.html" rel="nofollow noreferrer">speed comparison between Tokyo Cabinet, Tokyo Tyrant and Berkeley DB</a>)</li> <li><a href="http://repetae.net/computer/vsdb/" rel="nofollow noreferrer">VSDB</a> (safe even on NFS, without locking; what about barriers?; updates are very slow, but not as slow as in cdb; last version in 2003)</li> <li><a href="http://en.wikipedia.org/wiki/Berkeley_DB" rel="nofollow noreferrer">BerkeleyDB</a> (provides crash recovery; provides transactions; the <code>bsddb</code> Python module provides bindings)</li> <li><a href="http://tdb.samba.org/" rel="nofollow noreferrer">Samba's TDB</a> (with transactions and Python bindings, some users <a href="http://lists.samba.org/archive/samba/2009-January/145793.html" rel="nofollow noreferrer">experienced corruption</a>, sometimes <code>mmap()</code>s the whole file, the <code>repack</code> operation sometimes doubles the file size, produces mysterious failures if the database is larger than 2G (even on 64-bit systems), cluster implementation (<a href="http://ctdb.samba.org/" rel="nofollow noreferrer">CTDB</a>) also available; file grows too large after lots of modifications; file becomes too slow after lots of hash contention; no built-in way to rebuild the file; very fast parallel updates by locking individual hash buckets)</li> <li><a href="https://sourceforge.net/projects/aodbm/" rel="nofollow noreferrer">aodbm</a> (append-only so survives a system crash, with Python bindings)</li> <li><a href="http://www.hamsterdb.com/about/features" rel="nofollow noreferrer">hamsterdb</a> (with Python bindings)</li> <li><a href="http://en.wikipedia.org/wiki/C-tree" rel="nofollow noreferrer">C-tree</a> (mature, versatile commercial solution with high performance, has a free edition with reduced functionality)</li> <li>the old <a href="http://sourceforge.net/projects/tdb/" rel="nofollow noreferrer">TDB</a> (from 2001)</li> <li><a href="https://bitbucket.org/basho/bitcask" rel="nofollow noreferrer">bitcask</a> (log-structured, written in Erlang)</li> <li>various other DBM implementations (such as GDBM, NDBM, QDBM,, Perl's SDBM or Ruby's; probably they don't have proper crash recovery)</li> </ul> <p>I won't use these:</p> <ul> <li><a href="http://memcachedb.org/" rel="nofollow noreferrer">MemcacheDB</a> (client-server, uses BereleleyDB as a backend)</li> <li><a href="http://cr.yp.to/cdb.html" rel="nofollow noreferrer">cdb</a> (needs to regenerate the whole database upon each write)</li> <li><a href="http://www.wildsparx.com/apbcdb/" rel="nofollow noreferrer">http://www.wildsparx.com/apbcdb/</a> (ditto)</li> <li><a href="http://code.google.com/p/redis/" rel="nofollow noreferrer">Redis</a> (keeps the whole database in memory)</li> <li><a href="http://www.sqlite.org/" rel="nofollow noreferrer">SQLite</a> (it becomes very slow without periodic vacuuming, see autocompletion in the in the location bar in Firefox 3.0, even though versions 3.1 and later of sqlite allow <code>auto_vacuum</code>ing; beware: small writing transactions can be very slow; beware: if a busy process is doing many transactions, other processes starve, and they can never get the lock)</li> <li><a href="http://www.mongodb.org/" rel="nofollow noreferrer">MongoDB</a> (too heavy-weight, treats values as objects with internal structure)</li> <li><a href="http://www.firebirdsql.org/index.php?id=about-firebird" rel="nofollow noreferrer">Firebird</a> (SQL-based RDBMS, too heavy-weight)</li> </ul> <p>FYI, a <a href="http://www.linux-mag.com/cache/7579/1.html" rel="nofollow noreferrer">recent article about key--value databases</a> in the Linux magazine.</p> <p>FYI, an <a href="http://linuxfinances.info/info/dbmsisam.html" rel="nofollow noreferrer">older software list</a></p> <p>FYI, a <a href="http://timyang.net/data/mcdb-tt-redis/" rel="nofollow noreferrer">speed comparison of MemcacheDB, Redis and Tokyo Cabinet Tyrant</a></p> <p>Related questions on StackOverflow:</p> <ul> <li><a href="https://stackoverflow.com/questions/775474/key-value-database-for-windows">Key Value Database For Windows?</a></li> <li><a href="https://stackoverflow.com/questions/639545/is-there-a-business-proven-cloud-store-keyvalue-database-open-source">Is there a business proven cloud store / Key=&gt;Value Database? (Open Source)</a></li> </ul>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload