Note that there are some explanatory texts on larger screens.

plurals
  1. POWhat are the options for storing hierarchical data in a relational database?
    primarykey
    data
    text
    <p><strong>Good Overviews</strong></p> <p>Generally speaking, you're making a decision between fast read times (for example, nested set) or fast write times (adjacency list). Usually, you end up with a combination of the options below that best fit your needs. The following provides some in-depth reading:</p> <ul> <li><a href="http://vadimtropashko.wordpress.com/2008/08/09/one-more-nested-intervals-vs-adjacency-list-comparison/" rel="noreferrer">One more Nested Intervals vs. Adjacency List comparison</a>: <em>the best comparison</em> of Adjacency List, Materialized Path, Nested Set and Nested Interval I've found.</li> <li><a href="http://www.slideshare.net/billkarwin/models-for-hierarchical-data" rel="noreferrer">Models for hierarchical data</a>: slides with good explanations of tradeoffs and example usage</li> <li><a href="http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/" rel="noreferrer">Representing hierarchies in MySQL</a>: very good overview of Nested Set in particular</li> <li><a href="http://troels.arvin.dk/db/rdbms/links/#hierarchical" rel="noreferrer">Hierarchical data in RDBMSs</a>: most comprehensive and well-organized set of links I've seen, but not much in the way of explanation</li> </ul> <p><strong>Options</strong></p> <p>Ones I am aware of and general features:</p> <ol> <li><a href="http://en.wikipedia.org/wiki/Adjacency_list" rel="noreferrer">Adjacency List</a>: <ul> <li>Columns: ID, ParentID</li> <li>Easy to implement.</li> <li>Cheap node moves, inserts, and deletes.</li> <li>Expensive to find the level, ancestry &amp; descendants, path</li> <li>Avoid N+1 via <a href="http://en.wikipedia.org/wiki/Common_table_expressions" rel="noreferrer">Common Table Expressions</a> in databases that support them</li> </ul></li> <li><a href="http://en.wikipedia.org/wiki/Nested_set_model" rel="noreferrer">Nested Set</a> (a.k.a <a href="https://www.caktusgroup.com/blog/2016/01/04/modified-preorder-tree-traversal-django/" rel="noreferrer">Modified Preorder Tree Traversal</a>) <ul> <li>Columns: Left, Right</li> <li>Cheap ancestry, descendants</li> <li>Very expensive <code>O(n/2)</code> moves, inserts, deletes due to volatile encoding</li> </ul></li> <li><a href="https://www.informationweek.com/software/information-management/kimball-university-five-alternatives-for-better-employee-dimension-modeling/d/d-id/1082326" rel="noreferrer">Bridge Table</a> (a.k.a. <a href="http://dirtsimple.org/2010/11/simplest-way-to-do-tree-based-queries.html" rel="noreferrer">Closure Table /w triggers</a>) <ul> <li>Uses separate join table with: ancestor, descendant, depth (optional)</li> <li>Cheap ancestry and descendants</li> <li>Writes costs <code>O(log n)</code> (size of subtree) for insert, updates, deletes</li> <li>Normalized encoding: good for RDBMS statistics &amp; query planner in joins</li> <li>Requires multiple rows per node</li> </ul></li> <li><a href="https://web.archive.org/web/20150909165001/http://www.ferdychristant.com/blog//articles/DOMM-7QJPM7" rel="noreferrer">Lineage Column</a> (a.k.a. <a href="http://communities.bmc.com/communities/docs/DOC-9902" rel="noreferrer">Materialized Path</a>, Path Enumeration) <ul> <li>Column: lineage (e.g. /parent/child/grandchild/etc...)</li> <li>Cheap descendants via prefix query (e.g. <code>LEFT(lineage, #) = '/enumerated/path'</code>)</li> <li>Writes costs <code>O(log n)</code> (size of subtree) for insert, updates, deletes</li> <li>Non-relational: relies on Array datatype or serialized string format</li> </ul></li> <li><a href="http://communities.bmc.com/communities/docs/DOC-9902" rel="noreferrer">Nested Intervals</a> <ul> <li>Like nested set, but with real/float/decimal so that the encoding isn't volatile (inexpensive move/insert/delete)</li> <li>Has real/float/decimal representation/precision issues</li> <li><a href="http://vadimtropashko.files.wordpress.com/2011/07/ch5.pdf" rel="noreferrer">Matrix encoding variant</a> adds ancestor encoding (materialized path) for "free", but with added trickiness of linear algebra.</li> </ul></li> <li><a href="http://evolt.org/node/4047" rel="noreferrer">Flat Table</a> <ul> <li>A modified Adjacency List that adds a Level and Rank (e.g. ordering) column to each record.</li> <li>Cheap to iterate/paginate over</li> <li>Expensive move and delete</li> <li>Good Use: threaded discussion - forums / blog comments</li> </ul></li> <li><a href="https://stackoverflow.com/a/6802687/684229">Multiple lineage columns</a> <ul> <li>Columns: one for each lineage level, refers to all the parents up to the root, levels down from the item's level are set to NULL</li> <li>Cheap ancestors, descendants, level</li> <li>Cheap insert, delete, move of the leaves</li> <li>Expensive insert, delete, move of the internal nodes</li> <li>Hard limit to how deep the hierarchy can be</li> </ul></li> </ol> <p><strong>Database Specific Notes</strong></p> <p><em>MySQL</em></p> <ul> <li><a href="http://explainextended.com/2009/09/29/adjacency-list-vs-nested-sets-mysql/" rel="noreferrer">Use session variables for Adjacency List</a></li> </ul> <p><em>Oracle</em></p> <ul> <li>Use <a href="http://www.ypl.com/oracle/sql/hierarchical_queries/html_deep/index.html" rel="noreferrer">CONNECT BY</a> to traverse Adjacency Lists</li> </ul> <p><em>PostgreSQL</em></p> <ul> <li><a href="http://www.postgresql.org/docs/current/static/ltree.html" rel="noreferrer">ltree datatype</a> for Materialized Path</li> </ul> <p><em>SQL Server</em></p> <ul> <li><a href="http://msdn.microsoft.com/en-us/magazine/cc794278.aspx" rel="noreferrer">General summary</a></li> <li>2008 offers <a href="http://msdn.microsoft.com/en-us/library/bb677290.aspx" rel="noreferrer">HierarchyId</a> data type appears to help with Lineage Column approach and expand the depth that can be represented.</li> </ul>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload