Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to efficiently implement graph with a lot of big complete subgraphs?
    text
    copied!<p>I'm currently dealing with a performance issue of graph implementation.</p> <h2>Used technologies</h2> <p>It is programmed in C++. For the moment, the graph is implemented thanks to the BGL.</p> <h2>About the graph</h2> <p>The managed graph is dynamic and undirected. It has two kinds of structures: a lot of complete subgraphs and few of single edges. The only needed information is the direct neighborhood of a vertex.</p> <h2>Problem statement</h2> <p>At the beginning, the complete subgraphs were small (about 10 vertices) and numerous (about 13k). An adjacency list implementation, the BGL's one, was perfect. But now, it is asked to manage few subgraph of 5000 vertices. That means 5000x5000 edges. The performance in time and space are then very poor now.</p> <h2>Rejected solutions</h2> <p>My first thought was to use the adjacency matrix implementation provided by BGL. But it doesn't allow dynamic graph. To resolve this constraint, two solutions: provide a new implementation of adjacency matrix for dynamic graph or use a pool of available vertices in a static graph. After reflection, I think it's not a good idea: the space complexity is still VxV/2.</p> <h2>Final Solution and question</h2> <p>So, here my final solution: don't use the BGL, implement bags of vertices to represent complete subgraphs (no need of edges) and directly connect vertices for the few single edges. By doing so, the space complexity of the biggest subgraph falls to its number of vertices, about 5000.</p> <ol> <li>Do you think this last solution is the good one?</li> <li>If not, which implementation could I use? And why?</li> </ol> <h3>Update 1</h3> <p>More information about the graph: The graph has ~100k vertices, ~13k complete subgraphs of about 3 vertices, and ~100 complete subgraphs of size range [10-5000]. And each edge has bundled properties.</p> <h3>Update 2</h3> <p>I've recently learned thanks to <a href="http://www.loria.fr/~jouilisa/" rel="nofollow">Salim Jouilli</a> that the bag of nodes is a candid approach of <a href="http://en.wikipedia.org/wiki/Hypergraph" rel="nofollow">hypergraph</a> where a hyperedge consists in a subset of nodes. </p> <h3>Update 3</h3> <p>I've finished to implement the solution. I've effectively gain in memory consumption and runtime: from 6GB to 24MB and from 50min to 2min30.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload