Note that there are some explanatory texts on larger screens.

plurals
  1. POSlow performing cypher query that creates nodes to group existing nodes by property values
    primarykey
    data
    text
    <p>I have a performance issue with a modifying cypher query. Given is an origin node that has a huge amount of outgoing relationships to child nodes. These child nodes all have a key property. Now the goal is to create new nodes between the origin and the child nodes to group all child nodes which share the same key properties value. A plot of that idea can be found at the neo4j console: <a href="http://console.neo4j.org/?id=vinntj" rel="nofollow">http://console.neo4j.org/?id=vinntj</a></p> <p>I use the query together with spring-data-neo4j 2.2.2.RELEASE and neo4j 1.9.2 embedded. The parameter for that query must be a node id and the result of that query should be the modified root node.</p> <p>The query currently looks like (a bit more complex than in the linked neo4j console):</p> <pre><code>START root=node({0}) MATCH (root)-[r:LEAF]-&gt;(child) SET root.__type__='my.GroupedRoot' DELETE r WITH child.`custom-GROUP` AS groupingKey, root AS origin, child AS leaf CREATE UNIQUE (origin)-[:GROUP]-&gt;(group{__type__:'my.Group',key:'GROUP',value:groupingKey,origin:ID(origin)})-[:LEAF]-&gt;(leaf) RETURN DISTINCT origin </code></pre> <p>The property custom-GROUP is the key to group by. In SDN it is represented by a DynamicProperties object. I annotated it to be indexed as well as the groupingKey and origin property of the created group node.</p> <p>With 5000 child nodes it takes ~50sec to group them. For 10000 nodes ~90sec. For 20000 nodes ~380s and for 30000 nodes > 50min! This looks like an o(log n) scale to me. But my goal is an o(n) scale and to get 500000+ child nodes processed below 30min. I assume that the CREATE UNIQUE part of that query causes that problem because for new group nodes it always need to check what kind of group nodes have already been created. And the amount to check grows with the amount of already grouped child nodes.</p> <p>Does someone have an idea about how to get this query faster? Or to do the same thing faster with an other query?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload