Note that there are some explanatory texts on larger screens.

plurals
  1. POOptimizing tree branch data aggregation in SQL Server 2008 (recursion)
    text
    copied!<p>I have a table containing stages and sub-stages of certain projects, and a table with specific tasks and estimated costs.<br> I need some way to aggregate each level (stages/sub-stages), to see how much it costs, but to do it at a minimum performance cost.</p> <p>To illustrate this, I will use the following data structure:</p> <pre><code>CREATE TABLE stage ( id int not null, fk_parent int ) CREATE TABLE task ( id int not null, fk_stage int not null, cost decimal(18,2) not null default 0 ) </code></pre> <p>with the following data:</p> <pre><code>==stage== id fk_parent 1 null 2 1 3 1 ==task== id fk_stage cost 1 2 100 1 2 200 1 3 600 </code></pre> <p>I want to obtain a table containing the total costs on each branch. Something like this:</p> <pre><code>Stage ID Total Cost 1 900 2 300 3 600 </code></pre> <p>But, I also want it to be productive. I don't want to end up with extremely bad solutions like <a href="http://bosker.wordpress.com/2011/04/29/the-worst-algorithm-in-the-world/" rel="nofollow">The worst algorithm in the world</a>. I mean this is the case. In case I'll request the data for all the items in the <code>stage</code> table, with the total costs, each total cost will be evaluated <code>D</code> times, where <code>D</code> is the depth in the tree (level) at which it is situated. I am afraid I'll hit extremely low performances at large amounts of data with a lot of levels.</p> <p>SO,</p> <p>I decided to do something which made me ask this question here.<br> I decided to add 2 more columns to the <code>stage</code> table, for caching.</p> <pre><code>... calculated_cost decimal(18,2), date_calculated_cost datetime ... </code></pre> <p>So what I wanted to do is pass another variable within the code, a <code>datetime</code> value which equals to the time when this process was started (pretty much unique). That way, if the <code>stage</code> row already has a <code>date_calculated_cost</code> which equals to the one I'm carrying, I don't bother calculating it again, and just return the <code>calculated_cost</code> value.</p> <p>I couldn't do it with Functions (updates are needed to the <code>stage</code> table, once costs are calculated)<br> I couldn't do it with Procedures (recursion within running cursors is a no-go)<br> I am not sure temporary tables are suitable because it wouldn't allow concurrent requests to the same procedure (which are least likely, but anyway I want to do it the right way)<br> I couldn't figure out other ways.</p> <p>I am not expecting a definite answer to my question, but I will reward any good idea, and the best will be chosen as the answer.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload