Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Although the Hungarian Algorithm is going to work, a much simpler algorithm can be used in your case. Your implicit cost matrix is a symmetric matrix of a special form:</p> <pre><code>ABS(SUBJ.match_field-CTRL.match_field) </code></pre> <p>Therefore, you can relatively easily prove that in an optimal assignment <em>{SUBJ<sub>i</sub>, CTRL<sub>j</sub>}</em> ordered by <code>SUBJ.match_field</code> the values of <code>CTRL.match_field</code> will be ordered as well.</p> <p><strong><i>Proof:</i></strong> Consider an assignment <em>{SUBJ<sub>i</sub>, CTRL<sub>j</sub>}</em> ordered by <code>SUBJ.match_field</code> that is not ordered by <code>CTRL.match_field</code>. Then you have at least one <em>inversion</em>, i.e. a pair of assignments <em>{SUBJ<sub>i1</sub>, CTRL<sub>j1</sub>}</em> and <em>{SUBJ<sub>i2</sub>, CTRL<sub>j2</sub>}</em> such that</p> <p><code>SUBJ.match_field</code><sub>i1</sub> &lt; <code>SUBJ.match_field</code><sub>i2</sub>, but</p> <p><code>CTRL.match_field</code><sub>j1</sub> > <code>CTRL.match_field</code><sub>j2</sub></p> <p>Then you can replace the inverted pair with a non-inverted one</p> <p><em>{SUBJ<sub>i1</sub>, CTRL<sub>j2</sub>}</em> and <em>{SUBJ<sub>i2</sub>, CTRL<sub>j1</sub>}</em></p> <p>of a cost that is less than or equal to the cost of the inverted assignment for all six relative placements of <code>SUBJ.match_field</code>(<sub>i1</sub>, <sub>i2</sub>) and <code>CTRL.match_field</code>(<sub>j1</sub>, <sub>j2</sub>) (<a href="http://www.wolframalpha.com/input/?i=b%3Ea,B%3CA,abs%28a-A%29%2babs%28b-B%29%3E=abs%28a-B%29%2babs%28b-A%29" rel="noreferrer">link to Wolfram Alpha</a>). <strong><i>:Proof</i></strong></p> <p>With this observation in hand, it is easy to prove that the <a href="http://en.wikipedia.org/wiki/Dynamic_programming" rel="noreferrer"><em>dynamic programming</em></a> algorithm below comes up with the optimal assignment:</p> <ul> <li>Make <code>N</code> duplicates of each subject; order by <code>match_field</code></li> <li>Order controls by <code>match_field</code></li> <li>Prepare an empty array <code>assignments</code> of size <code>N * subject.SIZE</code></li> <li>Prepare an empty 2D array <code>mem</code> of size <code>N * subject.SIZE</code> by <code>control.SIZE</code> for <a href="http://en.wikipedia.org/wiki/Memoization" rel="noreferrer"><em>memoization</em></a>; set all elements to <code>-1</code></li> <li>Call <code>Recursive_Assign</code> defined in pseudocode below</li> <li>The <code>assignments</code> table now contains <code>N</code> assignments for each subject <code>i</code> at positions between <code>N*i</code>, inclusive, and <code>N*(i+1)</code>, exclusive.</li> </ul> <hr/> <pre><code>FUNCTION Recursive_Assign // subjects contains each original subj repeated N times PARAM subjects : array of int[subjectLength] PARAM controls: array of int[controlLength] PARAM mem : array of int[subjectLength,controlLength] PARAM sp : int // current subject position PARAM cp : int // current control position PARAM assign : array of int[subjectLength] BEGIN IF sp == subjects.Length THEN RETURN 0 ENDIF IF mem[sp, cp] &gt; 0 THEN RETURN mem[sp, cp] ENDIF int res = ABS(subjects[sp] - controls[cp]) + Recursive_Assign(subjects, controls, mem, sp + 1, cp + 1, assign) assign[sp] = cp IF cp+1+subjects.Length-sp &lt; controls.Length THEN int alt = Recursive_Assign(subjects, controls, mem, sp, cp + 1, assign) IF alt &lt; res THEN res = alt ELSE assign[sp] = cp ENDIF ENDIF RETURN (mem[sp, cp] = res) END </code></pre> <hr/> <p><a href="http://ideone.com/Wotmmc" rel="noreferrer">Here is an implementation of the above pseudocode using C# on ideone.</a></p> <p>This algorithm is ready to be re-written as set-based in SQL. Trying to fit it into the original problem setting (with grouping by locations and making multiple copies of the subject) would add unnecessary layer of complexity to a procedure that is already rather complex, so I am going to simplify things quite a bit by using table-valued parameters of SQL Server. I am not sure if DB2 provides similar capabilities, but if it does not, you should be able to replace them with temporary tables.</p> <p>The stored procedure below is a nearly direct transcription of the above pseudocode into SQL Server's syntax for stored procedures:</p> <pre><code>CREATE TYPE SubjTableType AS TABLE (row int, id int, match_field int) CREATE TYPE ControlTableType AS TABLE (row int, id int, match_field int) CREATE PROCEDURE RecAssign ( @subjects SubjTableType READONLY , @controls ControlTableType READONLY , @sp int , @cp int , @subjCount int , @ctrlCount int ) AS BEGIN IF @sp = @subjCount BEGIN RETURN 0 END IF 1 = (SELECT COUNT(1) FROM #MemoTable WHERE sRow=@sp AND cRow=@cp) BEGIN RETURN (SELECT best FROM #MemoTable WHERE sRow=@sp AND cRow=@cp) END DECLARE @res int, @spNext int, @cpNext int, @prelim int, @alt int, @diff int, @sId int, @cId int SET @spNext = @sp + 1 SET @cpNext = @cp + 1 SET @sId = (SELECT id FROM @subjects WHERE row = @sp) SET @cId = (SELECT id FROM @controls WHERE row = @cp) EXEC @prelim = RecAssign @subjects=@subjects, @controls=@controls, @sp=@spNext, @cp=@cpNext, @subjCount=@subjCount, @ctrlCount=@ctrlCount SET @diff = ABS((SELECT match_field FROM @subjects WHERE row=@sp)-(SELECT match_field FROM @controls WHERE row=@cp)) SET @res = @prelim + @diff IF 1 = (SELECT COUNT(1) FROM #Assignments WHERE sRow=@sp) BEGIN UPDATE #Assignments SET cId=@cId, sId=@sId, diff=@diff WHERE sRow=@sp END ELSE BEGIN INSERT INTO #Assignments(sRow, sId, cId, diff) VALUES (@sp, @sId, @cId, @diff) END IF @cp+1+@subjCount-@sp &lt; @ctrlCount BEGIN EXEC @alt = RecAssign @subjects=@subjects, @controls=@controls, @sp=@sp, @cp=@cpNext, @subjCount=@subjCount, @ctrlCount=@ctrlCount IF @alt &lt; @res BEGIN SET @res = @alt END ELSE BEGIN UPDATE #Assignments SET cId=@cId, sId=@sId, diff=@diff WHERE sRow=@sp END END INSERT INTO #MemoTable (sRow, cRow, best) VALUES (@sp, @cp, @res) RETURN @res END </code></pre> <p>Here is how you call this stored procedure:</p> <pre><code>-- The procedure uses a temporary table for memoization: CREATE TABLE #MemoTable (sRow int, cRow int, best int) -- The procedure returns a table with assignments: CREATE TABLE #Assignments (sRow int, sId int, cId int, diff int) DECLARE @subj as SubjTableType INSERT INTO @SUBJ (row, id, match_field) SELECT ROW_NUMBER() OVER(ORDER BY match_field ASC)-1 AS row, id, match_field FROM subjects DECLARE @ctrl as ControlTableType INSERT INTO @ctrl (row, id, match_field) SELECT ROW_NUMBER() OVER(ORDER BY match_field ASC)-1 AS row, id, match_field FROM controls DECLARE @subjCount int SET @subjCount = (SELECT COUNT(1) FROM subjects) DECLARE @ctrlCount int SET @ctrlCount = (SELECT COUNT(1) FROM controls) DECLARE @best int EXEC @best = RecAssign @subjects=@subj , @controls=@ctrl , @sp=0 , @cp=0 , @subjCount=@subjCount , @ctrlCount=@ctrlCount SELECT @best SELECT sId, cId, diff FROM #Assignments </code></pre> <p>The call above assumes that both <code>subjects</code> and <code>controls</code> have been filtered by location, and that <code>N</code> copies of <code>subjects</code> has been inserted into the table-valued parameter (or the temp table in case of DB2) before making the call.</p> <p>Here is <a href="http://sqlfiddle.com/#!3/07cf1/12" rel="noreferrer">a running demo on sqlfiddle</a>.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload