Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Since you haven't gotten any answers yet, I thought I would at least contribute some thoughts. I have used a python k-d tree module for quickly searching nearest neighbor points:<br> <a href="http://code.google.com/p/python-kdtree/downloads/detail?name=kdtree.py" rel="nofollow">http://code.google.com/p/python-kdtree/downloads/detail?name=kdtree.py</a><br> It takes arbitrary point lengths as long as they are the same sizes. </p> <p>I'm not sure how you would want to apply the weighting of the "importance", but here is just a brainstorm on how to use the kdtree module to at least get the closest "people" to each point of a given person's set:</p> <pre><code>import numpy from kdtree import KDTree from itertools import chain class PersonPoint(object): def __init__(self, person, point, factor): self.person = person self.point = point self.factor = factor def __repr__(self): return '&lt;%s: %s, %0.2f&gt;' % (self.person, ['%0.2f' % p for p in self.point], self.factor) def __iter__(self): return self.point def __len__(self): return len(self.point) def __getitem__(self, i): return self.point[i] people = {} for name in ('bill', 'john', 'mary', 'jenny', 'phil', 'george'): factors = numpy.random.rand(6) points = numpy.random.rand(6, 3).tolist() people[name] = [PersonPoint(name, p, f) for p,f in zip(points, factors)] bill_points = people['bill'] others = list(chain(*[people[name] for name in people if name != 'bill'])) tree = KDTree.construct_from_data(others) for point in bill_points: # t=1 means only return the 1 closest. # You could set it higher to return more. print point, "=&gt;", tree.query(point, t=1)[0] </code></pre> <p>Results:</p> <pre><code>&lt;bill: ['0.22', '0.64', '0.14'], 0.07&gt; =&gt; &lt;phil: ['0.23', '0.54', '0.11'], 0.90&gt; &lt;bill: ['0.31', '0.87', '0.16'], 0.88&gt; =&gt; &lt;phil: ['0.36', '0.80', '0.14'], 0.40&gt; &lt;bill: ['0.34', '0.64', '0.25'], 0.65&gt; =&gt; &lt;jenny: ['0.29', '0.77', '0.28'], 0.40&gt; &lt;bill: ['0.24', '0.90', '0.23'], 0.53&gt; =&gt; &lt;jenny: ['0.29', '0.77', '0.28'], 0.40&gt; &lt;bill: ['0.50', '0.69', '0.06'], 0.68&gt; =&gt; &lt;phil: ['0.36', '0.80', '0.14'], 0.40&gt; &lt;bill: ['0.13', '0.67', '0.93'], 0.54&gt; =&gt; &lt;jenny: ['0.05', '0.62', '0.94'], 0.84&gt; </code></pre> <p>I figured with the result, you could look at the most frequent matched "person" or then consider the weights. Or maybe you can total up the important factors in the results and then take the highest rated one. That way, if mary only matched once but had like a 10 factor, and phil had 3 matched but only totaled to 5, mary might be more relevant? </p> <p>I know you have a more robust function for creating an index but it would require going through every point in your collection.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload