Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I don't suppose performance matters much here, but I can't resist. The zip() function completely recopies both vectors (more of a matrix transpose, actually) just to get the data in "Pythonic" order. It would be interesting to time the nuts-and-bolts implementation:</p> <pre><code>import math def cosine_similarity(v1,v2): "compute cosine similarity of v1 to v2: (v1 dot v2)/{||v1||*||v2||)" sumxx, sumxy, sumyy = 0, 0, 0 for i in range(len(v1)): x = v1[i]; y = v2[i] sumxx += x*x sumyy += y*y sumxy += x*y return sumxy/math.sqrt(sumxx*sumyy) v1,v2 = [3, 45, 7, 2], [2, 54, 13, 15] print(v1, v2, cosine_similarity(v1,v2)) Output: [3, 45, 7, 2] [2, 54, 13, 15] 0.972284251712 </code></pre> <p>That goes through the C-like noise of extracting elements one-at-a-time, but does no bulk array copying and gets everything important done in a single for loop, and uses a single square root.</p> <p>ETA: Updated print call to be a function. (The original was Python 2.7, not 3.3. The current runs under Python 2.7 with a <code>from __future__ import print_function</code> statement.) The output is the same, either way.</p> <p>CPYthon 2.7.3 on 3.0GHz Core 2 Duo:</p> <pre><code>&gt;&gt;&gt; timeit.timeit("cosine_similarity(v1,v2)",setup="from __main__ import cosine_similarity, v1, v2") 2.4261788514654654 &gt;&gt;&gt; timeit.timeit("cosine_measure(v1,v2)",setup="from __main__ import cosine_measure, v1, v2") 8.794677709375264 </code></pre> <p>So, the unpythonic way is about 3.6 times faster in this case.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload