Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>This is not actually a "MapReduce" function but it should give you some significant speedup without all of the hassle.</p> <p>I would actually use numpy to "vectorize" the operation and make your life easier. From this you'll just need to loop through this dictionary and apply the vectorized function comparing this item against all others.</p> <pre><code>import numpy as np bnb_items = bnb.values() for num in xrange(len(bnb_items)-1): sims = cosSim(bnb_items[num], bnb_items[num+1:] def cosSim(User, OUsers): """ Determinnes the cosine-similarity between 1 user and all others. Returns an array the size of OUsers with the similarity measures User is a single array of the items purchased by a user. OUsers is a LIST of arrays purchased by other users. """ multidot = np.vectorize(np.vdot) multidenom = np.vectorize(lambda x: np.sum(x)*np.sum(User)) #apply the dot-product between this user and all others num = multidot(OUsers, User) #apply the magnitude multiplication across this user and all others denom = multidenom(OUsers) return num/denom </code></pre> <p>I haven't tested this code so there may be some silly errors but the idea should get you 90% of the way.</p> <p>This should have a SIGNIFICANT speedup. If you still need a speed up there is a wonderful blog post which implements a "Slope One" recommendation system <a href="http://www.serpentine.com/blog/2006/12/12/collaborative-filtering-made-easy/" rel="nofollow noreferrer">here</a>.</p> <p>Hope that helps, Will</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload