Note that there are some explanatory texts on larger screens.

plurals
  1. POApache Mahout Performance Issues
    text
    copied!<p>I have been working with Mahout in the past few days trying to create a recommendation engine. The project I'm working on has the following data: <li>12M users</li> <li>2M items</li> <li>18M user-item boolean recommendations</li></p> <p>I am now experimenting with 1/3 of the full set we have (i.e. 6M out of 18M recommendations). At any configuration I tried, Mahout was providing quite disappointing results. Some recommendations took 1.5 seconds while other took over a minute. I think a reasonable time for a recommendation should be around the 100ms timeframe.</p> <p><strong>Why does Mahout work so slow?</strong><br> I'm running the application on a Tomcat with the following JVM arguments (even though adding them didn't make much of a difference): </p> <pre><code>-Xms4096M -Xmx4096M -da -dsa -XX:NewRatio=9 -XX:+UseParallelGC -XX:+UseParallelOldGC </code></pre> <p>Below are code snippets for my experiments:</p> <p><strong>User similarity 1:</strong></p> <pre><code>DataModel model = new FileDataModel(new File(dataFile)); UserSimilarity similarity = new CachingUserSimilarity(new LogLikelihoodSimilarity(model), model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(10, Double.NEGATIVE_INFINITY, similarity, model, 0.5); recommender = new GenericBooleanPrefUserBasedRecommender(model, neighborhood, similarity); </code></pre> <p><strong>User similarity 2:</strong></p> <pre><code>DataModel model = new FileDataModel(new File(dataFile)); UserSimilarity similarity = new CachingUserSimilarity(new LogLikelihoodSimilarity(model), model); UserNeighborhood neighborhood = new CachingUserNeighborhood(new NearestNUserNeighborhood(10, similarity, model), model); recommender = new GenericBooleanPrefUserBasedRecommender(model, neighborhood, similarity); </code></pre> <p><strong>Item similarity 1:</strong></p> <pre><code>DataModel dataModel = new FileDataModel(new File(dataFile)); ItemSimilarity itemSimilarity = new LogLikelihoodSimilarity(dataModel); recommender = new GenericItemBasedRecommender(dataModel, itemSimilarity); </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload