Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The given code is misleading, the code</p> <pre><code>Cluster cluster = new Cluster(vec, i, new EuclideanDistanceMeasure()); writer.append(new Text(cluster.getIdentifier()), cluster); } writer.close(); KMeansDriver.run(conf, new Path("testdata/points"), new Path("testdata/clusters"), new Path("output"), new EuclideanDistanceMeasure(), 0.001, 10, true, false); SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path("output/" + Cluster.CLUSTERED_POINTS_DIR + "/part-m-00000"), conf); </code></pre> <p>should be replaced by</p> <pre><code>Kluster cluster = new Kluster(vec, i, new EuclideanDistanceMeasure()); writer.append(new Text(cluster.getIdentifier()), cluster); } writer.close(); KMeansDriver.run(conf, new Path("testdata/points"), new Path("testdata/clusters"), new Path("output"), new EuclideanDistanceMeasure(), 0.001, 10, true, false); SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path("output/" + Kluster.CLUSTERED_POINTS_DIR + "/part-m-00000"), conf); </code></pre> <p>Cluster is an interface whereas <a href="https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/clustering/kmeans/Kluster.html" rel="nofollow">Kluster</a> is a class. Please check <a href="https://builds.apache.org/job/Mahout-Quality/javadoc/overview-summary.html" rel="nofollow">Mahout API Javadoc</a> for more information.</p> <p>To run kmeans with csv file, first you have to create a SequenceFile to pass as an argument in KmeansDriver. The following code reads each line of the CSV file "points.csv" and converts it into vector and write it to the SequenceFile "points.seq"</p> <pre><code>try ( BufferedReader reader = new BufferedReader(new FileReader("testdata2/points.csv")); SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,new Path("testdata2/points.seq"), LongWritable.class, VectorWritable.class) ) { String line; long counter = 0; while ((line = reader.readLine()) != null) { String[] c = line.split(","); if(c.length&gt;1){ double[] d = new double[c.length]; for (int i = 0; i &lt; c.length; i++) d[i] = Double.parseDouble(c[i]); Vector vec = new RandomAccessSparseVector(c.length); vec.assign(d); VectorWritable writable = new VectorWritable(); writable.set(vec); writer.append(new LongWritable(counter++), writable); } } writer.close(); } </code></pre> <p>Hope it helps!!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload