Note that there are some explanatory texts on larger screens.

plurals
  1. PORegarding the example in Oracle's java online tutorial that uses HashMap to store anagram
    text
    copied!<p>I was reading the example in Oracle's online java tutorial that uses HashMap to store anagrams:</p> <pre><code> // Read words from file and put into a simulated multimap Map&lt;String, List&lt;String&gt;&gt; m = new HashMap&lt;String, List&lt;String&gt;&gt;(); try { Scanner s = new Scanner(new File(args[0])); while (s.hasNext()) { String word = s.next(); String alpha = alphabetize(word); List&lt;String&gt; l = m.get(alpha); if (l == null) m.put(alpha, l=new ArrayList&lt;String&gt;()); l.add(word); } } catch (IOException e) { System.err.println(e); System.exit(1); } // Print all permutation groups above size threshold for (List&lt;String&gt; l : m.values()) if (l.size() &gt;= minGroupSize) System.out.println(l.size() + ": " + l); } private static String alphabetize(String s) { char[] a = s.toCharArray(); Arrays.sort(a); return new String(a); } </code></pre> <p>}</p> <p>Since HashMap is implemented with Hash Table, I think that each sorted alphabetized string should have a unique hash code after compression(otherwise the linked list that stores the values in the HashMap will store a value that's not a anagram of the sorted alphabetized string).</p> <p>I am not exactly sure how Java's implementation of HashMap can satisfy this - I assume that they use the hash code of a string (a1*31^n-1 + a2*31^n-2 + ... + an). This might guarantee the uniqueness of the hash code if we are talking about strings with only lower case chars. However, one also has to compress the hash code before putting the value of the key to the bucket in the hash table (otherwise you would have a hugggggge hash table that can't be handled in memory, just thinking of how big 31^10 is). Among this compression, I would think that there will be collision. In other words, two different strings that are not true anagram will end up being stored in the same bucket (which should be only used to store a list of true anagrams)...</p> <p>Could anyone help me to understand what I might be missing? Or if the online tutorial is missing something? </p> <p>thanks!</p> <p>Jason</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload