Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I am on Lucene core 3.0.3, but I expect the API will be very similar. This method will total up a term frequency map for a given set of Document numbers and a list of fields of interest, ignoring stop words.</p> <pre><code> /** * Sums the term frequency vector of each document into a single term frequency map * @param indexReader the index reader, the document numbers are specific to this reader * @param docNumbers document numbers to retrieve frequency vectors from * @param fieldNames field names to retrieve frequency vectors from * @param stopWords terms to ignore * @return a map of each term to its frequency * @throws IOException */ private Map&lt;String,Integer&gt; getTermFrequencyMap(IndexReader indexReader, List&lt;Integer&gt; docNumbers, String[] fieldNames, Set&lt;String&gt; stopWords) throws IOException { Map&lt;String,Integer&gt; totalTfv = new HashMap&lt;String,Integer&gt;(1024); for (Integer docNum : docNumbers) { for (String fieldName : fieldNames) { TermFreqVector tfv = indexReader.getTermFreqVector(docNum, fieldName); if (tfv == null) { // ignore empty fields continue; } String terms[] = tfv.getTerms(); int termCount = terms.length; int freqs[] = tfv.getTermFrequencies(); for (int t=0; t &lt; termCount; t++) { String term = terms[t]; int freq = freqs[t]; // filter out single-letter words and stop words if (StringUtils.length(term) &lt; 2 || stopWords.contains(term)) { continue; // stop } Integer totalFreq = totalTfv.get(term); totalFreq = (totalFreq == null) ? freq : freq + totalFreq; totalTfv.put(term, totalFreq); } } } return totalTfv; } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload