StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow to get Document ids for Document Term Vector in Lucene
primarykey
Id
8938960
data
AcceptedAnswerId
8947826
AnswerCount
1
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2012-01-20T09:07:29.090
FavoriteCount
0
LastActivityDate
2012-01-20T21:03:16.163
LastEditDate
2017-05-23T10:24:23.800
LastEditorUserId
-1
OwnerUserId
1160250
ParentId
0
PostTypeId
1
Score
0
ViewCount
3132
LastEditorDisplayName
text
Body
I am new to Lucene world, and don't have much working knowledge of the subject. I need to extract document term vector and I found the following code online <a href="https://stackoverflow.com/questions/8776794/how-to-extract-document-term-vector-in-lucene-3-5-0/8927749#8927749">How to extract Document Term Vector in Lucene 3.5.0</a>. <pre><code> /** * Sums the term frequency vector of each document into a single term frequency map * @param indexReader the index reader, the document numbers are specific to this reader * @param docNumbers document numbers to retrieve frequency vectors from * @param fieldNames field names to retrieve frequency vectors from * @param stopWords terms to ignore * @return a map of each term to its frequency * @throws IOException */ private Map<String,Integer> getTermFrequencyMap(IndexReader indexReader, List<Integer> docNumbers, String[] fieldNames, Set<String> stopWords) throws IOException { Map<String,Integer> totalTfv = new HashMap<String,Integer>(1024); for (Integer docNum : docNumbers) { for (String fieldName : fieldNames) { TermFreqVector tfv = indexReader.getTermFreqVector(docNum, fieldName); if (tfv == null) { // ignore empty fields continue; } String terms[] = tfv.getTerms(); int termCount = terms.length; int freqs[] = tfv.getTermFrequencies(); for (int t=0; t < termCount; t++) { String term = terms[t]; int freq = freqs[t]; // filter out single-letter words and stop words if (StringUtils.length(term) < 2 || stopWords.contains(term)) { continue; // stop } Integer totalFreq = totalTfv.get(term); totalFreq = (totalFreq == null) ? freq : freq + totalFreq; totalTfv.put(term, totalFreq); } } } return totalTfv; } </code></pre> I have created the index which resides in the following directory. <pre><code>String indexDir = "C:\\Lucene\\Output\\"; Directory dir = FSDirectory.open(new File(indexDir)); IndexReader reader = IndexReader.open(dir); </code></pre> My problem is that I do not know how to get the doc ids (List docNumbers) which is required for the above mentioned function. I have tried a couple of methods like <pre><code>TermDocs docs = reader.termDocs(); </code></pre> but it did not work.
Tags
<lucene>
Title
How to get Document ids for Document Term Vector in Lucene
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. USAhmad
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow to get Document ids for Document Term Vector in Lucene
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTApproveEditSuggestion
CommentsPostId
1. CObtw, how come you know what's a term frequency vector and you don't know anything about lucene document ids?
 singulars
 PostPostId
 POHow to get Document ids for Document Term Vector in Lucene
 UserUserId
 USmilan
2. CO@milan I read that the Lucene does it automatically but the above code was a bit confusing as the "docNumbers" was passed as an argument.
 singulars
 PostPostId
 POHow to get Document ids for Document Term Vector in Lucene
 UserUserId
 USAhmad

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.