StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<blockquote> <ol> <li>If I understand Lucene's indexing scheme correctly, when the same long string is indexed as a field in many documents, this doesn't really bulk out the index compared to if it were indexed just once. Correct?</li> <li>If I create a single Term object, make it stored, and then add it to many documents, does the full string data get duplicated for each document in the index? If this is the case, am I just best off putting the actual storage of the tags/attributes into sql?</li> <li>As far as I can tell, the only info that comes back in query results is the documents themselves ordered by score. To determine which fields satisfied the query for a matched document, must I do separate queries on the fields for each document, or what?</li> </ol> </blockquote> <ol> <li>Correct. Lucene stores a dictionary mapping strings to numerical identifiers, so the memory consumed is only to store the identifier several times.</li> <li>I think you are safe storing the tags and attributes in Lucene.</li> <li>You do not need separate queries - once you hold a Document object, you can use e.g. <a href="http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/document/Document.html#getField(java.lang.String)" rel="nofollow noreferrer">getField()</a> to get the relevant field information. Since you are concerned about Lucene performance, I suggest you read <a href="http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr" rel="nofollow noreferrer">Scaling Lucene and Solr</a>, which covers lots of performance tips. </li> </ol>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload