Note that there are some explanatory texts on larger screens.

plurals
  1. POCan Solr highlighting also indicate the position or offset of the returned fragments within the original field?
    primarykey
    data
    text
    <p><strong>Background</strong></p> <p>Using Solr 4.0.0. I've indexed the text of a set of sample documents and enabled Term Vectors so I can use Fast Vector Highlighting</p> <pre><code>&lt;field name="raw_text" type="text_en" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" /&gt; </code></pre> <p>For highlighting I'm using the Break Iterator Boundary Scanner with SENTENCE boundaries.</p> <pre><code>&lt;boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner"&gt; &lt;lst name="defaults"&gt; &lt;!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE --&gt; &lt;str name="hl.bs.type"&gt;SENTENCE&lt;/str&gt; &lt;/lst&gt; &lt;/boundaryScanner&gt; </code></pre> <p>I do a simple query</p> <pre><code>http://localhost:8983/solr/documents/select?q=raw_text%3AArtibonite&amp;wt=xml&amp;hl=true&amp;hl.fl=raw_text&amp;hl.useFastVectorHighlighter=true&amp;hl.snippets=100&amp;hl.boundaryScanner=breakIterator </code></pre> <p>Highlighting is working fairly well </p> <pre><code>&lt;response&gt; ... &lt;result name="response" numFound="5" start="0"&gt; &lt;doc&gt; &lt;str name="id"&gt;-1071691270&lt;/str&gt; &lt;str name="raw_text"&gt; Final Report of the Independent Panel of Experts on the Cholera Outbreak in Haiti Dr. Alejando Cravioto (Chair) International Center for Diarrhoeal Disease Research, Dhaka, Bangladesh Dr. Claudio F. Lanata Instituto de Investigación Nutricional, and The US Navy Medical Research Unit 6, Lima, Peru Engr. Daniele S. Lantagne Harvard University... ~SNIP~ &lt;/str&gt; &lt;doc&gt; &lt;lst name="highlighting"&gt; &lt;lst name="-1071691270"&gt; &lt;arr name="raw_text"&gt; ... &lt;str&gt; The timeline suggests that the outbreak spread along the &lt;em&gt;Artibonite&lt;/em&gt; River. After establishing that the cases began in the upper reaches of the Artibonite River, potential sources of contamination that could have initiated the outbreak were investigated. &lt;/str&gt; ... &lt;/arr&gt; &lt;/lst&gt; &lt;/lst&gt; </code></pre> <p><strong>Problem</strong></p> <p>I want to be able to send the resulting sentences for further processing (entity-extraction, etc.) but I would like to track the start/end offsets of the highlighted sentence within the original (long) text field. <em>Is there straightforward way to do this?</em></p> <p>Would it be better to set hl.fragsize to return the entire field and then process/extract the sentences of interest this way?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload