Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to use Sphinx BuildExcerpts
    primarykey
    data
    text
    <p>So, I've set up a Sphinx configuration file. I have a very simple schema with two fields, title and body, where the title is the name of a novel and body is the complete novel itself. To keep things simple, I've only added one novel. The indexer worked just fine and the Python API made querying sphinxd a breeze. I'm really impressed so far, this seems the easiest to set up full-text search engine I've investigated so far (much easier than Lucene or Solr and faster than Woosh).</p> <p>I have skipped any DB backend. I have my novels in plain .txt format, and I've added the sample one with this simple xml (through xmlpipe)</p> <pre><code>&lt;?xml version="1.0" encoding="utf-8"?&gt; &lt;sphinx:docset&gt; &lt;sphinx:document id="1"&gt; &lt;title&gt;&lt;![CDATA[Dan Simmons - I Canti di Hyperion 3 - Endymion]]&gt;&lt;/title&gt; &lt;body&gt;&lt;![CDATA[ * ALL THE NOVEL HERE * ]]&gt;&lt;/body&gt; &lt;/sphinx:document&gt; &lt;/sphinx:docset&gt; </code></pre> <p>By the way, I search the archive for "tartaruga", it is italian for "turtle" and I'm sure that the word is the file. In fact, is found three times, and I guess that's what Sphinx returns to me (<code>'hits': 3</code>). This is the complete result:</p> <pre><code>{'attrs': [], 'error': '', 'fields': ['title', 'body'], 'matches': [{'attrs': {}, 'id': 1, 'weight': 1}], 'status': 0, 'time': '0.392', 'total': 1, 'total_found': 1, 'warning': '', 'words': [{'docs': 1, 'hits': 3, 'word': 'tartaruga'}]} </code></pre> <p>What I want to have, in the end, is something like this:</p> <pre><code>[ { 'title': 'Dan Simmons - I Canti di Hyperion 3 - Endymion', 'body': 'il vecchio mostrò quel suo sorriso a becco di tartaruga. — non bisogna dimenticare il palazzo dello shrike, né il nostro vecchio amico shrike, giusto? non ce ne sono altre?' }, { 'title': 'Dan Simmons - I Canti di Hyperion 3 - Endymion', 'body': '— vieni più vicino, raul endymion. — la voce pareva il rumore di una lama spuntata che sfregasse su pergamena. le labbra si muovevano come il becco d\'una tartaruga.' }, { 'title': 'Dan Simmons - I Canti di Hyperion 3 - Endymion', 'body': 'il becco di tartaruga ebbe una contrazione, la grossa testa si mosse in un cenno d\'assenso. notai ora che il viso del vecchio, malgrado i danni provocati dai secoli, aveva ancora tratti netti e spigolosi... un\'aria da satiro.' }, ] </code></pre> <p>I mean, an array of occurrences with the book the excerpt is taken from and the word within a context (i've chosen sentencies, but <em>n</em> words before or after the match would work). I think I have to use BuildExcerpts, but how?</p> <p>Also, if I want to match both <em>tartaruga</em> (turtle) and <em>tartarughe</em> (turtles), I'd like to issue a query like <code>tartarug*</code>. How to do this is Sphinx? Thanks in advance.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload