Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Wow I just wrote a big post and SO choked and hung on it, and when I hit my back button to resubmit, the markup editor was empty. aaargh.</p> <p>So here I go again...</p> <p>Regarding Stack Overflow, it turns out that they use <a href="https://stackoverflow.com/questions/115831/what-is-the-search-engine-behind-stack-overflow">SQL server 2005 full text search</a>. </p> <p>Regarding the OS projects recommended by @Grant:</p> <ul> <li><strong>*DotNetKicks</strong> uses the DB for tagging and Lucene for full-text search. There appears to be no way to combine a full text search with a tag search</li> <li><strong>Kigg</strong> uses Linq-to-SQL for both search and tag queries. Both queries join Stories->StoryTags->Tags.</li> <li>Both projects have a 3-table approach to tagging as everyone generally seems to recommend</li> </ul> <p>I also found some other questions on SO that I'd missed before:</p> <ul> <li><a href="https://stackoverflow.com/questions/20856/how-do-you-recommend-implementing-tags-or-tagging">How Do You Recommend Implementing Tags or Tagging?</a></li> <li><a href="https://stackoverflow.com/questions/185597/how-to-structure-data-for-searchability">How to structure data for searchability?</a></li> <li><a href="https://stackoverflow.com/questions/48475/database-design-for-tagging">Database Design for Tagging</a></li> </ul> <p>What I'm currently doing for each of the items I mentioned:</p> <ol> <li>In the DB, 3 tables: Entity, Tag, Entity_Tag. I use the DB to: <ul> <li>Build site-wide tag clouds</li> <li>browse by tag (i.e. urls like SO's <em>/questions/tagged/ASP.NET</em>)</li> </ul></li> <li>For search I use Lucene + NHibernate.Search <ul> <li>Tags are concat'd into a TagString that is indexed by Lucene <ul> <li>So I have the full power of the Lucene query engine (AND / OR / NOT queries)</li> <li>I can search for text <em>and</em> filter by tags at the same time</li> <li>The Lucene analyzer merges words for better tag searches (i.e. a tag search for "test" will also find stuff tagged "testing")</li> </ul></li> <li>Lucene returns a potentially enormous result set, which I paginate to 20 results</li> <li>Then NHibernate loads the result Entities by Id, either from the DB or the Entity cache</li> <li>So it's entirely possible that a search results in 0 hits to the DB</li> </ul></li> <li>Not doing this yet, but I think I will probably try to find a way to build the tag cloud from the TagString in Lucene, rather than take another DB hit</li> <li>Haven't done this yet either, but I will probably store the TagString in the DB so that I can show an Entity's Tag list without having to make 2 more joins.</li> </ol> <p>This means that whenever an Entity's tags are modified, I have to:</p> <ul> <li>Insert any new Tags that do not already exist</li> <li>Insert/Delete from the EntityTag table</li> <li>Update Entity.TagString</li> <li>Update the Lucene index for the Entity</li> </ul> <p>Given that the ratio of reads to writes is very big in my application, I think I'm ok with this. The only really time-consuming part is Lucene indexing, because Lucene can only <em>insert</em> and <em>delete</em> from its index, so I have to re-index the entire entity in order to update the TagString. I'm not excited about that, but I think that if I do it in a background thread, it will be fine.</p> <p>Time will tell...</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload