Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>My understanding is that the only way to get on Google or any other indexing engine is to have the robot actually crawl your site and generate results. Obviously, Google can crawl dynamic sites:</p> <ul> <li><a href="http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html" rel="nofollow noreferrer">http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html</a></li> </ul> <p>however I find this to be an evolutionary rather then revolutionary change with regard to your question.</p> <p>What I think is happening behind the scenes is the combination of these things:</p> <ul> <li>Content index</li> <li>Prepared index</li> <li>User submitted content</li> <li>Referrer search updates</li> </ul> <p>I'll try to explain each of these on a fictional site that sells music - you have plenty of examples to compare the experience. It will of course be on example.com domain.</p> <h1>Content index</h1> <p>Obviously, as a site that wants to offer something, you actually have some content. Usually, you group this contents somehow. Let's assume our music site can group content by different categories:</p> <ul> <li>Author</li> <li>Music genre</li> <li>User submitted</li> <li>Content ratings</li> </ul> <p>Each of these can be represented abstractly as a tag. For example, our site could choose to have example.com/tags/eagles to represent Eagles or example.com/tags/rock to represent all rock bands. Google would be able to index these, so any potential search could yield a link to our site.</p> <h1>Prepared index</h1> <p>Prepared index is similar, but is a generic index instead of real content. This can be prepared in several ways, such as:</p> <ul> <li>Take a dictionary and add all words</li> <li>Crawl a few million pages from the Web (possibly using links provided by search engines!) and get often repeated phrases from there</li> <li>Grab content from free forums</li> <li>Use <a href="http://www.wikipedia.org/" rel="nofollow noreferrer">Wikipeda</a></li> <li>Get text from freely available books, such as those from <a href="http://www.gutenberg.org/" rel="nofollow noreferrer">Project Gutenberg</a></li> </ul> <p>Our site would, for example, get any words from texts that are related to music in any way and make tags similar to the previous ones. E.g. just by crawling the <a href="http://en.wikipedia.org/wiki/Rock_music" rel="nofollow noreferrer">Rock music</a> page on Wikipedia, you can get a lot of tags.</p> <h1>User submitted content</h1> <p>This is something that usually comes after your site is up and running. Let's say that we put a search box on our site and then users come in and type "rock music". Doh, we already knew that, so nothing good from that search. However, let's say we go throughout our Web server logs and see some searches for <a href="http://en.wikipedia.org/wiki/Langeleik" rel="nofollow noreferrer">langeleik</a>. Now, that would be something we might not have indexed before. Cool, just generated another tag on our site. </p> <p>Obviously, Google doesn't know that - so we create an entry in our <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=156184" rel="nofollow noreferrer">sitemap</a> and it's there after another Googlebot crawl. When an user searches on Google for "langeleik", one of the links might be a link to example.com/tags/langeleik.</p> <p>There are other and possibly far more valuable forms of user input - comments, forum posts, etc. Hence the reason there are many generic forums that have no other purpose except hosting forums. It's a great data source and you get new content for free.</p> <p>At the end, all this should go to your site sitemap. You can have huge sitemaps, see this:</p> <ul> <li><a href="https://webmasters.stackexchange.com/questions/26964/google-sitemap-for-dynamic-url-structure">https://webmasters.stackexchange.com/questions/26964/google-sitemap-for-dynamic-url-structure</a></li> </ul> <h1>Referrals</h1> <p>The last thing is referrals. Again after your site is up and running, some of the Google searches will come directly to you. That's when you can take advantage of the HTTP Referer header (yes, it's a misspelling - check it out on <a href="http://en.wikipedia.org/wiki/HTTP_referer" rel="nofollow noreferrer">Wikipedia</a>), see this:</p> <ul> <li><a href="https://stackoverflow.com/questions/941469/is-it-possible-to-capture-search-term-from-google-search">Is it possible to capture search term from Google search?</a></li> </ul> <p>Note that Google search is both:</p> <ul> <li>Incomplete</li> <li>Fuzzy</li> </ul> <p>Thus, you can search for "langeleik" above, but some of the links have the title of e.g. "Langeleik and Harpe". Nothing unusual, but note also the reverse - if you search for "langeleik and harpe", it will not only find all pages with <strong>both</strong> terms, but also pages with one or another. If our we know for harpe, but not for langeleik, and somebody searches for "langeleik and harpe", we will get through HTTP Referer header a <code>q</code> paramter such as <code>q=langeleik+harpe</code>. Cool - just got another word to add to our sitemap, if we want.</p> <p>As for fuzziness, note that when you search for "eagles", you can get everything from birds through NFL teams to a rock band. Thus, even though we are a music site, we might expand our horizon (if desired) to latest NFL news - something totally unrelated and very useful for some sites.</p> <h1>Conclusion - it's an illusion</h1> <p>I consider the combination of all these a very rich sitemap building source. You can very easily generate millions of unique tags using the above techniques. Thus, "anything" you type will be found on example.com/tags. </p> <p>However, you have to note that this is just an <strong><em>illusion</em></strong>. For example, if you search for "ertfghedctgb" (easily typed on regular QWERTY keyboard - ert + fgh + edc + tgb), you will most likely not get anything from Google (I do not currently). It just was not common enough for anybody to put this in their sitemaps (or not common enough for search engines to index it).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload