Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to guess the REAL title of an HTML document?
    primarykey
    data
    text
    <p>A lot of people put extremely useless and annoying stuff in their <code>&lt;title&gt;</code> tags and I'm trying to come up with Javascript code that extracts only the interesting part.</p> <p>For example on a Google search you get this document title:<br> <a href="http://www.google.com/search?q=some+random+search" rel="nofollow">some random search - Google Search</a><br> The "Google Search" part is redundant, because you already have this information in the domain name (and the favicon). In this example I <strong>only</strong> want this part:<br> <code>some random search</code></p> <p>Most site's authors probably use the "dash notation" which looks like this:<br> <code>Site name - Title</code> or<br> <code>Title - Site name</code><br> But if it was that easy I wouldn't be asking here. ;)</p> <p>There are also some really annoying cases where the title isn't present in the <code>&lt;title&gt;</code> tag at all. (Oh the irony!) Just have a look at this page from the NY Times: <a href="http://www.nytimes.com/2011/02/16/technology/16internet.html" rel="nofollow">Egypt’s Autocrats Exploited Internet’s Weaknesses - NYTimes.com</a>. Whereas the headline of the article actually is: <code>Egypt Leaders Found ‘Off’ Switch for Internet</code>. What the f***, New York Times?</p> <p>What's the most reliable approach to extract this information under the assumption that we have access to the page's DOM? I think a good starting point would be the <code>&lt;h1&gt;</code> tag, but it isn't reliable. I imagine that there are a lot of authors who don't use it at all or use it multiple times.</p> <p><strong>Update: The combination of the <code>&lt;title&gt;</code> and <code>&lt;h1&gt;</code> content seems reasonable to me. Thanks to all of you who have suggested it. But what if there is no <code>&lt;h1&gt;</code> tag? I think some (admittedly, bad) authors don't use them and instead just specify the font-size of a <code>&lt;div&gt;</code> or <code>&lt;span&gt;</code>.</strong></p> <p>I'm currently creating my very first browser extension. (Isn't that nice?) It has a feature that let's you save the current tab, so it should work generally and for as many pages as possible.</p> <p>Thanks to all of you! :)</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload