Note that there are some explanatory texts on larger screens.

plurals
  1. POParsing HTML to fix microtypography & glyph issues
    primarykey
    data
    text
    <p>I'm interested in <a href="http://en.wikipedia.org/wiki/Microtypography" rel="nofollow noreferrer">microtypography</a> issues on the web.</p> <p>I want a tool to fix:</p> <ul> <li>Quotes <ul> <li>“ (&amp;#8220;) opening quote (instead of ")</li> <li>” (&amp;#8221;) closing quote (instead of ")</li> </ul></li> <li>Apostrophe <ul> <li>’ (&amp;#8217;) apostrophe (instead of ')</li> </ul></li> <li>Dashes and Hyphens <ul> <li>– (&amp;#8211; or &amp;ndash;) en dash, used for ranges, e.g. “13–15 November” (instead of -)</li> <li>— (&amp;#8212; or &amp;mdash;) em dash, used for change of thought, e.g. “Star Wars is—as everyone knows—amazing.” (instead of -, or --)</li> </ul></li> <li>Ellipsis <ul> <li>… (&amp;#8230; or &amp;hellip;) horizontal ellipsis, used to indicate an omission or a pause (instead of ...)</li> </ul></li> <li>And more \o/</li> </ul> <p>All those fixes depend on the content language. In French, for example, we must add a insecable (non-breaking) space before every composed glyph (<code>:</code>, <code>;</code>, <code>…</code>, <code>?</code>, <code>!</code>, ...), and our quotes are « like this ».</p> <p>There are many constraints for such a tool:</p> <ul> <li>it must not edit any HTML inside protected tags (<code>pre</code>, <code>code</code>...)</li> <li>it must be fast (used on a CMS output)</li> <li>it must not break the HTML</li> <li>and so on.</li> </ul> <p>There already are some tools on the market:</p> <ul> <li><a href="http://michelf.ca/projects/php-smartypants/typographer/" rel="nofollow noreferrer">http://michelf.ca/projects/php-smartypants/typographer/</a></li> <li><a href="http://kingdesk.com/projects/php-typography/" rel="nofollow noreferrer">http://kingdesk.com/projects/php-typography/</a></li> <li><a href="http://code.google.com/p/typogrify/" rel="nofollow noreferrer">http://code.google.com/p/typogrify/</a></li> </ul> <p>They are all more or less based on SmartyPants, a 2005 lib, not tested, not documented, parsing HTML manually and not dealing with other rules than English. Hell no.</p> <p>So my questions are:</p> <ul> <li>Do you know of any decent tool like this?</li> <li>How can I do it? I already have a POC using <a href="http://symfony.com/doc/current/components/dom_crawler.html" rel="nofollow noreferrer">DomCrawler</a> but I'm not convinced. What's the best way to parse and edit HTML in PHP?</li> </ul> <hr> <p><strong>Edit July 2013</strong>: I have developed <a href="https://github.com/jolicode/JoliTypo" rel="nofollow noreferrer">JoliTypo</a> from the tests and expertise I gained with this issue. No existing lib was doing what I wanted to do.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload