Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You are using Python 2.x, and it will try to auto-convert <code>unicode</code>s and plain <code>str</code>s, but it often fails with non-ascii characters.</p> <p>You shouldn't mix <code>unicode</code>s and <code>str</code>s together. You can either stick to <code>unicode</code>s:</p> <pre><code>ignorelist = (u'!', u'-', u'_', u'(', u')', u',', u'.', u':', u';', u'"', u'\'', u'?', u'#', u'@', u'$', u'^', u'&amp;', u'*', u'+', u'=', u'{', u'}', u'[', u']', u'\\', u'|', u'&lt;', u'&gt;', u'/', u'—') if not isinstance(token, unicode): token = token.decode('utf-8') # assumes you are using UTF-8 for punc in ignorelist: token = token.replace(punc, u' ') </code></pre> <p>or use only plain <code>str</code>s (note the last one):</p> <pre><code>ignorelist = ('!', '-', '_', '(', ')', ',', '.', ':', ';', '"', '\'', '?', '#', '@', '$', '^', '&amp;', '*', '+', '=', '{', '}', '[', ']', '\\', '|', '&lt;', '&gt;', '/', u'—'.encode('utf-8')) # and other parts do not need to change </code></pre> <p>By manually encoding your <code>u'—'</code> into a <code>str</code>, Python won't need to try that by itself.</p> <p>I suggest you use <code>unicode</code> all across your program to avoid this kind of errors. But if it'd be too much work, you can use the latter method. However, take care when you call some functions in standard library or third party modules.</p> <p><code># -*- coding: utf-8 -*-</code> only tells Python that your code is written in UTF-8 (or you'll get a <code>SyntaxError</code>).</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload