Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>After the operations you performed, you'll see:</p> <pre><code>&gt;&gt;&gt; req.headers['content-type'] 'text/html; charset=windows-1251' </code></pre> <p>and so:</p> <pre><code>&gt;&gt;&gt; encoding=req.headers['content-type'].split('charset=')[-1] &gt;&gt;&gt; ucontent = unicode(content, encoding) </code></pre> <p><code>ucontent</code> is now a Unicode string (of 140655 characters) -- so for example to display a part of it, if your terminal is UTF-8:</p> <pre><code>&gt;&gt;&gt; print ucontent[76:110].encode('utf-8') &lt;title&gt;Lenta.ru: Главное: &lt;/title&gt; </code></pre> <p>and you can search, etc, etc.</p> <p>Edit: Unicode I/O is usually tricky (this may be what's holding up the original asker) but I'm going to bypass the difficult problem of inputting Unicode strings to an interactive Python interpreter (completely unrelated to the original question) to show how, once a Unicode string IS correctly input (I'm doing it by codepoints -- goofy but not tricky;-), search is absolutely a no-brainer (and thus hopefully the original question has been thoroughly answered). Again assuming a UTF-8 terminal:</p> <pre><code>&gt;&gt;&gt; x=u'\u0413\u043b\u0430\u0432\u043d\u043e\u0435' &gt;&gt;&gt; print x.encode('utf-8') Главное &gt;&gt;&gt; x in ucontent True &gt;&gt;&gt; ucontent.find(x) 93 </code></pre> <p><strong>Note</strong>: Keep in mind that this method may not work for all sites, since some sites only specify character encoding inside the served documents (using http-equiv meta tags, for example).</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload