StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
13923698
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
5
CommunityOwnedDate
CreationDate
2012-12-17T23:00:22.650
FavoriteCount
0
LastActivityDate
2012-12-18T01:02:47.643
LastEditDate
2012-12-18T01:02:47.643
LastEditorUserId
908494
OwnerUserId
908494
ParentId
13923355
PostTypeId
2
Score
5
ViewCount
0
LastEditorDisplayName
text
Body
First, <code>ISO-8859-1</code> isn't a valid coding declaration. You want <code>iso-8859-1</code>. If you look at <a href="http://docs.python.org/2/library/codecs.html" rel="nofollow">the docs</a>, you can call this <code>latin_1</code>, <code>iso-8859-1</code>, <code>iso8859-1</code>, <code>8859</code>, <code>cp819</code>, <code>latin</code>, <code>latin1</code>, or <code>L1</code>, but not <code>ISO-8859-1</code>. It looks like <code>codecs.lookup</code> bends over backward to accept bad input, including doing case-insensitive lookups. If you trace <a href="http://hg.python.org/cpython/file/2.7/Lib/codecs.py" rel="nofollow"><code>codecs.lookup</code></a> through <a href="http://hg.python.org/cpython/file/2.7/Modules/_codecsmodule.c" rel="nofollow"><code>_codecs.lookup</code></a> to <a href="http://hg.python.org/cpython/file/2.7/Python/codecs.c" rel="nofollow"><code>_PyCodec_Lookup</code></a>, you can see this comment: <pre><code>/* Convert the encoding to a normalized Python string: all characters are converted to lower case, spaces and hyphens are replaced with underscores. */ </code></pre> But source file decoding doesn't go through the same codec lookup process. Because it happens at compile time rather than runtime, there's no reason for it to do so. (At any rate, saying "It seems to work, even though the docs say it's wrong… so why doesn't it quite work right?" is kind of silly in the first place.) To demonstrate, if I create two Latin-1 files: badcode.py: <pre><code># -*- coding: ISO-8859-1 -*- print u"Vérifier l'affichage de cette chaîne" </code></pre> goodcode.py: <pre><code># -*- coding: iso-8859-1 -*- print u"Vérifier l'affichage de cette chaîne" </code></pre> The first one fails, the second succeeds. Now, why does it "work" when it's going to console but raise an exception when piped? Well, when you print to a Windows console, or a Unix TTY, Python has some code to try to guess the right encoding to use. (I'm not sure what happens under the covers on Windows; it might even be using UTF-16 output, for all I know.) When you're not printing to a console/TTY, it can't do this, so you have to specify the encoding explicitly. You can see some of what's going on by looking at <code>sys.stdout.isatty()</code>, <code>sys.stdout.encoding</code>, and <code>sys.getdefaultencoding()</code>. Here's what I see on a Mac in different cases: <ul> <li>Python 2, no redirect: <code>True, UTF-8, ascii, Vérifier</code></li> <li>Python 3, no redirect: <code>True, UTF-8, utf-8, Vérifier</code></li> <li>Python 2, redirect: <code>False, None, ascii, UnicodeEncodeError</code></li> <li>Python 3, redirect: <code>False, UTF-8, utf-8, Vérifier</code></li> </ul> If <code>isatty()</code>, <code>encoding</code> will be an appropriate encoding for the TTY; otherwise, <code>encoding</code> will be the default value, which is <code>None</code> (meaning <code>ascii</code>) in 2.x, and (I think—I'd have to check the code) something based on <code>getdefaultencoding()</code> in 3.x. Which means that if you try to print Unicode while <code>stdout</code> is not a TTY in 2.x, it will try to encode it as <code>ascii</code>, <code>strict</code>, which will fail if you've got non-ASCII characters. If you somehow know what codec you want to use, you can deal with this manually by checking <code>isatty()</code> and encoding to that codec (or even to <code>ascii</code>, <code>ignore</code> instead of <code>strict</code>, if you prefer) whenever you print, instead of trying to print Unicode. (If you know what codec you want, you may want to do this even in 3.x—defaulting to UTF-8 isn't too helpful if you're trying to generate, say, Windows-1252 files…) The difference there actually has nothing to do with Latin-1. Try this out: nocode.py: <pre><code>print u"V\xe9rifier l'affichage de cette cha\xeene" print u"V\u00e9rifier l'affichage de cette cha\u00eene" </code></pre> I get the Unicode strings encoded to UTF-8 for my Mac terminal, and (apparently) Windows-1252 to my Windows cmd window, but an exception redirecting to a file.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POWhy python does not behave the same when printing unicode strings in console and pipes?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USabarnert
UserOwnerUserId
1. USabarnert
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.