StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POWhat encoding scheme should be used in a web project?
primarykey
Id
3607459
data
AcceptedAnswerId
3609414
AnswerCount
2
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2010-08-31T08:53:26.283
FavoriteCount
2
LastActivityDate
2010-08-31T13:46:44.097
LastEditDate
LastEditorUserId
0
OwnerUserId
165629
ParentId
0
PostTypeId
1
Score
8
ViewCount
1271
LastEditorDisplayName
text
Body
We are building a (Java) web project with Eclipse. By default Eclipse uses <code>Cp1252</code> encoding on Windows machines (which we use). As we also have developers in China (in addition to Europe), I started to wonder if that is really the encoding to use. My initial thought was to convert to <code>UTF-8</code>, because "it supports all the character sets". However, is this really wise? Should we pick some other encoding instead? I see couple of issues: 1) How do web browser interpret the files by default? Does it depend on what language version one is using? What I am after here is that should we verbosely declare the encoding schemes used: <ul> <li>XHTML files can set the encoding verbosely using <code><?xml version='1.0' encoding='UTF-8' ?></code> declarations.</li> <li>CSS files can set this by <code>@CHARSET "UTF-8";</code>.</li> <li>JavaScript files do not have in-file declarations, but one can globally define <code><meta http-equiv="Content-Script-Type" content="text/javascript; charset=utf-8"></code> or <code><script type="text/javascript" charset="utf-8"></code> for specific scripts.</li> </ul> What if we leave CSS file without <code>@CHARSET "UTF-8";</code> declaration? How does the browser decide how it is encoded? 2) Is it wise to use UTF-8, because it is so flexible. By locking our code into <code>Cp1252</code> (or maybe <code>ISO-8859-1</code>) I can ensure that foreign developers don't introduce special characters into files. This effectively prevents them from inserting Chinese comments, for example (we should use 100% english). Also, allowing UTF-8 can sometimes allow developers accidentally introduce some strange characters, that are difficult/impossible to perceive with human eye. This occurs when people, for example, copy-paste text or happen to press some weird keyboard combination accidentally. It would seem that allowing UTF-8 in the project just brings problems... 3) For internatioanlization, I initially considered UTF-8 a good thing ("how can you add translations if the file encoding doesn't support the characters one needs?"). However, as it turned out, Java Resource Bundles (.properties files) must be encoded with ISO-8859-1, because otherwise they might break. Instead, the international characters are converted into <code>\uXXXX</code> notation, for example <code>\u0009</code> and the files are encoded with <code>ISO-8859-1</code>. So... we are not even able to use UTF-8 for this. For binary files... well, the encoding scheme doesn't really matter (I suppose one can say it doesn't even exist). How should we approach these issues?
Tags
<utf-8><character-encoding><special-characters>
Title
What encoding scheme should be used in a web project?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USTuukka Mustonen
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POWhat encoding scheme should be used in a web project?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POWhat encoding scheme should be used in a web project?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POWhat encoding scheme should be used in a web project?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COdid u read this http://www.joelonsoftware.com/articles/Unicode.html
 singulars
 PostPostId
 POWhat encoding scheme should be used in a web project?
 UserUserId
 USGustyWind
2. CO@ GustlyWind: I haven't read that specifically. I will check it out, thanks. @ Kwebble: Wikipedia states that "A resource bundle is a Java .properties file that contains locale-specific data" and that "The encoding of a .properties file is ISO-8859-1, also known as Latin-1". Is there a conflict here? I didn't know about XML-format on properties, that's nice to know, although XML is such a verbose syntax :/
 singulars
 PostPostId
 POWhat encoding scheme should be used in a web project?
 UserUserId
 USTuukka Mustonen

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.