StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
1447720
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
10
CommunityOwnedDate
CreationDate
2009-09-19T04:18:00.080
FavoriteCount
0
LastActivityDate
2009-09-22T12:34:50.353
LastEditDate
2017-05-23T12:26:00.663
LastEditorUserId
-1
OwnerUserId
149808
ParentId
887148
PostTypeId
2
Score
29
ViewCount
0
LastEditorDisplayName
text
Body
I asked the same question, <a href="https://stackoverflow.com/questions/1233076/handling-character-encoding-in-uri-on-tomcat">Handling Character Encoding in URI on Tomcat</a> I recently found a solution and it works pretty well for me. You might want give it a try. Here is what you need to do, <ol> <li>Leave your URI encoding as Latin-1. On Tomcat, add URIEncoding="ISO-8859-1" to the Connector in server.xml.</li> <li>If you have to manually URL decode, use Latin1 as charset also.</li> <li>Use the fixEncoding() function to fix up encodings.</li> </ol> For example, to get a parameter from query string, <pre><code> String name = fixEncoding(request.getParameter("name")); </code></pre> You can do this always. String with correct encoding is not changed. The code is attached. Good luck! <pre><code> public static String fixEncoding(String latin1) { try { byte[] bytes = latin1.getBytes("ISO-8859-1"); if (!validUTF8(bytes)) return latin1; return new String(bytes, "UTF-8"); } catch (UnsupportedEncodingException e) { // Impossible, throw unchecked throw new IllegalStateException("No Latin1 or UTF-8: " + e.getMessage()); } } public static boolean validUTF8(byte[] input) { int i = 0; // Check for BOM if (input.length >= 3 && (input[0] & 0xFF) == 0xEF && (input[1] & 0xFF) == 0xBB & (input[2] & 0xFF) == 0xBF) { i = 3; } int end; for (int j = input.length; i < j; ++i) { int octet = input[i]; if ((octet & 0x80) == 0) { continue; // ASCII } // Check for UTF-8 leading byte if ((octet & 0xE0) == 0xC0) { end = i + 1; } else if ((octet & 0xF0) == 0xE0) { end = i + 2; } else if ((octet & 0xF8) == 0xF0) { end = i + 3; } else { // Java only supports BMP so 3 is max return false; } while (i < end) { i++; octet = input[i]; if ((octet & 0xC0) != 0x80) { // Not a valid trailing byte return false; } } } return true; } </code></pre> EDIT: Your approach doesn't work for various reasons. When there are encoding errors, you can't count on what you are getting from Tomcat. Sometimes you get � or ?. Other times, you wouldn't get anything, getParameter() returns null. Say you can check for "?", what happens your query string contains valid "?" ? Besides, you shouldn't reject any request. This is not your user's fault. As I mentioned in my original question, browser may encode URL in either UTF-8 or Latin-1. User has no control. You need to accept both. Changing your servlet to Latin-1 will preserve all the characters, even if they are wrong, to give us a chance to fix it up or to throw it away. The solution I posted here is not perfect but it's the best one we found so far. 
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow to determine if a String contains invalid encoded characters
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. USZZ Coder
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POHow to determine if a String contains invalid encoded characters
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.