StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow to determine if a String contains invalid encoded characters
primarykey
Id
887148
data
AcceptedAnswerId
1447720
AnswerCount
10
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2009-05-20T10:11:34.030
FavoriteCount
16
LastActivityDate
2015-12-02T05:21:53.270
LastEditDate
2009-09-23T05:44:03.647
LastEditorUserId
16193
OwnerUserId
16193
ParentId
0
PostTypeId
1
Score
30
ViewCount
83721
LastEditorDisplayName
text
Body
Usage scenario We have implemented a webservice that our web frontend developers use (via a php api) internally to display product data. On the website the user enters something (i.e. a query string). Internally the web site makes a call to the service via the api. Note: We use restlet, not tomcat Original Problem Firefox 3.0.10 seems to respect the selected encoding in the browser and encode a url according to the selected encoding. This does result in different query strings for ISO-8859-1 and UTF-8. Our web site forwards the input from the user and does not convert it (which it should), so it may make a call to the service via the api calling a webservice using a query string that contains german umlauts. I.e. for a query part looking like <pre><code> ...v=abcädef </code></pre> if "ISO-8859-1" is selected, the sent query part looks like <pre><code>...v=abc%E4def </code></pre> but if "UTF-8" is selected, the sent query part looks like <pre><code>...v=abc%C3%A4def </code></pre> Desired Solution As we control the service, because we've implemented it, we want to check on server side wether the call contains non utf-8 characters, if so, respond with an 4xx http status Current Solution In Detail Check for each character ( == string.substring(i,i+1) ) <ol> <li>if character.getBytes()[0] equals 63 for '?'</li> <li>if Character.getType(character.charAt(0)) returns OTHER_SYMBOL</li> </ol> Code <pre><code>protected List< String > getNonUnicodeCharacters( String s ) { final List< String > result = new ArrayList< String >(); for ( int i = 0 , n = s.length() ; i < n ; i++ ) { final String character = s.substring( i , i + 1 ); final boolean isOtherSymbol = ( int ) Character.OTHER_SYMBOL == Character.getType( character.charAt( 0 ) ); final boolean isNonUnicode = isOtherSymbol && character.getBytes()[ 0 ] == ( byte ) 63; if ( isNonUnicode ) result.add( character ); } return result; } </code></pre> Question Will this catch all invalid (non utf encoded) characters? Does any of you have a better (easier) solution? Note: I checked URLDecoder with the following code <pre><code>final String[] test = new String[]{ "v=abc%E4def", "v=abc%C3%A4def" }; for ( int i = 0 , n = test.length ; i < n ; i++ ) { System.out.println( java.net.URLDecoder.decode(test[i],"UTF-8") ); System.out.println( java.net.URLDecoder.decode(test[i],"ISO-8859-1") ); } </code></pre> This prints: <pre><code>v=abc?def v=abcädef v=abcädef v=abcÃ¤def </code></pre> and it does not throw an IllegalArgumentException sigh
Tags
<java><string><unicode><encoding>
Title
How to determine if a String contains invalid encoded characters
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USDaniel Hiller
UserOwnerUserId
1. USDaniel Hiller
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
2. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
3. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow to determine if a String contains invalid encoded characters
 UserUserId
 USSwapnonil Mukherjee
 VoteTypeVoteTypeId
 VTFavorite
2. VO
 singulars
 PostPostId
 POHow to determine if a String contains invalid encoded characters
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POHow to determine if a String contains invalid encoded characters
 UserUserId
 USe70
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.