StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
4610397
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2011-01-05T23:41:50.147
FavoriteCount
0
LastActivityDate
2011-01-05T23:41:50.147
LastEditDate
LastEditorUserId
0
OwnerUserId
295802
ParentId
4514433
PostTypeId
2
Score
2
ViewCount
0
LastEditorDisplayName
text
Body
Found the issue, there was another place the UTF-8 had to be specified. In the HTTP Request, to the right of the Method, you have to also set Content Encoding to UTF-8 Yes, in hindsight, this seems obvious, but there were a number of reasons I didn't think this was needed. Some of my incorrect assumptions might be helpful for others who are debugging, so here goes - I would have thought that: 1: Once text has made it into Java as Unicode, it stays as Unicode, and goes in and out by UTF-8. Obviously not in this case. 2: I sort of thought HTTP defaulted to UTF-8 unless you say otherwise, but maybe I'm just used to XML, but probably not a good practice to assume that, and maybe HTTP defaults to ISO-Latin1 or something, or even if there's a spec, maybe folks don't follow it. 3: And if I don't specific it, I'd think the "do no harm" approach would be to pass the characters on, and let the receiver on the other end deal with it. Wrong again! (OK, so points 1, 2 and 3 overlap a bit) 4: Even though my HTTP Request POST, I did still try the Encode checkbox. I certainly thought that would have encoded it, but all I got was the repeating % hex for question marks, so seemed to me that the data was already corrupted at that point. Wrong again. I suspect WITHIN the HTTP phase, there's TWO character transitions, first from Unicode to whatever encoding it thinks you have, and THEN a second encoding into the %signs, and my data was mis-encoded at the first step. 5: And I would have thought JMeter would say something or warn, but from my reading, apparently it's not helpful in that respect. You can do logging or whatever. And the "?" is Java's way of reporting a problem BY default, this started in the Java 1.4x timeframe. In my Java code I prefer to set encoding errors to report as an exception, but again, not the default, and not what JMeter does. So I learned my lesson. The HINT that the Unicode was at least starting out OK was that the number of question marks equaled the number of Japanese characters, instead of having 2 or 3 times as many question marks. If the length of "???" matches your Japanese (or Chinese) string, then Java DID see actual Unicode characters at some point along the journey. Whereas if you see 3 times as many ?'s as input text, then Java always saw them as bytes or ints or whatever, and NEVER as valid codepoints.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POJMeter CSV Data Set is corrupting Japanese strings stored as proper UTF-8, I get Question Marks instead
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USMark Bennett
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. This table or related slice is empty.
CommentsPostId
1. COAs of Java 1.4x you can tell Java to THROW EXCEPTIONS when it hits encoding errors, instead of just silently replacing them with question marks. This may be too strict for production applications, but for debugging and TESTING I find it helpful. The trick is the Charset object that you use: Charset charset Charset.forName( charsetName ); CharsetDecoder dec = charset.newDecoder(); dec.onMalformedInput(CodingErrorAction.REPORT); Then be prepared to handle exceptions of java.nio.charset.CharacterCodingException or subclasses: MalformedInputException and UnmappableCharacterException Very strict!
 singulars
 PostPostId
 PO
 UserUserId
 USMark Bennett
2. COAND I had also done -Dfile.encoding=UTF-8 so I thought Java would default to that if it was ever unsure. But this was also incorrect, at least for JMeter's HTTP pipeline stage.
 singulars
 PostPostId
 PO
 UserUserId
 USMark Bennett

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.