Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>As Peter said, the content-type Header is just an "hint" of what the content sent is expected to be. On server side you can set any content-type and send any bytes sequences, which can be invalid.</p> <p>I had exactly the same issue dealing with incorrect UTF-8 data, which included ISO-8859-1 (Latin-1) characters (french accents).</p> <p><a href="http://en.wikipedia.org/wiki/UTF-8" rel="noreferrer">Wikipedia about UTF-8</a> is worth reading to understand this issue and how to handle encoding errors.</p> <p>The fact is that <code>NSString initWithData:encoding:</code> strict implementation just return nil when a decoding error occurs. (unlike java for instance which use a replacement character)</p> <p>The peter solution of converting a mostly UTF-8 data into Latin-1 was not satisfying me. (All UTF-8 characters becomes incorrect, for just one Latin 1 erratic character)</p> <p>Best option would be a fix on server side, sure, but I'm not responsible on this side...</p> <p>So I looked deeper, and found a solution using GNU libiconv C library (available on OSX and iOS) The principle is using iconv to remove non UTF-8 invalid characters (i.e. "prété" will become "prt")</p> <p>Here is a sample code, equivalent of the command line <code>iconv -c -f UTF-8 -t UTF-8 invalid.txt &gt; cleaned.txt</code></p> <pre><code>#include "iconv.h" - (NSData *)cleanUTF8:(NSData *)data { iconv_t cd = iconv_open("UTF-8", "UTF-8"); // convert to UTF-8 from UTF-8 int one = 1; iconvctl(cd, ICONV_SET_DISCARD_ILSEQ, &amp;one); // discard invalid characters size_t inbytesleft, outbytesleft; inbytesleft = outbytesleft = data.length; char *inbuf = (char *)data.bytes; char *outbuf = malloc(sizeof(char) * data.length); char *outptr = outbuf; if (iconv(cd, &amp;inbuf, &amp;inbytesleft, &amp;outptr, &amp;outbytesleft) == (size_t)-1) { NSLog(@"this should not happen, seriously"); return nil; } NSData *result = [NSData dataWithBytes:outbuf length:data.length - outbytesleft]; iconv_close(cd); free(outbuf); return result; } </code></pre> <p>Then the resulting <code>NSData</code> can be safely decoded using <code>NSUTF8StringEncoding</code></p> <p>Note that latest iconv also allow fallback methods by using :</p> <pre><code>iconvctl(cd, ICONV_SET_FALLBACKS, &amp;fallbacks); </code></pre> <p>By using a fallback on unicode errors, you can use a replacement character, or better, to try another encoding. In my case I managed to fallback to LATIN-1 where UTF-8 failed, which resulted in 99% positive conversions. Look at iconv source code for understanding it.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload