Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Current MS Access versions use UTF-8 to store string values. Older ones simply followed the code page of the machine on which the text was entered.</p> <p>Most encodings do indeed use some marker bytes to indicate the encoding of what follows. Whether or not you have the benefit of that, really depends on the legacy app. If that simply followed a single encoding, or relied on the machine's code page, then you'll have to do some clever recognizing yourself.</p> <p><strong>Quick checks</strong></p> <p><em>UTF-8</em></p> <p>If there is a marker, it would be <code>$EFBBBF</code>. If there isn't, you can make an educted guess that it is UTF-8 when sequences of ASCII (0-127) characters can be seen in the string.</p> <p><em>UTF-16</em></p> <p>Comes in two flavours: Little Endian (LE) and Big Endian (BE). For characters within the Basic Multilingual Plane, both use two bytes per character. The difference between the two is that for ASCII characters, one starts with a zero byte, the other ends with it.</p> <p>If there is a marker UTF-16LE is designated by <code>$FFFE</code> and UTF-16BE by <code>$FEFF</code>. If neither of those markers is present having alternating zero and non-zero bytes in the memo field is a fair indication. And your first bet should be UTF-16LE as that is the windows standard and UTF-16BE is used a lot less. (Sorry, can never remember which of the two starts with a zero-byte for ASCII characters and which one starts with a non-zero byte).</p> <p><em>Other</em></p> <p>If you can exclude UTF-8 and UTF-16, you could try to figure out whether one of the other UTF encodings was used. I wouldn't spend the time though, chances are that the program simply relied on the machine's code page. Seeing as your are dealing with a lot of "asian looking" characters, your best bet would be to check for the MBCS code pages (Multi Byte Character S??? code pages). See MSDN for more details. As I have never dealt with them myself, I'm afraid I can't be of more help here though.</p> <p><strong>Trying encodings</strong></p> <p>If you do have to start trying out every encoding there is, you may want to have a look at the DIConvertors library. It's pretty good at converting between encodings. IIRC it can also recognize encodings, but otherwise it should help getting you started with your own detection. It can be found at <a href="http://www.yunqa.de/delphi/doku.php/products/converters/index" rel="nofollow">http://www.yunqa.de/delphi/doku.php/products/converters/index</a></p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload