Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to correct files with mixed encodings?
    primarykey
    data
    text
    <p>Given a corrupted file with mixed encoding (e.g. utf-8 and latin-1), how do I configure Emacs to "project" all its symbols to a single encoding (e.g. utf-8) when saving the file?</p> <p>I did the following function to automatize some of the cleaning, but I would guess I could find somewhere the information to map the symbol "é" in one encoding to "é" in utf-8 somewhere in order to improve this function (or that somebody already wrote such a function). </p> <pre><code> (defun jyby/cleanToUTF () "Cleaning to UTF" (interactive) (progn (save-excursion (replace-regexp "अ" "")) (save-excursion (replace-regexp "आ" "")) (save-excursion (replace-regexp "ॆ" "")) ) ) (global-unset-key [f11]) (global-set-key [f11] 'jyby/cleanToUTF) </code></pre> <hr> <p>I have many files "corrupted" with mixed encoding (due to copy pasting from a browser with an ill font configuration), generating the error below. I sometime clean them by hand by searching and replacing for each problematic symbol by either "" or the appropriate character, or more quickly specifying "utf-8-unix" as the encoding (which will prompt the same message next time I edit and save the file). It has become an issue as in any such corrupted file any accentuated character is replaced by a sequence which <em>doubles</em> in size at each save, ending up doubling the size of the file. I am using GNU Emacs 24.2.1</p> <pre><code>These default coding systems were tried to encode text in the buffer `test_accents.org': (utf-8-unix (30 . 4194182) (33 . 4194182) (34 . 4194182) (37 . 4194182) (40 . 4194181) (41 . 4194182) (42 . 4194182) (45 . 4194182) (48 . 4194182) (49 . 4194182) (52 . 4194182)) However, each of them encountered characters it couldn't encode: utf-8-unix cannot encode these: ... Click on a character (or switch to this window by `C-x o' and select the characters by RET) to jump to the place it appears, where `C-u C-x =' will give information about it. Select one of the safe coding systems listed below, or cancel the writing with C-g and edit the buffer to remove or modify the problematic characters, or specify any other coding system (and risk losing the problematic characters). raw-text emacs-mule no-conversion </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload