Note that there are some explanatory texts on larger screens.

plurals
  1. POWhy is 7-bit ASCII string literal encoded as UTF-8 in Ruby
    text
    copied!<p>I'm reading "The Ruby Programming Language". In section 3.2.6.1, "Multibyte characters in Ruby 1.9", the book introduces an optimization in Ruby's string</p> <blockquote> <p>If a string literal contains only 7-bit ASCII characters, then its encoding method will return ASCII, even if the source encoding is UTF-8</p> </blockquote> <p>I tried the following simple script on both ruby 1.9.1-p431, 1.9.2 and 1.9.3-p125, both uses UTF-8 encoding for 7-bit ASCII characters.</p> <pre><code># coding: utf-8 s = 'hello' p s.encoding # result is #&lt;Encoding:UTF-8&gt; </code></pre> <p>I guess maybe this behavior is changed during the development of Ruby 1.9. I tried to search Ruby 1.9's changelog, and the <a href="http://svn.ruby-lang.org/repos/ruby/tags/v1_9_1_0/ChangeLog" rel="nofollow">1.9.1 changelog</a> confirms this behavior. I also cloned Ruby's git repository but I can't find the commit mentioning about changing this behavior.</p> <p><strong>Update:</strong></p> <p>Looking at Ruby's source code repository, I guess this is the behavior in Ruby 1.9.0, which was released in Jan, 2008. (It failed to compile on Debian 6 so I can't exactly confirm this.) <strong>Though "The Ruby Programming Language" is an excellent book, it's originally published in 2008. It's very likely that some descriptions in the book are outdated.</strong></p> <p>Another outdated description is about the <code>Encoding.list</code> method behavior. So be careful of outdated description if you are also reading this book.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload