Note that there are some explanatory texts on larger screens.

plurals
  1. POActiveSupport::Inflector::camelize - help in understanding regex
    primarykey
    data
    text
    <h2>Short version:</h2> <p>I am having a rather hard time understanding two rather complex regular expressions in the <code>ActiveSupport::Inflector::camelize</code> method.</p> <p>This is the definition of the <code>camelize</code> method:</p> <pre><code>def camelize(term, uppercase_first_letter = true) string = term.to_s if uppercase_first_letter string = string.sub(/^[a-z\d]*/) { inflections.acronyms[$&amp;] || $&amp;.capitalize } else string = string.sub(/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/) { $&amp;.downcase } end string.gsub(/(?:_|(\/))([a-z\d]*)/i) { "#{$1}#{inflections.acronyms[$2] || $2.capitalize}" }.gsub('/', '::') end </code></pre> <p>I have some difficulty understanding:</p> <pre><code>string = string.sub(/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/) { $&amp;.downcase } </code></pre> <p>and:</p> <pre><code>string.gsub(/(?:_|(\/))([a-z\d]*)/i) { "#{$1}#{inflections.acronyms[$2] || $2.capitalize}" }.gsub('/', '::') </code></pre> <p>Please explain to me what they mean. Thank you.</p> <h1>Long version</h1> <p>This shows me trying to understand the regex and how I interpret them to mean. It would be very helpful if you could go through this and correct my mistakes.</p> <h2>For the first regex</h2> <pre><code>string = string.sub(/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/) { $&amp;.downcase } </code></pre> <p>Based on what I am seeing, <code>inflections.acronym_regex</code> is from the <code>Inflections</code> class in the <code>ActiveSupport::Inflector</code> module, and in the <code>initialize</code> method of the <code>Inflections</code> class,</p> <pre><code>def initialize @plurals, @singulars, @uncountables, @humans, @acronyms, @acronym_regex = [], [], [], [], {}, /(?=a)b/ end </code></pre> <p><code>acronym_regex</code> is assigned <code>/(?=a)b/</code>. From what I understand from <a href="http://www.ruby-doc.org/core-2.0.0/Regexp.html#class-Regexp-label-Anchors" rel="nofollow">http://www.ruby-doc.org/core-2.0.0/Regexp.html#class-Regexp-label-Anchors</a> ,</p> <pre><code>(?=pat) - Positive lookahead assertion: ensures that the following characters match pat, but doesn't include those characters in the matched text </code></pre> <p>So <code>/(?=a)b/</code> ensures that character <code>a</code> is inside the text, but we dont include character <code>a</code> inside the matched text, and what immediately follows character <code>a</code> must be character <code>b</code>. In other words, <code>"abc"</code> would match this regex, but <code>"bbc"</code> would not match this regex, and the matched text for <code>"abc"</code> would be <code>"b"</code> (instead of <code>"ab"</code>).</p> <p>So combining the value of <code>inflections.acronym_regex</code> into this regex <code>/^(?:#{inflections.acronym_regex}(?=\b|[A-Z_])|\w)/</code>, I do not know which of the following two regex results:</p> <p>A. <code>/^(?:/(?=a)b/(?=\b|[A-Z_])|\w)/</code></p> <p>B. <code>/^(?:(?=a)b(?=\b|[A-Z_])|\w)/</code></p> <p>although I am thinking it is B. From what I understand, <code>(?:</code> provides grouping without capturing, <code>(?=</code> means positive lookahead assertion, <code>\b</code> matches word boundaries when outside brackets and matches backspace when inside brackets. So in english terms, regex B, when matching against a text, will find a string that begins with an <code>a</code> character, followed by a <code>b</code> character, and one of (1. backspace [whatever that may mean] 2. any uppercase character or underscore 3. any english alphabetic character, digit, or underscore).</p> <p>However, I find it strange that passing <code>upper_case_first_letter = false</code> to the <code>camelize</code> function should cause it to match a string starting with the characters <code>ab</code>, given that that does not seem to be how the <code>camelize</code> function behaves.</p> <h2>For the second regex</h2> <pre><code>string.gsub(/(?:_|(\/))([a-z\d]*)/i) { "#{$1}#{inflections.acronyms[$2] || $2.capitalize}" }.gsub('/', '::') </code></pre> <p>The regex is:</p> <pre><code>/(?:_|(\/))([a-z\d]*)/i </code></pre> <p>I am guessing that this regex will match a substring that starts with either an <code>_</code> or <code>/</code>, followed by 0 or more (upper or lowercase english alpabetic characters or digit). Furthermore, for the first group <code>(?:_|(\/))</code>, whether we match the <code>_</code> or <code>/</code>, the <code>([a-z\d]*)</code> capturing group will always be regarded as the second group. I do understand the part where the block tries to look up <code>inflections.acronyms[$2]</code> and on failure, does <code>$2.captitalize</code>.</p> <p>Since <code>(?:</code> means grouping without capturing, what is the value of <code>$1</code> when we match <code>_</code> ? Is it still <code>_</code> ? And for the <code>.gsub('/', '::')</code> portion, I am guessing that it gets applied for each match in the initial <code>gsub</code>, instead of being applied to the overall string after the outer <code>gsub</code> call is done?</p> <p>Apologies for the really long post. Please point out my errors in understanding the 2 regular expressions, or explain them in a better way if you can do it.</p> <p>Thank you.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload