Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Okay, there are 26 possibilities for a 1-character string, 26<sup>2</sup> for a 2-character string, and so on up to 26<sup>26</sup> possibilities for a 26-character string.</p> <p>That means there are 26 times as many possibilities for an (N)-character string than there are for an (N-1)-character string. You can use that fact to select your length:</p> <pre><code>def getlen(maxlen): sz = maxlen while sz != 1: if rnd(27) != 1: return sz sz--; return 1 </code></pre> <p>I use 27 in the above code since the total sample space for selecting strings from "ab" is the 26 1-character possibilities and the 26<sup>2</sup> 2-character possibilities. In other words, the ratio is 1:26 so 1-character has a probability of 1/27 (rather than 1/26 as I first answered).</p> <p>This solution isn't <em>perfect</em> since you're calling <code>rnd</code> multiple times and it would be better to call it once with an possible range of 26<sup>N</sup>+26<sup>N-1</sup>+26<sup>1</sup> and select the length based on where your returned number falls within there but it may be difficult to find a random number generator that'll work on numbers that large (10 characters gives you a possible range of 26<sup>10</sup>+...+26<sup>1</sup> which, unless I've done the math wrong, is 146,813,779,479,510).</p> <p>If you can limit the maximum size so that your <code>rnd</code> function will work in the range, something like this should be workable:</p> <pre><code>def getlen(chars,maxlen): assert maxlen &gt;= 1 range = chars sampspace = 0 for i in 1 .. maxlen: sampspace = sampspace + range range = range * chars range = range / chars val = rnd(sampspace) sz = maxlen while val &lt; sampspace - range: sampspace = sampspace - range range = range / chars sz = sz - 1 return sz </code></pre> <p>Once you have the length, I would then use your current algorithm to choose the actual characters to populate the string.</p> <hr> <p>Explaining it further:</p> <p>Let's say our alphabet only consists of "ab". The possible sets up to length 3 are <code>[ab]</code> (2), <code>[ab][ab]</code> (4) and <code>[ab][ab][ab]</code> (8). So there is a 8/14 chance of getting a length of 3, 4/14 of length 2 and 2/14 of length 1.</p> <p>The 14 is the magic figure: it's the sum of all 2<sup>n</sup> for n = 1 to the maximum length. So, testing that pseudo-code above with <code>chars = 2</code> and <code>maxlen = 3</code>:</p> <pre><code> assert maxlen &gt;= 1 [okay] range = chars [2] sampspace = 0 for i in 1 .. 3: i = 1: sampspace = sampspace + range [0 + 2 = 2] range = range * chars [2 * 2 = 4] i = 2: sampspace = sampspace + range [2 + 4 = 6] range = range * chars [4 * 2 = 8] i = 3: sampspace = sampspace + range [6 + 8 = 14] range = range * chars [8 * 2 = 16] range = range / chars [16 / 2 = 8] val = rnd(sampspace) [number from 0 to 13 inclusive] sz = maxlen [3] while val &lt; sampspace - range: [see below] sampspace = sampspace - range range = range / chars sz = sz - 1 return sz </code></pre> <p>So, from that code, the first iteration of the final loop will exit with <code>sz = 3</code> if <code>val</code> is greater than or equal to <code>sampspace - range [14 - 8 = 6]</code>. In other words, for the values 6 through 13 inclusive, 8 of the 14 possibilities.</p> <p>Otherwise, <code>sampspace</code> becomes <code>sampspace - range [14 - 8 = 6]</code> and <code>range</code> becomes <code>range / chars [8 / 2 = 4]</code>.</p> <p>Then the second iteration of the final loop will exit with <code>sz = 2</code> if <code>val</code> is greater than or equal to <code>sampspace - range [6 - 4 = 2]</code>. In other words, for the values 2 through 5 inclusive, 4 of the 14 possibilities.</p> <p>Otherwise, <code>sampspace</code> becomes <code>sampspace - range [6 - 4 = 2]</code> and <code>range</code> becomes <code>range / chars [4 / 2 = 2]</code>.</p> <p>Then the third iteration of the final loop will exit with <code>sz = 1</code> if <code>val</code> is greater than or equal to <code>sampspace - range [2 - 2 = 0]</code>. In other words, for the values 0 through 1 inclusive, 2 of the 14 possibilities (this iteration will <em>always</em> exit since the value must be greater than or equal to zero.</p> <hr> <p>In retrospect, that second solution is a bit of a nightmare. In my personal opinion, I'd go for the first solution for its simplicity and to avoid the possibility of rather large numbers.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload