StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>You seem to be having a problem with Winston Ewert's answer. I'm making this a separate answer for reasons of length and formatting, but what Winston is suggesting is a reasonable approach.</p> <p>You ask about the "array of characters" and what your parents and offspring are, how you'd do crossover, mutation, etc. I think maybe you're confused by that array of characters bit. Don't think of it as an array of characters -- it's an array of words. This is important, because you don't want to do crossover between word boundaries or change letters inside a word via mutation. You want to shuffle the words around. So instead of dealing with the words, let's take a slightly different (but equivalent) approach.</p> <p>Let's number each of your 2n words from 1 to 2n. To generate a candidate solution, we just pick n random numbers between 1 and 2n (without replacement). The randomly selected words become the rows to your matrix. So if the words are <code>AGE,AGO,BEG,CAB,CAD,DOG</code>, we pick three at random and end up with, for instance, 2, 3, and 5, (giving a chromosome of <code>235</code>) or <code>AGO,BEG,CAD</code>.</p> <p>This yields a matrix of </p> <pre><code>AGO BEG CAD </code></pre> <p>Now we count how many of the 2n total input words appear in the matrix. In this case, it's only three, as none of the columns make valid words. Your fitness for the input <code>235</code> is 3. The input <code>416</code> works though -- its fitness is 6.</p> <p>To generate a population, you just generate N random sets of three numbers (without repetition) between 1 and 2n. With a population size of 4, you might get</p> <pre><code>142 354 624 241 </code></pre> <p>These give you four different potential solutions:</p> <pre><code>142 = AGE CAB AGO 354 = BEG CAD CAB 624 = DOG AGO CAB 241 = AGO CAB AGE </code></pre> <p>To do crossover, you'd need to design a method that avoided duplicated numbers in the offspring. I've in the past adapted the Cycle Crossover method designed for permutations to handle situations similar to this, but you can design any method that seems reasonable. For instance, you could do a simple uniform crossover and then just fix any duplicate values by changing one of them to something else.</p> <p>Mutation is simply changing one number into another or swapping the positions of two numbers in the encoding.</p> <p>You say you must use a GA, so you could stop here and try something like that. It's not likely to work extremely well, for a couple of reasons. First, there are going to be a lot of plateaus in the fitness function. Almost every possible answer is equally wrong, so there's not much of a gradient for the GA to follow. It's a needle in a haystack search problem. You could try to modify the way you score the fitness values to alleviate this somewhat. For instance, I know that only one word starts with <code>D</code> in your example set, so any solution with <code>DOG</code> in the first row is automatically wrong. I could award points to solutions that gave more "plausible" wrong answers. However, in general, this is going to be a tough problem for a GA to solve efficiently. </p> <p>Second, constraint satisfaction problems like this one can be much more efficiently solved with specialized techniques, no matter how well you build your GA. You could easily solve this with backtracking.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload