StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>I've been thinking about this for quite some time now and I realized that you need at least two intelligent "agents" to make this work properly. The basic idea is that you have two types intelligent activity here:</p> <ol> <li>Subconscious Motor Control (SMC).</li> <li>Conscious Decision Making (CDM).</li> </ol> <p>Training for the SMC could be done on-line... if you really think about it: defining success within motor control is basically done when you provide a signal to your robot, it evaluates that signal and either accepts it or rejects it. If your robot accepts a signal and it results in a "failure", then your robot goes "offline" and it can't accept any more signals. Defining "failure" and "offline" could be tricky, but I was thinking that it would be a failure if, for example, a sensor on the robot indicates that the robot is immobile (laying on the ground).</p> <p>So your fitness function for the SMC might be something of the sort: <code>numAcceptedSignals/numGivenSignals + numFailure</code></p> <p>The CDM is another AI agent that generates signals and the fitness function for it could be: <code>(numSignalsAccepted/numSignalsGenerated)/(numWinGoals/numLossGoals)</code></p> <p>So what you do is you run the CDM and all the output that comes out of it goes to the SMC... at the end of a game you run your fitness functions. Alternately you can combine the SMC and the CDM into a single agent and you can make a composite fitness function based on the other two fitness functions. I don't know how else you could do it...</p> <p>Finally, you have to determine what constitutes a learning session: is it half a game, full game, just a few moves, etc. If a game lasts 1 minute and you have a total of 8 players on the field, then the process of training could be VERY slow!</p> <h2>Update</h2> <p>Here is a quick reference to a paper that used genetic programming to create "softbots" that play soccer: <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.136&rep=rep1&type=pdf" rel="nofollow noreferrer">http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.136&rep=rep1&type=pdf</a></p> <p>With regards to your comments: I was thinking that for the subconscious motor control (SMC), the signals would come from the conscious decision maker (CDM). This way you're evolving your SMC agent to properly handle the CDM agent's commands (signals). You want to maximize the up-time of the SMC agent regardless of what the CDM agent says.</p> <p>The SMC agent receives an input, for example a vector force on a joint, and it then runs it through its processing unit to determine if it should execute that input or if it should reject it. The SMC should only execute inputs that it doesn't "think" it will recover from and it should reject inputs that it "thinks" would lead to a "catastrophic failure". </p> <p>Now the SMC agent has an output: accept or reject a signal (1 or 0). The CDM can use that signal for its own training... the CDM wants to maximize the number of signals that the SMC accepts and it also wants to satisfy a goal: a high score for its own team and a low score for the opposing team. So the CDM has its own processing unit that is being evolved to satisfy both of those needs. Your reference provided a 3-layer design, while mine is only a 2-layer... I think mine was a right step in towards the 3-layer design.</p> <p>One more thing to note here: is <em>falling</em> really a <em>"catastrophic failure"</em>? What if your robot falls, but the CDM makes it stand up again? I think that would be a valid behavior, so you shouldn't penalize the robot for falling... perhaps a better thing to do is penalize it for the amount of time it takes in order to perform a goal (not necessarily a soccer goal).</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload