Note that there are some explanatory texts on larger screens.

plurals
  1. POAbstracting away the algorithm loop: how to keep algorithms DRY (don't repeat yourself)?
    primarykey
    data
    text
    <p>I am writing a toolbox for (PO)MDPs and am seeing a bad pattern emerge. Especially when implementing reinforcement learning algorithms I tend to repeat myself. See the following pseudo-algorithm:</p> <pre><code>arguments: epsilon v &lt;- initial V values c &lt;- initial C values while not good-enough delta &lt;- 0.0 if in-place v_old &lt;- copy(v) else v_old &lt;- reference to v for s in ss a = some_value(s,old_v) old_v &lt;- v_old[s] v[s] = c*a*v_old[s] delta = max(delta,old_v-v[s]) if delta &lt; epsilon good-enough &lt;- true return v </code></pre> <p>Now see this nearly identical algorithm:</p> <pre><code>arguments: epsilon,gamma v &lt;- initial V values c &lt;- initial C values while not good-enough delta &lt;- 0.0 if in-place v_old &lt;- copy(v) else v_old &lt;- reference to v for s in ss a,o = get_a_and_o(s) old_v &lt;- v_old[s] v[s] = c*v_old[s]*exp(o-a) delta = max(delta,old_v-v[s]) if delta &lt; epsilon(/1-gamma) good-enough &lt;- true return v </code></pre> <p>There are some simple differences between these algorithms, but I am repeating myself quite a bit. Now my question is: <strong>how do you abstract away the common parts between these two example algorithms</strong> (applicable to real algorithms)?</p> <p>I have looked at one approach (in python), where you give the algorithm a pre, a post and a loop function which are called before, after and for each iteration respectively and passed an algorithm state dictionary to hold variables. But this approach did not seem very nice. Any suggestions?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload