Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The implementation below provides a Python-like generator.</p> <p>Notice that there's a function called <code>_yield</code> in the code below, because <code>yield</code> is already a keyword in Scala, which by the way, does not have anything to do with <code>yield</code> you know from Python.</p> <pre><code>import scala.annotation.tailrec import scala.collection.immutable.Stream import scala.util.continuations._ object Generators { sealed trait Trampoline[+T] case object Done extends Trampoline[Nothing] case class Continue[T](result: T, next: Unit =&gt; Trampoline[T]) extends Trampoline[T] class Generator[T](var cont: Unit =&gt; Trampoline[T]) extends Iterator[T] { def next: T = { cont() match { case Continue(r, nextCont) =&gt; cont = nextCont; r case _ =&gt; sys.error("Generator exhausted") } } def hasNext = cont() != Done } type Gen[T] = cps[Trampoline[T]] def generator[T](body: =&gt; Unit @Gen[T]): Generator[T] = { new Generator((Unit) =&gt; reset { body; Done }) } def _yield[T](t: T): Unit @Gen[T] = shift { (cont: Unit =&gt; Trampoline[T]) =&gt; Continue(t, cont) } } object TestCase { import Generators._ def sectors = generator { def tailrec(seq: Seq[String]): Unit @Gen[String] = { if (!seq.isEmpty) { _yield(seq.head) tailrec(seq.tail) } } val list: Seq[String] = List("Financials", "Materials", "Technology", "Utilities") tailrec(list) } def main(args: Array[String]): Unit = { for (s &lt;- sectors) { println(s) } } } </code></pre> <p>It works pretty well, including for the typical usage of for loops.</p> <p>Caveat: we need to remember that Python and Scala differ in the way continuations are implemented. Below we see how generators are typically used in Python and compare to the way we have to use them in Scala. Then, we will see why it needs to be like so in Scala.</p> <p>If you are used to writing code in Python, you've probably used generators like this:</p> <pre><code>// This is Scala code that does not compile :( // This code naively tries to mimic the way generators are used in Python def myGenerator = generator { val list: Seq[String] = List("Financials", "Materials", "Technology", "Utilities") list foreach {s =&gt; _yield(s)} } </code></pre> <p>This code above does not compile. Skipping all convoluted theoretical aspects, the explanation is: it fails to compile because <em>"the type of the for loop"</em> does not match the type involved as part of the continuation. I'm afraid this explanation is a complete failure. Let me try again:</p> <p>If you had coded something like shown below, it would compile fine:</p> <pre><code>def myGenerator = generator { _yield("Financials") _yield("Materials") _yield("Technology") _yield("Utilities") } </code></pre> <p>This code compiles because the generator can be <em>decomposed</em> in a sequence of <code>yield</code>s and, in this case, a <code>yield</code> matches the type involved in the continuation. To be more precise, the code can be decomposed onto chained blocks, where each block ends with a <code>yield</code>. Just for the sake of clarification, we can think that the sequence of <code>yield</code>s could be expressed like this:</p> <pre><code>{ some code here; _yield("Financials") { some other code here; _yield("Materials") { eventually even some more code here; _yield("Technology") { ok, fine, youve got the idea, right?; _yield("Utilities") }}}} </code></pre> <p>Again, without going deep into convoluted theory, the point is that, after a <code>yield</code> you need to provide another block that ends with a <code>yield</code>, or close the chain otherwise. This is what we are doing in the pseudo-code above: after the <code>yield</code> we are opening another block which in turn ends with a <code>yield</code> followed by another <code>yield</code> which in turn ends with another <code>yield</code>, and so on. Obviously this thing must end at some point. Then the only thing we are allowed to do is closing the entire chain.</p> <p>OK. But... how we can <code>yield</code> multiple pieces of information? The answer is a little obscure but makes a lot of sense after you know the answer: we need to employ tail recursion, and the the last statement of a block must be a <code>yield</code>.</p> <pre><code> def myGenerator = generator { def tailrec(seq: Seq[String]): Unit @Gen[String] = { if (!seq.isEmpty) { _yield(seq.head) tailrec(seq.tail) } } val list = List("Financials", "Materials", "Technology", "Utilities") tailrec(list) } </code></pre> <p>Let's analyze what's going on here:</p> <ol> <li><p>Our generator function <code>myGenerator</code> contains some logic that obtains that generates information. In this example, we simply use a sequence of strings.</p></li> <li><p>Our generator function <code>myGenerator</code> calls a recursive function which is responsible for <code>yield</code>-ing multiple pieces of information, obtained from our sequence of strings.</p></li> <li><p>The recursive function <strong>must be declared before use</strong>, otherwise the compiler crashes.</p></li> <li><p>The recursive function <code>tailrec</code> provides the tail recursion we need.</p></li> </ol> <p>The rule of thumb here is simple: substitute a for loop with a recursive function, as demonstrated above.</p> <p>Notice that <code>tailrec</code> is just a convenient name we found, for the sake of clarification. In particular, <code>tailrec</code> does not need to be the last statement of our generator function; not necessarily. The only restriction is that you have to provide a sequence of blocks which match the type of an <code>yield</code>, like shown below:</p> <pre><code> def myGenerator = generator { def tailrec(seq: Seq[String]): Unit @Gen[String] = { if (!seq.isEmpty) { _yield(seq.head) tailrec(seq.tail) } } _yield("Before the first call") _yield("OK... not yet...") _yield("Ready... steady... go") val list = List("Financials", "Materials", "Technology", "Utilities") tailrec(list) _yield("done") _yield("long life and prosperity") } </code></pre> <p>One step further, you must be imagining how real life applications look like, in particular if you are employing several generators. It would be a good idea if you find a way to <em>standardize</em> your generators around a single pattern that demonstrates to be convenient for most circumstances.</p> <p>Let's examine the example below. We have three generators: <code>sectors</code>, <code>industries</code> and <code>companies</code>. For brevity, only <code>sectors</code> is completely shown. This generator employs a <code>tailrec</code> function as demonstrated already above. The trick here is that the same <code>tailrec</code> function is also employed by other generators. All we have to do is supply a different <code>body</code> function.</p> <pre><code>type GenP = (NodeSeq, NodeSeq, NodeSeq) type GenR = immutable.Map[String, String] def tailrec(p: GenP)(body: GenP =&gt; GenR): Unit @Gen[GenR] = { val (stats, rows, header) = p if (!stats.isEmpty &amp;&amp; !rows.isEmpty) { val heads: GenP = (stats.head, rows.head, header) val tails: GenP = (stats.tail, rows.tail, header) _yield(body(heads)) // tail recursion tailrec(tails)(body) } } def sectors = generator[GenR] { def body(p: GenP): GenR = { // unpack arguments val stat, row, header = p // obtain name and url val name = (row \ "a").text val url = (row \ "a" \ "@href").text // create map and populate fields: name and url var m = new scala.collection.mutable.HashMap[String, String] m.put("name", name) m.put("url", url) // populate other fields (header, stat).zipped.foreach { (k, v) =&gt; m.put(k.text, v.text) } // returns a map m } val root : scala.xml.NodeSeq = cache.loadHTML5(urlSectors) // obtain entire page val header: scala.xml.NodeSeq = ... // code is omitted val stats : scala.xml.NodeSeq = ... // code is omitted val rows : scala.xml.NodeSeq = ... // code is omitted // tail recursion tailrec((stats, rows, header))(body) } def industries(sector: String) = generator[GenR] { def body(p: GenP): GenR = { //++ similar to 'body' demonstrated in "sectors" // returns a map m } //++ obtain NodeSeq variables, like demonstrated in "sectors" // tail recursion tailrec((stats, rows, header))(body) } def companies(sector: String) = generator[GenR] { def body(p: GenP): GenR = { //++ similar to 'body' demonstrated in "sectors" // returns a map m } //++ obtain NodeSeq variables, like demonstrated in "sectors" // tail recursion tailrec((stats, rows, header))(body) } </code></pre> <ul> <li><p>Credits to Rich Dougherty and huynhjl.<br/> See this SO thread: <a href="https://stackoverflow.com/questions/2201882/implementing-yield-yield-return-using-scala-continuations">Implementing yield (yield return) using Scala continuations</a>*</p></li> <li><p>Credits to Miles Sabin, for putting some of the code above together<br/> <a href="http://github.com/milessabin/scala-cont-jvm-coro-talk/blob/master/src/continuations/Generators.scala" rel="nofollow noreferrer">http://github.com/milessabin/scala-cont-jvm-coro-talk/blob/master/src/continuations/Generators.scala</a></p></li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload