Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <h1>Foreword</h1> <p>There's a <a href="http://docs.scala-lang.org/overviews/collections/introduction.html" rel="noreferrer">2.8 collection walk-through</a> by Martin Odersky which should probably be your first reference. It has been supplemented as well with <a href="http://docs.scala-lang.org/overviews/core/architecture-of-scala-collections.html" rel="noreferrer">architectural notes</a>, which will be of particular interest to those who want to design their own collections. </p> <p>The rest of this answer was written way before any such thing existed (in fact, before 2.8.0 itself was released).</p> <p>You can find a paper about it as <a href="http://www.scala-lang.org/sid/3" rel="noreferrer">Scala SID #3</a>. Other papers in that area should be interesting as well to people interested in the differences between Scala 2.7 and 2.8.</p> <p>I'll quote from the paper, selectively, and complement with some thoughts of mine. There are also some images, generated by Matthias at decodified.com, and the original SVG files can be found <a href="https://github.com/sirthias/scala-collections-charts/downloads" rel="noreferrer">here</a>.</p> <h1>The collection classes/traits themselves</h1> <p>There are actually three hierarchies of traits for the collections: one for mutable collections, one for immutable collections, and one which doesn't make any assumptions about the collections.</p> <p>There's also a distinction between parallel, serial and maybe-parallel collections, which was introduced with Scala 2.9. I'll talk about them in the next section. The hierarchy described in this section refers <em>exclusively to non-parallel collections</em>.</p> <p>The following image shows the non-specific hierarchy introduced with Scala 2.8: <img src="https://i.stack.imgur.com/bSVyA.png" alt="General collection hierarchy"></p> <p>All elements shown are traits. In the other two hierarchies there are also classes directly inheriting the traits as well as classes which can be <em>viewed as</em> belonging in that hierarchy through implicit conversion to wrapper classes. The legend for these graphs can be found after them.</p> <p>Graph for immutable hierarchy: <img src="https://i.stack.imgur.com/2fjoA.png" alt="Immutable collection hierarchy"></p> <p>Graph for mutable hierarchy: <img src="https://i.stack.imgur.com/Dsptl.png" alt="Mutable collection hierarchy"></p> <p>Legend:</p> <p><img src="https://i.stack.imgur.com/szWUr.png" alt="Graph legend"></p> <p>Here's an abbreviated ASCII depiction of the collection hierarchy, for those who can't see the images.</p> <pre><code> Traversable | | Iterable | +------------------+--------------------+ Map Set Seq | | | | +----+----+ +-----+------+ Sorted Map SortedSet BitSet Buffer Vector LinearSeq </code></pre> <h1>Parallel Collections</h1> <p>When Scala 2.9 introduced parallel collections, one of the design goals was to make their use as seamless as possible. In the simplest terms, one can replace a non-parallel (serial) collection with a parallel one, and instantly reap the benefits.</p> <p>However, since all collections until then were serial, many algorithms using them assumed and depended on the fact that they <em>were</em> serial. Parallel collections fed to the methods with such assumptions would fail. For this reason, all the hierarchy described in the previous section <em>mandates serial processing</em>.</p> <p>Two new hierarchies were created to support the parallel collections.</p> <p>The parallel collections hierarchy has the same names for traits, but preceded with <code>Par</code>: <code>ParIterable</code>, <code>ParSeq</code>, <code>ParMap</code> and <code>ParSet</code>. Note that there is no <code>ParTraversable</code>, since any collection supporting parallel access is capable of supporting the stronger <code>ParIterable</code> trait. It doesn't have some of the more specialized traits present in the serial hierarchy either. This whole hierarchy is found under the directory <code>scala.collection.parallel</code>.</p> <p>The classes implementing parallel collections also differ, with <code>ParHashMap</code> and <code>ParHashSet</code> for both mutable and immutable parallel collections, plus <code>ParRange</code> and <code>ParVector</code> implementing <code>immutable.ParSeq</code> and <code>ParArray</code> implementing <code>mutable.ParSeq</code>.</p> <p>Another hierarchy also exists that mirrors the traits of serial and parallel collections, but with a prefix <code>Gen</code>: <code>GenTraversable</code>, <code>GenIterable</code>, <code>GenSeq</code>, <code>GenMap</code> and <code>GenSet</code>. These traits are <em>parents</em> to both parallel and serial collections. This means that a method taking a <code>Seq</code> cannot receive a parallel collection, but a method taking a <code>GenSeq</code> is expected to work with both serial and parallel collections.</p> <p>Given the way these hierarchies were structured, code written for Scala 2.8 was fully compatible with Scala 2.9, and demanded serial behavior. Without being rewritten, it cannot take advantage of parallel collections, but the changes required are very small.</p> <h2>Using Parallel Collections</h2> <p>Any collection can be converted into a parallel one by calling the method <code>par</code> on it. Likewise, any collection can be converted into a serial one by calling the method <code>seq</code> on it.</p> <p>If the collection was already of the type requested (parallel or serial), no conversion will take place. If one calls <code>seq</code> on a parallel collection or <code>par</code> on a serial collection, however, a new collection with the requested characteristic will be generated.</p> <p>Do not confuse <code>seq</code>, which turns a collection into a non-parallel collection, with <code>toSeq</code>, which returns a <code>Seq</code> created from the elements of the collection. Calling <code>toSeq</code> on a parallel collection will return a <code>ParSeq</code>, not a serial collection.</p> <h1>The Main Traits</h1> <p>While there are many implementing classes and subtraits, there are some basic traits in the hierarchy, each of which providing more methods or more specific guarantees, but reducing the number of classes that could implement them.</p> <p>In the following subsections, I'll give a brief description of the main traits and the idea behind them.</p> <h2>Trait TraversableOnce</h2> <p>This trait is pretty much like trait <code>Traversable</code> described below, but with the limitation that you can only use it <em>once</em>. That is, any methods called on a <code>TraversableOnce</code> <em>may</em> render it unusable.</p> <p>This limitation makes it possible for the same methods to be shared between the collections and <code>Iterator</code>. This makes it possible for a method that works with an <code>Iterator</code> but not using <code>Iterator</code>-specific methods to actually be able to work with any collection at all, plus iterators, if rewritten to accept <code>TraversableOnce</code>.</p> <p>Because <code>TraversableOnce</code> unifies collections and iterators, it does not appear in the previous graphs, which concern themselves only with collections.</p> <h2>Trait Traversable</h2> <p>At the top of the <em>collection</em> hierarchy is trait <code>Traversable</code>. Its only abstract operation is</p> <pre><code>def foreach[U](f: Elem =&gt; U) </code></pre> <p>The operation is meant to traverse all elements of the collection, and apply the given operation f to each element. The application is done for its side effect only; in fact any function result of f is discarded by foreach.</p> <p>Traversible objects can be finite or infinite. An example of an infinite traversable object is the stream of natural numbers <code>Stream.from(0)</code>. The method <code>hasDefiniteSize</code> indicates whether a collection is possibly infinite. If <code>hasDefiniteSize</code> returns true, the collection is certainly finite. If it returns false, the collection has not been not fully elaborated yet, so it might be infinite or finite.</p> <p>This class defines methods which can be efficiently implemented in terms of <code>foreach</code> (over 40 of them).</p> <h2>Trait Iterable</h2> <p>This trait declares an abstract method <code>iterator</code> that returns an iterator that yields all the collection’s elements one by one. The <code>foreach</code> method in <code>Iterable</code> is implemented in terms of <code>iterator</code>. Subclasses of <code>Iterable</code> often override foreach with a direct implementation for efficiency.</p> <p>Class <code>Iterable</code> also adds some less-often used methods to <code>Traversable</code>, which can be implemented efficiently only if an <code>iterator</code> is available. They are summarized below.</p> <pre><code>xs.iterator An iterator that yields every element in xs, in the same order as foreach traverses elements. xs takeRight n A collection consisting of the last n elements of xs (or, some arbitrary n elements, if no order is defined). xs dropRight n The rest of the collection except xs takeRight n. xs sameElements ys A test whether xs and ys contain the same elements in the same order </code></pre> <h2>Other Traits</h2> <p>After <code>Iterable</code> there come three base traits which inherit from it: <code>Seq</code>, <code>Set</code>, and <code>Map</code>. All three have an <code>apply</code> method and all three implement the <code>PartialFunction</code> trait, but the meaning of <code>apply</code> is different in each case.</p> <p>I trust the meaning of <code>Seq</code>, <code>Set</code> and <code>Map</code> is intuitive. After them, the classes break up in specific implementations that offer particular guarantees with regards to performance, and the methods it makes available as a result of it. Also available are some traits with further refinements, such as <code>LinearSeq</code>, <code>IndexedSeq</code> and <code>SortedSet</code>.</p> <p><strong>The listing below may be improved. Leave a comment with suggestions and I'll fix it.</strong></p> <h2>Base Classes and Traits</h2> <ul> <li><code>Traversable</code> -- Basic collection class. Can be implemented just with <code>foreach</code>. <ul> <li><code>TraversableProxy</code> -- Proxy for a <code>Traversable</code>. Just point <code>self</code> to the real collection.</li> <li><code>TraversableView</code> -- A Traversable with some non-strict methods.</li> <li><code>TraversableForwarder</code> -- Forwards most methods to <code>underlying</code>, except <code>toString</code>, <code>hashCode</code>, <code>equals</code>, <code>stringPrefix</code>, <code>newBuilder</code>, <code>view</code> and all calls creating a new iterable object of the same kind.</li> <li><code>mutable.Traversable</code> and <code>immutable.Traversable</code> -- same thing as <code>Traversable</code>, but restricting the collection type.</li> <li>Other special-cases <code>Iterable</code> classes, such as <code>MetaData</code>, exists.</li> <li><code>Iterable</code> -- A collection for which an <code>Iterator</code> can be created (through <code>iterator</code>). <ul> <li><code>IterableProxy</code>, <code>IterableView</code>, <code>mutable</code> and <code>immutable.Iterable</code>.</li> </ul></li> </ul></li> <li><code>Iterator</code> -- A trait which is not descendant of <code>Traversable</code>. Define <code>next</code> and <code>hasNext</code>. <ul> <li><code>CountedIterator</code> -- An <code>Iterator</code> defining <code>count</code>, which returns the elements seen so far.</li> <li><code>BufferedIterator</code> -- Defines <code>head</code>, which returns the next element without consuming it.</li> <li>Other special-cases <code>Iterator</code> classes, such as <code>Source</code>, exists.</li> </ul></li> </ul> <h2>The Maps</h2> <ul> <li><code>Map</code> -- An <code>Iterable</code> of <code>Tuple2</code>, which also provides methods for retrieving a value (the second element of the tuple) given a key (the first element of the tuple). Extends <code>PartialFunction</code> as well. <ul> <li><code>MapProxy</code> -- A <code>Proxy</code> for a <code>Map</code>.</li> <li><code>DefaultMap</code> -- A trait implementing some of <code>Map</code>'s abstract methods.</li> <li><code>SortedMap</code> -- A <code>Map</code> whose keys are sorted. <ul> <li><code>immutable.SortMap</code> <ul> <li><code>immutable.TreeMap</code> -- A class implementing <code>immutable.SortedMap</code>.</li> </ul></li> </ul></li> <li><code>immutable.Map</code> <ul> <li><code>immutable.MapProxy</code></li> <li><code>immutable.HashMap</code> -- A class implementing <code>immutable.Map</code> through key hashing.</li> <li><code>immutable.IntMap</code> -- A class implementing <code>immutable.Map</code> specialized for <code>Int</code> keys. Uses a tree based on the binary digits of the keys.</li> <li><code>immutable.ListMap</code> -- A class implementing <code>immutable.Map</code> through lists.</li> <li><code>immutable.LongMap</code> -- A class implementing <code>immutable.Map</code> specialized for <code>Long</code> keys. See <code>IntMap</code>.</li> <li>There are additional classes optimized for an specific number of elements.</li> </ul></li> <li><code>mutable.Map</code> <ul> <li><code>mutable.HashMap</code> -- A class implementing <code>mutable.Map</code> through key hashing.</li> <li><code>mutable.ImmutableMapAdaptor</code> -- A class implementing a <code>mutable.Map</code> from an existing <code>immutable.Map</code>.</li> <li><code>mutable.LinkedHashMap</code> -- ?</li> <li><code>mutable.ListMap</code> -- A class implementing <code>mutable.Map</code> through lists.</li> <li><code>mutable.MultiMap</code> -- A class accepting more than one distinct value for each key.</li> <li><code>mutable.ObservableMap</code> -- A <em>mixin</em> which, when mixed with a <code>Map</code>, publishes events to observers through a <code>Publisher</code> interface.</li> <li><code>mutable.OpenHashMap</code> -- A class based on an open hashing algorithm.</li> <li><code>mutable.SynchronizedMap</code> -- A <em>mixin</em> which should be mixed with a <code>Map</code> to provide a version of it with synchronized methods.</li> <li><code>mutable.MapProxy</code>.</li> </ul></li> </ul></li> </ul> <h2>The Sequences</h2> <ul> <li><code>Seq</code> -- A sequence of elements. One assumes a well-defined size and element repetition. Extends <code>PartialFunction</code> as well. <ul> <li><code>IndexedSeq</code> -- Sequences that support O(1) element access and O(1) length computation. <ul> <li><code>IndexedSeqView</code></li> <li><code>immutable.PagedSeq</code> -- An implementation of <code>IndexedSeq</code> where the elements are produced on-demand by a function passed through the constructor.</li> <li><code>immutable.IndexedSeq</code> <ul> <li><code>immutable.Range</code> -- A delimited sequence of integers, closed on the lower end, open on the high end, and with a step. <ul> <li><code>immutable.Range.Inclusive</code> -- A <code>Range</code> closed on the high end as well.</li> <li><code>immutable.Range.ByOne</code> -- A <code>Range</code> whose step is 1.</li> </ul></li> <li><code>immutable.NumericRange</code> -- A more generic version of <code>Range</code> which works with any <code>Integral</code>. <ul> <li><code>immutable.NumericRange.Inclusive</code>, <code>immutable.NumericRange.Exclusive</code>.</li> <li><code>immutable.WrappedString</code>, <code>immutable.RichString</code> -- Wrappers which enables seeing a <code>String</code> as a <code>Seq[Char]</code>, while still preserving the <code>String</code> methods. I'm not sure what the difference between them is.</li> </ul></li> </ul></li> <li><code>mutable.IndexedSeq</code> <ul> <li><code>mutable.GenericArray</code> -- An <code>Seq</code>-based array-like structure. Note that the "class" <code>Array</code> is Java's <code>Array</code>, which is more of a memory storage method than a class.</li> <li><code>mutable.ResizableArray</code> -- Internal class used by classes based on resizable arrays.</li> <li><code>mutable.PriorityQueue</code>, <code>mutable.SynchronizedPriorityQueue</code> -- Classes implementing prioritized queues -- queues where the elements are dequeued according to an <code>Ordering</code> first, and order of queueing last.</li> <li><code>mutable.PriorityQueueProxy</code> -- an abstract <code>Proxy</code> for a <code>PriorityQueue</code>.</li> </ul></li> </ul></li> <li><code>LinearSeq</code> -- A trait for linear sequences, with efficient time for <code>isEmpty</code>, <code>head</code> and <code>tail</code>. <ul> <li><code>immutable.LinearSeq</code> <ul> <li><code>immutable.List</code> -- An immutable, singlely-linked, list implementation.</li> <li><code>immutable.Stream</code> -- A lazy-list. Its elements are only computed on-demand, but memoized (kept in memory) afterwards. It can be theoretically infinite.</li> </ul></li> <li><code>mutable.LinearSeq</code> <ul> <li><code>mutable.DoublyLinkedList</code> -- A list with mutable <code>prev</code>, <code>head</code> (<code>elem</code>) and <code>tail</code> (<code>next</code>).</li> <li><code>mutable.LinkedList</code> -- A list with mutable <code>head</code> (<code>elem</code>) and <code>tail</code> (<code>next</code>).</li> <li><code>mutable.MutableList</code> -- A class used internally to implement classes based on mutable lists. <ul> <li><code>mutable.Queue</code>, <code>mutable.QueueProxy</code> -- A data structure optimized for FIFO (First-In, First-Out) operations.</li> <li><code>mutable.QueueProxy</code> -- A <code>Proxy</code> for a <code>mutable.Queue</code>.</li> </ul></li> </ul></li> </ul></li> <li><code>SeqProxy</code>, <code>SeqView</code>, <code>SeqForwarder</code></li> <li><code>immutable.Seq</code> <ul> <li><code>immutable.Queue</code> -- A class implementing a FIFO-optimized (First-In, First-Out) data structure. There is no common superclass of both <code>mutable</code> and <code>immutable</code> queues.</li> <li><code>immutable.Stack</code> -- A class implementing a LIFO-optimized (Last-In, First-Out) data structure. There is no common superclass of both <code>mutable</code> <code>immutable</code> stacks.</li> <li><code>immutable.Vector</code> -- ?</li> <li><code>scala.xml.NodeSeq</code> -- A specialized XML class which extends <code>immutable.Seq</code>.</li> <li><code>immutable.IndexedSeq</code> -- As seen above.</li> <li><code>immutable.LinearSeq</code> -- As seen above.</li> </ul></li> <li><code>mutable.ArrayStack</code> -- A class implementing a LIFO-optimized data structure using arrays. Supposedly significantly faster than a normal stack.</li> <li><code>mutable.Stack</code>, <code>mutable.SynchronizedStack</code> -- Classes implementing a LIFO-optimized data structure.</li> <li><code>mutable.StackProxy</code> -- A <code>Proxy</code> for a <code>mutable.Stack</code>..</li> <li><code>mutable.Seq</code> <ul> <li><code>mutable.Buffer</code> -- Sequence of elements which can be changed by appending, prepending or inserting new members. <ul> <li><code>mutable.ArrayBuffer</code> -- An implementation of the <code>mutable.Buffer</code> class, with constant amortized time for the append, update and random access operations. It has some specialized subclasses, such as <code>NodeBuffer</code>.</li> <li><code>mutable.BufferProxy</code>, <code>mutable.SynchronizedBuffer</code>.</li> <li><code>mutable.ListBuffer</code> -- A buffer backed by a list. It provides constant time append and prepend, with most other operations being linear.</li> <li><code>mutable.ObservableBuffer</code> -- A <em>mixin</em> trait which, when mixed to a <code>Buffer</code>, provides notification events through a <code>Publisher</code> interfaces.</li> <li><code>mutable.IndexedSeq</code> -- As seen above.</li> <li><code>mutable.LinearSeq</code> -- As seen above.</li> </ul></li> </ul></li> </ul></li> </ul> <h2>The Sets</h2> <ul> <li><code>Set</code> -- A set is a collection that includes at most one of any object. <ul> <li><code>BitSet</code> -- A set of integers stored as a bitset. <ul> <li><code>immutable.BitSet</code></li> <li><code>mutable.BitSet</code></li> </ul></li> <li><code>SortedSet</code> -- A set whose elements are ordered. <ul> <li><code>immutable.SortedSet</code> <ul> <li><code>immutable.TreeSet</code> -- An implementation of a <code>SortedSet</code> based on a tree.</li> </ul></li> </ul></li> <li><code>SetProxy</code> -- A <code>Proxy</code> for a <code>Set</code>.</li> <li><code>immutable.Set</code> <ul> <li><code>immutable.HashSet</code> -- An implementation of <code>Set</code> based on element hashing.</li> <li><code>immutable.ListSet</code> -- An implementation of <code>Set</code> based on lists.</li> <li>Additional set classes exists to provide optimized implementions for sets from 0 to 4 elements.</li> <li><code>immutable.SetProxy</code> -- A <code>Proxy</code> for an immutable <code>Set</code>.</li> </ul></li> <li><code>mutable.Set</code> <ul> <li><code>mutable.HashSet</code> -- An implementation of <code>Set</code> based on element hashing.</li> <li><code>mutable.ImmutableSetAdaptor</code> -- A class implementing a mutable <code>Set</code> from an immutable <code>Set</code>.</li> <li><code>LinkedHashSet</code> -- An implementation of <code>Set</code> based on lists.</li> <li><code>ObservableSet</code> -- A <em>mixin</em> trait which, when mixed with a <code>Set</code>, provides notification events through a <code>Publisher</code> interface.</li> <li><code>SetProxy</code> -- A <code>Proxy</code> for a <code>Set</code>.</li> <li><code>SynchronizedSet</code> -- A <em>mixin</em> trait which, when mixed with a <code>Set</code>, provides notification events through a <code>Publisher</code> interface.</li> </ul></li> </ul></li> </ul> <hr> <ul> <li>Why the Like classes exist (e.g. TraversableLike)</li> </ul> <p>This was done to achieve maximum code reuse. The concrete <em>generic</em> implementation for classes with a certain structure (a traversable, a map, etc) is done in the Like classes. The classes intended for general consumption, then, override selected methods that can be optmized.</p> <ul> <li>What the companion methods are for (e.g. List.companion)</li> </ul> <p>The builder for the classes, ie, the object which knows how to create instances of that class in a way that can be used by methods like <code>map</code>, is created by a method in the companion object. So, in order to build an object of type X, I need to get that builder from the companion object of X. Unfortunately, there is no way, in Scala, to get from class X to object X. Because of that, there is a method defined in each instance of X, <code>companion</code>, which returns the companion object of class X.</p> <p>While there might be some use for such method in normal programs, its target is enabling code reuse in the collection library.</p> <ul> <li>How I know what implicit objects are in scope at a given point</li> </ul> <p>You aren't supposed to care about that. They are implicit precisely so that you don't need to figure out how to make it work.</p> <p>These implicits exists to enable the methods on the collections to be defined on parent classes but still return a collection of the same type. For example, the <code>map</code> method is defined on <code>TraversableLike</code>, but if you used on a <code>List</code> you'll get a <code>List</code> back.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload