Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Edit:</strong> I was wrong, it's not a question of value type vs reference type. The performance problem was related to the hash function, as explained in other comments. I keep my answer here because there's an interessant discussion. My code partially fixed the performance issue, but this is not the clean and recommended solution.</p> <p>--</p> <p>On my computer, I made your sample run <em>twice as fast</em> by replacing the tuple with a struct. This means, the equivalent F# code should run faster than your Python code. I don't agree with the comments saying that .NET hashtables are slow, I believe there's no significant difference with Python or other languages implementations. Also, I don't agree with the "You can't 1-to-1 translate code expect it to be faster": F# code will generally be faster than Python for most tasks (static typing is very helpful to the compiler). In your sample, most of the time is spent doing hashtable lookups, so it's fair to imagine that both languages <em>should</em> be almost as fast.</p> <p>I think the performance issue is related to gabage collection (but I haven't checked with a profiler). The reason why using tuples can be slower here than structures has been discussed in a SO question ( <a href="https://stackoverflow.com/questions/2410710/why-is-the-new-tuple-type-in-net-4-0-a-reference-type-class-and-not-a-value-ty">Why is the new Tuple type in .Net 4.0 a reference type (class) and not a value type (struct)</a>) and a MSDN page (<a href="http://msdn.microsoft.com/en-us/magazine/dd942829.aspx#id0400060" rel="nofollow noreferrer">Building tuples</a>):</p> <blockquote> <p>If they are reference types, this means there can be lots of garbage generated if you are changing elements in a tuple in a tight loop. [...] F# tuples were reference types, but there was a feeling from the team that they could realize a performance improvement if two, and perhaps three, element tuples were value types instead. Some teams that had created internal tuples had used value instead of reference types, because their scenarios were very sensitive to creating lots of managed objects.</p> </blockquote> <p>Of course, as Jon said in another comment, the obvious optimization in your example is to replace hashtables with arrays. Arrays are obviously much faster (integer index, no hashing, no collision handling, no reallocation, more compact), but this is very specific to your problem, and it doesn't explain the performance difference with Python (as far as I know, Python code is using hashtables, not arrays).</p> <p>To reproduce my 50% speedup, here is the full code: <a href="http://pastebin.com/nbYrEi5d" rel="nofollow noreferrer">http://pastebin.com/nbYrEi5d</a></p> <p>In short, I replaced the tuple with this type:</p> <pre><code>type Tup = {x: int; y: int} </code></pre> <p>Also, it seems like a detail, but you should move the <code>List.mapi (fun i x -&gt; (i,x)) fileSizes</code> out of the enclosing loop. I believe Python <code>enumerate</code> does not actually allocate a list (so it's fair to allocate the list only once in F#, or use <code>Seq</code> module, or use a mutable counter).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload