StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POWhy does this parallel code in D scale so badly?
primarykey
Id
17955902
data
AcceptedAnswerId
0
AnswerCount
2
ClosedDate
CommentCount
11
CommunityOwnedDate
CreationDate
2013-07-30T19:39:28.257
FavoriteCount
3
LastActivityDate
2013-07-31T19:47:18.227
LastEditDate
2013-07-30T23:37:19.910
LastEditorUserId
626537
OwnerUserId
626537
ParentId
0
PostTypeId
1
Score
10
ViewCount
513
LastEditorDisplayName
text
Body
Here is one experiment I performed comparing parallelism in C++ and D. I implemented an algorithm (a parallel label propagation scheme for community detection in networks) in both languages, using the same design: A parallel iterator gets a handle function (normally a closure) and applies it for every node in the graph. Here is the iterator in D, implemented using <code>taskPool</code> from <code>std.parallelism</code>: <pre><code>/** * Iterate in parallel over all nodes of the graph and call handler (lambda closure). */ void parallelForNodes(F)(F handle) { foreach (node v; taskPool.parallel(std.range.iota(z))) { // call here handle(v); } } </code></pre> And this is the handle function which is passed: <pre><code> auto propagateLabels = (node v){ if (active[v] && (G.degree(v) > 0)) { integer[label] labelCounts; G.forNeighborsOf(v, (node w) { label lw = labels[w]; labelCounts[lw] += 1; // add weight of edge {v, w} }); // get dominant label label dominant; integer lcmax = 0; foreach (label l, integer lc; labelCounts) { if (lc > lcmax) { dominant = l; lcmax = lc; } } if (labels[v] != dominant) { // UPDATE labels[v] = dominant; nUpdated += 1; // TODO: atomic update? G.forNeighborsOf(v, (node u) { active[u] = 1; }); } else { active[v] = 0; } } }; </code></pre> The C++11 implementation is almost identical, but uses OpenMP for parallelization. So what do the scaling experiments show? <img src="https://i.stack.imgur.com/2LloS.png" alt="scaling"> Here I examine weak scaling, doubling the input graph size while also doubling the number of threads and measuring the running time. The ideal would be a straight line, but of course there is some overhead for parallelism. I use <code>defaultPoolThreads(nThreads)</code> in my main function to set the number of threads for the D program. The curve for C++ looks good, but the curve for D looks surprisingly bad. Am I doing something wrong w.r.t. D parallelism, or does this reflect badly on the scalability of parallel D programs? p.s. compiler flags for D: <code>rdmd -release -O -inline -noboundscheck</code> for C++: <code>-std=c++11 -fopenmp -O3 -DNDEBUG</code> pps. Something must be really wrong, because the D implementation is slower in parallel than sequentially: <img src="https://i.stack.imgur.com/rtUih.png" alt="enter image description here"> ppps. For the curious, here are the Mercurial clone urls for both implementations: <ul> <li><a href="http://algohub.iti.kit.edu/parco/Prototypes/PLPd" rel="nofollow noreferrer">http://algohub.iti.kit.edu/parco/Prototypes/PLPd</a></li> <li><a href="http://algohub.iti.kit.edu/parco/Prototypes/PLPcpp" rel="nofollow noreferrer">http://algohub.iti.kit.edu/parco/Prototypes/PLPcpp</a></li> </ul>
Tags
<c++><performance><parallel-processing><d>
Title
Why does this parallel code in D scale so badly?
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USclstaudt
UserOwnerUserId
1. USclstaudt
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POWhy does this parallel code in D scale so badly?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POWhy does this parallel code in D scale so badly?
 UserUserId
 USKyle
 VoteTypeVoteTypeId
 VTFavorite
3. VO
 singulars
 PostPostId
 POWhy does this parallel code in D scale so badly?
 UserUserId
 USclstaudt
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.