StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POWhy is mclappy slower than apply in this case?
primarykey
Id
18123125
data
AcceptedAnswerId
18127147
AnswerCount
1
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2013-08-08T10:01:55.440
FavoriteCount
0
LastActivityDate
2013-08-08T13:39:33.277
LastEditDate
LastEditorUserId
0
OwnerUserId
1991825
ParentId
0
PostTypeId
1
Score
1
ViewCount
529
LastEditorDisplayName
text
Body
i'm pretty confused. I want to speed up my algorithm by using mclapply:parallel, but when I compare time efficiency, apply still wins. I'm smoothing log2ratio data by rq.fit.fnb:quantreg which is called by my function quantsm and I'm wrapping my data into matrix/list for apply/lapply(mclapply) usage. I adjist my data like this: <pre><code>q = matrix(data, ncol=N) # wrapping into matrix (using N = 2, 4, 6 or 8) ql = as.list(as.data.frame(q)) # making list </code></pre> And time comparing: <pre><code>apply=system.time(apply(q, 1, FUN=quantsm, 0.50, 2)) lapply=system.time(lapply(ql, FUN=quantsm, 0.50, 2)) mc2lapply=system.time(mclapply(ql, FUN=quantsm, 0.50, 2, mc.cores=2)) mc4lapply=system.time(mclapply(ql, FUN=quantsm, 0.50, 2, mc.cores=4)) mc6lapply=system.time(mclapply(ql, FUN=quantsm, 0.50, 2, mc.cores=6)) mc8lapply=system.time(mclapply(ql, FUN=quantsm, 0.50, 2, mc.cores=8)) timing=rbind(apply,lapply,mc2lapply,mc4lapply,mc6lapply,mc8lapply) </code></pre> Function quantsm: <pre><code>quantsm <- function (y, p = 0.5, lambda) { # Quantile smoothing # Input: response y, quantile level p (0<p<1), smoothing parmeter lambda # Result: quantile curve # Augment the data for the difference penalty m <- length(y) E <- diag(m); Dmat <- diff(E); X <- rbind(E, lambda * Dmat) u <- c(y, rep(0, m - 1)) # Call quantile regression q <- rq.fit.fnb(X, u, tau = p) q } </code></pre> Function rq.fit.fnb (quantreg library): <pre><code>rq.fit.fnb <- function (x, y, tau = 0.5, beta = 0.99995, eps = 1e-06) { n <- length(y) p <- ncol(x) if (n != nrow(x)) stop("x and y don't match n") if (tau < eps || tau > 1 - eps) stop("No parametric Frisch-Newton method. Set tau in (0,1)") rhs <- (1 - tau) * apply(x, 2, sum) d <- rep(1, n) u <- rep(1, n) wn <- rep(0, 10 * n) wn[1:n] <- (1 - tau) z <- .Fortran("rqfnb", as.integer(n), as.integer(p), a = as.double(t(as.matrix(x))), c = as.double(-y), rhs = as.double(rhs), d = as.double(d), as.double(u), beta = as.double(beta), eps = as.double(eps), wn = as.double(wn), wp = double((p + 3) * p), it.count = integer(3), info = integer(1), PACKAGE = "quantreg") coefficients <- -z$wp[1:p] names(coefficients) <- dimnames(x)[[2]] residuals <- y - x %*% coefficients list(coefficients = coefficients, tau = tau, residuals = residuals) } </code></pre> For data vector of length 2000 i get: (value = elapsed time in sec; columns = different number of columns of smoothed matrix/list) <pre><code> 2cols 4cols 6cols 8cols apply 0.178 0.096 0.069 0.056 lapply 16.555 4.299 1.785 0.972 mc2lapply 11.192 2.089 0.927 0.545 mc4lapply 10.649 1.326 0.694 0.396 mc6lapply 11.271 1.384 0.528 0.320 mc8lapply 10.133 1.390 0.560 0.260 </code></pre> For data of length 4000 i get: <pre><code> 2cols 4cols 6cols 8cols apply 0.351 0.187 0.137 0.110 lapply 189.339 32.654 14.544 8.674 mc2lapply 186.047 20.791 7.261 4.231 mc4lapply 185.382 30.286 5.767 2.397 mc6lapply 184.048 30.170 8.059 2.865 mc8lapply 182.611 37.617 7.408 2.842 </code></pre> Why is apply so much more efficient than mclapply? Maybe I'm just doing some usual beginner mistake. Thank you for your reactions.
Tags
<r><parallel-processing><apply><smoothing><mclapply>
Title
Why is mclappy slower than apply in this case?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USuser1991825
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. This table or related slice is empty.
CommentsPostId
1. COShouldn't your `apply` call use `MARGIN = 2`, not `1`?
 singulars
 PostPostId
 POWhy is mclappy slower than apply in this case?
 UserUserId
 USflodel
2. COOf course! Thank you both (@flodel and @SteveWeston). I really overlooked that.
 singulars
 PostPostId
 POWhy is mclappy slower than apply in this case?
 UserUserId
 USuser1991825

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.