StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POR Example - ddply, ave, and merge
primarykey
Id
19892125
data
AcceptedAnswerId
19991291
AnswerCount
2
ClosedDate
CommentCount
4
CommunityOwnedDate
CreationDate
2013-11-10T16:30:28.503
FavoriteCount
2
LastActivityDate
2013-11-15T00:32:08.433
LastEditDate
2013-11-11T04:49:25.863
LastEditorUserId
243796
OwnerUserId
243796
ParentId
0
PostTypeId
1
Score
3
ViewCount
1037
LastEditorDisplayName
text
Body
I have written a code. It would be great if you guys can suggest better way of doing the stuff I am trying to do. The dt is given as follows: <pre><code> SIC FYEAR AU AT 1 1 2003 6 212.748 2 1 2003 5 3987.884 3 1 2003 4 100.835 4 1 2003 4 1706.719 5 1 2003 5 9.159 6 1 2003 7 60.069 7 1 2003 5 100.696 8 1 2003 4 113.865 9 1 2003 6 431.552 10 1 2003 7 309.109 ... </code></pre> My job is to create a new column for a given SIC, and FYEAR, the AU which has highest percentage AT and the difference between highest AT and second highest AT will get a value 1, otherwise 0. Here, is my attempt to do the stuff mentioned. <pre><code>a <- ddply(dt,.(SIC,FYEAR),function(x){ddply(x,.(AU),function(x) sum(x$AT))}); SIC FYEAR AU V1 1 1 2003 4 3412.619 2 1 2003 5 13626.241 3 1 2003 6 644.300 4 1 2003 7 1478.633 5 1 2003 9 0.003 6 1 2004 4 3976.242 7 1 2004 5 9383.516 8 1 2004 6 457.023 9 1 2004 7 456.167 10 1 2004 9 238.282 </code></pre> where V1 represnts the sum AT for all the rows for a given AU for a given SIC and FYEAR. Next I do : <pre><code>a$V1 <- ave(a$V1, a$SIC, a$FYEAR, FUN = function(x) x/sum(x)); SIC FYEAR AU V1 1 1 2003 4 1.780949e-01 2 1 2003 5 7.111150e-01 3 1 2003 6 3.362420e-02 4 1 2003 7 7.716568e-02 5 1 2003 9 1.565615e-07 6 1 2004 4 2.740114e-01 7 1 2004 5 6.466382e-01 8 1 2004 6 3.149444e-02 9 1 2004 7 3.143545e-02 10 1 2004 9 1.642052e-02 </code></pre> The column V1 now represents the percentage value for each AU for AT contribution for a given SIC, and FYEAR. Next, <pre><code>a$V2 <- ave(a$V1, a$SIC, a$FYEAR, FUN = function(x) {t<-((sort(x, TRUE))[2]); ifelse((x-t)> 0.1,1,0)}); SIC FYEAR AU V1 V2 1 1 2003 4 1.780949e-01 0 2 1 2003 5 7.111150e-01 1 3 1 2003 6 3.362420e-02 0 4 1 2003 7 7.716568e-02 0 5 1 2003 9 1.565615e-07 0 6 1 2004 4 2.740114e-01 0 7 1 2004 5 6.466382e-01 1 8 1 2004 6 3.149444e-02 0 9 1 2004 7 3.143545e-02 0 10 1 2004 9 1.642052e-02 0 </code></pre> The AU for a given SIC, and FYEAR, which has highest percentage contribution to AT, and f the difference is greater than 10%, the that AU gets 1 else gets 0. Then I merge the result with original data dt. <pre><code>dt <- merge(dt,a,key=c("SIC","FYEAR","AU")); SIC FYEAR AU AT V1 V2 1 1 2003 4 1706.719 1.780949e-01 0 2 1 2003 4 100.835 1.780949e-01 0 3 1 2003 4 113.865 1.780949e-01 0 4 1 2003 4 1491.200 1.780949e-01 0 5 1 2003 5 3987.884 7.111150e-01 1 6 1 2003 5 100.696 7.111150e-01 1 7 1 2003 5 67.502 7.111150e-01 1 8 1 2003 5 9461.000 7.111150e-01 1 9 1 2003 5 9.159 7.111150e-01 1 10 1 2003 6 212.748 3.362420e-02 0 </code></pre> What I did is very cumbersome. Is there a better way to do the same stuff? Thanks.
Tags
<r><dataframe><plyr>
Title
R Example - ddply, ave, and merge
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USSumit
UserOwnerUserId
1. USSumit
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POR Example - ddply, ave, and merge
 UserUserId
 USSumit
 VoteTypeVoteTypeId
 VTBountyStart
2. VO
 singulars
 PostPostId
 POR Example - ddply, ave, and merge
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POR Example - ddply, ave, and merge
 UserUserId
 USSESman
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.