StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow obtain the true residual deviance and degrees of freedom in R of a glm model when a set of parameters gets pasted() as a vector
primarykey
Id
14509495
data
AcceptedAnswerId
14700837
AnswerCount
1
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2013-01-24T19:47:47.037
FavoriteCount
0
LastActivityDate
2013-02-09T13:50:41.573
LastEditDate
2013-02-09T13:50:41.573
LastEditorUserId
559784
OwnerUserId
1009215
ParentId
0
PostTypeId
1
Score
6
ViewCount
1663
LastEditorDisplayName
text
Body
I'm writing a script (in python, with the R parts in pypeR) such that I need to use a function in R that compares two models with an F-ratio test. The models are like this: Model 1: <code>Response ~ Predictor A + Predictor B + Predictor C.... + Predictor n</code> Model 2: <code>Response ~ Predictor 1</code> Together predictors <code>A+B+...n</code> make up <code>Predictor 1</code>, so there's no problem with nesting here (trust me). When I pass <code>Predictor A + Predictor B + Predictor C.... + Predictor n</code> to the function I've created, I think it's treating them as one variable (because the degrees of freedom is the same as that of <code>Model 2</code>). Perhaps this is because I'm using <code>paste()</code>? Anyway, the actual number of predictors in model 1 will be changing across runs (which is why I need it as a function), so I'm not sure how else to accommodate this besides using <code>paste()</code>. Bear in mind that paste may not actually be the problem here; I just wanted to let people know that I thought the problem might be. Are there any suggestions for how I might obtain to true residual deviance and degrees of freedom for <code>model 1</code>? It can be a hack. For instance, I was simply subtracting <code>length(vector of predictors) - 1</code> to obtain the degrees of freedom. I have no idea what a similar hack for residual deviance would be. Here's the function and an example instantiation: <pre><code>make_and_compare_models <- function(fitness_trait_name, data_frame_name, vector_for_multiple_regression, predictor_for_single_regression, fam){ fit1<-glm(formula=as.formula(paste(fitness_trait_name,"~", paste(vector_for_multiple_regression, sep="+"))), family=fam, data=data_frame_name) #print ('length of vector of predictors') additional.degrees.of.freedom.fit1<-length(vector_for_multiple_regression)-1 ##the paste above prevents R from recognizing all of the vectors as separate predictors. This -1 gives you the difference in parameter number between the two models. print ("summary fit 1") print(summary(fit1)) dev1<-(fit1$deviance) print ('residual deviance of fit1') print (dev1) print(fit1$df.residual) ##this is how I'd correct for degrees of freedom #df1=fit1$df.residual-additional.degrees.of.freedom.fit1 #fit1$df.residual=df1 ##if the old way df1=fit1$df.residual print(fit1$df.residual) print ('df1') print (df1) fit2<- glm(data=data_frame_name, formula=as.formula(paste(fitness_trait_name,"~",predictor_for_single_regression)), family=fam) print("summary fit 2") print(summary(fit2)) print ("deviance of fit2") dev2<-(fit2$deviance) print(dev2) df2=fit2$df.residual print ('df2') print (df2) F.ratio<-((dev2-dev1)/(df2-df1))/(dev1/df1) print('F.ratio') print(F.ratio) new.p<-1-pf(F.ratio,abs(df1-df2),max(df2,df1)) print('new.p') print(new.p) } data <- structure(list(ID = c(1L, 2L, 4L, 7L, 9L, 10L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 20L, 21L, 22L, 23L, 24L, 25L, 27L, 28L, 29L, 31L, 34L, 37L, 38L, 39L, 40L, 41L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 52L, 55L, 56L, 59L, 60L, 61L, 62L, 63L, 65L, 66L, 67L, 68L, 69L, 71L), QnWeight_initial = c(158L, 165L, 137L, 150L, 153L, 137L, 158L, 163L, 159L, 151L, 145L, 144L, 157L, 144L, 133L, 148L, 151L, 151L, 147L, 158L, 178L, 164L, 134L, 151L, 148L, 142L, 127L, 179L, 162L, 150L, 151L, 153L, 163L, 155L, 163L, 170L, 149L, 165L, 128L, 134L, 145L, 147L, 148L, 160L, 131L, 155L, 169L, 143L, 123L, 151L), Survived_eclosion = c(0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Days_wrkr_eclosion_minus20 = c(NA, 1L, NA, 3L, 0L, 2L, 0L, 1L, 0L, 0L, 0L, 1L, NA, 0L, 7L, 1L, 0L, 1L, 0L, 1L, 2L, 2L, NA, 2L, 3L, 2L, 2L, NA, 0L, 1L, NA, NA, 0L, 0L, 0L, 0L, 3L, 3L, 3L, 1L, 0L, 2L, NA, 1L, 0L, 1L, 1L, 3L, 1L, 2L), MLH = c(0.5, 0.666666667, 0.555555556, 0.25, 1, 0.5, 0.333333333, 0.7, 0.5, 0.7, 0.5, 0.666666667, 0.375, 0.4, 0.5, 0.333333333, 0.4, 0.375, 0.3, 0.5, 0.3, 0.2, 0.4, 0.875, 0.6, 0.4, 0.222222222, 0.222222222, 0.6, 0.6, 0.3, 0.4, 0.714285714, 0.4, 0.3, 0.6, 0.4, 0.7, 0.625, 0.555555556, 0.25, 0.5, 0.5, 0.6, 0.25, 0.428571429, 0.3, 0.25, 0.375, 0.555555556), Acon5 = c(0.35387674, 0.35387674, 0.35387674, 0.35387674, 0.35387674, 0.35387674, 0.35387674, 0, 0, 1, 0, 1, 0.35387674, 0, 0, 0.35387674, 1, 1, 0, 0, 0, 1, 0, 0.35387674, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0.35387674), Baez = c(1, 1, 1, 0.467836257, 1, 1, 0, 0, 1, 1, 0, 0.467836257, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0.467836257, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1), C294 = c(0, 1, 0, 0, 1, 0.582542694, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0.582542694, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1), C316 = c(1, 1, 0, 0, 0.519685039, 0.519685039, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0.519685039, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0.519685039, 1, 0, 1, 1, 0, 0.519685039, 1, 0.519685039, 1, 1, 1, 0.519685039, 0.519685039, 0, 0.519685039, 0.519685039, 0), i_120_PigTail = c(1, 1, 0, 1, 0.631236443, 0.631236443, 1, 1, 1, 1, 1, 0, 0.631236443, 1, 1, 1, 0, 0.631236443, 1, 1, 1, 0, 0, 1, 1, 1, 0.631236443, 0, 1, 1, 0, 1, 0.631236443, 1, 0, 1, 0, 0, 1, 0.631236443, 0.631236443, 0, 1, 0, 0.631236443, 0.631236443, 1, 0.631236443, 0.631236443, 1), i129 = c(0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Jackstraw_PigTail = c(0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Neil_Young = c(0.529636711, 0, 1, 0, 0.529636711, 0.529636711, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1), Ramble = c(0, 0, 0, 0, 0.215163934, 0.215163934, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0.215163934, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0.215163934, 0, 0, 0, 0), Sol_18 = c(1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0.404669261, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1)), .Names = c("ID", "QnWeight_initial", "Survived_eclosion", "Days_wrkr_eclosion_minus20", "MLH", "Acon5", "Baez", "C294", "C316", "i_120_PigTail", "i129", "Jackstraw_PigTail", "Neil_Young", "Ramble", "Sol_18"), class = "data.frame", row.names = c(NA, -50L)) make_and_compare_models("QnWeight_initial", data, c("Acon5","Baez","C294","C316","i_120_PigTail","i129","Jackstraw_PigTail","Neil_Young","Ramble","Sol_18"), "MLH", "gaussian") </code></pre>
Tags
<r><regression><glm><anova>
Title
How obtain the true residual deviance and degrees of freedom in R of a glm model when a set of parameters gets pasted() as a vector
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USArun
UserOwnerUserId
1. USAtticus29
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow obtain the true residual deviance and degrees of freedom in R of a glm model when a set of parameters gets pasted() as a vector
 UserUserId
 USAtticus29
 VoteTypeVoteTypeId
 VTBountyStart
2. VO
 singulars
 PostPostId
 POHow obtain the true residual deviance and degrees of freedom in R of a glm model when a set of parameters gets pasted() as a vector
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POHow obtain the true residual deviance and degrees of freedom in R of a glm model when a set of parameters gets pasted() as a vector
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COWhat is wrong with the `anova` function?
 singulars
 PostPostId
 POHow obtain the true residual deviance and degrees of freedom in R of a glm model when a set of parameters gets pasted() as a vector
 UserUserId
 USmnel

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.