Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Yes, you can sum the log-likelihoods for the two groups (if they were calculated separately). Just like you would sum the log-likelihoods for a vector of observations, where each observation has different generative parameters.</p> <p>I prefer to think in terms of one large vector (ie, of the shape parameter) which contains values that vary according to the structure of the covariates (ie, group membership). In a linear model context, this vector could equal the linear predictor (once appropriately transformed by link function): the dot product of the design matrix and the vector of regression coefficients.</p> <p>Here is a (non-functionalized) example:</p> <pre><code>## setup true values nobs = 50 ## number of observations a1 = 1 ## shape for first group b1 = 2 ## scale parameter for both groups beta = c(a1, a1 * 1.5) ## vector of linear coefficients (group shapes) ## model matrix for full, null models mm_full = cbind(grp1 = rep(c(1,0), each = nobs), grp2 = rep(c(0,1), each = nobs)) mm_null = cbind(grp1 = rep(1, nobs*2)) ## shape parameter vector for the full, null models shapes_full = mm_full %*% beta ## different shape parameters by group (full model) shapes_null = mm_null %*% beta[1] ## same shape parameter for all obs scales = rep(b1, length(shapes_full)) ## scale parameters the same for both groups ## simulate response from full model response = rweibull(length(shapes_full), shapes_full, scales) ## the log likelihood for the full, null models: LL_full = sum(dweibull(response, shapes_full, scales, log = T)) LL_null = sum(dweibull(response, shapes_null, scales, log = T)) ## likelihood ratio test LR_test = function(LL_null, LL_full, df) { LR = -2 * (LL_null - LL_full) ## test statistic pchisq(LR, df = df, ncp = 0, lower = F) ## probability of test statistic under central chi-sq distribution } LR_test(LL_null, LL_full, 1) ## 1 degrees freedom (1 parameter added) </code></pre> <p>To write a log-likelihood function to find the MLE of a Weibull model where the shape parameter(s) are some linear function of covariates, you could use the same approach:</p> <pre><code>## (negative) log-likelihood function LL_weibull = function(par, data, mm, inv_link_fun = function(.) .){ P = ncol(mm) ## number of regression coefficients N = nrow(mm) ## number of observations shapes = inv_link_fun(mm %*% par[1:P]) ## shape vector (possibly transformed) scales = rep(par[P+1], N) ## scale vector -sum(dweibull(data, shape = shapes, scale = scales, log = T)) ## negative log likelihood } </code></pre> <p>Then your power simulation might look like this:</p> <pre><code>## function to simulate data, perform LRT weibull_sim = function(true_shapes, true_scales, mm_full, mm_null){ ## simulate response response = rweibull(length(true_shapes), true_shapes, true_scales) ## find MLE mle_full = optim(par = rep(1, ncol(mm_full)+1), fn = LL_weibull, data = response, mm = mm_full) mle_null = optim(par = rep(1, ncol(mm_null)+1), fn = LL_weibull, data = response, mm = mm_null) ## likelihood ratio test df = ncol(mm_full) - ncol(mm_null) return(LR_test(-mle_null$value, -mle_full$value, df)) } ## run simulations nsim = 1000 pvals = sapply(1:nsim, function(.) weibull_sim(shapes_full, scales, mm_full, mm_null) ) ## calculate power alpha = 0.05 power = sum(pvals &lt; alpha) / nsim </code></pre> <p>An identity link works fine in the above example, but for more complex models some sort of transformation might be required. </p> <p>And you don't have to use linear algebra in the log-likelihood function -- obviously, you can construct the vector of shapes in any way you see fit (as long as you explicitly index the appropriate generative parameters in the vector <code>par</code>).</p> <p><strong>Interval-censored data</strong></p> <p>The cumulative distribution function <em>F(T)</em> of the Weibull distribution (<code>pweibull</code> in R) gives the probability of failure before time <em>T</em>. So, if an observation is interval censored between times <em>T[0]</em> and <em>T[1]</em>, the probability that the object fails between <em>T[0]</em> and <em>T[1]</em> is <em>F(T[1]) - F(T[0])</em> : the probability that the object failed before <em>T[1]</em> minus the probability that it failed before <em>T[0]</em> (the integral of the PDF between <em>T[0]</em> and <em>T[1]</em>). Because the Weibull CDF is already implemented in R it is trivial to modify the likelihood function above:</p> <pre><code>LL_ic_weibull &lt;- function(par, data, mm){ ## 'data' has two columns, left and right times of censoring interval P = ncol(mm) ## number of regression coefficients shapes = mm %*% par[1:P] scales = par[P+1] -sum(log(pweibull(data[,2], shape = shapes, scale = scales) - pweibull(data[,1], shape = shapes, scale = scales))) } </code></pre> <p>Or if you don't want to use a model matrix, etc., and just restrict yourself to indexing the shape parameter vector by groups, you could do something like:</p> <pre><code>LL_ic_weibull2 &lt;- function(par, data, nobs){ ## 'data' has two columns, left and right times of censoring interval ## 'nobs' is a vector that contains the num. observations for each group (grp1, grp2, ...) P = length(nobs) ## number of regression coefficients shapes = rep(par[1:P], nobs) scales = par[P+1] -sum(log(pweibull(data[,2], shape = shapes, scale = scales) - pweibull(data[,1], shape = shapes, scale = scales))) } </code></pre> <p>Test that both functions give the same solution:</p> <pre><code>## generate intervals from simulated response (above) left = ifelse(response - 0.2 &lt; 0, 0, response - 0.2) right = response + 0.2 response_ic = cbind(left, right) ## find MLE w/ first LL function (model matrix) mle_ic_full = optim(par = c(1,1,3), fn = LL_ic_weibull, data = response_ic, mm = mm_full) mle_ic_null = optim(par = c(1,3), fn = LL_ic_weibull, data = response_ic, mm = mm_null) ## find MLE w/ second LL function (groups only) nobs_per_group = apply(mm_full, 2, sum) ## just contains number of observations per group nobs_one_group = nrow(mm_null) ## one group so only one value mle_ic_full2 = optim(par = c(1,1,3), fn = LL_ic_weibull2, data = response_ic, nobs = nobs_per_group) mle_ic_null2 = optim(par = c(1,3), fn = LL_ic_weibull2, data = response_ic, nobs = nobs_one_group) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload