Model-based (Semi-)Parametric Bootstrap for Mixed Models

Perform model-based (Semi-)parametric bootstrap for mixed models.

Usage

bootMer(x, FUN, nsim = 1, seed = NULL, use.u = FALSE, re.form=NA,
  type = c("parametric", "semiparametric"),
  verbose = FALSE, .progress = "none", PBargs = list(),
  parallel = c("no", "multicore", "snow"),
  ncpus = getOption("boot.ncpus", 1L), cl = NULL)

Arguments

x: a fitted merMod object: see lmer, glmer, etc.
FUN: a function taking a fitted merMod object as input and returning the statistic of interest, which must be a (possibly named) numeric vector.
nsim: number of simulations, positive integer; the bootstrap \(B\) (or \(R\)).
seed: optional argument to set.seed.
use.u: logical, indicating whether the spherical random effects should be simulated / bootstrapped as well. If TRUE, they are not changed, and all inference is conditional on these values. If FALSE, new normal deviates are drawn (see Details).
re.form: formula, NA (equivalent to use.u=FALSE), or NULL (equivalent to use.u=TRUE): alternative to use.u for specifying which random effects to incorporate. See simulate.merMod for details.
type: character string specifying the type of bootstrap, "parametric" or "semiparametric"; partial matching is allowed.
verbose: logical indicating if progress should print output
.progress: character string - type of progress bar to display. Default is "none"; the function will look for a relevant *ProgressBar function, so "txt" will work in general; "tk" is available if the tcltk package is loaded; or "win" on Windows systems. Progress bars are disabled (with a message) for parallel operation.
PBargs: a list of additional arguments to the progress bar function (the package authors like list(style=3)).
parallel: The type of parallel operation to be used (if any). If missing, the default is taken from the option "boot.parallel" (and if that is not set, "no").
ncpus: integer: number of processes to be used in parallel operation: typically one would choose this to be the number of available CPUs.
cl: An optional parallel or snow cluster for use if parallel = "snow". If not supplied, a cluster on the local machine is created for the duration of the boot call.

Value

an object of S3 class "boot", compatible with boot package's boot() result. (See Details for information on how to retrieve information about errors during bootstrapping.)

Note

If you are using parallel="snow", you will need to run clusterEvalQ(cl,library("lme4")) before calling bootMer to make sure that the lme4 package is loaded on all of the workers; you may additionally need to use clusterExport if you are using a summary function that calls any objects from the environment.

Details

The semi-parametric variant is only partially implemented, and we only provide a method for lmer and glmer results.

Information about warning and error messages incurred during the bootstrap returns can be retrieved via the attributes

bootFail: number of failures (errors)
boot.fail.msgs: error messages
boot.all.msgs: messages, warnings, and error messages

e.g. attr("boot.fail.msgs") to retrieve error messages

The working name for bootMer() was “simulestimate()”, as it is an extension of simulate (see simulate.merMod), but we want to emphasize its potential for valid inference.

If use.u is FALSE and type is "parametric", each simulation generates new values of both the “spherical” random effects \(u\) and the i.i.d. errors \(\epsilon\), using rnorm() with parameters corresponding to the fitted model x.
If use.u is TRUE and type=="parametric", only the i.i.d. errors (or, for GLMMs, response values drawn from the appropriate distributions) are resampled, with the values of \(u\) staying fixed at their estimated values.
If use.u is TRUE and type=="semiparametric", the i.i.d. errors are sampled from the distribution of (response) residuals. (For GLMMs, the resulting sample will no longer have the same properties as the original sample, and the method may not make sense; a warning is generated.) The semiparametric bootstrap is currently an experimental feature, and therefore may not be stable.
The case where use.u is FALSE and type=="semiparametric" is not implemented; Morris (2002) suggests that resampling from the estimated values of \(u\) is not good practice.

References

Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.

Morris, J. S. (2002). The BLUPs Are Not ‘best’ When It Comes to Bootstrapping. Statistics & Probability Letters 56(4): 425–430. doi:10.1016/S0167-7152(02)00041-X.

Examples

if (interactive()) {
fm01ML <- lmer(Yield ~ 1|Batch, Dyestuff, REML = FALSE)
## see ?"profile-methods"
mySumm <- function(.) { s <- sigma(.)
    c(beta =getME(., "beta"), sigma = s, sig01 = unname(s * getME(., "theta"))) }
(t0 <- mySumm(fm01ML)) # just three parameters
## alternatively:
mySumm2 <- function(.) {
    c(beta=fixef(.),sigma=sigma(.), sig01=sqrt(unlist(VarCorr(.))))
}

set.seed(101)
## 3.8s (on a 5600 MIPS 64bit fast(year 2009) desktop "AMD Phenom(tm) II X4 925"):
system.time( boo01 <- bootMer(fm01ML, mySumm, nsim = 100) )

## to "look" at it
if (requireNamespace("boot")) {
    boo01
    ## note large estimated bias for sig01
    ## (~30% low, decreases _slightly_ for nsim = 1000)

    ## extract the bootstrapped values as a data frame ...
    head(as.data.frame(boo01))

    ## ------ Bootstrap-based confidence intervals ------------

    ## warnings about "Some ... intervals may be unstable" go away
    ##   for larger bootstrap samples, e.g. nsim=500

    ## intercept
    (bCI.1 <- boot::boot.ci(boo01, index=1, type=c("norm", "basic", "perc")))# beta

    ## Residual standard deviation - original scale:
    (bCI.2  <- boot::boot.ci(boo01, index=2, type=c("norm", "basic", "perc")))
    ## Residual SD - transform to log scale:
    (bCI.2L <- boot::boot.ci(boo01, index=2, type=c("norm", "basic", "perc"),
                       h = log, hdot = function(.) 1/., hinv = exp))

    ## Among-batch variance:
    (bCI.3 <- boot::boot.ci(boo01, index=3, type=c("norm", "basic", "perc"))) # sig01

    
    confint(boo01)
    confint(boo01,type="norm")
    confint(boo01,type="basic")

    ## Graphical examination:
    plot(boo01,index=3)

    ## Check stored values from a longer (1000-replicate) run:
    (load(system.file("testdata","boo01L.RData", package="lme4")))# "boo01L"
    plot(boo01L, index=3)
    mean(boo01L$t[,"sig01"]==0) ## note point mass at zero!
} 
}