Tip
Mixed effects models are a way of doing repeated measures regression.1 You use random effects to separate out between-subject variance you're not interested in so you can focus, e.g., on differences within subjects, between conditions. You can also easily eliminate (subject-invariant) variance in the degree to which items induce an effect, which is particularly useful if you have a load of different stimuli (sentences, faces) which superficially look different but have the same (you hope — with respect to your experimental hypothesis) underlying form.
More concretely, here's linear regression with one predictor
(1)where the i's are observations and $\epsilon \in N(0,\sigma_\epsilon^2)$.2
Here's the extension with a random intercept, $c_j$, per subject, j. The i's are still observations.
(2)Note how the residual term is at the level of individual items whereas the random intercept is at the level of people (hence the term "multilevel model" often used). If you've got a heap of between-subject variation in intercept then that disappears into $c_j$, thus reducing $\epsilon_{ij}$ and possibly improving fit. The fixed effects are the b's. They're fixed in the sense that they're group means.
With an additional random slope, $c_{1j}$ , per subject, j:
(3)You'd use this if there is between-subject variation in effect size and you're most interested only in whether there is indeed an effect. Funny Things can happen here if some folk have a positive slope and some folk negative.
One cool thing about mixed effects models is that you can extract estimates for these random effects; analogously to how you obtain residuals for standard linear regression.
Relationship with ANOVA
Which function to use
lmer in lme4.
Example from the helpfile (type ?lmer):
require(lme4)
(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
(fm2 <- lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy))
anova(fm1, fm2)
Model fm1 regresses Reaction on Days, with a random intercept for Subject and slope for Days grouped by Subject. So you get the group mean slope for Days, the group average intercept, plus subject deviations from this mean.
Note: (1|A/B) expands to (1|A) + (1|A:B). Look at examples to see what this means.
See over here for further hints on specifying random effects.
Models with multinomial outcome variables may be fitted using the drm package.
Assessing fit
- Compare nested models using the log-likelihood ratio test (confusingly, using the anova function). This feels new at first, but recall that in linear regression generally the test you do is comparing the null model (with only an intercept) with your particular model of interest. SPSS hides this from you. Sometimes you see $\Delta F$'s reported in SPSS-land; the associated p-values test the effect of individual predictors.
- Look at the slopes. Are they significantly different to zero?
Things that will annoy you
P-values
lmer doesn't give you them for t-tests in gaussian GLMMs as the statisticians aren't sure how many degrees of freedom to use and Doug Bates refuses to copy what SAS does. Use the pvals.fnc function from the languageR package. It uses Markov Chain Monte Carlo (MCMC) methods.
Or you can use the log-likelihood ratio test. Compare the models with and without the predictor of interest.
Categorical predictors with more than 2 levels
By default they're given a dummy encoding. Use relevel to change the base category. You can also use your own encoding. See a good text on ANOVA for properties the encoding should ideally have.
Whether to include a random effect
- Supposed structure of the data can be used to decide whether you include a random effect. (By analogy, one would use a repeated measures ANOVA for repeated measures designs even if the between-subject variation was minimal.)
- If you want to make a test, then use the log likelihood ratio test (LRT), not MCMC-based methods as none of the MCMC samples will have zero variance.
- A significant p from LRT means you should definitely leave the random effect in.
- A non statistically significant p might mean you can leave it out.
- Though you can’t really trust the LRT to have a $\chi^2$ distribution; you’d be better off using a bootstrapping approach.
Bootstrapping approach
An example from Faraway (2006):
require(faraway)
require(lme4)
smod <- lmer(bright ~ 1 + (1|operator), pulp, method="ML")
nullmod <- lm(bright ~ 1, pulp)
lr = as.numeric(2*(logLik(smod)-logLik(nullmod)))
# gives the likelihood ratio
lrstat <- numeric(1000)
for(i in 1:1000){
y <- unlist(simulate(nullmod))
bnull <- lm(y ~ 1)
balt <- lmer(y ~ 1 + (1|operator),pulp,method="ML")
lrstat[i] <- as.numeric(2*(logLik(balt)-logLik(bnull)))
}
# p-value for the test:
mean(lrstat > lr)
If you're using anything other than a gaussian GLM (e.g., binomial) then you're snookered — simulate won't work for you :(
Other resources
- T. Florian Jaeger (In press). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language. Useful for arguments against using ANOVA for proportion outcomes.
- Malcolm, G. L., Lanyon, L. J., Fugard, A. J. B., & Barton, J. J. S. (2008). Scanning fixations during the processing of facial expression versus identity: an exploration of task-driven and stimulus-driven effects. Journal of Vision, 8(8):2, 1-9.
- A bibliography of articles that use lme (the predecessor to lmer, but it fits more general linear models) and lmer
- A useful R news article with lmer examples (PDF—open and search for lmer).