Generalized Linear Models (GLMs)

[FIXME: needs pointers to documentation + additional hints from here and there]

Assessing fit

Useful discussion of hypothesis testing over here on the R-list.

Proportion of deviance explained (logistic regression)

Takes a binomial glm fit object.

d2 <- function(model) { round(1-(model$deviance/model$null.deviance),4) }

($D^2$, analogue of $R^2$. I found it in Rossiter and Loza (2008).)

[FIXME: add the other pseudo-$R^2$ measures.]

Classification accuracy of binomial models

If you run a logistic regression in SPSS, one of the outputs is a "Classification Table" showing how accurately (binary) outcomes are predicted by the model. This is not particularly useful statistically, but for the record, here's how to do something similar in R. The code shows how to muck about with object internals (which you can look at using str(object)). It will also work for lmer logistic regressions.

Function definition:

binomial.accuracy <- function(model,criterion=0.5) {
  if (class(model)[1] == "glm") {
    if (model$family$family != "binomial") {
      stop ("argument must be a binomial glm")
    }
# here we have a binomial glm
    depvar <- as.character(names(model$model)[1])
    df <- as.data.frame(model$data)

  } else {
    if (class(model)[1] != "glmer") {
      stop ("model must be glm or lmer")
    }
    if (model@family$family != "binomial") {
      stop ("argument must be a binomial lmer")
    }
# but here we have a binomial lmer
    depvar <- as.character(formula(attr(model@frame, "terms")))[2]
    df <- as.data.frame(model@frame)
# in the lmer case, the fitted() function gives beta, so convert probability
# to beta
    criterion=log(criterion/(1-criterion))

  }

# everything else is the same, whatever type of model
# score '1' if value is greater than criterion value, zero otherwise
  predicted <- as.numeric(fitted(model) > criterion)

# things are in sequence, so it's OK to remove the NA's from the depvar
  observed <- (df[,depvar])[!(is.na(df[,depvar]))]

# the observed score may be a factor, so convert to zeroes and ones
  observed <- as.numeric(as.factor(observed))-1

# then score 1 for each time predictions match
  matches <- as.numeric(predicted == observed)

# and report matching rate as percentage  
  round((mean(matches))*100,2)
}

Example usage:

# data frame with likely relation between predictor and (binary) outcome
testme <- data.frame(subject = gl(10,5,100), predictor=c(rnorm(50,25,15),rnorm(50,35,15)),outcome=c(rbinom(50,1,.2),rbinom(50,1,.8)))

# for glm
glm.bin <- glm(outcome~predictor,data=testme,family="binomial")
binomial.accuracy(glm.bin)

# for lmer
lmer.bin <- lmer(outcome~predictor+(1|subject),data=testme,family="binomial")
binomial.accuracy(lmer.bin)

Useful resources

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License