WAIC

The Watanabe-Akaike information criterion (WAIC) is a criterion that does not tell us much in isolation but can be used to compare different models in much the same way that we use AIC/BIC for frequentist models. Smaller values of WAIC indicate better models.

WAIC can be readily calculated from the output that we get when calculating our Bayesian p-values. One of the nice features is that we also get uncertainty estimates associated with WAIC when we use the R package “loo”. This is useful because, even if WAIC is smaller for one model when compared to another, depending on the amount of uncertainty, these WAIC values might not be that different.

WAIC is pretty good in terms of comparing Bayesian models but there are other approaches. You can read more about them in (Hooten and Hobbs 2015).

Testing WAIC

To demonstrate the utility of WAIC, I will also fit a Negative Binomial regression model, in addition to the Poisson regression model that we fitted to demonstrate the Bayesian p-values.

Recall that the Negative Binomial (NB) regression model approximates a Poisson regression model when the “r” parameter is very big. As a result, I expect that WAIC will be similar for both models applied to the data simulated following a Poisson regression model. However, I expect that WAIC will be smaller for the NB model when compared to the Poisson model when these models are fitted to the data simulated following an NB regression model.

JAGS code for NB regression model

Here is my code to fit a NB regression model.

  #likelihood
  for (i in 1:nobs){
    S[i]~dnegbin(p[i],r)
    mu[i]=exp(b0+b1*logA[i])
    p[i]=r/(r+mu[i])
    
    #calculate deviance for real data
    Tobs[i]=logdensity.negbin(S[i],p[i],r)
    
    #simulate data
    Ssim[i]~dnegbin(p[i],r)
    
    #calculate deviance for simulated data
    Tsim[i]=logdensity.negbin(Ssim[i],p[i],r)
  }
  
  #priors
  r~dunif(0,1000)
  b0~dnorm(0,0.1)
  b1~dnorm(0,0.1)
}

Fitting my model

Here is how I fit my NB regression model.

rm(list=ls(all=TRUE)) 
library(jagsUI)
set.seed(1)

#get data
setwd('U:\\uf\\courses\\bayesian course\\rmarkdown\\waic and Bayes pval')
dat=read.csv('sim data Poisson.csv')
#dat=read.csv('sim data NB.csv')
nobs=nrow(dat)
#plot(S~logA,data=dat)

# MCMC settings 
ni <- 1500 #number of iterations
nt <- 10    #interval to thin 
nb <- 500  #number of iterations to discard as burn-in
nc <- 3     #number of chains

#set parameters to monitor
params=c("b0","b1",'Tsim','Tobs','r')

#create function that sets the initial values
inits=function(){
  list(b0=rnorm(1,-1,1), 
       b1=rnorm(1,-1,1),
       r=runif(1,0,100))
}

#prepare data for JAGS
dat1=list(S=dat$S,logA=dat$logA,
          nobs=nobs)

#run NB model
setwd('U:\\uf\\courses\\bayesian course\\rmarkdown\\waic and Bayes pval')
mod=jags(model.file="NB regression.R", inits=inits,
         parameters.to.save=params,
         data=dat1,n.chains=nc,
         n.burnin=nb,n.iter=ni,n.thin=nt,DIC=TRUE)

Calculating WAIC

Notice that the results from the “loo” package provides standard errors (SE) as a measure of uncertainty.

#calculate waic with package
library(loo)
res=waic(mod$sims.list$Tobs)
res
## 
## Computed from 300 by 1000 log-likelihood matrix.
## 
##           Estimate   SE
## elpd_waic  -1891.2 21.6
## p_waic         2.3  0.1
## waic        3782.3 43.2

Now we can compare the WAIC for different models fitted to the different datasets. Here is what we find:

  • Poisson data, Poisson model: 3782 (SE=44)
  • Poisson data, NB model: 3788 (SE=43)
  • NB data, Poisson model: 5844 (SE=160)
  • NB data, NB model: 4471 (SE=64)

As expected, there isn’t much difference between the Poisson and NB model when applied to the data simulated following a Poisson regression. However, the NB model is much better than the Poisson model when fitted to the data simulated following an NB regression.



Comments?

Send me an email at

References

Hooten, M. B., and N. T. Hobbs. 2015. “A Guide to Bayesian Model Selection for Ecologists.” Ecological Monographs 85: 3–28.