Fitting mixed models with lmer and BRMS

Fitting the radon mixed model within a frequentist framework using lmer

For completeness, here we show how mixed models can be fitted in a straightforward fashion with several R packages. For example, we can fit the radon data within a frequentist framework using the “lmer” package.

rm(list=ls(all=T))
library('lme4')
library('brms')

#import data
setwd('U:\\uf\\courses\\bayesian course\\group activities\\8 example radon')
radon=read.csv('radon data gelman_hill.csv',as.is=T)

#fit model using lmer
mod.lmer=lmer(log.radon~floor+log.u+(1|county),data=radon)
summary(mod.lmer)

## Linear mixed model fit by REML ['lmerMod']
## Formula: log.radon ~ floor + log.u + (1 | county)
##    Data: radon
## 
## REML criterion at convergence: 2134.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.9673 -0.6117  0.0274  0.6555  3.3848 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  county   (Intercept) 0.02446  0.1564  
##  Residual             0.57523  0.7584  
## Number of obs: 919, groups:  county, 85
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  1.46576    0.03794  38.633
## floor       -0.66824    0.06880  -9.713
## log.u        0.72027    0.09176   7.849
## 
## Correlation of Fixed Effects:
##       (Intr) floor 
## floor -0.357       
## log.u  0.145 -0.009

The syntax “(1|county)” indicates that we are assuming that the intercept varies from county to county and that it will be modeled as a random effect. On the other hand, the variables “floor” and “log.u” are modeled as fixed effects.

The results from lmer suggest that both “floor” and “log.u” variables are statistically discernible given that they have large t-values. Furthermore, in the “Random effects” section of the output, we see that the standard deviation of the county-level intercepts (equal to 0.16) is relatively small compared to the residual standard deviation (equal to 0.76). More formally, we can calculate the intraclass correlation, given by \(\frac{\sigma^2_{county}}{\sigma^2_{county}+\sigma^2_{residual}}\). Notice that this expression relies on variances, not on standard deviations. In this case, the intraclass correlation is equal to 0.04, suggesting that, differently from what we antecipated, there isn’t much of a grouping effect (i.e., log radon concentration in houses in the same county do not tend to be more similar than concentration in houses in distinct counties).

#intraclass correlation
0.02446/(0.02446+0.57523)

## [1] 0.04078774

Fitting the radon mixed model within a Bayesian framework using BRMS

We can also fit the same model within a Bayesian framework using the BRMS package. Notice that the syntax is very similar to the one we used for lmer.

#fit model using brms
mod.brms=brm(log.radon~floor+log.u+(1|county), data=radon,family=gaussian())

summary(mod.brms)

##  Family: gaussian 
##   Links: mu = identity 
## Formula: log.radon ~ floor + log.u + (1 | county) 
##    Data: radon (Number of observations: 919) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Multilevel Hyperparameters:
## ~county (Number of levels: 85) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     0.16      0.05     0.06     0.26 1.00     1074     1356
## 
## Regression Coefficients:
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     1.46      0.04     1.39     1.54 1.00     4715     3192
## floor        -0.67      0.07    -0.80    -0.54 1.00     7618     2938
## log.u         0.72      0.09     0.53     0.89 1.00     4283     3058
## 
## Further Distributional Parameters:
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.76      0.02     0.72     0.80 1.00     5627     2866
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Notice that the overall intercept and slope estimates from brms are very similar to those obtained from lmer. Also, the brms output confirms that both “floor” and “log.u” variables are statistically discernible from zero given that their 95% credible intervals do not overlap, and are far from, zero. Furthermore, we also see that the standard deviation of the intercept (denoted as “sd(Intercept)”, equal to 0.16) is small relative to the residual standard deviation (denoted as “sigma”, equal to 0.76). Finally, we can compare the random effect estimates from both packages to show that we get similar results.

#extract random effects
re.lmer=ranef(mod.lmer)$county
re.brms=as.data.frame(ranef(mod.brms)$county)
re.brms1=re.brms$Estimate.Intercept

#compare random effect estimates
rango=range(c(re.lmer$'(Intercept)',re.brms1))
plot(re.lmer$'(Intercept)',re.brms1,xlim=rango,ylim=rango,
     xlab='random effects lmer',ylab='random effects brms')
lines(rango,rango,col='red') #1:1 line

Comments?

Send me an email at