Fitting the radon mixed model within a frequentist framework using lmer
For completeness, here we show how mixed models can be fitted in a straightforward fashion with several R packages. For example, we can fit the radon data within a frequentist framework using the “lmer” package.
rm(list=ls(all=T))
library('lme4')
library('brms')
#import data
setwd('U:\\uf\\courses\\bayesian course\\group activities\\8 example radon')
radon=read.csv('radon data gelman_hill.csv',as.is=T)
#fit model using lmer
mod.lmer=lmer(log.radon~floor+log.u+(1|county),data=radon)
summary(mod.lmer)## Linear mixed model fit by REML ['lmerMod']
## Formula: log.radon ~ floor + log.u + (1 | county)
## Data: radon
##
## REML criterion at convergence: 2134.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.9673 -0.6117 0.0274 0.6555 3.3848
##
## Random effects:
## Groups Name Variance Std.Dev.
## county (Intercept) 0.02446 0.1564
## Residual 0.57523 0.7584
## Number of obs: 919, groups: county, 85
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.46576 0.03794 38.633
## floor -0.66824 0.06880 -9.713
## log.u 0.72027 0.09176 7.849
##
## Correlation of Fixed Effects:
## (Intr) floor
## floor -0.357
## log.u 0.145 -0.009
The syntax “(1|county)” indicates that we are assuming that the intercept varies from county to county and that it will be modeled as a random effect. On the other hand, the variables “floor” and “log.u” are modeled as fixed effects.
The results from lmer suggest that both “floor” and “log.u” variables are statistically discernible given that they have large t-values. Furthermore, in the “Random effects” section of the output, we see that the standard deviation of the county-level intercepts (equal to 0.16) is relatively small compared to the residual standard deviation (equal to 0.76). More formally, we can calculate the intraclass correlation, given by \(\frac{\sigma^2_{county}}{\sigma^2_{county}+\sigma^2_{residual}}\). Notice that this expression relies on variances, not on standard deviations. In this case, the intraclass correlation is equal to 0.04, suggesting that, differently from what we antecipated, there isn’t much of a grouping effect (i.e., log radon concentration in houses in the same county do not tend to be more similar than concentration in houses in distinct counties).
## [1] 0.04078774
Fitting the radon mixed model within a Bayesian framework using BRMS
We can also fit the same model within a Bayesian framework using the BRMS package. Notice that the syntax is very similar to the one we used for lmer.
## Family: gaussian
## Links: mu = identity
## Formula: log.radon ~ floor + log.u + (1 | county)
## Data: radon (Number of observations: 919)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Multilevel Hyperparameters:
## ~county (Number of levels: 85)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.16 0.05 0.06 0.26 1.00 1074 1356
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.46 0.04 1.39 1.54 1.00 4715 3192
## floor -0.67 0.07 -0.80 -0.54 1.00 7618 2938
## log.u 0.72 0.09 0.53 0.89 1.00 4283 3058
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.76 0.02 0.72 0.80 1.00 5627 2866
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Notice that the overall intercept and slope estimates from brms are very similar to those obtained from lmer. Also, the brms output confirms that both “floor” and “log.u” variables are statistically discernible from zero given that their 95% credible intervals do not overlap, and are far from, zero. Furthermore, we also see that the standard deviation of the intercept (denoted as “sd(Intercept)”, equal to 0.16) is small relative to the residual standard deviation (denoted as “sigma”, equal to 0.76). Finally, we can compare the random effect estimates from both packages to show that we get similar results.
#extract random effects
re.lmer=ranef(mod.lmer)$county
re.brms=as.data.frame(ranef(mod.brms)$county)
re.brms1=re.brms$Estimate.Intercept
#compare random effect estimates
rango=range(c(re.lmer$'(Intercept)',re.brms1))
plot(re.lmer$'(Intercept)',re.brms1,xlim=rango,ylim=rango,
xlab='random effects lmer',ylab='random effects brms')
lines(rango,rango,col='red') #1:1 line
Comments?
Send me an email at