Model
Recall that radon concentration was measured in multiple houses, either in the basement (\(x_{ik1}=0\)) or at the first floor level (\(x_{ik1}=1\)). Furthermore, we also have data on the uranium in the soil for each county (\(x_{k2}\)). Uranium is important because it is the first element in a long series of decay that eventually produces radon. In this model, the log-radon concentration for the i-th house in the k-th county is denoted by \(y_{ik}\). Here is how this model looks like:
\[y_{ik} \sim N(\beta_{0k}+\beta_1 x_{ik1} + \beta_2 x_{k2},\sigma^2) \]
Here are some of the priors for this model:
\[\frac{1}{\sigma^2} \sim Gamma(a,b)\] \[\beta_1 \sim N(0,10)\] \[\beta_2 \sim N(0,10)\]
Data
The radon data are available in the file “radon data gelman_hill.csv”. Here is how these data look like:
rm(list=ls(all=T))
#import data
setwd('U:\\uf\\courses\\bayesian course\\group activities\\8 example radon')
radon=read.csv('radon data gelman_hill.csv',as.is=T)
head(radon)
## log.radon floor log.u county
## 1 0.7884574 1 -0.6890476 1
## 2 0.7884574 0 -0.6890476 1
## 3 1.0647107 0 -0.6890476 1
## 4 0.0000000 0 -0.6890476 1
## 5 1.1314021 0 -0.8473129 2
## 6 0.9162907 0 -0.8473129 2
Model comparison
To understand the difference between a fixed versus a random effect formulation, we will create two distinct JAGS models:
- Fixed effects model: to make \(\beta_{0k}\) into a fixed effect, I will avoid estimating \(\mu\) and \(\tau^2\) by setting these to 0 and 10, respectively. In other words, we are using our standard vague prior by assuming that:
\[\beta_{0k} \sim N(0,10)\]
- Mixed-effects model: to make \(\beta_{0k}\) a random effect, I will also have to estimate \(\mu\) and \(\tau^2\). As a result, I will need to specify priors for these prior-parameters. More specifically, I will assume that:
\[\beta_{0k} \sim N(\mu,\tau^2)\] \[\mu \sim N(0,10)\] \[\frac{1}{\tau^2} \sim Gamma(a,b)\]
Analysis steps
Create model 1 in JAGS and estimate all the parameters.
Create model 2 in JAGS and estimate all the parameters.
Compare your estimates of \(\beta_{0k}\) from model 1 and from model 2. How similar or distinct are these estimates? How is the similarity between these parameter estimates influenced by the number of observations for the corresponding county?
How would you predict mean radon concentration for houses within a county present in our dataset?
How would you predict mean radon concentration for houses in a new county (i.e., a county not present in our dataset) with log(uranium in the soil) of 0.3?
How would you change your JAGS code to model the slope parameter for “floor” as a random-effect? You just have to provide the JAGS code (i.e., don’t worry about running this model).
How would you change your JAGS code to represent the two regressions structure, one at the house level and the other at the county level? This model is given below:
\[y_{ik} \sim N(\beta_{0k}+\beta_1 x_{ik1},\sigma^2) \] \[\beta_{0k} \sim N(\gamma +\beta_2 x_{k2},\tau^2)\] You just have to provide the JAGS code (i.e., don’t worry about running this model).
Obs.: to create these models in JAGS, we will need to rely on the following piece of code “b0k[county[i]]”
Comments?
Send me an email at