Diagnosing the problem
As we have seen, the main problem is that our algorithm is generating samples that are highly auto-correlated. As a result, we might have to run the algorithm for a very long time to get a decent effective sample size.
To fix this problem, we have to first figure out why it occurs. Notice that, in this particular example, \(\beta_0\) and \(\beta_1\) are highly correlated, as shown below:
#get data
setwd('U:\\uf\\courses\\bayesian course\\rmarkdown\\improving MCMC')
=read.csv('poisreg results no centering.csv',as.is=T)
param
=nrow(param)
ngibbs=500
nburn=param$b0[nburn:ngibbs]
b0=param$b1[nburn:ngibbs]
b1plot(b0,b1)
How can we uncorrelate \(\beta_0\) and \(\beta_1\)?
We start by noticing that minor changes in \(\beta_1\) will generally result in large changes in \(\beta_0\) if:
- our observations are far from zero in the x-axis; and
- we would like our linear regression to go through the cloud of observations.
You can see a simulated example in the figure below:
rm(list=ls(all=TRUE))
set.seed(1)
#simulate data
=100
n=runif(n,min=100,max=150)
x=rnorm(n,mean=x,sd=5)
yplot(x,y,xlim=c(-10,max(x)),ylim=c(-10,max(y)))
abline(v=0,col='grey')
abline(h=0,col='grey')
#explore different scenarios
=function(b0){
optim.func=b0+b1*x
mean1-sum(dnorm(y,mean1,sd=5,log=T))
}
#example 1
=1
b1=optimize(optim.func,interval=c(-100,100))
res=res$minimum
b0=seq(from=-10,to=max(x),length.out=1000)
seq1=b0+b1*seq1
media1lines(seq1,media1)
text(0,150,paste('b0 =',round(b0,2),', b1 =',b1),pos=4)
points(0,b0,pch=19)
#example 2
=0.9
b1=optimize(optim.func,interval=c(-100,100))
res=res$minimum
b0=b0+b1*seq1
media1lines(seq1,media1,col='red')
text(0,140,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='red')
points(0,b0,pch=19,col='red')
#example 3
=1.1
b1=optimize(optim.func,interval=c(-100,100))
res=res$minimum
b0=b0+b1*seq1
media1lines(seq1,media1,col='blue')
text(0,130,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='blue')
points(0,b0,pch=19,col='blue')
It is precisely because changes in \(\beta_1\) force changes in \(\beta_0\) that there is a strong correlation between the intercept and the slope parameters. Perhaps if we could move this cloud of observations close to the y-axis, we would be able to break the association between \(\beta_0\) and \(\beta_1\). We can do this by centering the covariate (i.e., substracting the mean). I will illustrate this using the simulated example:
rm(list=ls(all=TRUE))
set.seed(1)
#simulate data
=100
n=runif(n,min=100,max=150)
x=rnorm(n,mean=x,sd=5)
y=x-mean(x) #CENTERING OF X
xplot(x,y,xlim=range(x),ylim=c(0,max(y)))
abline(v=0,col='grey')
abline(h=0,col='grey')
#explore different scenarios
=function(b0){
optim.func=b0+b1*x
mean1-sum(dnorm(y,mean1,sd=5,log=T))
}
#example 1
=1
b1=optimize(optim.func,interval=c(-500,500))
res=res$minimum
b0=seq(from=min(x),to=max(x),length.out=1000)
seq1=b0+b1*seq1
media1lines(seq1,media1)
text(min(x),150,paste('b0 =',round(b0,2),', b1 =',b1),pos=4)
points(0,b0,cex=2,pch=19)
#example 2
=0.9
b1=optimize(optim.func,interval=c(-500,500))
res=res$minimum
b0=b0+b1*seq1
media1lines(seq1,media1,col='red')
text(min(x),140,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='red')
points(0,b0,col='red',pch=19,cex=1.5)
#example 3
=1.1
b1=optimize(optim.func,interval=c(-500,500))
res=res$minimum
b0=b0+b1*seq1
media1lines(seq1,media1,col='blue')
text(min(x),130,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='blue')
points(0,b0,col='blue',cex=1,pch=19)
Comments?
Send me an email at