Diagnosing the problem

As we have seen, the main problem is that our algorithm is generating samples that are highly auto-correlated. As a result, we might have to run the algorithm for a very long time to get a decent effective sample size.

To fix this problem, we have to first figure out why it occurs. Notice that, in this particular example, \(\beta_0\) and \(\beta_1\) are highly correlated, as shown below:

#get data
setwd('U:\\uf\\courses\\bayesian course\\rmarkdown\\improving MCMC')
param=read.csv('poisreg results no centering.csv',as.is=T)

ngibbs=nrow(param)
nburn=500
b0=param$b0[nburn:ngibbs]
b1=param$b1[nburn:ngibbs]
plot(b0,b1)

How can we uncorrelate \(\beta_0\) and \(\beta_1\)?

We start by noticing that minor changes in \(\beta_1\) will generally result in large changes in \(\beta_0\) if:

  • our observations are far from zero in the x-axis; and
  • we would like our linear regression to go through the cloud of observations.

You can see a simulated example in the figure below:

rm(list=ls(all=TRUE)) 
set.seed(1)

#simulate data
n=100
x=runif(n,min=100,max=150)
y=rnorm(n,mean=x,sd=5)
plot(x,y,xlim=c(-10,max(x)),ylim=c(-10,max(y)))
abline(v=0,col='grey')
abline(h=0,col='grey')

#explore different scenarios
optim.func=function(b0){
  mean1=b0+b1*x
  -sum(dnorm(y,mean1,sd=5,log=T))
}

#example 1
b1=1
res=optimize(optim.func,interval=c(-100,100))
b0=res$minimum
seq1=seq(from=-10,to=max(x),length.out=1000)
media1=b0+b1*seq1
lines(seq1,media1)
text(0,150,paste('b0 =',round(b0,2),', b1 =',b1),pos=4)
points(0,b0,pch=19)

#example 2
b1=0.9
res=optimize(optim.func,interval=c(-100,100))
b0=res$minimum
media1=b0+b1*seq1
lines(seq1,media1,col='red')
text(0,140,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='red')
points(0,b0,pch=19,col='red')

#example 3
b1=1.1
res=optimize(optim.func,interval=c(-100,100))
b0=res$minimum
media1=b0+b1*seq1
lines(seq1,media1,col='blue')
text(0,130,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='blue')
points(0,b0,pch=19,col='blue')

It is precisely because changes in \(\beta_1\) force changes in \(\beta_0\) that there is a strong correlation between the intercept and the slope parameters. Perhaps if we could move this cloud of observations close to the y-axis, we would be able to break the association between \(\beta_0\) and \(\beta_1\). We can do this by centering the covariate (i.e., substracting the mean). I will illustrate this using the simulated example:

rm(list=ls(all=TRUE)) 
set.seed(1)

#simulate data
n=100
x=runif(n,min=100,max=150)
y=rnorm(n,mean=x,sd=5)
x=x-mean(x) #CENTERING OF X
plot(x,y,xlim=range(x),ylim=c(0,max(y)))
abline(v=0,col='grey')
abline(h=0,col='grey')

#explore different scenarios
optim.func=function(b0){
  mean1=b0+b1*x
  -sum(dnorm(y,mean1,sd=5,log=T))
}

#example 1
b1=1
res=optimize(optim.func,interval=c(-500,500))
b0=res$minimum
seq1=seq(from=min(x),to=max(x),length.out=1000)
media1=b0+b1*seq1
lines(seq1,media1)
text(min(x),150,paste('b0 =',round(b0,2),', b1 =',b1),pos=4)
points(0,b0,cex=2,pch=19)

#example 2
b1=0.9
res=optimize(optim.func,interval=c(-500,500))
b0=res$minimum
media1=b0+b1*seq1
lines(seq1,media1,col='red')
text(min(x),140,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='red')
points(0,b0,col='red',pch=19,cex=1.5)

#example 3
b1=1.1
res=optimize(optim.func,interval=c(-500,500))
b0=res$minimum
media1=b0+b1*seq1
lines(seq1,media1,col='blue')
text(min(x),130,paste('b0 =',round(b0,2),', b1 =',b1),pos=4,col='blue')
points(0,b0,col='blue',cex=1,pch=19)



Back to main menu

Comments?

Send me an email at