Inner product

The inner product consists of the multiplication of two vectors of the same length returning a single number.

For example, say that we perform the inner product between the vectors \(x=[a,b,c]\) and \(\beta=[1,2,3]\). We denote this inner product as \(x^T \beta\) (“T” stands for transpose) and the result of this calculation is \(a\times 1+b \times 2+ c \times 3\).

What does the inner product have to do with statistics?

In statistics, we often rely on regression models where the mean is some transformation of \(\beta_0+\beta_1 x_{1i}+\beta_2 x_{i2}+\beta_3 x_{i3}+⋯\).

We can write this expression as the inner product of two vectors: \(\boldsymbol{x_i^T}=[1,x_{i1},x_{i2},x_{i3},…]\) and \(\boldsymbol{\beta}=[\beta_0,\beta_1,\beta_2,\beta_3,…]\). In other words:

\[\boldsymbol{x_i^T \beta}=\beta_0+\beta_1 x_{1i}+\beta_2 x_{i2}+\beta_3 x_{i3}+⋯\]

Notice that the first vector contains all covariates for observation \(i\) and has a leading one in the vector. We need this number “1” because of the intercept. The second vector contains all the regression coefficients.

The inner product within JAGS

Knowing about inner products is particularly useful because it avoids us having to explicitly write all the terms \(\beta_0+\beta_1 x_{1i}+\beta_2 x_{i2}+\beta_3 x_{i3}+⋯\) within JAGS. Imagine how much you would have to write if you had 10 or 20 covariates particularly if you had more informative names for these covariates! Instead, in JAGS we can just write

inprod(x[i,],betas)

Here is an example. The model specification within JAGS is given in the file “JAGS model.R”, which is reproduced below:

model{
  for (i in 1:N){
    y[i]~dnorm(inprod(x[i,],betas),prec)
  }
  
  for (j in 1:nparam){
    betas[j] ~ dnorm(0,1/100)
  }
  prec ~ dgamma(0.1,0.1)
}

Now I simulate some data, fit the JAGS model, and compare the resulting parameter estimates to the true parameter values. Note that I avoid using more linear algebra to simulate the data to keep my notes easy to understand even for people that have never taken linear algebra.

rm(list=ls(all=T))
library(jagsUI)
set.seed(1)

#simulate data 
N=1000
nparam=15
xmat=matrix(runif(nparam*N,min=-1,max=1),N,nparam)
xmat[,1]=1
betas.true=sample(seq(from=-2,to=2,by=0.5),size=nparam,replace=T)

#calculate mean
media=rep(0,N)
for (i in 1:nparam){
  media=media+xmat[,i]*betas.true[i]
}
var1=0.1

#generate response variable
y=rnorm(N,mean=media,sd=sqrt(var1))
#-----------------------------------------------
# MCMC settings 
ni <- 1000 #number of iterations
nt <- 50    #interval to thin 
nb <- 500  #number of iterations to discard as burn-in
nc <- 3     #number of chains

#set parameters to monitor
params=c("betas","prec")

#run model
setwd('U:\\uf\\courses\\bayesian course\\rmarkdown\\inner product')
mod=jags(model.file="JAGS model.R",
               parameters.to.save=params,
               data=list(N=N,x=xmat,y=y,nparam=nparam),
               n.chains=nc,
               n.burnin=nb,n.iter=ni,n.thin=nt,DIC=TRUE)
## 
## Processing function input....... 
## 
## Done. 
##  
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 1000
##    Unobserved stochastic nodes: 16
##    Total graph size: 18024
## 
## Initializing model
## 
## Adaptive phase..... 
## Adaptive phase complete 
##  
## 
##  Burn-in phase, 500 iterations x 3 chains 
##  
## 
## Sampling from joint posterior, 500 iterations x 3 chains 
##  
## 
## Calculating statistics....... 
## 
## Done.
#compare estimated to true
betas.mean.estim=apply(mod$sims.list$betas,2,mean)
rango=range(c(betas.mean.estim,betas.true))
plot(betas.true,betas.mean.estim,ylim=rango,xlim=rango)
lines(rango,rango,col='red')