How do we integrate/marginalize parameters out?
Recall our probability rules. We can go from the joint probability \(p(A,B)\) to the marginal probability \(p(A)\) using the following expression:
\[p(A)=\sum_{j=1}^J p(A,B=j) = \sum_{j=1}^J p(A|B=j)p(B=j)\] In this expression, we assume that each outcome that B can take is a mutually exclusive outcome and we assume that these outcomes are discrete.
The same idea also works when we want to remove parameters from our models. However, because parameters are typically continuous, we will need to rely on integration instead of summation. For example:
\[p(y_i)=\int p(y_i|\theta)p(\theta) d\theta\]
Going back to the robust regression model, we want to integrate out \(\tau_i\). Here is how we do this:
\[p(y_i|\beta_0,\beta_1,x_i,\sigma^2) = \int N(y_i|\beta_0 +\beta_1 x_i,\frac{\sigma^2}{\tau_i}) Gamma(\tau_i|\frac{\nu}{2},\frac{\nu}{2})d\tau_i\] Let \(\mu_i = \beta_0 +\beta_1 x_i\). We are interested in the distribution of the quantity \(\frac{y_i-\mu_i}{\sigma}\). Therefore, we have that: \[\propto \int \frac{1}{\sqrt{2\pi \frac{\sigma^2}{\tau_i}}}exp(-\frac{1}{2\frac{\sigma^2}{\tau_i}} (y_i-\mu_i)^2) \tau_i ^{\frac{v}{2}-1}exp(-\frac{v}{2}\tau_i)d\tau_i\] \[\propto \frac{1}{\sqrt{\sigma^2}} \int \tau_i^{\frac{1}{2}} exp(-\tau_i \frac{(y_i -\mu_i)^2}{2\sigma^2}) \tau_i ^{\frac{v}{2}-1}exp(-\frac{v}{2}\tau_i)d\tau_i\]
\[\propto \frac{1}{\sqrt{\sigma^2}} \int \tau_i^{\frac{v+1}{2}-1} exp(-\tau_i [\frac{(y_i -\mu_i)^2}{2\sigma^2}+\frac{v}{2}])d\tau_i\] You should recognize the integrand as the kernel of a Gamma distribution. Furthermore, recall that \(\int \frac{b^a}{\Gamma(a)} x^{a-1}exp(-bx)dx = 1\). This implies that \(\int x^{a-1}exp(-bx)dx = \frac{\Gamma(a)}{b^a}\). Therefore, we can solve the above integral and obtain:
\[\propto \frac{1}{\sqrt{\sigma^2}}\frac{\Gamma(\frac{v+1}{2})}{[\frac{(y_i-\mu_i)^2}{2\sigma^2}+\frac{v}{2}]^{\frac{v+1}{2}}}\] \[\propto \frac{1}{\sqrt{\sigma^2}}[\frac{(y_i-\mu_i)^2}{2\sigma^2}+\frac{v}{2}]^{-\frac{v+1}{2}}\] If we multiply by \(\frac{v}{2} \times \frac{2}{v}\), we get:
\[\propto \frac{1}{\sqrt{\sigma^2}}[\frac{v}{2} \times \frac{2}{v}[\frac{(y_i-\mu_i)^2}{2\sigma^2}+\frac{v}{2}]]^{-\frac{v+1}{2}}\] \[\propto \frac{1}{\sqrt{\sigma^2}}[\frac{v}{2} [\frac{1}{v}\frac{(y_i-\mu_i)^2}{\sigma^2}+1]]^{-\frac{v+1}{2}}\] \[\propto \frac{1}{\sqrt{\sigma^2}} [1+\frac{1}{v}\frac{(y_i-\mu_i)^2}{\sigma^2}]^{-\frac{v+1}{2}}\] This is a non-standardized Student t-distribution (see wikipedia).
Comments?
Send me an email at