Rationale

We often use logistic regressions to model binary response variables. Similarly, we often use Poisson regressions to model count data. We fit these Generalized Linear Models (GLM) in R with the black-box function “glm.”

To truly understand these models, it is useful to know their likelihood. In this assignment, we will explicitly define their likelihood to then use MLE to estimate their parameters.

Logistic regression model

Here is the simulated data for the logistic regression model:

set.seed(12)
n=1000
x=runif(n,min=-10,max=10)

#parameters
b0=0.3
b1=1

#simulate response variable
tmp=exp(b0+b1*x)
pi=tmp/(1+tmp)
y=rbinom(n,size=1,prob=pi)
dat.logistic=data.frame(x=x,y=y)
plot(jitter(y)~x,data=dat.logistic)

The likelihood for the logistic regression model is given by:

\[p(y_1,...,y_n|x_1,...,x_n,\beta_0,\beta_1)\propto \prod_i^n Bernoulli(y_i|\pi_i) \] where

\[\pi_i=\frac{exp(\beta_0 + \beta_1 x_i)}{1+exp(\beta_0 + \beta_1 x_i)}\]

Poisson regression model

Here is the simulated data for the Poisson regression model:

set.seed(12)
n=1000
x=runif(n,min=-10,max=10)

#parameters
b0=0.3
b1=0.1

#simulate response variable
lambda=exp(b0+b1*x)
y=rpois(n,lambda=lambda)
dat.poisson=data.frame(x=x,y=y)
plot(jitter(y)~x,data=dat.poisson)

The likelihood for the Poisson regression model is given by:

\[p(y_1,...,y_n|x_1,...,x_n,\beta_0,\beta_1)\propto \prod_i^n Poisson(y_i|\lambda_i) \] where

\[\lambda_i=exp(\beta_0 + \beta_1 x_i)\]

Assignment

  1. Create functions that calculate the minus log-likelihood for each dataset.

  2. Using the optimization function “optim,” find the maximum likelihood estimates for the regression parameters.

  3. Fit these data using the “glm” function. Do we get similar parameter estimates?



Back to main menu

Comments?

Send me an email at