Educational example

Recall the educational data that we analyzed previously. In this data set, each line corresponded to an individual student \(s_1,...,s_{100}\) and each column was an individual question \(q_1,...,q_{40}\) in a test. The content of each cell was either:

  • 0: the student got this question wrong,
  • 1: the student got this question right, or
  • NA: the student was not presented this question

This type of data often arise from assessments in which the questions presented to the student are randomly drawn from a pool of questions.

In the model we saw, we were interested in determining the ability of each student \(\theta_i\) and the difficulty of each question \(\phi_j\). As a result, we relied on the following assumptions:

\[y_{ij} \sim Bernoulli(\pi_{ij})\]

\[\pi_{ij}=\frac{exp(\theta_i - \phi_j)}{1+exp(\theta_i - \phi_j)}\] where

\[\theta_i \sim N(0,1)\] \[\phi_j \sim N(0,1)\]

Steps

  1. Extend the model described above by answering the following questions:
  1. Say that students came from two different schools. How can we modify the equations of this model to test if students from school A are better prepared than students from school B?

  2. We have the impression that this particular exam is harder than the usual exam. As a result, the mean of zero for the question difficulty does not seem appropriate. How can we modify the equations of this model to test if the questions in this exam have a mean that is different from zero?

  1. Simulate data following the extended model that you have just specified in (1)

  2. Create a corresponding JAGS model and fit it to the simulated data. Do we run into identifiability issues with the extended model that you have proposed? If you do, explain why this is happening.

Back to main menu

Comments?

Send me an email at