Solved – dmulti in jags when sum(p[i]) < 1

I have a problem with dmulti. Principially, this would be my model:

... for (i in 1:sites) {     for (j in 1:visits) {         pi[i, j] <- ....     }     Y[i, ] ~ dmulti(pi[i, ], N[i]) } 

where Y is data, pi and N are unknown unobserved parameters of the model. The problem here is that sum(Y[i,]) != N[i] and of course sum(pi[i,]) != 1. Unfortunatelly, JAGS requires this, and will normalize the pi to sum up to 1, which I don't want!

How to cope with this problem?

I tried several solutions:

1) add new value to the last column of Y (Y[,visits+1]) in the model:

... for (i in 1:sites) {     for (j in 1:visits) {         pi[i, j] <- ....     }     pi[i, visits + 1] <- 1 - sum(pi[i, 1:visits])     Y[i, visits + 1] <- N[i] - sum(Y[i, 1:visits])     Y[i, ] ~ dmulti(pi[i, ], N[i]) } 

In this case, JAGS will report an error "Index out of range for node Y", probably because the input data had only columns 1:visits and now I am introducing a new one.

2) I tried to add the last column (Y[,visits+1]) to the data – as NA – and then run the model from 1). It ended with "Y[1,1:13] has missing values".

3) I tried to create new array Y2 from Y and N in the model, but this will end with another error because you both assign to Y2 and then you do Y2 ~ dmulti which is some kind of ambiguity error in jags.

The problem is not that sum(p[]) < 1 but that you are modelling an incompletely observed multinomial distribution. As you have found, one of the limitations of JAGS is that multivariate nodes cannot be incompletely observed.

There is one possibility to rescue your model that depends on a particular choice of prior distribution for N[i]. If this has a Poisson prior

N[i] ~ dpois(mu[i]) 

then you can rewrite the model as

for (i in 1:sites) {     for (j in 1:visits) {         pi[i, j] <- ....         Y[i, j] ~ dpois(mu[i] * p[i, j])     }     Z[i] ~ dpois(mu[i] * (1 - sum(p[i,]))     N[i] <- sum(Y[i,]) + Z[i] } 

Here Z[i] represents the missing observation from your multinomial distribution. Note that here N[i] is not modelled directly, but is a function of Y[i,] and Z[i].

A more general solution is to model Y[i,j] as a sequence of binomial observations

for (i in 1:sites) {     N.bin[i,1] <- N[i]     pi.bin[i,1] <- pi[i, j]     for (j in 2:visits) {         N.bin[i,j] <- N.bin[i, j-1] - Y[i,j-1]         pi.bin[i,j] <- pi[i,j]/(sum(pi[i,j:visits] + piz)     }     for (j in 1:visits) {         pi[i, j] <- ...         Y[i, j] ~ dbinom(pi.bin[i,j], N.bin[i,j])     }     piz[i] <- 1 - sum(p[i,1:visits])     Z[i] <- N[i] - sum(Y[i,1:visits]) } 

Similar Posts:

Rate this post

Leave a Comment