Solved – Expectation maximization on Bayesian networks with latent variables

I am trying to determine parameters in a bayesian network with two latent variables (in blue).

DAG (not enough reputation to post an image)

Every variable is discrete with 2-4 categories. The latent variables have 3 categories each.

I am trying to find an R package that will let me define the DAG and then learn the conditional probability parameters from data:

    B C D E F G H I J     2 1 ? 3 2 ? 1 4 4     2 2 ? 3 2 ? 1 3 2     2 1 ? 3 2 ? 1 4 2     3 3 ? 3 2 ? 1 4 2     3 1 ? 3 2 ? 1 1 4     3 3 ? 3 2 ? 1 4 1     ... 

I have tried working with:

  1. bnlearn: as far as I can tell the parameter estimation doesn't support latent variables
  2. catnet: I can't really tell what is going on here, but the results haven't worked out the way I expected them to when using the parameter estimation tools in this package.
  3. gRain: I tried working with this a lot, but couldn't sort out how to generate the CPT values when one or more of the variables in my DAG was latent.

I also tried working out how to use rJags for a long time but ended up giving up. The tutorials I found for parameter estimation all assumed either:

  1. There are no latent variables
  2. We were trying to do learning on the structure, rather than just determine conditional probabilities given a structure.

I'd really like it if I could find a solid R package for tackling this problem, though a java/scala solution would work as well. A tutorial/example problem would be even better.

I have a vague impression that expectation maximization techniques can be used and even started trying to write code for the problem myself by extending the cluster graph stuff written here bayes-scala but quickly realized that this was a bigger project than I initially thought.

I worked with Bayes scala some more and found a video from a coursera class defining cluster graphs http //

I'll walk through here quickly what I think I learned. First off, the full DAG I am dealing with is (http // (still not enough reputation for images or more links)

Most of the language dealing with cluster graphs refers to factors in factor graphs, so I tried to translate this DAG into a factor graph (http // I'm not certain how this works, but I think this factor graph is a valid transformation from the DAG.

Then a simple cluster graph I could make is called a bethe cluster graph (http // The bethe cluster graph is guaranteed to satisfy the constraints of cluster graphs. Namely:

1. Family Preservation: For each factor $Phi_k$ there must exist a cluster $C_i$ s.t $scope(Phi_k) subseteq C_i$

2. Running Intersection Property: for each pair of clusters $C_i,C_j$ and variable $X_i in C_i cap C_j$ there exists a unique path between $C_i,C_j$ such that all clusters along the path contain $X_i$

However, I think the bethe cluster graph loses the covariability conferred by the original DAG. So I tried to construct what I believe is a valid cluster graph that still maintains information about covariance between variables:
http //

I think its possible that I messed something up in this cluster graph, but I could use bayes scala to write this cluster graph (http // After I do so the log likelihood's progress as one would imagine:

EM progress(iterNum, logLikelihood): 1, -2163.880428036244 EM progress(iterNum, logLikelihood): 2, -1817.7287344711604 

and the marginal probabilities can be calculated. Unfortunately the marginal probabilities for the hidden nodes turn out to be flat after using GenericEMLearn:

marginal D: WrappedArray(0.25, 0.25, 0.25000000000000006, 0.25000000000000006) marginal J: WrappedArray(0.25000000000000006, 0.25, 0.25, 0.25) 

Additionally an odd error occurs where if I set evidence in a LoopyBP more than once I start getting NA values back:

 val loopyBP = LoopyBP(skinGraph)     loopyBP.calibrate()     (1 to 44).foreach((sidx) => {       val s1 = dataSet.samples(sidx)       (1 to 10).foreach((x) => {               loopyBP.setEvidence(x,s1(x))       })       loopyBP.marginal(11).getValues.foreach((x) => print(x + " "))       println()     }) 
actual value was 0 0.0 0.0 0.0 1.0  actual value was 0 NaN NaN NaN NaN  

This problem actually occurs for the sprinkler example as well, so I must be using setEvidence incorrectly somehow.

I'm the creator of bayes-scala toolbox you are referring to. Last year I implemented the EM in discrete bayesian network for learning from incomplete data (including latent variables), that looks like the use case you are asking about.

Some tutorial for a sprinkler bayesian network is here.

And, "Learning Dynamic Bayesian Networks with latent variables" is given here.

Similar Posts:

Rate this post

Leave a Comment