Solved – Expected number of successes from $N$ Bernoulli trials with different $p$

Suppose I have N probabilities $(p_1, p_2,…,p_N)$ that represent the chance that each that a corresponding test was passed. How do I apply the Bernoulli distribution to determine the expected number of passes?

If you add up $n$ Bernouilli random variables with a different success probabilities then the sum has a Binomial Distribution of Poisson.

If the success probabilities are $p_i$ then the mean of the Binomial of Poisson is $n bar{p}$ where $bar{p}=frac{sum_i p_i}{n}$ and the variance of the Binomial distribution of Poisson is $n bar{p} (1-bar{p}) – n times var(p)$ (Note: for $var(p)$ there must be $n$ in the denominator!).

Note that if $var(p)$ is relatively small then the variance reduces to a binomial variance, which was expected because a small $var(p)$ means that the success probabilities of the Bernouillis are more or less equal.

I have some code to simulate this:

# Function that simulates Poisson Binomial random variable #  #   parameter 'ps' contains the success probabilities of the Bernouilli's to add up #   parameter n.sim is the number of simulations # #   The return value is a list containing #      - the simulated mean #      - the simulated variance #      - the 'true' mean of Poisson Binomial namely  n x average(ps) #      - the true variance of Poisson Binomial namely n x average(ps) x (1-average(ps)) - n var(ps)  simulate.Poisson_Binomial<-function(ps, n.sim=100000) {    sum.all<-vector(mode="numeric", length=n.sim)   for ( i in 1:n.sim ) {     # generate the random outcome for each Bernouilli     random.outcome<-( runif(n=length(ps)) <= ps )      # count the number of successes     sum.all[i]<-sum(random.outcome)   }    ret<-list(sim.mean=mean(sum.all),              sim.var=var(sum.all),             PoisBin.mean=length(ps)*mean(ps),              PoisBin.var=length(ps)*mean(ps)*(1-mean(ps))-(length(ps)-1)*var(ps))    return(ret) }   # Generate 50 Bernouilli success probabilities set.seed(1) N<-50 ps<-runif(n=N)  # do the simulation simulate.Poisson_Binomial(ps=ps, n.sim=5e5) 

In the return of the R-function there is (length(ps)-1)*var(ps) this is because the R-function var() has (n-1) in the denominator. so $-n times var(p)$ in the formula above should be 'translated' to -length(ps) * ( var(ps) * (length(ps)-1)/length(ps) which becomes - var(ps) * (length(ps)-1)

See also this Intuitive explanation for dividing by $n-1$ when calculating standard deviation?

Similar Posts:

Rate this post

Leave a Comment