Solved – Non homogenous Poisson process with simple rates

I am trying to stimulate number of claims in the next 12 months using a non-homogeneous poisson process. The rates are: 11.02 per day, during March, April and May 11.68 per day, during June, July and August 26.41 per day, during September, October, November 20.83 per day, during December, Jan and Feb I came across … Read more

Solved – How to Calculate Probable Defective Rate with Confidence Interval Sampling from Population

I took some stats in college, and this is a tip-of-the-tongue problem that I just can't think how to search. If there is an answer to this already, please point me in the right direction. My problem fits well in an analogy of auto manufacturing: There are hundreds of populations (different component) with varying population … Read more

Solved – Using PROC QLIM OUTPUT to get predicted values from a two-step Tobit model

I am fitting a two-step Tobit model through PROC QLIM in SAS. The first step of the model is a probit model for whether someone "responds" (e.g. makes a donation). The second step of the model is linear for the amount (e.g. amount of the donation, given that someone made a donation). I am using … Read more

Solved – Rayleigh Distribution Quartiles

The Rayleigh distribution has PDF f(x) =xe−$frac{x^2}{2}$, x >0. Let X have the Rayleigh distribution. (a) Find P(1< X < 3). (b) Find the first quartile, median, and third quartile of X. Alright, so the first part is quite easy– it's just the integral from 1 to 3 of f(x), but the second part is … Read more

Solved – Most common method for deciding when to stop training a neural net on a batch

I have created my own neural net which is using batch gradient descent. In other words, it trains on batches of examples all at once. My issue is trying to figure out when to stop the training of the batch. I'll try to make things as understandable as possible since there are so many options, … Read more

Solved – Noise covariance matrix in Kalman filter

In the development of Kalman filter, I hit roadblocks when trying to estimate the noise covariance matrix of both the state process and the measurement process. In this post, the author mentioned "tuning" these covariance. Also, on the wikipedia page, it says In most real-time applications, the covariance matrices that are used in designing the … Read more

Solved – QQ-plot doesn’t correspond with histogram

I made a histogram and QQ-plot, using this code: hist(ang$Pkt, , ylim=c(0,0.05), freq = F, breaks = 10) curve(dnorm(x, mean=mean(ang$Pkt), sd=sd(ang$Pkt)), add=TRUE, col="red", lty="dotted", xaxt="n") qqnorm(ang$Pkt) qqline(ang$Pkt, col ="red") I got those two images: According to what I found it means that variables should be concentrated in the centre. But according to histogram, they are … Read more

Solved – K-means++ like initialization for K-medoids

Does it make sense to use initialization in K-medoids like in the case of K-means++? To be precise – is it good to select "farthest" points as initial medoids? (farthest in sense that points that are further from each other have greater probability to be selected as initial medoids). I think that it makes sense, … Read more

Solved – Getting negative variance

I'm having a problem when calculating the variance of the following estimator: $hattheta=frac{1}{N}sum_{n=1}^{N}D_n$ with $D_1….D_N$ independent random variables. In order to calculate the variance of this estimator, I follow the following procedure: $sigma_{hattheta}^{2}=E[(hattheta – E[hattheta])^2]=E[hattheta^2]+E[E[hattheta]^2]-2E[hattheta]E[E[hattheta]]$ Knowing that $f(D)=frac{1}{theta}e^{-frac{1}{theta}D}$ and therefore, $E[D]= E[hattheta]=theta$. From this point, I get that $sigma_{hattheta}^{2} = E[hattheta^2] +theta^2-2theta^2$ So, in order … Read more

Solved – Testing heteroscedasticity in seasonal ARIMA model

I have estimated seasonal ARIMA(1,2,1)x(0,0,2) and then to test for heteroscedasicity transformed it to lm object with x <- lm(residuals(m) ~ 1), with m = auto.arima(ts.loggdpq,stepwise=FALSE) and ts.loggdpq is quaterrly, logged gdp data. Testing for heteroscedasicity with bptest(x) resulted with studentized Breusch-Pagan test data: x BP = 3.429e-30, df = 0, p-value < 2.2e-16 meaning … Read more