I was horrified to find recently that Matlab returns $0$ for the *sample* variance of a scalar input:

`>> var(randn(1),0) %the '0' here tells var to give sample variance ans = 0 >> var(randn(1),1) %the '1' here tells var to give population variance ans = 0 `

Somehow, the sample variance is not dividing by $0 = n-1$ in this case. R returns a NaN for a scalar:

`> var(rnorm(1,1)) [1] NA `

What do you think is a sensible way to define the ~~population~~ sample variance for a scalar? What consequences might there be for returning a zero instead of a NaN?

**edit**: from the help for Matlab's `var`

:

`VAR normalizes Y by N-1 if N>1, where N is the sample size. This is an unbiased estimator of the variance of the population from which X is drawn, as long as X consists of independent, identically distributed samples. For N=1, Y is normalized by N. Y = VAR(X,1) normalizes by N and produces the second moment of the sample about its mean. VAR(X,0) is the same as VAR(X). `

a cryptic comment in the m code for `var states:

`if w == 0 && n > 1 % The unbiased estimator: divide by (n-1). Can't do this % when n == 0 or 1. denom = n - 1; else % The biased estimator: divide by n. denom = n; % n==0 => return NaNs, n==1 => return zeros end `

*i.e.* they explicitly choose not to return a `NaN`

even when the user requests a sample variance on a scalar. My question is why they should choose to do this, not how.

**edit**: I see that I had erroneously asked about how one should define the population variance of a scalar (see strike through line above). This probably caused a lot of confusion.

**Contents**hide

#### Best Answer

Scalars can't 'have' a population variance although they can be single samples from population that has a (population) variance. If you want to estimate that then you need at least: more than one data point in the sample, another sample from the same distribution, or some prior information about the population variance by way of a model.

btw R has returned missing (NA) not NaN

`is.nan(var(rnorm(1,1))) [1] FALSE `

### Similar Posts:

- Solved – When is the sample median a good estimator of the population mean
- Solved – When is the sample median a good estimator of the population mean
- Solved – Built-in var() function in R computes the quasi-variance
- Solved – Covariance in R vs definition
- Solved – Calculating the variance of sample, knowing the mean of population