Solved – Using Std.Dev and Mean to generate hypothetical/additional data points

Let's say I want to make a football simulator based on real-life data.

Say I have a player who averages 5.3 yards per carry with a SD of 1.7 yards.

I'd like to generate a random variable that simulates the next few plays.
eg: 5.7, 4.9, 5.3, etc.

What stats terms to I need to look up to pursue this idea? Density function? The normal curve estimates what boundaries the data generally fall within, but how do I translate that into simulation of subsequent data points?

Thanks for any guidance!

Of course you can use rnorm() in R, but it may be easier to understand how drawing from a pdf works by using the probability integral transform.

Basically, once we specify the structure of the pdf, we can transform this into a cdf (empirically, to ignore what the equation is), and because the values of the cdf have unique values from 0 to 1, we can back-calculate a draw from the original pdf by matching random draws from 0 to 1, with the cdf.

This way, you only need to have a RNG from 0 to 1, and the function of the pdf, and you're set. Here is the R code:

x <- seq(-4, 4, len = 1000) f <- function(x, mu = 0, sigma = 1) {   out <- 1 / sqrt(2*pi*sigma^2) * exp(-(x - mu)^2 / (2*sigma^2))   out }  x.ecdf <- cumsum(f(x)) / sum(f(x))  out <- vector() y <- runif(100) for (i in 1:length(y)) {   out[i] <- which((y[i] - x.ecdf)^2 == min((y[i] - x.ecdf)^2)) }  par(mfrow = c(1,2)) plot(x, x.ecdf) hist(x[out], breaks = 20) 

alt text http://probabilitynotes.files.wordpress.com/2010/08/rnormish.png

Similar Posts:

Rate this post

Leave a Comment