I might be overthinking this. I generated the output in R and 5 of my 10 samples were successful, so that's 50%. Given that, if I am to estimate the probability of two or more people in a group of 30 sharing a birthday, what is my total sample? Should I be using combinations?

**Contents**hide

#### Best Answer

How are you generating your birthdays? To generate 23 birthdays:

`dates = sample(1:365, 23, replace = TRUE) `

To see if 2 or more share the same birthday:

`length(dates) != length(unique(dates)) # TRUE if there are duplicates `

How often is the above TRUE?

`dupe_count = 0 runs = 1000000 for (i in 1:runs) { dates = sample(1:365, 23, replace = TRUE) if (length(dates) != length(unique(dates))) { dupe_count = dupe_count + 1 } } print(dupe_count / runs) [1] 0.508158 `

This closely matches the theoretical value of 50.7% in the wikipedia page