Solved – Birthday paradox: How to estimate the probability of two or more people in a group of 30 sharing a birthday

I might be overthinking this. I generated the output in R and 5 of my 10 samples were successful, so that's 50%. Given that, if I am to estimate the probability of two or more people in a group of 30 sharing a birthday, what is my total sample? Should I be using combinations?

How are you generating your birthdays? To generate 23 birthdays:

dates = sample(1:365, 23, replace = TRUE) 

To see if 2 or more share the same birthday:

length(dates) != length(unique(dates)) # TRUE if there are duplicates 

How often is the above TRUE?

dupe_count = 0 runs = 1000000 for (i in 1:runs) {   dates = sample(1:365, 23, replace = TRUE)   if (length(dates) != length(unique(dates))) {     dupe_count = dupe_count + 1   } } print(dupe_count / runs)  [1] 0.508158 

This closely matches the theoretical value of 50.7% in the wikipedia page

Similar Posts:

Rate this post

Leave a Comment