# Solved – Comparing two lists to measure for similarity

The problem I am trying to solve is finding the probability that two people are the same by cross-referencing the associates of the two people. For example, if person A is associated with the following people:

• Jeff
• Rick
• Jessica
• Mary

Person B is associated with the following people:

• Ryan
• Mary
• Dennis
• Scott
• Jeff
• Sharon
• Rick
• Larry
• James

So these two people have the following people in common:

• Mary
• Jeff
• Rick

How would I go about figuring out the likelihood that Person A and Person B are the same based on the common relationship with the three people above? There are three factors I can see right now, but I don't know how to weigh any of them:

1. Ratio of common associates (doubled because seen from both sides) over the total number of associates
2. Ratio of common associates over the number of associates for Person A
3. Ratio of common associates over the number of associates for Person B

I'm not a statistician, so I don't know if what I've presented is the correct way to solve the problem. Can anyone provide some guidance?

Contents