Imagine I have a network, and in this network I have $n$ nodes of interest.
I calculated the average shortest path (asp) between these nodes.
I want to see if they are statistically significantly close to each other.
Therefore I select $n$ nodes from the network randomly, and work out the average shortest path length. I repeat this 1000 times, so I have my asp for my nodes of interest
a vector of 1000 asps for randomly selected nodes
Question: How can I compare the two? Just count how many times the asp of my nodes interest < randomly selected nodes? And then divide by 1000 to get a p.value?
Or should I compare the asp of the nodes of interest with the vector of 1000 random asps using e.g. t.test or Wilcoxon rank sum?
I think you should make a bootstrapped confidence interval. Rank your 1000 randomly sampled asp distances. Your CI is from the 25th smallest to the 25th largest interval. If the asp for your n nodes of interest lies outside of the interval, then the asp for your n nodes is significantly different from that expected from a random selection of n nodes.
Alternatively, if you want a rough p-value, rank your asp of interest among the 1000 randomly sampled asps. If it is above the median, divide it's rank by 500. If it's below the median, subtract it's rank from 500, and divide the result by 500. The resulting number is your p-value.
- Solved – Comparing statistics of networks of different sizes
- Solved – Check for similarity between networks/graphs with the same nodes, but different edge values
- Solved – Calculating Personalized PageRank in R
- Solved – Mean value of truncated normal distribution
- Solved – What are the structural rules for arc reversal in a Bayesian network