I have two random variables (say x1 and x2) defined by empirical probability distributions, and would like to calculate the median of their sum.
Under what circumstances (in terms of the distributions of x1 and x2) can I assume the median of the sum is equal to the sum of the medians i.e.
median(x1) + median(x2). (1)
The alternative approach I've used is to randomly generate large samples of x1 and x2 and then calculate the median as
median(sample of x1 + sample of x2). (2)
Approach (1) is quicker and I need to do this calculation many times. Under what circumstances is approach 1 approximately correct? Are their alternatives to my second approach?
I've seen this Q&A What does it mean if the median or average of sums is greater than sum of those of addends?
—- Additional information after reading the comments
If we have two normally distributed random variables then median of the sum is approximately the sum of the medians
N1 <- rnorm(10000, mean = 1, sd = 0.1) N2 <- rnorm(10000, mean = 0) # We expect an answer of 1 and get close median(N1) + median(N2) #[1] 0.9918688 median(N1 + N2) #[1] 0.9962555
This doesn't work for exponential variables
set.seed(2002) e1 <- rexp(100000, 1) e2 <- rexp(100000, 1) median(e1) + median(e2) # expect 2* log(2) = 1.386 and get 1.374 median(e1 + e2) # expect 1.678 and get 1.668
So, looking at @glen_b's comment, is symmetry the sufficient condition that would allow the assumption that the median of the sum is the sum of the medians?
Best Answer
Actually my comment is not entirely correct, allow me to clear up;
The median of a series of numbers $X$ is calculated by ordering all the numbers from smallest to largest, then finding the number in the middle. This means that when you change the numbers in $X$ you also change the ordering, hence the median changes. Therefore (in general) you can almost always assume that: $$ text{MED}(X + Y) neq text{MED}(X) + text{MED}(Y) $$ However there is at least one exception, whenever the ordering of $X$ (after adding $Y$ to $X$) does not change neither does the median. For instance if all numbers in $X$ and $Y$ are the same, see this example (written in R):
set.seed(42) n <- 100 x <- rnorm(n) c <- x y <- rnorm(n) median(x+y) # 0.0767433 median(x) + median(y) # 0.02050838 median(x + c) # 0.1795935 median(x) + median(c) # 0.1795935
Similar Posts:
- Solved – What does it mean if the median or average of sums is greater than sum of those of addends
- Solved – What does it mean if the median or average of sums is greater than sum of those of addends
- Solved – What does it mean if the median or average of sums is greater than sum of those of addends
- Solved – What does it mean if the median or average of sums is greater than sum of those of addends
- Solved – Why does mean tend be more stable in different samples than median