I am planning to run a 2x2x3 repeated measures ANOVA on reaction time data pre and post treatment. However, my data is skewed and I wanted to do a log transformation but have few questions.

Can I apply the log transformation to the means that I am performing the ANOVA on OR

Should I perform the log transformation on the raw data then compute means for each participant and then do the ANOVA on the means of log transformation

Or do the log transformation for the raw data, compute the mean and back transform that mean and do the ANOVA on that

And for the graphs, can I use the brack transformed means depending one which one is the best option

Thank you all,

**Contents**hide

#### Best Answer

- Should I perform the log transformation on the raw data then compute means for each participant and then do the ANOVA on the means of log transformation

This is the one you should do.

Choices 1 and 3 are not useful. Actually, to do them sort of correctly would require you to do some mathematical derivations.

What you are doing with the log transformation is attempting to rescale the data in a way that the mathematical and probabilistic model behind the analysis of variance is at least plausible.

The primary issue is that the mathematical model assumes independent errors with equal variances when compared among the various cell combinations. Since you have repeated measures, the independence assumption will be relaxed in some way and taken into account.

What you hope is that when you model using the transformed data, that your residuals look remarkably as though they came from similar normal distributions for each factor combination.

- And for the graphs, can I use the back transformed means depending one which one is the best option

You can, but you should label things very carefully. Even then, be prepared for people to find fault (e.g., because your graphs don't match the means, or, because your confidence intervals are asymmetric).

Your other choice is to do the analysis correctly on transformed data but provide graphs with summary statistics calculated on the raw data. You should then label things very carefully and be prepared for people to find fault (e.g., because your graphs don't match the analysis).