While I find statistics interesting, I don't understand enough yet to argue my case. I think this is because I need a 'real world' example to help me understand: heights and weights of various breeds of dog or whatever just aren't helping me reach that eureka moment.
Imagine you and I were to meet and I presented you with the graph below. As someone who understands statistics, I would like to give you the statistical information you would expect. But in such a way that if you didn't understand a great deal about statistics, I want to have made things as clear as possible.
Some relevant information:
r = 0.8692
t = 39.467
Data points = 506 pairs.
Perhaps it is better if I state the following about the graph:
My hypothesis is that there will be no similarity between two sets of variables for two given dates.
The time series shown in the chart are similar. A little less than 87% of the time there is a correlation between them. This is unlikely to be a coincidental correlation because the t-value is 39.5 ‘Somethings or WhatEvers (maybe deviations)’ away from standard deviation.
I can therefore reject my hypothesis and can say there is a similarity between the variables.
NB – I have calculated r and t and I’m sure my figures are accurate based on dropping my raw data into various online calculators that produce the same values as I do.
My specific questions are
If my goal is to argue whether the lines are similar or not, what is my hypothesis?
Is it ‘The red and blue lines are unlikely to be similar’ and then ‘prove’ this is / is not the case?
My y axis is marked as ‘z-Score’. I have normalized my data and that is what is displayed in the chart. Is ‘z-Score’ appropriate as a y axis label, or should I mark it as ‘Normalized’ / ‘Standardized’? I want to point out / make it clear that the data has been transformed.
Regarding the r value. Can I express this value as a percentage i.e. round it to 87% … is it appropriate to do this?
I think I should also calculate the p value, but have avoided it so far because I’m unsure whether to do a one-tailed or two-tailed test. Is it ‘safer’ to calculate a two-tailed test as a default?
What would a p-value actually indicate to me? How would I express that in plain English?
The questions you have asked cover a large fraction of any introductory course in statistics, so it's unlikely either that anyone will write you the lengthy personal tutorial you need or that everything in any answer will make sense to you.
To cover some of your points:
In statistical jargon, you have two variables, one a time series for one day and another a time series for another day. You do not have "two sets of variables".
Your calculation of a correlation between the series is giving you a measurement of the strength of a linear relation between the two variables. It is not a measurement of agreement. By calculating z-scores you washed out differences in means and SDs and so made a measurement of agreement impossible. Whether that is a problem for you I can't say. A word "similarity" is ambiguous here as between linear relation $y = a + bx$ and agreement $y = x$.
Interpreting a correlation as the percent of the time there is a relation between variables is utterly nonsensical: there is no sense in which a relationship flips on and off between the variables for this kind of data. (Also, negative correlations alone would make this interpretation nonsensical.)
Independently of the previous point, it is not conventional at all to present correlations as percents. They are always presented on a scale from -1 to +1. (But the existence of negative correlations is one excellent reason for this convention.)
Strictly speaking, any P-value calculation is compromised here because the two variables are themselves time series and not independent. (It wouldn't be surprising if this wasn't clear to you; it can't be easily explained at a very introductory level. But the warning is: not to take the P-values at all literally.)
Given #5 above, I don't know how you would test a hypothesis here without an appropriate time series model. (Similar parenthetical comment.)
More broadly, wanting everything in plain English is understandable, but ultimately impossible. There wouldn't be a need for statistics if it could be conducted in plain English (or French, German, Mandarin, Klingon, …).