I just have a question about logistic regression and the 2 by 2 or 3 by 2 or n by 2 contingency table: the table can be found here:
http://en.wikipedia.org/wiki/Contingency_table
My question is when I have a table like this:
Right-handed Left-handed Total Male 43 9 52 Female 44 4 48 Total 87 13 100
if I want to know if gender is associated with a person being right-handed or left-handed, I can calculate the odd ratio as an indication of whether being male is more likely to be left-handed:
(9/52) / (4/48) = 2.07
Then, I can also test using the Pearson Chi-square test to see if there is significant association? like p-value <0.05 right?
But I can also create an indicator variable with 100 people in the data set, 52 being male, 48 being females, and let 1= left-handed , 0= right-handed. Then get the odd-ratio from the output of the logistic regression.
So my question is, I am not sure which is the right way to find out which group is more likely to be left-handed?
Also is it possible to get some opinions about what is the differences between the two approaches? are they answering the same question?
1) Basically, I just want to know what is the difference between the two approaches here in answering questions? (what kind of question does each approaches designed to answer?).
2) What is the difference between the p-value given by say in this case the 2-by-2 contingency table and the p-value given by the significant test of the parameter estimate of gender(M,F) in the logistic regression?
Could someone kindly explain?
Best Answer
The odds ratio in that table is:
(9/43)/(4/44) = 2.30
What John_w computed was a risk ratio.
Now you can see that the manually computed odds ratio is exactly the same as the one produced by logistic regression. Both tests you suggested test the null-hypothesis that this odds ratio is equal to 1. The p-values should be close enough to not matter. For example doing this in Stata gives the following output:
. // prepare the data . clear . input female right freq female right freq 1. 0 1 43 2. 0 0 9 3. 1 1 44 4. 1 0 4 5. end . label define female 0 "male" 1 "female" . label value female female . label variable female "respondent's sex" . . label define right 0 "left-handed" 1 "right-handed" . label value right right . label variable right "respondent's handeness" . . // tabulate . tab female right [fw=freq], lr chi2 | respondent's respondent | handeness 's sex | left-hand right-han | Total -----------+----------------------+---------- male | 9 43 | 52 female | 4 44 | 48 -----------+----------------------+---------- Total | 13 87 | 100 Pearson chi2(1) = 1.7774 Pr = 0.182 likelihood-ratio chi2(1) = 1.8250 Pr = 0.177 . . // the odds ratio: . di (9/43)/(4/44) 2.3023256 . . // logistic regression . logit right female [fw=freq], or nolog Logistic regression Number of obs = 100 LR chi2(1) = 1.82 Prob > chi2 = 0.1767 Log likelihood = -37.726174 Pseudo R2 = 0.0236 ------------------------------------------------------------------------------ right | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | 2.302326 1.468974 1.31 0.191 .6592751 8.040199 _cons | 4.777778 1.751347 4.27 0.000 2.32921 9.800386 ------------------------------------------------------------------------------
For fun I also asked for the likelihood ratio chi square statistic after tab
, to show that test is exactly the same as the one reported by logistic regression (labeled LR chi2(1)
and Prob > chi2
in the output of logit
).
Similar Posts:
- Solved – Calculate probability using tree diagram and bayes theorem
- Solved – Statistics interpretation of a real world application (Sampling Proportions)
- Solved – Statistics interpretation of a real world application (Sampling Proportions)
- Solved – Testing for contingency table with three variables
- Solved – Testing for contingency table with three variables