I am trying to assess a 20-item multliple choice test. I want to perform an item analysis such as can be found in this example. So for each question I want the P-value and the correlation with the total, and the distribution of the options selected.
I don't know anything about the various statistical software packages out there, but I'd like to use R as I'm comfortable with programming and R is open source. The pseudo-workflow I envision is:
prepare data in excel and export to CSV
load data in R
load a package that does what I need
execute that package's commands
export and report.
I am confident with 1 and 2 but having trouble with 3, probably because I don't have the statistical vocabulary to compare the packages I browsed on CRAN. ltm
looks like it could be the right package but I can't tell. Whatever package is used, what would the commands be?
Side question: in the linked example, what do you suppose MC and MI stand for?
Best Answer
I can suggest you at least two packages that allow to perform these tasks: psych (score.items
) and ltm (descript
). The CTT package seems also to process MCQ but I have no experience with it. More information can be found on W Revelle's website, The Personality Project, esp. the page dedicated to psychometrics with R which provides step-by-step instructions for importing, analyzing and report data. Also, the CRAN Task View on Psychometrics includes many additional resources.
As described in your link, MC stands for "Mean total raw score of the persons who answered the item with the correct response", and MI for "Mean total score of the persons who did not answer the item with the correct response.". Point-biserial correlation (R(IT)) is also available in the ltm
package (biserial.cor
). This is basically an indicator of the discrimination power of the item (since it is the correlation of item and total score), and is related to the discrimination parameter of a 2-PL IRT model or factor loading in Factor Analysis.
If you really want to reproduce the table you show, I guess you will have to wrap some of this code with custom code, at least to output the same kind of table. I've made a quick and dirty example which reproduce your table:
dat <- replicate(10, sample(LETTERS[1:4], 100, rep=TRUE)) dat[3,2] <- dat[67,5] <- NA itan(dat) P R MC MI NC OMIT A B C D [1,] 0.23 -0.222 2.870 2.169 23 0 23 22 32 23 [2,] 0.32 -0.378 3.062 1.985 32 1 32 20 14 33 [3,] 0.18 -0.197 2.889 2.207 18 0 18 33 22 27 [4,] 0.33 -0.467 3.212 1.896 33 0 33 18 29 20 [5,] 0.27 -0.355 3.111 2.056 27 1 27 23 23 26 [6,] 0.17 -0.269 3.118 2.169 17 0 17 25 25 33 [7,] 0.21 -0.260 3.000 2.152 21 0 21 24 25 30 [8,] 0.24 -0.337 3.125 2.079 24 0 24 32 22 22 [9,] 0.13 -0.218 3.077 2.218 13 0 13 29 33 25 [10,] 0.25 -0.379 3.200 2.040 25 0 25 25 31 19
As these are random responses, biserial correlation and item difficulty are not very meaningful (except to check that data are truly random :). Also, it is worth checking for possible errors, since I drafted the R function in 10'…
Similar Posts:
- Solved – IRT in R: Does anyone know of an IRT item calibration function that can cope with NA’s
- Solved – Factor loading and corrected item to factor correlation
- Solved – item-total correlation vs. inter-item correlation
- Solved – How to measure correlation between multi-item likert scales
- Solved – Correlation between two Likert items with a non-monotonic relationship