Solved – Designing and validating a Likert-type scale with odd or even response categories

As part of an advisory team in my educational institution for undergraduate nurses, I (among others) have been assigned to design a questionnaire that will assess students' and academic staff's perceptions about the ability of the final year's exams as an adequate tool for evaluating students': knowledge-understanding /competence-skills/ and judgment-approach (in other words, to judge whether they are ready to safely work in the real world).
The final questionnaire comprises 21 questions regarding the above mentioned 3 basic concepts. I proposed the answers to be given on a 5-point balanced Likert-type scale (strongly disagree to strongly agree with a neutral midpoint). However, the rest of the team is in favour of using an unbalanced 4-point scale (disagree/agree a little/agree/strongly agree).

Question 1: Which scale would be more appropriate?

Also, I believe that this proposed tool needs to be validated and checked for reliability first before being administered to the target population, however, most of my colleagues disagree, since, the questions arise from the corresponding 21 statements of our national regulatory board, according to which:

  1. "Nurses must demonstrate knowledge of planning of health care measures",
  2. "Nurses must demonstrate the ability to draw care plans",
  3. "Nurses must demonstrate the ability to apply their knowledge so as to deal with different situations" and so on.

The majority of the team is in favour of converting these statutory statements into a questionnaire as follows:

I believe that the test: Q1 "demonstrated knowledge of planning of
health care measures", Q2 "demonstrated the ability to draw care

and so on up to question 21, without checking for validity and reliability first.

Question 2: Shouldn't this tool be tested first for its reliability and validity (convergent, discriminant, criteria validity, factor analysis, etc)?

Q1. 4 or 5 point scale (strongly disagree to strongly agree with or without a neutral midpoint)

A1. I think the use of even or odd number of scale points is not a matter that has a definitive answer. There are arguments on both sides of this question. Since you want a yes-no answer, the 4 point scale may be better suited to your purpose than a scale with a neutral mid-point.

Parenthetically, I would suggest that a convincing evaluation additionally would include a detailed evaluation of the actual content of each of the final exam questions such as: "Do you think a minimally competent nurse would answer this question incorrectly?" (This is the type of approach developed by Angoff at ETS many years ago. See, this secondary source, for example.) Global opinions are open to halo and other types of bias.

Q2a. Should the reliability and convergent and discriminant validity be evaluated before the scale is used?

A2a. Inter-rater agreement might be evaluated after the data is collected. If inter-rater agreement is low, the reasons can be probed. If it turns out the survey has to be repeated due to low agreement, it is not a high cost undertaking. Convergent and discriminant validity are evaluated based on correlations. However, in your case, these correlations may be driven more by how hard individual nursing students studied, how bright they are, etc. than by the design of the survey.

Q2b. Should the criterion validity of the scale be evaluated before the scale is used?

A2b. Criterion validity requires a criterion and you do not have one (or you would not be undertaking this survey type evaluation). The best that is usually done in this type of undertaking is content validity. (See for example, the AERA/APA Standards, soon to be revised.)

Q3. Designing the questionnaire around the 21 statements of your national regulatory board.

A3. This is a great idea since it builds on an accepted statement of standards. Do you have alternatives to suggest?

Similar Posts:

Rate this post

Leave a Comment