Thursday, February 28, 2013

Week 4

The part of Chapter 5 that I found most useful was the paragraph outlining how to increase the reliability of test scores. The most obvious example outlined in the text involved increasing the number of test items. For example, if the reliability for a 10-item test is .30, we can increase reliability to .68 by increasing the number of items from 10 to 50. Other ways of improving reliability involve writing test items in clear language, using multiple-choice questions rather than essay questions, making sure items are neither too easy or too difficult, having clear scoring procedures, and making sure individuals are trained before administering or scoring exams. I think these are guidelines we can follow when administering exams for students and information we can share with other teachers who may ask us for advice in designing and administering effective exams. When choosing assessments, the reliability measurement is a factor that we will definitely want to consider, as the "higher the coefficient, the more reliable the test scores" [will be] (Drummond, R. and Jones, K., 2012, p. 97). Somewhat related to reliability is validity, which is "the degree to which all the accumulated evidence supports the intended interpretation of test scores for the proposed purpose of the test, " (p.100). While reliability measures the appropriateness of the test itself, validity measures the interpretation of tests results and how they are then used to make decisions about students. Evidence for validity is organized around 5 areas, including test content, response processes, internal structure, relations to other variables and consequences of testing, but chapter six focuses on three areas: content, criterion-related and construct validity evidence. In attempting to understand these terms, I found the examples the authors provide very helpful. For instance, the sample table of specifications showing the content areas measured for sales performance demonstrated content validity, while criterion measure examples included academic achievement and job performance, and construct validity examples included group and age differentiation studies. The authors also define these terms nicely in the chapter summary, describing content validity, as, of course "focus[ing] on the content of the test," criterion-related validity as the "relationship between test results and external variables," and construct validity as the "appropriateness of inferences drawn from test scores as they relate to a particular construct" (p. 115). Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.

No comments:

Post a Comment