Thursday, February 28, 2013
Week 4
The part of Chapter 5 that I found most useful was the paragraph outlining how to increase the reliability of test scores. The most obvious example outlined in the text involved increasing the number of test items. For example, if the reliability for a 10-item test is .30, we can increase reliability to .68 by increasing the number of items from 10 to 50. Other ways of improving reliability involve writing test items in clear language, using multiple-choice questions rather than essay questions, making sure items are neither too easy or too difficult, having clear scoring procedures, and making sure individuals are trained before administering or scoring exams. I think these are guidelines we can follow when administering exams for students and information we can share with other teachers who may ask us for advice in designing and administering effective exams. When choosing assessments, the reliability measurement is a factor that we will definitely want to consider, as the "higher the coefficient, the more reliable the test scores" [will be] (Drummond, R. and Jones, K., 2012, p. 97).
Somewhat related to reliability is validity, which is "the degree to which all the accumulated evidence supports the intended interpretation of test scores for the proposed purpose of the test, " (p.100). While reliability measures the appropriateness of the test itself, validity measures the interpretation of tests results and how they are then used to make decisions about students. Evidence for validity is organized around 5 areas, including test content, response processes, internal structure, relations to other variables and consequences of testing, but chapter six focuses on three areas: content, criterion-related and construct validity evidence. In attempting to understand these terms, I found the examples the authors provide very helpful. For instance, the sample table of specifications showing the content areas measured for sales performance demonstrated content validity, while criterion measure examples included academic achievement and job performance, and construct validity examples included group and age differentiation studies. The authors also define these terms nicely in the chapter summary, describing content validity, as, of course "focus[ing] on the content of the test," criterion-related validity as the "relationship between test results and external variables," and construct validity as the "appropriateness of inferences drawn from test scores as they relate to a particular construct" (p. 115).
Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.
Reliability and Validity
This week I would first like to address our class activity
last week that was done in the computer lab.
The problems that were given are very familiar to me and actually bring
about some excitement. During both of my
previous statistic classes, I loved all the parts that related to math and
being able to perform calculations.
After my second class I was able to take it even a step further and
could then use those numbers to interpret data and my results. The more comfortable I become with this
information the better I feel about doing research and using data as a school
counselor. In class when I was able to
help my classmates with working Excel and successfully finishing all of the
problems, I had a great sense of pride and accomplishment. This was a very motivating feeling and
something that I have been looking for this year in the program. It also made me realize that if I had taken
Appraisal before Guidance Program Development things may have fit together
better for me. However, I do have the
advantage of understanding a greater picture the way I have completed the
classes. Appraisal fits into the overall
picture I have already created and it seems to be filling some of the holes. I am excited to see how much more confidence I can gain by the end of the semester and this class.
Both chapters for this week were again somewhat of a review
from my previous classes, however it took a new approach that I was not as
familiar with. In the Reliability
chapter, the sources of measurement error were organized by time-sampling error,
content-sampling error, and interrater differences. This type of organization gave me a greater
understanding of reliability and the methods used in order to estimate
reliability. The Validity chapter also
organized the information in a more modern way than I had previously been
taught. I have always learned and
understood validity to be broken up into content, criterion, and construct
validity. Although the chapter did
include sections on all three, it described all three as falling under
construct validity. Construct validity
is used as an umbrella term which is broken down into five sources (Drummond
and Jones, 2010). It became clearer that
the purpose is to establish a relationship between assessment scores and the
other variables. We are trying to determine if the claims and decisions that
are made on the basis of a particular assessment are meaningful and useful for
what they are supposed to be accomplishing (Drummond and Jones, 2010). Another aspect that I appreciated from our
text was the brief discussion on the fairness of certain assessments. “Validity also refers to the adequacy and
appropriateness of the uses of assessment results” (Drummond and Jones, p. 100,
2010). My recent work with multicultural
students and counseling has started to interest me in how fair certain parts of
the educational system are for their success.
The book points out that a lack of fairness is a lack of validity and
this would also show a lack of reliability.
This tells us that we should not be using this assessment to make
educated decisions about any student and in particular students from unique
backgrounds.
Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.
Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.
Reliability and Validity
Chapters 5&6
helping professionals. (7th ed.). Upper Saddle River, NJ: Pearson.
When I was
reading the chapter on reliability I kept thinking, like most other people in
the class, about the most common form of reliability that we have all
experienced…the GRE/SAT. During both I was skeptical, no I was downright
confident that they would not yield results that were consistent with my actual
body of knowledge. I do understand that there
is legitimacy in the predictive validity of some students with their scores and
how their future GPA’s will reflect their intelligence. I however, am an
outlier for the reliability of both the SAT and GRE. For both I scored in the
lower-middle of the road area and should probably have only had a GPA that was
around the C+/B- range. I can proudly say that both in undergrad and here in
grad school this is not true. My GPA is near perfect and I work hard to keep
that. For me- there is no predictive validity between standardized tests and
how I actually preform.
Knowing that not
everyone is a good test taker, or will be able to adequately express their body
of knowledge on a standardized test will be part of our battle as school
counselors. It will be important to be able to help those students who are not
good test takers show their potential in different ways. Not to negate
standardized tests, but most people are just not wired to be extraordinarily
good at them. The fact that our education system places so much emphasis on
them and their “predictive quality” disturbs me to the core.
The more I self-reflect about this class and
my life I draw conclusions similar to ones I draw about validity and reliability-
it is all relative and able to be affected by extraneous variable. No one in
the psychology field or many others would ever argue that there will never be a
time when an outside variable could not
affect something. It would be silly to even consider this- so why then do we
place such emphasis on schools receiving funding based on standardized test
scores. Obviously there must be some measure to assess how schools are
achieving academic success, and while I have no suggestions for a better
system, I seriously disagree with what we are currently using.
I keep realizing more and more how much I have to learn. I fell so excited when I reflect on what I have learned so far- yet I am intimidated of all I feel I need to learn before I'll actually be doing this job. Hopefully Aassessment helps me on my way!
Drummond,
R. J. and Jones, K. (2010). Assessment procedures for counselors and
helping professionals. (7th ed.). Upper Saddle River, NJ: Pearson.
Blog 4: Chapters 5 & 6
I
found the chapters on reliability and validity to be vital when learning about
the assessment process. Before reading
these chapters I was under the impression that reliability and validity went
hand in hand. Obviously it would be
ideal if you found a test that was both consistent as well as accurate;
however, Drummond and Jones (2010) state, “Reliability is a necessary but
insufficient condition for validity. A
measure that produces totally inconsistent results cannot possibly provide valid
reliable score interpretations. However,
no matter how reliable assessment results are, it is not a guarantee of validity”
(p.102). After learning that, I believe
that it is important that you look at all components of a test to ensure that
it fits the individual and/or group of students you are testing. This is imperative due to the results affecting
the person being tested. As educators,
we guide our teaching based off of the results and it directs us to which
services to provide to which students. Furthermore,
as future counselors it will help guide our program as well as ensuring the
students are being provided with the support they need.
In
terms of reliability, it was stated to be the most important characteristic since
there are many decisions based off of the results. I learned that there are many factors that may
affect the reliability and it is a good idea to keep these components in
mind. Two of the factors that stood out
to me were the content-sampling error and interrater differences. With content-sampling, I was able to relate
to this because I am working with a small group of teachers in order to create
a math screener. In order to ensure that
we administer this consistently since there may be numerous teachers facilitating
the assessment we need to make sure to adequately represent the content domain. We need to make sure that we are careful when
choosing the questions that will be tested for each grade level. Next, with interrater differences, I agree
that people view things differently.
When you have multiple people completing observations you have to be
clear about the expectations and both be on the same page.
Finally,
in terms of validity, Drummond and Jones (2010) state, “The Standards assert that validity is “the
most fundamental consideration in developing and evaluating tests”” (p.
102). When looking at test you want to make
certain that they are not underrepresented by not including enough or
irrelevant by the test being too broad. In
conclusion, it is important to make sure the assessment that is given is
appropriate for the audience. As a final
point, I believe that reliability and validity are terms that are important to
be familiar with due to the importance of the affects they have when
testing.
Drummond, R. J. & Jones, K.
(2010). Assessment Procedures for
Counselors and Helping Professionals (7th ed.). Upper Saddle
River, NJ: Pearson.
Blog #4
While I was reading about the chapter on reliability I thought about my junior year in high school and when I took the SATs. My parents made me sign up to take the test two times within a period of two months. Like most other high school students, I was incredibly nervous about taking the test. I was already a horrible test taker and even worse at taking standardized tests. Even with all of the nervousness and anxiety I was able to get through the test without completely melting down. I ended up doing exactly as well as I thought I would do, and I received a very average score.
My family kept telling me that I would do better the second time around because I knew how the test was structured, how the time intervals work, what I was most comfortable with, etc…I even spent those two months in between studying and preparing as much as I could. As a result, when I went to take the SATs the second time around I was so much more confident and ready to take the test. My family and my focus on preparation made most of my anxiety go away, and I was so much more relaxed. I took the test and even walked out that day feeling more confident than I ever did after taking a standardized test. A few weeks later I got my score back. It was 10 points lower than the first time that I took it. Obviously, I was incredibly surprised that I did better the first time around. To be honest, I was pretty pissed off about the fact that I spent all that time studying and preparing and it did not make one bit of difference. This is probably why I didn’t prepare as much as I should have for the GRE.
Some people could say that it was an absolute fluke that I scored within 10 points of the two times that I took the SATs. When I look back on it now, I really believe that I got around the best score that I possibly could have. If I would have taken it five more times, I really believe that my scores would have been very similar. I know that people have many different experiences with taking the SATs, and some probably question the reliability and validity of the test. As much as I hate to admit it, in my experience, the test couldn’t have been any more reliable. I still don’t believe that one test should be a measure of where or if you go to college, but I will save that for another blog.
Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7th ed.). Upper Saddle River , New Jersey : Pearson Education Inc.
Wednesday, February 27, 2013
Reliability and Validity
Reliability and validity are both concepts that are not new
to me. However, after reading this
week’s chapters, I realized how much I really did not know about them! I also realized how relevant they are to my job at this point.
When reading about true ability vs. observed score in the
Reliability chapter, I immediately thought about a 7th grade student
that I have. He is very emotionally immature and has low confidence in his
abilities. He does not like school at all and states this often. From working with him, one would think
that he has low ability based on the way he performs in class both in
motivation/task completion and his grades. However, he was evaluated at the
beginning of this school year, and his full-scale IQ was a 95. His scores in all areas of reading,
written expression, and math were average on the KTEA-II. This seems like a great example of true
ability vs. an observed score, with the observed score being his performance in
class. The two are like polar
opposites for this student.
When reading about test-retest as a method of estimating
reliability, I wondered if the test that is given is the exact same test, or if
the test is just a similar one.
When reading about simultaneous administration on page 89 (Drummond
& Jones, 2010), I gather that the test-retest measure uses the same assessment. I wonder, then, how long they have
testers wait before re-taking the test?
Some people have extraordinary memories, and a curiosity that might
warrant them to seek answers to particular questions they were asked on the
test. If they sought out information or practice on certain questions or skills
themselves, can the test-retest measure really show reliability for some groups
of people? Related to re-test
measures, the book stated that a limitation is that many
tests do not have alternate forms. When reading about this, I thought about my school’s re-take policy, which was implemented last
year. We
are required to allow students to re-take tests when they go through a
particular process. We were told
as teachers to have an alternate form of each test so that students who re-take
will take the alternate. I agree
with the book: this is extremely time-consuming! So time-consuming that I wonder just how reliable some of
the teacher-made alternate tests are?
When reading about predictive validity and the example of
SAT scores as a predictor with college success, I was thinking, “Yeah, but SAT
scores has nothing to do with personal choices about whether to focus more on
classes or partying.” I was not surprise to then read that while it can predict academic success, it’s a poor predictor
of morality (Drummond & Jones, 2010).
With anything there could be confounding variables that could skew the
correlation between two items.
Also related to confounding variables was the thought that, in group
differentiation studies, there is an expectation that children with ADHD would
score lower than children in the standardization sample. The special education
teacher in me wonders if a child’s environment is modified to limit distractions when taking this test. Are they given extended time to
complete the test? Do they have
ADHD paired with a learning disability?
These are all variables that I think would need to be considered. Lastly, relating to confounding
variables were my thoughts on criterion measures as uncontaminated. It makes sense that a measure should not
be influenced by any external factors.
I am curious about how achievement or IQ testing could be affected by
the fact that a student might be an English Language Learner. All students that I have had whom have
undergone evaluations have been fluent in English. I guess I have never asked a school psychologist how they
would evaluate a student who speaks minimal English, especially if it is a more
obscure language that they speak.
Is the test interpreted? Are
there forms of tests in students’ native languages and psychologists who
specialize in administering such assessments? This could certainly affect not only the level of
“contamination,” but also the reliability of the scores.
The more I learn in this class, the more questions I have,
and the less I realize I know. I
am left wondering: How many of the concepts in this textbook will be concepts
that we will need to understand and use on a regular basis as school
counselors?
Drummond,
R. J. and Jones, K. (2010). Assessment procedures for counselors and
helping
professionals. (7th ed.). Upper Saddle River,
NJ: Pearson.
Week 4 - Chs. 5 & 6
I came to a greater respect of the importance of reliable and valid assessments this week, but not just because of the reading from Drummond & Jones. For the past couple of days, I’ve been battling a sinus infection. On Monday I felt like I was sneezing and blowing my nose almost non-stop. Monday night, the chills set in. I know that getting the chills is often a sign of a fever, so I decided to get out my handy-dandy thermometer to take my temperature. While the brand of my thermometer is called Reli On, I found that it was not very reliable (or valid). I took my temperature 3 times in a row on Tuesday morning, getting a slightly different reading each time. I realize that doctors will tell you that taking your temperature via the armpit is probably the least accurate (valid) measurement of body temperature; however, it also happens to be the cheapest type of thermometer to come by on the market. Even if my thermometer had been consistent (reliable), that doesn’t mean that it would have actually been giving me an accurate (valid) measurement on which to base my decisions as to whether or not I should go to work or go to the doctor’s office. This aligns with what Drummond and Jones (2010, p. 102) state: “ no matter how reliable assessment results are, it is not a guarantee of validity…assessment results may be highly reliable, but may be measuring the wrong thing or may be used in inappropriate ways.”
My understanding of reliability and validity has been expanded from the knowledge that I first gained in my statistics class last year. Last year, we primarily talked about reliability and validity in regards to “treatment.” In my mind, “treatment” is what would normally follow an assessment. For instance, when I went to visit the doctor yesterday, she asked me a number of questions in order to assess what my symptoms were most likely caused by. After her assessment, she proceeded to prescribe me some antibiotics in order to treat my illness. If the treatment does what it is supposed to do (i.e. kill the infection wreaking havoc on my sinuses), it will be an effective, valid treatment. In assessment, validity refers not just to whether or not a test “measures what it was supposed to measure,” but also to the appropriateness of the interpretation and use made of assessment results (Drummond & Jones, 2010, p. 99). This leaves the issue of validity in assessment a bit more in the hands of the test administrator and a bit less in the hands of the test creator. In a few short weeks, I will be administering a couple of assessments to a high school student. That means that, with the help of whatever scoring mechanism the test creators have designed, I will also be responsible for making an appropriate interpretation of the results. This is a large responsibility. The interpretation of the assessments that I give could have a lasting impacts on this student’s life! And that is exactly why counselors need to be on the lookout for threats to reliability and validity in assessment.
References
Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7th ed.). Upper Saddle River, NJ: Pearson.
My understanding of reliability and validity has been expanded from the knowledge that I first gained in my statistics class last year. Last year, we primarily talked about reliability and validity in regards to “treatment.” In my mind, “treatment” is what would normally follow an assessment. For instance, when I went to visit the doctor yesterday, she asked me a number of questions in order to assess what my symptoms were most likely caused by. After her assessment, she proceeded to prescribe me some antibiotics in order to treat my illness. If the treatment does what it is supposed to do (i.e. kill the infection wreaking havoc on my sinuses), it will be an effective, valid treatment. In assessment, validity refers not just to whether or not a test “measures what it was supposed to measure,” but also to the appropriateness of the interpretation and use made of assessment results (Drummond & Jones, 2010, p. 99). This leaves the issue of validity in assessment a bit more in the hands of the test administrator and a bit less in the hands of the test creator. In a few short weeks, I will be administering a couple of assessments to a high school student. That means that, with the help of whatever scoring mechanism the test creators have designed, I will also be responsible for making an appropriate interpretation of the results. This is a large responsibility. The interpretation of the assessments that I give could have a lasting impacts on this student’s life! And that is exactly why counselors need to be on the lookout for threats to reliability and validity in assessment.
References
Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7th ed.). Upper Saddle River, NJ: Pearson.
Tuesday, February 26, 2013
Week 4 - Reliability & Validity
I must admit that my head is spinning a bit after reading this week's chapters on reliability and validity. While my suspicions that I most likely will never be interested in creating new instruments (on a large scale) have been confirmed, I am appreciative of the following ideas that were spurred on by this week's reading. One is that I think most of us take for granted the reliability and validity of many things in our lives. I think of the PSSA tests my 4th grader will be taking soon. I really don't know much about the nitty gritty details of this standardized test that already is making her nervous. I took the GRE years ago and really didn't know much about the instrument except for the power it held over my future plans. I think back to when I was a teenager and taking the written exam to get my driver's license. Did I care if the exam was using appropriate content? No! I just wanted my shiny new license.
I now have a much greater appreciation for all that goes into creating, administering and interpreting a test. It is overwhelming in many ways as I am now aware of many of the issues and variables that go into a test from it's creation through to the interpretation of results. Drummond and Jones write, "It is the responsibility of test users to carefully read the validity information in the test manual and evaluate the suitability of the test for their specific purposes." (2010,p.115) As an individual, it is easy to gloss over reliability and validity issues in our own lives. We can take the "cross your fingers" approach and just assume that someone else has checked up on the reliability and validity factors. As a school counselor though, we are responsible for our students and have a responsibility to make the wisest and most informed choices for them whether it be a referral to another professional or which assessment tool will be the most beneficial for them to use and gain insight into their world.
Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and helping
professionals (7th ed.). Upper Saddle River, NJ: Pearson.
I now have a much greater appreciation for all that goes into creating, administering and interpreting a test. It is overwhelming in many ways as I am now aware of many of the issues and variables that go into a test from it's creation through to the interpretation of results. Drummond and Jones write, "It is the responsibility of test users to carefully read the validity information in the test manual and evaluate the suitability of the test for their specific purposes." (2010,p.115) As an individual, it is easy to gloss over reliability and validity issues in our own lives. We can take the "cross your fingers" approach and just assume that someone else has checked up on the reliability and validity factors. As a school counselor though, we are responsible for our students and have a responsibility to make the wisest and most informed choices for them whether it be a referral to another professional or which assessment tool will be the most beneficial for them to use and gain insight into their world.
Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and helping
professionals (7th ed.). Upper Saddle River, NJ: Pearson.
Sunday, February 24, 2013
Week 4, Ch 5 and 6
When
considering validity, I was particularly drawn to one of the threats to
validity: construct-irrelevant variance.
As I understand the concept involves using an instrument for a
particular purpose, but having other aspects inadvertently confuse the results
and therefore impede the proper interpretation and usage of an assessment. I witnessed this first hand when I helped
with the settlement of Burmese refugees in Lancaster. The students were being assessed for their
academic achievement in order to be placed in the correct grade and sections in
elementary, middle, and high school. I accompanied
them when an employee from the school district attempted to give them an
academic assessment. Although these
children had been in school in refugee camps in Thailand, their very limited
English created a situation where they appeared to know less than they
did. In this way the assessment
experience had very little validity. The
intent was to use the tests as a tool to discern their academic level so that
they could place them in the grade and section appropriate to their academic
level. The tests in English were not
able to assess anything but the refugees’ lack of English language skills. As I continued to work with these children and
teens, I continually wondered what their academic skill level and capabilities
were. But neither the school nor I had a
tool or set of tools that could even begin to make an accurate assessment. If they had been Spanish speakers, or some
other language that was more common, perhaps an adapted assessment tool would
have had some validity. However their native Karen language made it impossible
to find any tool to make an adequate assessment. Even the way they wrote numbers was different
than the Arabic numerals used in this country.
I remember coming home from the school district assessment, feeling as
though it had been a somewhat useless exercise.
It was district policy to conduct this assessment in either English or
Spanish for all new students, but this may be a case where the question of whether
a particular assessment scenario is apt to have any validity at all would have
been useful before subjecting the refugees to this testing experience. They were very nervous going to the test,
knowing they would not be able to answer anything. And they left the experience feeling that
they had represented themselves poorly. Additionally they were fearful as to what
their failure would mean for them.
The
refugees were placed in grades and sections for school, because the district
had to place them somewhere. But, as
this example points out, without assessments and interpretations that fit the
intended usage and needs, schools are in a sense flying blind with students. Hopefully,
one day, these students will develop English skills enough to get a more
accurate and useful assessment of their academic needs and abilities. In the meantime they were exposed needlessly
to yet another intimidating and fearful experience.
References
Drummond, R. J.
and Jones, K. (2010). Assessment
Procedures for Counselors and Helping Professionals. Upper Saddle River,
New Jersey: Pearson Education, Inc.
Friday, February 22, 2013
Blog 3: Chapter 4
I found this chapter to be difficult
as well as last weeks. Statistics is not
something that I am familiar with nor had exposure to. As an educator, we are not required to
analyze the assessments the students take in this manner. When looking over the assessments we look at
growth or decrease in the scores. If
there is a decrease or no movement then we need to look further at what is going
on with the particular child. Also, we
look at the particular questions the students got wrong and guide our teaching
on the results. With that in mind, I was
able to relate to the section on percentages.
The chart that distinguished the differences between percentile ranks
and percentages helped because it helped me understand that our testing is not
analyzed by percentile ranks which at first I thought they fell under that
category as well.
In addition to that section, there
were other points that Drummond and Jones (2010) stated that stood out and I
felt was important to reflect on. The
first point was the importance to have a clear understanding of the scoring due
to it being a reflection of the individuals’ performance. You want to be accurate in determining how an
individual performs so that the score has a meaning and you can explain what
the score of a “60” means. Another important
component is the testing groups in the particular sample should be
current. It stated that the testing
instruments should be revised every ten years but is that even enough. Times change so often and I believe that all
tests should be up to date to provide you with the most accurate information. For example, there is a much bigger push for
testing compared to when I was in school.
Finally, another part that stood out to me was the norm groups. It made me question how reliable the
interpretation is. It states that it is
vital to make sure that the results are relevant if you want them to be
meaningful. The text brought up the
example that when looking at 6th grade mathematics you would look at
the grades across the country. My
question regarding that is do they take account of suburban versus urban
settings. All students perform
differently and it would be important to include all economic status kids. In
conclusion, statistics is something that everyone should have exposure to in
order to be able to explain results of test and I look forward to becoming more
familiar with basic statistical concepts.
Drummond, R. J.
& Jones, K. (2010). Assessment
Procedures for Counselors and Helping Professionals (7th ed.).
Upper Saddle River, NJ: Pearson.
Thursday, February 21, 2013
blogs 1-3
Blog 3
While I was reading the article I was disappointed that we, as school counselors, seem to be on a hamster wheel of sorts dealing with the same types of technical issues that prevent us from doing our jobs better. I feel that the idea that instrumentation much like various learning difficulties must be approached in student specific ways. By this I mean- not every instrument will accommodate or even assess the different learning needs of one student to a different student. It is impossible to create 15 or even 100 instruments that will assess broadly. Each learning difficulty has specific deficit areas and will require something that is sensitive to that fact. I feel that our job is limited because like most things- learning difficulties are attempted to be placed into compact boxes. Like most things in life this will not work.
It was encouraging that the article directly challenged school counselors both future and current to be more aware of what we need to do as competent professionals and also raising awareness that we need to “cheerlead” harder than we have been. The field of school psychology, as well as educational standards in general is always growing and expanding. Much like we as future counselors are always asked to continue to grow, change, and analyze these changes, our field and the instruments used must have the same scrutiny placed upon them.
I am curious to see how much of the issues we are discovering within instrumentation are due to increased awareness that they are not as student centered as they should be or is it because our education systems are continuously being challenged to be more streamlined and effective for testing/funding purposes. I really think it is more from the testing/funding perspective which is upsetting but regardless, the first steps towards change are having there be raised awareness. I hope to gain clarity during class about more action based assessments. I was glad most of the article talked about more action based assessments since these are more geared towards implementing something, checking to make sure the results yielded are what you are looking for and then revamping anything that is not getting you the end result you desire. I want to learn more about retrospective assessments because when I was reading about them they sound a lot like pre/posttest assessments and I would also be curious to see what their validity/reliability issues were and if they are the same as pre/posttest assessments.
While I was reading the article I was disappointed that we, as school counselors, seem to be on a hamster wheel of sorts dealing with the same types of technical issues that prevent us from doing our jobs better. I feel that the idea that instrumentation much like various learning difficulties must be approached in student specific ways. By this I mean- not every instrument will accommodate or even assess the different learning needs of one student to a different student. It is impossible to create 15 or even 100 instruments that will assess broadly. Each learning difficulty has specific deficit areas and will require something that is sensitive to that fact. I feel that our job is limited because like most things- learning difficulties are attempted to be placed into compact boxes. Like most things in life this will not work.
It was encouraging that the article directly challenged school counselors both future and current to be more aware of what we need to do as competent professionals and also raising awareness that we need to “cheerlead” harder than we have been. The field of school psychology, as well as educational standards in general is always growing and expanding. Much like we as future counselors are always asked to continue to grow, change, and analyze these changes, our field and the instruments used must have the same scrutiny placed upon them.
I am curious to see how much of the issues we are discovering within instrumentation are due to increased awareness that they are not as student centered as they should be or is it because our education systems are continuously being challenged to be more streamlined and effective for testing/funding purposes. I really think it is more from the testing/funding perspective which is upsetting but regardless, the first steps towards change are having there be raised awareness. I hope to gain clarity during class about more action based assessments. I was glad most of the article talked about more action based assessments since these are more geared towards implementing something, checking to make sure the results yielded are what you are looking for and then revamping anything that is not getting you the end result you desire. I want to learn more about retrospective assessments because when I was reading about them they sound a lot like pre/posttest assessments and I would also be curious to see what their validity/reliability issues were and if they are the same as pre/posttest assessments.
-------------------------------------------------------------------------------------------------------------
Blog entry 2:
I found this article to be interesting and vastly different from the experiences I have had with my own school counselor in high school. Mrs. Ghettle was predominantly utilized for career decisions post high school. I never saw/heard of her doing any types of assessment or even running group counseling sessions. When I was assessed or had 504 meetings with my mother it was only with the School Psychologist. I did not attend most meeting so maybe she was present in some of the meetings that I did not attend. Unfortunately, Mrs. Ghettle’s position seems now, to have been a waste. I know Annville did not utilize her in the best way to aid students. I’m not sure she would have had the competency that I will walk out of this program with but I’m sure she could rely somewhat on her almost 20 years of experience.
I felt relieved to know that my experience was most likely different form others in my position as the article cited research from 1999 where counselors were more effective and utilized in assessments than teachers and secondary school principals. Through reading this chapter (which I loved as I just had stats last year) I was reflecting a lot on my blog from last week. I know it was basically seething with hatred for assessments and the mental/emotional strain it places on its clients. I guess, in retrospect- most of the assessment is about how it is given and the perception of the client going into the assessment. Obviously, because of my experiences I will be paying special attention to my client’s pre-assessments etc.
Chapter 3 was a very easy to understand overview of the statistical work that we will be expected to know and utilize during assessments. I was a little worried upon signing up for this class that the statistical portion of this class would be presented (via the book) in a more difficult manner. I found their instruction to be presented more in laymen’s terms.
I was pleased that the chapter 5 (sorry I got a little ahead) did such a good job covering internal consistency reliability as I think this is one of the most important parts of an assessment next to making sure the person who is doing the assessment is also qualified and understands the instrumentation. Effective administration, scoring and internal reliability will be the most crucial parts of preforming an assessment that will actually benefit the client.
--------------------------------------------------------------------------------------------------------------
Baker Blog 1: Chapters 1,2,17
In writing this first blog, parts of what I talk about will be repeated from what I shared in class. I mentally struggle with thinking about assessments being part of my job because I firmly believe that as a counselor it is my job to not put people in boxes. Even though I can personally speak from the benefits of being assessed and having accommodations, I still have always been discontented that my style of learning is placed into a box. I didn’t realize that the reason Assessment has been changed from testing is that assessments, while those terms are used interchangeably, does encompass a lot more than just administering tests. I think that as a future counselor, I will strive to make sure that all students who I give assessments to will hopefully have better experiences than I had. I enjoyed finding out that assessments also include collecting information from various sources not just what the applicant can reproduce under pressure.
I found the pre-screening process to be one of the most invasive parts of the assessment. I know that this is a crucial part to an assessment because it allows you to gather information from the client and it also gives the client some power in expressing (if able) their concerns with difficulties in learning/understanding. I just can’t get over the feelings I remember about how the screening for possible depression and ADD just made me feel more defective. I didn’t find them to be helpful. I can recognize that the woman who was doing my re-evaluation was not the most personable and obviously her personality and general disposition could be affecting my feelings of this overall process. I was too young during my first evaluation to really remember feeling stupid or upset. I remember feeling frustrated but I knew the man who did my assessment for years previously because he started meeting with me for ADD. He had been meeting with my brother for years before so we had a pre-established relationship.
When reading the part in chapter 2 about computer-based testing I find myself feeling concerned about what the test truly can tell us about the individual. Obviously as I am a testing outlier- most of my concerns about tests in general are geared towards individuals that might not perform as well on a computer due to lack of instruction/understanding what directions are given. I know that as a future counselor I will be appreciative of computer based tests because they will provide ease of access, scoring and interpretation. I thought I remembered reading that the results of using computer-based and pencil and paper tests proved that you could not be used interchangeably? I could be misremembering but regardless I know that I struggle more with computer based tests. Obviously, again, my experience has skewed my experience with assessments etc. so I can only hope that during this semester I will become more educated about the benefits of assessment to help outweigh my clinging bitterness.
Understanding Assessment Scores
When I began reading Chapter 4 this week, I was a little confused about the content of the chapter and how I could relate that to my career. Chapter 3 had been more of a focus on the statistical calculations and data, which I have learned a great deal about in the past. However interpreting test scores is an area I am not as familiar with specifically norm-referenced and criterion-referenced scores. After finishing Chapter 4, I have begun to comprehend what these two types of interpretation mean and how beneficial it can be to understand them as a school counselor.
Many types of standard scores are recognizable to me from other classes and from conversation within schools. Drummond and Jones (2010) were able to give a deeper description of how these scores are applied and used to understand a student's capability and where they stand in comparison to other students their age, grade or the larger population. The grade and age equivalents in particular stuck out to me. It was noted that it is important that these not be used alone as primary scores because they do have many limitations. I found this to be important. It seems simple that these equivalents would provide a great deal of information about a student in comparison to their peers. The grade equivalents are specific to their grade level and subtest. A fifth grade student may receive an above average score in math such as 7.4 but this does not mean she is at a 7th grade level. This number only tells us that she is above her peers within 5th grade and performing at a much higher level for 5th grade math. The age equivalents are also difficult to rely on because the rate of growth for most behavior varies year by year and person by person. This makes it hard to compare students to others at their age. I believe have a stronger understanding of these concepts will help me when I am working with students in schools. I will be able to break down a student's ability level based on this score while still recognizing the limitations those scores have.
Now that we have gained this deeper comprehension of assessment and interpreting scores, I am looking forward to figuring out how to apply this to real life examples. I am hoping to start to learn more about the tests and assessments that are providing these scores and how we can use these scores to help our students.
I also wanted to reflect on last week's class when we discussed the ethical issues with the class. I truly believe that ethical issues will always be one of the most difficult aspects of our career. There is no black and white but instead a lot of gray areas that we have to be cautious of. Even after going over each case study it was clear that everyone views each situation differently. This also made me aware that different codes would come into play depending on who is handling that case. Something that made me feel a little more confident about making ethical decisions was learning more about the standard systems and rules that are laid out in individual districts. It seems expected that we follow the rules that we have been provided but that we also have our codes of ethics, our peers, and other professionals that we can turn to if we need assistance. Allow this area seems to be the most difficult, there are many resources and knowing/understanding this provides some sense of security.
Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.
Many types of standard scores are recognizable to me from other classes and from conversation within schools. Drummond and Jones (2010) were able to give a deeper description of how these scores are applied and used to understand a student's capability and where they stand in comparison to other students their age, grade or the larger population. The grade and age equivalents in particular stuck out to me. It was noted that it is important that these not be used alone as primary scores because they do have many limitations. I found this to be important. It seems simple that these equivalents would provide a great deal of information about a student in comparison to their peers. The grade equivalents are specific to their grade level and subtest. A fifth grade student may receive an above average score in math such as 7.4 but this does not mean she is at a 7th grade level. This number only tells us that she is above her peers within 5th grade and performing at a much higher level for 5th grade math. The age equivalents are also difficult to rely on because the rate of growth for most behavior varies year by year and person by person. This makes it hard to compare students to others at their age. I believe have a stronger understanding of these concepts will help me when I am working with students in schools. I will be able to break down a student's ability level based on this score while still recognizing the limitations those scores have.
Now that we have gained this deeper comprehension of assessment and interpreting scores, I am looking forward to figuring out how to apply this to real life examples. I am hoping to start to learn more about the tests and assessments that are providing these scores and how we can use these scores to help our students.
I also wanted to reflect on last week's class when we discussed the ethical issues with the class. I truly believe that ethical issues will always be one of the most difficult aspects of our career. There is no black and white but instead a lot of gray areas that we have to be cautious of. Even after going over each case study it was clear that everyone views each situation differently. This also made me aware that different codes would come into play depending on who is handling that case. Something that made me feel a little more confident about making ethical decisions was learning more about the standard systems and rules that are laid out in individual districts. It seems expected that we follow the rules that we have been provided but that we also have our codes of ethics, our peers, and other professionals that we can turn to if we need assistance. Allow this area seems to be the most difficult, there are many resources and knowing/understanding this provides some sense of security.
Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.
Week 3
I skipped ahead in my blog post last week, so I am now returning to comment on Elkstrom's article. Although it was a mistake, I am happy that I now have the benefit of our small group discussion to reference when describing the article because we had a number of interesting conversations last week. Although I knew that assessment is an important role for school counselors, I was surprised to read that among school administrators, teachers, and counselors surveyed, counselors were determined to have the strongest background in assessment; therefore having to serve in an advisory role for many staff members. This surprised because of the strong emphasis on the use of standardized tests in recent years which has forced teachers and administrators to focus on data and demonstrate student achievement in quantitative terms. I thought that teachers were required to have a number of classes in assessment, but one of my group members shared that this was not the case, so it would make sense then, that the school counselor would become the go-to person in the school.
As I shared in class last week, I'm not yet comfortable with my assessment knowledge to be the "school expert," but I hope that I will feel much more capable by the completion of this course. The discussions we had surrounding ethics was also helpful as we determined that the school counselor is not alone in administering assessments and interpreting results, but should often defer to the school psychologist or even other school officials when appropriate. I think that I may often err on the side of being overly cautious when I first begin working in a school because of my concern at my own competence. I will have have to balance this sense of caution with my desire to help and be as effective as possible in my new role. If time is on my side, I don't see any problems with consulting others before making judgements that could potentially send a student down an entirely different academic or career path. If time is not on my side, then I will have to make the most ethical decision I would make (and the same one I would want a school staff member to make if they were working with my child).
Ekstrom, R.B., Elmore, P.B., Schaefer, W.D., Trotter, T.V., & Webster, B. (2004). A survey of assessment and evaluation activities of school counselors. Professional School Counseling, 8 (1), 24-30.
Blog #3
I have to say that I felt good about the first half of class last week, and was even curious to start learning about some of the different statistical concepts. After leaving class I was a bit overwhelmed, but some of the concepts started to come back to me. When I first started to read chapter 3, I had to break it up as much as I could. When I take too many mathematical terms and concepts in all at the same time, I just get confused, which is how I kind of felt during the second half of class last week. After I got home from school that evening, I just decided to let it sink in and to look at it again the next morning. At that point, I realized that I am starting to pick things up and things are starting to come back to me.
Another extremely important step in the assessment process is the understanding of the scores. Chapter 4 gives a good summary of how this process works, and some of the topics that were introduced really interested me. For example, I could see how choosing a norm group to measure testing results could be complicated. If you take into account all of the different characteristics you need to consider in order to choose a norm, in some cases, I don’t even see how it is possible to create a consistent and fair testing or scoring process. It just seems to me that there would always be some sort of bias or lack of sensitivity to many of the groups or people involved in the testing process.
Drummond & Jones (2010) also states that, “it is reasonable to expect that instruments be revised at least every 10 years and that new norms will accompany the revisions (Thorndike, 2005).” I bet an argument could be made that 10 years is too long to wait to change the norms and testing instruments. I am also sure that some of the testing instruments out there probably haven’t been changed in 20 years or more. All of this information throughout chapter 4 makes me believe that we should be treading very lightly and be extremely picky with how we choose to test or measure results for any future client or student. It also shows how we should be treading lightly not to make assumptions or decisions about the client or student based on one set of results from an assessment.
Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7th ed.). Upper Saddle River, New Jersey: Pearson Education Inc.
Wednesday, February 20, 2013
Week 3, Chapter 4
As
I read chapter 4, “light bulbs” started to turn on for me. There were several terms in this
chapter that I have seen but never truly understood. I appreciate the way the authors of this book explain
concepts and given examples in simple terms. The figures on
pages 70 and 71 are incredibly helpful for me, as I am a very visual
person. Just seeing how various
types of scores compare to the normal curve really helped to solidify these
concepts for me. Also, while I
understood what norm-referenced and criterion-referenced tests were, until I
read the section about when to use each test, I did not fully understand the
difference between them. The example about a depression inventory made perfect
sense; if using a criterion-referenced test for such an assessment, one would
only understand whether or not the person has “mastered the knowledge and
skills to be depressed” (Drummond & Jones, 2010). This, of course, would not make sense, so a norm-referenced
test would be used.
When reading about norm groups, I
wondered how frequently norm-referenced tests are updated based on the norm
group. Page 67 of Drummond &
Jones (2010) states, “It is reasonable to expect that instruments will be
revised at least every 10 years.”
With the rapidly changing diversity of the population locally, state-
and nation-wide, I wonder how relevant the norm group is for a given test in
the months or years before it is updated, as it relates to ethnicity, race, and
SES.
I
still have some unanswered questions.
First, I have heard of stanine scores (mostly from reading evaluation
reports) but I am not familiar with sten scores. I wondered what exactly sten scores are used for. Concerning stanine scores, while I
understand that they are a simplified way of describing scores, they still do
not make sense to me when just looking at the score. It seems that including a percentile rank or a qualitative
description is essential to help solidify understanding of a variety of the
types of scores, sten and stanine included. Perhaps the part of this chapter that I feel
completely in the dark about is the GSV, or Growth Scale Value, which is
included in figures on pages 77 and 78.
I did not see a description of this, and am completely confused about
what it means and what it is used for, especially after viewing the figure on
page 78.
While
I feel more comfortable with the concepts of understanding assessment scores, I
still cannot help but wonder how all of the data used to determine norm groups,
percentile ranks, etc. are organized.
It is just so much data to keep straight! I would be
interested in learning the process of such large-scale data collection.
Drummond,
R. J. and Jones, K. (2010). Assessment procedures for counselors and
helping
professionals. (7th ed.). Upper Saddle River,
NJ: Pearson.
Week 3 (chp. 4)
While reading chapter four in our textbook this week, I found myself alternating between having to read the terms and explanations several times and having some "Ah ha! Now I understand this a bit better!" moments. I liked learning the differences between criterion-referenced and norm-referenced interpretations. This made sense to me and I started thinking about the different tests I have taken in my life. I pulled out my old GRE scores and read through the information about interpreting your scores. It made much more sense now! I looked at a table of general test mean scores showing results from a three year period. The number of examinees was 1.4 million (quite a sizable group!) and it listed their mean and standard deviation for the three parts of the test. The numbers made sense to me now whereas, a few weeks ago they didn't. I started thinking then about the height/weight charts at my daughter's pediatrician office and how they always show, at the yearly physical, where your child is on a growth curve. I admit that I am not a "numbers" person in many ways. We used to make jokes about my younger brother who did his high school's baseball teams statistics for fun. Really? Well now he has his MBA and is a CFO of a very large non-profit organization in Ohio. He always loved numbers and it has translated into a vocation he is passionate about.
While I may not want to do statistics for fun, I do enjoy reading the finished product and I always am curious of how I am doing compared to others. I guess that is why I enjoy the D2L feature which allows you to see your grade compared to your classmates. I am gaining a much greater appreciation for what goes into that "finished product" now. This chapter was very helpful in breaking down different methods of scoring and helping me understand the material so I, in turn, can more easily interpret the scores to the clients I will be serving in the future.
Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and helping
professionals (7th ed.). Upper Saddle River, NJ: Pearson.
While I may not want to do statistics for fun, I do enjoy reading the finished product and I always am curious of how I am doing compared to others. I guess that is why I enjoy the D2L feature which allows you to see your grade compared to your classmates. I am gaining a much greater appreciation for what goes into that "finished product" now. This chapter was very helpful in breaking down different methods of scoring and helping me understand the material so I, in turn, can more easily interpret the scores to the clients I will be serving in the future.
Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and helping
professionals (7th ed.). Upper Saddle River, NJ: Pearson.
Tuesday, February 19, 2013
Week 3, Chapter 4
Last spring around this time, I was learning statistics in
PSYC 612. While initially nervous about
the math part, I found that it actually came easier than the concepts did. I enjoyed having formulas to plug numbers
into, but became frustrated when I forgot a step along the way and wound up
with the wrong answer. Most of our
discussion and use of statistics for that class seemed to revolve around
research and interpreting the results of research. Because research is not too high on my
priority scale or bucket list, I figured that I’d probably never have to
actually use statistics in my daily work.
After reading for the past couple of weeks, I’ve determined that I
thought wrong. As Drummond & Jones
(2010) state in the introduction to this week’s chapter, “Counselors are often
called upon to interpret the results of tests, rating scales, structured
interviews, and various other instruments used in the assessment process” (p.
63). Ekstrom (2004) found that of 161
school counselors surveyed, 29% were responsible for selecting tests, 63% for
administering tests, and 71% for interpreting tests. Unless another career change is in my near
future (and I sincerely hope that it is not) it looks like I most certainly
have not escaped the daily use of statistics.
However, Drummond & Jones’ explanations of the interpretations of
different types of assessment scores in this week’s reading made the concept of
using them quite a bit less overwhelming and scary.
A couple of new concepts for me included those of stanines
and sten scores. I had heard these words
dropped in other conversations (possibly in my statistics class last year), but
never really knew what they were.
According to Drummond & Jones, stanines are widely used in
education, so it’s probably a good thing that I finally know what they are (a
standard score that converts raw scores into values ranging from 1 to 9). Sten scores are similar, but range from 1 to
10 instead of 1 to 9. I was surprised to
learn that grade equivalents ARE NOT an estimate of the grade at which the
student should be placed. A better way
to understand grade equivalents is as a method of comparing the scores of
typical students at various grade levels on the same test. I found the example of Katy, a 5th-grade
student who received a grade equivalent of 7.4, to be very helpful in
understanding what grade equivalents do and do not tell the
interpreter(s). I also found Drummond & Jones’ explanations
of why it is better to choose to use either a criterion-referenced
interpretation OR a norm-referenced interpretation to be helpful.
References
Drummond, R.J.
& Jones, K. (2010). Assessment
procedures for counselors and helping professionals ( 7th ed.). Upper Saddle River, NJ: Pearson.
Sunday, February 17, 2013
Week 3
I was very
interested in the concept of norm-referenced tests and the importance of
ascertaining the norm group for a particular test. I had no idea that the norm group for tests
like the KBIT was based on the census data so that the norm group would be a
reflection of the current population of the United States (Drummond and Jones,
2010). I would think it would be very complicated to get a norm group that
would be reflective of the various demographics listed such as parental
education, ethnicity, socio-economic status, region of the country, etc. When it comes to testing and finding a
stratified sample, there are so many variables and how would one determine
which ones are integral to a student’s test performance? Still it is comforting to know that students
are not norm-referenced against a norm group of children all of whose parents
have advanced degrees, or all of who are English language learners. That is of course unless I am interested in
knowing how students compare to other similar students as opposed to knowing
how they fare according the general populace.
It would seem that the norm group should be chosen based on the intent
of the assessment. Therefore, as in accordance
with RUST (2003) guidelines, the purpose of any test should be very clear when
choosing an assessment as well as when choosing the interpretation of the
results. Perhaps that is why the PSSA
results that are often reported in the paper have scores for all students and
then there are subset average scores for ELL students and Low SES
students. In this way districts and
parents can get results that help them learn about different aspects of their
educational progress and can get results that speak to specific assessment
purposes.
References
American
Counseling Association, & Association for Assessment in Counseling. (2003).
Responsibilities of users of standardized
tests (RUST). Alexandria, VA: Author.
Drummond, R. J.
and Jones, K. (2010). Assessment
Procedures for Counselors and Helping Professionals. Upper Saddle River,
New Jersey: Pearson Education, Inc.
Subscribe to:
Comments (Atom)