SCCN631Spring2013: February 2013

Thursday, February 28, 2013

Week 4

The part of Chapter 5 that I found most useful was the paragraph outlining how to increase the reliability of test scores. The most obvious example outlined in the text involved increasing the number of test items. For example, if the reliability for a 10-item test is .30, we can increase reliability to .68 by increasing the number of items from 10 to 50. Other ways of improving reliability involve writing test items in clear language, using multiple-choice questions rather than essay questions, making sure items are neither too easy or too difficult, having clear scoring procedures, and making sure individuals are trained before administering or scoring exams. I think these are guidelines we can follow when administering exams for students and information we can share with other teachers who may ask us for advice in designing and administering effective exams. When choosing assessments, the reliability measurement is a factor that we will definitely want to consider, as the "higher the coefficient, the more reliable the test scores" [will be] (Drummond, R. and Jones, K., 2012, p. 97). Somewhat related to reliability is validity, which is "the degree to which all the accumulated evidence supports the intended interpretation of test scores for the proposed purpose of the test, " (p.100). While reliability measures the appropriateness of the test itself, validity measures the interpretation of tests results and how they are then used to make decisions about students. Evidence for validity is organized around 5 areas, including test content, response processes, internal structure, relations to other variables and consequences of testing, but chapter six focuses on three areas: content, criterion-related and construct validity evidence. In attempting to understand these terms, I found the examples the authors provide very helpful. For instance, the sample table of specifications showing the content areas measured for sales performance demonstrated content validity, while criterion measure examples included academic achievement and job performance, and construct validity examples included group and age differentiation studies. The authors also define these terms nicely in the chapter summary, describing content validity, as, of course "focus[ing] on the content of the test," criterion-related validity as the "relationship between test results and external variables," and construct validity as the "appropriateness of inferences drawn from test scores as they relate to a particular construct" (p. 115). Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.

Reliability and Validity

This week I would first like to address our class activity last week that was done in the computer lab. The problems that were given are very familiar to me and actually bring about some excitement. During both of my previous statistic classes, I loved all the parts that related to math and being able to perform calculations. After my second class I was able to take it even a step further and could then use those numbers to interpret data and my results. The more comfortable I become with this information the better I feel about doing research and using data as a school counselor. In class when I was able to help my classmates with working Excel and successfully finishing all of the problems, I had a great sense of pride and accomplishment. This was a very motivating feeling and something that I have been looking for this year in the program. It also made me realize that if I had taken Appraisal before Guidance Program Development things may have fit together better for me. However, I do have the advantage of understanding a greater picture the way I have completed the classes. Appraisal fits into the overall picture I have already created and it seems to be filling some of the holes. I am excited to see how much more confidence I can gain by the end of the semester and this class.

Both chapters for this week were again somewhat of a review from my previous classes, however it took a new approach that I was not as familiar with. In the Reliability chapter, the sources of measurement error were organized by time-sampling error, content-sampling error, and interrater differences. This type of organization gave me a greater understanding of reliability and the methods used in order to estimate reliability. The Validity chapter also organized the information in a more modern way than I had previously been taught. I have always learned and understood validity to be broken up into content, criterion, and construct validity. Although the chapter did include sections on all three, it described all three as falling under construct validity. Construct validity is used as an umbrella term which is broken down into five sources (Drummond and Jones, 2010). It became clearer that the purpose is to establish a relationship between assessment scores and the other variables. We are trying to determine if the claims and decisions that are made on the basis of a particular assessment are meaningful and useful for what they are supposed to be accomplishing (Drummond and Jones, 2010). Another aspect that I appreciated from our text was the brief discussion on the fairness of certain assessments. “Validity also refers to the adequacy and appropriateness of the uses of assessment results” (Drummond and Jones, p. 100, 2010). My recent work with multicultural students and counseling has started to interest me in how fair certain parts of the educational system are for their success. The book points out that a lack of fairness is a lack of validity and this would also show a lack of reliability. This tells us that we should not be using this assessment to make educated decisions about any student and in particular students from unique backgrounds.

Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.

Reliability and Validity

Chapters 5&6

When I was reading the chapter on reliability I kept thinking, like most other people in the class, about the most common form of reliability that we have all experienced…the GRE/SAT. During both I was skeptical, no I was downright confident that they would not yield results that were consistent with my actual body of knowledge. I do understand that there is legitimacy in the predictive validity of some students with their scores and how their future GPA’s will reflect their intelligence. I however, am an outlier for the reliability of both the SAT and GRE. For both I scored in the lower-middle of the road area and should probably have only had a GPA that was around the C+/B- range. I can proudly say that both in undergrad and here in grad school this is not true. My GPA is near perfect and I work hard to keep that. For me- there is no predictive validity between standardized tests and how I actually preform.

Knowing that not everyone is a good test taker, or will be able to adequately express their body of knowledge on a standardized test will be part of our battle as school counselors. It will be important to be able to help those students who are not good test takers show their potential in different ways. Not to negate standardized tests, but most people are just not wired to be extraordinarily good at them. The fact that our education system places so much emphasis on them and their “predictive quality” disturbs me to the core.

The more I self-reflect about this class and my life I draw conclusions similar to ones I draw about validity and reliability- it is all relative and able to be affected by extraneous variable. No one in the psychology field or many others would ever argue that there will never be a time when an outside variable could not affect something. It would be silly to even consider this- so why then do we place such emphasis on schools receiving funding based on standardized test scores. Obviously there must be some measure to assess how schools are achieving academic success, and while I have no suggestions for a better system, I seriously disagree with what we are currently using.

I keep realizing more and more how much I have to learn. I fell so excited when I reflect on what I have learned so far- yet I am intimidated of all I feel I need to learn before I'll actually be doing this job. Hopefully Aassessment helps me on my way!

Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and

helping professionals. (7th ed.). Upper Saddle River, NJ: Pearson.

Blog 4: Chapters 5 & 6

I found the chapters on reliability and validity to be vital when learning about the assessment process. Before reading these chapters I was under the impression that reliability and validity went hand in hand. Obviously it would be ideal if you found a test that was both consistent as well as accurate; however, Drummond and Jones (2010) state, “Reliability is a necessary but insufficient condition for validity. A measure that produces totally inconsistent results cannot possibly provide valid reliable score interpretations. However, no matter how reliable assessment results are, it is not a guarantee of validity” (p.102). After learning that, I believe that it is important that you look at all components of a test to ensure that it fits the individual and/or group of students you are testing. This is imperative due to the results affecting the person being tested. As educators, we guide our teaching based off of the results and it directs us to which services to provide to which students. Furthermore, as future counselors it will help guide our program as well as ensuring the students are being provided with the support they need.

In terms of reliability, it was stated to be the most important characteristic since there are many decisions based off of the results. I learned that there are many factors that may affect the reliability and it is a good idea to keep these components in mind. Two of the factors that stood out to me were the content-sampling error and interrater differences. With content-sampling, I was able to relate to this because I am working with a small group of teachers in order to create a math screener. In order to ensure that we administer this consistently since there may be numerous teachers facilitating the assessment we need to make sure to adequately represent the content domain. We need to make sure that we are careful when choosing the questions that will be tested for each grade level. Next, with interrater differences, I agree that people view things differently. When you have multiple people completing observations you have to be clear about the expectations and both be on the same page.

Finally, in terms of validity, Drummond and Jones (2010) state, “The Standards assert that validity is “the most fundamental consideration in developing and evaluating tests”” (p. 102). When looking at test you want to make certain that they are not underrepresented by not including enough or irrelevant by the test being too broad. In conclusion, it is important to make sure the assessment that is given is appropriate for the audience. As a final point, I believe that reliability and validity are terms that are important to be familiar with due to the importance of the affects they have when testing.

Drummond, R. J. & Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals (7^th ed.). Upper Saddle River, NJ: Pearson.

Blog #4

While I was reading about the chapter on reliability I thought about my junior year in high school and when I took the SATs. My parents made me sign up to take the test two times within a period of two months. Like most other high school students, I was incredibly nervous about taking the test. I was already a horrible test taker and even worse at taking standardized tests. Even with all of the nervousness and anxiety I was able to get through the test without completely melting down. I ended up doing exactly as well as I thought I would do, and I received a very average score.

My family kept telling me that I would do better the second time around because I knew how the test was structured, how the time intervals work, what I was most comfortable with, etc…I even spent those two months in between studying and preparing as much as I could. As a result, when I went to take the SATs the second time around I was so much more confident and ready to take the test. My family and my focus on preparation made most of my anxiety go away, and I was so much more relaxed. I took the test and even walked out that day feeling more confident than I ever did after taking a standardized test. A few weeks later I got my score back. It was 10 points lower than the first time that I took it. Obviously, I was incredibly surprised that I did better the first time around. To be honest, I was pretty pissed off about the fact that I spent all that time studying and preparing and it did not make one bit of difference. This is probably why I didn’t prepare as much as I should have for the GRE.

Some people could say that it was an absolute fluke that I scored within 10 points of the two times that I took the SATs. When I look back on it now, I really believe that I got around the best score that I possibly could have. If I would have taken it five more times, I really believe that my scores would have been very similar. I know that people have many different experiences with taking the SATs, and some probably question the reliability and validity of the test. As much as I hate to admit it, in my experience, the test couldn’t have been any more reliable. I still don’t believe that one test should be a measure of where or if you go to college, but I will save that for another blog.

Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7^th ed.). Upper Saddle River, New Jersey: Pearson Education Inc.

Wednesday, February 27, 2013

Reliability and Validity

Reliability and validity are both concepts that are not new to me. However, after reading this week’s chapters, I realized how much I really did not know about them! I also realized how relevant they are to my job at this point.

When reading about true ability vs. observed score in the Reliability chapter, I immediately thought about a 7^th grade student that I have. He is very emotionally immature and has low confidence in his abilities. He does not like school at all and states this often. From working with him, one would think that he has low ability based on the way he performs in class both in motivation/task completion and his grades. However, he was evaluated at the beginning of this school year, and his full-scale IQ was a 95. His scores in all areas of reading, written expression, and math were average on the KTEA-II. This seems like a great example of true ability vs. an observed score, with the observed score being his performance in class. The two are like polar opposites for this student.

When reading about test-retest as a method of estimating reliability, I wondered if the test that is given is the exact same test, or if the test is just a similar one. When reading about simultaneous administration on page 89 (Drummond & Jones, 2010), I gather that the test-retest measure uses the same assessment. I wonder, then, how long they have testers wait before re-taking the test? Some people have extraordinary memories, and a curiosity that might warrant them to seek answers to particular questions they were asked on the test. If they sought out information or practice on certain questions or skills themselves, can the test-retest measure really show reliability for some groups of people? Related to re-test measures, the book stated that a limitation is that many tests do not have alternate forms. When reading about this, I thought about my school’s re-take policy, which was implemented last year. We are required to allow students to re-take tests when they go through a particular process. We were told as teachers to have an alternate form of each test so that students who re-take will take the alternate. I agree with the book: this is extremely time-consuming! So time-consuming that I wonder just how reliable some of the teacher-made alternate tests are?

When reading about predictive validity and the example of SAT scores as a predictor with college success, I was thinking, “Yeah, but SAT scores has nothing to do with personal choices about whether to focus more on classes or partying.” I was not surprise to then read that while it can predict academic success, it’s a poor predictor of morality (Drummond & Jones, 2010). With anything there could be confounding variables that could skew the correlation between two items. Also related to confounding variables was the thought that, in group differentiation studies, there is an expectation that children with ADHD would score lower than children in the standardization sample. The special education teacher in me wonders if a child’s environment is modified to limit distractions when taking this test. Are they given extended time to complete the test? Do they have ADHD paired with a learning disability? These are all variables that I think would need to be considered. Lastly, relating to confounding variables were my thoughts on criterion measures as uncontaminated. It makes sense that a measure should not be influenced by any external factors. I am curious about how achievement or IQ testing could be affected by the fact that a student might be an English Language Learner. All students that I have had whom have undergone evaluations have been fluent in English. I guess I have never asked a school psychologist how they would evaluate a student who speaks minimal English, especially if it is a more obscure language that they speak. Is the test interpreted? Are there forms of tests in students’ native languages and psychologists who specialize in administering such assessments? This could certainly affect not only the level of “contamination,” but also the reliability of the scores.

The more I learn in this class, the more questions I have, and the less I realize I know. I am left wondering: How many of the concepts in this textbook will be concepts that we will need to understand and use on a regular basis as school counselors?

Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and

helping professionals. (7th ed.). Upper Saddle River, NJ: Pearson.

Week 4 - Chs. 5 & 6

I came to a greater respect of the importance of reliable and valid assessments this week, but not just because of the reading from Drummond & Jones. For the past couple of days, I’ve been battling a sinus infection. On Monday I felt like I was sneezing and blowing my nose almost non-stop. Monday night, the chills set in. I know that getting the chills is often a sign of a fever, so I decided to get out my handy-dandy thermometer to take my temperature. While the brand of my thermometer is called Reli On, I found that it was not very reliable (or valid). I took my temperature 3 times in a row on Tuesday morning, getting a slightly different reading each time. I realize that doctors will tell you that taking your temperature via the armpit is probably the least accurate (valid) measurement of body temperature; however, it also happens to be the cheapest type of thermometer to come by on the market. Even if my thermometer had been consistent (reliable), that doesn’t mean that it would have actually been giving me an accurate (valid) measurement on which to base my decisions as to whether or not I should go to work or go to the doctor’s office. This aligns with what Drummond and Jones (2010, p. 102) state: “ no matter how reliable assessment results are, it is not a guarantee of validity…assessment results may be highly reliable, but may be measuring the wrong thing or may be used in inappropriate ways.”

My understanding of reliability and validity has been expanded from the knowledge that I first gained in my statistics class last year. Last year, we primarily talked about reliability and validity in regards to “treatment.” In my mind, “treatment” is what would normally follow an assessment. For instance, when I went to visit the doctor yesterday, she asked me a number of questions in order to assess what my symptoms were most likely caused by. After her assessment, she proceeded to prescribe me some antibiotics in order to treat my illness. If the treatment does what it is supposed to do (i.e. kill the infection wreaking havoc on my sinuses), it will be an effective, valid treatment. In assessment, validity refers not just to whether or not a test “measures what it was supposed to measure,” but also to the appropriateness of the interpretation and use made of assessment results (Drummond & Jones, 2010, p. 99). This leaves the issue of validity in assessment a bit more in the hands of the test administrator and a bit less in the hands of the test creator. In a few short weeks, I will be administering a couple of assessments to a high school student. That means that, with the help of whatever scoring mechanism the test creators have designed, I will also be responsible for making an appropriate interpretation of the results. This is a large responsibility. The interpretation of the assessments that I give could have a lasting impacts on this student’s life! And that is exactly why counselors need to be on the lookout for threats to reliability and validity in assessment.

References
Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7th ed.). Upper Saddle River, NJ: Pearson.

Tuesday, February 26, 2013

Week 4 - Reliability & Validity

I must admit that my head is spinning a bit after reading this week's chapters on reliability and validity. While my suspicions that I most likely will never be interested in creating new instruments (on a large scale) have been confirmed, I am appreciative of the following ideas that were spurred on by this week's reading. One is that I think most of us take for granted the reliability and validity of many things in our lives. I think of the PSSA tests my 4th grader will be taking soon. I really don't know much about the nitty gritty details of this standardized test that already is making her nervous. I took the GRE years ago and really didn't know much about the instrument except for the power it held over my future plans. I think back to when I was a teenager and taking the written exam to get my driver's license. Did I care if the exam was using appropriate content? No! I just wanted my shiny new license.

I now have a much greater appreciation for all that goes into creating, administering and interpreting a test. It is overwhelming in many ways as I am now aware of many of the issues and variables that go into a test from it's creation through to the interpretation of results. Drummond and Jones write, "It is the responsibility of test users to carefully read the validity information in the test manual and evaluate the suitability of the test for their specific purposes." (2010,p.115) As an individual, it is easy to gloss over reliability and validity issues in our own lives. We can take the "cross your fingers" approach and just assume that someone else has checked up on the reliability and validity factors. As a school counselor though, we are responsible for our students and have a responsibility to make the wisest and most informed choices for them whether it be a referral to another professional or which assessment tool will be the most beneficial for them to use and gain insight into their world.

Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and helping
professionals (7th ed.). Upper Saddle River, NJ: Pearson.

Sunday, February 24, 2013

Week 4, Ch 5 and 6

When considering validity, I was particularly drawn to one of the threats to validity: construct-irrelevant variance. As I understand the concept involves using an instrument for a particular purpose, but having other aspects inadvertently confuse the results and therefore impede the proper interpretation and usage of an assessment. I witnessed this first hand when I helped with the settlement of Burmese refugees in Lancaster. The students were being assessed for their academic achievement in order to be placed in the correct grade and sections in elementary, middle, and high school. I accompanied them when an employee from the school district attempted to give them an academic assessment. Although these children had been in school in refugee camps in Thailand, their very limited English created a situation where they appeared to know less than they did. In this way the assessment experience had very little validity. The intent was to use the tests as a tool to discern their academic level so that they could place them in the grade and section appropriate to their academic level. The tests in English were not able to assess anything but the refugees’ lack of English language skills. As I continued to work with these children and teens, I continually wondered what their academic skill level and capabilities were. But neither the school nor I had a tool or set of tools that could even begin to make an accurate assessment. If they had been Spanish speakers, or some other language that was more common, perhaps an adapted assessment tool would have had some validity. However their native Karen language made it impossible to find any tool to make an adequate assessment. Even the way they wrote numbers was different than the Arabic numerals used in this country. I remember coming home from the school district assessment, feeling as though it had been a somewhat useless exercise. It was district policy to conduct this assessment in either English or Spanish for all new students, but this may be a case where the question of whether a particular assessment scenario is apt to have any validity at all would have been useful before subjecting the refugees to this testing experience. They were very nervous going to the test, knowing they would not be able to answer anything. And they left the experience feeling that they had represented themselves poorly. Additionally they were fearful as to what their failure would mean for them.

The refugees were placed in grades and sections for school, because the district had to place them somewhere. But, as this example points out, without assessments and interpretations that fit the intended usage and needs, schools are in a sense flying blind with students. Hopefully, one day, these students will develop English skills enough to get a more accurate and useful assessment of their academic needs and abilities. In the meantime they were exposed needlessly to yet another intimidating and fearful experience.

References

Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.

Friday, February 22, 2013

Blog 3: Chapter 4

I found this chapter to be difficult as well as last weeks. Statistics is not something that I am familiar with nor had exposure to. As an educator, we are not required to analyze the assessments the students take in this manner. When looking over the assessments we look at growth or decrease in the scores. If there is a decrease or no movement then we need to look further at what is going on with the particular child. Also, we look at the particular questions the students got wrong and guide our teaching on the results. With that in mind, I was able to relate to the section on percentages. The chart that distinguished the differences between percentile ranks and percentages helped because it helped me understand that our testing is not analyzed by percentile ranks which at first I thought they fell under that category as well.

In addition to that section, there were other points that Drummond and Jones (2010) stated that stood out and I felt was important to reflect on. The first point was the importance to have a clear understanding of the scoring due to it being a reflection of the individuals’ performance. You want to be accurate in determining how an individual performs so that the score has a meaning and you can explain what the score of a “60” means. Another important component is the testing groups in the particular sample should be current. It stated that the testing instruments should be revised every ten years but is that even enough. Times change so often and I believe that all tests should be up to date to provide you with the most accurate information. For example, there is a much bigger push for testing compared to when I was in school. Finally, another part that stood out to me was the norm groups. It made me question how reliable the interpretation is. It states that it is vital to make sure that the results are relevant if you want them to be meaningful. The text brought up the example that when looking at 6^th grade mathematics you would look at the grades across the country. My question regarding that is do they take account of suburban versus urban settings. All students perform differently and it would be important to include all economic status kids. In conclusion, statistics is something that everyone should have exposure to in order to be able to explain results of test and I look forward to becoming more familiar with basic statistical concepts.

Drummond, R. J. & Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals (7^th ed.). Upper Saddle River, NJ: Pearson.

Thursday, February 21, 2013

blogs 1-3

Blog 3
                While I was reading the article I was disappointed that we, as school counselors, seem to be on a hamster wheel of sorts dealing with the same types of technical issues that prevent us from doing our jobs better. I feel that the idea that instrumentation much like various learning difficulties must be approached in student specific ways. By this I mean- not every instrument will accommodate or even assess the different learning needs of one student to a different student. It is impossible to create 15 or even 100 instruments that will assess broadly. Each learning difficulty has specific deficit areas and will require something that is sensitive to that fact. I feel that our job is limited because like most things- learning difficulties are attempted to be placed into compact boxes. Like most things in life this will not work.
                It was encouraging that the article directly challenged school counselors both future and current to be more aware of what we need to do as competent professionals and also raising awareness that we need to “cheerlead” harder than we have been. The field of school psychology, as well as educational standards in general is always growing and expanding. Much like we as future counselors are always asked to continue to grow, change, and analyze these changes, our field and the instruments used must have the same scrutiny placed upon them.
                I am curious to see how much of the issues we are discovering within instrumentation are due to increased awareness that they are not as student centered as they should be or is it because our education systems are continuously being challenged to be more streamlined and effective for testing/funding purposes. I really think it is more from the testing/funding perspective which is upsetting but regardless, the first steps towards change are having there be raised awareness. I hope to gain clarity during class about more action based assessments. I was glad most of the article talked about more action based assessments since these are more geared towards implementing something, checking to make sure the results yielded are what you are looking for and then revamping anything that is not getting you the end result you desire. I want to learn more about retrospective assessments because when I was reading about them they sound a lot like pre/posttest assessments and I would also be curious to see what their validity/reliability issues were and if they are the same as pre/posttest assessments.

-------------------------------------------------------------------------------------------------------------
Blog entry 2:
                I found this article to be interesting and vastly different from the experiences I have had with my own school counselor in high school. Mrs. Ghettle was predominantly utilized for career decisions post high school. I never saw/heard of her doing any types of assessment or even running group counseling sessions. When I was assessed or had 504 meetings with my mother it was only with the School Psychologist. I did not attend most meeting so maybe she was present in some of the meetings that I did not attend. Unfortunately, Mrs. Ghettle’s position seems now, to have been a waste. I know Annville did not utilize her in the best way to aid students. I’m not sure she would have had the competency that I will walk out of this program with but I’m sure she could rely somewhat on her almost 20 years of experience.
                I felt relieved to know that my experience was most likely different form others in my position as the article cited research from 1999 where counselors were more effective and utilized in assessments than teachers and secondary school principals. Through reading this chapter (which I loved as I just had stats last year) I was reflecting a lot on my blog from last week. I know it was basically seething with hatred for assessments and the mental/emotional strain it places on its clients. I guess, in retrospect- most of the assessment is about how it is given and the perception of the client going into the assessment. Obviously, because of my experiences I will be paying special attention to my client’s pre-assessments etc.
                Chapter 3 was a very easy to understand overview of the statistical work that we will be expected to know and utilize during assessments. I was a little worried upon signing up for this class that the statistical portion of this class would be presented (via the book) in a more difficult manner. I found their instruction to be presented more in laymen’s terms.
                I was pleased that the chapter 5 (sorry I got a little ahead) did such a good job covering internal consistency reliability as I think this is one of the most important parts of an assessment next to making sure the person who is doing the assessment is also qualified and understands the instrumentation. Effective administration, scoring and internal reliability will be the most crucial parts of preforming an assessment that will actually benefit the client.

--------------------------------------------------------------------------------------------------------------

Baker Blog 1: Chapters 1,2,17
                In writing this first blog, parts of what I talk about will be repeated from what I shared in class. I mentally struggle with thinking about assessments being part of my job because I firmly believe that as a counselor it is my job to not put people in boxes. Even though I can personally speak from the benefits of being assessed and having accommodations, I still have always been discontented that my style of learning is placed into a box. I didn’t realize that the reason Assessment has been changed from testing is that assessments, while those terms are used interchangeably, does encompass a lot more than just administering tests. I think that as a future counselor, I will strive to make sure that all students who I give assessments to will hopefully have better experiences than I had. I enjoyed finding out that assessments also include collecting information from various sources not just what the applicant can reproduce under pressure.
                I found the pre-screening process to be one of the most invasive parts of the assessment. I know that this is a crucial part to an assessment because it allows you to gather information from the client and it also gives the client some power in expressing (if able) their concerns with difficulties in learning/understanding. I just can’t get over the feelings I remember about how the screening for possible depression and ADD just made me feel more defective. I didn’t find them to be helpful. I can recognize that the woman who was doing my re-evaluation was not the most personable and obviously her personality and general disposition could be affecting my feelings of this overall process. I was too young during my first evaluation to really remember feeling stupid or upset. I remember feeling frustrated but I knew the man who did my assessment for years previously because he started meeting with me for ADD. He had been meeting with my brother for years before so we had a pre-established relationship.
                When reading the part in chapter 2 about computer-based testing I find myself feeling concerned about what the test truly can tell us about the individual. Obviously as I am a testing outlier- most of my concerns about tests in general are geared towards individuals that might not perform as well on a computer due to lack of instruction/understanding what directions are given. I know that as a future counselor I will be appreciative of computer based tests because they will provide ease of access, scoring and interpretation. I thought I remembered reading that the results of using computer-based and pencil and paper tests proved that you could not be used interchangeably? I could be misremembering but regardless I know that I struggle more with computer based tests. Obviously, again, my experience has skewed my experience with assessments etc. so I can only hope that during this semester I will become more educated about the benefits of assessment to help outweigh my clinging bitterness.

Understanding Assessment Scores

When I began reading Chapter 4 this week, I was a little confused about the content of the chapter and how I could relate that to my career. Chapter 3 had been more of a focus on the statistical calculations and data, which I have learned a great deal about in the past. However interpreting test scores is an area I am not as familiar with specifically norm-referenced and criterion-referenced scores. After finishing Chapter 4, I have begun to comprehend what these two types of interpretation mean and how beneficial it can be to understand them as a school counselor.

Many types of standard scores are recognizable to me from other classes and from conversation within schools. Drummond and Jones (2010) were able to give a deeper description of how these scores are applied and used to understand a student's capability and where they stand in comparison to other students their age, grade or the larger population. The grade and age equivalents in particular stuck out to me. It was noted that it is important that these not be used alone as primary scores because they do have many limitations. I found this to be important. It seems simple that these equivalents would provide a great deal of information about a student in comparison to their peers. The grade equivalents are specific to their grade level and subtest. A fifth grade student may receive an above average score in math such as 7.4 but this does not mean she is at a 7th grade level. This number only tells us that she is above her peers within 5th grade and performing at a much higher level for 5th grade math. The age equivalents are also difficult to rely on because the rate of growth for most behavior varies year by year and person by person. This makes it hard to compare students to others at their age. I believe have a stronger understanding of these concepts will help me when I am working with students in schools. I will be able to break down a student's ability level based on this score while still recognizing the limitations those scores have.

Now that we have gained this deeper comprehension of assessment and interpreting scores, I am looking forward to figuring out how to apply this to real life examples. I am hoping to start to learn more about the tests and assessments that are providing these scores and how we can use these scores to help our students.

I also wanted to reflect on last week's class when we discussed the ethical issues with the class. I truly believe that ethical issues will always be one of the most difficult aspects of our career. There is no black and white but instead a lot of gray areas that we have to be cautious of. Even after going over each case study it was clear that everyone views each situation differently. This also made me aware that different codes would come into play depending on who is handling that case. Something that made me feel a little more confident about making ethical decisions was learning more about the standard systems and rules that are laid out in individual districts. It seems expected that we follow the rules that we have been provided but that we also have our codes of ethics, our peers, and other professionals that we can turn to if we need assistance. Allow this area seems to be the most difficult, there are many resources and knowing/understanding this provides some sense of security.

Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.

Week 3

I skipped ahead in my blog post last week, so I am now returning to comment on Elkstrom's article. Although it was a mistake, I am happy that I now have the benefit of our small group discussion to reference when describing the article because we had a number of interesting conversations last week. Although I knew that assessment is an important role for school counselors, I was surprised to read that among school administrators, teachers, and counselors surveyed, counselors were determined to have the strongest background in assessment; therefore having to serve in an advisory role for many staff members. This surprised because of the strong emphasis on the use of standardized tests in recent years which has forced teachers and administrators to focus on data and demonstrate student achievement in quantitative terms. I thought that teachers were required to have a number of classes in assessment, but one of my group members shared that this was not the case, so it would make sense then, that the school counselor would become the go-to person in the school. As I shared in class last week, I'm not yet comfortable with my assessment knowledge to be the "school expert," but I hope that I will feel much more capable by the completion of this course. The discussions we had surrounding ethics was also helpful as we determined that the school counselor is not alone in administering assessments and interpreting results, but should often defer to the school psychologist or even other school officials when appropriate. I think that I may often err on the side of being overly cautious when I first begin working in a school because of my concern at my own competence. I will have have to balance this sense of caution with my desire to help and be as effective as possible in my new role. If time is on my side, I don't see any problems with consulting others before making judgements that could potentially send a student down an entirely different academic or career path. If time is not on my side, then I will have to make the most ethical decision I would make (and the same one I would want a school staff member to make if they were working with my child). Ekstrom, R.B., Elmore, P.B., Schaefer, W.D., Trotter, T.V., & Webster, B. (2004). A survey of assessment and evaluation activities of school counselors. Professional School Counseling, 8 (1), 24-30.

Blog #3

I have to say that I felt good about the first half of class last week, and was even curious to start learning about some of the different statistical concepts. After leaving class I was a bit overwhelmed, but some of the concepts started to come back to me. When I first started to read chapter 3, I had to break it up as much as I could. When I take too many mathematical terms and concepts in all at the same time, I just get confused, which is how I kind of felt during the second half of class last week. After I got home from school that evening, I just decided to let it sink in and to look at it again the next morning. At that point, I realized that I am starting to pick things up and things are starting to come back to me.

Another extremely important step in the assessment process is the understanding of the scores. Chapter 4 gives a good summary of how this process works, and some of the topics that were introduced really interested me. For example, I could see how choosing a norm group to measure testing results could be complicated. If you take into account all of the different characteristics you need to consider in order to choose a norm, in some cases, I don’t even see how it is possible to create a consistent and fair testing or scoring process. It just seems to me that there would always be some sort of bias or lack of sensitivity to many of the groups or people involved in the testing process.

Drummond & Jones (2010) also states that, “it is reasonable to expect that instruments be revised at least every 10 years and that new norms will accompany the revisions (Thorndike, 2005).” I bet an argument could be made that 10 years is too long to wait to change the norms and testing instruments. I am also sure that some of the testing instruments out there probably haven’t been changed in 20 years or more. All of this information throughout chapter 4 makes me believe that we should be treading very lightly and be extremely picky with how we choose to test or measure results for any future client or student. It also shows how we should be treading lightly not to make assumptions or decisions about the client or student based on one set of results from an assessment.

Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7^th ed.). Upper Saddle River, New Jersey: Pearson Education Inc.

Wednesday, February 20, 2013

Week 3, Chapter 4

As I read chapter 4, “light bulbs” started to turn on for me. There were several terms in this chapter that I have seen but never truly understood. I appreciate the way the authors of this book explain concepts and given examples in simple terms. The figures on pages 70 and 71 are incredibly helpful for me, as I am a very visual person. Just seeing how various types of scores compare to the normal curve really helped to solidify these concepts for me. Also, while I understood what norm-referenced and criterion-referenced tests were, until I read the section about when to use each test, I did not fully understand the difference between them. The example about a depression inventory made perfect sense; if using a criterion-referenced test for such an assessment, one would only understand whether or not the person has “mastered the knowledge and skills to be depressed” (Drummond & Jones, 2010). This, of course, would not make sense, so a norm-referenced test would be used.

When reading about norm groups, I wondered how frequently norm-referenced tests are updated based on the norm group. Page 67 of Drummond & Jones (2010) states, “It is reasonable to expect that instruments will be revised at least every 10 years.” With the rapidly changing diversity of the population locally, state- and nation-wide, I wonder how relevant the norm group is for a given test in the months or years before it is updated, as it relates to ethnicity, race, and SES.

I still have some unanswered questions. First, I have heard of stanine scores (mostly from reading evaluation reports) but I am not familiar with sten scores. I wondered what exactly sten scores are used for. Concerning stanine scores, while I understand that they are a simplified way of describing scores, they still do not make sense to me when just looking at the score. It seems that including a percentile rank or a qualitative description is essential to help solidify understanding of a variety of the types of scores, sten and stanine included. Perhaps the part of this chapter that I feel completely in the dark about is the GSV, or Growth Scale Value, which is included in figures on pages 77 and 78. I did not see a description of this, and am completely confused about what it means and what it is used for, especially after viewing the figure on page 78.

While I feel more comfortable with the concepts of understanding assessment scores, I still cannot help but wonder how all of the data used to determine norm groups, percentile ranks, etc. are organized. It is just so much data to keep straight! I would be interested in learning the process of such large-scale data collection.

Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and

helping professionals. (7th ed.). Upper Saddle River, NJ: Pearson.

Week 3 (chp. 4)

While reading chapter four in our textbook this week, I found myself alternating between having to read the terms and explanations several times and having some "Ah ha! Now I understand this a bit better!" moments. I liked learning the differences between criterion-referenced and norm-referenced interpretations. This made sense to me and I started thinking about the different tests I have taken in my life. I pulled out my old GRE scores and read through the information about interpreting your scores. It made much more sense now! I looked at a table of general test mean scores showing results from a three year period. The number of examinees was 1.4 million (quite a sizable group!) and it listed their mean and standard deviation for the three parts of the test. The numbers made sense to me now whereas, a few weeks ago they didn't. I started thinking then about the height/weight charts at my daughter's pediatrician office and how they always show, at the yearly physical, where your child is on a growth curve. I admit that I am not a "numbers" person in many ways. We used to make jokes about my younger brother who did his high school's baseball teams statistics for fun. Really? Well now he has his MBA and is a CFO of a very large non-profit organization in Ohio. He always loved numbers and it has translated into a vocation he is passionate about.

While I may not want to do statistics for fun, I do enjoy reading the finished product and I always am curious of how I am doing compared to others. I guess that is why I enjoy the D2L feature which allows you to see your grade compared to your classmates. I am gaining a much greater appreciation for what goes into that "finished product" now. This chapter was very helpful in breaking down different methods of scoring and helping me understand the material so I, in turn, can more easily interpret the scores to the clients I will be serving in the future.

Drummond, R. J. and Jones, K. (2010). Assessment procedures for counselors and helping
professionals (7th ed.). Upper Saddle River, NJ: Pearson.

Tuesday, February 19, 2013

Week 3, Chapter 4

Last spring around this time, I was learning statistics in PSYC 612. While initially nervous about the math part, I found that it actually came easier than the concepts did. I enjoyed having formulas to plug numbers into, but became frustrated when I forgot a step along the way and wound up with the wrong answer. Most of our discussion and use of statistics for that class seemed to revolve around research and interpreting the results of research. Because research is not too high on my priority scale or bucket list, I figured that I’d probably never have to actually use statistics in my daily work. After reading for the past couple of weeks, I’ve determined that I thought wrong. As Drummond & Jones (2010) state in the introduction to this week’s chapter, “Counselors are often called upon to interpret the results of tests, rating scales, structured interviews, and various other instruments used in the assessment process” (p. 63). Ekstrom (2004) found that of 161 school counselors surveyed, 29% were responsible for selecting tests, 63% for administering tests, and 71% for interpreting tests. Unless another career change is in my near future (and I sincerely hope that it is not) it looks like I most certainly have not escaped the daily use of statistics. However, Drummond & Jones’ explanations of the interpretations of different types of assessment scores in this week’s reading made the concept of using them quite a bit less overwhelming and scary.

A couple of new concepts for me included those of stanines and sten scores. I had heard these words dropped in other conversations (possibly in my statistics class last year), but never really knew what they were. According to Drummond & Jones, stanines are widely used in education, so it’s probably a good thing that I finally know what they are (a standard score that converts raw scores into values ranging from 1 to 9). Sten scores are similar, but range from 1 to 10 instead of 1 to 9. I was surprised to learn that grade equivalents ARE NOT an estimate of the grade at which the student should be placed. A better way to understand grade equivalents is as a method of comparing the scores of typical students at various grade levels on the same test. I found the example of Katy, a 5^th-grade student who received a grade equivalent of 7.4, to be very helpful in understanding what grade equivalents do and do not tell the interpreter(s). I also found Drummond & Jones’ explanations of why it is better to choose to use either a criterion-referenced interpretation OR a norm-referenced interpretation to be helpful.

References

Drummond, R.J. & Jones, K. (2010). Assessment procedures for counselors and helping professionals ( 7^th ed.). Upper Saddle River, NJ: Pearson.

Ekstrom, R.B., Elmore, P.B., Schaefer, W.D., Trotter, T.V., & Webster, B. (2004). A survey of assessment and evaluation activities of school counselors. Professional School Counseling, 8 (1), 24-30.

Sunday, February 17, 2013

Week 3

I was very interested in the concept of norm-referenced tests and the importance of ascertaining the norm group for a particular test. I had no idea that the norm group for tests like the KBIT was based on the census data so that the norm group would be a reflection of the current population of the United States (Drummond and Jones, 2010). I would think it would be very complicated to get a norm group that would be reflective of the various demographics listed such as parental education, ethnicity, socio-economic status, region of the country, etc. When it comes to testing and finding a stratified sample, there are so many variables and how would one determine which ones are integral to a student’s test performance? Still it is comforting to know that students are not norm-referenced against a norm group of children all of whose parents have advanced degrees, or all of who are English language learners. That is of course unless I am interested in knowing how students compare to other similar students as opposed to knowing how they fare according the general populace. It would seem that the norm group should be chosen based on the intent of the assessment. Therefore, as in accordance with RUST (2003) guidelines, the purpose of any test should be very clear when choosing an assessment as well as when choosing the interpretation of the results. Perhaps that is why the PSSA results that are often reported in the paper have scores for all students and then there are subset average scores for ELL students and Low SES students. In this way districts and parents can get results that help them learn about different aspects of their educational progress and can get results that speak to specific assessment purposes.

References

American Counseling Association, & Association for Assessment in Counseling. (2003). Responsibilities of users of standardized tests (RUST). Alexandria, VA: Author.

Drummond, R. J. and Jones, K. (2010). Assessment Procedures for Counselors and Helping Professionals. Upper Saddle River, New Jersey: Pearson Education, Inc.