SlideShare a Scribd company logo
Language Test ReliabilityLanguage Test Reliability
Teacher:Teacher: Dr. Golshan
Prepared by:Prepared by:Tahere Bakhshi
November 2015November 2015
In the name of God
A test should have:A test should have:
Reliability: (Same result under the same condition)Reliability: (Same result under the same condition)
Validity: (Scale to measure the size of head Not sthValidity: (Scale to measure the size of head Not sth
else)else)
Usability or Practicality: (Not too difficult, practicalUsability or Practicality: (Not too difficult, practical
to use)to use)
•The problem of measuring mental traits, languageThe problem of measuring mental traits, language
proficiency, motivation and … !proficiency, motivation and … !
• Tests should measure consistently !Tests should measure consistently !
VarianceVariance
VarianceVariance:: variance measureshow far aset of numbersisspread out.
 
Variance of Zero: Identical values
Small Variance: Expected value close to mean
High Variance: Spread out values, far from mean
Sources ofSources of
VarianceVariance
 Meaningful VarianceMeaningful Variance
Those sources that make variance related to the purpose
of the test.
To gain the goal: (Items be related to the purpose of designed test & students’
knowledge on topic. Test validity issue: (see Table 8.1, P. 170) )
Other Factors unrelated to the aim of the test :Other Factors unrelated to the aim of the test :
 Measurement error or Error VarianceMeasurement error or Error Variance
Those sources that make variance related to other
extraneous variables.
 Types of issues related to errorTypes of issues related to error
variancevariance1.Variance due to the environment: (1.Variance due to the environment: (Noise, classroom
temperature, outside noises, distractions, amount of space per person,
lighting, ventilation, or other environmental factors)).
2. Variance due to the administration procedure: (2. Variance due to the administration procedure: (Directions
of test, Quality of equipment and timing (Cassette or teachers ) )). Table
2.5, p.35
3. Variance due to examinees:3. Variance due to examinees: ((Condition of students: their
fatigue, health, hearing or vision)). ((Psychological factors:
motivation, memory, concentration, forgetfulness, impulsiveness,
carelessness and…). (). (Students’ testwiseness and Strategies))
4. Variance due to scoring procedure:4. Variance due to scoring procedure: Errors in doing scoring.
Subjective nature of scoring procedure.
5. Variance due to test and test items: (5. Variance due to test and test items: (Printing, knowing answer
sheet, number of items, item selection, quality of test items, test
security))
The mentioned sources of measurement error should beThe mentioned sources of measurement error should be
minimized so that there is no Variance in students’minimized so that there is no Variance in students’
((Dependable or trustworthy))
A test is considered reliable if it would give us the same result
over and over again.
How is reliability measured?How is reliability measured?
By comparing two sets of scores for a single assessment (such as twoBy comparing two sets of scores for a single assessment (such as two
rater scores for the same person).rater scores for the same person). After having two sets of scores for
a group of students, we can determine how similar they are by
computing a statistic known as the reliability coefficient.
Reliability Coefficient:Reliability Coefficient:
A numerical index of reliability, ranging from 0 to 1.
Number closer to 1 = high reliability. A low reliability
coefficient indicates more error in the assessment results.
Reliability is considered good or acceptable if the reliability
coefficient is .80 or above.
 Reliability of NRTsReliability of NRTs
Testing in language programs (chapter 8)
1. Test-Retest Reliability:1. Test-Retest Reliability:
SituationSituation: Same people taking two administrations of the same test.
ProcedureProcedure: Correlate scores on the two tests which yields the
coefficient of stability.
MeaningMeaning: the extent to which scores on a test can be generalized
over different occasions (temporal stability).
Appropriate use:Appropriate use: Information about the stability of the trait over time.
DisadvantagesDisadvantages: Requires two testing sessions, Learning, Test
effect.
     Three Basic Strategies to Estimate theThree Basic Strategies to Estimate the
reliability of a Test:reliability of a Test:
2. Parallel / Equivalent-Forms Reliability:2. Parallel / Equivalent-Forms Reliability:
 SituationSituation: Testing of same people on different but comparable
forms of the test. (Forms A & B)
 ProcedureProcedure: correlate the scores from the two tests which yields a
coefficient of equivalence.
 Meaning:Meaning: the consistency of response to different item samples (where
testing is immediate) and across occasions (where testing is delayed).
 Appropriate use:Appropriate use: to provide information about the equivalence of
forms.
Ali usually ………… late at night. A. study b. studiesAli usually ………… late at night. A. study b. studies
c. studyingc. studying
Reza often ………… the shopping in the afternoon. A. do b. doesReza often ………… the shopping in the afternoon. A. do b. does
c. doingc. doing
3. Internal Consistency3. Internal Consistency
Reliability:Reliability:
• Situation:Situation: a single administration of one test form. All items in
an internally consistent scale assess the same construct.
•Procedure:Procedure: Divide test into comparable halves and correlate
scores from both halves.
– Split Half with Spearman Brown adjustment
– Kuder Richardson #20 and #21
– Cronbach’s Alpha
•Meaning:Meaning: consistency across the parts of a measuring instrument
(“parts” = individual items or subgroups of items).
•Appropriate Use:Appropriate Use: Where focus is on the degree to which same
characteristic is being measured. A measure of test homogeneity.
Internal Consistency StrategiesInternal Consistency Strategies
All items in the test should be homogenous. And there should be a
relationship among them.
Split – HalfSplit – Half
ReliabilityReliability
Split – HalfSplit – Half
ReliabilityReliability
Cronbach AlphaCronbach AlphaCronbach AlphaCronbach Alpha
Kuder-RichardsonKuder-Richardson
FormulasFormulas
Kuder-RichardsonKuder-Richardson
FormulasFormulas
Split – Half Reliability)Split – Half Reliability)
In split-half reliability we randomly divide all items that purport to measure
the same construct into two sets. We administer the entire instrument to a
sample of people and calculate the total score for each randomly divided
half. the split-half reliability estimate, as shown in the figure, is simply the
correlation between these two total scores. In the example it is .87.
Odd/ even Items, easy and difficult item equally distributed.
Spearman Brown Prophecy FormulaSpearman Brown Prophecy Formula
k = the number of items I WANTk = the number of items I WANT
toto
estimate the reliability for dividedestimate the reliability for divided
byby
the number of items I HAVEthe number of items I HAVE
( )11
*
11
11
−+
=
kr
rk
rkk
Cronbach AlphaCronbach Alpha
Cronbach Coefficient Alpha used only if the item scores are other
than 0 & 1. (Such as Likert scale). )This is advisable for essay items,
problem solving and 5-scaled items. ; based on 2 or more parts of the
test, requires only one administration of the test.
Kuder – Richardson FormulasKuder – Richardson Formulas
Kuder and Richardson believed that all items in a test are designed to
measure a single trait.
KR21 is the most practical, frequently used and convenient method
of estimating reliability.
K – R20 = most advisable if the p values vary a lot
K – R21 = most advisable if the items do not vary much in difficulty,
i.e., the p values are more or less similar.
The KR21 formula is a simplified version of theThe KR21 formula is a simplified version of the
KR20.KR20.
Inter-rater ReliabilityInter-rater Reliability
Having a sample of test papers (essays) scored independently by
two examiners.
Inter-rater reliability is a measure of reliability used to assess the
degree to which different judges or raters agree in their assessment
decisions.  Inter-rater reliability is useful because human observers will
not necessarily interpret answers the same way; raters may disagree as
to how well certain responses or material demonstrate knowledge of
the construct or skill being assessed. 
Intra-rater ReliabilityIntra-rater Reliability
The degree of stability observed when a measurement is repeated
under identical conditions by the same rater.
•Note: Intra-rater reliability makes it possible to determine the
degree to which the results obtained by a measurement procedure can
be replicated.
Standard Error of MeasurementStandard Error of Measurement
 All tests scores contain some error
 For any test, the higher the reliability estimate, the lower the error
 The standard error or measurement is the average standard
deviation of the error variance over the number of people in the
sample.
 Can be used to estimate a range within which a true score would
likely fall.
 We never know the true score
 By knowing the S.E.M. and by understanding the normal curve, we
can assess the likelihood of the true score being within certain
limits.
 The higher the reliability the lower the standard error of
measurement, hence more confidence we can place in the
accuracy of a person’s test score.
Factors That Affect The ReliabilityFactors That Affect The Reliability
CoefficientCoefficient
• Test lengthTest length
• Range of scoresRange of scores
• Item similarityItem similarity
Questions?Questions?
Ad

More Related Content

What's hot (20)

Validity
ValidityValidity
Validity
Maury Martinez
 
Test Usefulness
Test UsefulnessTest Usefulness
Test Usefulness
sara_galastarxy
 
Item Response Theory (IRT)
Item Response Theory (IRT)Item Response Theory (IRT)
Item Response Theory (IRT)
Dr. Muhammad Zafar Iqbal
 
introducing language testing and assessment
 introducing language testing  and assessment introducing language testing  and assessment
introducing language testing and assessment
Najah M. Algolaip
 
Kinds of tests and testing
Kinds of tests and testingKinds of tests and testing
Kinds of tests and testing
Maury Martinez
 
Communicative testing
Communicative testingCommunicative testing
Communicative testing
Samcruz5
 
Approaches to Language Testing
Approaches to Language TestingApproaches to Language Testing
Approaches to Language Testing
mpazhou
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
shobhitsaxena67
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
Astrid Caballero
 
Language testing
Language testingLanguage testing
Language testing
Jihan Zayed
 
Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6
ahfameri
 
Language Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingLanguage Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testing
Musfera Nara Vadia
 
Validity, reliablility, washback
Validity, reliablility, washbackValidity, reliablility, washback
Validity, reliablility, washback
Maury Martinez
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
Arash Yazdani
 
Differences between testing and assessments
Differences between testing and assessmentsDifferences between testing and assessments
Differences between testing and assessments
Shilpi Agrawal
 
The relationship between Language testing & SLA
The relationship between Language testing & SLAThe relationship between Language testing & SLA
The relationship between Language testing & SLA
Kobra( Minoo) Tajahmadi
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
songoten77
 
Test methods in Language Testing
Test methods in Language TestingTest methods in Language Testing
Test methods in Language Testing
Seray Tanyer
 
Valiadity and reliability- Language testing
Valiadity and reliability- Language testingValiadity and reliability- Language testing
Valiadity and reliability- Language testing
Phuong Tran
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
Phạm Phúc Khánh Minh
 
introducing language testing and assessment
 introducing language testing  and assessment introducing language testing  and assessment
introducing language testing and assessment
Najah M. Algolaip
 
Kinds of tests and testing
Kinds of tests and testingKinds of tests and testing
Kinds of tests and testing
Maury Martinez
 
Communicative testing
Communicative testingCommunicative testing
Communicative testing
Samcruz5
 
Approaches to Language Testing
Approaches to Language TestingApproaches to Language Testing
Approaches to Language Testing
mpazhou
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
shobhitsaxena67
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
Astrid Caballero
 
Language testing
Language testingLanguage testing
Language testing
Jihan Zayed
 
Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6
ahfameri
 
Language Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingLanguage Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testing
Musfera Nara Vadia
 
Validity, reliablility, washback
Validity, reliablility, washbackValidity, reliablility, washback
Validity, reliablility, washback
Maury Martinez
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
Arash Yazdani
 
Differences between testing and assessments
Differences between testing and assessmentsDifferences between testing and assessments
Differences between testing and assessments
Shilpi Agrawal
 
The relationship between Language testing & SLA
The relationship between Language testing & SLAThe relationship between Language testing & SLA
The relationship between Language testing & SLA
Kobra( Minoo) Tajahmadi
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
songoten77
 
Test methods in Language Testing
Test methods in Language TestingTest methods in Language Testing
Test methods in Language Testing
Seray Tanyer
 
Valiadity and reliability- Language testing
Valiadity and reliability- Language testingValiadity and reliability- Language testing
Valiadity and reliability- Language testing
Phuong Tran
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
Phạm Phúc Khánh Minh
 

Viewers also liked (20)

Testing language skills chapter one
Testing language skills chapter oneTesting language skills chapter one
Testing language skills chapter one
vidadehnad
 
Testing language areas and skills
Testing language areas and skillsTesting language areas and skills
Testing language areas and skills
Phạm Phúc Khánh Minh
 
Chapter 4 testing aima
Chapter 4 testing aimaChapter 4 testing aima
Chapter 4 testing aima
Aimz Crisostomo
 
Norm-referenced & Criterion-referenced Tests
Norm-referenced & Criterion-referenced TestsNorm-referenced & Criterion-referenced Tests
Norm-referenced & Criterion-referenced Tests
Fariba Chamani
 
Chapter 2 powerpoint
Chapter 2 powerpointChapter 2 powerpoint
Chapter 2 powerpoint
Whitney Carver
 
Testing grammar and vocabulary
Testing grammar and vocabularyTesting grammar and vocabulary
Testing grammar and vocabulary
marinasr_
 
Language Testing
Language TestingLanguage Testing
Language Testing
Teguh Ekosetio
 
Language Testing: Approaches and Techniques
Language Testing: Approaches and TechniquesLanguage Testing: Approaches and Techniques
Language Testing: Approaches and Techniques
Monica Angeles
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
Anju Kumawat
 
Lecture11
Lecture11Lecture11
Lecture11
Walter Perez
 
Languaje Testing, I Bimestr
Languaje Testing, I BimestrLanguaje Testing, I Bimestr
Languaje Testing, I Bimestr
Videoconferencias UTPL
 
Language testing (Testing Vocabulary)-Adapted from Madsen
Language testing (Testing Vocabulary)-Adapted from MadsenLanguage testing (Testing Vocabulary)-Adapted from Madsen
Language testing (Testing Vocabulary)-Adapted from Madsen
Melva Simangunsong
 
Test techniques and testing overall ability
Test techniques and testing overall abilityTest techniques and testing overall ability
Test techniques and testing overall ability
Santi Setiorini Nur
 
How to test oral production
How to test oral productionHow to test oral production
How to test oral production
Rosa Elena Cabrera
 
Testing oral ability ppt
Testing oral ability pptTesting oral ability ppt
Testing oral ability ppt
Ángela Martinez Rebolledo
 
Reability & Validity
Reability & ValidityReability & Validity
Reability & Validity
Jo Balucanag - Bitonio
 
Standards In Language Testing
Standards In Language TestingStandards In Language Testing
Standards In Language Testing
masters8
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
Louzel Linejan
 
Understanding Authenticity in Language Teaching & Assessment
Understanding Authenticity in Language Teaching & Assessment Understanding Authenticity in Language Teaching & Assessment
Understanding Authenticity in Language Teaching & Assessment
Omaima Ayoub
 
Unit 4 - Referring Expressions
Unit 4 -  Referring ExpressionsUnit 4 -  Referring Expressions
Unit 4 - Referring Expressions
Ashwag Al Hamid
 
Testing language skills chapter one
Testing language skills chapter oneTesting language skills chapter one
Testing language skills chapter one
vidadehnad
 
Norm-referenced & Criterion-referenced Tests
Norm-referenced & Criterion-referenced TestsNorm-referenced & Criterion-referenced Tests
Norm-referenced & Criterion-referenced Tests
Fariba Chamani
 
Testing grammar and vocabulary
Testing grammar and vocabularyTesting grammar and vocabulary
Testing grammar and vocabulary
marinasr_
 
Language Testing: Approaches and Techniques
Language Testing: Approaches and TechniquesLanguage Testing: Approaches and Techniques
Language Testing: Approaches and Techniques
Monica Angeles
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
Anju Kumawat
 
Language testing (Testing Vocabulary)-Adapted from Madsen
Language testing (Testing Vocabulary)-Adapted from MadsenLanguage testing (Testing Vocabulary)-Adapted from Madsen
Language testing (Testing Vocabulary)-Adapted from Madsen
Melva Simangunsong
 
Test techniques and testing overall ability
Test techniques and testing overall abilityTest techniques and testing overall ability
Test techniques and testing overall ability
Santi Setiorini Nur
 
Standards In Language Testing
Standards In Language TestingStandards In Language Testing
Standards In Language Testing
masters8
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
Louzel Linejan
 
Understanding Authenticity in Language Teaching & Assessment
Understanding Authenticity in Language Teaching & Assessment Understanding Authenticity in Language Teaching & Assessment
Understanding Authenticity in Language Teaching & Assessment
Omaima Ayoub
 
Unit 4 - Referring Expressions
Unit 4 -  Referring ExpressionsUnit 4 -  Referring Expressions
Unit 4 - Referring Expressions
Ashwag Al Hamid
 
Ad

Similar to Testing in language programs (chapter 8) (20)

4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...
4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...
4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...
yanuarrizal6
 
Characteristics of a Good Test
Characteristics of a Good TestCharacteristics of a Good Test
Characteristics of a Good Test
Ann Meredith Garcia
 
Reliability
ReliabilityReliability
Reliability
Martin Vince Cruz, RPm
 
Establishing the English Language Test Reliability
 Establishing the  English Language Test Reliability  Establishing the  English Language Test Reliability
Establishing the English Language Test Reliability
Djihad .B
 
Reliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdfReliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdf
Vartika Verma
 
Monika seminar
Monika seminarMonika seminar
Monika seminar
monika22singh
 
Monika seminar
Monika seminarMonika seminar
Monika seminar
monika22singh
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITY
Joydeep Singh
 
Reliability
ReliabilityReliability
Reliability
Novi Kirena
 
RELIABILITY.pptx
RELIABILITY.pptxRELIABILITY.pptx
RELIABILITY.pptx
rupasi13
 
Establishing Validity-and-Reliability-Test ppt.pptx
Establishing Validity-and-Reliability-Test ppt.pptxEstablishing Validity-and-Reliability-Test ppt.pptx
Establishing Validity-and-Reliability-Test ppt.pptx
RayLorenzOrtega
 
Reliability
ReliabilityReliability
Reliability
Celine Espada
 
VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptxVALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
Abhinav Bhatt
 
B.tech admission in idia
B.tech admission in idiaB.tech admission in idia
B.tech admission in idia
Edhole.com
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
cyrilcoscos
 
Validity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesValidity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their Types
MohammadRabbani18
 
Reliability And it's types in psychological testing and measurements
Reliability And it's types in psychological testing and measurementsReliability And it's types in psychological testing and measurements
Reliability And it's types in psychological testing and measurements
NainoAli
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.ppt
CityComputers3
 
Reliability and validity issues in language
Reliability and validity issues in languageReliability and validity issues in language
Reliability and validity issues in language
Hassan Asadollahfam
 
With-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptx
With-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptxWith-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptx
With-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptx
JunrivRivera
 
4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...
4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...
4ESTABLISHING_TEST_RELIABILITY.pptx;filename= UTF-8''4ESTABLISHING TEST RELIA...
yanuarrizal6
 
Establishing the English Language Test Reliability
 Establishing the  English Language Test Reliability  Establishing the  English Language Test Reliability
Establishing the English Language Test Reliability
Djihad .B
 
Reliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdfReliability by Vartika Verma .pdf
Reliability by Vartika Verma .pdf
Vartika Verma
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITY
Joydeep Singh
 
RELIABILITY.pptx
RELIABILITY.pptxRELIABILITY.pptx
RELIABILITY.pptx
rupasi13
 
Establishing Validity-and-Reliability-Test ppt.pptx
Establishing Validity-and-Reliability-Test ppt.pptxEstablishing Validity-and-Reliability-Test ppt.pptx
Establishing Validity-and-Reliability-Test ppt.pptx
RayLorenzOrtega
 
VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptxVALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
VALIDITY AND RELIABILITY OF THE TOPIC NURSING RESEARCH.pptx
Abhinav Bhatt
 
B.tech admission in idia
B.tech admission in idiaB.tech admission in idia
B.tech admission in idia
Edhole.com
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
cyrilcoscos
 
Validity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesValidity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their Types
MohammadRabbani18
 
Reliability And it's types in psychological testing and measurements
Reliability And it's types in psychological testing and measurementsReliability And it's types in psychological testing and measurements
Reliability And it's types in psychological testing and measurements
NainoAli
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.ppt
CityComputers3
 
Reliability and validity issues in language
Reliability and validity issues in languageReliability and validity issues in language
Reliability and validity issues in language
Hassan Asadollahfam
 
With-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptx
With-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptxWith-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptx
With-Hershey-Marie-Abarri-4hshzjzhzhzhzhzhz.pptx
JunrivRivera
 
Ad

Recently uploaded (20)

Search Matching Applicants in Odoo 18 - Odoo Slides
Search Matching Applicants in Odoo 18 - Odoo SlidesSearch Matching Applicants in Odoo 18 - Odoo Slides
Search Matching Applicants in Odoo 18 - Odoo Slides
Celine George
 
The Pedagogy We Practice: Best Practices for Critical Instructional Design
The Pedagogy We Practice: Best Practices for Critical Instructional DesignThe Pedagogy We Practice: Best Practices for Critical Instructional Design
The Pedagogy We Practice: Best Practices for Critical Instructional Design
Sean Michael Morris
 
INDIA QUIZ FOR SCHOOLS | THE QUIZ CLUB OF PSGCAS | AUGUST 2024
INDIA QUIZ FOR SCHOOLS | THE QUIZ CLUB OF PSGCAS | AUGUST 2024INDIA QUIZ FOR SCHOOLS | THE QUIZ CLUB OF PSGCAS | AUGUST 2024
INDIA QUIZ FOR SCHOOLS | THE QUIZ CLUB OF PSGCAS | AUGUST 2024
Quiz Club of PSG College of Arts & Science
 
Aerospace Engineering Homework Help Guide – Expert Support for Academic Success
Aerospace Engineering Homework Help Guide – Expert Support for Academic SuccessAerospace Engineering Homework Help Guide – Expert Support for Academic Success
Aerospace Engineering Homework Help Guide – Expert Support for Academic Success
online college homework help
 
Capitol Doctoral Presentation -May 2025.pptx
Capitol Doctoral Presentation -May 2025.pptxCapitol Doctoral Presentation -May 2025.pptx
Capitol Doctoral Presentation -May 2025.pptx
CapitolTechU
 
GENERAL QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 4 MARCH 2025 .pdf
GENERAL QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 4 MARCH 2025 .pdfGENERAL QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 4 MARCH 2025 .pdf
GENERAL QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 4 MARCH 2025 .pdf
Quiz Club of PSG College of Arts & Science
 
How to Manage Manual Reordering Rule in Odoo 18 Inventory
How to Manage Manual Reordering Rule in Odoo 18 InventoryHow to Manage Manual Reordering Rule in Odoo 18 Inventory
How to Manage Manual Reordering Rule in Odoo 18 Inventory
Celine George
 
MCQS (EMERGENCY NURSING) DR. NASIR MUSTAFA
MCQS (EMERGENCY NURSING) DR. NASIR MUSTAFAMCQS (EMERGENCY NURSING) DR. NASIR MUSTAFA
MCQS (EMERGENCY NURSING) DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
114P_English.pdf114P_English.pdf114P_English.pdf
114P_English.pdf114P_English.pdf114P_English.pdf114P_English.pdf114P_English.pdf114P_English.pdf
114P_English.pdf114P_English.pdf114P_English.pdf
paulinelee52
 
Antepartum fetal surveillance---Dr. H.K.Cheema pdf.pdf
Antepartum fetal surveillance---Dr. H.K.Cheema pdf.pdfAntepartum fetal surveillance---Dr. H.K.Cheema pdf.pdf
Antepartum fetal surveillance---Dr. H.K.Cheema pdf.pdf
Dr H.K. Cheema
 
How to Add Button in Chatter in Odoo 18 - Odoo Slides
How to Add Button in Chatter in Odoo 18 - Odoo SlidesHow to Add Button in Chatter in Odoo 18 - Odoo Slides
How to Add Button in Chatter in Odoo 18 - Odoo Slides
Celine George
 
"Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit...
"Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit..."Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit...
"Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit...
AlionaBujoreanu
 
The History of Kashmir Lohar Dynasty NEP.ppt
The History of Kashmir Lohar Dynasty NEP.pptThe History of Kashmir Lohar Dynasty NEP.ppt
The History of Kashmir Lohar Dynasty NEP.ppt
Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
PUBH1000 Slides - Module 10: Health Promotion
PUBH1000 Slides - Module 10: Health PromotionPUBH1000 Slides - Module 10: Health Promotion
PUBH1000 Slides - Module 10: Health Promotion
JonathanHallett4
 
Letter to Secretary Linda McMahon from U.S. Senators
Letter to Secretary Linda McMahon from U.S. SenatorsLetter to Secretary Linda McMahon from U.S. Senators
Letter to Secretary Linda McMahon from U.S. Senators
Mebane Rash
 
Dastur_ul_Amal under Jahangir Key Features.pptx
Dastur_ul_Amal under Jahangir Key Features.pptxDastur_ul_Amal under Jahangir Key Features.pptx
Dastur_ul_Amal under Jahangir Key Features.pptx
omorfaruqkazi
 
20250515 Ntegra San Francisco 20250515 v15.pptx
20250515 Ntegra San Francisco 20250515 v15.pptx20250515 Ntegra San Francisco 20250515 v15.pptx
20250515 Ntegra San Francisco 20250515 v15.pptx
home
 
Peer Assesment- Libby.docx..............
Peer Assesment- Libby.docx..............Peer Assesment- Libby.docx..............
Peer Assesment- Libby.docx..............
19lburrell
 
Final Evaluation.docx...........................
Final Evaluation.docx...........................Final Evaluation.docx...........................
Final Evaluation.docx...........................
l1bbyburrell
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-17-2025 .pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-17-2025  .pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-17-2025  .pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-17-2025 .pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
Search Matching Applicants in Odoo 18 - Odoo Slides
Search Matching Applicants in Odoo 18 - Odoo SlidesSearch Matching Applicants in Odoo 18 - Odoo Slides
Search Matching Applicants in Odoo 18 - Odoo Slides
Celine George
 
The Pedagogy We Practice: Best Practices for Critical Instructional Design
The Pedagogy We Practice: Best Practices for Critical Instructional DesignThe Pedagogy We Practice: Best Practices for Critical Instructional Design
The Pedagogy We Practice: Best Practices for Critical Instructional Design
Sean Michael Morris
 
Aerospace Engineering Homework Help Guide – Expert Support for Academic Success
Aerospace Engineering Homework Help Guide – Expert Support for Academic SuccessAerospace Engineering Homework Help Guide – Expert Support for Academic Success
Aerospace Engineering Homework Help Guide – Expert Support for Academic Success
online college homework help
 
Capitol Doctoral Presentation -May 2025.pptx
Capitol Doctoral Presentation -May 2025.pptxCapitol Doctoral Presentation -May 2025.pptx
Capitol Doctoral Presentation -May 2025.pptx
CapitolTechU
 
How to Manage Manual Reordering Rule in Odoo 18 Inventory
How to Manage Manual Reordering Rule in Odoo 18 InventoryHow to Manage Manual Reordering Rule in Odoo 18 Inventory
How to Manage Manual Reordering Rule in Odoo 18 Inventory
Celine George
 
MCQS (EMERGENCY NURSING) DR. NASIR MUSTAFA
MCQS (EMERGENCY NURSING) DR. NASIR MUSTAFAMCQS (EMERGENCY NURSING) DR. NASIR MUSTAFA
MCQS (EMERGENCY NURSING) DR. NASIR MUSTAFA
Dr. Nasir Mustafa
 
114P_English.pdf114P_English.pdf114P_English.pdf
114P_English.pdf114P_English.pdf114P_English.pdf114P_English.pdf114P_English.pdf114P_English.pdf
114P_English.pdf114P_English.pdf114P_English.pdf
paulinelee52
 
Antepartum fetal surveillance---Dr. H.K.Cheema pdf.pdf
Antepartum fetal surveillance---Dr. H.K.Cheema pdf.pdfAntepartum fetal surveillance---Dr. H.K.Cheema pdf.pdf
Antepartum fetal surveillance---Dr. H.K.Cheema pdf.pdf
Dr H.K. Cheema
 
How to Add Button in Chatter in Odoo 18 - Odoo Slides
How to Add Button in Chatter in Odoo 18 - Odoo SlidesHow to Add Button in Chatter in Odoo 18 - Odoo Slides
How to Add Button in Chatter in Odoo 18 - Odoo Slides
Celine George
 
"Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit...
"Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit..."Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit...
"Bridging Cultures Through Holiday Cards: 39 Students Celebrate Global Tradit...
AlionaBujoreanu
 
PUBH1000 Slides - Module 10: Health Promotion
PUBH1000 Slides - Module 10: Health PromotionPUBH1000 Slides - Module 10: Health Promotion
PUBH1000 Slides - Module 10: Health Promotion
JonathanHallett4
 
Letter to Secretary Linda McMahon from U.S. Senators
Letter to Secretary Linda McMahon from U.S. SenatorsLetter to Secretary Linda McMahon from U.S. Senators
Letter to Secretary Linda McMahon from U.S. Senators
Mebane Rash
 
Dastur_ul_Amal under Jahangir Key Features.pptx
Dastur_ul_Amal under Jahangir Key Features.pptxDastur_ul_Amal under Jahangir Key Features.pptx
Dastur_ul_Amal under Jahangir Key Features.pptx
omorfaruqkazi
 
20250515 Ntegra San Francisco 20250515 v15.pptx
20250515 Ntegra San Francisco 20250515 v15.pptx20250515 Ntegra San Francisco 20250515 v15.pptx
20250515 Ntegra San Francisco 20250515 v15.pptx
home
 
Peer Assesment- Libby.docx..............
Peer Assesment- Libby.docx..............Peer Assesment- Libby.docx..............
Peer Assesment- Libby.docx..............
19lburrell
 
Final Evaluation.docx...........................
Final Evaluation.docx...........................Final Evaluation.docx...........................
Final Evaluation.docx...........................
l1bbyburrell
 

Testing in language programs (chapter 8)

  • 1. Language Test ReliabilityLanguage Test Reliability Teacher:Teacher: Dr. Golshan Prepared by:Prepared by:Tahere Bakhshi November 2015November 2015 In the name of God
  • 2. A test should have:A test should have: Reliability: (Same result under the same condition)Reliability: (Same result under the same condition) Validity: (Scale to measure the size of head Not sthValidity: (Scale to measure the size of head Not sth else)else) Usability or Practicality: (Not too difficult, practicalUsability or Practicality: (Not too difficult, practical to use)to use) •The problem of measuring mental traits, languageThe problem of measuring mental traits, language proficiency, motivation and … !proficiency, motivation and … ! • Tests should measure consistently !Tests should measure consistently !
  • 3. VarianceVariance VarianceVariance:: variance measureshow far aset of numbersisspread out.   Variance of Zero: Identical values Small Variance: Expected value close to mean High Variance: Spread out values, far from mean
  • 4. Sources ofSources of VarianceVariance  Meaningful VarianceMeaningful Variance Those sources that make variance related to the purpose of the test. To gain the goal: (Items be related to the purpose of designed test & students’ knowledge on topic. Test validity issue: (see Table 8.1, P. 170) ) Other Factors unrelated to the aim of the test :Other Factors unrelated to the aim of the test :  Measurement error or Error VarianceMeasurement error or Error Variance Those sources that make variance related to other extraneous variables.
  • 5.  Types of issues related to errorTypes of issues related to error variancevariance1.Variance due to the environment: (1.Variance due to the environment: (Noise, classroom temperature, outside noises, distractions, amount of space per person, lighting, ventilation, or other environmental factors)). 2. Variance due to the administration procedure: (2. Variance due to the administration procedure: (Directions of test, Quality of equipment and timing (Cassette or teachers ) )). Table 2.5, p.35 3. Variance due to examinees:3. Variance due to examinees: ((Condition of students: their fatigue, health, hearing or vision)). ((Psychological factors: motivation, memory, concentration, forgetfulness, impulsiveness, carelessness and…). (). (Students’ testwiseness and Strategies)) 4. Variance due to scoring procedure:4. Variance due to scoring procedure: Errors in doing scoring. Subjective nature of scoring procedure. 5. Variance due to test and test items: (5. Variance due to test and test items: (Printing, knowing answer sheet, number of items, item selection, quality of test items, test security)) The mentioned sources of measurement error should beThe mentioned sources of measurement error should be minimized so that there is no Variance in students’minimized so that there is no Variance in students’
  • 6. ((Dependable or trustworthy)) A test is considered reliable if it would give us the same result over and over again. How is reliability measured?How is reliability measured? By comparing two sets of scores for a single assessment (such as twoBy comparing two sets of scores for a single assessment (such as two rater scores for the same person).rater scores for the same person). After having two sets of scores for a group of students, we can determine how similar they are by computing a statistic known as the reliability coefficient. Reliability Coefficient:Reliability Coefficient: A numerical index of reliability, ranging from 0 to 1. Number closer to 1 = high reliability. A low reliability coefficient indicates more error in the assessment results. Reliability is considered good or acceptable if the reliability coefficient is .80 or above.  Reliability of NRTsReliability of NRTs
  • 8. 1. Test-Retest Reliability:1. Test-Retest Reliability: SituationSituation: Same people taking two administrations of the same test. ProcedureProcedure: Correlate scores on the two tests which yields the coefficient of stability. MeaningMeaning: the extent to which scores on a test can be generalized over different occasions (temporal stability). Appropriate use:Appropriate use: Information about the stability of the trait over time. DisadvantagesDisadvantages: Requires two testing sessions, Learning, Test effect.      Three Basic Strategies to Estimate theThree Basic Strategies to Estimate the reliability of a Test:reliability of a Test:
  • 9. 2. Parallel / Equivalent-Forms Reliability:2. Parallel / Equivalent-Forms Reliability:  SituationSituation: Testing of same people on different but comparable forms of the test. (Forms A & B)  ProcedureProcedure: correlate the scores from the two tests which yields a coefficient of equivalence.  Meaning:Meaning: the consistency of response to different item samples (where testing is immediate) and across occasions (where testing is delayed).  Appropriate use:Appropriate use: to provide information about the equivalence of forms. Ali usually ………… late at night. A. study b. studiesAli usually ………… late at night. A. study b. studies c. studyingc. studying Reza often ………… the shopping in the afternoon. A. do b. doesReza often ………… the shopping in the afternoon. A. do b. does c. doingc. doing
  • 10. 3. Internal Consistency3. Internal Consistency Reliability:Reliability: • Situation:Situation: a single administration of one test form. All items in an internally consistent scale assess the same construct. •Procedure:Procedure: Divide test into comparable halves and correlate scores from both halves. – Split Half with Spearman Brown adjustment – Kuder Richardson #20 and #21 – Cronbach’s Alpha •Meaning:Meaning: consistency across the parts of a measuring instrument (“parts” = individual items or subgroups of items). •Appropriate Use:Appropriate Use: Where focus is on the degree to which same characteristic is being measured. A measure of test homogeneity.
  • 11. Internal Consistency StrategiesInternal Consistency Strategies All items in the test should be homogenous. And there should be a relationship among them. Split – HalfSplit – Half ReliabilityReliability Split – HalfSplit – Half ReliabilityReliability Cronbach AlphaCronbach AlphaCronbach AlphaCronbach Alpha Kuder-RichardsonKuder-Richardson FormulasFormulas Kuder-RichardsonKuder-Richardson FormulasFormulas
  • 12. Split – Half Reliability)Split – Half Reliability) In split-half reliability we randomly divide all items that purport to measure the same construct into two sets. We administer the entire instrument to a sample of people and calculate the total score for each randomly divided half. the split-half reliability estimate, as shown in the figure, is simply the correlation between these two total scores. In the example it is .87. Odd/ even Items, easy and difficult item equally distributed.
  • 13. Spearman Brown Prophecy FormulaSpearman Brown Prophecy Formula k = the number of items I WANTk = the number of items I WANT toto estimate the reliability for dividedestimate the reliability for divided byby the number of items I HAVEthe number of items I HAVE ( )11 * 11 11 −+ = kr rk rkk
  • 14. Cronbach AlphaCronbach Alpha Cronbach Coefficient Alpha used only if the item scores are other than 0 & 1. (Such as Likert scale). )This is advisable for essay items, problem solving and 5-scaled items. ; based on 2 or more parts of the test, requires only one administration of the test.
  • 15. Kuder – Richardson FormulasKuder – Richardson Formulas Kuder and Richardson believed that all items in a test are designed to measure a single trait. KR21 is the most practical, frequently used and convenient method of estimating reliability. K – R20 = most advisable if the p values vary a lot K – R21 = most advisable if the items do not vary much in difficulty, i.e., the p values are more or less similar. The KR21 formula is a simplified version of theThe KR21 formula is a simplified version of the KR20.KR20.
  • 16. Inter-rater ReliabilityInter-rater Reliability Having a sample of test papers (essays) scored independently by two examiners. Inter-rater reliability is a measure of reliability used to assess the degree to which different judges or raters agree in their assessment decisions.  Inter-rater reliability is useful because human observers will not necessarily interpret answers the same way; raters may disagree as to how well certain responses or material demonstrate knowledge of the construct or skill being assessed. 
  • 17. Intra-rater ReliabilityIntra-rater Reliability The degree of stability observed when a measurement is repeated under identical conditions by the same rater. •Note: Intra-rater reliability makes it possible to determine the degree to which the results obtained by a measurement procedure can be replicated.
  • 18. Standard Error of MeasurementStandard Error of Measurement  All tests scores contain some error  For any test, the higher the reliability estimate, the lower the error  The standard error or measurement is the average standard deviation of the error variance over the number of people in the sample.  Can be used to estimate a range within which a true score would likely fall.  We never know the true score  By knowing the S.E.M. and by understanding the normal curve, we can assess the likelihood of the true score being within certain limits.  The higher the reliability the lower the standard error of measurement, hence more confidence we can place in the accuracy of a person’s test score.
  • 19. Factors That Affect The ReliabilityFactors That Affect The Reliability CoefficientCoefficient • Test lengthTest length • Range of scoresRange of scores • Item similarityItem similarity
  翻译: