es lang langua uage ge test test
Introduction Th
o s i mp mp or or ta ta n
a titi n
e st st i
c on on si si de de ra ra ti ti o
of
i n d es es ig ig ni ni n
ul
is
an
d ev ev el el op op in in g
ie th
l an an gu gu ag ag e
im ly
in
i s t o p oi oi n o u t ha ha t a lt lt ho ho ug ug h u se se fu fu ln ln es es s i s o f u nq nq ue ue st st io io ne ne d i mp mp or or ta ta nc nc e i t h a n o b ee ee n d ef ef in in e p re re ci ci se se l e no no ug ug h t o p ro ro vi vi d b as as i f o e it it he he r d e i gn gn in ev op te de er in ef es af er it ha ee developed. ca
lu te
th
ia
ly
e st st s al ty
at
el
tr
ls ho
ld in ed de ln p os os e o de de l o f t es es t u se se fu fu ln ln e t ha ha t i nc nc lu lu de de s s i c on on st st ru ru c v al al id id itit y a ut ut he he nt nt ic ic it it y i nt nt er er ac ac ti ti ve ve ne ne ss ss , t io io na na li li zi zi n an
o u m od od e o f u se se fu fu ln ln e
er
he qu io ?' e ac ac h o f t h s i q ua ua lili ti ti e
ti
es
an
u s o f l an an gu gu ag ag e
u se se f is ic la te ts in en ib an ci le an en di o f t es es t u se se fu fu ln ln es es s p ro ro vi vi di di n e xa xa mp mp le le s t o i ll ll us us tr tr at at e p ro ro ce ce ss ss , f o e va va lu lu at at in in g t he he s q ua ua li li t
Tes useful usefulnes nes ac
ev lo
In t h ap ot e q ua ua lili ti ti es es -r -r el el ia ia bi bi lili ty ty , i mp mp ac ac t a n p ra ra ct ct ic ic al al itit y
i n t h d ev ev el el op op me me n
d ur ur in in g t h t es es t d es es ig ig n a n d ev ev el el op op me me n ie fo spec specif ific ic test testin in situ situat atio ions ns
ra ti al
a s e ct ct s
to
ib
te
l itit i
as
ee
T es t
Concept ua Doses of tes del'e/opmcllt
imze al hi as ed an ag e r to th re an te ab p o i ti o t h imzi l it y l ea d t h v ir r ua l l os s o f o th er s a ng ua g t es te r h av e b ee n t ol d t ha t t h q ua li ti e o f r el ia bi li t a n v al id it y a r e ss en ti al l i n c on fl ic t ( fo r e xa mp le , n de rh il l a ut he nt i ch
an
a t t h a m t im e r el ia bl e ( fo r e xa mp le , M or ro w 1979, 1986). as ab p o i ti o xp h e (1989), i s t ha t
s e n es s
a l t ie s
la
lI
st
u se fu ln es s i n t er m o f t h i x t es t q ua li ti es , a n o ut li n g en er a c on si de ra t io n a n p ro ce du re s f o a ss es si n t he se . W e c an no t h ow ev er , o ff e g en er a p re sc ri pt io n a bo u e it he r w ha t t h a pp ro pr ia t b al an c a mo n t h d if fe r on iv v al ua ti n t h o ve ra l
te ti u se fu ln es s o f
i t a ti o g iv e t es t i s e ss en ti al l
s ub je ct iv e
It
e mp ha si zi n t h t en si o a mo n t h d if fe r~ n q ua li ti es , t es t d ev el op er s n ee d to ec ll ze th em nt it ld us es ev ee to in an pr te a la n am th li ie an th th il in a ti o an er i s i s e ca u c on st it ut e a n a pp ro pr ia t b al an c c a b e d et er mi ne d o nl y b y c on si de ri n t h d if fe re n q ua li ti e i n c om bi na ti o a s t he y a ff ec t t h o ve ra l u se fu ln es s pa ic la te no on ne ca es ed as in ig .1 Usefulness Reliabilit Authenticity Interactivenes
Figure 2.1:
Construc validity Impact Practic ality
Usefulness
is e se n a t o n ew th f un ct io n o f s ev er a d if fe re n q ua li ti es , in er la ed al ef t ha t b as i f o o pe ra ti on al iz in g t hi s v ie w as
a li z P ri nc ip l Principl
P ri nc ip l
in
ev lo
en
in
ap er
to be maximized, r at he r t ha n t h i nd iv id ua l q ua li ti e t ha t a ff ec t u se fu ln es s Th individual test qualitie cannot be evaluate independently, lu te te th ir co in ct on o ve ra l u se fu ln es s o f t h t es t T es t u se fu ln es s a n t h a pp ro pr ia t b al an c a mo n t h d if fe re n
It
mine l an gu ag e es ta ic in in
ig
te ln ca e sc r e d a l o f w hi c c on tr ib ut e i n u ni qu e es i v rest.' We believ o f u se fu ln es s i n t h d ev el op me n
fo each specific testin
situation.
t es t u s b e d ev el op e w it h s pe ci fi c p ur po se , p ar ti cu la r g ro u er an ci ic la gu om in e. it ti co ex th es ta ll be in he an t si d th es el \' ew ll th ai ar la ag e' he ai as us
l ar ge - c al e t es t t ha t i l b e u se d f o a ki n i mp or ta n d ec is io n a bo u l ar g n um be r o f i nd iv id ua ls , f o e xa mp le , t h t es t d ev el op e a y w an t t o d e i g t h t es t a n t es t t as k s o a s t o a ch ie v t h h ig he s p o s ib l l ev el s o f r el ia bi l t il i interactiveness,
te an
as th impact
ll pr
ig
ee
a u e n c it y
Test qualitie I n c on si de ri n t h s pe ci fi c q ua li ti e t ha t d et er mi n t h o ve ra l u se fu ln es s o f en es l ie v ti ak te ie id ng es as ar o c e ta l ca io co ex th cu io w e w il l f oc u o n t h u s o f t e st s i n e du ca ti on a p ro gr a s , w hi c w il l i nc lu d an co on ts as ac in a te r l s ar ng ac iv ti ll as te ai er ce tw en te th ts n st r t i a l gr ou vi is in th ir im er nt ro te ea in he ts ca er ag ca ur to to e a primary w il l d is cu s i t r es pe c t o t es t a r s ha re d b y o th e c om po ne nt s o f l ea rn -
p ar ti cu la r l ea rn in g t as k t h i mp ac t o f g iv e l ea rn in g a ct iv it y o r t h p ra c i ca l t y ti ul e ac h ap ch iv it at n. u a i ti e li bi it an li it ar er it ca te an s om et i e s r ef er re d t o a s e ss en ti a measurement q ua li ti es . h i i s b ec au s co
is
in
fe
ce
ec io
Reliability Re ia il ty is te in en s co r w il l b e c on si st en t a cr os s d if fe re n
ea c ha ra ct er i t ic s
t. of th
ia le es t es ti n i tu -
s of t
2(
t u
If es ro e st s te to ot ta ts as a r t e i st ic s ib th ne reliability can be considered to b e f un ct io n o f c on si st en ci e e n e t o f t es t t a c ha ra ct er i t ic s ( Te s t as k c ha ra ct er is ti c
Scores on test task with characteristiC
Reliability
Figure 2.2:
in te ch er a cr os s d if fe r a r d is cu ss e
Scores on test task wit characte ristics A'
Reliability
c ha ra ct er is ti c t er s a n
u se d i n e r
v id ua l f ro m do no er o f t h a bi li t iv al nt ld
no
ke
di
en
to
h ig he s t o l ow es t i f t h s co re s o bt ai ne d o n t h d if fe re n f or m nd du in ti ll am de es ar i st e t , a n co ed to el ab i ca t w e w an t t o e as ur e S im il ar ly , i n t es t d e i gn e t o d is ti ng ui s ar ic ar a st e ev il ty om
id ls as te er th in a s i f a t o n decisions, Another ex ed ev al if en at o f c o p os it io ns . I n t hi s c a e , g iv e c om po si ti o s ho ul d r ec ei v t h s am e co ir ct ic ar cu ar te e d i t If o m a t ev el th he th th ti of if at ar is t, th co t ai n ld cu ed to l ia b li bi it is ea ly an es nt al a li t te es es te c or e a r r el at iv el y c on si st en t t he y c an no t p ro vi d u s w it h a n i nf or ma ti al ut th ab li a n to e a th am im ee to co ze bl i mi n t e co en es en ir at ca o , o we v in iz ec th te ia ur co en ar de ro th es de th ac rs af ct te ce a r c t i st ic s ev
pi
th
ge ly
in el ab
ng
te
to
imze
a ti o
in th
as
t ha t d o n o c or re sp on d t o v ar ia ti on s i n U J t as k I n C ha p d is cu s a y i n w hi c w e c a t ak e r el ia bi li t c on si de ra ri ou s
validity
C on st ru c v al id it y p er ta in s t o t h m ea ni ng fu ln es s a n a pp ro pr ia te ne s of the interpretations at ak th as es co te p re t s co re s f ro m l an gu ag e t e t s a s i nd ic at or s o f t es t t ak er s l an gu ag e a bi li ty , c ru ci a q ue st io n i s ' T w ha t e xt en t c a w e j u t if y t he s i nt er pr et at io ns ? er
te ed
21
de ig t es t a n e va lu at e i t p ot en ti a u se fu ln es s In additio to usin tes design to minimize variations in rest task characteri st ic s n ee d to t i a t ir e ct s es co to determine ho successfu we have been. (Procedure fo investigating an demonstrating reliabilit ar discusse in th Suggeste Readings at th en of this chaprer.
Construc
I n t hi s f ig ur e t h d ou bl e- he ad e a rr o i s u se d to indicate correspondce be en ts as ar te ti a n A') h ic h d if fe r o nl y id ta ay ex e, th am es be administere th am vi ls er ca if
e t
ab
to provid
adequate justificatio
fo an
interpreta
if li it of th in re at on ak es es im ly as er ar ha ey al er fy t ic u e r t at i ne e vi de nc e t ha t t h t es t s co r r ef le ct s t h a re a( s o f l an gu ag e a bi li t ea an l it t e l to id id ce d ef in e t h c on st ru c w e a n t o m ea su re . F o o u p ur po se s w e c a construct t o b e t h p ec if i d ef in it io n o f a n a bi li t t ha t p ro vi de s
an id w e w an t c on si de r th ba is
e r construct validity ic ca nt et en as ic to bi it ti o r c on st ru ct ( ) , w e w an t t o e as ur e C on st ru c v al id it y a ls o h a to do with t h d o a i o f g en er al iz at io n t o w hi c o u c or e I nt er pr et at io n g en er al iz e ta
ic te as es er ea an er ta t io n a bo u l an gu ag e a bi li t t o g en er al iz e b ey on d t h t es ti n i tu at io n i t e l to ar cu ar ai e s t w a sp e t s th co a li d s co r i nt er pr et at io n a r r ep re se nt e v is ua ll y i n F ig ur e 2 .3 . gu c at e te co appropriatel to be interprete to measure with respec to in ca i li t te specific domain of generalization. he de co li ty o f s co r ~ nt er rr et at io n w e n ee d t o c on si de r b ot h t h c on st ru ~ d ef in it io n th ar ct is cs es ta ee er ar ct ic es ta tw ea ee et en
22
C0l1ce/ tIwl base uf test development
Test usefulness: Quditics ollill/guage
Summar
SCORE INTERPRETATION Inference about language ability (construc definition)
Figure 2.3:
Construc
an
co st uc
va id ty
measure an te et an in ca an iv al gu bi it measuremen qualities, reliabilit an construc validity ar thus essentia t o t h u se fu ln es s o f a n l an gu ag e t es t R el ia bi li t i s n ec e s ar y c on di ti o fo construc validity an henc fo usefulness However, reliabilit IS not s uf fi ci en t c on di ti o f o e it he r c on st ru c v al id it y o r u se fu ln e s . S up po se , xa at ed te a ci n di du to er levels in an academic writin course multiple-choice tes of grammatical ed ie ve en li le co th i ci e t o u st i in th te l ac e e n e s it co is ca gr t ic a dg is nl as ct of th to p ro vi d
Of~ eneralizatiQ
,.
Language ability
of el ab li
ef in he co ct to in ud ea an ag ed is i na pp ro pr ia te l n ar ro w s in c t h c on st ru c i nv ol ve d i n t h T L d or na in -e a bi li t t o p er fo r a ca de mi c w ri ti n t as ks -i nv ol ve s o th e a re a o f l an gu ag e no ge as el ac ni iv tr te ie nd ay nv to ca k no wl ed g a n a ff ec ti v r es po n e s a s w el l
Interactiveness
validity ol scor
interpretation
Authenticity" ' au th en ti ci ty "
s ec on d r ea so n i s to d et er mi n
t h d eg re e t o w hi c
th te
under 'inreractiveness') . C on st ru c v al id at io n i s t h e o n- go in g p ro ce s o f d em on st ra ti n t ha t p ar t ic ul a i nt er pr et at io n o f t es t s co re s i s j us ti fi ed , a n i nv ol ve s e ss en ti al ly , b ui ld in g l og ic a c as e i n u pp or t o f p ar ti cu la r i nt er pr et at io n a n p ro vi d i n e vi de nc e j us ti fy in g t ha t i nt er pr et at io n. ' S ev er a t yp e o f e vi de nc e ( fo r example, conten relevanc an coverage concurrent criterio relatedness, ed ti l it y c a ro ed pp of ti la in er p r a ti o as pa th a li d i o oc es is di
ic
I t i s i mp or ta n f o t es t an in es c a n ev e b e c on si de re d of a n ti es th th in
vi
ed te en in er
g i a ll y
ak al it co er ti to it, potentia usefulness. d ev el op er s a n u se r to realiz that test validation at th nt pr ri ak or a b o lu te l v al id . justryiug t h i r r er pr ct au on s w e es co eg th co ev de p o on intended imuPfera~
gin' th impression that or th as n, ul ' n a , been validatcd. i s a li d
te at ce la ua te to co la ag e ci f ns er an th an ag t es t i ts el f O n a sp ec t o f d e o ns tr at in g t hi s p er ta in s t o t h c or re sp on de nc e b et we e t h c ha ra ct er is ti c o f L U t as k a n t ho s o f t h t e t as k It is this c or re sp on de nc e t ha t i s a t t h h ea r o f a ut he nt ic it y a n w e w ou l d es cr ib e t es t t a ho c ha ra ct er i t ic s c or re sp on d to a ti ve l a ut he nt ic . d ef in e authenticity as th degree of correspondence th a ct e i st i en la ag es to th at es i s r e a ti o is in ur is e sp o to ar Authenticity
figure 2.4:
Characte ristics of th test task
Authenticity
u th en ti ci t a s c ri ti ca l q ua li t o f l an gu ag e t es t h a n o g en er al l i n a n g e e st i ex ks ho ee on gu te in ea he er ca e. h e i ci t to an an es l it y ca i t e la t es
b ee n a te d de ta
24
Conceptual base of test developmen
t o g en er al iz e u th en ti ci t t hu s p ro vi de s <1 ea ig ti e xt en t t o h ic h c or e i nt er pr et at io n g en er al iz e b ey on d p er fo r a nc e o n t h
li
es as
th ts
ea te ia
g cn er al it :a bi li t validation ct
o f s co r
er test a k
i nt er pr et at io n
th ti it ep io
te
en
We believ that most language test developers implicitly consider authen t ic it y i n d es ig ni n l an gu ag e t e t s I n d ev el op in g r ea di n t e t , f o e xa mp le , li el to ho os ca nt at es th nd do ai
r eq ui re s t es t t ak er s t o p ar ti ci pa t es ta ic te ac io
a ut he nt ic it y i n t er m o f t a c ha ra ct er i t ic s i s n o r ad ic a d ep ar tu r f ro m en te ct ce at er l ie v o u ac es p re ci s w a o f b ui ld in g c on si de ra ti on s o f a ut he nt ic it y i nt o t h d es ig n a n developmen of language tests. In attempting to d e i g
a n a ut he nt i
cr
in
l it i
es
ua
at
st
your
reaction
co
ec nt
az
ou
a st e el an
l? ei er
(summarization be to i gh t p er ce iv e t hi s p ar ti cu la r a sp ec t o f l an gu ag e u s t o b e l ar ge l i rr el ev an t a b i t to c ar r o u a n i nt er ac ti v c on ve rs at io n w it h c us to me rs , il ty ib et in ly ig ev you needed to k no w e no ug h
ec
t hi s r el ev an ce , a s p er ce iv e b y t h t es t t ak er , t ha t w e b el ie v h el p p ro mo t positive affectiv response to pe a t t h i r e st .
t e t in g s it ua ti on . O r i f t h T L in e r a ti o th il
io
i s a n i mp or ta n an
s ef u e ss :
to
he i nv e t ig at in g t h part of construc
es
t es t t as k w e f ir s i de nt if y t h c ri ti ca l
f ra me wo r o f t as k c ha ra ct er i t ic s s uc h a s t ha t d es cr ib e i n t h n ex t c ha pt er . en it ig es ta ec le ta at ha es tical features In Chapters es ay ic ar te ti en if in cr ti al at te ta a s t o p er mi t u s t o a ss es s t he i r el at iv e a ut he nt ic it y B ec au s o f t h w a i n w hi c w e h av e d ef in e L U d om ai n o u d ef in it io n o f a ut he nt ic it y c a a pp l t o w id e v ar ie t o f d o a in s i nc lu di n l an gu ag e c la s r oo m i n w hi c t h t ea ch in g i s ' co m u ni ca ti ve ' o r ' ta sk -b as ed ' hi
t es t c on si st in g o f r it te n p as sa g d e c ri bi n ep to el io eq ed to il in th in co is es e le v ed to
om er io
in th
el
ct es an
t h t yp e o f
th ta at he la ag
e rc ha nd i th e le t nd ce th ic li is in ll e st s
ak p-
usefulness.
Interactiveness We defin mteractiueness th xt an in es er id ar te ti in ac in te in a l h a c t i st ic s at ar el an an te ng test taker' language abilit (languag knowledg an strategi competence o r r nc ta co gn it iv e s tr at eg ie s) , t op ic a k no wl ed ge , a n a ff ec ti v s ch e a ta . ( Th es e a r d e c ri be d o n p ag e 65-6 er te ac iv i n w hi c t h t es t t ak er ' a re a o f l an gu ag e k no wl ed ge , e ta co gn it iv e s tr at eg es to ca ed an ct ve ch at ar ag te as am le te ta ir es ta to at to al
to
c la s r oo m t ea ch in g a n l ea rn in g a ct iv it ie s t ha t a r t he m e lv e r el at e to ci in i ti o lo bl if perception of authenticity Differen test takers ma have differen percep ti ab ir ai ar ta ce io th el an ch ac er cs iv es as th ir ns er th th es el er am le pp ti le er pa to
a ti ve l o r i nt er ac ti v t ha n o n t ha t d oe s n ot . W e c a r ep re se n i nt er ac t i ve ne s a s i n F ig ur e 2 .5 . In this figure th double-heade arrows ar intended to represen interact io n b et we e l an gu ag e a bi li ty , t op ic a k no wl ed g a n a ff ec ti v s ch e a ta , a n t h c ha ra ct er is ti c o f t h t es t t a k . U nl ik e a ut he nt ic it y h ic h p er ta in s c on si de r t h c ha ra ct er i t ic s th er ti t we e
o f b ot h k in d in du
o f t as ks , i nt er ac ti ve ne s es ta la ag
r es id e
ce ua
es of es
lo
Test usefulness:
Illustrative example test task
LANGUAG ABILITY (Language knowledge
Metacognitive
Qualitie
of authenticity
an
of ImIgllage
interacttueness
in
I n o rd e t o a ch ie v b et te r u nd er st an di n o f w ha t w e e a b y a u th en ti ci t and n te ra ct iv en e s , w e i l p re se n f ou r e xa mp le s o f t es t t as k t ha t d if fe r inter o f t he s q ua li ti es . b el ie v t hi s i l h el p y o t o u nd er st an d h o a ct ua l t es t t a k s d if fe r i n t er m o f t he i a ut he nt ic it y a n i nt er ac ti ve ne s ir (A is y p e t c a i n t it u o n b r ch so o f t h t yp is t d o n o u nd er st an d n gl is h v er y w el l b u h av e n ev er th e
strategies)
gl es yp in ic lt ge in ec oc l an gu ag e u s i n n gl is h o r t o p ro du c r it te n t ex t i n n gl is h o n t he i o wn . er le he ce le yp ro al ty s cr ip ts , e ve n f ro m h an dw ri tt e d oc u e nt s w hi c i s t h o nl y t a r eq ui re d f o t he i j ob . s cr ee ni n t e f o n e t yp i t s i n t hi s s it ua ti o m ig h i nv ol v i ca n to an te u m t . If the
figure 2.5:
ex tl ly cr er ta
lnteractiueness
b ot h L U t a k s a n t es t t as k c a p ot en ti al l v ar y i n t he i i nt er ac ti ve ne s It i s f o t hi s r ea so n t ha t w e d is ti ng ui s i nt er ac ti ve ne s f ro m a ut he nt ic it y pt in la ag bi it lu ea of an k no wl ed g a n s tr at eg i c o p et en ce , o r e ta co gn it iv e s tr at eg ie s h us , i n language ability, in to vo ta er ea la ag te
t as k
s uc h a s t ho s
i nv ol vi n
a th em at ic a
c al cu la ti o
o r r e p on di n
ve nt ac io th te t. o we v le th in ac io ir th la ag ed ld ot er es ab an ab it a si s o f te ke er a nc e F o e xa mp le , g eo me tr y t es t t a t ha t r eq ui re s t h t es t t ak e t o u ti li z
or
nc th hu c r i ca l
diSCUSS
ac
nt
ay in
i n a ti o la ag es
ua it hi ig
te
an
ev lu te
gu as
bi it in
er io ts nt al
is
a ct i al ty
er ct eu ln
is le
nd yp th ll ab er ei ng es to jo ea ly ev th es et th i n a ct i e s ce it ec il re te oc th an te cu en la ag at is ty im ly op le te an th ce in d oc u e n a s p ie c o f d is co ur se . h i e xa mp l i ll us tr at e t es t t as k w hi c w ou l b e e va lu at e a s h ig hl y a ut he nt i b u l o i n t er m o f i nt er ac ti ve ne ss . inth er le om ia am ur i ca t t hi s e xa mp l t es t f al l i n t er m o f a ut he nt ic it y a n i nt er ac ti ve ne s ca th es tu ti ec t ha t t he s s am e a pp li ca nt s w er e c ap ab l o f c ar ry in g o n ' s a l t al k c on ve r th
es
in rv
in
em
li
topics
involved in non-test conversation If ec vi ls le
involv th same type of interactions he co fr te ew
Interactiveness High
Low Authenticity High Low
Figure 2.6:
If th
Authenticit
an
interactioenes
28
am le
is at bl
to
es
ha tt oc en to a ut he nt ic it y a n i nt er ac ti ve ne ss ?
te hi
t as k
ld bl te la iv te ie at lo th Il1 e le ct in g t op ic s a n i nf lu en ci n am in ig in th xa le A me ri ca n u ni ve rs it y e r g iv e
ly in in er ti en ti la ly te ak ea le am co t h s tr uc tu r o f t h i nt er ac ti on . T h ' B a te s th am le al in a ti o tu ts te in t es t o f E ng li s v oc ab ul ar y i n w hi c t he y
c ol um n S co re s f ro m t hi s t es t w ou l
b e u se d t o p ro vi d
th
hypothetical
customer in
t hi s t es t t a
ki
at t;
of language test
co id
et
2':)
churacrer
ce n o a nt ic ip at e
am
te
as
if
ay
te
in
th if e st i a ti o av must be considered essentia to language test
l an gu ag e l ea rn in g a n l an gu ag e t ea ch in g A t t h s am e t i e , h ow ev er , th minimu acceptable levels fo authenticity an interactivenes must be la ed th th or er es a l t ie s
Authenticity interactiveness an
construct validity
u th en ti ci ty , i nt er ac ti ve ne ss , a n c on st ru c v al id it y a l d ep en d u po n h o d ef in e t h c on st ru c l an gu ag e a bi li ty ' f o g iv e t es t s it ua ti on . u th en la ed to t r i ti o ti te li it is al pr a si s ci ng th ai hi or in er pretations to generalize an hence, fo investigatin this aspect of construc v al id it y T h r el at io ns hi p b et we e i nt er ac ti ve ne s a n c on st ru c v al id it y i s f un ct io n o f t h r el at iv e i nv ol ve me n o f a re a o f l an gu ag e k no wl ed ge , s tr a tegi competence or metacognitiv strategies an topica knowledge. That is th extent to whic high interacrivcnes correspond to construc validity will
v al i
e as ur e o f
g iv e
c on st ru ct , e ve n t ho ug h i t i s r el at iv el y i nt er ac ti v
f al ls .
things to re embe
abou authenticity an
interactiueness to th he ce f ol lo wi n i mp li ca ti on s f o h o ' re la ti ve l 'authentic
ju
i n a ct i e s il ep argued that thes qualitie
at in e la t e l
a n i nt er ac ti v i s t hi s t es t t as k It at l at i l y i g a ut he nt ic it y a n i nr er ac ri ve ne ss , I t i s h ig h a ut he nt ic it y b ec au s o f t h c or re sp on de nc e b et we e c ha ra ct er i t ic s o f t h d om ai n a n c ha ra ct er i st ic s o f t h t es t t as k I t i s h ig h i n i nt er ac ti ve ne s b ec au s o f t h i nv ol ve me n o f a s e ss me nt , g oa l- se tt in g a n p la nn in g t ra te gi e a s w el l a s t h h ig h l ev e o f i n vo lv em en t o f a l o f t h a re a o f l an gu ag e k no wl ed g a n t h t es t t ak er '
So
er ma
d ia gn os ti c i nf or ma -
t h t ic i an ac iv es is as an id th ti it ec us th ry gu in ic iv ti th in is or of ta al el ti el lo in in er ti en e ca u hi e st r t e minimal i nv ol ve me n o f t h lv an dg nd thre metacognitiv strategies Th 'C' i n t h ia in re w he r t hi s e xa mp l f al ls . in ex ay hi ec iv a le s e r as ir tt to ro a y might i nv ol v f ac e- to -f ac e o ra l c on ve r a ti o w it h a n i nt er lo cu to r h o p la y t h to
as
Qualitie
t ai n t e as a ti v ul ei in de pu e ve n t ho ug h t he y a r l o i n e it he r a ut he nt ic it y o r i nt er ac ri ve ue ss . e it h e si g te an zi ex in ts t i a te s o f a ut he nt ic it y a n i nt er a c ti ve ne s a r o nl y g ue ss es . c a d o o u b es t e si g e s t a at li au nt an in er ti
e d to
th
th
Test usefulness
ConG·eptua base of test developmen
o re ' o r ' re la ti ve l an 'inauthentic,'
au en ic tv te ct ve av he d es ig n d ~v el op , a n u s l an gu ag e t es t l es s a ut he nt i or 'interactive
o r i nt er ac ti ve , r at he r t ha n an 'non-interactive'.
e, i t n t a ct i in ha in es te ta er meta cognitiv strategies an topica knowledge. However, ifit requires very l it tl e i nv ol ve me n o f a re a o f l an gu ag e k no wl ed ge , it vali measur of language knowledge.
Impact n ot he r q ua li t o f t es t i s t he i i mp ac t 011 societ an educationa system a n u po n t h i nd iv id ua l w it hi n t ho s s y t em s T h i mp ac t o f t es t u s o pe r
30
Conceptual base of test deuelopmen
Test taking an seof test score
Impact
ct an ep te as ad in in ta in es im ls an nc il ly lu oa an th co eq (1990) p oi nt s o ut , ' te st s a r n o d ev el op e a n u se d i n et e st - e ; ey ir al l wa y i n e n ar ee
c ie t ts
ie a ti o
am e,
to
t ai n
es an te co ce ac an v al ue -f re e p sy ch o er ed
iv
er
er ce ch ea er nt al ak e v e wi n t h e s t a co at t h c on se qu en ce s f o o ci et y t h e du ca ti on a s y t em , a n t h i nd iv id ua l ve a si n i si o es co th an om th c ri te ri o s uc h a s s en io ri t o r p er so na l c on ne ct io n h us , h en ev e w e u s i l h av e s pe ci fi c c on se qu en ce s f or , o r i mp ac t o n b ot h t h i nd iv id ua l the system involved.
an
Washback e st i tr io al
o f l cm g ll ag e tests
r es ea rc h t ha t t es t d ev el op er s a n t e u se r c an no t s i p l a ss u t ha t t e t s will i mp ac t o n t ea ch in g a n l ea rn in g b u u s acrua ll investigate t h p ec if i a re a ( su c a s c on te n o f t ea ch in g t ea ch in g e th od ol og y ay of assessin achievement) directio (positive, negative), an extent of the p re su me d i mp ac t h ei r w or k a ls o a ke s i t c le a t ha t w as hhack ha poten ti af ti ly in id ls th at on te l, w hi c i mp li e t ha t l an gu ag e t e t er s n ee d t o i nv e t ig at e t hi s a sp ec t of washb ac k a ls o T h~ s i n i nv es ti ga ti n w as hb ac k o n u s b e p re pa re d to find
Macro: Society, education system Micro: Individuals ~-~-~.-
Impact
Figure 2.7:
em
Tes usefulness: QualitIe
teaching.
on individuals
Impact ld h ol de r
ak th es t ha t a r d ir ec tl y a ff ec te d i nc lu d
takers future classmates or co-workers ca io a tt e p ti n
te
e,
ca
th re
ar cu it ti et ak er s a n t h t es t u se rs ,
future employers) will be indirectly ar
at
ir al
ev
er
t o d is cu s t h g en er a s ys te mi c e ff ec t o f t es t u s o r t h p ot en ti a
ac th takers an teachers
id ls
ct
ec ed
te
es
Impact on tes takers
ch
ct ti er ak
el
t h p er sp ec ti v as ac i mp ac t o f t es ti n
ed at
al
is la in ci ta
ed
as
ar
ac
te ve
nd ct
i l p re se n
h er e. ! la ge ti ge ec o n i nd iv id ua ls , a n i t i s w id el y a ss u e d t o e xi st . H ug he s
l ea r
an er a t e st i ul ef ac in ea ng as ac ly te e du ca ti on a p ra ct ic e a n b el ie f (1994: er d, ar co ci
it er en es al
en th
ld te
c ia l ct af (1993),
ir ca
er
ec ed
th
as ec
e st i
ce
scores. ie es in ak th te ia a ff ec ti n t ho s c ha ra ct er i t ic s o f t es t t ak er s t ha t a r d is cu ss e i n o u o de l an in Ch er ig e s e st s ex in io ar ze te ti al te ti al co ze q ua li fi ca ti on s t es t r ak er s m a s pe n s ev er a e ek s p re pa ri n i nd iv id ua ll y f o t h t es t I n s om e c ou nt ri e w he r h ig h- st ak e n at io n- wi d p ub li c e xa m i na ti on s a rt : u se d f o s el ec ti o a n p la ce me n i nt o h ig he r l ev el s o f t h s ch oo l em in e r i t e s e ac h be ll r es t f o u p t o s ev er a y ea r b ef or e t h a ct ua l r es t a n t h t ec hn iq ue s n ee de d
32
Conceptual base of test developmen
Tes usefulness: Qualitie
ta er he te
he es ta er to ca le e, e, c te d id ic cu tu in io is ew hu ee et er op ca co en he es ta rm nf he es ke te ak ce ti in ff ted by t h t e t , p ar ti c l ad y i n c as e w he r t h T L d om ai n m a h e u nf a i l i a ( fo r e xa mp le , a n i nt er na ti on a t ud en t p la nn in g to e nr ol l i n p ro gr a o f s t ud y a t a nA me ri ca n c ol le g o r u ni ve rs it y) . t hu s n ee d t o a s w he th e te ea th te ak as va us ct th do ai es ak re la ag kn ge ay ec ed d i c on fi r a ti o o f t he i o w p er ce pt io n o f t he i l an gu ag e a bi li ty , a n ay a ff ec t t he i a re a o f l an gu ag e k no wl ed ge . F o e xa mp le , i f s om et hi n i s p re sented as grammaticall correc th input, bu is actually ungrammatical, t hi s c a b e m is le ad in g C on ve r e ly , t h t es t t ak e m a i mp ro v h e l an gu ag e k no wl ed g e it he r w hi l t ak in g t h t es t o r f ro m f ee db ac k r ec ei ve d i na ll y th es ta er at gi e ct e by t h c ha ra ct er is ti c of te as t ic u a r it as at ar ig t e a ct i th in in ta er th es an el th es w el l a s c ol le ct in g i nf or ma ti o f ro m t he m a bo u t he i p er ce pt io n o f t h t es t an test tasks. If t es t t ak er s a r i nv ol ve d i n t hi s w ay , o ul d h yp ot he si z th es as ik ly to be pe ei ed as au en ic in er tig es
ot ee ar likely to a f c t feedback as relevant
a te d a n p r ly rm be te ac te ak e ce i ab th ir es e m i r c tl y th e e to c o i d complete an meaningful to th test take
er
an ak as possible
otlanguag
tests
33
d ec is io n p ro ce du re s a n c ri te ri a a r a pp li e u ni fo rm l t o a l g ro up s o f t es t takers. Fair test us a ls o h a t o d o w it h t h r el ev an c a n a pp ro pr ia te ne s to t h he es d ec is io n W e n ee d to c on si de r t h v ar io u k in d o f ti nc di es te th co in e ci si o el as th e la t v e an th cr te th ll ai a m l e to a k l if e- af fe ct in g d ec is io n s ol el y o n it et er an to ea es ke ly in ed ou th e ci si o ll ad an et ci ns tu ll a d i n h e a y e sc r e d to
a se d i ma r et er the decision procedure
er ce
ig
ue io
rn
Impact on teachers ec an te
gr in an
in id ls i r c tl y f f uc io am es rs af ac er ed ab im of instruction.ias i mp le me nt e b y c la ss ro o t ea ch er s of in en ac er ey in th in to te lm at
am
e st i ca a ll y er th to oi bl
no on of te ch hi
te te te at d i c tl y ac am h a b ee n r ef er re d t o
th ir ru io te ct ea er in te al ci ci ed te th in te ch e ac h to th test implie
to th
e r e ct i e , i f t ea c e r
es
r e a te d el th
at
a u e n i ci t ey
ac
to
de
ip on
lp
te
et te
es as
el as
ba
ip io
r ic h v er ba l d e c ri pt io n e sp ec ia ll y i f g iv e i n p er so na l d eb ri ef in g w it h t h a pp ro pr ia t t es t a dm in is tr at or , c a b e v er y e ff ec ti v i n d ev el op in g p os it i v e a f c ti v ns he te te ta er al ci at ay th te ke ba of th es co ir ct af ct em be ep ad ce o r n on -e mp lo y e nt , es ta rs made. Fair decisions
ar
th ca er a l d ec i i on s t ha t c a h av e s er io u c on se qu en ce s ee id th ai es of ci th th ar al pp a te , eg le
te
a ut he nt ic it y
i n w hi c
a y h av e h ar mf u w as hb ac k o r n eg at iv e in ze te ti eg iv i mp ac t o n i ns tr uc ti o i s t o c ha ng e t h w a w e t e t ha t t h c ha ra ct er is ti c he es nd te ta co nd cl ly to ar ct is cs th instructiona program. te it at le th in tr ti al am ti ed th li an es lt it du es is ti of a se d th es lt te ay al ar be au io ct he am or in ta ce cu te ia ty l ea r a ct i i t e s te ch re ly el ev ot e f c ti v l ea r er instruction,
th te
to
34
Conceptttal
s it ua ti o
base of test developmen
T es t u se fu ln es s
t h h yp ot he si s i s t h a w e s ho ul d b e a bl e b ri n
a bo u i mp ro ve me n
c o p at ib l w it h h a b el ie v to b e p ri nc ip le s o f e ff ec ti v t ea ch in g a n l ea rn in g H ow ev er , o ul d r ei nf or c al an l de r o n' s ( 19 93 ) p oi n t ha t w e c an no t s i p l a ss u t ha t i mp os in g a n e nl ig ht en ed ' t es t w il l a ut o a ti ca ll y h av e a n e ff ec t o n i ns tr uc ti on a p ra ct ic e I nd ee d a s t he i r es ea rc h h a d e o ns tr at ed , t h i mp ac t o f t es ti n instructio varies with respec cu ts in tr io af te an ex th al it lo an positive an negative
an
e du ca ti on a
on
culture to
co
ie
an
v al u
sy te
ti io
cy
t ha t i nf or m o u t e
co
u se . T h
ig
ed
ti li
c on si de ra ti o
lu co id
to
matter of consideration. ls
co id
co
en
ct
i nd iv id ua l s ta k h ol de rs , a s d is cu s e d a bo ve , b u a ls o f o t h em ci ty is ic ar v id ua ls . C on si de r f o e xa mp le , t h i n p ra ct ic e a n l an gu ag e p ro gr a ty te as as ti er ie ly n ee d to c o i d en al in la e st s ic ar
o f l an gu ag e t es t
3S
\ V t hu s n ee d to i n ef en lt o f o u u si n t es t f o p ar ti cu la r p ur po se . W e c a o rg an iz e o u a ss es s e n of potentia consequences as follows: te ti ce th ti an ti t es t i n t he s ay r an k t he s p os si bl e o ut co me s i n t er m o f t h d e i ra bi li t o r u nd es ir ab il it of thei occurring; comes
IS.
complemented by consideratio of th consequences of usin alternatives to t e t in g t o a ch ie v t h s am e p ur po se .
at on
th it
Q ua li ti e
li in
at
e du ca ti on a e s e st s
p ot en ti a i mp ac t o n t h l an gu ag e t ea ch i n g iv e c ou nt r o f u si n p ar ti cu la r ece it ec ic es at al le l. ar im ct ie al es an ls es am e,
s ch oo l c hi ld re n i nt o d if fe re n i n t ru ct io na l p ro gr am s o r f o c re en in g i nd i al ap ti ar li en in professionals? If ment of society, to te ig th ex iv at in ce al al et e?
lu ls ci ty ca io al in ch it ta p la ce , a n a cc or di n t o t h p ot en ti a c on se qu en ce s o f u c u se . I n a s e ss in g t h i mp ac t o f t es t u se , w e u s c on si de r t h c ha ra ct er is ti c o f t h p ar ti cu la r t e t in g s it ua ti o ( pu rp os e L U d o a in , t e t ak er s c on st ru c d ef in it io n inter o f t h v al ue s a n g oa l o f t h e i nd iv id ua l a ff ec te d a n o f t h e e du ca io te ci ty ti co en ls th ca io te ci ty io ac la te ca ac iz in er im ct lu en al ac te ak ir ar te ic ch ea ti ti at al te et er an d is cu s s pe ci fi c c on si de ra ti on s f o a ss es si n t h p ot en ti a i mp ac t an ta in ac e si g a n el en p re se nt e
i n P ar t
h re e
Practicality [Q co er is ct ca it ch is i n n at ur e f ro m t h o th e f iv e q ua li ti e W hi l t ho s u al it ie s p er uses t ha t a r m ad e o f t e c or es , p ra ct ic al it y p er ta in s p ri ma ri l to ch th te llbe im em te an to ar e, w he th e i t w il l b e d ev el op e a n u se d a t a ll . h a i s f o a n g iv e i tu at io n es le en th te ee es ce a va il ab le , t h t e w il l b e i mp ra ct ic al , a n i l n o b e u se d u nl es s r e o ur ce s
d if fe re n
a ll oc at ed . A lt ho ug h t h c on si de ra ti o o f p ra ct ic al it y l og ic al l co id io th li ie is im an es th th li ie
f ol lo w t h ti al ty tr
36
Test usefulness
C on ce pt ua l (lases of test deuelopmen
50 t h er io ac ic li li ly to ct i si o e ve r s ta g a lo n t h w ay , a n m a l ea d us to reconsider an perhap revise Some of ou earlie specifications in lo te t, t r to a ch ie v t h o pt i u m b al II a nc e a mo n t h q ua li ti e o f r el ia bi li ty , c on st ru c v al id it y a ut he nt ic it y interacriveness, an impact fo ou particular testin situation. In addition w e u s d et er mi n t h e so ur ce s r eq ui re d to a ch ie v t hi s b al an ce , i n r el a tionship to th resource that ar available. Thus determinin th practicalit te in er io at eq ev io es th a nc e it es t, an ll at an es ce th ar il le ll tr ed ta te chapte we will simply dcfine practicality an introduc th primar aspect o f p ra ct ic al it y t ha t n ee d t o b e c on si de re d c a d ef in e practicality a s t h r el at io ns hi p b et we e t h r e o ur ce s t ha t
administration in in
at te
av la ig
es
Availabl
racticality Ifpracticalit If practicalit
ac iv ie
at
an
time fo specific task (for example, designing, writing, an in co er if ty ls en al e st i a t t h if ta co ts
es d if fe re n
ce
a i a bl e
t yp e o f r es ou rc e
resources
an
1,the test developmen an us is practical. 1,the test developmen an us is no practical.
es
yp
or
r e o ur ce s t ha n a r
ee
ai ab
we ca
or raters test admin istrators
( ti m f ro m t h
es
an
clerical support)
resources)
b eg in ni n
o f t h t es t d ev el op me n
p ro ce s
to th
(e.q. designing writ ing, adminis tering, scoring. analvzinq
Figure 2.9;
Types of resources
I ff er en t t yp e o f r es o r ce s a n t he i a ss oc ia te d c os ts , w il l b e r eq ui re d i n v ar yi n d eg re e a t d if fe re n t ag e o f t es t d ev el op me n a n u se . f ur th er more th specific type an amount of resource required will vary accord it ti d et er mi ne d f o
er r es ou rc e ll at
a va il ab le .
es ur es
r es ou rc es , w hi c
t.
Material resource
Time fo specific task
d ev el op me n a n t es t u s c a p ro ce ed . If availabl resource ar exceeded t he n t h t e i s n o p ra ct ic a a n t h d ev el op e u s e it he r m od if y t h s pe ci fications to ea lo te th th an li ed ie ly r eq ui r
ti 2 .9 .
Human resources
Development tim
Practicality
ci ca io
ti ll
l is te d i n i gu r
Materials (e.g paper, pictures librar
Required resources
r ac ti ca li t i s a a tt e o f t h e xt en t t o w hi c t h d em an d o f t h p ar ti cu l a t es t s pe ci fi ca ti on s c a b e m e w it hi n t h l i i t o f e xi st in g r e o ur ce s W e ie is ew ti li is ef ec le ef ce
es
ar
Space (e.g room fo test developmen an test admin istration Equipment (e
:::.:.=:.:.:..:,--::..::..::..::.:.:;:.:
Figure 2.8:
37
an
(e.g test writers, scorer
ce
of /.lIlgllage tests
as cl ic ch al l . Material resources includ spac (suc as room fo test developmen an tesr adlllinisrratinn), equipmen (suc as typewriters, word rroc~ssors tape an vide recorders, c om pu te rs ) a n m at er ia l ( su c a s p ap er , pictures, library resources). Time c on si st s o f d ev el op me n t im e ( ti m f ro m t h b eg in ni n o f t h t e d ev el op -
<1
es
Qualitie
th p ec if i t es ti n
al
at
o r a t d if fe re n an
th i tu at io n
ab
ti li t es t p ec if ic at io n t ha t i s p ra c
ev
s ta ge s i n t h t e t in g p ro ce ss . et in er
te d is cu s r es ou rc e
Conclusion c la s i f
i nt o t hr e
g e e ra l t yp e Huma
resources
We believ th contributio
approach to
to
defining test usefulness develope
here,makc
38
C on ce pt ua l b as e o f t e
Test usefulness
d ev el op me n
es i pl e co in el ti im al th qualitie that contribute to u se fu ln e s , a n e na bl e u s t o c on si de r h o to in ct th ac it ti ti th specific testin situation. That is it link considerations of reliabilir validity authenticiry interacriveness, impact an practicality to th specific
a va il ab il it y a n a ke s t w t he s q ua li ti e a bs tr ac t t he or ie s al ie
a ll oc at io n o f r es ou rc es . eq ir ts es i t r es pe c to a n s ta ti st ic a f or mu la e
h i a pp ro ac h to tes usefulness lo er ir
er th
ex post [act
S ec on d in
in
le
u s ~ on si de r t he s el analyses.
Summary im ti an it ln is ed er ix it ab it v al id it y a ut he nt ic it y i nt er ac ti ve ne ss , i mp ac t a n p ra ct ic al it y T he s s i t es t q ua li ti e a l c on tr ib ut e t o t e u se fu ln es s t ha t t he y c an no t b e e va lu a te d i nd ep en de nt l o f e ac h o th er . u rt he r o re , t h r el at iv e i mp or ta nc e th en li ie e st i tu io to th te ef at i c e st i tu io S i i la rl y t h a pp ro pr ia t b al an c o f t he s q ua li ti e c an no t b e p re sc ri be d important consideration to a t t h e x e n o f o th er s R at he r w e n ee d to strive to achieve an appropriate
hu
i n d es ig ni n
an
3':.1
to generalize, and
potentially, facilitate thei test performance. e fi n th ci th th ar er ic en la ct isic ta lnteractiveness to t~e d eg re e t o h ic h t h c on st ru ct s w e w an t t o a ss es s a r c ri ti ca ll y i nv ol ve ~ l~ a cc o p li sh in g t h t e t as k I nt e a ct iv en es s i s a ls o i mp or ta n b ec au s I tIS a t t h h ea r o f a n c ur re n v ie w o f l an gu ag e t ea ch in g a n l an gu a l ea rn e ra c i v is io t h e xt e an ty t h t e t ak er ' l a g ua g a bi li t ( la ng ua g k no wl ed g p lu s e ta co g n v s tr at eg ie s) , t op ic a k no wl ed ge , a n a ff ec ti v s ch em at a i n a cc o p li sh in g test task
ic ty te ti la e, th ea r el at iv el y o re ' o r ' re la ti ve l l e s ' a ut he nt i a n i nt er ac ti ve , r at he r t ha n 'authentic an 'inauthentic', or 'interactive an 'non-interactive'. Furthero r w e c a n o d et er mi n t h e la ti v a ut he nt ic it y o r i nt er ac ti ve ne s o f es lo in al ar er ti lmpact
ed af ec
al
e as ur e e nt ; i nc on si st en c th th co ct e, th th ch te cs te as d ev el op in g l an gu ag e t e t s w e t r t o i ni mi z v ar i
of lalJguag test
t ra te d e s v al id at io n i s t h e o ng oi n r oc e o f d e o ns tr at in g t ha t p ar ticula interpretation of test scores is justified, an involves providin evid ence justifying that interpretation Th justificatio that we need to provid is evidence of construct validity o r e vi de nc e t ha t t h t e s co r r e l ec t [ h to e a u re , a n v er y l it tl e e l e . ar la li Authenticity an ta ty an
ly at
ie ev
be measured. Reliability c a b e d ef in e a s c on si st en c ar ti in es co es at isdu to want to as ct
Qualitie
an
th an er
ar
i~ in
lw th
ti
id
SOCI-
c.
of
to
te in i ni m z e iv ed at th n ee d t o e st i a t t he i e ff ec t o n t e s co re s t o d et er mi n h o s uc ce ss fu l w e h av e b ee n i n i ni mi zi n t he m a s s o r ce s o f i nc on si st en c o f e as ur e e nt . Construct validity p er ta in s t o t h e an i g fu ln e a n a p r op ri at en es s o f er et ti at a k OIl te id
cr le te th ie al ti i cr o l ev el , i n t er m o f t h i nd iv id ua l w h a r a ff ec te d b y t h p ar t ~ ul a t e u se . h e n ot i o f w a sh ba ck , o r b ac k a sh , h ic h h a b ee n c on si de r a bl e c on ce r i n l an gu ag e t e t in g c a b e v ie we d i n t er m o f v ar io u a sp ec t of impact. n li k t h o th e f iv e q ua li ti e o f t es t s co re s practicality an
in
ti li
th
a va i a bl e
h ic h p er ta i
el ti
to
ip ct
th ti
ac ic
th Son
of
Qlhtlities
de lo en an es th ar available. Several types of resources ca be identified huma resource material r~sources, an rime Th specific resource required will vary fro~ one situation to c ti c l i et rm ed if te in tu io make little sens to s a t ha t g iv e t es t o r t es t t as k i s m or e o r l es s p ra ct ic a . t ha n a n~ )t .h er , i n g en er al . C on si de ra ti on s o f p ra ct ic al it y a r l ik el y t o a ff ec t he i ca t
ay ea
co id ha li e si g ve in te ac ev l~ am th l it i el ab it ct a li d a ut he nt ic it y . ln te ra Ct Iv en es s a n i mp ac t f o o u p ar ti cu la r t es ti n s it u ation. I n a dd it io n u s d et er mi n t h r es ou rc e r eq ui re d t o a ch ie v t hi s b al an ce , i n r el at io ns hi p t o t h r e o ur ce s t ha t a r a va il ab le .
Exercises h a w a y ou r c on ce pt io n o f t es t u se fu ln e
ce n o f am il ia r ha t h describe
it
es
av
t h f ra me wo r
ta to in this chapter?
t o r ea di n
cc
ar
ls
b e u se d f o t h p ur po s
t hi s c ha p
ed
o f u se fu ln es s p re se nt e
an ev at fulnes of each proposal P ar t T hr e
p ri o
i ti e
te
th
d e c ri be d i n P ro je c
u se f
t ia l in in Part Three.
ef ct ag es to er er u se fu ln es s A r t he r a n a sp ec t o f i t a pp ar en t u se fu ln es s t ha t c an no t ca tu th ix al ti e sc r e d i n i s c h t e If might you a dd it io na l q ua li ti e y o c am e u p w it h
ity,
a r w id el y a cc ep te d b y t h e as ur em en t p ro fe s i on , a s w el l es lo es Bachman (1990) discusse reliabilit an validity in Chapters and respectively Messick (1989) provides an extensiv theoretica frarneon id a li d e. cu te al ti in th te e d t i a l performance in as es en ( 19 92 ) B ak er , O 'N ei l a n i n ( 19 93 ) L in n ( 19 94 ) M es si c ( 19 94 ) an Wiggin (1994). 83 en ic ty ng te ch material an exercises. ac an 99 te es an en cu th t ic it y i n l an gu ag e t es ts . le th im ta ce id ed to tes takers. Swain (1985) ak gn te as at as or th te er it ni ie er at th leve of ability. (1993) discus thei al ld an an r ec en t r es ea rc h i nt o w as h h ac k I ug he s ( 19 89 ) W ei r ( 19 90 ) a n C oh e (1994) provid genera discussion of washback publication
Notes M an y o f t he s q ua li ti e a r d i c u s e i n Standard fo Educationa an Psychologica Testin (APA 1985). C ur re n m ea su re me n t he or is t g en er al l i nc lu de , a s c on si de ra ti on s i n he id ty co nt et ti an th en h e d e i si o ad the basi of test scores (see fo example, il an he ew er t o c on si de r c on se qu en ce s o f d ec i i on s e pa ra te ly , u nd e t h q ua li t o f impact. en es
Suggested reading
th ti la i si o th ll ad es at an ll en ia a sp ec t o f v al id it y h i i nv ol ve s d e o ns tr at in g t ha t t h c on se qu en ce s o f a ki n p ar ti cu la r d ec is io n a bo u i nd iv id ua l a n o f u si n t es t s co re s
im ac
an
Psychologica
Testin
(A
1 98 5 p ro -
im ct
rest
j ec t e s
[orEducational
41
i n t hi s c ha pt er .
reduced?
Standard
of "m,~Iii1g( test
ti it
it ti cu th te ct cn
ie al
a ti o ua it
in at
es ti it es
el
42
once tual base of test develo ment
a ut he nt ic it y a n i nt er ac ti ve ne s a r r el at e to c on st ru c v al id it y a n im eq en a si s te e. ev e li ev e t ha t t he s q ua li ti e a r i mp or ta n e no ug h to the developmen te to at id io il discus them as separate qualities. man's (1990) 'interactional/ability approach to defining authenticity T h d ef in it io n o f a ut he nt ic it y a n i nr er ac ti ve ne s t ha t p re se n h er e thus represen what Bachma (1990) ha characterize as th 'real-life er io al ab ty ch en it r ec og ni z t h v al u a n u se fu ln es s o f e ac h i n c ha ra ct er iz in g t es t t a k s t he n i t la th ti id er h a l ar ge l b ee n r ej ec te d b y o t l an gu ag e t es te r a n s pe ci al is t i n e du ca io al ea li ty ce li ty would at authenticity an interactiveness, an pages 285-9) f o a n e xt en si v 'face validity'.
ea ia thus do no consider 'fac validity (1990 d is cu s i o o f t h p ro bl em s w it h t h t e
H ug he s 1 98 9 e i 1990). ac er ee o r p re va le n i n b ot h l a g ua g t es ti n a n a pp li e l in gu is ti cs , a n s o ll at te C on si de ri n w as hb ac k w it hi n t h s co p o f i mp ac t a pp ea r to be consiste n w it h c ur re n r e e ar c i n l an gu ag e t es ti ng , s uc h a s t ha t by al an research into washback (1988) d is cu s e s t h n ee d t o ' h m an iz e an lu es ch ac this.
t h e xp er ie nc e o f t es t