近畿大学学術情報リポジトリ

全文

(1)ItemAnalysesofaMultiple-choiceAchievementTest: IncreasingitsUsefulness. KaoriNitta. 1.lntroduction. Likemanyotheruniversities,KinkiUniversityhastriedtohaveourstudentsgethigher. scoresontheTOEICTest.Wechangedoursyllabustorealizeacαmmunicative-oriented. course,wroteatextbookfocusingonpreparingfortheTOEICTest,andbegantoimplement. anachievementtestthatwouldbealsousedasascreeningtestfortherealTOEICTest,. whichisaproficiencytest.. Inthispaper,Iwillstatethatitispossibletouseatestformultiplepurposes,suchasan achievementtestasaproficiencytest.Next,Iwillsuggestthatweusetwokindsofitem analysestomakeourtestmoreuseful.Finally,Iwillsuggestthatforarnultiplechoicetest,we usetwodistractorsinsteadofthree.. 2、Anachievementtestandaproficiencytest. Isitappropriatetouseonetestformultiplepurposes?Firstofall,whatisan`achievement. test'?Whatisthedifferencebetweenan`achievementtest'anda`proficiencytesゼlike. TOEICorTOEFL?Alderson,ClaphamandWall(1995:286)saythatachievementtestsare " similartoprogresstests,buttheyaregivenattheendofthecourse.Thecontent_[are]. generallybasedonthecoursesyllabusorthecoursetextbook."Ontheotherhand,theyclaim thatproficiencytestsare"notbasedonaparticularlanguageprogramme..."and"show. whetherstudentshavesufficientabilitytobeabletousealanguageinsomespecificarea_". (293)Inotherwords,achievementtestsareusedtojudgethepast,namely,whetherstudents. 19一.

(2) 近畿大学語学教育部紀要7巻1号(2007・7). haveachievedacertainlevelattheendofthecourse.Ontheotherhand,proficiencytestsare. theonestopredictthefuture,・thatis,whethertheycansurvivesuccessfullyintheirfuture. discoursecommunity,suchasgraduateschoolsandcompaniestheywillbelongto.Thus,two. typesoftestsapParentlyhavedifferentpurposes.. However,accordingtoBrindley(1986),bothtypesoftestsareblurred,andhardto distinguishinacommunicatively-orientatedlanguageprogram.Hughes(2003)alsostates thatbothkindsoftestsareessentiallythesame.Moreover,McNamara(1996)claimsthatif thecoursesyllabusreflectstherealworld,onetestcanbeusedfordifferentpurposes.Soa testcanhavemultiplepurposesundercertaincircumstancessuchasinacommunicativelyorientedforeignlanguagecourse.. InKinkiUniversity,ourEnglishcourseforfirst-yearstudentsisintendedtohelpthem preparefortheTOEICTest.Thistestisbasedonbusinesscommunicationintherealworld, thatis,communicativeEnglishskillsandknowledgecompaniesexpecttheiremployeesto haveinordertomaketheirbusinessworkwellintheglobalsociety.SoourEnglishcourseis alsoconnectedtotherealworldoutsidethecampus.. Ourgoalistohelpourstudentsimprovetheircommunicativeskills,which,eventually,enables themtogethigherscoresontheTOEICTest.However,questionsofrealTOEICTestsare toodifficultformostofourstudents,whoneedstep-by-stepexercisesbeforechallengingth6 realTOEICquestions.Sincewecouldnotfindanysuitabletextbookincludingsuchexercises, wewroteanewtextbookforthecourse,focusingonthebasiccommunicativeskills.. ByanalyzingthequestionsoftheTOEICTest,wecandecidewhichfunctions,which. grammaticalpoints,andwhatkindsofgenresforrea4ingourstudentsshouldacquire.We includedwhatwethoughtwasthemostimportantonthebasisofthepriorityinthetextbook, andsothetextbookisitselfconcernedwiththeTOEICTest,thatis,therealbusiness-oriented domainoftheoutsideworld.. 一20一.

(3) ItemAnalysesofaMultiple-choiceAchievementTest Ourachievementtestforthiscourseisusedtocheckhowmuchourstudentshaveacquired throughthetextbook,asNunan(1999:301)claimsthatanachievementtest"attemptsto Ineasurewhatstudentshavelearnedfromaparticularcourseorsetofrnaterials."Atthe sametime,itisalsopossibletopredictstudents'performancesontherealTOEICTest, becauseourcourseandtheTOEICTestincludethesamefunctions,thesamegrammatical pointsandthesamegenresforreading.Thus,sincethecoursesyllabusreflectstherealworld, ourtestcanbeusedfordifferentpurposes:anachievementtestandascreeningtestf6rthe realTOEICTest.. 3.ltemanalyses. 3.1Whyisstatisticalanalysisneeded. Itis,ofcourse,impossibletodevelopaperfecttest,inparticular,whichhasmultiplepurposes.. However,wecanmakeeffortstomakeourtestsmoreuseful,orasusefulaspossible.Inorder. tojustifythetest,weneedtohavesomeevidence.BachmanandPalmer(1996:17)statesthat "th emostimportantqualityofatestisitsusefulness.一. 一一Wethusregardamodeloftest. usefulnessastheessentialbasisf6rqualitycontrolthroughouttheentiretestdevelopment. process.一. 一一weproposeamodeloftestusefulnessthatincludessixtestqualities-reliability,. constructvalidity,authenticity,interactiveness,impact,andpracticality.". Weuseamultiple-choicetestaswellastheTOEICTest.Sincewewanttohelpstudentsget higherscoresontheTOEICTest,togiveamultiple-choicequestionsisauthenticinasense. Moreover,weuseasmanyauthenticmaterialsaspossible.. Itgoeswithoutsayingthatmultiple-choicetestsareverypractical,thatis,easytoimplement andrate.Thereisnoinconsistencyamongraters.They,however,tendtolackinteractiveness becausethosetestsonlygivestudentslimitedtasks.. Wemaygiveapositivewashback(impact)toalltheclassesbygivingthesametesttothe. 一21一.

(4) 近畿大学語学教育部紀要7巻1号(2007・7). studentsoftheseclassesinordertoevaluatetheirclassperformances。Ofcourse,weneedto. beverycarefulnottomakeourstudentstootest-wise.. AsIsaidabove,ourachievementtestissupposedtomeasurecommunicativeEnglishskills andknowledgecompaniesexpecttheiremployeestohave.Weinterprettesttakers' performancesonthetest,thatis,towhatextenttesttakershaveobtained,Toinvestigatethe scorestesttakersobtaindefinitelyincludesquantitativestatisticalprocedureswhichare relatedtotwoofthequalitiesofmeasurement,reliabilityandconstructvalidity.. AccordingtoBachmanandPalmer(1996:19),reliabilityis"definedasconsistencyof. measurernent."Andconstructvalidityis"themeaningfulnessandappropriatenessofthe. 磁6η. 肥'α. 加 η5thatwemakeonthebasisoftestscores."(21). Inordertojustifytheinterpretations,BachmanandPalmer(1996:21)claimthat"weneed. toprovideevidencethatthetestscorerenectsthearea(s)oflanguageability.wewantto. measure."Bachman(2004:3)pointsoutthat"weneedtobeabletodemonstratethatscores. weobtainfromlanguagetestsarereliable,andthatthewaysinwhichweinterpretanduse. languagetestscoresarevalid.一. 一一Animportantkindofevidence-一. 一isthatwhichwederive. fromquantitativedata-scoresfromtesttasksandtestsasawhole-andtheappropriate. statisticalanalysesofthesedata.". Inthenextsection,Iwillfocusonhowtoincreasereliabilityofourtestbyusingtwokindsof itemanalyses:classicitemanalysesandIRT(itemresponsetheory).IRThasbeendeveloped tosolvetheproblemsofclassicitemanalyses,butofcourse,ithasitsproblemsorIimitations, soweshouldknowwhatcanbemoresuitabletoanalyzetestitems.Toincreasereliabilityof ourtestwillleadustobettervalidity,andreliabilityisoneofthetoolstomakeatestmore usefu1.. 一22一.

(5) ItemAnalysesofaMultiple-choiceAchievementTest 3.2Analysesofourachievementtest. 3.2.1Whatistheachievementtestlike. ThetestwhichisanalyzedhereistheachievementtestadministeredinNovember,2004.The purposesofthetestare: 1)tocheckhowmuchthestudentshavemasteredwhattheyhavebeenexposedto 2)toselecttheupper40%ofthestudentswhocantakeTOEICforfree. Thedetailsareasfollows:. ◆Theparticipants:236first-yearstudentswhoarenon-Englishmajors. ◆Thenumberoftheitems:75items(Listeningsection;30,Readingsection:45). ◆Thetimeduration:60minutes. ◆Thecontent:1)Listeningsection:30items. (1)photos:8items(40ptions). (2)quickresponses:10items(30ptions). (3)conversations:7items(40ptions). (4)announcements:5items(40ptions) 2)Readingsection:45items. (5)incompletesentences:20items(40ptions). (6)errorrecognition:10items(40ptions). (7)readingcomprehension:15items(40ptions). LikethecaseoftheTOEICTest,ourachievementtesthas合evensectionsasabove,butthe. numbersof. 、eachsectionaredifferentfromtheTOEICTestbecausewechosemoresuitable. questionsforl-yearstudentsdependingonthedifficultyofeachsection.. 3.2.2Proceduresofanalyzingtheachievementtest. Beforeweimplementedtheachievementtest,wehadgivenapilottesttoabout100second一. 一23一.

(6) 近畿木学語学教育部紀要7巻1号(2007・7). yearstudents,andmodifieditemswhichturnedouttobetoodi貿icult.. Inordertoexaminethereliabilityofatest,weusestatisticaltoolssuchasclassicalitem analysesandIRT.Asabasicclassicalitemanalysis,facilityvaluesanddiscriminationindices arecalculated.Aldersonetal(1995:80-81)statethat"anitem'sfacilityvalueisthe percentageofstudentstoansweritcorrectly."Andadiscriminationindexcangiveusthe informationon"howwellit[anitem]distinguishesbetweenstudentsatdifferentlevelsof ability."(81)Ifanitemshowsahigherdiscriminationindex,itdiscriminatesbetter.. Sinceweusemultiplechoiceitemsforourachievementtest,anotherclassicalitemanalysis, ` analysisofdistractors,ユshouldbeinvestigatedtocheckhoweachdistractorworks.Some. distractorsattractmoretesttakersthancorrectanswers,andtheseitemsdonotcontributeto. raisereliabilityofthetest.Thus,weneedtomodifyordeletetheseitems.. Nowwehavethreekindsofclassicalitemanalyses.Arethesesatisfactoryenough?Inorderto investigatethetestmorethoroughly,IcarriedoutanewitemanalysisusingIRT.Iwillshow theresultsofeachanalysisandalsolimitationsofeachanalysis.Ifoundoutthatwewould needdiversifiedanalysestomakedecisionswhichitemsshouldbemodifiedordeleted. Furthermore,fromtheresultofdistractoranalysis,Isuggestthatwecanreducethenumber ofdistractors,. 3.2.3Theresultsofitemanalyses. Graphlbelowshowsthehistogramoftheachievementtest,fromwhichwegetthemean (averagescore)44.5(outof75),whichmeans59.30utof100andthestandarddeviation8.89.. 一24一.

(7) ItemAnalysesofaMultiple-choiceAchievementTest. numberofthestudents. Std.Dev=8.89. 戯毎. Mean;44.5 輪濯臥. 思y. 榊.. 甲. 御争亀︹帝〃、奄. 50.0. ξ 畿姫僻鋤. 40.0. きウ. 55,0. 35.0 30.0. 難輔灘. 25.0 20.O. 総繍灘蟻灘欄難 . 読吻7鮒窃. 15.0. 45.0. 難灘羅. 繍慧灘 (. 篶. 臨". 鮮W^. 盈. 0,. ■. 1Q. ■. 20. ■. 30. ■. 40. 一. 50. 鱗麟難欝鐵灘灘瓢講難隷懸難騨懸辮雛羅羅灘難難灘灘難欝難懸灘. 60. N=236.000. 65.0 60.0. 70.0. TOTAL. Graph1:Histogramoftheachievementtest. [1]Classicalfacilityvalueanddiscriminationindex. IusedExcelandSPSSforthisanalysis.Inordertocalculatediscriminationindices,weneedto decidewhobelongtothetopgroup,andwhotothebottomgroup.Iadded8.89(standard deviation)tothemean445andchosetheuppergroupof38students.Then,Ialsodistracted 8.89fromthemeanandgotthelowergroupof38students.Thereare68%ofthestudents betweenthescores35.51and53.29,sothenurnber38showsthetopl6%andthebottom16% students,respectively,outofthetotal236students.. 44.5+8.89=53.29. 「. 44.5-8.89=35.51. 236×0ユ6=37.76. (Thetopgroupandthebottomgrouphave38studentseach.). Tablelshowsfacilityvalues,whichshowaveragepercentageofcorrectanswers,ofthetwo. 一25一.

(8) 近畿大学語学教育部紀要7巻1号(2007・7). groups,topandbottom,anddiscriminationindices,whicharethedifferencesbetweenfacility valuesofthetoPgroupandthatofthebottomgroup,andalsofacilityvaluesofallthestudents. foreachitern.F.V,standsforfacilityvalues,andD.1.fordiscriminationindices.. Aldersonclaimedinhislecturein2005thattheinterpretationsdependonthetypesand purposesofyourtests.Sincethistestisanachievementtest,itispreferableforthestudents toget60%inthefacilityvalueofthetota1.Isettheacceptablerangeoffacilityvaluesas higherthan45%.Althoughthefacilityvaluehigherthan90%isconsideredtoohighandthe itemshouldbemodifiedordeleted,Iincludedtheseitemswhichhadhighfacilityvalues. Becausethisisanachievementtest,Iprefertoincludeeasyitemstoincreaseourstudents' confidence,Easyitemsshouldbethefirstonesonthetest.Iflower-1evelstudentscananswer thequestionswithconfidence,theycanmoveonandchallengethefollowingquestions.. Asfordiscriminationindices,IdecidedtogivethisachievementtesttherangeO.2^0,6while. inthecaseofapro五ciencytest,therangehigherthanO.3wouldbechosen.Fromthetable1,I. canassumewehave17problematicitemsoutof75.. ・・1鰹羅. 11. 、. ・. 鍵. 21,. 、4窪. 5幽. .,韓. 7. 、8. ・蕪 .. 1. 灘o・. 鞠. 鱗1. .薄. 欝. 鱗5,. 1. F.V.ofT. 0.95. 0.9. 0.95. 0.9. 0.74. 0.87. 0.53. 0.92. 0.71. 0.95. 0.79. 0.95. 0.76. 0.87. 0.68. F。V.ofB. 0.63. 0.53. 0.61. 0.47. 0.18. 0.53. 0.32. 0.63. 0.71. 0.45. 0.37. 0.61. 0.42. 0.34. 0.34. DJ.. 0.32. 0.37. 0.34. 0.42. 0.55. 0.34. 0.21. 0.29. 0.5. 0.42. 0.34. 0.34. 0.52. 0.34. F.V.ofall. 0.87. 0.74 . 0.83. 0.75. 0.50. 0.75. 0.73. 0.64. 0.71. 0.64. 0.61. 0.54. 蝿蟻. 、灘. 1轍灘軽繍灘 0.88 望り…り蹴,.,:. 、鱗灘. ・1、. 灘撚鰹. RV.ofT. 0.70. 0.47. F.V.ofB. 0.34. 0.29. 掬. ・.・. 0.75. 鶏. 豹. 塗2'. き. 灘1. 難. ,謹5葬. 窪61. Fl. 、解. 難. 、. 螺. 騰. 0.55. 0.68. 0.76. 0.42. 0.32. 0.9. 0.97. 074. 0.92. 0.82. 0.74. 0.82. 0.89. 0.32. 0.34. 0.34. 0.18. 0.03. 0.53. 0.68. 0.21. 0.66. 0.29. 0.24. 0.55. 0.4. 0.24. 0.34. 0.42. 0.24. 0.29. 0.37. 0.29. 0.53. 0.26. 0.53. 0.5. 0.26. 0.47. 0.75. 0.88. 0.54. 0.85. 0.45. 0.5. 0.69. 0.66. 寧診戦難繕無灘. DJ。. 0.45. 1欝、。雛灘羅. 此げ「n"n「'1}. 藁鰯癒・灘鐙灘. F.V.ofall. 0.53. 蒙緩灘. 難灘難灘灘. 難. 0.61. 0.54. '. ミ・雷箆-盈. 『娩・統総. ・冨・. 一26一.

(9) ItemAnalysesofaMultiple-choiceAchievementTest 此. P. . 、. l婁em国0. ・3裂. 3翔. 33. 鱗. 35. 86. 甘. 甘. 鎌. 諺8. 39. 甘. 尋o. 醐. 導2. 導8. 翻. 婆5. 此ノ. F.V.ofT. 0.79. 0.84. 0.95. 0.68. α76. 0.68. 0.5. 0.68. 0.9. 0.4. 0.97. 0.84. 0.66. 0.92. 0.55. F。V.ofB. 0.47. 0.58. 0.61. 0.16. 0.5. 0.29. 0.24. 0.37. 0.42. 0ユ6. 0.66. 0.29. 0.24. 0.47. 0.34. D.1.. 0.32. 0.26. 0.34. 0.53. 0.26. 0.4. 0.26. 0.32. 0.47. 0.24. 0.32. 0.55. 0.42. 0.45. 0.21. F.V.ofalI. 0.62. 0.66. 0.87. 0.57. 0.53. 0.73. 0.49. 簾. 58. 59. 6◎. 講舞雛灘. 嚢離雛灘 Aρ. 睾籍紛璃総. 0.66 難雛鎌鶴翻鶏. 0.48. 、へ. 許. {. 講馨灘講鶴旨く毎毎碗釜$鋸". 義碕輝『. 獺灘灘灘羅灘籔 0.5. 0.68. 懇灘嚢0.87 懸灘難雛. 筏俘励墾㈹酬榊. 許. 甘. 此、. 縫e附闘◎. 嫁6. 綿. 孫馨. '. 翻. 5◎. 5離. 此. 絡墾. 53. 5獲. 此. 輔6. 55. 甘甘{. F.V.ofT. 0.55. 0.68. 0.84. 0.95. 0.95. 0.74. 0.63. 0.58. 0.68. 0.82. 0.76. 0.68. 0.74. 0.53. 0.82. F.V.ofB. 0.55. 0.24. 0.4. 0.47. 0.37. 0.32. 0.34. 0.26. 0.58. 0.37. 0.45. 0.34. 0.32. 0.26. 0.45. 韻1. 0.45. 0.45. 0.47. 0.58. 0.42. 0.29. 0.32. 霧騰騰 0.45 霧灘灘雛. 0.32. 0.34. 0.42. 0.26. α37. 0.55. 0.48. 0.63. 0.68. 0.73. 0.53. 0.65. 0.5. 0.6. 0.46. 0.72. 播轟羅灘'撫. D.1.. RV.ofa11. 灘繍欝蕪溺講'斜紛磁忽. 鍵灘灘籔毎釜障v舗赫飴岱. 難灘. 0.49. 0.57. 0.62. 霞臨醐総'照麗此. 総羅. 驚1騰灘獄⑥ ♂. 此. {許. 此. 、. 許 {、. 絡2 '. 許. 63. {許. き. 艦. {. 甘. { {. 65. 66. 許許. {{許. 甘{. 1磯甘. 68、冥. く. 此. 69. 禰. 許. 甘. 灘. 1. 羅譲. 顎3. {蒸5. 羅襟 '. F.V.ofT. 1. 0.9. 0.97. 0.61. 0.61. 0.92. 0.76. 1. 0.71. 0.92. 0.45. 0.45. 0.9. 0.55. 0.79. F.V、ofB. 0.79. 0.42. 0.76. 0.29. 0.24. 0.24. 0.29. 0.58. 0.37. α45. 0.26. 0.16. 0.34. 0.37. 0.24. D.1.. 0.21. 0.47. 0.21. 0.32. 0.37. 0.68. 0.47. 0.42. 0.34. 0.47. 1難繊甥盤㈱惇撫騰. 0.29. 0.55. 鑛灘灘灘. 灘欝㈱ κ 毒㈹耳㈱績盤恩、. 鞭難雛難. 櫓続繍鐵騨翌鰍. F.V.ofall. 0.94. 0.81. 0.92. 0.53. 羅講1. 霧灘灘羅硬熱雰鴇'窮6瞬間. 灘離購懸 0.87 鍵鑛灘繋廟財鼠醸㈹爾蕊娩. 0.48. 0.72. 蹴鱈㈱繊雛㈱㈱纏欄齢蝋轍鍮鎌藻葉雛雛鶴. Tablel:FacilityvaluesandDiscriminationindicesoftheachievementtest. Theitemswhichareproblematicinfacilityvaluesareitems7,17,18,21,22,34,37,40,52,65, 67,71,72,and74.Theitemswhichareproblematicindiscriminationindicesareitems9,17,46, 54,71,and74.Herewecansayweabsolutelyneedtomodifyordeleteitemsl7,71,and74 becausetheseareproblematicinbothtypesofanalyses.. Nextweneedtodeleteitems9,46,and54sincethesehavetoolowdiscriminationindices,0,0, andO.11,respectively.Thenwemaymodifyordeleteitems7,18,21,22,34,37,40,52,65,67 and72sincethesehaveverylowfacilityvalues.. 一27一. 0.55. 灘諜雛雛.. 灘撚鱒灘灘纐灘雛灘1 雛繋. 羅律鰹鶏羅雛繊総懇銭醐. 0.68. 麟難. 0.53. "N喉. 翻. 慧黛、"αv. 0.56.

(10) 近畿大学語学教育部紀要7巻1号(2007・7). Items61and63haveveryhighfacilityvalues,0.94,andO.92,respectively,anddonot. discriminatewell,butwemaykeepthemtomakethestudentsfeelconfidentsincethisisan. achievementtest.. [2]Classicaldistractoranalysis o. 1. A. 諜譲騰蓑馨灘1灘難, 羅箋雛難. B. 8.5. 2. 3. 尋. 5. 6. 7. 8. 9. 翔0. 1薙. 箆. 葉3. 14. 15. 8.9. 3.0. 3.0. 5.9. 16.9. 16.9. 1.7. 14.0. 7.6. 13.1. 18.2. 20.8. 25.8. 19.9. 3.0. 39.0. 13.6. 嚢鱗灘. 諜難羅灘,. 灘. 1鑛灘. 6.8 馨雛灘.. 12.3. '梛. C. 3.0. 灘'難灘. 写. 緊湾妨`構. 雛羅識. 灘羅鰹譲譲. 灘騰 '難藁騰譲. ・灘蝋、蕪、繊許撒麟. ・翻。議. 許. 6.4. 15.7. 4.7. 野. 嚢灘'鱒. 9.3. 嚢欝灘. 10.6. 灘. {許,. 灘難羅 1羅騰, 15.7 雛翻綴嚢.. 22.5 灘脳籔讃雛雛、. 同. 議鑑1雛,. 羅総懸譲 '繕羅灘護. 譲灘懸饗灘. 燃. ll.0. 鰯・鱗鷲'、灘峯脱. 雛雛雛馨二. 18.6 蝋・蘭灘!'轍欄蜘纒 .. 灘購灘灘難羅難難鋸総難、略. 苫'悼. 織灘難灘. 羅謙羅繍. 蒙雛灘灘1 ウ♪奄・錫同^・. 縮謎撒購霜輩阿麗師. 嚢織難1 難購1. 灘繋離. 難〆灘漁・距廓灘難 '. 難鱗灘織灘. 25.8. 簗9. 30. 此占κ肖七. 翻叢羅灘 D. 1.3. 4.7. 灘鱗鑛灘灘難. 7.6. 39.0. 5.5. 3.0. 0.8 く. 許. 楴. 翔6 難萎灘灘蓑. A. 遡9. 翔8. 2◎. 2ヨ. 22. 0.4. 7.2. 16.1. 3.0. り1、「ヴ、v・. 5.1. 3.8. 1懇灘灘灘雛灘. 27.5. 2駅. 12.3. 2.5. 10.6. 28. 31.8. 5.9. 雛灘. 灘雛1. 難灘難冨遭'毎ぎ. 葺霊毛. 鱗. ㈱.. 鱒. 輔. 漁鐘雛灘雛羅灘羅捲趨蟹臆㈱購. 1.7. 24.2. 39.0. 8.9. 58.5. 蒙灘灘鰻. 4.2. 11.9. 35.6. 10.6. 83. 36. 38. 37. 4.2. 20.8. 醇. 0.4. 1.3. 醇a謹鶴鵬・鍍. 冗陰く陥'. 6.8. 10.2. 7.2. 25.4. 灘轟鑛灘講難繧 5.1. 12.7. ㈱灘講揃鰯欝途蝋. 灘難. 11.9. 43. 44. 輔. 21.2. 8.9. 7.6. 4.7. 灘黙雛霧灘難難霧. 18.6. 20.3. ・灘護 ≡ 難欝. 灘灘縷難. 35. 84. 灘癖縷欝・撚㈱. 灘灘難鑛美難轍. 辮購購、. 32. 難雛馨雛. 難. 丼締鴇鍛. 糠嚢難雛講瞳灘 D. 羅雛灘雛. 鞭灘灘灘蕪16.1. 23.7. 鯵灘購羅毎. 霧簾難. 34.7. 雛鞭灘難灘嚢. 難灘襲 26.3 %轟}. 灘鎌灘襲馨囎繧騰糠. 53.8. 3麺. 盆6. ウ幸澤. 藁ヒ ° 撚ウ蓑く"織遷隷・. 21.6. C. 塗5. 難雛灘諺、雛甥難離灘羅灘灘灘、. 鑛萎灘灘翁、. 25.0. 餌. 議灘羅鱗. 雛難 17.4 鐵馨織 1灘雛灘難議難鑛灘獲. B. 23. 蟹翻繍麟鱗帖藤欄顎購. 39. 難0. 窮. 魏. ・. く. '㈱{. 1総灘終灘藍魁弓採鑛奄露触. A. 糞纒繍. 28.0. 3.4. 1繊灘. 難灘麟繍撚. 1.3. 19ユ. 繍難灘鰻難雛欝鱗鱒難難羅撫. 繍菱灘舞》、毛懸聾凄麟・㈱. 獺難繊灘灘. 6.4. 18.6. 7.6. 23.3. 雛灘羅難韓雛繋騨蓑・. ン. 10.2. 雛繊や燃獅、響・'・灘. N舜"'く. 灘難鱗灘. 11.9. 灘難難鍵. 鞠纒. 22.0. 燃撚彰講,. 一灘総 §. 鐡綴灘叢. 顯蹴蓮'1欝. 講叢鱗舞難. 嚢講嚢灘縮・. 灘欝灘灘灘灘雛講. 7.2. 諜難難羅饗. B. 〃鐵鷲羅㈱. ^蝋. 15.7. 懇1灘雛. 11.0. 鵜羅灘毛灘. 藝識難総蕩. 臼歪. 隅. 毛' .丼. ノ. 華灘撒. 雛灘雛,繋. 講講欝難等く誤剃が瞬^ぐ. C. 6.4. 8.1. 1.7. 21.2. 16.5. 16.9. 25.4. 24.6. 9.3. 14.4. 綴購嚢,11.9 講'離灘. D. 3.0. 14.4. 41.5. 8.1. 6.8. 講灘灘灘難難難. 26.7. 14.0. 5.9. 19.1. 噸. 霧繊難難. 2.5. 11.9. 39.0. 5導. 55. 0.8. 11.9. 5態. 5署. 19.5. 1撚欝 24.2 灘鞍講講. 灘雛螺鎌 {. 総. 鞠. 4簗. 尋9. 50. 難羅1難羅. A B. 1灘灘. 1甜. 10.2. 12.7. 11.0. 糊蟹邸獺㈱縫難'響響訟 '霞. 7.2. 滅蹴綴'照. 議鞭灘灘叢蕪難. 5.9. 駿. 顯. 許. 鰯撰懲灘鍵. 灘灘 24.6 灘雛難捻繊翻亀酵解. 雛灘馨雛霧. 24.2. 7.2. 34.7. D. 31.8. 30.9. 15.7. 3.8. 欝灘灘雛, 観奮齢照m葦響翻. 5.9. 8.5. 撰綴難雛 4.2 灘灘灘霧, 高'、'. 13.6 詫か. 5.1. 19.5. ll.0. 裟憩経雛"鐵騨鍵. 1嚢難毅許ウ雌,捧. 17.8. 灘騨難灘雛難鰻灘. 8.5. 14.4. 7.6. 27.5. 16.1. 3.4. 14.4. 30.1. 5.1. 6.8. 23.7. 11.9. 8.5. 28.8. 灘灘灘雛 30.5. 鑛難. 灘撚灘難馨購. 2.5. 駅. 密韓謬噸醇. 鞠鎌薮灘諜雛講 A%、'、. 一28一. 5.1. 羅騰羅難. 蹴9障. 7.6. 13.6. くv'苫. 雛講灘灘. ￠能。毎ん. 鰻灘雛雛雛雛、. "訣甲. 60. 自ウ. 難. 5.9. 毒繍繕鍼嚢萎灘灘. κ 券・5く. 毎毛盗絆碑総㈱欝. 轟. 24.6. 珍. 馨灘灘. 紬島戴、窃が罫田. 59. 58. 靴欝簸灘購難蹴縮蝋樽. 鵜雛臓査灘脚桑灘醐 '鱒俘. 矯蹴. C. 53. 灘. 3.0. 10.2. 20.8.

(11) ItemAnalysesofaMultiple-choiceAchievementTest. ♂. 此. 62. 6矧. 63. 6君. 65. 6灘. 66. 69. 68・. '. 欄. 蓑⑧. 麗'. 此. 譲雛. 羅葵. ・. 霧5. ㌧. 職 w昏v「A￠㈹岸㈱. 年噛絶照・. '勅"w藷捧櫛勅w 鰯、￠鉾剛繊鰺$醸灘齢鶏熱奪舘織鑓. 贈緻騰履鰯麓雪. A. 嚢購購. 1.7. 1灘灘1. 2ユ. 欄. 嚇購輝覗嗣鯉緯く概. B. 1.7. 燃. 詞醇尋. 露 ζ、. 102. 繭蝋騰餓卿鐙娩. 18.6. 聯. 25D. 4.7. 嚢鰻灘. 職紛硲趣簸朝緋. 、禰補'^斜'舖. 4.7. 鐡潔駕轄彰甑舘焔蝋髪鐙鍔勒膓御棄. 2.5. 9.7. 2.1. 塀襟議亨朝群 β. ll.4. 153. 灘欝難. D. 艦. 購,廿. 蹴N繊締礪揮麓任調囎錫翌冠憩㈱印. 3D. 2.1. 灘灘. 難蕊働舘窟語僅. 糠A凸'棋繕㈹縦舗繊澱、昂、〆. 13. 2.1. ウ璽郵戯9豊麿照浴響鍔醗雪、煎縦鱒り釜譲驚轟鰯硲お. 漁,簿㈹麟娩'邸く゜庚薇卜聡鈴総繊露. 7.2. 霧雛灘灘 203 樫姶拐!総か錫鮫翌. 68. 12.3. 11!1. 難購灘. 置・醐醇聾爾概講や窪灘,駕粉脇範'灘. 、ケ咽. o多駕ウ、溝・邸靴. 醒師. お簿簿黛醗鐡謝澤糖灘脚価国電綜. 34. &9. $㈱. 24.2. 2a7. 32β. 16.9. 33.1. 12.3. 13.6. 49.2. 3.0. 1&6. 歌早澤'櫨"「. 授Y飾. 酬拾繍. 給$微纈写》. 難難灘鎌鱗醗籔嫡宅繋榊礁梛鐡灘聡照怨÷ 仙^,、. 絆蓉磁駆霜醗が離舘倫娩β短籔騎畝憩^醇. la6. 254. ル櫨甘. 授. 懸1. 齪綴艀締描聡. 21β. 灘灘藝 17.4. 5β. 騰鍵往お縄鱒. 鈴く韻翫灘給観嫡β 野羅罷澤田覇那醐冨$騨鞭蕊嚢翁鋤繊s 槍鷺蕉容繊露銭蘭翠盆釜照灘毎畿㈱. 灘難難議灘. 灘繊羅錨難灘躾無駐㈱額総㈱帥憲、'、繊、A獄囎鍵婁般悟凸ウ、苗姶園". Table2:Thepercentageofcorrectanswersanddistractors. FromTable2,Ichose7itemswhichshouldbechecked.The7items(17,21,22,34,40,67,and 72)seemtoodifficult,becausetheircorrectanswersattractfewerstudentsthantheirstrong distracters.. [3]IRT[itemdifficultyandunexpectedresponses]. Finally,IusedIRT(item-responsetheory)toconfirmtheresultsofclassicalitemanalyses, andalsotochecktheitemdifficultyandunexpectedresponses.Weir(2005:26)statesthat item-responsetheorymodelshavebecomeincreasinglypopularmeasurementtoolsinthepast 35years.Thesemodelsuseresponsestoitemonatestorsurveyquestionnaireto simultaneouslylocateboththeitemsandtherespondentsonthesamelatentcontinuum." Figurelindicatesthedistributionoftesttakersontheleftandthatofitemsontheright.By lookingatthedistribution,wecaneasilygraspthelevelsofthetesttakersandtheitems.. 一29一. 、障. 鋤鐡登萄騨望オ蝉. 脚鶴韓繍糊譲・総雁霞詩餅織葭輌鱒露揮鋤許避ウ9鞭. 28B. +Y・. 懸羅灘馨鎌難. 30.9. 醇。燃勧磁観恕擬導蒔"照箔薗ウ. 」. 許". 冨購耗磁鐙磁㌶醍鍍桐襯. 鱒鎚繊鞠紹瓢鵜鐡鵬關鐡. 灘霧購. 難谷錫蜜繕殊繊硲婚 ∠諭. 19.1. 田{醐翁総2諺蝋で鞍顯礁嬢奪㈱写. 謬膨隅診盗陪翁$蘇㍊臨酵密鱈磯撫㍗驚腹、'、旨艦衰甘障. C. 14.0. 纈禰麗獄奪総誤. 鮮妃、. lm. 140.

(12) 近畿大学語学教育部紀要7巻1号. Logit. (2007・7). Numberofstudents<more>1<rare>. 4. person+item l (1)l I. 3. 十 l l I. 2. 十 lT. (1) TI. (1). 1. #. (3). 72 ー. 17. (1) ー. (10). .###. 40 I. 21. (1). 37. 18. 67. 74. 7. 65. 27. 52. 47. 59. 69. 36. 28. 5. 45. 53. 5738. 16. 25. 43. 51. 6473. 15. 20. 46. 75. 42. 54. 14. 19. 31. 55. 11. 13. 48. 30. 32. 56. 35. 29. 49. 66. 39. 12. 44. 60. 70. 10. 2. 4. 6. 23. 9. I. 71. .####. (13). .####. S. 34 十. ⑬. 1. 06). .#####. I. ######. I. G8). ㈲. .########. l. #########. I. ⑳. I. ⑮. M. #####. 06). .#####. 1. ######. -. (18). 十. 0. ⑬. .####. M. 58. 1. (12). ####. S. (4). .#. 1. .#. 1. (4). 1. (1) 1. (7). .## 1. (3). #. T 1. (1) 十. 一1. S. (3). #. 62 ー. (1). 一30. 50.

(13) ItemAnalysesofaMultiple-choiceAchievementTest. 13. (1). 126. (1). 14133 1124868 1 1T. 一2. 十 Il鞭1蓑 l I I. 一3. 十 <leSS>. Figure1:Personsmapofitems[Each. 1<frequ> `. .'showsonepersonandeach'#1,3persons.]. Shownaboveisthefigureofdifficultyofitemsandthelevelsoftesttakers.Iusedthe. softwareWinstepstogetthismap.AmongtheresultsbyusingthesoftwareWinsteps,this. mapisthemostuseful.Wecanseewhichitemsaretoodif且cultortooeasy,andcancompare. thedifficultyofitemsandthelevelsoftesttakers.. LogitOshowstheM(mean)oftheitems;thatis,itemsユ4,19,31,55,and58areaverageitems. whichhavethesamedifficulty.ThefacilityvaluesareO.61,0.61,0.62,0.62,andO.60,. respectively.Sshowsonelogithigherorlower,whileTshows210gitshigherorIowerthanM.. TheitemswhicharehigherthanTaretoodifficultandtheitemslowerthanTaretooeasy.. Herewecanconfirmtheresultoftheclassicalitemanalyses.Those`tooeasyortoodifficult'. itemsaresupposedtobemodifiedordeletedifthistestisaproficiencytest.Howev6r,inthe. caseofachievementtest,wecankeeptooeasyitemstomaketesttakersfeelconfidentas. statedabove.. 一31一.

(14) 近畿大学語学教育部紀要7巻1号(2007・7). ThenumberofthetesttakersabovelogitOis192personsoutof236.LogitOshowsthereis. 50%probabilitythatthesameleveloftesttakerscanchoosecorrectanswersofaverageitems,. Thissoundsthetestiseasyenough.Butifwewantatesteasierasanachievementtest,we. needtocheckeveryitemabovelogitO.. Besidesthedifficultyoftheitems,Winst←psenablesustoknowmostunexpectedresponses,. whichmeansthatlower-leveltesttakersgetdifEicultitemsrightwhilehigher-leveltesttakers. geteasyitemswrong.Table3showsmostunexpectedresponsesinorderinlisteningand reading.. mostunexpectedresponses Listeningitems. 8. 24. Readingitems. 61. 63. 1. 3. 23. 9. 4. 6. 2. 10. 12. 29. 30. 33. 41. 72. 50. 70. 60. 39. 49. 66. 31. 58. 68. 26. Table3:Unexpectedresponsesinorder. Herewecangettheinformationaboutsomeofthemostunexpectedresponseswecantake intoconsiderationanddecidewhetherweeliminatethemorkeepthem.. Too. Most. SMS. Teasyeasiest. 鱒. dif『icultT. 灘. 欝. 織1 欝1騰. 購 Table4:12problematicitemsbasedonIRT(gray). Judgingfromthedataofunexpecteditemsandtheitemdifficultymap,Ichosel2itemstobe checked.Item22wastoodifficultandanoutlierwhichmeantaboveT(logit2),anditems61 and63weretooeasyandalsooutliersbelowT(logit-2).Item72turnedouttobevery difficultandalsogotmoreunexpectedresponses.Iteml7,40,21and34wereratherdifficult andalsotheircorrectanswersdidnotdrawthemosttesttakers.Itemsl,24,8,68,41,33,26,3. 一32一.

(15) ItemAnalysesofaMultiple-choiceAchievementTest wererathereasyandgotmoreunexpectedresponses.. Table5belowshowstheresultofproblematicitemswhichwerefoundoutthrough3item analyses.. ListeningitemS. Problematicitems. 1. 7. 8. ●. Analysis[1ユ:ユ6. 9. 17. 18. 21. 22. ●. ●. ●. ●. ●. ●. ●. ●. Analysis[2ユ:7. Analysis[3]:10. ●. ●. 24. 26. ●. ●. ●. 61. 63. Readingitems. Problernaticitems. 33. 34. Analysis[1]:ユ6. Analysis[2]:7 Analysis[3]:10. ●. ●. 37. 40. ●. ●. 41. 46. 52. 54. ●. ●. ●. ●. 65. 67. ●. ●. 68. ●. ●. ●. ●. 71. 72. 74. ●. ●. ●. ●. ●. Table5:Itemstobecheckedafter3itemanalyses. Analysis[1],analysesoffacilityvaluesanddiscriminationindices,has16problematicitems, andanalysis[2],thedistractoranalysis,has7whileanalysis[3]has10.Intotalwehave26 problematicitemsoatof75,whichimpliesthistestisweak.. Nowhowdowechoosefinalproblematicitems?Dowehavetoc耳ooseitemswhichare consideredproblematicatleastintwoofthethreeanalyses?Thatisnotthecase,because. apparentlywehavetomodifyordeletetheitemswhosecorrectanswersattractedfewertest. takersthanstrongdistractors,thatis,items17,21,22,34,40,67,and72.Onlydistractor. ana王ysiscouldfindtheproblematicitem34.Theitemswhichdidnotsucceedindiscriminating. wellareite恥9,46,54,71and74,andthesecouldnot.befoundbytheothertwoanalyses,. distractoranalysisandIRT.Only.IRTcanfindunexpectedresponses。Mostoftheitemsfrom. thedataofanalysis[3]aretheoneswhichshowunexpectedresponses.Sointhiscasewe. 一33一.

(16) 近畿大学語学教育部紀要7巻1号(2007・7). canexcludetheseunexpectedresponses,andinsteadcheckextrOmelydifficultitems.Wecan. keeptheeasiestandeasieritemsbecausethistestfunctionsasanachievementtest.. IRTalsocanshowthedif且cultyoftheitems,whichisveryusefulwhenwehaveafewitems. whichhavethesamedifficulty.Inotherwords,ifwehavetwoproblematicitemswhichhave. thesamedifficulty,wecanmodifyordeleteoneofthem.Moreover,ifwewanttokeeptwoor. threeitemswhichhavethesamedifficulty,itiseasytojudgebyusingIRT.. Fromtheanalysesabove,itisdesirabletouseclassicalanalysesandIRTtoascertainwhich itemsshouldbemodifiedordeleted.Eachanalysiscangivedifferentevaluations,soweneed tobecarefulbeforewedecide.. 4.Discussionandafinaldecision. Accordingtotheresultsofthe3itemanalyses,Iconcludethatnoneoftheanalysesis. satisfactoryonitsownandweshouldusebothclassicalitemanalysisandIRTinorderto. decidetheitemstobemodifiedordeleted,becauseIRTcannotfindalltheitemsincludingthe. strongdistracorswhichattractmorestudentsthancorrectanswers,Itemsl7,21,22,34,40,67,. and72shouldbemodifiedordeletedalthoughIRTmissesitems21,34,and67・. The9itemswhicharesoeasythatallofthethreedistractorscouldattractonlylessthan10% oftesttakersmightneedaslightmodification,orcanbekeptastheyareifweputthoseitems atthebeginningofeachsectiontomaketesttakersfeelconfident.. Letusseetheproblematicitemsofeachsection.Inourcourse,wefocusedon(1)photosection, (2)quick-responsesectionand(4)grammarsection,andspentlesstimef6rtheothersections. Inadditlontothestatisticaldata,weshouldtakethecontentofthecourseintoconsideration.. (1)photosec毛ion:3problematicitems(1,7and8)outof8. Wecankeepalloftheseitems1,7and8,becausewef6cusedonthephotosectionsand. 一34一.

(17) ItemAnalysesofaMultiple-choiceAchievementTest studentsc3nfeelconfidentaboutwhattheyhavelearnedandtheycangetcorrectanswers inthissectionfairlyeasily.. (2)responsesection:3items(9,ユ7」8)outof10. Thissectionwasdonefairlywellandistheonlysectionwhichhas30ptionswhiletheother sectionshave40ptions.Item17shouldbemodifiedbecauseithasastrongdistractor.. (3)conversation/announceme益tsection:4items(21,22,24and26)outof12 Themostdifficultitem22shouldbedeleted,anditem21needsaslightmodification.Item24. canremainbutneedstobemovedatthebeginningofthesectionbecauseitisarathereasy. item.Item26seemsslightlyeasy,butitisacceptableenoughforanachievementtest.. (4)grammarsection:8items(33,34,37,40,41,46,52and54)outof30. Wecankeeptheitem33,whichisrathereasy,tomakethestudentsfeelbetter.Item34and. 40shouldbemodi負edbecauseofstrongdistractors.Wekeepitem37withoutanychange,. becauseithasgooddistractorsalthoughitisratherdifficult.. (5)readingcomprehensionsection:8items(61,63,65,67,68,71,72,and74)outof15 Asawhole,theitemsinthereadingcomprehensionsectionareeithertoodifficultortoo easy.Theresultshowshowdifficulttodevelopgooditems.Wecanmakeitems61and63 moredifficultorleavebothofthemastheyare.Wehavetotakeoutitem71and74because ofstrongdistracters.. Judgingfromtheresultsandthediscussion・Iwillmodif¥ordeleteatleastl7itemsoutof26・ Initemsl7,21,22,34,40,67,and72,thedistractorsperfbrmedtoowell.Items9,46,54,71and. 74areproblematicindiscriminationindices,andcannotdiscriminateappropriately.Rather. difficultitemssuchas7,18,37,52ahd65alsoshouldbesimplified.The6theritemssuchasl,. 8,24,26,33,41,61,63and68canbekeptastheyare,becausetheseeasieritemscanencourage. lower-levelstudents.. 一35一.

(18) 近畿大学語学教育部紀要7巻1号(2007・7) 5.Conclusionsandsuggestionstomaketestsmoreuseful. AsIquotedintheintroduction,Bachman&Palmer(1996)statethatusefulnessoftestsis themostimportant.AndAlderson,inhislectureonlanguagetestingin2005,claimsclearly that(construct)validityisthemostimportantandreliabilityisusefultovalidateatest.In thispaper,Ifocusedonhowwecanincreasereliabilityofitemsthroughtwokindsofitem analyses.. Throughtheitemanalyses,Icametotwoconclusions.First,weshouldusebothclassicalitem. analysesandIRT.Second,wecanreducethenumberofdistractorsfromthreetotwo.IwilI. justifythesetwosuggestionsinthefollov財ingsections。. ClassicalitemanalysesandIRTarecomplementary,andneitherofthemissatisfactory enoughtomakeafinaldecisionabouttheitems.Hughes(2003)emphasizesthat"...both classicalanalysisandRaschanalysis[IRT]havecontributionstomaketothedevelopmentof bettertests.". Touseonetestformultiplepurposes,weneedtoconsiderabouttheacceptablerangeof. facilityvaluesanddiscriminationindices,reliabilityofitemsandthetotal,itemdif且cultyand. unexpectedresponses,Asanachievementtest,weacceptratherhighfacilityvalues,andat. thesametimeasaproficiencytest,weneedtocheckdiscriminationindicesandstandard. deviationsothatwecanjudgethateachitemdiscriminateswellandthetestdoesnotcluster.. Inspiteoftheimplementationofthepilottest,westillhave26items,outof75,whichneedto bedeletedormodified,Thisfactshowsthatourtestisratherweak.Byusingdifferentkinds ofitemanalyses,classicalandIRT,wecouldfindthese26problematicitems,Afterwedecide whichshouldbedeletedormodified,weneedtoretestthemodifieditemsuntilweget acceptableevidences,Thisprocedureincreasesthereliability.. Thesecondconclusionisrelatedtoreliabilityandalsopracticalityoftests.Tomakeourtest. 一36一.

(19) ItemAnalysesofaMultiple-choiceAchievelnentTest. morereliableandmorevalid,inadditiontotheuseoftwokindsofitemanalyses,weneedto. haveclearertestspecificationsfromtheinitial-designprocess.Weir(2005二14)statesthata. testshould"alwaysb6constructedonanexplicitspecification,whichaddressesboththe. cognitiveandlinguisticabilities."However,thesecannotbeconsidered`satisfactory.'. AsHughes(2003:63)says,towritesuccessfulitems"isextremelydifficult"under七he. conditionthatwehavelirnitationsoftimeandstaff.Inthecaseofourtest,outof225. distractors,88distractors(39%)attractedlessthan10%oftesttakers.Thisshowsthatitis. highlydif且culttomakeplausibledistractors,soreducingthenumberofdistractorsisdesirable. interlnsofotherreasonsaswell;wehavelimitedtimeandlackofstaffmemberstomakea. test.. Inhisarticletitled籍r86(η. げR68ω. 漉01z&Ar6αo'加10Zプbrル. ノ認'ψ16-Clzo∫c61'6〃. ∬ ∴Aルf8∫6z-.AηoZy3'36ゾ80}セor5. 融,Rodriguez(2005)empiricallyprovedthattoreducethenumberofdistractorshas. noin且uenceonthereliabilityofthetests.. Rodriguezrecommendsusingthreeoptions,whichincludeonecorrectanswerandtwo distractors,because:. ●. -←. Lesstimeisneededtopreparetwoplausibledistractorsthanthreeorfourdistractors.. ●. ワ自. More3-optionitemscanbeadministeredperunitoftimethan4-or5-optionitems, potentiallyimprovingcontentcoverage.. 3.. Theinclusionofadditionalhigh-qualityitemsperunitoftimeshouldimprovetestscore reliability,providingadditionalvalidity-relatedevidenceregardingconsistencyofscores andscoremeaningfulnessandusability.(2005:11). Moreover,Shizukaetal.(2006)alsoreportedthatthereductionofdistractors(threetotwo). gavenoinnuenceonthereli毎bilityoftheirentranceexaminations.Theystatethat"using threeoptionsinsteadoffburdidnotsignificantlychangethemeanitemfacilityorthemean. itemdiscrimination.Distractoranalysesrevealedthatwhetherfourorthreeoptionswere. 一37一.

(20) 近畿大学語学教育部紀要7巻1号(2007・7). provided,theactualtest-takers'responsesspread,ontheaverage,overabout2.60ptionsper item,thatthemeannumberoffunctioningdistractorswasmuchlowerthan2,andthat. reducingtheleastpopularoptionhadonlyaminimaleffectontheperformanceofthe. remainingoptions.Theseresultssuggestedthatthree-optionitemsperfbrmednearlyaswell. astheirfour-option60unterparts."(2006:35). Withrespecttopracticality,reducingthenumberofdistractorswilldecreaseourworkload whichoccurswhenwedeveloptestitems.Wealwayshavetroubleproducing3plausible distractors,andalsotaketimewhenwecheckthem,becauseweusuallycheckitemsfouror fivetimes.Moreover,wealwaysgetannoyedbythirddistractorswhichareusuallystrangeor notattractive.Therefore,adoptingtwodistractorsinsteadofthree,wecanreducemistakes andmakethetestingprocedure(developing,implementingandevaluating)moreeffective andpractical.. Itisextrernelydifficulttomakeusefultests,buthavingclearpurposesoftests,usingtwo kindsofitemanalyses,andreducingthenumberofdistractorswillhelpusdevelopmore reliable,morevalid,andmoreusefultests.. References. Alderson,J.C.,Clapham,C.,&Wal1,D,(1995).L伽8襯8ε7セ5'coη5〃. 〃c"oη. 飢4Ev礁. α'加.Cambridge:. CambridgeUniversityPress. Bachman,L.F.(2004).8鰯. ∫5∫'cαZ舶め385ヵrLα. ηg房αgεA∬. θ∬ 耀 ηゑCambridge:CarnbridgeUhiversity. Press. Bachman,LF.&Palmer,A.S.(1996).Lα Brindley,G.(1986).Z勉6A∬. η8襯867セ3伽8'η1)70α'cε,Oxford:OxfordUniversityPress, ε∬ 現6厩qブ36coη4L侃8・. 醐8εPr(ザo彪. ηcy,1∬. 膨63α η4Aρ. NationalCurriculumResourceCentreAdultMigrantEducationProgramAustralia. Hughes,A.(2003).距. ∫伽8!brL伽8屍. α8ε7θ αc舵r∫.36coη4E4'∫'oη.Cambridge:CambridgeUniversity. Press. McNamara,T.F.(1996).1晩. α躍. ∫η83660η4L研g尻. Nunan,D.(1999).56coη4Lα. η8照g67セ. α8εPε ゆ朋 αηc6.London:PearsonEducation.. αo痂 ηg(隻Lεorη. 加g.Boston:Heinle&HeinlePublishers,. Rodriguez,MC.(2005).ThreeOptionsAreOptimalf6rMultiple-ChoiceItems:AMeta-Analysisof 80YearsofResearch.E伽cα. 加 ηoJルfω5尻r朋6砿. 器3翻630η4、Prα. 一38一. α'c6.Summer2005:pp。3-13. ρroαc舵5.Adelaide:.

(21) ItemAnalysesofaMultiple-choiceAchievementTest. Shizuka,T,Takeuchi,0,Yashima,T,&YoshizawaK.(2006).Acomparisonofthree-andfouroptionEnglishtestsforuniversityentranceselectionpurposesinJapan.Lozz8膨 2006;voL23:PP.35-57. Weir,CJ.(2005).Lα. η8雌867セ5伽80η4嗣. ∫伽'oη.NewYork:PalgraveMacmillan.. 一39一. α8ε1セ5∫ 加8,1.

(22)