Grup Değişmezliği Özelliğinin Farklı Eşitleme Yöntemlerinde İncelenmesi

Bu çalışmada grup değişmezlik özelliğinin Klasik test kuramına dayalı gözlenen puan eşitleme yöntemlerinden Tucker lineer eşitleme, Levine lineer eşitleme, Braun-Holland lineer eşitleme, eşit yüzdelikli eşitleme yöntemleri üzerindeki etkisi araştırılmıştır. Bu doğrultuda FVAT Fast and Valid Aptitude Test testinin toplam 3031 kişiye uygulanan iki alt formu eşitleme çalışmasında kullanılmıştır. İki alt formun 13 ortak maddesi vardır ve eşitleme deseni olarak ortak maddeli eşit olmayan gruplar deseni seçilmiştir. Grup değişmezlik özelliğinin etkisini incelemek için tüm gruptan elde edilen veriler cinsiyet alt gruplarına bölünmüştür. Araştırmanın sonucunda Tucker ve Braun-Holland lineer eşitleme yöntemlerinin kabul edilen hata sınırının altında sonuçlar ürettiği ve grup değişmezlik özelliği varsayıma karşı daha dirençli olduğu; Levine lineer eşitleme ve eşit yüzdelikli eşitleme yöntemlerinin kabul edilen hata sınırının üstünde hata ile eşitleme yaptığı bulunmuştur

The Investigation of the Group Invariance Property on Diverse Equating Methods

In this study, we investigated the effect of group invariance property on Tucker linear equating, Levine linear equating, BraunHolland linear equating, and equipercentile equating methods based on the classical test theory under the non-equivalent groups with anchor test. Two subforms of FVAT Fast and Valid Aptitude Test were applied to 3031 subjects that were used in the equating. The two subforms have 13 common items. The whole group was divided into gender subgroups to examine the effect of the group invariance. The results showed that Tucker and Braun-Holland linear equating methods produced equated scores below the acceptable limit of error and that these methods showed resistance to the assumption of group invariance. Furthermore, Levine linear equating and equipercentile equating methods generated equated scores above the acceptable limit of error

___

  • Andrulis, R. S., Starr, L. M., & Furst, L. M. (1978). The effect of repeaters on test equating. Educational and Psychological Measurement, 38, 341-349.
  • Angoff, W. H. (1971). Scales, norms and equivalent scores. In R.L. Thorndike (Ed.), Educational measurement (2nd ed.). American Council on Education: Washington, DC.
  • Brennan, R. L. (2008). A discussion of population invariance. Applied Psychological Measurement, 32(1), 102-114.
  • Dorans, N. J. (2004). Using subpopulation invariance to assess test score equity. Journal of Educational Measurement, 41, 43–68.
  • Dorans, N. J. (Ed.). (2003). Population invariance of score linking: Theory and applications to advanced placement program examinations (ETS Research Report RR-03-27). Educational Testing Service: Princeton, NJ.
  • Dorans, N. J., & Feigenbaum, M. D. (1994). Equating issues engendered by changes to the SAT and PSAT/NMSQT. (ETS Research Report RM-94-10). Educational Testing Service: Princeton, NJ.
  • Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37(4), 281-306.
  • Dorans, N. J., & Liu, J., Hammond, S. (2008). Anchor test type and population ınvariance: an exploration across subpopulations and test administrations. Applied Psychological Measurement, 32(1), 81-97.
  • Green, B. F. (1995). Comparability of scores from performance assessments. Educational Measurement: Issues and Practices, 14, 13-24.
  • Haladyna, T. M., & Dowing, S. M. (2004). Construct-ırrelevant variance in high-stakes testing. Educational Measurement: Issues and Practices, 23(1), 17-27.
  • Harris, D. J., & Crouse, J. D. (1993). A study of criteria used in equating. Applied Measurement in Education, 6(3), 195- 240.
  • Harris, D. J., & Kolen, M. J. (1986). Effect of examinee group on equating relationships. Applied Psychological Measurement, 10, 35-43.
  • Holland P.W., & Dorans, N. J. (2006). “Linking and equating.” In RL Brennan (ed.), Educational Measurement, (4th ed.). Westport, CT: Greenwood.
  • Huggins, A. C., & Penfield, R. D. (2012). An NCME ınstructional module on population ınvariance in linking and equating. Educational Measurement: Issues and Practice, 31: 27–40.
  • Jurich, D. P., DeMars, C. E., & Goodman, J. T. (2012). Investigating the impact of compromised anchor items on IRT equating under the nonequivalent anchor test design. Applied Psychological Measurement, 36(4), 291-308.
  • Kolen, M. J. (2004). Common Item Program for Equating (CIPE) [computer program]. Version 1100. http://www.education.uiowa.edu/centers/casma adresinden ulaşılmıştır.
  • Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking. Springer Verlag: New York, NY.
  • Kolen, M. J., & Whitney, D. R. (1982). Comparison of four procedures for equating the tests of general educational development. Journal of Educational Measurement, 19, 279-293.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
  • Öztürk Gübeş, N., & Kelecioğlu, H. (2017). Investigating group invariance of equating results. Elementary Education Online, 16(1), 217-227.
  • Powers, S., Turhan, A., & Binici, S. (2012). Population invariance of vertical scaling results. National Council of Measaurement in Education, Pearson: Vancouver, BC.
  • Tan, Ş. (2016). SPSS ve excel uygulamalı temel istatistik-1. PEGEM Akademi: Ankara.
  • van der Linden, W. J. (2000). A test-theoretic approach to observed-scored equating. Psychometrika, 65, 437-456.
  • von Davier, A. A., & Han, N. (2004). Population invariance and linear equating for the non-equivalent groups Design. (ETS Research Report Series No: 04-47). Princeton, NJ: Educational Testing Service.
  • von Davier, A. A., Holland P. W., & Thayer, D. T. (2004). The chain and post‐stratification methods for observed‐score equating: their relationship to population ınvariance. Journal of Educational Measurement, 41(1), 15-32.
  • von Davier, A. A. (2013). Observed-score equating: An overview. Psychometrika, 78(4), 605–623.
  • Yang, W. L., Dorans, N. J., & Tateneni, K. (2003). Effect of sample selection on advanced placement multiple-choice score to composite score linking. (ETS RR-03-27). Educational Testing Service: Princeton, NJ.
  • Zhang, X., McDermott, P. A., Fantuzzo, J. W., & Gadsden, V. L. (2013). Longitudinal stability of IRT and equivalent- groups linear and equipercentile equating. Psychological Reports: Measures and Statistics, 113, 1303–1325.