Original Article

Efficacy of Effect Size Measures in Logistic Regression

An Application for Detecting DIF

Juana Gómez-Benito

University of Barcelona, Spain

Search for more papers by this author

M. Dolores Hidalgo

University of Murcia, Spain

Search for more papers by this author

, and

José-Luis Padilla

University of Granada, Spain

Search for more papers by this author

Published Online:January 28, 2009https://doi.org/10.1027/1614-2241.5.1.18

Abstract

Statistical techniques based on logistic regression (LR) are adequate for the detection of differential item functioning (DIF) in dichotomous items. Nevertheless, they return more false positives (FPs) than do other DIF detection techniques. This paper compares the efficacy of DIF detection using the LR significance test and the estimation of the effect size that these procedures provide using R² of Nagelkerke. The variables manipulated were different conditions of sample size, focal and reference group sample size ratio, amount of DIF, test length and percentage of test items with DIF. In addition, examinee responses were generated to simulate both uniform and nonuniform DIF (symmetric and asymmetric). In all cases, dichotomous response tests were used. The results show that the use of R² as a strategy for detecting DIF obtained lower correct detection percentages than those obtained from significance tests. Moreover, the LR significance test showed adequate control of FP rates, close to the nominal 5%, although the rate was slightly higher than the nominal 5% when the sample size was smaller. However, when the effect size measure was used to detect DIF, the FP rates were lower and <1% for a wide number of conditions. In addition, a statistically significant main effect of the sample size variable was obtained. Thus, the FP percentages were higher when the sample size was small (100/100). The results obtained indicate that the use of R² as a measure of effect size together with the statistical significance test reduces the rate of FP.

References

American Psychological Association , (2001). Publication manual of the American Psychological Association, 5th ed. Washington, DC: Author. First citation in article Google Scholar
Gómez-Benito, J. , Navas-Ara, M. J. (2000). A Comparison of χ², RFA and IRT based procedures in the detection of DIF. Quality & Quantity, 34, 17–31. First citation in article Crossref, Google Scholar
Gómez-Benito, J. , Hidalgo, M. D. , Padilla, J. L. , González, A. (2005). Desarrollo informático para la utilización de la regresión logística como técnica de detección del DIF [Software for the use of logistic regression as DIF detection procedure]. Paper presented at the IX Congreso de Metodología de las Ciencias Sociales y de la Salud, September. Granada, Spain. First citation in article Google Scholar
Hambleton, R. K. , Cook, L. (1983). Robustness of item response models and effects of test length and sample size on the precision of ability estimates. In D. J. Weiss, (Ed.), New horizons in testing: Latent trait test theory and computerized adaptative testing (pp. 31–49). New York: Academic Press. First citation in article Crossref, Google Scholar
Hidalgo, M. D. , Gómez-Benito, J. (2003). Test purification and the evaluation of differential item functioning with multinomial logistic regression. European Journal of Psychological Assessment, 19, 1–11. First citation in article Link, Google Scholar
Hidalgo, M. D. , Gómez-Benito, J. (2006). Nonuniform DIF detection using discriminant logistic analysis and multinomial logistic regression: A comparison for polytomous items. Quality & Quantity, 40, 805–823. First citation in article Crossref, Google Scholar
Hidalgo, M. D. , López-Pina, J. A. (2004). DIF detection and effect size: A comparison between logistic regression and Mantel-Haenszel variation. Educational and Psychological Measurement, 64, 903–915. First citation in article Crossref, Google Scholar
Hidalgo, M. D. , Gómez-Benito, J. , Padilla, J. L. (2005). Regresión logística: alternativas de análisis en la detección del funcionamiento diferencial del ítem [Logistic regression: Analysis alternatives in the detection of differential item functioning]. Psicothema, 17, 509–515. First citation in article Google Scholar
Jodoin, M. G. , Gierl, M. J. (2001). Evaluating Type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329–349. First citation in article Crossref, Google Scholar
Millsap, R. E. , Everson, H. T. (1993). Methodology review: Statistical approaches for assessing measurement bias. Applied Psychological Measurement, 17(4), 297–334. First citation in article Crossref, Google Scholar
Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78(3), 691–692. First citation in article Crossref, Google Scholar
Narayanan, P. , Swaminathan, H. (1996). Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20, 257–274. First citation in article Crossref, Google Scholar
Navas-Ara, M. J. , Gómez-Benito, J. (2002). Effects of ability scale purification on the identification of DIF. European Journal of Psychological Assessment., 18(1), 9–15. First citation in article Link, Google Scholar
Oshima, T. C , Raju, N. S. , Nanda, A. O. (2006). A new method for assessing the statistical significance in the differential functioning of item and tests (DFIT) framework. Journal of Educational Measurement, 43(1), 1–17. First citation in article Crossref, Google Scholar
Penfield, R. D. , Lam, T. C. M. (2000). Assessing differential item functioning in performance assessment: Review and recommendations. Educational Measurement: Issues and Practice, 19, 5–15. First citation in article Crossref, Google Scholar
Potenza, M. T. , Dorans, N. J. (1995). DIF assessment for polytomously scored items: A framework for classification and evaluation. Applied Psychological Measurement, 19, 23–37. First citation in article Crossref, Google Scholar
Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53, 492–502. First citation in article Crossref, Google Scholar
Rogers, H. J. , Swaminathan, H. (1993). A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17, 105–116. First citation in article Crossref, Google Scholar
Swaminathan, H. , Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370. First citation in article Crossref, Google Scholar
Wilkinson, L. , & The Task Force on Statistical Inference , (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604. First citation in article Crossref, Google Scholar
Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In P. W. Holland, H. Wainer, (Eds.), Differential item functioning (pp. 337–347). Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. First citation in article Google Scholar
Zumbo, B. D. (1999). A Handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Canada: Directorate of Human Resources Research and Evaluation, Department of National Defense. First citation in article Google Scholar
Zumbo, B. D. , Thomas, D. R. (1997). A measure of effect size for a model-based approach for studying DIF. Prince George, Canada: University of Northern British Columbia, Edgeworht Laboratory for Quantitative Behavioral Science. First citation in article Google Scholar

Volume 5Issue 1January 2009

ISSN: 1614-1881eISSN: 1614-2241

Licenses & Copyright

Keywords

Acknowledgments:

This study was partially supported by grants: 2005SGR00365 from the “Departament d'Universitats, Recerca i Societat de la Informació de la Generalitat de Catalunya”, SEJ565 from the Andalusian Regional Government, SEJ2005-09144-C02-02 from Spain’s “Ministerio de Ciencia y Tecnología”, under the European Regional Development Fund (ERDF) and 05725/PHCS/07 from the “Programa de Generación del Conocimiento Científico de Excelencia de la Fundación Séneca, Agencia de Ciencia y Tecnología de la Región de Murcia”.

PDF download

Verify Phone

Congrats!

Efficacy of Effect Size Measures in Logistic Regression

An Application for Detecting DIF

Abstract

References

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Efficacy of Effect Size Measures in Logistic Regression

An Application for Detecting DIF

Abstract

References

Licenses & Copyright

Acknowledgments:

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners