Key Challenges and Some Guidance on Using Strong Quantitative Methodology in Education Research
Keywords:doctoral training, educational research, effect sizes, evidence-based practice, quantitative methods
The current article reviews several common areas of focus in quantitative methods with the hope of providing Journal of Urban Mathematics Education (JUME) readers and researchers with some guidance on conducting and reporting quantitative analyses. After providing some background for the discussion, the methodological nature of recent JUME articles is reviewed, followed by commentary on key challenges and recommendations for strong practice in quantitative methodology. The review addresses causal inferences, measurement issues, handling missing data, testing for assumptions, dealing with nested data, and providing evidence for outcomes. Enhanced quantitative training and resources for doctoral students, authors, reviewers, and editors is recommended.
Adler, J., Ball, D. Krainer, K., Lin, F., & Novotna, J. (2005). Reflections on an emerging field: Researching mathematics teacher education. Educational Studies in Mathematics, 60(3), 359–381. https://doi.org/10.1007/s10649-005-5072-6
Aiken, L. S., West, S.g., & Millsap, R. E. (2008). Doctoral training in statistics, measurement, and methodology: Replication and extension of Aiken, West, Sechrest, and Reno's (1990) survey of PhD programs in North America. American Psychologist, 63(1), 32–50. https://doi.org/10.1037/0003-066X.63.1.32
Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Brooks/Cole Publishing Company.
American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.).
American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). https://doi.org/10.1037/0000165-000
Austin, P. C. (2008). A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003. Statistics in Medicine, 27(12), 2037–2049. https://doi.org/10.1002/sim.3150
Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424. https://doi.org/10.1080/00273171.2011.568786
Beaujean, A. A., & Osterlind, S. J. (2008). Using item response theory to assess the Flynn effect in the National Longitudinal Study of Youth 79 Children and Young Adults data. Intelligence, 36(5), 455–463. https://doi.org/10.1016/j.intell.2007.10.004
Berliner, D. C. (2002). Comment: Educational research: The hardest science of all. Educational Researcher, 31(8), 18–20. https://doi.org/10.3102/0013189X031008018
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061
Cai, J., Morris, A., Hohensee, C., Hwang, S., Robison, V., Cirillo, M., Kramer, J., & Hiebert, J. (2019). Posing significant research questions. Journal for Research in Mathematics Education, 50(2), 114–120. https://doi.org/10.5951/jresematheduc.50.2.0114
Cai, J., Morris, A., Hohensee, C., Hwang, S., Robison, V., Cirillo, M., Kramer, S. L.., Hiebert, J., & Bakker, A. (2020). Addressing the problem of always starting over: Identifying, valuing, and sharing professional knowledge for teaching. Journal for Research in Mathematics Education, 51(2), 130–139. https://doi.org/10.5951/jresematheduc-2020-0015
Casad, B. J., Hale, P., & Wachs, F. L. (2017). Stereotype threat among girls: Differences by gender identity and math education context. Psychology of Women Quarterly, 41(4), 513–529. https://doi.org/10.1177%2F0361684317711412
Cochran-Smith, M., & Zeichner, K. M. (2005). Studying teacher educations, The report of the AERA Panel on Research and Teacher Education. Lawrence Erlbaum Associates.
Cohen, J. (1983). The cost of dichotomization. Applied Psychological Measurement, 7(3), 249–253. https://doi.org/10.1177/014662168300700301
Connolly, P., Keenan, C., & Urbanska, K. (2018). The trials of evidence-based practice in education: A systematic review of randomised controlled trials in education research 1980–2016. Educational Research, 60(3), 276–291. https://doi.org/10.1080/00131881.2018.1493353
Courville, T., & Thompson, B. (2001). Use of structure coefficients in published multiple regression articles: B is not enough. Educational and Psychological Measurement, 61(2), 229–248. https://doi.org/10.1177/0013164401612006
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. http://doi.org/10.1037/h0040957
Cumming, G., & Finch, S. (2005). Inference by eye: Confidence intervals and how to read pictures of data. American Psychologist, 60(2), 170–180. http://doi.org/10.1037/0003-066X.60.2.170
Demerath, P. (2006). The science of context: Modes of response for qualitative researchers in education. International Journal of Qualitative Studies in Education, 19(1), 97–113. https://doi.org/10.1080/09518390500450201
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.
Enders, C. K. (2010). Applied missing data analysis. Guilford Press.
Ferron, J. M., Hogarty, K. Y., Dedrick, R. F., Hess, M. R., Niles, J. D., Kromrey, J. D. (2008). Reporting results from multilevel analyses. In A. A. O'Connell & D. B. McCoach (Eds.), Multilevel modeling of educational data. Information Age Publishing.
Gutiérrez, R. (2002). Enabling the practice of mathematics teachers in context: Toward a new equity research agenda. Mathematical Thinking and Learning, 4(2–3), 145–187. https://doi.org/10.1207/S15327833MTL04023_4
Henson, R. K. (1999). Multivariate normality: What is it and how is it assessed? Advances in Social Science Methodology, 5, 193–211.
Henson, R. K. (2001). Understanding internal consistency reliability estimates: A conceptual primer on coefficient alpha. Measurement and Evaluation in Counseling and Development, 34(3), 177–189. https://doi.org/10.1080/07481756.2002.12069034
Henson, R. K. (2002, April 1–5). The logic and interpretation of structure coefficients in multivariate general linear model analyses [Paper presentation]. Annual Meeting of the American Educational Research Association, New Orleans, LA, United States.
Henson, R. K. (2006). Effect-size measures and meta-analytic thinking in counseling psychology research. The Counseling Psychologist, 34(5), 601–629. https://doi.org/10.1177/0011000005283558
Henson, R. K., Hull, D. M., & Williams, C. S. (2010). Methodology in our education research culture: Toward a stronger collective quantitative proficiency. Educational Researcher, 39(3), 229–240. https://doi.org/10.3102/0013189X10365102
Henson, R. K., Kogan, L. R., & Vacha-Haase, T. (2001). A reliability generalization study of the Teacher Efficacy Scale and related instruments. Educational and Psychological Measurement, 61(3), 404–420. https://doi.org/10.1177/00131640121971284
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice. Educational and Psychological Measurement, 66(3), 393–416. https://doi.org/10.1177/0013164405282485
Henson, R. K., & Williams, C. (2006, April 7–11). Doctoral training in research methodology: A national survey of education and related disciplines [Paper presentation]. Annual Meeting of the American Educational Research Association, San Francisco, CA, United States.
Hill, J. (2008). Discussion of research using propensity-score matching: Comments on ‘A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003’ by Peter Austin, Statistics in Medicine. Statistics in Medicine, 27(12), 2055–2061. https://doi.org/10.1002/sim.3245
Hogan, T. P., Benjamin, A., & Brezinski, K. L. (2000). Reliability methods: A note on the frequency of use of various types. Educational and Psychological Measurement, 60(4), 523–531. https://doi.org/10.1177/00131640021970691
Howard, K. E., Romero, M., Scott, A., & Saddler, D. (2015). Success after failure: Academic effects and psychological implications of early universal algebra policies. Journal of Urban Mathematics Education, 8(1). https://doi.org/10.21423/jume-v8i1a248
Hughes, G. D., Onwuegbuzie, A. J., Daniel, L. G., & Slate, J. R. (2010). APA Publication Manual changes: Impacts on research reporting in the social sciences. Research in the Schools, 17(1), viii–xix.
Irvin, M., Byun, S. Y., Smiley, W. S., & Hutchins, B. C. (2017). Relation of opportunity to learn advanced math to the educational attainment of rural youth. American Journal of Education, 123(3), 475–510. https://doi.org/10.1086/691231
Johnson, R. B., & Christensen, L. (2019). Educational research: Quantitative, qualitative, and mixed approaches. SAGE.
Journal of Urban Mathematics Education. (n.d.-a). Policies and procedures. Retrieved November 1, 2019, from https://jume-ojs-tamu.tdl.org/jume/index.php/jume/policiesandprocedures
Journal of Urban Mathematics Education. (n.d.-b). About the journal. Retrieved November 1, 2019, from https://journals.tdl.org/jume/index.php/jume/about
Kesselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., & Levin, J. R. (1998). Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68(3), 350–386. https://doi.org/10.3102/00346543068003350
Kraha, A., Turner, H., Nimon, K., Zientek, L., & Henson, R. (2012). Tools to support interpreting multiple regression in the face of multicollinearity. Frontiers in Psychology, 3, 44. https://doi.org/10.3389/fpsyg.2012.00044
Kwok, O., Underhill, A., Berry, J. W., Luo, W., Elliott, T., & Yoon, M. (2008). Analyzing longitudinal data with multilevel models: An example with individuals living with lower extremity intra-articular fractures. Rehabilitation Psychology, 53(3), 370–386. https://doi.org/10.1037/a0012765
Lee, L. S. (2018). Success of online mathematics courses at the community college level. Journal of Mathematics Education, 11(3), 69–89. https://doi.org/10.26711/007577152790033
Lekwa, A. J., Reddy, L. A., Dudek, C. M., & Hua, A. N. (2019). Assessment of teaching to predict gains in student achievement in urban schools. School Psychology, 34(3), 271–280. https://doi.org/10.1037/spq0000293
Lissitz, R. W., & Samuelson, K. (2007). A suggested change in terminology and emphasis regarding validity and education. Educational Researcher, 36(8), 437–448. https://doi.org/10.3102/0013189X07311286
Little, R. J. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83(404), 1198–1202. https://doi.org/10.1080/01621459.1988.10478722
Matthews, J. S. (2018). When am I ever going to use this in the real world? Cognitive flexibility and urban adolescents’ negotiation of the value of mathematics. Journal of Educational Psychology, 110(5), 726–746. http://doi.org/10.1037/edu0000242
Maxwell, J. A. (2004). Causal explanation, qualitative research, and scientific inquiry in education. Educational Researcher, 33(2), 3–11. https://doi.org/10.3102%2F0013189X033002003
McCoach, D. B. (2010). Hierarchical linear modeling. In G. R. Hancock, R. O. Mueller, & L. M. Stapleton (Eds.), The reviewer’s guide to quantitative methods in the social sciences (pp. 123–140). Routledge.
Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge.
Morales-Chicas, J., & Agger, C. (2017). The effects of teacher collective responsibility on the mathematics achievement of students who repeat algebra. Journal of Urban Mathematics Education, 10(1), 52–73. https://doi.org/10.21423/jume-v10i1a287
Morgan, P. L., Frisco, M. L., Farkas, G., & Hibel, J. (2010). A propensity score matching analysis of the effects of special education services. Journal of Special Education, 43(4), 236–254. https://doi.org/10.1177/0022466908323007
Onwuegbuzie, A. J., & Daniel, L. G. (2005). Evidence-based guidelines for publishing articles in Research in the Schools and beyond. Research in the Schools, 12(2), 1–11.
Osborne, J. W. (2013). Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. SAGE.
Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525–556. https://doi.org/10.3102/00346543074004525
Primi, C., Morsanyi, K., Donati, M. A., Galli, S., & Chiesi, F. (2017). Measuring probabilistic reasoning: The construction of a new scale applying item response theory. Journal of Behavioral Decision Making, 30(4), 933–950. https://doi.org/10.1002/bdm.2011
Quintana, S. M., & Minami, T. (2006). Guidelines for meta-analyses of counseling psychology research. The Counseling Psychologist, 34(6), 839–877. https://doi.org/10.1177/0011000006286991
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Sage.
Reise, S. P., Ainsworth, A. T., & Haviland, M. G. (2005). Item response theory: Fundamentals, applications, and promise in psychological research. Current Directions in Psychological Science, 14(2), 95–101. https://doi.org/10.1111/j.0963-7214.2005.00342.x
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.1093/biomet/70.1.41
Sadikovic, S., Milovanovic, I., & Oljaca, M. (2018). Another psychometric proof of the Abbreviated Math Anxiety Scale usefulness: IRT analysis. Primenjena Psihologija, 11(3), 301–323. https://doi.org/10.19090/pp.2018.3.301-323
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147–177. https://doi.org/10.1037/1082-989X.7.2.147
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin Company.
Smith, P. A., & Hoy, W. K. (2007). Academic optimism and student achievement in urban elementary schools. Journal of Educational Administration, 45(5), 556–568. https://doi.org/10.1108/09578230710778196
Tabachnick, B. G., & Fidell, L. S. (1996). Using multivariate statistics (3rd ed.). Pearson.
Thompson, B. (1999). If statistical significance tests are broken/misused, what practices should supplement or replace them? Theory & Psychology, 9(2), 165–181. https://doi.org/10.1177/095935439992006
Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher, 31(3), 25–32. https://doi.org/10.3102/0013189X031003025
Vacha-Haase, T., Henson, R. K., & Caruso, J. C. (2002). Reliability generalization: Moving toward improved understanding and use of score reliability. Educational and Psychological Measurement, 62(4), 562–569. https://doi.org/10.1177/0013164402062004002
Vacha-Haase, T., Ness, C., Nilsson, J., & Reetz, D. (1999). Practices regarding reporting of reliability coefficients: A review of three journals. Journal of Experimental Education, 67(4), 335–341. https://doi.org/10.1080/00220979909598487
Vogler, A. M., Prediger, S., Quasthoff, U., & Heller, V. (2018). Students’ and teachers’ focus of attention in classroom interaction — Subtle sources for the reproduction of social disparities. Mathematics Education Research Journal, 30(3), 299–323. https://doi.org/10.1007/s13394-017-0234-2
Valero P. (2008). In between the global and the local: The politics of mathematics education reform in a globalized society. In B. Atweh, A. C. Barton, M. C. Borba, N. Gough, C. Keitel, C. Vistro-Yu, & R. Vithal (Eds.), Internationalisation and Globalisation in Mathematics and Science Education (pp. 421–439). Springer. https://doi.org/10.1007/978-1-4020-5908-7_23
Woltman, H., Feldstain, A., MacKay, J. C., & Rocchi, M. (2012). An introduction to hierarchical linear modeling. Tutorials in Quantitative Methods for Psychology, 8(1), 52–69. https://doi.org/10.20982/tqmp.08.1.p052
Young, D. J. (1997, March 24–28). A Multilevel Analysis of Science and Mathematics Achievement [Paper presentation]. Annual Meeting of the American Educational Research Association, Chicago, IL, United States.
Young, J. R., Young, J., Hamilton, C., & Pratt, S. (2019). Evaluating the effects of professional development on urban mathematics teachers TPACK using confidence intervals. REDIMAT – Journal of Research in Mathematics Education, 8(3), 312–338. http://doi.org/10.17583/redimat.2019.3065
Zientek, L. R., Capraro, M. M., & Capraro, R. M. (2008). Reporting practices in quantitative teacher education research: One look at the evidence cited in the AERA panel report. Educational Researcher, 37(4), 208–216. https://doi.org/10.3102/0013189X08319762
Zimney, G. H. (1961). Method in experimental psychology. Ronald Press.
How to Cite
The copyright for articles in JUME is held by the individual. By virtue of their appearance in this open access journal, articles are free to use with proper attribution in educational and other non-commercial settings.