Research Article

Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory

Salima Aldazharova 1 , Gulnara Issayeva 1 , Samat Maxutov 2 , Nuri Balta 2 *
More Detail
1 Abai Kazakh National Pedagogical University, Almaty, KAZAKHSTAN2 SDU University, Almaty, KAZAKHSTAN* Corresponding Author
Contemporary Educational Technology, 16(4), October 2024, ep538, https://doi.org/10.30935/cedtech/15592
Published: 07 November 2024
OPEN ACCESS   411 Views   407 Downloads
Download Full Text (PDF)

ABSTRACT

This study investigates the performance of GPT-4, an advanced AI model developed by OpenAI, on the force concept inventory (FCI) to evaluate its accuracy, reasoning patterns, and the occurrence of false positives and false negatives. GPT-4 was tasked with answering the FCI questions across multiple sessions. Key findings include GPT-4’s proficiency in several FCI items, particularly those related to Newton’s third law, achieving perfect scores on many items. However, it struggled significantly with questions involving the interpretation of figures and spatial reasoning, resulting in a higher occurrence of false negatives where the reasoning was correct, but the answers were incorrect. Additionally, GPT-4 displayed several conceptual errors, such as misunderstanding the effect of friction and retaining the outdated impetus theory of motion. The study’s findings emphasize the importance of refining AI-driven tools to make them more effective in educational settings. Addressing both AI limitations and common misconceptions in physics can lead to improved educational outcomes.

CITATION (APA)

Aldazharova, S., Issayeva, G., Maxutov, S., & Balta, N. (2024). Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory. Contemporary Educational Technology, 16(4), ep538. https://doi.org/10.30935/cedtech/15592

REFERENCES

  1. Anderson, M., Anderson, S. L., & Armen, C. (2019). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 40(4), 45–52.
  2. Balta, N. (2024). A short review of AI in education: Perspectives from the Web of Science database. The European Educational Researcher, 7(2), 41–43. https://doi.org/10.31757/euer.723
  3. Balta, N., & Eryılmaz, A. (2017). Counterintuitive dynamics test. International Journal of Science and Mathematics Education, 15, 411–431. https://doi.org/10.1007/s10763-015-9694-6
  4. Bengio, Y., Lavoie, P., & Vincent, P. (2020). Learning neural networks to solve differential equations. Journal of Machine Learning Research, 21(1), 3485–3510.
  5. Boehnlein, A., Diefenthaler, M., Fanelli, C., Hjorth-Jensen, M., Horn, T., Kuchera, M. P., Lee, D., Pang, L.-G., Poon, A., Sato, N., Schram, M., Scheinker, A., Smith, M. S., Wang, X.-N., & Ziegler, V. (2021). Artificial intelligence and machine learning in nuclear physics. arXiv. https://doi.org/10.1103/RevModPhys.94.031003
  6. Buabeng, I. (2018). Physics classroom interactions: Teaching strategies and practices. Journal of Research in Science, Mathematics and Technology Education, 1(3), 311–328. https://doi.org/10.31756/jrsmte.134
  7. Chen, L., Chen, P., & Lin, Z. (2020). Artificial intelligence in education: A review. IEEE Access, 8, 75264–75278. https://doi.org/10.1109/ACCESS.2020.2988510
  8. Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1989). Categorization and representation of physics problems by experts and novices. Cognitive Science, 13(2), 145–182. https://doi.org/10.1207/s15516709cog1302_1
  9. Dahlkemper, M. N., Lahme, S. Z., & Klein, P. (2023). How do physics students evaluate artificial intelligence responses on comprehension questions? A study on the perceived scientific accuracy and linguistic quality of ChatGPT. Physical Review Physics Education Research, 19(1), Article 010142. https://doi.org/10.1103/PhysRevPhysEducRes.19.010142
  10. de los Ángeles Domínguez-González, M., Hervás-Gómez, C., Díaz-Noguera, M. D., & Reina-Parrado, M. (2023). Attention to diversity from artificial intelligence. The European Educational Researcher, 6(3), 101–115. https://doi.org/10.31757/euer.633
  11. Docktor, J. L., & Mestre, J. P. (2014). Synthesis of discipline-based education research in physics. Physical Review Special Topics-Physics Education Research, 10(2), Article 020119. https://doi.org/10.1103/PhysRevSTPER.10.020119
  12. Ge, Z., & Hu, Y. (2020). Innovative application of artificial intelligence (AI) in the management of higher education and teaching. Journal of Physics: Conference Series, 1533(3), Article 032089. https://doi.org/10.1088/1742-6596/1533/3/032089
  13. Geiger, P., Willner, J., & Kuhn, D. (2021). Misconceptions in physics: A comparative analysis of human and AI reasoning. Physics Education Research, 23(2), 231–245.
  14. Hake, R. R. (1998). Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 66(1), 64–74. https://doi.org/10.1119/1.18809
  15. Halloun, I. A., & Hestenes, D. (1985). The initial knowledge state of college physics students. American Journal of Physics, 53(11), 1043–1055. https://doi.org/10.1119/1.14030
  16. Hammer, D. (1996). Misconceptions or p-prims: How may alternative perspectives of cognitive structure influence instructional perceptions and intentions. Journal of the Learning Sciences, 5(2), 97–127. https://doi.org/10.1207/s15327809jls0502_1
  17. Hestenes, D., Wells, M., & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30(3), 141–151. https://doi.org/10.1119/1.2343497
  18. Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign.
  19. Jung, E. (2020). Impetus. In H. Lagerlund (Ed.), Encyclopedia of medieval philosophy: Philosophy between 500 and 1500 (pp. 832–835). Springer. https://doi.org/10.1007/978-94-024-1665-7_239
  20. Kortemeyer, G. (2023). Could an artificial-intelligence agent pass an introductory physics course? Physical Review Physics Education Research, 19(1), Article 010132. https://doi.org/10.1103/PhysRevPhysEducRes.19.010132
  21. Krupp, L., Steinert, S., Kiefer-Emmanouilidis, M., Avila, K. E., Lukowicz, P., Kuhn, J., Küchemann, S., & Karolus, J. (2024). Unreflected acceptance–Investigating the negative consequences of ChatGPT-assisted problem solving in physics education. Frontiers in Artificial Intelligence and Applications, 386, 199–212. https://doi.org/10.3233/FAIA240195
  22. Kuzu, S. Y. (2021). Artificial intelligence based machine learning approach in high energy physics. International Journal of Innovative Engineering Applications, 5(2), 176–180. https://doi.org/10.46460/ijiea.929292
  23. Lai, J. W., & Cheong, K. H. (2022). Educational opportunities and challenges in augmented reality: Featuring implementations in physics education. IEEE Access, 10, 43143–43158. https://doi.org/10.1109/ACCESS.2022.3166478
  24. Lample, G., & Charton, F. (2019). Deep learning for symbolic mathematics. arXiv. https://doi.org/10.48550/arXiv.1912.01412
  25. Luckin, R., Holmes, W., Griffiths, M., & Forcier, L. B. (2016). Intelligence unleashed: An argument for AI in education. Pearson.
  26. Mahligawati, F., Allanas, E., Butarbutar, M. H., & Nordin, N. A. N. (2023). Artificial intelligence in physics education: A comprehensive literature review. Journal of Physics: Conference Series, 2596(1), Article 012080. https://doi.org/10.1088/1742-6596/2596/1/012080
  27. McDermott, L. C., & Redish, E. F. (1999). Resource letter: PER-1: Physics education research. American Journal of Physics, 67(9), 755–767. https://doi.org/10.1119/1.19122
  28. Mustofa, H. A., Bilad, M. R., & Grendis, N. W. B. (2024). Utilizing AI for physics problem solving: A literature review and ChatGPT experience. Lensa: Jurnal Kependidikan Fisika, 12(1), 78–97. https://doi.org/10.33394/j-lkf.v12i1.11748
  29. OpenAI. (2023). GPT-4: Technical report. OpenAI. https://cdn.openai.com/papers/gpt-4.pdf
  30. Polverini, G., & Gregorcic, B. (2024). Performance of ChatGPT on the test of understanding graphs in kinematics. Physical Review Physics Education Research, 20(1), Article 010109. https://doi.org/10.1103/PhysRevPhysEducRes.20.010109
  31. Roll, I., & Wylie, R. (2016). Evolution and revolution in artificial intelligence in education. International Journal of Artificial Intelligence in Education, 26(2), 582–599. https://doi.org/10.1007/s40593-016-0110-3
  32. Rosé, C. P., Resnick, L., Goldman, P., & Sherin, B. L. (2019). The future of AI in education: Integrating technology and human judgment. In R. Sharpe, H. Beetham, & S. de Freitas (Eds.), Rethinking learning in the digital age (pp. 265–293). Springer.
  33. Schoenfeld, A. H. (2018). On reasoning and sense making in mathematics and science: Themes and highlights. International Journal of STEM Education, 5(1), 3–13.
  34. Schunk, D. H., & Pajares, F. (2002). The development of academic self-efficacy. In A. Wigfield, & J. S. Eccles (Eds.), Development of achievement motivation (pp. 15–31). Academic Press. https://doi.org/10.1016/B978-012750053-9/50003-6
  35. Smith, T. I., & Knight, R. D. (2021). Using computer simulations to improve physics learning. Journal of Science Education and Technology, 30(3), 346–358.
  36. Tschisgale, P., Wulff, P., & Kubsch, M. (2023). Integrating artificial intelligence-based methods into qualitative research in physics education research: A case for computational grounded theory. Physical Review Physics Education Research, 19(2), Article 020123. https://doi.org/10.1103/PhysRevPhysEducRes.19.020123
  37. Van der Veen, J. T., & Van den Berg, E. (2021). Enhancing conceptual understanding with computer simulations in physics education. Physics Education, 56(1), Article 015011.
  38. Wang, L. (2020). Artificial intelligence and career development of college teachers: Challenge and countermeasures. Journal of Physics: Conference Series, 1550(2), Article 022030. https://doi.org/10.1088/1742-6596/1550/2/022030
  39. West, C. G. (2023). Advances in apparent conceptual physics reasoning in GPT-4. arXiv. https://doi.org/10.48550/arXiv.2303.17012
  40. Wink, R., & Bonivento, W. M. (2023). Artificial intelligence: New challenges and opportunities in physics education. In M. Streit-Bianchi, M. Michelini, W. Bonivento, & M. Tuveri, M. (Eds.), New challenges and opportunities in physics education. Challenges in physics education (pp. 427–434). Springer. https://doi.org/10.1007/978-3-031-37387-9_27
  41. Wulff, P. (2024). Physics language and language use in physics–What do we know and how AI might enhance language-related research and instruction. European Journal of Physics, 45(2), Article 023001. https://doi.org/10.1088/1361-6404/ad0f9c
  42. Yeadon, W., & Hardy, T. (2024). The impact of AI in physics education: A comprehensive review from GCSE to university levels. Physics Education, 59(2), Article 025010. https://doi.org/10.1088/1361-6552/ad1fa2
  43. Yerushalmi, E., Cohen, E., & Singh, C. (2017). Assessing and improving student reasoning in physics. Physical Review Physics Education Research, 13(1), Article 010121.
  44. Yilmaz, H., Maxutov, S., Baitekov, A., & Balta, N. (2023). Student’s perception of Chat GPT: A technology acceptance model study. International Educational Review, 1(1), 57– 83. https://doi.org/10.58693/ier.114
  45. Zanca, F., Avanzo, M., Colgan, N., Crijns, W., Guidi, G., Hernandez-Giron, I., Kagadis, G. C., Diaz, O., Zaidi, H., Russo, P., Toma-Dasu, I., & Kortesniemi, M. (2021). Focus issue: Artificial intelligence in medical physics. Physica Medica: European Journal of Medical Physics, 83, 287–291. https://doi.org/10.1016/j.ejmp.2021.05.008
  46. Zohar, A., & Dori, Y. J. (2012). Metacognition in science education: Trends in current research. Springer. https://doi.org/10.1007/978-94-007-2132-6