Research Article
Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory
More Detail
1 Abai Kazakh National Pedagogical University, Almaty, KAZAKHSTAN2 SDU University, Almaty, KAZAKHSTAN* Corresponding Author
Contemporary Educational Technology, 16(4), October 2024, ep538, https://doi.org/10.30935/cedtech/15592
Published: 07 November 2024
OPEN ACCESS 411 Views 407 Downloads
ABSTRACT
This study investigates the performance of GPT-4, an advanced AI model developed by OpenAI, on the force concept inventory (FCI) to evaluate its accuracy, reasoning patterns, and the occurrence of false positives and false negatives. GPT-4 was tasked with answering the FCI questions across multiple sessions. Key findings include GPT-4’s proficiency in several FCI items, particularly those related to Newton’s third law, achieving perfect scores on many items. However, it struggled significantly with questions involving the interpretation of figures and spatial reasoning, resulting in a higher occurrence of false negatives where the reasoning was correct, but the answers were incorrect. Additionally, GPT-4 displayed several conceptual errors, such as misunderstanding the effect of friction and retaining the outdated impetus theory of motion. The study’s findings emphasize the importance of refining AI-driven tools to make them more effective in educational settings. Addressing both AI limitations and common misconceptions in physics can lead to improved educational outcomes.
CITATION (APA)
Aldazharova, S., Issayeva, G., Maxutov, S., & Balta, N. (2024). Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory. Contemporary Educational Technology, 16(4), ep538. https://doi.org/10.30935/cedtech/15592
REFERENCES
- Anderson, M., Anderson, S. L., & Armen, C. (2019). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 40(4), 45–52.
- Balta, N. (2024). A short review of AI in education: Perspectives from the Web of Science database. The European Educational Researcher, 7(2), 41–43. https://doi.org/10.31757/euer.723
- Balta, N., & Eryılmaz, A. (2017). Counterintuitive dynamics test. International Journal of Science and Mathematics Education, 15, 411–431. https://doi.org/10.1007/s10763-015-9694-6
- Bengio, Y., Lavoie, P., & Vincent, P. (2020). Learning neural networks to solve differential equations. Journal of Machine Learning Research, 21(1), 3485–3510.
- Boehnlein, A., Diefenthaler, M., Fanelli, C., Hjorth-Jensen, M., Horn, T., Kuchera, M. P., Lee, D., Pang, L.-G., Poon, A., Sato, N., Schram, M., Scheinker, A., Smith, M. S., Wang, X.-N., & Ziegler, V. (2021). Artificial intelligence and machine learning in nuclear physics. arXiv. https://doi.org/10.1103/RevModPhys.94.031003
- Buabeng, I. (2018). Physics classroom interactions: Teaching strategies and practices. Journal of Research in Science, Mathematics and Technology Education, 1(3), 311–328. https://doi.org/10.31756/jrsmte.134
- Chen, L., Chen, P., & Lin, Z. (2020). Artificial intelligence in education: A review. IEEE Access, 8, 75264–75278. https://doi.org/10.1109/ACCESS.2020.2988510
- Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1989). Categorization and representation of physics problems by experts and novices. Cognitive Science, 13(2), 145–182. https://doi.org/10.1207/s15516709cog1302_1
- Dahlkemper, M. N., Lahme, S. Z., & Klein, P. (2023). How do physics students evaluate artificial intelligence responses on comprehension questions? A study on the perceived scientific accuracy and linguistic quality of ChatGPT. Physical Review Physics Education Research, 19(1), Article 010142. https://doi.org/10.1103/PhysRevPhysEducRes.19.010142
- de los Ángeles Domínguez-González, M., Hervás-Gómez, C., Díaz-Noguera, M. D., & Reina-Parrado, M. (2023). Attention to diversity from artificial intelligence. The European Educational Researcher, 6(3), 101–115. https://doi.org/10.31757/euer.633
- Docktor, J. L., & Mestre, J. P. (2014). Synthesis of discipline-based education research in physics. Physical Review Special Topics-Physics Education Research, 10(2), Article 020119. https://doi.org/10.1103/PhysRevSTPER.10.020119
- Ge, Z., & Hu, Y. (2020). Innovative application of artificial intelligence (AI) in the management of higher education and teaching. Journal of Physics: Conference Series, 1533(3), Article 032089. https://doi.org/10.1088/1742-6596/1533/3/032089
- Geiger, P., Willner, J., & Kuhn, D. (2021). Misconceptions in physics: A comparative analysis of human and AI reasoning. Physics Education Research, 23(2), 231–245.
- Hake, R. R. (1998). Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 66(1), 64–74. https://doi.org/10.1119/1.18809
- Halloun, I. A., & Hestenes, D. (1985). The initial knowledge state of college physics students. American Journal of Physics, 53(11), 1043–1055. https://doi.org/10.1119/1.14030
- Hammer, D. (1996). Misconceptions or p-prims: How may alternative perspectives of cognitive structure influence instructional perceptions and intentions. Journal of the Learning Sciences, 5(2), 97–127. https://doi.org/10.1207/s15327809jls0502_1
- Hestenes, D., Wells, M., & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30(3), 141–151. https://doi.org/10.1119/1.2343497
- Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign.
- Jung, E. (2020). Impetus. In H. Lagerlund (Ed.), Encyclopedia of medieval philosophy: Philosophy between 500 and 1500 (pp. 832–835). Springer. https://doi.org/10.1007/978-94-024-1665-7_239
- Kortemeyer, G. (2023). Could an artificial-intelligence agent pass an introductory physics course? Physical Review Physics Education Research, 19(1), Article 010132. https://doi.org/10.1103/PhysRevPhysEducRes.19.010132
- Krupp, L., Steinert, S., Kiefer-Emmanouilidis, M., Avila, K. E., Lukowicz, P., Kuhn, J., Küchemann, S., & Karolus, J. (2024). Unreflected acceptance–Investigating the negative consequences of ChatGPT-assisted problem solving in physics education. Frontiers in Artificial Intelligence and Applications, 386, 199–212. https://doi.org/10.3233/FAIA240195
- Kuzu, S. Y. (2021). Artificial intelligence based machine learning approach in high energy physics. International Journal of Innovative Engineering Applications, 5(2), 176–180. https://doi.org/10.46460/ijiea.929292
- Lai, J. W., & Cheong, K. H. (2022). Educational opportunities and challenges in augmented reality: Featuring implementations in physics education. IEEE Access, 10, 43143–43158. https://doi.org/10.1109/ACCESS.2022.3166478
- Lample, G., & Charton, F. (2019). Deep learning for symbolic mathematics. arXiv. https://doi.org/10.48550/arXiv.1912.01412
- Luckin, R., Holmes, W., Griffiths, M., & Forcier, L. B. (2016). Intelligence unleashed: An argument for AI in education. Pearson.
- Mahligawati, F., Allanas, E., Butarbutar, M. H., & Nordin, N. A. N. (2023). Artificial intelligence in physics education: A comprehensive literature review. Journal of Physics: Conference Series, 2596(1), Article 012080. https://doi.org/10.1088/1742-6596/2596/1/012080
- McDermott, L. C., & Redish, E. F. (1999). Resource letter: PER-1: Physics education research. American Journal of Physics, 67(9), 755–767. https://doi.org/10.1119/1.19122
- Mustofa, H. A., Bilad, M. R., & Grendis, N. W. B. (2024). Utilizing AI for physics problem solving: A literature review and ChatGPT experience. Lensa: Jurnal Kependidikan Fisika, 12(1), 78–97. https://doi.org/10.33394/j-lkf.v12i1.11748
- OpenAI. (2023). GPT-4: Technical report. OpenAI. https://cdn.openai.com/papers/gpt-4.pdf
- Polverini, G., & Gregorcic, B. (2024). Performance of ChatGPT on the test of understanding graphs in kinematics. Physical Review Physics Education Research, 20(1), Article 010109. https://doi.org/10.1103/PhysRevPhysEducRes.20.010109
- Roll, I., & Wylie, R. (2016). Evolution and revolution in artificial intelligence in education. International Journal of Artificial Intelligence in Education, 26(2), 582–599. https://doi.org/10.1007/s40593-016-0110-3
- Rosé, C. P., Resnick, L., Goldman, P., & Sherin, B. L. (2019). The future of AI in education: Integrating technology and human judgment. In R. Sharpe, H. Beetham, & S. de Freitas (Eds.), Rethinking learning in the digital age (pp. 265–293). Springer.
- Schoenfeld, A. H. (2018). On reasoning and sense making in mathematics and science: Themes and highlights. International Journal of STEM Education, 5(1), 3–13.
- Schunk, D. H., & Pajares, F. (2002). The development of academic self-efficacy. In A. Wigfield, & J. S. Eccles (Eds.), Development of achievement motivation (pp. 15–31). Academic Press. https://doi.org/10.1016/B978-012750053-9/50003-6
- Smith, T. I., & Knight, R. D. (2021). Using computer simulations to improve physics learning. Journal of Science Education and Technology, 30(3), 346–358.
- Tschisgale, P., Wulff, P., & Kubsch, M. (2023). Integrating artificial intelligence-based methods into qualitative research in physics education research: A case for computational grounded theory. Physical Review Physics Education Research, 19(2), Article 020123. https://doi.org/10.1103/PhysRevPhysEducRes.19.020123
- Van der Veen, J. T., & Van den Berg, E. (2021). Enhancing conceptual understanding with computer simulations in physics education. Physics Education, 56(1), Article 015011.
- Wang, L. (2020). Artificial intelligence and career development of college teachers: Challenge and countermeasures. Journal of Physics: Conference Series, 1550(2), Article 022030. https://doi.org/10.1088/1742-6596/1550/2/022030
- West, C. G. (2023). Advances in apparent conceptual physics reasoning in GPT-4. arXiv. https://doi.org/10.48550/arXiv.2303.17012
- Wink, R., & Bonivento, W. M. (2023). Artificial intelligence: New challenges and opportunities in physics education. In M. Streit-Bianchi, M. Michelini, W. Bonivento, & M. Tuveri, M. (Eds.), New challenges and opportunities in physics education. Challenges in physics education (pp. 427–434). Springer. https://doi.org/10.1007/978-3-031-37387-9_27
- Wulff, P. (2024). Physics language and language use in physics–What do we know and how AI might enhance language-related research and instruction. European Journal of Physics, 45(2), Article 023001. https://doi.org/10.1088/1361-6404/ad0f9c
- Yeadon, W., & Hardy, T. (2024). The impact of AI in physics education: A comprehensive review from GCSE to university levels. Physics Education, 59(2), Article 025010. https://doi.org/10.1088/1361-6552/ad1fa2
- Yerushalmi, E., Cohen, E., & Singh, C. (2017). Assessing and improving student reasoning in physics. Physical Review Physics Education Research, 13(1), Article 010121.
- Yilmaz, H., Maxutov, S., Baitekov, A., & Balta, N. (2023). Student’s perception of Chat GPT: A technology acceptance model study. International Educational Review, 1(1), 57– 83. https://doi.org/10.58693/ier.114
- Zanca, F., Avanzo, M., Colgan, N., Crijns, W., Guidi, G., Hernandez-Giron, I., Kagadis, G. C., Diaz, O., Zaidi, H., Russo, P., Toma-Dasu, I., & Kortesniemi, M. (2021). Focus issue: Artificial intelligence in medical physics. Physica Medica: European Journal of Medical Physics, 83, 287–291. https://doi.org/10.1016/j.ejmp.2021.05.008
- Zohar, A., & Dori, Y. J. (2012). Metacognition in science education: Trends in current research. Springer. https://doi.org/10.1007/978-94-007-2132-6