Elucidating STEM Concepts through Generative AI: A Multi-modal Exploration of Analogical Reasoning

Chen Cao^1,2 Zijian Ding^3,4 Gyeong-Geon Lee⁵ Jiajun Jiao⁶ Jionghao Lin⁷ Xiaoming Zhai⁵ ¹Carnegie Learning
²University of Sheffield
³Microsoft Research
⁴University of Maryland, College Park
⁵University of Georgia
⁶New York University
⁷Carnegie Mellon University
ccao5@sheffield.ac.uk, ding@umd.edu, crusaderlee@snu.ac.kr, jj3100@nyu.edu, jionghao@cmu.edu, xiaoming.zhai@uga.edu

Abstract

This study explores the integration of generative artificial intelligence (AI), specifically large language models, with multi-modal analogical reasoning as an innovative approach to enhance science, technology, engineering, and mathematics (STEM) education. We have developed a novel system that utilizes the capacities of generative AI to transform intricate principles in mathematics, physics, and programming into comprehensible metaphors. To further augment the educational experience, these metaphors are subsequently converted into visual form. Our study aims to enhance the learners’ understanding of STEM concepts and their learning engagement by using the visual metaphors. We examine the efficacy of our system via a randomized A/B/C test, assessing learning gains and motivation shifts among the learners. Our study demonstrates the potential of applying large language models to educational practice on STEM subjects. The results will shed light on the design of educational system in terms of harnessing AI’s potential to empower educational stakeholders.

1 Introduction

The teaching and learning of science, technology, engineering, and mathematics (STEM) stand as an essential yet challenging aspect of education. Concepts, the bedrock of STEM, are frequently abstract and complex, demanding learners to exhibit higher-order thinking skills such as logical reasoning and problem-solving. Consequently, comprehending these algorithms often poses significant hurdles for learners, particularly for those new to the field Lee and Ko (2011).

Current pedagogical practices primarily rely on textual explanations and code examples to convey these concepts, which, although useful, can be limiting. Traditional methods might fail to demystify abstract concepts adequately, relying heavily on learners’ abilities to translate theoretical knowledge into functional understanding Kolikant (2010). Moreover, the inherently linear nature of text-based teaching methods may not cater well to the non-linear and dynamic nature of STEM concepts Sorva et al. (2013).

Educators, particularly in the STEM fields, face their own set of challenges. Lesson planning, particularly for complex topics like mathematical theorems, physical laws, and computational algorithms, can be a daunting task, demanding not just extensive subject knowledge, but also creativity in crafting engaging content Xu and Ouyang (2022). Resource constraints, such as limited time and funding, further add to these challenges. Technological limitations also come into play, as the tools to create interactive, multimodal learning content are often expensive or difficult to use Lee et al. (2014).

Recognizing these challenges, this paper introduces an innovative, AI-driven, multimodal approach to teaching STEM algorithms. The proposed approach harnesses the capabilities of a state-of-the-art large language model, to generate intuitive analogies and transform them into engaging visual storyboards with a text-to-image generative model. The aim is to enhance pedagogical efficiency and, most importantly, to improve learners’ comprehension of complex STEM algorithms.

2 Related Work

2.1 AI for Analogical Reasoning

Analogical reasoning, a cognitive process that involves transferring knowledge from a source domain to a target domain based on similarities and differences, has long been recognized as a powerful tool for problem-solving and learning Gentner and Markman (1997); Holyoak and Thagard (1996). Analogies can help learners make connections between familiar and unfamiliar concepts, facilitating understanding and promoting knowledge transfer Richland et al. (2007). Recent advancements in AI, particularly the development of large language models like GPT-4, have unlocked new possibilities for generating analogies and facilitating their integration into educational contexts. Previous research suggests that Large Language Models (LLMs) might have the ability to create analogies that are similar to those made by humans, including longer natural-language analogies Team (2020), explain analogical mappings Webb et al. (2022); Bhavya et al. (2022), or invent analogy-inspired concepts for creative problem-solving Zhu and Luo (2022); Zhu et al. (2023); Webb et al. (2022); Ding et al. (2023); Ding and Chan (2023).

2.2 Analogical Reasoning to Support Education

Previous work highlighted how AI can generate personalized learning pathways for students, leading to enhanced learning outcomes Tetzlaff et al. (2021), and underscored the potential of AI in predicting and improving student engagement and performance Maghsudi et al. (2021).

One strength of AI is the ability to simplify complex concepts. For STEM topics, it can generate relevant analogies or stories that draw on more familiar situations or narratives. For example, variables in STEM can be thought of as boxes where you store things (values), which you label so you know what’s inside Tetzlaff et al. (2021). Additionally, for visual learners, storyboards or visual analogies can be created to help explain these concepts. Adaptive storytelling refers to tailoring the narrative based on the individual’s understanding or preference for learning, which can be gauged through their interactions or explicit feedback Bietti et al. (2019). This ability can be employed to generate study materials, simplifying educators’ workload and enhancing the learning experience.

2.3 STEM Education

Learning to program is increasingly recognized as a critical skill. However, acquiring this skill can be a daunting task due to the complexity and abstractness of STEM algorithms Selby (2015); Marín et al. (2018). Despite numerous pedagogical strategies—like pair STEM, code tracing, and the use of visual aids—many students still find it challenging to comprehend and apply algorithmic concepts Kalelioglu and Gülbahar (2014).

A common pedagogical strategy for teaching algorithms is through textual explanations and examples Lopez et al. (2008). However, this approach can be insufficient, particularly for novices, as it requires the learner to translate the abstract, textual information into a more concrete understanding Kolikant (2010). This translation process is often fraught with challenges and can result in misconceptions.

2.4 Multimodal Learning and Cognitive Theory of Multimedia

Given the challenges of textual explanations, researchers have begun exploring multimodal learning approaches, which combine text with other modalities, such as images or videos. Mayer’s Mayer (2002) cognitive theory of multimedia learning posits that people learn better from words and pictures together than from words alone. His research has sparked a growing interest in multimodal learning in various fields, including computer science education Bétrancourt (2005).

Aligned with Mayer’s theory, Paivio’s Paivio (1991) dual coding theory proposes that verbal and non-verbal systems are used for cognitive processing. He suggests that information can be remembered better if it is presented in both verbal and visual formats. This theory has been used to inform many multimodal learning approaches, which attempt to exploit both the visual and verbal cognitive subsystems.

2.5 Synthesis and Research Gap

The reviewed literature shows the potential of AI in education and the need for innovative approaches to teaching STEM algorithms. There is also theoretical support for using multi-modal approaches to enhance learning. However, there is a clear gap in research that combines these elements—AI, multi-modal learning, and STEM education. This study aims to fill this gap by exploring an AI-driven, multi-modal approach to teaching STEM algorithms.

In this paper, we present an innovative approach that harnesses the power of analogical reasoning, multimodal learning, and AI technologies to demystify STEM concepts. By combining AI-generated analogies with visual storyboards and animated videos, we aim to enhance learners’ comprehension and knowledge retention in the field of STEM. We believe that this multimodal approach has the potential to revolutionize STEM education, inspire creative instructional content creation, and shed light on the interplay between analogical reasoning, abstraction, and cognitive processes.

This innovative educational approach sits at the intersection of AI and education, an increasingly researched area that holds immense potential. Leveraging AI to aid teaching and learning can provide personalized learning experiences, making education more effective and engaging. Therefore, an exploration into the role of AI, especially in generating analogies and visual aids, contributes valuable insights into this burgeoning field.

In addition to its theoretical implications, this research has practical significance. If successful, our approach could be a valuable tool for educators, aiding them in planning engaging lessons and providing students with a unique, effective way to understand abstract STEM concepts. In the long run, it could significantly transform teaching and learning processes in computer science and beyond.

The potential of AI in education has been well-documented, with significant strides in personalized learning and content generation. However, the application of AI for generating intuitive analogies and visual aids in STEM education is an under-explored domain. Our approach aims to fill this gap, exploiting the power of AI to create an engaging learning experience.

3 Theoretical Framework

This study is underpinned by three major theoretical perspectives: Mayer’s cognitive theory of multimedia learning, Paivio’s dual coding theory, and Piaget’s theory of cognitive development.

Mayer Mayer (2002) and Paivio Paivio (1991) provide the rationale for employing a multimodal approach, suggesting that learners process information more effectively when presented using both verbal and visual means. Piaget’s theory of cognitive development Piaget (1970) further supports our approach. Piaget postulates that knowledge acquisition is an active, constructive process where learners constantly build and adjust their understanding by integrating new information with their existing knowledge base. This theory justifies our approach’s interactive nature, which prompts learners to actively engage with the content, facilitating the integration of new information.

Our methodology is also heavily informed by best practices in UX/UI design. Following Norman’s principles Norman (1988), the interface is designed to be simple and intuitive, minimizing the cognitive load on the users. Clear instructions and feedback mechanisms are incorporated to guide the users through the process of selecting an analogy, generating a storyboard, and creating the animated video. The design also allows flexibility for the users to edit the generated content, providing a sense of control and customization, which further enhances the user experience.

These theories collectively inform the design of our AI-driven, multimodal learning solution that leverages analogies and visual storyboards. We hypothesize that this approach can enhance the comprehension of complex STEM algorithms by stimulating multiple cognitive subsystems and fostering an active learning environment.

4 Methodology

We developed a prototype with live deployment¹¹1https://analogen.netlify.app/. The prototype is backed by GPT-4, one of the most advanced language models developed by OpenAI (in Jul 2023), to generate a set of analogies based on specific STEM concepts. Users can provide a STEM concept (e.g., “Newton’s first Law”) as input, and the GPT-4 model generates three distinct analogies (e.g., skating on ice, pushing a stalled car, and the stationary soccer ball) as shown in the Appendix A. These analogies are designed to be intuitive and engaging, catering to a range of learner preferences.

Following the analogy selection, the system uses preset prompts to create a narrative based on the chosen analogy. This narrative serves as the foundation for generating a storyboard. The storyboard comprises a sequence of four images and each iamge is accompanied by a description, which provides a step-by-step visualization of the STEM concept using the selected analogy.

Finally, the images and descriptions from the storyboard are transformed into an animated video. The video serves as an engaging and easily digestible medium to convey the STEM concept building upon the principles of multimedia learning and dual coding theories.

5 Preliminary Results

We examined the content generation performance of our tool through a sequence of incremental steps, each substantiated by specific cases. These steps include:

•

Transforming STEM concepts into text-based analogies.
•

Converting text-based analogies into static visual analogies.
•

Evolving static visual analogies into dynamic visual analogies.

In the initial phase, our trials revealed that our tool was proficient in generating informative text-based analogies from STEM concepts. For instance, in the context of Object-Oriented Programming (OOP), objects were analogized as Lego bricks, and classes were likened to structures of Lego.

In the second phase, our tool was capable of producing visually appealing images for text-based analogies. However, we encountered a challenge in that STEM concepts often encompass multiple components that must be depicted in an analogy. This complexity is evident in analogies for OOP (objects, classes) and the analogy between water pressure/current and electric voltage/current in physics. Despite our emphasis on differentiating these components in the image generation prompts, the generative model frequently fell short in representing each part comprehensively. For example, consider the prompt for generating the analogy between water pressure/current and electric voltage/current:

”Water flows through a narrow tube connected between two water tanks, one tank having significantly more water than the other.”

Refer to caption — Figure 1: Visual analogy for water pressure/current and electric voltage/current with prompt ”Water flows through a narrow tube connected between two water tanks, one tank having significantly more water than the other.” Only the image in the top right corner (the second one) displayed two water tanks, with one tank (right) containing more water than the other. However, it was missing a connecting tube, an important feature needed to symbolize the relationship between water current and electric current.

From the generated images as shown in Figure 1, only the second one (in the top right corner) featured two water tanks, with one having more water than the other, but it lacked a connecting tube, a critical element to represent the mapping between water current and electric current.

In the third phase, the creation of dynamic visual analogies, while visually appealing, posed challenges in devising suitable transitions or motions to articulate the analogy dynamically. Taking the analogy between water pressure/current and electric voltage/current as an example, we found it difficult to generate a visual representation that effectively conveyed the concept of ”water flowing from the tank with more water to the tank with less water.”

6 Discussion

While the preliminary results indicate promising outcomes in terms of content generation and potential educational effectiveness, it is essential to recognize the need of an empirical study with learner participants in a real-world setting.

Our next step is to conduct a between-subject study with different types of analogies to gain valuable insights into the actual learning outcomes, learner engagement, and potential areas for improvement. Additionally, expanding the study to different STEM concepts and considering diverse learner profiles could help assess the generalizability and effectiveness of the approach in various educational settings.

7 Conclusion

This paper introduces a generative-AI-facilitated pedagogical tool for computer science education. Our results reveal the potential of expanding this multimodal approach to other subjects (e.g., physics, chemistry, and mathematics) in future research. By revolutionizing the creation of educational content, we aim to break down the barriers to understanding complex leanring concepts and foster an environment conducive to both teaching and learning.

References

(1)
Bétrancourt (2005) Mireille Bétrancourt. 2005. The animation and interactivity principles in multimedia learning. The Cambridge handbook of multimedia learning (2005), 287–296.
Bhavya et al. (2022) Bhavya Bhavya, Jinjun Xiong, and Chengxiang Zhai. 2022. Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. http://arxiv.org/abs/2210.04186 arXiv:2210.04186 [cs].
Bietti et al. (2019) Lucas M Bietti, Ottilie Tilston, and Adrian Bangerter. 2019. Storytelling as adaptive collective sensemaking. Topics in cognitive science 11, 4 (2019), 710–732.
Ding and Chan (2023) Zijian Ding and Joel Chan. 2023. Mapping the Design Space of Interactions in Human-AI Text Co-creation Tasks. ArXiv abs/2303.06430 (2023).
Ding et al. (2023) Zijian Ding, Arvind Srinivasan, Stephen MacNeil, and Joel Chan. 2023. Fluid Transformers and Creative Analogies: Exploring Large Language Models’ Capacity for Augmenting Cross-Domain Analogical Creativity. In Proceedings of the 15th Conference on Creativity and Cognition. 489–505.
Gentner and Markman (1997) Dedre Gentner and Arthur B Markman. 1997. Structure mapping in analogy and similarity. American psychologist 52, 1 (1997), 45.
Holyoak and Thagard (1996) Keith J Holyoak and Paul Thagard. 1996. Mental leaps: Analogy in creative thought. MIT press.
Kalelioglu and Gülbahar (2014) Filiz Kalelioglu and Yasemin Gülbahar. 2014. The Effects of Teaching Programming via Scratch on Problem Solving Skills: A Discussion from Learners’ Perspective. Informatics in education 13, 1 (2014), 33–50.
Kolikant (2010) Yifat Ben-David Kolikant. 2010. Digital natives, better learners? Students’ beliefs about how the Internet influenced their ability to learn. Computers in Human Behavior 26, 6 (2010), 1384–1391.
Lee et al. (2014) Michael J Lee, Faezeh Bahmani, Irwin Kwan, Jilian LaFerte, Polina Charters, Amber Horvath, Fanny Luor, Jill Cao, Catherine Law, Michael Beswetherick, et al. 2014. Principles of a debugging-first puzzle game for computing education. In 2014 IEEE symposium on visual languages and human-centric computing (VL/HCC). IEEE, 57–64.
Lee and Ko (2011) Michael J Lee and Amy J Ko. 2011. Personifying programming tool feedback improves novice programmers’ learning. In Proceedings of the seventh international workshop on Computing education research. 109–116.
Lopez et al. (2008) Mike Lopez, Jacqueline Whalley, Phil Robbins, and Raymond Lister. 2008. Relationships between reading, tracing and writing skills in introductory programming. In Proceedings of the fourth international workshop on computing education research. 101–112.
Maghsudi et al. (2021) Setareh Maghsudi, Andrew Lan, Jie Xu, and Mihaela van Der Schaar. 2021. Personalized education in the artificial intelligence era: what to expect next. IEEE Signal Processing Magazine 38, 3 (2021), 37–50.
Marín et al. (2018) Beatriz Marín, Jonathan Frez, J Cruz-Lemus, and Marcela Genero. 2018. An empirical investigation on the benefits of gamification in programming courses. ACM Transactions on Computing Education (TOCE) 19, 1 (2018), 1–22.
Mayer (2002) Richard E Mayer. 2002. Multimedia learning. In Psychology of learning and motivation. Vol. 41. Elsevier, 85–139.
Norman (1988) Donald A Norman. 1988. The psychology of everyday things. Basic books.
Paivio (1991) Allan Paivio. 1991. Dual coding theory: Retrospect and current status. Canadian Journal of Psychology/Revue canadienne de psychologie 45, 3 (1991), 255.
Piaget (1970) Jean Piaget. 1970. Science of education and the psychology of the child. Trans. D. Coltman. (1970).
Richland et al. (2007) Lindsey E Richland, Osnat Zur, and Keith J Holyoak. 2007. Cognitive supports for analogies in the mathematics classroom. Science 316, 5828 (2007), 1128–1129.
Selby (2015) Cynthia C Selby. 2015. Relationships: computational thinking, pedagogy of programming, and Bloom’s Taxonomy. In Proceedings of the workshop in primary and secondary computing education. 80–87.
Sorva et al. (2013) Juha Sorva, Ville Karavirta, and Lauri Malmi. 2013. A review of generic program visualization systems for introductory programming education. ACM Transactions on Computing Education (TOCE) 13, 4 (2013), 1–64.
Team (2020) Latitude Team. 2020. World Creation by Analogy. https://aidungeon.medium.com/world-creation-by-analogy-f26e3791d35f 00000.
Tetzlaff et al. (2021) Leonard Tetzlaff, Florian Schmiedek, and Garvin Brod. 2021. Developing personalized education: A dynamic framework. Educational Psychology Review 33 (2021), 863–882.
Webb et al. (2022) Taylor Webb, Keith J. Holyoak, and Hongjing Lu. 2022. Emergent Analogical Reasoning in Large Language Models. http://arxiv.org/abs/2212.09196 arXiv:2212.09196 [cs].
Xu and Ouyang (2022) Weiqi Xu and Fan Ouyang. 2022. The application of AI technologies in STEM education: a systematic review from 2011 to 2021. International Journal of STEM Education 9 (2022), 1–20. https://api.semanticscholar.org/CorpusID:252370184
Zhu and Luo (2022) Q. Zhu and J. Luo. 2022. Generative Pre-Trained Transformer for Design Concept Generation: An Exploration. Proceedings of the Design Society 2 (May 2022), 1825–1834. https://doi.org/10.1017/pds.2022.185 Publisher: Cambridge University Press.
Zhu et al. (2023) Qihao Zhu, Xinyu Zhang, and Jianxi Luo. 2023. Biologically Inspired Design Concept Generation Using Generative Pre-Trained Transformers. Journal of Mechanical Design 145, 4 (Jan. 2023). https://doi.org/10.1115/1.4056598

APPENDIX