Designing Knowledge Tools: How Students Transition from Using to Creating Generative AI in STEAM classroom

Authors: Qian Huang, Nachamma Sockalingam, Thijs Willems, King Wang Poon

Abstract: This study explores how graduate students in an urban planning program transitioned from passive users of generative AI to active creators of custom GPT-based knowledge tools. Drawing on Self-Determination Theory (SDT), which emphasizes the psychological needs of autonomy, competence, and relatedness as foundations for intrinsic motivation, the research investigates how the act of designing AI tools influences students’ learning experiences, identity formation, and engagement with knowledge. The study is situated within a two-term curriculum, where students first used instructor-created GPTs to support qualitative research tasks and later redesigned these tools to create their own custom applications, including the Interview Companion GPT. Using qualitative thematic analysis of student slide presentations and focus group interviews, the findings highlight a marked transformation in students’ roles and mindsets. Students reported feeling more autonomous as they chose the functionality, design, and purpose of their tools, more competent through the acquisition of AI-related skills such as prompt engineering and iterative testing, and more connected to peers through team collaboration and a shared sense of purpose. The study contributes to a growing body of evidence that student agency can be powerfully activated when learners are invited to co-design the very technologies they use. The shift from AI tool users to AI tool designers reconfigures students’ relationships with technology and knowledge, transforming them from consumers into co-creators in an evolving educational landscape.

Link: https://arxiv.org/abs/2510.19405

Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

Authors: Yoshinari Fujinuma

Abstract: Large Language Models (LLMs) are commonly used as evaluators in various applications, but the reliability of the outcomes remains a challenge. One such challenge is using LLMs-as-judges for direct assessment, i.e., assigning scores from a specified range without any references. We first show that this challenge stems from LLM judge outputs being associated with score range bias, i.e., LLM judge outputs are highly sensitive to pre-defined score ranges, preventing the search for optimal score ranges. We also show that similar biases exist among models from the same family. We then mitigate this bias through contrastive decoding, achieving up to 11.3% relative improvement on average in Spearman correlation with human judgments across different score ranges.

Link: https://arxiv.org/abs/2510.18196

Discovering the curriculum with AI: A proof-of-concept demonstration with an intelligent tutoring system for teaching project selection

Authors: Lovis Heindrich, Falk Lieder

Abstract: The decisions of individuals and organizations are often suboptimal because fully rational decision-making is too demanding in the real world. Recent work suggests that some errors can be prevented by leveraging artificial intelligence to discover and teach clever heuristics. So far, this line of research has been limited to simplified, artificial decision-making tasks. This article is the first to extend this approach to a real-world decision problem, namely, executives deciding which project their organization should launch next. We develop a computational method (MGPS) that automatically discovers project selection strategies that are optimized for real people, and we develop an intelligent tutor that teaches the discovered project selection procedures. We evaluated MGPS on a computational benchmark and tested the intelligent tutor in a training experiment with two control conditions. MGPS outperformed a state-of-the-art method and was more computationally efficient. Moreover, people who practiced with our intelligent tutor learned significantly better project selection strategies than the control groups. These findings suggest that AI could be used to automate the process of discovering and formalizing the cognitive strategies taught by intelligent tutoring systems.

Link: https://arxiv.org/abs/2406.04082

Large Language Models in Architecture Studio: A Framework for Learning Outcomes

Authors: Juan David Salazar Rodriguez, Sam Conrad Joyce, Nachamma Sockalingam, Julfendi

Abstract: The study explores the role of large language models (LLMs) in the context of the architectural design studio, understood as the pedagogical core of architectural education. Traditionally, the studio has functioned as an experiential learning space where students tackle design problems through reflective practice, peer critique, and faculty guidance. However, the integration of artificial intelligence (AI) in this environment has been largely focused on form generation, automation, and representation-al efficiency, neglecting its potential as a pedagogical tool to strengthen student autonomy, collaboration, and self-reflection. The objectives of this research were: (1) to identify pedagogical challenges in self-directed, peer-to-peer, and teacher-guided learning processes in architecture studies; (2) to propose AI interventions, particularly through LLM, that contribute to overcoming these challenges; and (3) to align these interventions with measurable learning outcomes using Bloom’s taxonomy. The findings show that the main challenges include managing student autonomy, tensions in peer feedback, and the difficulty of balancing the transmission of technical knowledge with the stimulation of creativity in teaching. In response to this, LLMs are emerging as complementary agents capable of generating personalized feedback, organizing collaborative interactions, and offering adaptive cognitive scaffolding. Furthermore, their implementation can be linked to the cognitive levels of Bloom’s taxonomy: facilitating the recall and understanding of architectural concepts, supporting application and analysis through interactive case studies, and encouraging synthesis and evaluation through hypothetical design scenarios.

Link: https://arxiv.org/abs/2510.15936

Human or AI? Comparing Design Thinking Assessments by Teaching Assistants and Bots

Authors: Sumbul Khan, Wei Ting Liow, Lay Kee Ang

Abstract: As design thinking education grows in secondary and tertiary contexts, educators face the challenge of evaluating creative artefacts that combine visual and textual elements. Traditional rubric-based assessment is laborious, time-consuming, and inconsistent due to reliance on Teaching Assistants (TA) in large, multi-section cohorts. This paper presents an exploratory study investigating the reliability and perceived accuracy of AI-assisted assessment compared to TA-assisted assessment in evaluating student posters in design thinking education. Two activities were conducted with 33 Ministry of Education (MOE) Singapore school teachers to (1) compare AI-generated scores with TA grading across three key dimensions: empathy and user understanding, identification of pain points and opportunities, and visual communication, and (2) examine teacher preferences for AI-assigned, TA-assigned, and hybrid scores. Results showed low statistical agreement between instructor and AI scores for empathy and pain points, with slightly higher alignment for visual communication. Teachers preferred TA-assigned scores in six of ten samples. Qualitative feedback highlighted the potential of AI for formative feedback, consistency, and student self-reflection, but raised concerns about its limitations in capturing contextual nuance and creative insight. The study underscores the need for hybrid assessment models that integrate computational efficiency with human insights. This research contributes to the evolving conversation on responsible AI adoption in creative disciplines, emphasizing the balance between automation and human judgment for scalable and pedagogically sound assessment.

Link: https://arxiv.org/abs/2510.16069

css.php