Search Results for author: Andrew Lan

Found 48 papers, 29 papers with code

SMART: Simulated Students Aligned with Item Response Theory for Question Difficulty Prediction

no code implementations7 Jul 2025 Alexander Scarlatos, Nigel Fernandez, Christopher Ormerod, Susan Lottridge, Andrew Lan

Item (question) difficulties play a crucial role in educational assessments, enabling accurate and efficient assessment of student abilities and personalization to maximize learning outcomes.

LookAlike: Consistent Distractor Generation in Math MCQs

no code implementations3 May 2025 Nisarg Parikh, Nigel Fernandez, Alexander Scarlatos, Simon Woodhead, Andrew Lan

Large language models (LLMs) are increasingly used to generate distractors for multiple-choice questions (MCQs), especially in domains like math education.

Distractor Generation Math +1

The StudyChat Dataset: Student Dialogues With ChatGPT in an Artificial Intelligence Course

no code implementations11 Mar 2025 Hunter McNichols, Andrew Lan

\textbf{StudyChat} provides a rich resource for the learning sciences and AI in education communities, enabling further research into the evolving role of LLMs in education.

Chatbot

From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics

no code implementations10 Mar 2025 Jaewook Lee, Jeongah Lee, Wanyong Feng, Andrew Lan

We address three research questions: (1) how to automatically generate math diagrams in problem-solving hints and evaluate their quality, (2) whether SVG is an effective intermediate representation for math diagrams, and (3) what prompting strategies and formats are required for LLMs to generate accurate SVG-based diagrams.

Math Question Answering +2

Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues

1 code implementation9 Mar 2025 Alexander Scarlatos, Naiming Liu, Jaewook Lee, Richard Baraniuk, Andrew Lan

Specifically, we generate a set of candidate tutor utterances and score them using (1) an LLM-based student model to predict the chance of correct student responses and (2) a pedagogical rubric evaluated by GPT-4o.

Learning Code-Edit Embedding to Model Student Debugging Behavior

1 code implementation26 Feb 2025 Hasnain Heickal, Andrew Lan

Providing effective feedback for programming assignments in computer science education can be challenging: students solve problems by iteratively submitting code, executing it, and using limited feedback from the compiler or the auto-grader to debug.

Decoder

Automated Knowledge Component Generation and Knowledge Tracing for Coding Problems

1 code implementation25 Feb 2025 Zhangqi Duan, Nigel Fernandez, Arun Balajiee Lekshmi Narayanan, Mohammad Hassany, Rafaella Sampaio de Alencar, Peter Brusilovsky, Bita Akram, Andrew Lan

Knowledge components (KCs) mapped to problems help model student learning, tracking their mastery levels on fine-grained skills thereby facilitating personalized learning and feedback in online learning platforms.

Knowledge Tracing

Whose story is it? Personalizing story generation by inferring author styles

1 code implementation18 Feb 2025 Nischal Ashok Kumar, Chau Minh Pham, Mohit Iyyer, Andrew Lan

Human evaluation highlights the high quality of our Author Writing Sheet and provides valuable insights into the personalized story generation task.

Story Generation

Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams

no code implementations7 Nov 2024 Adriana Caraeni, Alexander Scarlatos, Andrew Lan

Recent advances in generative artificial intelligence (AI) have shown promise in accurately grading open-ended student responses.

Math

Test Case-Informed Knowledge Tracing for Open-ended Coding Tasks

1 code implementation28 Sep 2024 Zhangqi Duan, Nigel Fernandez, Alexander Hicks, Andrew Lan

In this paper, we introduce Test case-Informed Knowledge Tracing for Open-ended Coding (TIKTOC), a framework to simultaneously analyze and predict both open-ended student code and whether the code passes each test case.

Knowledge Tracing Language Modeling +3

Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs

1 code implementation24 Sep 2024 Alexander Scarlatos, Ryan S. Baker, Andrew Lan

Recent advances in large language models (LLMs) have led to the development of artificial intelligence (AI)-powered tutoring chatbots, showing promise in providing broad access to high-quality personalized education.

Knowledge Tracing Misconceptions

Exploring Automated Keyword Mnemonics Generation with Large Language Models via Overgenerate-and-Rank

no code implementations21 Sep 2024 Jaewook Lee, Hunter McNichols, Andrew Lan

In this paper, we study an under-explored area of language and vocabulary learning: keyword mnemonics, a technique for memorizing vocabulary through memorable associations with a target word via a verbal cue.

Diversity

DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions

1 code implementation27 Jun 2024 Nigel Fernandez, Alexander Scarlatos, Wanyong Feng, Simon Woodhead, Andrew Lan

High-quality distractors are crucial to both the assessment and pedagogical value of multiple-choice questions (MCQs), where manually crafting ones that anticipate knowledge deficiencies or misconceptions among real students is difficult.

Distractor Generation Math +2

Interpreting Latent Student Knowledge Representations in Programming Assignments

1 code implementation13 May 2024 Nigel Fernandez, Andrew Lan

Recent advances in artificial intelligence for education leverage generative large language models, including using them to predict open-ended student responses rather than their correctness only.

Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions?

1 code implementation10 May 2024 Hunter McNichols, Jaewook Lee, Stephen Fancsali, Steve Ritter, Andrew Lan

We fine-tune both open-source and proprietary LLMs on real student responses and corresponding ITS-provided feedback.

Math text similarity

Generating Feedback-Ladders for Logical Errors in Programming using Large Language Models

no code implementations1 May 2024 Hasnain Heickal, Andrew Lan

These methods ask the LLM to generate feedback given the problem statement and a student's (buggy) submission.

Language Modelling Large Language Model

Math Multiple Choice Question Generation via Human-Large Language Model Collaboration

no code implementations1 May 2024 Jaewook Lee, Digory Smith, Simon Woodhead, Andrew Lan

We conduct a pilot study involving math educators to investigate how the tool can help them simplify the process of crafting high-quality math MCQs.

Language Modeling Language Modelling +6

Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank

no code implementations19 Apr 2024 Alexander Scarlatos, Wanyong Feng, Digory Smith, Simon Woodhead, Andrew Lan

Multiple-choice questions (MCQs) are commonly used across all levels of math education since they can be deployed and graded at a large scale.

Distractor Generation Math +2

Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models

1 code implementation2 Apr 2024 Wanyong Feng, Jaewook Lee, Hunter McNichols, Alexander Scarlatos, Digory Smith, Simon Woodhead, Nancy Otero Ornelas, Andrew Lan

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable format in assessments and practices.

Distractor Generation In-Context Learning +7

SyllabusQA: A Course Logistics Question Answering Dataset

1 code implementation3 Mar 2024 Nigel Fernandez, Alexander Scarlatos, Andrew Lan

Automated teaching assistants and chatbots have significant potential to reduce the workload of human instructors, especially for logistics-related question answering, which is important to students yet repetitive for instructors.

Language Modeling Language Modelling +4

Improving the Validity of Automatically Generated Feedback via Reinforcement Learning

1 code implementation2 Mar 2024 Alexander Scarlatos, Digory Smith, Simon Woodhead, Andrew Lan

Second, we propose a framework for feedback generation that optimizes both correctness and alignment using reinforcement learning (RL).

Math Misconceptions +3

Improving Socratic Question Generation using Data Augmentation and Preference Optimization

1 code implementation1 Mar 2024 Nischal Ashok Kumar, Andrew Lan

The Socratic method is a way of guiding students toward solving a problem independently without directly revealing the solution to the problem.

Data Augmentation Question Generation +1

Using Large Language Models for Student-Code Guided Test Case Generation in Computer Science Education

1 code implementation11 Feb 2024 Nischal Ashok Kumar, Andrew Lan

The goal of our work is to propose a fully automated approach for test case generation that can accurately measure student knowledge, which is important for two reasons.

Language Modeling Language Modelling +1

Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

1 code implementation7 Aug 2023 Hunter McNichols, Wanyong Feng, Jaewook Lee, Alexander Scarlatos, Digory Smith, Simon Woodhead, Andrew Lan

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable form of assessment.

In-Context Learning Math +2

Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank

1 code implementation15 Jun 2023 Nischal Ashok Kumar, Nigel Fernandez, Zichao Wang, Andrew Lan

Reading comprehension is a crucial skill in many aspects of education, including language learning, cognitive development, and fostering early literacy skills in children.

Data Augmentation Question Generation +2

Interpretable Math Word Problem Solution Generation Via Step-by-step Planning

no code implementations1 Jun 2023 Mengxue Zhang, Zichao Wang, Zhichao Yang, Weiqi Feng, Andrew Lan

We propose a step-by-step planning approach for intermediate solution generation, which strategically plans the generation of the next solution step based on the MWP and the previous solution steps.

GSM8K Language Modeling +2

Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions

no code implementations1 Jun 2023 Mengxue Zhang, Neil Heffernan, Andrew Lan

In this paper, we investigate a collection of models that account for the individual preferences and tendencies of each human scorer in the automated scoring task.

Math

RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning

2 code implementations23 May 2023 Alexander Scarlatos, Andrew Lan

Recent developments in large pre-trained language models have enabled unprecedented performance on a variety of downstream tasks.

In-Context Learning Language Modelling +7

A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

1 code implementation11 May 2023 Nischal Ashok Kumar, Wanyong Feng, Jaewook Lee, Hunter McNichols, Aritra Ghosh, Andrew Lan

In this paper, we take a preliminary step towards solving the problem of causal discovery in knowledge tracing, i. e., finding the underlying causal relationship among different skills from real-world student response data.

Causal Discovery Knowledge Tracing

SmartPhone: Exploring Keyword Mnemonic with Auto-generated Verbal and Visual Cues

no code implementations11 May 2023 Jaewook Lee, Andrew Lan

Our approach, an end-to-end pipeline for auto-generating verbal and visual cues, can automatically generate highly memorable cues.

Retrieval Scheduling

Algebra Error Classification with Large Language Models

1 code implementation8 May 2023 Hunter McNichols, Mengxue Zhang, Andrew Lan

Existing data-driven methods avoid these limitations but specifically require mathematical expressions in student responses to be parsed into syntax trees.

Classification Math +1

Tree-Based Representation and Generation of Natural and Mathematical Language

1 code implementation15 Feb 2023 Alexander Scarlatos, Andrew Lan

In this paper, we propose a series of modifications to existing language models to jointly represent and generate text and math: representing mathematical expressions as sequences of node tokens in their operator tree format, using math symbol and tree position embeddings to preserve the semantic and structural properties of mathematical expressions, and using a constrained decoding method to generate mathematically valid expressions.

Math Mathematical Reasoning +1

Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

no code implementations5 Dec 2022 Yun-Wei Chu, Seyyedali Hosseinalipour, Elizabeth Tenorio, Laura Cruz, Kerrie Douglas, Andrew Lan, Christopher Brinton

Conventional methods for student modeling, which involve predicting grades based on measured activities, struggle to provide accurate results for minority/underrepresented student groups due to data availability biases.

Knowledge Tracing Personalized Federated Learning

Mitigating Biases in Student Performance Prediction via Attention-Based Personalized Federated Learning

no code implementations2 Aug 2022 Yun-Wei Chu, Seyyedali Hosseinalipour, Elizabeth Tenorio, Laura Cruz, Kerrie Douglas, Andrew Lan, Christopher Brinton

To learn better representations of student activity, we augment our approach with a self-supervised behavioral pretraining methodology that leverages multiple modalities of student behavior (e. g., visits to lecture videos and participation on forums), and include a neural network attention mechanism in the model aggregation stage.

Personalized Federated Learning

Automatic Short Math Answer Grading via In-context Meta-learning

1 code implementation30 May 2022 Mengxue Zhang, Sami Baral, Neil Heffernan, Andrew Lan

In this paper, we study the problem of automatic short answer grading for students' responses to math questions and propose a novel framework for this task.

automatic short answer grading In-Context Learning +4

Automated Scoring for Reading Comprehension via In-context BERT Tuning

1 code implementation19 May 2022 Nigel Fernandez, Aritra Ghosh, Naiming Liu, Zichao Wang, Benoît Choffin, Richard Baraniuk, Andrew Lan

Our approach, in-context BERT fine-tuning, produces a single shared scoring model for all items with a carefully-designed input structure to provide contextual information on each item.

Reading Comprehension

Process-BERT: A Framework for Representation Learning on Educational Process Data

1 code implementation28 Apr 2022 Alexander Scarlatos, Christopher Brinton, Andrew Lan

One can use process data for many downstream tasks such as learning outcome prediction and automatically delivering personalized intervention.

Representation Learning

GPT-based Open-Ended Knowledge Tracing

1 code implementation21 Feb 2022 Naiming Liu, Zichao Wang, Richard G. Baraniuk, Andrew Lan

In education applications, knowledge tracing refers to the problem of estimating students' time-varying concept/skill mastery level from their past responses to questions and predicting their future performance.

Code Generation Knowledge Tracing +3

DiPS: Differentiable Policy for Sketching in Recommender Systems

no code implementations8 Dec 2021 Aritra Ghosh, Saayan Mitra, Andrew Lan

In sequential recommender system applications, it is important to develop models that can capture users' evolving interest over time to successfully recommend future items that they are likely to interact with.

Sequential Recommendation

BOBCAT: Bilevel Optimization-Based Computerized Adaptive Testing

2 code implementations17 Aug 2021 Aritra Ghosh, Andrew Lan

Computerized adaptive testing (CAT) refers to a form of tests that are personalized to every student/test taker.

Bilevel Optimization Question Selection

Math Operation Embeddings for Open-ended Solution Analysis and Feedback

no code implementations25 Apr 2021 Mengxue Zhang, Zichao Wang, Richard Baraniuk, Andrew Lan

Feedback on student answers and even during intermediate steps in their solutions to open-ended questions is an important element in math education.

Math

Do We Really Need Gold Samples for Sample Weighting Under Label Noise?

2 code implementations19 Apr 2021 Aritra Ghosh, Andrew Lan

Consequently, several recently proposed methods, such as Meta-Weight-Net (MW-Net), use a small number of unbiased, clean samples to learn a weighting function that downweights samples that are likely to have corrupted labels under the meta-learning framework.

Meta-Learning

Option Tracing: Beyond Correctness Analysis in Knowledge Tracing

2 code implementations19 Apr 2021 Aritra Ghosh, Jay Raspat, Andrew Lan

Knowledge tracing refers to a family of methods that estimate each student's knowledge component/skill mastery level from their past responses to questions.

Knowledge Tracing Multiple-choice +1

Contrastive Learning Improves Model Robustness Under Label Noise

1 code implementation19 Apr 2021 Aritra Ghosh, Andrew Lan

One common type of method that can mitigate the impact of label noise can be viewed as supervised robust methods; one can simply replace the CCE loss with a loss that is robust to label noise, or re-weight training samples and down-weight those with higher loss values.

Contrastive Learning image-classification +2

Personalized Education in the AI Era: What to Expect Next?

no code implementations19 Jan 2021 Setareh Maghsudi, Andrew Lan, Jie Xu, Mihaela van der Schaar

The objective of personalized learning is to design an effective knowledge acquisition track that matches the learner's strengths and bypasses her weaknesses to ultimately meet her desired goal.

Learning Student Interest Trajectory for MOOCThread Recommendation

no code implementations10 Jan 2021 Shalini Pandey, Andrew Lan, George Karypis, Jaideep Srivastava

The projection operation learns to estimate future embedding of students and threads.

VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

no code implementations27 May 2020 Zichao Wang, Yi Gu, Andrew Lan, Richard Baraniuk

We propose VarFA, a variational inference factor analysis framework that extends existing factor analysis models for educational data mining to efficiently output uncertainty estimation in the model's estimated factors.

Bayesian Inference Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.