Search Results for author: Yuxuan Lu

Found 18 papers, 7 papers with code

UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents

no code implementations13 Apr 2025 Yuxuan Lu, Bingsheng Yao, Hansu Gu, Jing Huang, Jessie Wang, Yang Li, Jiri Gesi, Qi He, Toby Jia-Jun Li, Dakuo Wang

Usability testing is a fundamental research method that user experience (UX) researchers use to evaluate and iterate a web design, but\textbf{ how to evaluate and iterate the usability testing study design } itself?

Language Modeling Language Modelling +1

Beyond Believability: Accurate Human Behavior Simulation with Fine-Tuned LLMs

no code implementations26 Mar 2025 Yuxuan Lu, Jing Huang, Yan Han, Bennet Bei, Yaochen Xie, Dakuo Wang, Jessie Wang, Qi He

In this work, we focus on evaluating and improving LLM's objective ``accuracy'' rather than the subjective ``believability'' in the web action generation task, leveraging a large-scale, real-world dataset collected from online shopping human actions.

Action Generation

UXAgent: An LLM Agent-Based Usability Testing Framework for Web Design

1 code implementation18 Feb 2025 Yuxuan Lu, Bingsheng Yao, Hansu Gu, Jing Huang, Jessie Wang, Yang Li, Jiri Gesi, Qi He, Toby Jia-Jun Li, Dakuo Wang

Usability testing is a fundamental yet challenging (e. g., inflexible to iterate the study design flaws and hard to recruit study participants) research method for user experience (UX) researchers to evaluate a web design.

Language Modeling Language Modelling +1

RECOVER: Designing a Large Language Model-based Remote Patient Monitoring System for Postoperative Gastrointestinal Cancer Care

no code implementations9 Feb 2025 Ziqi Yang, Yuxuan Lu, Jennifer Bagdasarian, Vedant Das Swain, Ritu Agarwal, Collin Campbell, Waddah Al-Refaire, Jehan El-Bayoumi, Guodong Gao, Dakuo Wang, Bingsheng Yao, Nawar Shara

Cancer surgery is a key treatment for gastrointestinal (GI) cancers, a group of cancers that account for more than 35% of cancer-related deaths worldwide, but postoperative complications are unpredictable and can be life-threatening.

Language Modeling Language Modelling +1

Benchmarking LLMs' Judgments with No Gold Standard

1 code implementation11 Nov 2024 Shengwei Xu, Yuxuan Lu, Grant Schoenebeck, Yuqing Kong

We also present GRE-bench (Generating Review Evaluation Benchmark) which evaluates LLMs based on how well they can generate high-quality peer reviews for academic research papers.

Benchmarking Machine Translation +1

VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking

no code implementations5 Aug 2024 Yuxuan Lu, Jiahao Nie, Zhiwei He, Hongjie Gu, Xudong Lv

Current LiDAR point cloud-based 3D single object tracking (SOT) methods typically rely on point-based representation network.

3D Single Object Tracking Object Tracking +1

Eliciting Informative Text Evaluations with Large Language Models

1 code implementation23 May 2024 Yuxuan Lu, Shengwei Xu, Yichi Zhang, Yuqing Kong, Grant Schoenebeck

We highlight the results that on the ICLR dataset, our mechanisms can differentiate three quality levels -- human-written reviews, GPT-4-generated reviews, and GPT-3. 5-generated reviews in terms of expected scores.

Multiple-choice Prediction

Professional Network Matters: Connections Empower Person-Job Fit

no code implementations19 Dec 2023 Hao Chen, Lun Du, Yuxuan Lu, Qiang Fu, Xu Chen, Shi Han, Yanbin Kang, Guangming Lu, Zi Li

Online recruitment platforms typically employ Person-Job Fit models in the core service that automatically match suitable job seekers with appropriate job positions.

Graph Neural Network

More Samples or More Prompts? Exploring Effective In-Context Sampling for LLM Few-Shot Prompt Engineering

no code implementations16 Nov 2023 Bingsheng Yao, Guiming Chen, Ruishi Zou, Yuxuan Lu, Jiachen Li, Shao Zhang, Yisi Sang, Sijia Liu, James Hendler, Dakuo Wang

While most existing works on LLM prompting techniques focus only on how to select a better set of data samples inside one single prompt input (In-Context Learning or ICL), why can not we design and leverage multiple prompts together to further improve the LLM's performance?

In-Context Learning Prompt Engineering

Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks

no code implementations16 Nov 2023 Yuxuan Lu, Bingsheng Yao, Shao Zhang, Yun Wang, Peng Zhang, Tun Lu, Toby Jia-Jun Li, Dakuo Wang

Large Language Models (LLMs) have demonstrated considerable advances, and several claims have been made about their exceeding human performance.

Active Learning

StorySparkQA: Expert-Annotated QA Pairs with Real-World Knowledge for Children's Story-Based Learning

1 code implementation16 Nov 2023 Jiaju Chen, Yuxuan Lu, Shao Zhang, Bingsheng Yao, Yuanzhe Dong, Ying Xu, Yunyao Li, Qianwen Wang, Dakuo Wang, Yuling Sun

Interactive story reading is a common parent-child activity, where parents expect to teach both language skills and real-world knowledge beyond the story.

Question Answering World Knowledge

Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture

1 code implementation22 May 2023 Bingsheng Yao, Ishan Jindal, Lucian Popa, Yannis Katsis, Sayan Ghosh, Lihong He, Yuxuan Lu, Shashank Srivastava, Yunyao Li, James Hendler, Dakuo Wang

Our AL architecture leverages an explanation-generation model to produce explanations guided by human explanations, a prediction model that utilizes generated explanations toward prediction faithfully, and a novel data diversity-based AL sampling strategy that benefits from the explanation annotations.

Active Learning Decision Making +3

A Framework of Transaction Packaging in High-throughput Blockchains

no code implementations26 Jan 2023 Yuxuan Lu, Qian Qi, Xi Chen

We develop a model of coordination and allocation of decentralized multi-sided markets, in which our theoretical analysis is promisingly optimizing the decentralized transaction packaging process at high-throughput blockchains or Web 3. 0 platforms.

Vocal Bursts Intensity Prediction

Equal Affection or Random Selection: the Quality of Subjective Feedback from a Group Perspective

no code implementations24 Feb 2021 Jiale Chen, Yuqing Kong, Yuxuan Lu

With this assumption, we propose a new definition for uninformative feedback and correspondingly design a family of evaluation metrics, called f-variety, for group-level feedback which can 1) distinguish informative feedback and uninformative feedback (separation) even if their statistics are both uniform and 2) decrease as the ratio of uninformative respondents increases (monotonicity).

Computer Science and Game Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.