Search Results for author: Yuxiao Chen

Found 35 papers, 15 papers with code

Sign Language Video Anonymization

no code implementations SignLang (LREC) 2022 Zhaoyang Xia, Yuxiao Chen, Qilong Zhangli, Matt Huenerfauth, Carol Neidle, Dimitri Metaxas

We modify a motion-based image animation model to generate high-resolution videos with the signer identity changed, but with the preservation of linguistically significant motions and facial expressions.

Decoder Image Animation +1

LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation

no code implementations18 Mar 2025 Yang Zhou, Shiyu Zhao, Yuxiao Chen, Zhenting Wang, Dimitris N. Metaxas

Large foundation models trained on large-scale visual-text data can significantly enhance Open Vocabulary Object Detection (OVD) through data generation.

object-detection Open-vocabulary object detection +3

STRIDE: Automating Reward Design, Deep Reinforcement Learning Training and Feedback Optimization in Humanoid Robotics Locomotion

no code implementations7 Feb 2025 Zhenwei Wu, Jinxiong Lu, Yuxiao Chen, Yunxin Liu, Yueting Zhuang, Luhui Hu

Humanoid robotics presents significant challenges in artificial intelligence, requiring precise coordination and control of high-degree-of-freedom systems.

Deep Reinforcement Learning

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering

1 code implementation5 Feb 2025 Zhuowei Li, Haizhou Shi, Yunhe Gao, Di Liu, Zhenting Wang, Yuxiao Chen, Ting Liu, Long Zhao, Hao Wang, Dimitris N. Metaxas

Extensive experiments show that VISTA on average reduces hallucination by abount 40% on evaluated open-ended generation task, and it consistently outperforms existing methods on four benchmarks across four architectures under three decoding strategies.

Hallucination

DreamDrive: Generative 4D Scene Modeling from Street View Images

no code implementations31 Dec 2024 Jiageng Mao, Boyi Li, Boris Ivanovic, Yuxiao Chen, Yan Wang, Yurong You, Chaowei Xiao, Danfei Xu, Marco Pavone, Yue Wang

In this paper, we present DreamDrive, a 4D spatial-temporal scene generation approach that combines the merits of generation and reconstruction, to synthesize generalizable 4D driving scenes and dynamic driving videos with 3D consistency.

Autonomous Driving Neural Rendering +2

Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models

1 code implementation5 Dec 2024 Zhejun Zhang, Peter Karkus, Maximilian Igl, Wenhao Ding, Yuxiao Chen, Boris Ivanovic, Marco Pavone

Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world.

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection

1 code implementation17 Nov 2024 Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong

Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories.

Action Detection Open Vocabulary Action Detection

Optimal Defenses Against Gradient Reconstruction Attacks

1 code implementation6 Nov 2024 Yuxiao Chen, Gamze Gürsoy, Qi Lei

Federated Learning (FL) is designed to prevent data leakage through collaborative model training without centralized data storage.

Federated Learning

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment

no code implementations22 Sep 2024 Yuxiao Chen, Kai Li, Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, Dimitris N. Metaxas

Learning to localize temporal boundaries of procedure steps in instructional videos is challenging due to the limited availability of annotated large-scale training videos.

Contrastive Learning cross-modal alignment +4

Data Reconstruction Attacks and Defenses: A Systematic Evaluation

no code implementations13 Feb 2024 Sheng Liu, Zihan Wang, Yuxiao Chen, Qi Lei

Reconstruction attacks and defenses are essential in understanding the data leakage problem in machine learning.

Federated Learning Reconstruction Attack

Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent

no code implementations30 Nov 2023 Yuxiao Chen, Sander Tonkens, Marco Pavone

Adept traffic models are critical to both planning and closed-loop simulation for autonomous vehicles (AV), and key design objectives include accuracy, diverse multimodal behaviors, interpretability, and downstream compatibility.

Autonomous Vehicles Common Sense Reasoning +1

Interactive Joint Planning for Autonomous Vehicles

no code implementations27 Oct 2023 Yuxiao Chen, Sushant Veer, Peter Karkus, Marco Pavone

In particular, IJP jointly optimizes over the behavior of the ego and the surrounding agents and leverages deep-learned prediction models as prediction priors that the join trajectory optimization tries to stay close to.

Autonomous Vehicles Model Predictive Control +3

Language-Guided Traffic Simulation via Scene-Level Diffusion

no code implementations10 Jun 2023 Ziyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi Ray

Realistic and controllable traffic simulation is a core capability that is necessary to accelerate autonomous vehicle (AV) development.

Language Modeling Language Modelling +1

Improving Tuning-Free Real Image Editing with Proximal Guidance

1 code implementation8 Jun 2023 Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Anastasis Stathopoulos, Xiaoxiao He, Yuxiao Chen, Di Liu, Qilong Zhangli, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas

Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing with cross-attention control.

HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention

1 code implementation6 Mar 2023 Shijie Geng, Jianbo Yuan, Yu Tian, Yuxiao Chen, Yongfeng Zhang

The success of large-scale contrastive vision-language pretraining (CLIP) has benefited both visual recognition and multimodal content understanding.

cross-modal alignment

BITS: Bi-level Imitation for Traffic Simulation

1 code implementation26 Aug 2022 Danfei Xu, Yuxiao Chen, Boris Ivanovic, Marco Pavone

We empirically validate our method, named Bi-level Imitation for Traffic Simulation (BITS), with scenarios from two large-scale driving datasets and show that BITS achieves balanced traffic simulation performance in realism, diversity, and long-horizon stability.

Autonomous Vehicles Diversity

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

1 code implementation20 Jul 2022 Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas

Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at large scales is difficult.

Action Detection Action Recognition +3

ScePT: Scene-consistent, Policy-based Trajectory Predictions for Planning

1 code implementation CVPR 2022 Yuxiao Chen, Boris Ivanovic, Marco Pavone

In this work, we present ScePT, a policy planning-based trajectory prediction model that generates accurate, scene-consistent trajectory predictions suitable for autonomous system motion planning.

Motion Planning Prediction +1

Onboard Safety Guarantees for Racing Drones: High-speed Geofencing with Control Barrier Functions

no code implementations12 Jan 2022 Andrew Singletary, Aiden Swann, Yuxiao Chen, Aaron D. Ames

This paper details the theory and implementation behind practically ensuring safety of remotely piloted racing drones.

Interactive multi-modal motion planning with Branch Model Predictive Control

1 code implementation10 Sep 2021 Yuxiao Chen, Ugo Rosolia, Wyatt Ubellacker, Noel Csomay-Shanklin, Aaron D. Ames

Motion planning for autonomous robots and vehicles in presence of uncontrolled agents remains a challenging problem as the reactive behaviors of the uncontrolled agents must be considered.

Autonomous Vehicles Model Predictive Control +1

Density Constrained Reinforcement Learning

no code implementations24 Jun 2021 Zengyi Qin, Yuxiao Chen, Chuchu Fan

We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous works.

reinforcement-learning Reinforcement Learning +1

More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image-Text Matching

no code implementations20 May 2021 Yuxiao Chen, Jianbo Yuan, Long Zhao, Tianlang Chen, Rui Luo, Larry Davis, Dimitris N. Metaxas

Cross-modal attention mechanisms have been widely applied to the image-text matching task and have achieved remarkable improvements thanks to its capability of learning fine-grained relevance across different modalities.

Contrastive Learning Image Captioning +4

Backup Control Barrier Functions: Formulation and Comparative Study

no code implementations22 Apr 2021 Yuxiao Chen, Mrdjan Jankovic, Mario Santillo, Aaron D. Ames

The backup control barrier function (CBF) was recently proposed as a tractable formulation that guarantees the feasibility of the CBF quadratic programming (QP) via an implicitly defined control invariant set.

Math

Learning Safe Multi-Agent Control with Decentralized Neural Barrier Certificates

1 code implementation ICLR 2021 Zengyi Qin, Kaiqing Zhang, Yuxiao Chen, Jingkai Chen, Chuchu Fan

We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes.

Reactive motion planning with probabilistic safety guarantees

no code implementations6 Nov 2020 Yuxiao Chen, Ugo Rosolia, Chuchu Fan, Aaron D. Ames, Richard Murray

Motion planning in environments with multiple agents is critical to many important autonomous applications such as autonomous vehicles and assistive robots.

Autonomous Vehicles Model Predictive Control +1

Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge

no code implementations CVPR 2020 Long Zhao, Xi Peng, Yuxiao Chen, Mubbasir Kapadia, Dimitris N. Metaxas

Our key idea is to generalize the distilled cross-modal knowledge learned from a Source dataset, which contains paired examples from both modalities, to the Target dataset by modeling knowledge as priors on parameters of the Student.

3D Hand Pose Estimation Knowledge Distillation

Counter-example Guided Learning of Bounds on Environment Behavior

1 code implementation20 Jan 2020 Yuxiao Chen, Sumanth Dathathri, Tung Phan-Minh, Richard M. Murray

There is a growing interest in building autonomous systems that interact with complex environments.

Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM

no code implementations20 Jul 2018 Yuxiao Chen, Jianbo Yuan, Quanzeng You, Jiebo Luo

Sentiment analysis on large-scale social media data is important to bridge the gaps between social media contents and real world activities including political election prediction, individual and public emotional status monitoring and analysis, and so on.

Twitter Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.