no code implementations • SignLang (LREC) 2022 • Zhaoyang Xia, Yuxiao Chen, Qilong Zhangli, Matt Huenerfauth, Carol Neidle, Dimitri Metaxas
We modify a motion-based image animation model to generate high-resolution videos with the signer identity changed, but with the preservation of linguistically significant motions and facial expressions.
no code implementations • 18 Mar 2025 • Yang Zhou, Shiyu Zhao, Yuxiao Chen, Zhenting Wang, Dimitris N. Metaxas
Large foundation models trained on large-scale visual-text data can significantly enhance Open Vocabulary Object Detection (OVD) through data generation.
no code implementations • 7 Feb 2025 • Zhenwei Wu, Jinxiong Lu, Yuxiao Chen, Yunxin Liu, Yueting Zhuang, Luhui Hu
Humanoid robotics presents significant challenges in artificial intelligence, requiring precise coordination and control of high-degree-of-freedom systems.
1 code implementation • 5 Feb 2025 • Zhuowei Li, Haizhou Shi, Yunhe Gao, Di Liu, Zhenting Wang, Yuxiao Chen, Ting Liu, Long Zhao, Hao Wang, Dimitris N. Metaxas
Extensive experiments show that VISTA on average reduces hallucination by abount 40% on evaluated open-ended generation task, and it consistently outperforms existing methods on four benchmarks across four architectures under three decoding strategies.
no code implementations • 31 Dec 2024 • Jiawei Yang, Jiahui Huang, Yuxiao Chen, Yan Wang, Boyi Li, Yurong You, Apoorva Sharma, Maximilian Igl, Peter Karkus, Danfei Xu, Boris Ivanovic, Yue Wang, Marco Pavone
We present STORM, a spatio-temporal reconstruction model designed for reconstructing dynamic outdoor scenes from sparse observations.
no code implementations • 31 Dec 2024 • Jiageng Mao, Boyi Li, Boris Ivanovic, Yuxiao Chen, Yan Wang, Yurong You, Chaowei Xiao, Danfei Xu, Marco Pavone, Yue Wang
In this paper, we present DreamDrive, a 4D spatial-temporal scene generation approach that combines the merits of generation and reconstruction, to synthesize generalizable 4D driving scenes and dynamic driving videos with 3D consistency.
1 code implementation • 5 Dec 2024 • Zhejun Zhang, Peter Karkus, Maximilian Igl, Wenhao Ding, Yuxiao Chen, Boris Ivanovic, Marco Pavone
Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world.
1 code implementation • 17 Nov 2024 • Wentao Bao, Kai Li, Yuxiao Chen, Deep Patel, Martin Renqiang Min, Yu Kong
Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories.
1 code implementation • 6 Nov 2024 • Yuxiao Chen, Gamze Gürsoy, Qi Lei
Federated Learning (FL) is designed to prevent data leakage through collaborative model training without centralized data storage.
no code implementations • 22 Sep 2024 • Yuxiao Chen, Kai Li, Wentao Bao, Deep Patel, Yu Kong, Martin Renqiang Min, Dimitris N. Metaxas
Learning to localize temporal boundaries of procedure steps in instructional videos is challenging due to the limited availability of annotated large-scale training videos.
no code implementations • 9 Sep 2024 • Shuhan Tan, Boris Ivanovic, Yuxiao Chen, Boyi Li, Xinshuo Weng, Yulong Cao, Philipp Krähenbühl, Marco Pavone
Simulation stands as a cornerstone for safe and efficient autonomous driving development.
no code implementations • 26 Jul 2024 • Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich, Jonah Philion, Xinshuo Weng, Fuzhao Xue, Andrew Tao, Ming-Yu Liu, Sanja Fidler, Boris Ivanovic, Trevor Darrell, Jitendra Malik, Song Han, Marco Pavone
Finally, we establish a benchmark for video captioning and introduce a leaderboard, aiming to accelerate advancements in video understanding, captioning, and data alignment.
no code implementations • 1 Jul 2024 • Ran Tian, Boyi Li, Xinshuo Weng, Yuxiao Chen, Edward Schmerling, Yue Wang, Boris Ivanovic, Marco Pavone
The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design.
no code implementations • 13 Feb 2024 • Sheng Liu, Zihan Wang, Yuxiao Chen, Qi Lei
Reconstruction attacks and defenses are essential in understanding the data leakage problem in machine learning.
no code implementations • 30 Nov 2023 • Yuxiao Chen, Sander Tonkens, Marco Pavone
Adept traffic models are critical to both planning and closed-loop simulation for autonomous vehicles (AV), and key design objectives include accuracy, diverse multimodal behaviors, interpretability, and downstream compatibility.
no code implementations • 27 Oct 2023 • Yuxiao Chen, Sushant Veer, Peter Karkus, Marco Pavone
In particular, IJP jointly optimizes over the behavior of the ego and the surrounding agents and leverages deep-learned prediction models as prediction priors that the join trajectory optimization tries to stay close to.
no code implementations • 10 Jun 2023 • Ziyuan Zhong, Davis Rempe, Yuxiao Chen, Boris Ivanovic, Yulong Cao, Danfei Xu, Marco Pavone, Baishakhi Ray
Realistic and controllable traffic simulation is a core capability that is necessary to accelerate autonomous vehicle (AV) development.
1 code implementation • 8 Jun 2023 • Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Anastasis Stathopoulos, Xiaoxiao He, Yuxiao Chen, Di Liu, Qilong Zhangli, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas
Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing with cross-attention control.
1 code implementation • CVPR 2023 • Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N. Metaxas, Hongxia Yang
However, direct aligning cross-modal information using such representations is challenging, as visual patches and text tokens differ in semantic levels and granularities.
1 code implementation • 6 Mar 2023 • Shijie Geng, Jianbo Yuan, Yu Tian, Yuxiao Chen, Yongfeng Zhang
The success of large-scale contrastive vision-language pretraining (CLIP) has benefited both visual recognition and multimodal content understanding.
1 code implementation • 31 Oct 2022 • Ziyuan Zhong, Davis Rempe, Danfei Xu, Yuxiao Chen, Sushant Veer, Tong Che, Baishakhi Ray, Marco Pavone
Controllable and realistic traffic simulation is critical for developing and verifying autonomous vehicles.
1 code implementation • 26 Aug 2022 • Danfei Xu, Yuxiao Chen, Boris Ivanovic, Marco Pavone
We empirically validate our method, named Bi-level Imitation for Traffic Simulation (BITS), with scenarios from two large-scale driving datasets and show that BITS achieves balanced traffic simulation performance in realism, diversity, and long-horizon stability.
1 code implementation • 20 Jul 2022 • Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris N. Metaxas
Despite the success of fully-supervised human skeleton sequence modeling, utilizing self-supervised pre-training for skeleton sequence representation learning has been an active field because acquiring task-specific skeleton annotations at large scales is difficult.
1 code implementation • CVPR 2022 • Yuxiao Chen, Boris Ivanovic, Marco Pavone
In this work, we present ScePT, a policy planning-based trajectory prediction model that generates accurate, scene-consistent trajectory predictions suitable for autonomous system motion planning.
no code implementations • 12 Jan 2022 • Andrew Singletary, Aiden Swann, Yuxiao Chen, Aaron D. Ames
This paper details the theory and implementation behind practically ensuring safety of remotely piloted racing drones.
1 code implementation • 10 Sep 2021 • Yuxiao Chen, Ugo Rosolia, Wyatt Ubellacker, Noel Csomay-Shanklin, Aaron D. Ames
Motion planning for autonomous robots and vehicles in presence of uncontrolled agents remains a challenging problem as the reactive behaviors of the uncontrolled agents must be considered.
no code implementations • 24 Jun 2021 • Zengyi Qin, Yuxiao Chen, Chuchu Fan
We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous works.
no code implementations • 20 May 2021 • Yuxiao Chen, Jianbo Yuan, Long Zhao, Tianlang Chen, Rui Luo, Larry Davis, Dimitris N. Metaxas
Cross-modal attention mechanisms have been widely applied to the image-text matching task and have achieved remarkable improvements thanks to its capability of learning fine-grained relevance across different modalities.
no code implementations • 22 Apr 2021 • Yuxiao Chen, Mrdjan Jankovic, Mario Santillo, Aaron D. Ames
The backup control barrier function (CBF) was recently proposed as a tractable formulation that guarantees the feasibility of the CBF quadratic programming (QP) via an implicitly defined control invariant set.
1 code implementation • ICLR 2021 • Zengyi Qin, Kaiqing Zhang, Yuxiao Chen, Jingkai Chen, Chuchu Fan
We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes.
no code implementations • 6 Nov 2020 • Yuxiao Chen, Ugo Rosolia, Chuchu Fan, Aaron D. Ames, Richard Murray
Motion planning in environments with multiple agents is critical to many important autonomous applications such as autonomous vehicles and assistive robots.
no code implementations • CVPR 2020 • Long Zhao, Xi Peng, Yuxiao Chen, Mubbasir Kapadia, Dimitris N. Metaxas
Our key idea is to generalize the distilled cross-modal knowledge learned from a Source dataset, which contains paired examples from both modalities, to the Target dataset by modeling knowledge as priors on parameters of the Student.
1 code implementation • 20 Jan 2020 • Yuxiao Chen, Sumanth Dathathri, Tung Phan-Minh, Richard M. Murray
There is a growing interest in building autonomous systems that interact with complex environments.
1 code implementation • 20 Jul 2019 • Yuxiao Chen, Long Zhao, Xi Peng, Jianbo Yuan, Dimitris N. Metaxas
We propose a Dynamic Graph-Based Spatial-Temporal Attention (DG-STA) method for hand gesture recognition.
Ranked #4 on
Hand Gesture Recognition
on SHREC 2017
no code implementations • 20 Jul 2018 • Yuxiao Chen, Jianbo Yuan, Quanzeng You, Jiebo Luo
Sentiment analysis on large-scale social media data is important to bridge the gaps between social media contents and real world activities including political election prediction, individual and public emotional status monitoring and analysis, and so on.