1 code implementation • 2 Nov 2023 • Kai Yan, Alexander G. Schwing, Yu-Xiong Wang
To address this problem, we propose Primal Wasserstein DICE (PW-DICE), which minimizes the primal Wasserstein distance between the expert and learner state occupancies with a pessimistic regularizer and leverages a contrastively learned distance as the underlying metric for the Wasserstein distance.
1 code implementation • 19 Oct 2023 • Ziqi Pang, Ziyang Xie, Yunze Man, Yu-Xiong Wang
This paper reveals that large language models (LLMs), despite being trained solely on textual data, are surprisingly strong encoders for purely visual tasks in the absence of language.
1 code implementation • 6 Oct 2023 • Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, Yu-Xiong Wang
While large language models (LLMs) have demonstrated impressive performance on a range of decision-making tasks, they rely on simple acting processes and fall short of broad deployment as autonomous agents.
Ranked #1 on
Code Generation
on MBPP
1 code implementation • 2 Oct 2023 • Ziqi Pang, Deva Ramanan, Mengtian Li, Yu-Xiong Wang
Our benchmark inherently captures the disappearance and re-appearance of agents, presenting the emergent challenge of forecasting for occluded agents, which is a safety-critical problem yet overlooked by snapshot-based benchmarks.
1 code implementation • ICCV 2023 • Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang
To tackle the MTVS problem, we propose MuvieNeRF, a framework that incorporates both multi-task and cross-view knowledge to simultaneously synthesize multiple scene properties.
1 code implementation • ICCV 2023 • Yuanyi Zhong, Anand Bhattad, Yu-Xiong Wang, David Forsyth
Dense depth and surface normal predictors should possess the equivariant property to cropping-and-resizing -- cropping the input image should result in cropping the same output image.
no code implementations • 25 Sep 2023 • Zhiqing Sun, Sheng Shen, Shengcao Cao, Haotian Liu, Chunyuan Li, Yikang Shen, Chuang Gan, Liang-Yan Gui, Yu-Xiong Wang, Yiming Yang, Kurt Keutzer, Trevor Darrell
Large Multimodal Models (LMM) are built across modalities and the misalignment between two modalities can result in "hallucination", generating textual outputs that are not grounded by the multimodal information in context.
1 code implementation • ICCV 2023 • Sirui Xu, Zhengyuan Li, Yu-Xiong Wang, Liang-Yan Gui
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs).
no code implementations • 8 Aug 2023 • Neehar Peri, Mengtian Li, Benjamin Wilson, Yu-Xiong Wang, James Hays, Deva Ramanan
LiDAR-based 3D detection plays a vital role in autonomous navigation.
2 code implementations • 3 Aug 2023 • Xinglong Sun, Jean Ponce, Yu-Xiong Wang
Our study reveals that, different from prior work, deformable convolution needs to be applied on an estimated depth map with a relatively high density for better performance.
no code implementations • 28 Jul 2023 • Garvita Allabadi, Ana Lucic, Peter Pao-Huang, Yu-Xiong Wang, Vikram Adve
Existing approaches for semi-supervised object detection assume a fixed set of classes present in training and unlabeled datasets, i. e., in-distribution (ID) data.
no code implementations • 24 Jun 2023 • Brando Miranda, Patrick Yu, Saumya Goyal, Yu-Xiong Wang, Sanmi Koyejo
Using this analysis, we demonstrate the following: 1. when the formal diversity of a data set is low, PT beats MAML on average and 2. when the formal diversity is high, MAML beats PT on average.
1 code implementation • 8 Jun 2023 • Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui
This paper aims to deal with the ignored real-world complexities in prior work on human motion forecasting, emphasizing the social properties of multi-person motion, the diversity of motion and social interactions, and the complexity of articulated motion.
1 code implementation • 23 May 2023 • Saba Ghaffari, Ehsan Saleh, Alexander G. Schwing, Yu-Xiong Wang, Martin D. Burke, Saurabh Sinha
Protein design, a grand challenge of the day, involves optimization on a fitness landscape, and leading methods adopt a model-based approach where a model is trained on a training set (protein sequences and fitness) and proposes candidates to explore next.
1 code implementation • ICCV 2023 • Ziyang Xie, Ziqi Pang, Yu-Xiong Wang
To further enhance multi-view consistency, we augment the uncertainty network with the global 3D structure optimized by a voxelized neural radiance field (Voxel-NeRF).
no code implementations • 5 May 2023 • Yunze Man, Liang-Yan Gui, Yu-Xiong Wang
In this work, we propose DualCross, a cross-modality cross-domain adaptation framework to facilitate the learning of a more robust monocular bird's-eye-view (BEV) perception model, which transfers the point cloud knowledge from a LiDAR sensor in one domain during the training phase to the camera-only testing scenario in a different domain.
1 code implementation • CVPR 2023 • Shengcao Cao, Dhiraj Joshi, Liang-Yan Gui, Yu-Xiong Wang
Object detectors often suffer from the domain gap between training (source domain) and real-world applications (target domain).
no code implementations • CVPR 2023 • Jun-Kun Chen, Jipeng Lyu, Yu-Xiong Wang
Our key insight is to exploit the explicit point cloud representation as the underlying structure to construct NeRFs, inspired by the intuitive interpretation of NeRF rendering as a process that projects or "plots" the associated 3D point cloud to a 2D image plane.
2 code implementations • CVPR 2023 • Zhipeng Bao, Pavel Tokmakov, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert
Object discovery -- separating objects from the background without manual labels -- is a fundamental open challenge in computer vision.
1 code implementation • 9 Feb 2023 • Sirui Xu, Yu-Xiong Wang, Liang-Yan Gui
Predicting diverse human motions given a sequence of historical poses has received increasing attention.
Ranked #1 on
Human Pose Forecasting
on Human3.6M
(MMADE metric)
1 code implementation • CVPR 2023 • Ziqi Pang, Jie Li, Pavel Tokmakov, Dian Chen, Sergey Zagoruyko, Yu-Xiong Wang
It emphasizes spatio-temporal continuity and integrates both past and future reasoning for tracked objects.
1 code implementation • ICCV 2023 • Jiangwei Yu, Xiang Li, Xinran Zhao, Hongming Zhang, Yu-Xiong Wang
Learning about object state changes in Video Object Segmentation (VOS) is crucial for understanding and interacting with the visual world.
no code implementations • ICCV 2023 • Yuanyi Zhong, Haoran Tang, Jun-Kun Chen, Yu-Xiong Wang
Though self-supervised contrastive learning (CL) has shown its potential to achieve state-of-the-art accuracy without any supervision, its behavior still remains under investigated by academia.
no code implementations • CVPR 2023 • Yunze Man, Liang-Yan Gui, Yu-Xiong Wang
We design a BEV-guided multi-sensor attention block to take queries from BEV embeddings and learn the BEV representation from sensor-specific features.
1 code implementation • 27 Oct 2022 • Kuan-Ying Lee, Yuanyi Zhong, Yu-Xiong Wang
Existing work on continual learning (CL) is primarily devoted to developing algorithms for models trained from scratch.
1 code implementation • 18 Oct 2022 • Kai Yan, Alexander G. Schwing, Yu-Xiong Wang
To better benefit from available demonstrations, we develop a method to Combine Explicit and Implicit Priors (CEIP).
no code implementations • 10 Oct 2022 • Zhiqiu Lin, Deepak Pathak, Yu-Xiong Wang, Deva Ramanan, Shu Kong
LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e. g., dog breeds that refine the previous ${\tt dog}$).
1 code implementation • 11 Aug 2022 • Jun-Kun Chen, Yu-Xiong Wang
Being able to learn an effective semantic representation directly on raw point clouds has become a central topic in 3D understanding.
no code implementations • 2 Aug 2022 • Brando Miranda, Patrick Yu, Yu-Xiong Wang, Sanmi Koyejo
This novel insight contextualizes claims that transfer learning solutions are better than meta-learned solutions in the regime of low diversity under a fair comparison.
no code implementations • 10 Jun 2022 • Yuanyi Zhong, Haoran Tang, Junkun Chen, Jian Peng, Yu-Xiong Wang
Our insight has implications in improving the downstream robustness of supervised learning.
no code implementations • 9 Jun 2022 • Mingtong Zhang, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong Wang
Comprehensive 3D scene understanding, both geometrically and semantically, is important for real-world applications such as robot perception.
1 code implementation • CVPR 2022 • Shaden Alshammari, Yu-Xiong Wang, Deva Ramanan, Shu Kong
In contrast, weight decay penalizes larger weights more heavily and so learns small balanced weights; the MaxNorm constraint encourages growing small weights within a norm ball but caps all the weights by the radius.
Ranked #8 on
Long-tail Learning
on CIFAR-100-LT (ρ=10)
1 code implementation • CVPR 2022 • Zhipeng Bao, Pavel Tokmakov, Allan Jabri, Yu-Xiong Wang, Adrien Gaidon, Martial Hebert
Our experiments demonstrate that, despite only capturing a small subset of the objects that move, this signal is enough to generalize to segment both moving and static instances of dynamic objects.
no code implementations • 24 Dec 2021 • Brando Miranda, Yu-Xiong Wang, Sanmi Koyejo
We hypothesize that the diversity coefficient of the few-shot learning benchmark is predictive of whether meta-learning solutions will succeed or not.
1 code implementation • 24 Dec 2021 • Brando Miranda, Yu-Xiong Wang, Sanmi Koyejo
Recent work has suggested that a good embedding is all we need to solve many few-shot learning benchmarks.
2 code implementations • CVPR 2022 • Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases.
Ranked #3 on
3D Object Detection
on waymo cyclist
2 code implementations • ICLR 2022 • Saba Ghaffari, Ehsan Saleh, David Forsyth, Yu-Xiong Wang
In this work, we demonstrate the effectiveness of Firth bias reduction in few-shot classification.
no code implementations • 29 Sep 2021 • Zhipeng Bao, Yu-Xiong Wang, Martial Hebert
Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images.
1 code implementation • ICCV 2021 • Rajshekhar Das, Yu-Xiong Wang, JoséM. F. Moura
An effective approach to few-shot classification involves a prior model trained on a large-sample base domain, which is then finetuned over the novel few-shot task to yield generalizable representations.
no code implementations • ICCV 2021 • Yuanyi Zhong, Bodi Yuan, Hong Wu, Zhiqiang Yuan, Jian Peng, Yu-Xiong Wang
We leverage the pixel-level L2 loss and the pixel contrastive loss for the two purposes respectively.
no code implementations • 25 Jun 2021 • Zhipeng Bao, Martial Hebert, Yu-Xiong Wang
Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images.
no code implementations • CVPR 2021 • Weilin Zhang, Yu-Xiong Wang
One critical factor in improving few-shot detection is to address the lack of variation in training data.
1 code implementation • 12 Apr 2021 • Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez
As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level.
1 code implementation • CVPR 2021 • Yuanyi Zhong, JianFeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang, Lei Zhang
This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e. g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks.
no code implementations • ICCV 2021 • Liangke Gui, Adrien Bardes, Ruslan Salakhutdinov, Alexander Hauptmann, Martial Hebert, Yu-Xiong Wang
Learning to hallucinate additional examples has recently been shown as a promising direction to address few-shot learning tasks.
no code implementations • 19 Nov 2020 • Weilin Zhang, Yu-Xiong Wang, David A. Forsyth
Learning to detect an object in an image from very few training examples - few-shot object detection - is challenging, because the classifier that sees proposal boxes has very little training data.
no code implementations • 22 Aug 2020 • Vivek Roy, Yan Xu, Yu-Xiong Wang, Kris Kitani, Ruslan Salakhutdinov, Martial Hebert
Recent works have proposed to solve this task by augmenting the training data of the few-shot classes using generative models with the few-shot training samples as the seeds.
1 code implementation • 17 Aug 2020 • Nadine Chang, Jayanth Koushik, Aarti Singh, Martial Hebert, Yu-Xiong Wang, Michael J. Tarr
Methods in long-tail learning focus on improving performance for data-poor (rare) classes; however, performance for such classes remains much lower than performance for more data-rich (frequent) classes.
1 code implementation • ICLR 2021 • Zhipeng Bao, Yu-Xiong Wang, Martial Hebert
We propose a novel task of joint few-shot recognition and novel-view synthesis: given only one or few images of a novel object from arbitrary views with only category annotation, we aim to simultaneously learn an object classifier and generate images of that type of object from new viewpoints.
1 code implementation • ECCV 2020 • Mengtian Li, Yu-Xiong Wang, Deva Ramanan
While past work has studied the algorithmic trade-off between latency and accuracy, there has not been a clear metric to compare different methods along the Pareto optimal latency-accuracy curve.
Ranked #2 on
Real-Time Object Detection
on Argoverse-HD (Detection-Only, Val)
(using extra training data)
1 code implementation • 29 Nov 2019 • Ziqi Pang, Zhiyuan Hu, Pavel Tokmakov, Yu-Xiong Wang, Martial Hebert
Indeed, even the majority of few-shot learning methods rely on a large set of "base classes" for pretraining.
no code implementations • ICCV 2019 • Yu-Xiong Wang, Deva Ramanan, Martial Hebert
Few-shot learning, i. e., learning novel concepts from few examples, is fundamental to practical visual recognition systems.
Ranked #20 on
Few-Shot Object Detection
on MS-COCO (30-shot)
no code implementations • 25 Sep 2019 • Yu-Xiong Wang, Yuki Uchiyama, Martial Hebert, Karteek Alahari
Learning to hallucinate additional examples has recently been shown as a promising direction to address few-shot learning tasks, which aim to learn novel concepts from very few examples.
no code implementations • CVPR 2017 • Yu-Xiong Wang, Deva Ramanan, Martial Hebert
One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset.
1 code implementation • CVPR 2019 • Zitian Chen, Yanwei Fu, Yu-Xiong Wang, Lin Ma, Wei Liu, Martial Hebert
Humans can robustly learn novel visual concepts even when images undergo various deformations and lose certain information.
no code implementations • ICCV 2019 • Pavel Tokmakov, Yu-Xiong Wang, Martial Hebert
One of the key limitations of modern deep learning approaches lies in the amount of data required to train them.
no code implementations • ECCV 2018 • Liang-Yan Gui, Yu-Xiong Wang, Xiaodan Liang, Jose M. F. Moura
We explore an approach to forecasting human motion in a few milliseconds given an input 3D skeleton sequence based on a recurrent encoder-decoder framework.
no code implementations • ECCV 2018 • Liang-Yan Gui, Yu-Xiong Wang, Deva Ramanan, Jose M. F. Moura
This paper addresses the problem of few-shot human motion prediction, in the spirit of the recent progress on few-shot learning and meta-learning.
1 code implementation • CVPR 2018 • Yu-Xiong Wang, Ross Girshick, Martial Hebert, Bharath Hariharan
Humans can quickly learn new visual concepts, perhaps because they can easily visualize or imagine what novel objects look like from different views.
no code implementations • NeurIPS 2017 • Yu-Xiong Wang, Deva Ramanan, Martial Hebert
We cast this problem as transfer learning, where knowledge from the data-rich classes in the head of the distribution is transferred to the data-poor classes in the tail.
no code implementations • NeurIPS 2016 • Yu-Xiong Wang, Martial Hebert
Inspired by the transferability properties of CNNs, we introduce an additional unsupervised meta-training stage that exposes multiple top layer units to a large amount of unlabeled real-world images.
no code implementations • CVPR 2015 • Yu-Xiong Wang, Martial Hebert
In this paper, we explore an approach to generating detectors that is radically different from the conventional way of learning a detector from a large corpus of annotated positive and negative data samples.