Search Results for author: Dilin Wang

Found 32 papers, 15 papers with code

MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction

no code implementations20 Feb 2024 Shitao Tang, Jiacheng Chen, Dilin Wang, Chengzhou Tang, Fuyang Zhang, Yuchen Fan, Vikas Chandra, Yasutaka Furukawa, Rakesh Ranjan

MVDiffusion++ achieves superior flexibility and scalability with two surprisingly simple ideas: 1) A ``pose-free architecture'' where standard self-attention among 2D latent features learns 3D consistency across an arbitrary number of conditional and generation views without explicitly using camera pose information; and 2) A ``view dropout strategy'' that discards a substantial number of output views during training, which reduces the training-time memory footprint and enables dense and high-resolution view synthesis at test time.

3D Object Reconstruction 3D Reconstruction +2

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

no code implementations CVPR 2024 Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

In this paper, we reveal that the existing score distillation-based text-to-3D generation frameworks degenerate to maximal likelihood seeking on each view independently and thus suffer from the mode collapse problem, manifesting as the Janus artifact in practice.

3D Generation Prompt Engineering +1

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

1 code implementation CVPR 2024 Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra

On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e. g., ~4 AP on COCO/LVIS) over other fast SAM models.

Decoder Image Classification +6

Pose-Free Generalizable Rendering Transformer

no code implementations5 Oct 2023 Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Hanwen Jiang, Dejia Xu, Zehao Zhu, Dilin Wang, Zhangyang Wang

To address this challenge, we introduce PF-GRT, a new Pose-Free framework for Generalizable Rendering Transformer, eliminating the need for pre-computed camera poses and instead leveraging feature-matching learned directly from data.

Generalizable Novel View Synthesis Novel View Synthesis

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

no code implementations5 Sep 2023 Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Results demonstrate that our TODM Supernet either matches or surpasses the performance of manually tuned models by up to a relative of 3% better in word error rate (WER), while efficiently keeping the cost of training many models at a small constant.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion

no code implementations12 Dec 2022 Lemeng Wu, Dilin Wang, Meng Li, Yunyang Xiong, Raghuraman Krishnamoorthi, Qiang Liu, Vikas Chandra

Fusing 3D LiDAR features with 2D camera features is a promising technique for enhancing the accuracy of 3D detection, thanks to their complementary physical properties.

Fast Point Cloud Generation with Straight Flows

1 code implementation CVPR 2023 Lemeng Wu, Dilin Wang, Chengyue Gong, Xingchao Liu, Yunyang Xiong, Rakesh Ranjan, Raghuraman Krishnamoorthi, Vikas Chandra, Qiang Liu

We perform evaluations on multiple 3D tasks and find that our PSF performs comparably to the standard diffusion model, outperforming other efficient 3D point cloud generation methods.

Point Cloud Completion

Temporally Consistent Online Depth Estimation in Dynamic Scenes

no code implementations17 Nov 2021 Zhaoshuo Li, Wei Ye, Dilin Wang, Francis X. Creighton, Russell H. Taylor, Ganesh Venkatesh, Mathias Unberath

We present a framework named Consistent Online Dynamic Depth (CODD) to produce temporally consistent depth estimates in dynamic scenes in an online setting.

Stereo Depth Estimation

Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation

1 code implementation CVPR 2022 Jiaqi Gu, Hyoukjun Kwon, Dilin Wang, Wei Ye, Meng Li, Yu-Hsin Chen, Liangzhen Lai, Vikas Chandra, David Z. Pan

Therefore, we propose HRViT, which enhances ViTs to learn semantically-rich and spatially-precise multi-scale representations by integrating high-resolution multi-branch architectures with ViTs.

Image Classification Representation Learning +3

NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training

1 code implementation ICLR 2022 Chengyue Gong, Dilin Wang, Meng Li, Xinlei Chen, Zhicheng Yan, Yuandong Tian, Qiang Liu, Vikas Chandra

In this work, we observe that the poor performance is due to a gradient conflict issue: the gradients of different sub-networks conflict with that of the supernet more severely in ViTs than CNNs, which leads to early saturation in training and inferior convergence.

Data Augmentation Image Classification +2

Vision Transformers with Patch Diversification

1 code implementation26 Apr 2021 Chengyue Gong, Dilin Wang, Meng Li, Vikas Chandra, Qiang Liu

To alleviate this problem, in this work, we introduce novel loss functions in vision transformer training to explicitly encourage diversity across patch representations for more discriminative feature extraction.

Image Classification Semantic Segmentation

AlphaNet: Improved Training of Supernets with Alpha-Divergence

2 code implementations16 Feb 2021 Dilin Wang, Chengyue Gong, Meng Li, Qiang Liu, Vikas Chandra

Weight-sharing NAS builds a supernet that assembles all the architectures as its sub-networks and jointly trains the supernet with the sub-networks.

Image Classification Neural Architecture Search

AlphaMatch: Improving Consistency for Semi-supervised Learning with Alpha-divergence

no code implementations CVPR 2021 Chengyue Gong, Dilin Wang, Qiang Liu

Semi-supervised learning (SSL) is a key approach toward more data-efficient machine learning by jointly leverage both labeled and unlabeled data.

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

2 code implementations CVPR 2021 Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra

Our discovered model family, AttentiveNAS models, achieves top-1 accuracy from 77. 3% to 80. 7% on ImageNet, and outperforms SOTA models, including BigNAS and Once-for-All networks.

Neural Architecture Search

Stein Variational Gradient Descent With Matrix-Valued Kernels

1 code implementation NeurIPS 2019 Dilin Wang, Ziyang Tang, Chandrajit Bajaj, Qiang Liu

Stein variational gradient descent (SVGD) is a particle-based inference algorithm that leverages gradient information for efficient approximate inference.

Bayesian Inference

Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent

1 code implementation ICLR 2020 Dilin Wang, Meng Li, Lemeng Wu, Vikas Chandra, Qiang Liu

Designing energy-efficient networks is of critical importance for enabling state-of-the-art deep learning in mobile and edge settings where the computation and energy budgets are highly limited.

Splitting Steepest Descent for Growing Neural Architectures

1 code implementation NeurIPS 2019 Qiang Liu, Lemeng Wu, Dilin Wang

We develop a progressive training approach for neural networks which adaptively grows the network structure by splitting existing neurons to multiple off-springs.

Improving Neural Language Modeling via Adversarial Training

1 code implementation10 Jun 2019 Dilin Wang, Chengyue Gong, Qiang Liu

Theoretically, we show that our adversarial mechanism effectively encourages the diversity of the embedding vectors, helping to increase the robustness of models.

Language Modelling Machine Translation +1

Variational Inference with Tail-adaptive f-Divergence

1 code implementation NeurIPS 2018 Dilin Wang, Hao liu, Qiang Liu

Variational inference with {\alpha}-divergences has been widely used in modern probabilistic machine learning.

Variational Inference

Stein Variational Gradient Descent as Moment Matching

no code implementations NeurIPS 2018 Qiang Liu, Dilin Wang

Stein variational gradient descent (SVGD) is a non-parametric inference algorithm that evolves a set of particles to fit a given distribution of interest.

Stein Variational Message Passing for Continuous Graphical Models

no code implementations ICML 2018 Dilin Wang, Zhe Zeng, Qiang Liu

We propose a novel distributed inference algorithm for continuous graphical models, by extending Stein variational gradient descent (SVGD) to leverage the Markov dependency structure of the distribution of interest.

Learning to Draw Samples with Amortized Stein Variational Gradient Descent

no code implementations20 Jul 2017 Yihao Feng, Dilin Wang, Qiang Liu

We propose a simple algorithm to train stochastic neural networks to draw samples from given target distributions for probabilistic inference.

Bayesian Inference

Learning Deep Energy Models: Contrastive Divergence vs. Amortized MLE

no code implementations4 Jul 2017 Qiang Liu, Dilin Wang

We propose a number of new algorithms for learning deep energy models and demonstrate their properties.

Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning

1 code implementation6 Nov 2016 Dilin Wang, Qiang Liu

We propose a simple algorithm to train stochastic neural networks to draw samples from given target distributions for probabilistic inference.

Ranked #19 on Conditional Image Generation on CIFAR-10 (Inception score metric)

Conditional Image Generation

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

13 code implementations NeurIPS 2016 Qiang Liu, Dilin Wang

We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization.

Bayesian Inference Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.