Search Results for author: Yunyang Xiong

Found 18 papers, 12 papers with code

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

1 code implementation1 Dec 2023 Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra

On segment anything task such as zero-shot instance segmentation, our EfficientSAMs with SAMI-pretrained lightweight image encoders perform favorably with a significant gain (e. g., ~4 AP on COCO/LVIS) over other fast SAM models.

Image Classification Instance Segmentation +5

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

1 code implementation14 Oct 2023 Jun Chen, Deyao Zhu, Xiaoqian Shen, Xiang Li, Zechun Liu, Pengchuan Zhang, Raghuraman Krishnamoorthi, Vikas Chandra, Yunyang Xiong, Mohamed Elhoseiny

Motivated by this, we target to build a unified interface for completing many vision-language tasks including image description, visual question answering, and visual grounding, among others.

Language Modelling Large Language Model +4

Self-positioning Point-based Transformer for Point Cloud Understanding

1 code implementation CVPR 2023 Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J. Kim

In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity.

3D Part Segmentation 3D Point Cloud Classification +1

PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion

no code implementations12 Dec 2022 Lemeng Wu, Dilin Wang, Meng Li, Yunyang Xiong, Raghuraman Krishnamoorthi, Qiang Liu, Vikas Chandra

Fusing 3D LiDAR features with 2D camera features is a promising technique for enhancing the accuracy of 3D detection, thanks to their complementary physical properties.

Fast Point Cloud Generation with Straight Flows

1 code implementation CVPR 2023 Lemeng Wu, Dilin Wang, Chengyue Gong, Xingchao Liu, Yunyang Xiong, Rakesh Ranjan, Raghuraman Krishnamoorthi, Vikas Chandra, Qiang Liu

We perform evaluations on multiple 3D tasks and find that our PSF performs comparably to the standard diffusion model, outperforming other efficient 3D point cloud generation methods.

Point Cloud Completion

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference

1 code implementation CVPR 2023 Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan Lin

Vision Transformers (ViTs) have shown impressive performance but still require a high computation cost as compared to convolutional neural networks (CNNs), one reason is that ViTs' attention measures global similarities and thus has a quadratic complexity with the number of input tokens.

Efficient ViTs

SageMix: Saliency-Guided Mixup for Point Clouds

1 code implementation13 Oct 2022 Sanghyeok Lee, Minkyu Jeon, Injae Kim, Yunyang Xiong, Hyunwoo J. Kim

Mixup is a simple and widely-used data augmentation technique that has proven effective in alleviating the problems of overfitting and data scarcity.

3D Part Segmentation 3D Point Cloud Classification +3

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

1 code implementation18 Nov 2021 Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh

In this paper, we show that a Bernoulli sampling attention mechanism based on Locality Sensitive Hashing (LSH), decreases the quadratic complexity of such models to linear.

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

4 code implementations CVPR 2021 Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen

By incorporating regular convolutions in the search space and directly optimizing the network architectures for object detection, we obtain a family of object detection models, MobileDets, that achieve state-of-the-art results across mobile accelerators.

Neural Architecture Search Object +2

Mixed Effects Neural Networks (MeNets) With Applications to Gaze Estimation

1 code implementation CVPR 2019 Yunyang Xiong, Hyunwoo J. Kim, Vikas Singh

nature of this data suggests better estimation may be possible if the model explicitly made use of such "repeated measurements" from each user as is commonly done in classical statistical analysis using so-called mixed effects models.

Gaze Estimation

Resource Constrained Neural Network Architecture Search: Will a Submodularity Assumption Help?

1 code implementation ICCV 2019 Yunyang Xiong, Ronak Mehta, Vikas Singh

In the latter case, the optimization is often non-differentiable and also not very amenable to derivative-free optimization methods.

Neural Architecture Search

ANTNets: Mobile Convolutional Neural Networks for Resource Efficient Image Classification

no code implementations7 Apr 2019 Yunyang Xiong, Hyunwoo J. Kim, Varsha Hedau

It boosts the representational power by modeling, in a high dimensional space, interdependency of channels between a depthwise convolution layer and a projection layer in the ANTBlocks.

Classification General Classification +1

Building Bayesian Neural Networks with Blocks: On Structure, Interpretability and Uncertainty

no code implementations10 Jun 2018 Hao Henry Zhou, Yunyang Xiong, Vikas Singh

We provide simple schemes to build Bayesian Neural Networks (BNNs), block by block, inspired by a recent idea of computation skeletons.

Gaussian Processes Variational Inference

Filter Flow Made Practical: Massively Parallel and Lock-Free

1 code implementation CVPR 2017 Sathya N. Ravi, Yunyang Xiong, Lopamudra Mukherjee, Vikas Singh

This paper is inspired by a relatively recent work of Seitz and Baker which introduced the so-called Filter Flow model.

Optical Flow Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.