Search Results for author: Hengduo Li

Found 17 papers, 8 papers with code

Multi-Glimpse LSTM with Color-Depth Feature Fusion for Human Detection

no code implementations3 Nov 2017 Hengduo Li, Jun Liu, Guyue Zhang, Yuan Gao, Yirui Wu

In this paper, we propose a new Multi-Glimpse LSTM (MG-LSTM) network, in which multi-scale contextual information is sequentially integrated to promote the human detection performance.

Human Detection

R-FCN-3000 at 30fps: Decoupling Detection and Classification

2 code implementations CVPR 2018 Bharat Singh, Hengduo Li, Abhishek Sharma, Larry S. Davis

Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization.

Classification General Classification +2

An Analysis of Pre-Training on Object Detection

no code implementations11 Apr 2019 Hengduo Li, Bharat Singh, Mahyar Najibi, Zuxuan Wu, Larry S. Davis

We analyze how well their features generalize to tasks like image classification, semantic segmentation and object detection on small datasets like PASCAL-VOC, Caltech-256, SUN-397, Flowers-102 etc.

Avg Classification +6

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

1 code implementation15 May 2019 Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein

Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data.

Transfer Learning

Improved Training of Certifiably Robust Models

no code implementations25 Sep 2019 Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein

Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical (PGD) robustness.

Learning from Noisy Anchors for One-stage Object Detection

1 code implementation CVPR 2020 Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis

State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.

Classification General Classification +3

Improving the Tightness of Convex Relaxation Bounds for Training Certifiably Robust Classifiers

no code implementations22 Feb 2020 Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein

Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.

Rethinking Pseudo Labels for Semi-Supervised Object Detection

no code implementations1 Jun 2021 Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis

In this paper, we introduce certainty-aware pseudo labels tailored for object detection, which can effectively estimate the classification and localization quality of derived pseudo labels.

Classification Image Classification +4

Efficient Video Transformers with Spatial-Temporal Token Selection

1 code implementation23 Nov 2021 Junke Wang, Xitong Yang, Hengduo Li, Li Liu, Zuxuan Wu, Yu-Gang Jiang

Video transformers have achieved impressive results on major video recognition benchmarks, which however suffer from high computational cost.

Video Recognition

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

no code implementations CVPR 2022 Lingchen Meng, Hengduo Li, Bor-Chun Chen, Shiyi Lan, Zuxuan Wu, Yu-Gang Jiang, Ser-Nam Lim

To this end, we introduce AdaViT, an adaptive computation framework that learns to derive usage policies on which patches, self-attention heads and transformer blocks to use throughout the backbone on a per-input basis, aiming to improve inference efficiency of vision transformers with a minimal drop of accuracy for image recognition.

Semi-Supervised Single-View 3D Reconstruction via Prototype Shape Priors

1 code implementation30 Sep 2022 Zhen Xing, Hengduo Li, Zuxuan Wu, Yu-Gang Jiang

In particular, we introduce an attention-guided prototype shape prior module for guiding realistic object reconstruction.

3D Reconstruction Object Reconstruction +2

BMB: Balanced Memory Bank for Imbalanced Semi-supervised Learning

no code implementations22 May 2023 Wujian Peng, Zejia Weng, Hengduo Li, Zuxuan Wu

Exploring a substantial amount of unlabeled data, semi-supervised learning (SSL) boosts the recognition performance when only a limited number of labels are provided.

SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation

1 code implementation24 Nov 2023 Lingchen Meng, Shiyi Lan, Hengduo Li, Jose M. Alvarez, Zuxuan Wu, Yu-Gang Jiang

In-context segmentation aims at segmenting novel images using a few labeled example images, termed as "in-context examples", exploring content similarities between examples and the target.

Meta-Learning One-Shot Segmentation +3

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

1 code implementation8 Apr 2024 Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim

However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.

Question Answering Video Captioning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.