Search Results for author: Fangyun Wei

Found 45 papers, 33 papers with code

WIDER Face and Pedestrian Challenge 2018: Methods and Results

no code implementations • 19 Feb 2019 • Chen Change Loy, Dahua Lin, Wanli Ouyang, Yuanjun Xiong, Shuo Yang, Qingqiu Huang, Dongzhan Zhou, Wei Xia, Quanquan Li, Ping Luo, Junjie Yan, Jian-Feng Wang, Zuoxin Li, Ye Yuan, Boxun Li, Shuai Shao, Gang Yu, Fangyun Wei, Xiang Ming, Dong Chen, Shifeng Zhang, Cheng Chi, Zhen Lei, Stan Z. Li, Hongkai Zhang, Bingpeng Ma, Hong Chang, Shiguang Shan, Xilin Chen, Wu Liu, Boyan Zhou, Huaxiong Li, Peng Cheng, Tao Mei, Artem Kukharenko, Artem Vasenin, Nikolay Sergievskiy, Hua Yang, Liangqi Li, Qiling Xu, Yuan Hong, Lin Chen, Mingjun Sun, Yirong Mao, Shiying Luo, Yongjun Li, Ruiping Wang, Qiaokang Xie, Ziyang Wu, Lei Lu, Yiheng Liu, Wengang Zhou

This paper presents a review of the 2018 WIDER Challenge on Face and Pedestrian.

Face Detection Pedestrian Detection +2

Paper
Add Code

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

9 code implementations • 25 Apr 2019 • Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu

In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation.

Ranked #25 on Object Detection on COCO-O

Instance Segmentation Object Detection +1

27,716

Paper
Code

Design and Interpretation of Universal Adversarial Patches in Face Detection

no code implementations • ECCV 2020 • Xiao Yang, Fangyun Wei, Hongyang Zhang, Jun Zhu

We consider universal adversarial patches for faces -- small visual elements whose addition to a face image reliably destroys the performance of face detectors.

Face Detection

Paper
Add Code

Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation

1 code implementation • ECCV 2020 • Fangyun Wei, Xiao Sun, Hongyang Li, Jingdong Wang, Stephen Lin

A recent approach for object detection and human pose estimation is to regress bounding boxes or human keypoints from a central point on the object or person.

Instance Segmentation Object +5

Paper
Code

Restoring Negative Information in Few-Shot Object Detection

1 code implementation • NeurIPS 2020 • Yukuan Yang, Fangyun Wei, Miaojing Shi, Guoqi Li

In this paper, we restore the negative information in few-shot object detection by introducing a new negative- and positive-representative based metric learning framework and a new inference scheme with negative and positive representatives.

Few-Shot Learning Few-Shot Object Detection +4

Paper
Code

RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder

4 code implementations • NeurIPS 2020 • Cheng Chi, Fangyun Wei, Han Hu

The proposed module is named \emph{bridging visual representations} (BVR).

Ranked #53 on Object Detection on COCO test-dev

Object object-detection +1

417

Paper
Code

Global Context Networks

3 code implementations • 24 Dec 2020 • Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies within an image, via aggregating query-specific global context to each query position.

Ranked #40 on Instance Segmentation on COCO test-dev

Instance Segmentation Object Detection

29,680

Paper
Code

High-Fidelity and Arbitrary Face Editing

no code implementations • CVPR 2021 • Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

However, we observe that the generator tends to find a tricky way to hide information from the original image to satisfy the constraint of cycle consistency, making it impossible to maintain the rich details (e. g., wrinkles and moles) of non-editing areas.

Attribute Vocal Bursts Intensity Prediction

Paper
Add Code

Aligning Pretraining for Detection via Object-Level Contrastive Learning

1 code implementation • NeurIPS 2021 • Fangyun Wei, Yue Gao, Zhirong Wu, Han Hu, Stephen Lin

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.

Contrastive Learning Object +6

169

Paper
Code

End-to-End Semi-Supervised Object Detection with Soft Teacher

8 code implementations • ICCV 2021 • Mengde Xu, Zheng Zhang, Han Hu, JianFeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, Zicheng Liu

This paper presents an end-to-end semi-supervised object detection approach, in contrast to previous more complex multi-stage methods.

Ranked #6 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Instance Segmentation object-detection +4

885

Paper
Code

Dual Path Learning for Domain Adaptation of Semantic Segmentation

1 code implementation • ICCV 2021 • Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Fang Wen, Wenqiang Zhang

In this paper, based on the observation that domain adaptation frameworks performed in the source and target domain are almost complementary in terms of image translation and SSL, we propose a novel dual path learning (DPL) framework to alleviate visual inconsistency.

Ranked #32 on Synthetic-to-Real Translation on GTAV-to-Cityscapes Labels

Domain Adaptation Segmentation +4

Paper
Code

ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment

1 code implementation • ICCV 2021 • Yangyu Huang, Hao Yang, Chong Li, Jongyoo Kim, Fangyun Wei

On the other hand, AAM is an attention module which can get anisotropic attention mask focusing on the region of point and its local edge connected by adjacent points, it has a stronger response in tangent than in normal, which means relaxed constraints in the tangent.

Ranked #5 on Face Alignment on 300W

Face Alignment

Paper
Code

Self-supervised Discovery of Human Actons from Long Kinematic Videos

no code implementations • 29 Sep 2021 • Kenneth Li, Xiao Sun, Zhirong Wu, Fangyun Wei, Stephen Lin

However, methods for understanding short semantic actions cannot be directly translated to long kinematic sequences such as dancing, where it becomes challenging even to semantically label the human movements.

Action Understanding Sentence

Paper
Add Code

Particle Based Stochastic Policy Optimization

no code implementations • 29 Sep 2021 • Qiwei Ye, Yuxuan Song, Chang Liu, Fangyun Wei, Tao Qin, Tie-Yan Liu

Stochastic polic have been widely applied for their good property in exploration and uncertainty quantification.

Ranked #1 on MuJoCo Games on Ant-v3

MuJoCo Games Offline RL +2

Paper
Add Code

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning

1 code implementation • NeurIPS 2021 • Hanzhe Hu, Fangyun Wei, Han Hu, Qiwei Ye, Jinshi Cui, LiWei Wang

The confidence bank is leveraged as an indicator to tilt training towards under-performing categories, instantiated in three strategies: 1) adaptive Copy-Paste and CutMix data augmentation approaches which give more chance for under-performing categories to be copied or cut; 2) an adaptive data sampling approach to encourage pixels from under-performing category to be sampled; 3) a simple yet effective re-weighting method to alleviate the training noise raised by pseudo-labeling.

Ranked #1 on Semi-Supervised Semantic Segmentation on Cityscapes 93 labeled

Data Augmentation Semi-Supervised Semantic Segmentation

126

Paper
Code

Bootstrap Your Object Detector via Mixed Training

1 code implementation • NeurIPS 2021 • Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Stephen Lin, Han Hu, Xiang Bai

We introduce MixTraining, a new training paradigm for object detection that can improve the performance of existing detectors for free.

Data Augmentation Missing Labels +3

Paper
Code

Towards Tokenized Human Dynamics Representation

1 code implementation • 22 Nov 2021 • Kenneth Li, Xiao Sun, Zhirong Wu, Fangyun Wei, Stephen Lin

For human action understanding, a popular research direction is to analyze short video clips with unambiguous semantic content, such as jumping and drinking.

Action Segmentation Action Understanding +3

Paper
Code

Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

no code implementations • CVPR 2022 • Yinghao Xu, Fangyun Wei, Xiao Sun, Ceyuan Yang, Yujun Shen, Bo Dai, Bolei Zhou, Stephen Lin

Typically in recent work, the pseudo-labels are obtained by training a model on the labeled data, and then using confident predictions from the model to teach itself.

Action Recognition

Paper
Add Code

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model

2 code implementations • 29 Dec 2021 • Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Han Hu, Xiang Bai

However, semantic segmentation and the CLIP model perform on different visual granularity, that semantic segmentation processes on pixels while CLIP performs on images.

Ranked #2 on Open Vocabulary Semantic Segmentation on COCO-Stuff-171

Image Classification Language Modelling +8

156

Paper
Code

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation

4 code implementations • CVPR 2022 • Yutong Chen, Fangyun Wei, Xiao Sun, Zhirong Wu, Stephen Lin

Concretely, we pretrain the sign-to-gloss visual network on the general domain of human actions and the within-domain of a sign-to-gloss dataset, and pretrain the gloss-to-text translation network on the general domain of a multilingual corpus and the within-domain of a gloss-to-text corpus.

Ranked #2 on Sign Language Translation on CSL-Daily

Sign Language Recognition Sign Language Translation +2

198

Paper
Code

Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning

1 code implementation • CVPR 2022 • Minghao Chen, Fangyun Wei, Chong Li, Deng Cai

In this paper, we introduce a novel contrastive action representation learning (CARL) framework to learn frame-wise action representations, especially for long videos, in a self-supervised manner.

Action Classification Contrastive Learning +4

Paper
Code

Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model

1 code implementation • CVPR 2022 • Yu Du, Fangyun Wei, Zihe Zhang, Miaojing Shi, Yue Gao, Guoqi Li

In this paper, we introduce a novel method, detection prompt (DetPro), to learn continuous prompt representations for open-vocabulary object detection based on the pre-trained vision-language model.

Image Classification Language Modelling +5

158

Paper
Code

Unsupervised Prompt Learning for Vision-Language Models

1 code implementation • 7 Apr 2022 • Tony Huang, Jack Chu, Fangyun Wei

In this paper, we explore a different scenario, in which the labels of the target datasets are unprovided, and we present an unsupervised prompt learning (UPL) approach to avoid prompt engineering while simultaneously improving transfer performance of CLIP-like vision-language models.

Prompt Engineering Transfer Learning

Paper
Code

Boosting Zero-shot Learning via Contrastive Optimization of Attribute Representations

1 code implementation • 8 Jul 2022 • Yu Du, Miaojing Shi, Fangyun Wei, Guoqi Li

In this paper, we propose a new framework to boost ZSL by explicitly learning attribute prototypes beyond images and contrastively optimizing them with attribute-level features within images.

Attribute Zero-Shot Learning

Paper
Code

Conditional DETR V2: Efficient Detection Transformer with Box Queries

no code implementations • 18 Jul 2022 • Xiaokang Chen, Fangyun Wei, Gang Zeng, Jingdong Wang

Inspired by Conditional DETR, an improved DETR with fast training convergence, that presented box queries (originally called spatial queries) for internal decoder layers, we reformulate the object query into the format of the box query that is a composition of the embeddings of the reference point and the transformation of the box with respect to the reference point.

Object object-detection +1

Paper
Add Code

AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars

1 code implementation • 12 Oct 2022 • Yue Wu, Yu Deng, Jiaolong Yang, Fangyun Wei, Qifeng Chen, Xin Tong

To achieve meaningful control over facial expressions via deformation, we propose a 3D-level imitative learning scheme between the generator and a parametric 3D face model during adversarial training of the 3D-aware GAN.

Disentanglement Face Model +1

Paper
Code

Two-Stream Network for Sign Language Recognition and Translation

1 code implementation • 2 Nov 2022 • Yutong Chen, Ronglai Zuo, Fangyun Wei, Yu Wu, Shujie Liu, Brian Mak

RGB videos, however, are raw signals with substantial visual redundancy, leading the encoder to overlook the key information for sign language understanding.

Ranked #1 on Sign Language Translation on RWTH-PHOENIX-Weather 2014 T

Sign Language Recognition Sign Language Translation +2

198

Paper
Code

Attentive Mask CLIP

1 code implementation • ICCV 2023 • Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description.

Contrastive Learning Retrieval +1

Paper
Code

Iterative Proposal Refinement for Weakly-Supervised Video Grounding

no code implementations • CVPR 2023 • Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang

Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.

Sentence Video Grounding

Paper
Add Code

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

2 code implementations • CVPR 2023 • Sucheng Ren, Fangyun Wei, Zheng Zhang, Han Hu

Our TinyMIM model of tiny size achieves 79. 6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget.

Image Classification Semantic Segmentation

148

Paper
Code

Side Adapter Network for Open-Vocabulary Semantic Segmentation

3 code implementations • CVPR 2023 • Mengde Xu, Zheng Zhang, Fangyun Wei, Han Hu, Xiang Bai

A side network is attached to a frozen CLIP model with two branches: one for predicting mask proposals, and the other for predicting attention bias which is applied in the CLIP model to recognize the class of masks.

Ranked #4 on Open Vocabulary Semantic Segmentation on PascalVOC-20

Language Modelling Open Vocabulary Semantic Segmentation +3

269

Paper
Code

DeepMIM: Deep Supervision for Masked Image Modeling

1 code implementation • 15 Mar 2023 • Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu

Deep supervision, which involves extra supervisions to the intermediate features of a neural network, was widely used in image classification in the early deep learning era since it significantly reduces the training difficulty and eases the optimization like avoiding gradient vanish over the vanilla training.

Image Classification object-detection +2

Paper
Code

Two-shot Video Object Segmentation

1 code implementation • CVPR 2023 • Kun Yan, Xiao Li, Fangyun Wei, Jinglu Wang, Chenbin Zhang, Ping Wang, Yan Lu

The underlying idea is to generate pseudo labels for unlabeled frames during training and to optimize the model on the combination of labeled and pseudo-labeled data.

Object Pseudo Label +5

Paper
Code

Natural Language-Assisted Sign Language Recognition

1 code implementation • CVPR 2023 • Ronglai Zuo, Fangyun Wei, Brian Mak

Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth.

Ranked #1 on Sign Language Recognition on WLASL-2000

Sign Language Recognition

198

Paper
Code

CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning

1 code implementation • CVPR 2023 • Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Wenqiang Zhang

Our framework, termed as domain-aware sign language retrieval via Cross-lingual Contrastive learning or CiCo for short, outperforms the pioneering method by large margins on various datasets, e. g., +22. 4 T2V and +28. 0 V2T R@1 improvements on How2Sign dataset, and +13. 7 T2V and +17. 1 V2T R@1 improvements on PHOENIX-2014T dataset.

Ranked #1 on Sign Language Retrieval on CSL-Daily

Contrastive Learning Retrieval +5

198

Paper
Code

Improving Continuous Sign Language Recognition with Cross-Lingual Signs

no code implementations • ICCV 2023 • Fangyun Wei, Yutong Chen

Experimentally, our approach achieves state-of-the-art performance on two widely-used CSLR datasets: Phoenix-2014 and Phoenix-2014T.

Sign Language Recognition speech-recognition +1

Paper
Add Code

AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections

no code implementations • 5 Sep 2023 • Yue Wu, Sicheng Xu, Jianfeng Xiang, Fangyun Wei, Qifeng Chen, Jiaolong Yang, Xin Tong

For the new task, we base our method on the generative radiance manifold representation and equip it with learnable facial and head-shoulder deformations.

Paper
Add Code

Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks

no code implementations • 12 Sep 2023 • Jindong Gu, Fangyun Wei, Philip Torr, Han Hu

In this work, we first taxonomize the stochastic defense strategies against QBBA.

Paper
Add Code

RAIN: Your Language Models Can Align Themselves without Finetuning

1 code implementation • 13 Sep 2023 • Yuhui Li, Fangyun Wei, Jinjing Zhao, Chao Zhang, Hongyang Zhang

We discover that by integrating self-evaluation and rewind mechanisms, unaligned LLMs can directly produce responses consistent with human preferences via self-boosting.

Adversarial Attack

Paper
Code

A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars

1 code implementation • 9 Jan 2024 • Ronglai Zuo, Fangyun Wei, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong

The objective of this paper is to develop a functional system for translating spoken languages into sign languages, referred to as Spoken2Sign translation.

Sign Language Translation Translation

198

Paper
Code

Towards Online Sign Language Recognition and Translation

1 code implementation • 10 Jan 2024 • Ronglai Zuo, Fangyun Wei, Brian Mak

Our approach comprises three phases: 1) developing a sign language dictionary encompassing all glosses present in a target sign language dataset; 2) training an isolated sign language recognition model on augmented signs using both conventional classification loss and our novel saliency loss; 3) employing a sliding window approach on the input sign sequence and feeding each sign clip to the well-optimized model for online recognition.

Sign Language Recognition speech-recognition +2

198

Paper
Code

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

1 code implementation • 26 Jan 2024 • Yuhui Li, Fangyun Wei, Chao Zhang, Hongyang Zhang

In this paper, we reconsider speculative sampling and derive two key observations.

Code Generation Instruction Following +2

471

Paper
Code

AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls

1 code implementation • 6 Feb 2024 • Yu Du, Fangyun Wei, Hongyang Zhang

We also revisit the evaluation protocol introduced by previous works and identify a limitation in this protocol that leads to an artificially high pass rate.

Language Modelling Large Language Model

154

Paper
Code

Rethinking Generative Large Language Model Evaluation for Semantic Comprehension

no code implementations • 12 Mar 2024 • Fangyun Wei, Xi Chen, Lin Luo

Through a comprehensive evaluation of 24 models across 11 benchmarks, we highlight several potential drawbacks of MCQA, for instance, the inconsistency between the MCQA evaluation and the generation of open-ended responses in practical scenarios.

Language Modelling Large Language Model +2

Paper
Add Code

Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

1 code implementation • 12 Mar 2024 • Lei Zhu, Fangyun Wei, Yanye Lu

To achieve this, we present the Vision-to-Language Tokenizer, abbreviated as V2T Tokenizer, which transforms an image into a ``foreign language'' with the combined aid of an encoder-decoder, the LLM vocabulary, and a CLIP model.

Deblurring Image Captioning +5

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.