Search Results for author: Xiaoming Wei

Found 22 papers, 14 papers with code

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

no code implementations • 1 Apr 2024 • Hongwei Zheng, Linyuan Zhou, Han Li, Jinming Su, Xiaoming Wei, Xiaoming Xu

To this end, this paper introduces the Balanced and Entropy-based Mix (BEM), a pioneering mixing approach to re-balance the class distribution of both data quantity and uncertainty.

Paper
Add Code

ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

no code implementations • 1 Mar 2024 • Chen Duan, Pei Fu, Shan Guo, Qianyi Jiang, Xiaoming Wei

With ODM, we achieve better alignment between text and OCR-Text and enable pre-trained models to adapt to the complex and diverse styles of scene text detection and spotting tasks.

Optical Character Recognition Optical Character Recognition (OCR) +2

Paper
Add Code

Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative Grounding

no code implementations • 2 Nov 2023 • Tianrui Hui, Zihan Ding, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu

Panoptic narrative grounding (PNG) aims to segment things and stuff objects in an image described by noun phrases of a narrative caption.

Object

Paper
Add Code

Orthogonal Temporal Interpolation for Zero-Shot Video Recognition

1 code implementation • 14 Aug 2023 • Yan Zhu, Junbao Zhuo, Bin Ma, Jiajia Geng, Xiaoming Wei, Xiaolin Wei, Shuhui Wang

We propose a model called OTI for ZSVR by employing orthogonal temporal interpolation and the matching loss based on VLMs.

Ranked #1 on Zero-Shot Action Recognition on UCF101

Video Recognition Zero-Shot Action Recognition +2

Paper
Code

EfficientRep:An Efficient Repvgg-style ConvNets with Hardware-aware Neural Network Design

1 code implementation • 1 Feb 2023 • Kaiheng Weng, Xiangxiang Chu, Xiaoming Xu, Junshi Huang, Xiaoming Wei

Thus, how to design a neural network to efficiently use the computing ability and memory bandwidth of hardware is a critical problem.

object-detection Object Detection

5,530

Paper
Code

Elastic Aggregation for Federated Optimization

1 code implementation • CVPR 2023 • Dengsheng Chen, Jie Hu, Vince Junkai Tan, Xiaoming Wei, Enhua Wu

Federated learning enables the privacy-preserving training of neural network models using real-world data across distributed clients.

Federated Learning Privacy Preserving

386

Paper
Code

Bridging Search Region Interaction With Template for RGB-T Tracking

1 code implementation • CVPR 2023 • Tianrui Hui, Zizheng Xun, Fengguang Peng, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu

To alleviate these limitations, we propose a novel Template-Bridged Search region Interaction (TBSI) module which exploits templates as the medium to bridge the cross-modal interaction between RGB and TIR search regions by gathering and distributing target-relevant object and environment contexts.

Ranked #2 on Rgb-T Tracking on RGBT210

Rgb-T Tracking Template Matching

Paper
Code

Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond

1 code implementation • CVPR 2023 • Zhengcong Fei, Mingyuan Fan, Li Zhu, Junshi Huang, Xiaoming Wei, Xiaolin Wei

In this paper, we introduce a novel Generative Adversarial Networks alike framework, referred to as GAN-MAE, where a generator is used to generate the masked patches according to the remaining visible patches, and a discriminator is employed to predict whether the patch is synthesized by the generator.

Representation Learning

Paper
Code

Uncertainty-Aware Image Captioning

no code implementations • 30 Nov 2022 • Zhengcong Fei, Mingyuan Fan, Li Zhu, Junshi Huang, Xiaoming Wei, Xiaolin Wei

It is well believed that the higher uncertainty in a word of the caption, the more inter-correlated context information is required to determine it.

Caption Generation Image Captioning +1

Paper
Add Code

Meta-Ensemble Parameter Learning

no code implementations • 5 Oct 2022 • Zhengcong Fei, Shuman Tian, Junshi Huang, Xiaoming Wei, Xiaolin Wei

Knowledge distillation is an approach that allows a single model to efficiently capture the approximate performance of an ensemble while showing poor scalability as demand for re-training when introducing new teacher models.

Knowledge Distillation Meta-Learning

Paper
Add Code

Rethinking skip connection model as a learnable Markov chain

1 code implementation • 30 Sep 2022 • Dengsheng Chen, Jie Hu, Wenwen Qiang, Xiaoming Wei, Enhua Wu

In this work, we deep dive into the model's behaviors with skip connections which can be formulated as a learnable Markov chain.

Paper
Code

YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications

7 code implementations • 7 Sep 2022 • Chuyi Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie, Yiduo Li, Bo Zhang, Yufei Liang, Linyuan Zhou, Xiaoming Xu, Xiangxiang Chu, Xiaoming Wei, Xiaolin Wei

The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios.

Ranked #14 on Object Detection on COCO-O

Object Detection Quantization

12,041

Paper
Code

PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding

1 code implementation • 11 Aug 2022 • Zihan Ding, Zi-han Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Si Liu

To alleviate these drawbacks, we propose a one-stage end-to-end Pixel-Phrase Matching Network (PPMN), which directly matches each phrase to its corresponding pixels instead of region proposals and outputs panoptic segmentation by simple combination.

Panoptic Segmentation Segmentation +1

Paper
Code

Efficient Modeling of Future Context for Image Captioning

1 code implementation • 22 Jul 2022 • Zhengcong Fei, Junshi Huang, Xiaoming Wei, Xiaolin Wei

Existing approaches to image captioning usually generate the sentence word-by-word from left to right, with the constraint of conditioned on local context including the given image and history generated words.

Image Captioning Sentence +1

Paper
Code

Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation

1 code implementation • CVPR 2022 • Zihan Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Jizhong Han, Si Liu

Referring video object segmentation aims to predict foreground labels for objects referred by natural language expressions in videos.

Ranked #6 on Referring Video Object Segmentation on MeViS

Denoising Referring Video Object Segmentation +2

Paper
Code

Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation

1 code implementation • CVPR 2021 • Tong Wu, Junshi Huang, Guangyu Gao, Xiaoming Wei, Xiaolin Wei, Xuan Luo, Chi Harold Liu

In inference, we directly use the activation masks from the DA layer as pseudo-labels for segmentation.

Segmentation Weakly supervised Semantic Segmentation +1

Paper
Code

Structure Guided Lane Detection

1 code implementation • 12 May 2021 • Jinming Su, Chao Chen, Ke Zhang, Junfeng Luo, Xiaoming Wei, Xiaolin Wei

Next, multi-level structural constraints are used to improve the perception of lanes.

Ranked #29 on Lane Detection on CULane

Autonomous Driving Lane Detection

Paper
Code

Rethinking BiSeNet For Real-time Semantic Segmentation

6 code implementations • CVPR 2021 • Mingyuan Fan, Shenqi Lai, Junshi Huang, Xiaoming Wei, Zhenhua Chai, Junfeng Luo, Xiaolin Wei

BiSeNet has been proved to be a popular two-stream network for real-time segmentation.

Ranked #8 on Real-Time Semantic Segmentation on Cityscapes test

Dichotomous Image Segmentation Image Classification +3

8,238

Paper
Code

Large Scale Visual Food Recognition

no code implementations • 30 Mar 2021 • Weiqing Min, Zhiling Wang, Yuxin Liu, Mengjiang Luo, Liping Kang, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang

Food2K can be further explored to benefit more food-relevant tasks including emerging and more complex ones (e. g., nutritional understanding of food), and the trained models on Food2K can be expected as backbones to improve the performance of more food-relevant tasks.

Fine-Grained Visual Recognition Food Recognition +3

Paper
Add Code

Rethinking the Optimization of Average Precision: Only Penalizing Negative Instances before Positive Ones is Enough

2 code implementations • 9 Feb 2021 • Zhuo Li, Weiqing Min, Jiajun Song, Yaohui Zhu, Liping Kang, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang

Limited by the definition of AP, such methods consider both negative and positive instances ranking before each positive instance.

Ranked #3 on Vehicle Re-Identification on VehicleID Large

Image Retrieval Retrieval +1

Paper
Code

ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network

no code implementations • 13 Aug 2020 • Weiqing Min, Linhu Liu, Zhiling Wang, Zhengdong Luo, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang

To encourage further progress in food recognition, we introduce the dataset ISIA Food- 500 with 500 categories from the list in the Wikipedia and 399, 726 images, a more comprehensive food dataset that surpasses existing popular benchmark datasets by category coverage and data volume.

Food Recognition Management

Paper
Add Code

Grand Challenge of 106-Point Facial Landmark Localization

no code implementations • 9 May 2019 • Yinglu Liu, Hao Shen, Yue Si, Xiaobo Wang, Xiangyu Zhu, Hailin Shi, Zhibin Hong, Hanqi Guo, Ziyuan Guo, Yanqin Chen, Bi Li, Teng Xi, Jun Yu, Haonian Xie, Guochen Xie, Mengyan Li, Qing Lu, Zengfu Wang, Shenqi Lai, Zhenhua Chai, Xiaoming Wei

However, previous competitions on facial landmark localization (i. e., the 300-W, 300-VW and Menpo challenges) aim to predict 68-point landmarks, which are incompetent to depict the structure of facial components.

Face Alignment Face Recognition +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.