Search Results for author: Wayne Zhang

Found 35 papers, 31 papers with code

Data-Driven Neuron Allocation for Scale Aggregation Networks

1 code implementation CVPR 2019 Yi Li, Zhanghui Kuang, Yimin Chen, Wayne Zhang

The most informative output neurons in each block are preserved while others are discarded, and thus neurons for multiple scales are competitively and adaptively allocated.

Image Classification object-detection +1

Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid

no code implementations ICCV 2019 Zhanghui Kuang, Yiming Gao, Guanbin Li, Ping Luo, Yimin Chen, Liang Lin, Wayne Zhang

To address this issue, we propose a novel Graph Reasoning Network (GRNet) on a Similarity Pyramid, which learns similarities between a query and a gallery cloth by using both global and local representations in multiple scales.

Image Retrieval Retrieval

Recovery of Future Data via Convolution Nuclear Norm Minimization

1 code implementation6 Sep 2019 Guangcan Liu, Wayne Zhang

This paper studies the problem of time series forecasting (TSF) from the perspective of compressed sensing.

Time Series Time Series Forecasting

Gradual Network for Single Image De-raining

no code implementations20 Sep 2019 Zhe Huang, Weijiang Yu, Wayne Zhang, Litong Feng, Nong Xiao

Taking the residual result (the coarse de-rained result) between the rainy image sample (i. e. the input data) and the output of coarse stage (i. e. the learnt rain mask) as input, the fine stage continues to de-rain by removing the fine-grained rain streaks (e. g. light rain streaks and water mist) to get a rain-free and well-reconstructed output image via a unified contextual merging sub-network with dense blocks and a merging block.

Rain Removal

Object Instance Mining for Weakly Supervised Object Detection

1 code implementation4 Feb 2020 Chenhao Lin, Siwen Wang, Dongqi Xu, Yu Lu, Wayne Zhang

Weakly supervised object detection (WSOD) using only image-level annotations has attracted growing attention over the past few years.

Multiple Instance Learning Object +2

Scale-Equalizing Pyramid Convolution for Object Detection

2 code implementations CVPR 2020 Xinjiang Wang, Shilong Zhang, Zhuoran Yu, Litong Feng, Wayne Zhang

Inspired by this, a convolution across the pyramid level is proposed in this study, which is termed pyramid convolution and is a modified 3-D convolution.

Object object-detection +1

Maximum-and-Concatenation Networks

1 code implementation ICML 2020 Xingyu Xie, Hao Kong, Jianlong Wu, Wayne Zhang, Guangcan Liu, Zhouchen Lin

While successful in many fields, deep neural networks (DNNs) still suffer from some open problems such as bad local minima and unsatisfactory generalization performance.

RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

4 code implementations ECCV 2020 Xiaoyu Yue, Zhanghui Kuang, Chenhao Lin, Hongbin Sun, Wayne Zhang

Theoretically, our proposed method, dubbed \emph{RobustScanner}, decodes individual characters with dynamic ratio between context and positional clues, and utilizes more positional ones when the decoding sequences with scarce context, and thus is robust and practical.

Irregular Text Recognition Position +1

Context-Aware RCNN: A Baseline for Action Detection in Videos

3 code implementations ECCV 2020 Jianchao Wu, Zhanghui Kuang, Li-Min Wang, Wayne Zhang, Gangshan Wu

In this work, we first empirically find the recognition accuracy is highly correlated with the bounding box size of an actor, and thus higher resolution of actors contributes to better performance.

Action Detection Action Recognition

Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph

1 code implementation12 Oct 2020 Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang

VSGraph-LC starts from anchor selection referring to the semantic similarity between metadata and correct label concepts, and then propagates correct labels from anchors on a visual graph using graph neural network (GNN).

General Classification Image Classification +2

Spatial Dual-Modality Graph Reasoning for Key Information Extraction

2 code implementations26 Mar 2021 Hongbin Sun, Zhanghui Kuang, Xiaoyu Yue, Chenhao Lin, Wayne Zhang

In order to roundly evaluate our proposed method as well as boost the future research, we release a new dataset named WildReceipt, which is collected and annotated tailored for the evaluation of key information extraction from document images of unseen templates in the wild.

Key Information Extraction Template Matching

Fourier Contour Embedding for Arbitrary-Shaped Text Detection

8 code implementations CVPR 2021 2021 Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang

One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances.

Scene Text Detection Text Detection

WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection

no code implementations21 May 2021 Shijie Fang, Yuhang Cao, Xinjiang Wang, Kai Chen, Dahua Lin, Wayne Zhang

The performance of object detection, to a great extent, depends on the availability of large annotated datasets.

object-detection Object Detection +2

Vision Transformer with Progressive Sampling

1 code implementation ICCV 2021 Xiaoyu Yue, Shuyang Sun, Zhanghui Kuang, Meng Wei, Philip Torr, Wayne Zhang, Dahua Lin

As a typical example, the Vision Transformer (ViT) directly applies a pure transformer architecture on image classification, by simply splitting images into tokens with a fixed length, and employing transformers to learn relations between these tokens.

Image Classification

Progressive Representative Labeling for Deep Semi-Supervised Learning

no code implementations13 Aug 2021 Xiaopeng Yan, Riquan Chen, Litong Feng, Jingkang Yang, Huabin Zheng, Wayne Zhang

In this paper, we propose to label only the most representative samples to expand the labeled set.

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

2 code implementations14 Aug 2021 Zhanghui Kuang, Hongbin Sun, Zhizhong Li, Xiaoyu Yue, Tsui Hin Lin, Jianyong Chen, Huaqiang Wei, Yiqin Zhu, Tong Gao, Wenwei Zhang, Kai Chen, Wayne Zhang, Dahua Lin

We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.

Key Information Extraction named-entity-recognition +4

Semantically Coherent Out-of-Distribution Detection

2 code implementations ICCV 2021 Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, Ziwei Liu

The proposed UDG can not only enrich the semantic knowledge of the model by exploiting unlabeled data in an unsupervised manner, but also distinguish ID/OOD samples to enhance ID classification and OOD detection tasks simultaneously.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

2 code implementations ICCV 2021 Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang

For these matters, we propose the following designs to push the performance to new state-of-art: (i) Coefficient of Variation Smoothing to smooth the CAMs adaptively; (ii) Proportional Pseudo-mask Generation to project the expanded CAMs to pseudo-mask based on a new metric indicating the importance of each class on each location, instead of the scores trained from binary classifiers.

Segmentation Weakly supervised Semantic Segmentation +1

ViM: Out-Of-Distribution with Virtual-logit Matching

2 code implementations CVPR 2022 Haoqi Wang, Zhizhong Li, Litong Feng, Wayne Zhang

Most of the existing Out-Of-Distribution (OOD) detection algorithms depend on single input source: the feature, the logit, or the softmax probability.

Out-of-Distribution Detection

Panoptic Scene Graph Generation

1 code implementation22 Jul 2022 Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu

Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i. e., objects are detected using bounding boxes followed by prediction of their pairwise relationships.

Benchmarking Panoptic Scene Graph Generation +1

Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation

1 code implementation CVPR 2023 Lihe Yang, Lei Qi, Litong Feng, Wayne Zhang, Yinghuan Shi

In this work, we revisit the weak-to-strong consistency framework, popularized by FixMatch from semi-supervised classification, where the prediction of a weakly perturbed image serves as supervision for its strongly perturbed version.

Semi-supervised Change Detection Semi-supervised Medical Image Segmentation +1

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

3 code implementations13 Oct 2022 Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu

Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.

Anomaly Detection Benchmarking +3

Get the Best of Both Worlds: Improving Accuracy and Transferability by Grassmann Class Representation

1 code implementation ICCV 2023 Haoqi Wang, Zhizhong Li, Wayne Zhang

We generalize the class vectors found in neural networks to linear subspaces (i. e.~points in the Grassmann manifold) and show that the Grassmann Class Representation (GCR) enables the simultaneous improvement in accuracy and feature transferability.

Diverse Cotraining Makes Strong Semi-Supervised Segmentor

1 code implementation ICCV 2023 Yijiang Li, Xinjiang Wang, Lihe Yang, Litong Feng, Wayne Zhang, Ying Gao

Deep co-training has been introduced to semi-supervised segmentation and achieves impressive results, yet few studies have explored the working mechanism behind it.

Panoptic Video Scene Graph Generation

3 code implementations CVPR 2023 Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +5

RelayAttention for Efficient Large Language Model Serving with Long System Prompts

1 code implementation22 Feb 2024 Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau

Practical large language model (LLM) services may involve a long system prompt, which specifies the instructions, examples, and knowledge documents of the task and is reused across numerous requests.

Language Modelling Large Language Model

Cannot find the paper you are looking for? You can Submit a new open access paper.