Search Results for author: Changhu Wang

Found 48 papers, 18 papers with code

Objects in Semantic Topology

no code implementations6 Oct 2021 Shuo Yang, Peize Sun, Yi Jiang, Xiaobo Xia, Ruiheng Zhang, Zehuan Yuan, Changhu Wang, Ping Luo, Min Xu

A more realistic object detection paradigm, Open-World Object Detection, has arisen increasing research interests in the community recently.

Incremental Learning Language Modelling +1

Meta Navigator: Search for a Good Adaptation Policy for Few-shot Learning

no code implementations ICCV 2021 Chi Zhang, Henghui Ding, Guosheng Lin, Ruibo Li, Changhu Wang, Chunhua Shen

Inspired by the recent success in Automated Machine Learning literature (AutoML), in this paper, we present Meta Navigator, a framework that attempts to solve the aforementioned limitation in few-shot learning by seeking a higher-level strategy and proffer to automate the selection from various few-shot learning designs.

AutoML Few-Shot Learning

Memory Based Video Scene Parsing

no code implementations1 Sep 2021 Zhenchao Jin, Dongdong Yu, Kai Su, Zehuan Yuan, Changhu Wang

Video scene parsing is a long-standing challenging task in computer vision, aiming to assign pre-defined semantic labels to pixels of all frames in a given video.

Scene Parsing Semantic Segmentation

Mining Contextual Information Beyond Image for Semantic Segmentation

1 code implementation ICCV 2021 Zhenchao Jin, Tao Gong, Dongdong Yu, Qi Chu, Jian Wang, Changhu Wang, Jie Shao

To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations.

Semantic Segmentation

MT-ORL: Multi-Task Occlusion Relationship Learning

1 code implementation ICCV 2021 Panhe Feng, Qi She, Lei Zhu, Jiaxin Li, Lin Zhang, Zijian Feng, Changhu Wang, Chunpeng Li, Xuejing Kang, Anlong Ming

Retrieving occlusion relation among objects in a single image is challenging due to sparsity of boundaries in image.

Unifying Nonlocal Blocks for Neural Networks

1 code implementation ICCV 2021 Lei Zhu, Qi She, Duo Li, Yanye Lu, Xuejing Kang, Jie Hu, Changhu Wang

The nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks.

Action Recognition Image Classification +2

Recovering the Unbiased Scene Graphs from the Biased Ones

1 code implementation5 Jul 2021 Meng-Jiun Chiou, Henghui Ding, Hanshu Yan, Changhu Wang, Roger Zimmermann, Jiashi Feng

Given input images, scene graph generation (SGG) aims to produce comprehensive, graphical representations describing visual relationships among salient objects.

Data Augmentation Graph Generation +2

Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning

1 code implementation1 Jun 2021 Ju He, Adam Kortylewski, Shaokang Yang, Shuai Liu, Cheng Yang, Changhu Wang, Alan Yuille

In particular, we decouple the training of the representation and the classifier, and systematically investigate the effects of different data re-sampling techniques when training the whole network including a classifier as well as fine-tuning the feature extractor only.

I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition

no code implementations18 May 2021 Chuhui Xue, Shijian Lu, Song Bai, Wenqing Zhang, Changhu Wang

Leveraging the advances of natural language processing, most recent scene text recognizers adopt an encoder-decoder architecture where text images are first converted to representative features and then a sequence of characters via `direct decoding'.

Scene Text Scene Text Recognition

Center Prediction Loss for Re-identification

no code implementations30 Apr 2021 Lu Yang, Yunlong Wang, Lingqiao Liu, Peng Wang, Lu Chi, Zehuan Yuan, Changhu Wang, Yanning Zhang

In this paper, we propose a new loss based on center predictivity, that is, a sample must be positioned in a location of the feature space such that from it we can roughly predict the location of the center of same-class samples.

ConTNet: Why not use convolution and transformer at the same time?

1 code implementation27 Apr 2021 Haotian Yan, Zhe Li, Weijian Li, Changhu Wang, Ming Wu, Chuang Zhang

It is also worth pointing that, given identical strong data augmentations, the performance improvement of ConTNet is more remarkable than that of ResNet.

Image Classification Object Detection

Conditional Meta-Network for Blind Super-Resolution with Multiple Degradations

no code implementations8 Apr 2021 Guanghao Yin, Wei Wang, Zehuan Yuan, Dongdong Yu, Shouqian Sun, Changhu Wang

We extract degradation prior at task-level with the proposed ConditionNet, which will be used to adapt the parameters of the basic SR network (BaseNet).

Image Super-Resolution

MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis

1 code implementation ICCV 2021 Jiaxin Li, Zijian Feng, Qi She, Henghui Ding, Changhu Wang, Gim Hee Lee

In this paper, we propose MINE to perform novel view synthesis and depth estimation via dense 3D reconstruction from a single image.

3D Reconstruction Depth Estimation +1

Re-rank Coarse Classification with Local Region Enhanced Features for Fine-Grained Image Recognition

no code implementations19 Feb 2021 Shaokang Yang, Shuai Liu, Cheng Yang, Changhu Wang

In this paper, a retrieval-based coarse-to-fine framework is proposed, where we re-rank the TopN classification results by using the local region enhanced embedding features to improve the Top1 accuracy (based on the observation that the correct category usually resides in TopN results).

Fine-Grained Image Classification Fine-Grained Image Recognition +1

Incorporating Vision Bias into Click Models for Image-oriented Search Engine

no code implementations7 Jan 2021 Ningxin Xu, Cheng Yang, Yixin Zhu, Xiaowei Hu, Changhu Wang

Most typical click models assume that the probability of a document to be examined by users only depends on position, such as PBM and UBM.

A Unified Framework to Analyze and Design the Nonlocal Blocks for Neural Networks

no code implementations1 Jan 2021 Lei Zhu, Qi She, Changhu Wang

When choosing Chebyshev graph filter, a generalized formulation can be derived for explaining the existing nonlocal-based blocks (e. g. nonlocal block, nonlocal stage, double attention block) and uses to analyze their irrationality.

Action Recognition Fine-Grained Image Classification

Domain-Invariant Disentangled Network for Generalizable Object Detection

no code implementations ICCV 2021 Chuang Lin, Zehuan Yuan, Sicheng Zhao, Peize Sun, Changhu Wang, Jianfei Cai

By disentangling representations on both image and instance levels, DIDN is able to learn domain-invariant representations that are suitable for generalized object detection.

Domain Generalization Image Classification +1

Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

no code implementations ICCV 2021 Wei Wang, Haochen Zhang, Zehuan Yuan, Changhu Wang

A popular attempts towards the challenge is unpaired generative adversarial networks, which generate "real" LR counterparts from real HR images using image-to-image translation and then perform super-resolution from "real" LR->SR.

Domain Adaptation Image-to-Image Translation +1

TransTrack: Multiple Object Tracking with Transformer

3 code implementations31 Dec 2020 Peize Sun, Jinkun Cao, Yi Jiang, Rufeng Zhang, Enze Xie, Zehuan Yuan, Changhu Wang, Ping Luo

In this work, we propose TransTrack, a simple but efficient scheme to solve the multiple object tracking problems.

Multiple Object Tracking Object Detection

What Makes for End-to-End Object Detection?

1 code implementation10 Dec 2020 Peize Sun, Yi Jiang, Enze Xie, Wenqi Shao, Zehuan Yuan, Changhu Wang, Ping Luo

We identify that classification cost in matching cost is the main ingredient: (1) previous detectors only consider location cost, (2) by additionally introducing classification cost, previous detectors immediately produce one-to-one prediction during inference.

Classification General Classification +1

Slimmable Generative Adversarial Networks

1 code implementation10 Dec 2020 Liang Hou, Zehuan Yuan, Lei Huang, HuaWei Shen, Xueqi Cheng, Changhu Wang

In particular, for real-time generation tasks, different devices require generators of different sizes due to varying computing power.

F2Net: Learning to Focus on the Foreground for Unsupervised Video Object Segmentation

no code implementations4 Dec 2020 Daizong Liu, Dongdong Yu, Changhu Wang, Pan Zhou

Specifically, our proposed network consists of three main parts: Siamese Encoder Module, Center Guiding Appearance Diffusion Module, and Dynamic Information Fusion Module.

Semantic Segmentation Unsupervised Video Object Segmentation +1

Is normalization indispensable for training deep neural network?

1 code implementation NeurIPS 2020 Jie Shao, Kai Hu, Changhu Wang, xiangyang xue, Bhiksha Raj

In this paper, we study what would happen when normalization layers are removed from the network, and show how to train deep neural networks without normalization layers and without performance degradation.

General Classification Image Classification +4

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

4 code implementations CVPR 2021 Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei LI, Zehuan Yuan, Changhu Wang, Ping Luo

In our method, however, a fixed sparse set of learned object proposals, total length of $N$, are provided to object recognition head to perform classification and location.

Object Detection Object Recognition

Learning the Best Pooling Strategy for Visual Semantic Embedding

no code implementations CVPR 2021 Jiacheng Chen, Hexiang Hu, Hao Wu, Yuning Jiang, Changhu Wang

Visual Semantic Embedding (VSE) is a dominant approach for vision-language retrieval, which aims at learning a deep embedding space such that visual data are embedded close to their semantic text labels or descriptions.

Video-Text Retrieval

Towards Good Practices for Multi-Person Pose Estimation

no code implementations28 Oct 2019 Dongdong Yu, Kai Su, Changhu Wang

Multi-Person Pose Estimation is an interesting yet challenging task in computer vision.

Multi-Person Pose Estimation

Deformable Tube Network for Action Detection in Videos

no code implementations3 Jul 2019 Wei Li, Zehuan Yuan, Dashan Guo, Lei Huang, Xiangzhong Fang, Changhu Wang

To perform action detection, we design a 3D convolution network with skip connections for tube classification and regression.

Action Detection Action Recognition

A Context-and-Spatial Aware Network for Multi-Person Pose Estimation

no code implementations14 May 2019 Dongdong Yu, Kai Su, Xin Geng, Changhu Wang

In this paper, a novel Context-and-Spatial Aware Network (CSANet), which integrates both a Context Aware Path and Spatial Aware Path, is proposed to obtain effective features involving both context information and spatial information.

Multi-Person Pose Estimation

Generative Dual Adversarial Network for Generalized Zero-shot Learning

1 code implementation CVPR 2019 He Huang, Changhu Wang, Philip S. Yu, Chang-Dong Wang

Most previous models try to learn a fixed one-directional mapping between visual and semantic space, while some recently proposed generative methods try to generate image features for unseen classes so that the zero-shot learning problem becomes a traditional fully-supervised classification problem.

Generalized Zero-Shot Learning Metric Learning

Mask Propagation Network for Video Object Segmentation

no code implementations24 Oct 2018 Jia Sun, Dongdong Yu, Yinghong Li, Changhu Wang

In this work, we propose a mask propagation network to treat the video segmentation problem as a concept of the guided instance segmentation.

Instance Segmentation Optical Flow Estimation +4

Knowing Where to Look? Analysis on Attention of Visual Question Answering System

no code implementations9 Oct 2018 Wei Li, Zehuan Yuan, Xiangzhong Fang, Changhu Wang

Attention mechanisms have been widely used in Visual Question Answering (VQA) solutions due to their capacity to model deep cross-domain interactions.

Question Answering Visual Question Answering

Towards Good Practices for Multi-modal Fusion in Large-scale Video Classification

no code implementations16 Sep 2018 Jinlai Liu, Zehuan Yuan, Changhu Wang

Leveraging both visual frames and audio has been experimentally proven effective to improve large-scale video classification.

Classification General Classification +1

An Introduction to Image Synthesis with Generative Adversarial Nets

no code implementations12 Mar 2018 He Huang, Philip S. Yu, Changhu Wang

There has been a drastic growth of research in Generative Adversarial Nets (GANs) in the past few years.

Image-to-Image Translation Translation

Network Iterative Learning for Dynamic Deep Neural Networks via Morphism

no code implementations ICLR 2018 Tao Wei, Changhu Wang, Chang Wen Chen

In this research, we present a novel learning scheme called network iterative learning for deep neural networks.

MAT: A Multimodal Attentive Translator for Image Captioning

no code implementations18 Feb 2017 Chang Liu, Fuchun Sun, Changhu Wang, Feng Wang, Alan Yuille

In this way, the sequential representation of an image can be naturally translated to a sequence of words, as the target sequence of the RNN model.

Image Captioning Machine Translation +1

Modularized Morphing of Neural Networks

no code implementations12 Jan 2017 Tao Wei, Changhu Wang, Chang Wen Chen

Different from existing work where basic morphing types on the layer level were addressed, we target at the central problem of network morphism at a higher level, i. e., how a convolutional layer can be morphed into an arbitrary module of a neural network.

Surveillance Video Parsing with Single Frame Supervision

no code implementations CVPR 2017 Si Liu, Changhu Wang, Ruihe Qian, Han Yu, Renda Bao

In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage.

Optical Flow Estimation

Network Morphism

no code implementations5 Mar 2016 Tao Wei, Changhu Wang, Yong Rui, Chang Wen Chen

The second requirement for this network morphism is its ability to deal with non-linearity in a network.

Understanding Image Structure via Hierarchical Shape Parsing

no code implementations CVPR 2015 Xian-Ming Liu, Rongrong Ji, Changhu Wang, Wei Liu, Bineng Zhong, Thomas S. Huang

A hierarchical shape parsing strategy is proposed to partition and organize image components into a hierarchical structure in the scale space.

Hierarchical structure

Cannot find the paper you are looking for? You can Submit a new open access paper.