Search Results for author: Song Bai

Found 46 papers, 27 papers with code

CAP-Net: Correspondence-Aware Point-view Fusion Network for 3D Shape Analysis

no code implementations3 Sep 2021 Xinwei He, Silin Cheng, Song Bai, Xiang Bai

The core element of CAP-Net is a module named Correspondence-Aware Fusion (CAF) which integrates the local features of the two modalities based on their correspondence scores.

3D Object Classification Object Classification

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

no code implementations27 Jul 2021 Bin Tan, Nan Xue, Song Bai, Tianfu Wu, Gui-Song Xia

This paper presents a neural network built upon Transformers, namely PlaneTR, to simultaneously detect and reconstruct planes from a single image.

Visual Parser: Representing Part-whole Hierarchies with Transformers

2 code implementations13 Jul 2021 Shuyang Sun*, Xiaoyu Yue*, Song Bai, Philip Torr

To model the representations of the two levels, we first encode the information from the whole into part vectors through an attention mechanism, then decode the global information within the part vectors back into the whole representation.

Image Classification Instance Segmentation +2

End-to-end Temporal Action Detection with Transformer

1 code implementation18 Jun 2021 Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Song Bai, Xiang Bai

Temporal action detection (TAD) aims to determine the semantic label and the boundaries of every action instance in an untrimmed video.

Action Detection Video Understanding

I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition

no code implementations18 May 2021 Chuhui Xue, Shijian Lu, Song Bai, Wenqing Zhang, Changhu Wang

Leveraging the advances of natural language processing, most recent scene text recognizers adopt an encoder-decoder architecture where text images are first converted to representative features and then a sequence of characters via `direct decoding'.

Scene Text Scene Text Recognition

Location-Sensitive Visual Recognition with Cross-IOU Loss

1 code implementation11 Apr 2021 Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.

2D Object Detection Instance Segmentation +2

Anchor-Free Person Search

1 code implementation CVPR 2021 Yichao Yan, Jinpeng Li, Jie Qin, Song Bai, Shengcai Liao, Li Liu, Fan Zhu, Ling Shao

Person search aims to simultaneously localize and identify a query person from realistic, uncropped images, which can be regarded as the unified task of pedestrian detection and person re-identification (re-id).

Pedestrian Detection Person Re-Identification +1

SwiftNet: Real-time Video Object Segmentation

1 code implementation CVPR 2021 Haochen Wang, XiaoLong Jiang, Haibing Ren, Yao Hu, Song Bai

In this work we present SwiftNet for real-time semisupervised video object segmentation (one-shot VOS), which reports 77. 8% J &F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance.

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

Multi-shot Temporal Event Localization: a Benchmark

1 code implementation CVPR 2021 Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr

Current developments in temporal event or action localization usually target actions captured by a single camera.

Ranked #2 on Temporal Action Localization on THUMOS’14 (using extra training data)

Temporal Action Localization

Dual Attention GANs for Semantic Image Synthesis

1 code implementation29 Aug 2020 Hao Tang, Song Bai, Nicu Sebe

We also propose two novel modules, i. e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM), to capture semantic structure attention in spatial and channel dimensions, respectively.

Image Generation

Bipartite Graph Reasoning GANs for Person Image Generation

1 code implementation10 Aug 2020 Hao Tang, Song Bai, Philip H. S. Torr, Nicu Sebe

We present a novel Bipartite Graph Reasoning GAN (BiGraphGAN) for the challenging person image generation task.

Pose Transfer

Corner Proposal Network for Anchor-free, Two-stage Object Detection

1 code implementation ECCV 2020 Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.

Object Detection

FedOCR: Communication-Efficient Federated Learning for Scene Text Recognition

no code implementations22 Jul 2020 Wenqing Zhang, Yang Qiu, Song Bai, Rui Zhang, Xiaolin Wei, Xiang Bai

In this paper, we study how to make use of decentralized datasets for training a robust scene text recognizer while keeping them stay on local devices.

Federated Learning Scene Text +1

XingGAN for Person Image Generation

2 code implementations ECCV 2020 Hao Tang, Song Bai, Li Zhang, Philip H. S. Torr, Nicu Sebe

We propose a novel Generative Adversarial Network (XingGAN or CrossingGAN) for person image generation tasks, i. e., translating the pose of a given person to a desired one.

Pose Transfer

Neural Architecture Search for Lightweight Non-Local Networks

2 code implementations CVPR 2020 Yingwei Li, Xiaojie Jin, Jieru Mei, Xiaochen Lian, Linjie Yang, Cihang Xie, Qihang Yu, Yuyin Zhou, Song Bai, Alan Yuille

However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks.

Image Classification Neural Architecture Search

Holistically-Attracted Wireframe Parsing

1 code implementation CVPR 2020 Nan Xue, Tianfu Wu, Song Bai, Fu-Dong Wang, Gui-Song Xia, Liangpei Zhang, Philip H. S. Torr

For computing line segment proposals, a novel exact dual representation is proposed which exploits a parsimonious geometric reparameterization for line segments and forms a holistic 4-dimensional attraction field map for an input image.

Line Segment Detection

AutoScale: Learning to Scale for Crowd Counting and Localization

2 code implementations20 Dec 2019 Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Xiang Bai, Masayoshi Tomizuka

A major issue is that the density map on dense regions usually accumulates density values from a number of nearby Gaussian blobs, yielding different large density values on a small set of pixels.

Crowd Counting

Learning Regional Attraction for Line Segment Detection

no code implementations18 Dec 2019 Nan Xue, Song Bai, Fu-Dong Wang, Gui-Song Xia, Tianfu Wu, Liangpei Zhang, Philip H. S. Torr

Given a line segment map, the proposed regional attraction first establishes the relationship between line segments and regions in the image lattice.

Line Segment Detection

Asymmetric Non-local Neural Networks for Semantic Segmentation

5 code implementations ICCV 2019 Zhen Zhu, Mengde Xu, Song Bai, Tengteng Huang, Xiang Bai

The non-local module works as a particularly useful technique for semantic segmentation while criticized for its prohibitive computation and GPU memory occupation.

Semantic Segmentation

View N-gram Network for 3D Object Retrieval

no code implementations ICCV 2019 Xinwei He, Tengteng Huang, Song Bai, Xiang Bai

By doing so, spatial information across multiple views is captured, which helps to learn a discriminative global embedding for each 3D object.

3D Object Retrieval 3D Shape Classification +1

Symmetry-constrained Rectification Network for Scene Text Recognition

no code implementations ICCV 2019 MingKun Yang, Yushuo Guan, Minghui Liao, Xin He, Kaigui Bian, Song Bai, Cong Yao, Xiang Bai

Reading text in the wild is a very challenging task due to the diversity of text instances and the complexity of natural scenes.

Rectification Scene Text +1

Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting

1 code implementation ICCV 2019 Chenfeng Xu, Kai Qiu, Jianlong Fu, Song Bai, Yongchao Xu, Xiang Bai

Dense crowd counting aims to predict thousands of human instances from an image, by calculating integrals of a density map over image pixels.

Crowd Counting Density Estimation

Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification

no code implementations CVPR 2019 Song Bai, Peng Tang, Philip H.S. Torr, Longin Jan Latecki

This work studies the unsupervised re-ranking procedure for object retrieval and person re-identification with a specific concentration on an ensemble of multiple metrics (or similarities).

3D Shape Classification 3D Shape Retrieval +3

CenterNet: Keypoint Triplets for Object Detection

10 code implementations ICCV 2019 Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.

Object Detection

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses

1 code implementation ECCV 2020 Yingwei Li, Song Bai, Cihang Xie, Zhenyu Liao, Xiaohui Shen, Alan L. Yuille

We observe the property of regional homogeneity in adversarial perturbations and suggest that the defenses are less robust to regionally homogeneous perturbations.

Object Detection Semantic Segmentation

Adversarial Metric Attack and Defense for Person Re-identification

1 code implementation30 Jan 2019 Song Bai, Yingwei Li, Yuyin Zhou, Qizhu Li, Philip H. S. Torr

However, our work observes the extreme vulnerability of existing distance metrics to adversarial examples, generated by simply adding human-imperceptible perturbations to person images.

Adversarial Attack General Classification +1

Hypergraph Convolution and Hypergraph Attention

1 code implementation23 Jan 2019 Song Bai, Feihu Zhang, Philip H. S. Torr

To efficiently learn deep embeddings on the high-order graph-structured data, we introduce two end-to-end trainable operators to the family of graph neural networks, i. e., hypergraph convolution and hypergraph attention.

Node Classification Representation Learning

Learn to Interpret Atari Agents

1 code implementation29 Dec 2018 Zhao Yang, Song Bai, Li Zhang, Philip H. S. Torr

In contrast to previous a-posteriori methods of visualizing DeepRL policies, we propose an end-to-end trainable framework based on Rainbow, a representative Deep Q-Network (DQN) agent.

Decision Making

Learning Transferable Adversarial Examples via Ghost Networks

1 code implementation9 Dec 2018 Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, Alan Yuille

The critical principle of ghost networks is to apply feature-level perturbations to an existing model to potentially create a huge set of diverse models.

Adversarial Attack

Learning Attraction Field Representation for Robust Line Segment Detection

1 code implementation CVPR 2019 Nan Xue, Song Bai, Fu-Dong Wang, Gui-Song Xia, Tianfu Wu, Liangpei Zhang

In experiments, our method is tested on the WireFrame dataset and the YorkUrban dataset with state-of-the-art performance obtained.

Ranked #4 on Line Segment Detection on York Urban Dataset (using extra training data)

Line Segment Detection Semantic Segmentation

Hard-Aware Point-to-Set Deep Metric for Person Re-identification

1 code implementation ECCV 2018 Rui Yu, Zhiyong Dou, Song Bai, Zhao-Xiang Zhang, Yongchao Xu, Xiang Bai

Person re-identification (re-ID) is a highly challenging task due to large variations of pose, viewpoint, illumination, and occlusion.

Metric Learning Person Re-Identification

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

3 code implementations9 Jul 2018 Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille

The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one.

Multiple Instance Learning Object Recognition +1

Semi-Supervised Multi-Organ Segmentation via Deep Multi-Planar Co-Training

no code implementations7 Apr 2018 Yuyin Zhou, Yan Wang, Peng Tang, Song Bai, Wei Shen, Elliot K. Fishman, Alan L. Yuille

In multi-organ segmentation of abdominal CT scans, most existing fully supervised deep learning algorithms require lots of voxel-wise annotations, which are usually difficult, expensive, and slow to obtain.

Semantic Segmentation

Improving Transferability of Adversarial Examples with Input Diversity

1 code implementation CVPR 2019 Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jian-Yu Wang, Zhou Ren, Alan Yuille

We hope that our proposed attack strategy can serve as a strong benchmark baseline for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in the future.

Adversarial Attack Image Classification

Triplet-Center Loss for Multi-View 3D Object Retrieval

1 code implementation CVPR 2018 Xinwei He, Yang Zhou, Zhichao Zhou, Song Bai, Xiang Bai

Most existing 3D object recognition algorithms focus on leveraging the strong discriminative power of deep learning models with softmax loss for the classification of 3D data, while learning discriminative features with deep metric learning for 3D object retrieval is more or less neglected.

3D Object Recognition 3D Object Retrieval +4

Ensemble Diffusion for Retrieval

no code implementations ICCV 2017 Song Bai, Zhichao Zhou, Jingdong Wang, Xiang Bai, Longin Jan Latecki, Qi Tian

This stimulates a great research interest of considering similarity fusion in the framework of diffusion process (i. e., fusion with diffusion) for robust retrieval.

3D Shape Classification 3D Shape Retrieval +1

Scalable Person Re-identification on Supervised Smoothed Manifold

no code implementations CVPR 2017 Song Bai, Xiang Bai, Qi Tian

Most existing person re-identification algorithms either extract robust visual features or learn discriminative metrics for person images.

Person Re-Identification

Multidimensional Scaling on Multiple Input Distance Matrices

no code implementations1 May 2016 Song Bai, Xiang Bai, Longin Jan Latecki, Qi Tian

How to do multidimensional scaling on multiple input distance matrices is still unsolved to our best knowledge.

Deep Learning Representation using Autoencoder for 3D Shape Retrieval

no code implementations25 Sep 2014 Zhuotun Zhu, Xinggang Wang, Song Bai, Cong Yao, Xiang Bai

By combing the global deep learning representation and the local descriptor representation, our method can obtain the state-of-the-art performance on 3D shape retrieval benchmarks.

3D Shape Classification 3D Shape Recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.