no code implementations • ICLR 2019 • Yutong Bai, Lingxi Xie
Reinforcement learning (RL) is a metaheuristic aiming at teaching an agent to interact with an environment and maximizing the reward in a complex task.
no code implementations • 10 May 2023 • Bruce X. B. Yu, Jianlong Chang, Haixin Wang, Lingbo Liu, Shijie Wang, Zhiyu Wang, Junfan Lin, Lingxi Xie, Haojie Li, Zhouchen Lin, Qi Tian, Chang Wen Chen
With the surprising development of pre-trained visual foundation models, visual tuning jumped out of the standard modus operandi that fine-tunes the whole pre-trained model or just the fully connected layer.
no code implementations • 24 Apr 2023 • Jiazhong Cen, Zanwei Zhou, Jiemin Fang, Chen Yang, Wei Shen, Lingxi Xie, Dongsheng Jiang, Xiaopeng Zhang, Qi Tian
Recently, the Segment Anything Model (SAM) emerged as a powerful vision foundation model which is capable to segment anything in 2D images.
no code implementations • 22 Apr 2023 • Xin Chen, Hengheng Zhang, Xiaotao Gu, Kaifeng Bi, Lingxi Xie, Qi Tian
The Mixture of Experts (MoE) model becomes an important choice of large language models nowadays because of its scalability with sublinear computational complexity for training and inference.
no code implementations • 16 Mar 2023 • Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images.
no code implementations • 14 Mar 2023 • Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian
Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS).
Multi-Label Classification
Weakly supervised Semantic Segmentation
+1
1 code implementation • CVPR 2023 • Yunjie Tian, Lingxi Xie, Zhaozhi Wang, Longhui Wei, Xiaopeng Zhang, Jianbin Jiao, YaoWei Wang, Qi Tian, Qixiang Ye
In this paper, we present an integral pre-training framework based on masked image modeling (MIM).
1 code implementation • 4 Nov 2022 • Chengcheng Ma, Yang Liu, Jiankang Deng, Lingxi Xie, WeiMing Dong, Changsheng Xu
Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts.
no code implementations • 3 Nov 2022 • Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian
In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
no code implementations • 1 Oct 2022 • Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye
LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.
class-incremental learning
Few-Shot Class-Incremental Learning
+3
1 code implementation • 4 Aug 2022 • Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang
Large-scale vision-language pre-training has shown impressive advances in a wide range of downstream tasks.
1 code implementation • 31 Jul 2022 • Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang
To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.
1 code implementation • CVPR 2023 • Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian
Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.
1 code implementation • 23 Jul 2022 • Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu
In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
1 code implementation • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian
Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.
1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian
A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.
no code implementations • 30 May 2022 • Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian
A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.
1 code implementation • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).
Ranked #32 on
Object Detection
on COCO test-dev
no code implementations • CVPR 2022 • Xinyue Huo, Lingxi Xie, Hengtong Hu, Wengang Zhou, Houqiang Li, Qi Tian
Unsupervised domain adaptation (UDA) is an important topic in the computer vision community.
1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye
The past year has witnessed a rapid development of masked image modeling (MIM).
no code implementations • 11 Mar 2022 • Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian
In this paper, we propose a novel approach that embeds a task-agnostic prior into a transformer.
no code implementations • 10 Mar 2022 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.
no code implementations • CVPR 2022 • Yuhang Zhang, Xiaopeng Zhang, Lingxi Xie, Jie Li, Robert C. Qiu, Hengtong Hu, Qi Tian
The Yes query is treated as positive pairs of the queried category for contrastive pulling, while the No query is treated as hard negative pairs for contrastive repelling.
no code implementations • 5 Dec 2021 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian
In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.
1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.
1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye
In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.
Ranked #58 on
Semantic Segmentation
on Cityscapes test
no code implementations • 24 Nov 2021 • Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen
Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.
Ranked #6 on
Anomaly Detection
on Fishyscapes L&F
1 code implementation • 19 Oct 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian
The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.
Ranked #1 on
3D-Aware Image Synthesis
on FFHQ 256 x 256
no code implementations • 29 Sep 2021 • Hengtong Hu, Lingxi Xie, Yinquan Wang, Richang Hong, Meng Wang, Qi Tian
We investigate the problem of estimating uncertainty for training data, so that deep neural networks can make use of the results for learning from limited supervision.
no code implementations • 29 Sep 2021 • Mengbiao Zhao, Shixiong Xu, Jianlong Chang, Lingxi Xie, Jie Chen, Qi Tian
Having consumed huge amounts of training data and computational resource, large-scale pre-trained models are often considered key assets of AI service providers.
1 code implementation • NeurIPS 2021 • Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi Xie, Zenglin Xu, Qi Tian
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).
1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian
Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.
no code implementations • CVPR 2021 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data.
no code implementations • 1 Jun 2021 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.
3 code implementations • CVPR 2022 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Transformers have offered a new methodology of designing neural networks for visual recognition.
no code implementations • 28 May 2021 • Lingxi Xie, Xiaopeng Zhang, Longhui Wei, Jianlong Chang, Qi Tian
This is an opinion paper.
4 code implementations • ICCV 2021 • Zhiliang Peng, Wei Huang, Shanzhi Gu, Lingxi Xie, YaoWei Wang, Jianbin Jiao, Qixiang Ye
Within Convolutional Neural Network (CNN), the convolution operations are good at extracting local features but experience difficulty to capture global representations.
Ranked #292 on
Image Classification
on ImageNet
5 code implementations • ICCV 2021 • Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian
The past year has witnessed the rapid development of applying the Transformer module to vision problems.
Ranked #454 on
Image Classification
on ImageNet
1 code implementation • 11 Apr 2021 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.
Ranked #54 on
Object Detection
on COCO test-dev
no code implementations • 30 Mar 2021 • Tianyu Zhang, Longhui Wei, Lingxi Xie, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian
Recently, the Transformer module has been transplanted from natural language processing to computer vision.
no code implementations • CVPR 2021 • Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Bo Zhang
This paper presents MagDR, a mask-guided detection and reconstruction pipeline for defending deepfakes from adversarial attacks.
1 code implementation • 10 Dec 2020 • Rui Yan, Lingxi Xie, Xiangbo Shu, Jinhui Tang
To understand a complex action, multiple sources of information, including appearance, positional, and semantic features, need to be integrated.
1 code implementation • CVPR 2021 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian
The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.
no code implementations • 4 Dec 2020 • Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, Qi Tian
In this paper, we propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to \textbf{Cross-samples and Multi-level} representation, and models the invariance to semantically similar images in a hierarchical way.
1 code implementation • 28 Nov 2020 • Yuhui Xu, Lingxi Xie, Cihang Xie, Jieru Mei, Siyuan Qiao, Wei Shen, Hongkai Xiong, Alan Yuille
Batch normalization (BN) is a fundamental unit in modern deep networks, in which a linear transformation module was designed for improving BN's flexibility of fitting complex data distributions.
2 code implementations • ICCV 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, Qi Tian
The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse.
Ranked #5 on
Conditional Image Generation
on ImageNet 128x128
no code implementations • 19 Nov 2020 • Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.
no code implementations • 18 Nov 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian
Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.
Ranked #5 on
Online Action Detection
on TVSeries
no code implementations • 17 Nov 2020 • Longhui Wei, Lingxi Xie, Jianzhong He, Jianlong Chang, Xiaopeng Zhang, Wengang Zhou, Houqiang Li, Qi Tian
Recently, contrastive learning has largely advanced the progress of unsupervised visual representation learning.
1 code implementation • NeurIPS 2020 • Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, Qi Tian
Instead of training a model upon the accurate label of each sample, our setting requires the model to query with a predicted label of each sample and learn from the answer whether the guess is correct.
no code implementations • ECCV 2020 • Lijie Liu, Chufan Wu, Jiwen Lu, Lingxi Xie, Jie zhou, Qi Tian
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Ranked #16 on
Vehicle Pose Estimation
on KITTI Cars Hard
no code implementations • 4 Aug 2020 • Lingxi Xie, Xin Chen, Kaifeng Bi, Longhui Wei, Yuhui Xu, Zhengsu Chen, Lanfei Wang, An Xiao, Jianlong Chang, Xiaopeng Zhang, Qi Tian
Neural architecture search (NAS) has attracted increasing attentions in both academia and industry.
1 code implementation • ECCV 2020 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.
Ranked #96 on
Object Detection
on COCO test-dev
no code implementations • 20 Jul 2020 • Ke Ning, Lingxi Xie, Fei Wu, Qi Tian
In this paper, we propose a novel Polar Relative Positional Encoding (PRPE) mechanism that represents spatial relations in a ``linguistic'' way, i. e., in terms of direction and range.
Ranked #10 on
Referring Expression Segmentation
on J-HMDB
no code implementations • ECCV 2020 • Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, Qi Tian
This paper presents a new task named weakly-supervised group activity recognition (GAR) which differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.
no code implementations • 13 Jul 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian
The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.
1 code implementation • 7 Jul 2020 • Yunjie Tian, Chang Liu, Lingxi Xie, Jianbin Jiao, Qixiang Ye
The search cost of neural architecture search (NAS) has been largely reduced by weight-sharing methods.
1 code implementation • 7 Jul 2020 • Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian
There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.
1 code implementation • 25 Jun 2020 • Peng Zhou, Lingxi Xie, Xiaopeng Zhang, Bingbing Ni, Qi Tian
To learn the sampling policy, a Markov decision process is embedded into the search algorithm and a moving average is applied for better stability.
no code implementations • 24 Jun 2020 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Qi Tian
This paper focuses on a popular pipeline known as self learning, and points out a weakness named lazy learning that refers to the difficulty for a model to learn from the pseudo labels generated by itself.
no code implementations • 17 Apr 2020 • Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu, Qi Tian
We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal.
1 code implementation • CVPR 2020 • Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, Qi Tian
Person re-identification (re-ID) is an important topic in computer vision.
1 code implementation • CVPR 2020 • Zhengsu Chen, Jianwei Niu, Lingxi Xie, Xuefeng Liu, Longhui Wei, Qi Tian
Automatic designing computationally efficient neural networks has received much attention in recent years.
1 code implementation • CVPR 2020 • Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian
In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval.
1 code implementation • ECCV 2020 • Longhui Wei, An Xiao, Lingxi Xie, Xin Chen, Xiaopeng Zhang, Qi Tian
AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization.
Ranked #170 on
Image Classification
on ImageNet
1 code implementation • ECCV 2020 • Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian
To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.
1 code implementation • ECCV 2020 • Zijie Zhuang, Longhui Wei, Lingxi Xie, Tianyu Zhang, Hengheng Zhang, Haozhe Wu, Haizhou Ai, Qi Tian
The fundamental difficulty in person re-identification (ReID) lies in learning the correspondence among individual cameras.
Ranked #15 on
Unsupervised Domain Adaptation
on Duke to Market
Direct Transfer Person Re-identification
Domain Adaptive Person Re-Identification
+2
1 code implementation • 17 Jan 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian, Hongkai Xiong
However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware.
no code implementations • ICLR 2020 • Peng Zhou, Bingbing Ni, Lingxi Xie, Xiaopeng Zhang, Hang Wang, Cong Geng, Qi Tian
In the field of Generative Adversarial Networks (GANs), how to design a stable training strategy remains an open problem.
no code implementations • 31 Dec 2019 • Lanfei Wang, Lingxi Xie, Tianyi Zhang, Jun Guo, Qi Tian
Neural Architecture Search (NAS) is an emerging topic in machine learning and computer vision.
1 code implementation • 23 Dec 2019 • Xin Chen, Lingxi Xie, Jun Wu, Qi Tian
With the rapid development of neural architecture search (NAS), researchers found powerful network architectures for a wide range of vision tasks.
no code implementations • 10 Dec 2019 • Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Qi Tian
There have been many efforts in attacking image classification models with adversarial perturbations, but the same topic on video classification has not yet been thoroughly studied.
1 code implementation • 25 Oct 2019 • Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian
Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages.
1 code implementation • MM '19: Proceedings of the 27th ACM International Conference on Multimedia 2019 • Lu Chi, Guiyu Tian, Yadong Mu, Lingxi Xie, Qi Tian
We show its equivalence to conducting residual learning in some spectral domain and carefully re-formulate a variety of neural layers into their spectral forms, such as ReLU or convolutions.
1 code implementation • 27 Sep 2019 • Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu
Network pruning is an important research field aiming at reducing computational costs of neural networks.
1 code implementation • 24 Sep 2019 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, Qi Tian
Differently, this paper investigates ReID in an unexplored single-camera-training (SCT) setting, where each person in the training set appears in only one camera.
no code implementations • 19 Sep 2019 • Zhuoxun He, Lingxi Xie, Xin Chen, Ya zhang, Yan-Feng Wang, Qi Tian
Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks.
6 code implementations • ICLR 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong
Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture.
Ranked #20 on
Neural Architecture Search
on CIFAR-10
no code implementations • 26 Jun 2019 • Yifeng Li, Lingxi Xie, Ya zhang, Rui Zhang, Yanfeng Wang, Qi Tian
Generating and eliminating adversarial examples has been an intriguing topic in the field of deep learning.
4 code implementations • ICCV 2019 • Xin Chen, Lingxi Xie, Jun Wu, Qi Tian
Recently, differentiable search methods have made major progress in reducing the computational costs of neural architecture search.
11 code implementations • ICCV 2019 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.
Ranked #118 on
Object Detection
on COCO test-dev
1 code implementation • ICCV 2019 • Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille
Sketch-based image retrieval (SBIR) is widely recognized as an important vision problem which implies a wide range of real-world applications.
no code implementations • 2 Apr 2019 • Qihang Yu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille
With this design, we achieve a higher performance while maintaining a lower inference latency on a few abdominal organs from CT scans, in particular when the organ has a peculiar 3D shape and thus strongly requires contextual information, demonstrating our method's effectiveness and ability in capturing 3D information.
1 code implementation • 2 Jan 2019 • Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye
In particular, the advantage of CHR is more significant in the scenarios with fewer positive training samples, which demonstrates its potential application in real-world security inspection.
no code implementations • 11 Dec 2018 • Yanwei Li, Xingang Wang, Shilei Zhang, Lingxi Xie, Wenqi Wu, Hongyuan Yu, Zheng Zhu
Facial expression recognition is a challenging task, arguably because of large intra-class variations and high inter-class similarities.
no code implementations • CVPR 2019 • Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, Xingang Wang
This paper studies panoptic segmentation, a recently proposed task which segments foreground (FG) objects at the instance level as well as background (BG) contents at the semantic level.
Ranked #23 on
Panoptic Segmentation
on COCO test-dev
2 code implementations • CVPR 2019 • Tianwei Ni, Lingxi Xie, Huangjie Zheng, Elliot K. Fishman, Alan L. Yuille
The key observation is that, although the object is a 3D volume, what we really need in segmentation is to find its boundary which is a 2D surface.
1 code implementation • CVPR 2019 • Yiming Zuo, Weichao Qiu, Lingxi Xie, Fangwei Zhong, Yizhou Wang, Alan L. Yuille
We also construct a vision-based control system for task accomplishment, for which we train a reinforcement learning agent in a virtual environment and apply it to the real-world.
1 code implementation • CVPR 2019 • Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille
We consider spatial contexts, for which we solve so-called jigsaw puzzles, i. e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration.
no code implementations • CVPR 2019 • Chenglin Yang, Lingxi Xie, Chi Su, Alan L. Yuille
Optimizing a deep neural network is a fundamental task in computer vision, yet direct training methods often suffer from over-fitting.
no code implementations • 29 Nov 2018 • Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille
Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn.
no code implementations • 28 Nov 2018 • Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille
However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.
1 code implementation • ICCV 2019 • Yutong Bai, Qing Liu, Lingxi Xie, Weichao Qiu, Yan Zheng, Alan Yuille
In particular, this enables images in the training dataset to be matched to a virtual 3D model of the object (for simplicity, we assume that the object viewpoint can be estimated by standard techniques).
no code implementations • 7 Sep 2018 • Junran Peng, Lingxi Xie, Zhao-Xiang Zhang, Tieniu Tan, Jingdong Wang
This paper presents an efficient module named spatial bottleneck for accelerating the convolutional layers in deep neural networks.
no code implementations • 7 Sep 2018 • Xiaolu Zhang, Shiwan Zhao, Lingxi Xie
This paper considers WCE-based gastric ulcer detection, in which the major challenge is to detect the lesions in a local region.
no code implementations • 1 Aug 2018 • Yingying Zhu, Jiong Wang, Lingxi Xie, Liang Zheng
Visual place recognition is challenging in the urban environment and is usually viewed as a large scale image retrieval task.
no code implementations • 9 Jul 2018 • Zhuotun Zhu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille
We propose an intuitive approach of detecting pancreatic ductal adenocarcinoma (PDAC), the most common type of pancreatic cancer, by checking abdominal CT scans.
no code implementations • 30 Jun 2018 • Bingzhe Wu, Xiaolu Zhang, Shiwan Zhao, Lingxi Xie, Caihong Zeng, Zhihong Liu, Guangyu Sun
Given an input image from a specified stain, several generators are first applied to estimate its appearances in other staining methods, and a classifier follows to combine visual cues from different stains for prediction (whether it is pathological, or which type of pathology it has).
no code implementations • 15 May 2018 • Chenglin Yang, Lingxi Xie, Siyuan Qiao, Alan Yuille
We focus on the problem of training a deep neural network in generations.
no code implementations • 27 Apr 2018 • Fengze Liu, Lingxi Xie, Yingda Xia, Elliot K. Fishman, Alan L. Yuille
Shape representation and classification are performed in a joint manner, both to exploit the knowledge that PDAC often changes the shape of the pancreas and to prevent over-fitting.
no code implementations • ECCV 2018 • Yan Wang, Lingxi Xie, Siyuan Qiao, Ya zhang, Wenjun Zhang, Alan L. Yuille
Convolution is spatially-symmetric, i. e., the visual features are independent of its position in the image, which limits its ability to utilize contextual cues for visual recognition.
no code implementations • 2 Apr 2018 • Yingda Xia, Lingxi Xie, Fengze Liu, Zhuotun Zhu, Elliot K. Fishman, Alan L. Yuille
There has been a debate on whether to use 2D or 3D deep neural networks for volumetric organ segmentation.
no code implementations • 1 Apr 2018 • Qi Chen, Weichao Qiu, Yi Zhang, Lingxi Xie, Alan Yuille
But, this raises an important problem in active vision: given an {\bf infinite} data space, how to effectively sample a {\bf finite} subset to train a visual classifier?
no code implementations • CVPR 2019 • Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi Keung Tang, Alan L. Yuille
Though image-space adversaries can be interpreted as per-pixel albedo change, we verify that they cannot be well explained along these physically meaningful dimensions, which often have a non-local effect.
no code implementations • 13 Nov 2017 • Jianyu Wang, Zhishuai Zhang, Cihang Xie, Yuyin Zhou, Vittal Premachandran, Jun Zhu, Lingxi Xie, Alan Yuille
We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles.
no code implementations • CVPR 2018 • Zhishuai Zhang, Cihang Xie, Jian-Yu Wang, Lingxi Xie, Alan L. Yuille
The first layer extracts the evidence of local visual cues, and the second layer performs a voting mechanism by utilizing the spatial relationship between visual cues and semantic parts.
2 code implementations • CVPR 2018 • Qihang Yu, Lingxi Xie, Yan Wang, Yuyin Zhou, Elliot K. Fishman, Alan L. Yuille
The key innovation is a saliency transformation module, which repeatedly converts the segmentation probability map from the previous iteration as spatial weights and applies these weights to the current iteration.
Ranked #1 on
Pancreas Segmentation
on TCIA Pancreas-CT Dataset
no code implementations • 25 Jul 2017 • Jianyu Wang, Cihang Xie, Zhishuai Zhang, Jun Zhu, Lingxi Xie, Alan Yuille
Our approach detects semantic parts by accumulating the confidence of local visual cues.
no code implementations • 22 Jun 2017 • Yuyin Zhou, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille
Inspired by the high relevance between the location of a pancreas and its cystic region, we introduce extra deep supervision into the segmentation network, so that cyst segmentation can be improved with the help of relatively easier pancreas segmentation.
2 code implementations • ICCV 2017 • Cihang Xie, Jian-Yu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille
Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e. g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations.
no code implementations • ICCV 2017 • Yan Wang, Lingxi Xie, Chenxi Liu, Ya zhang, Wenjun Zhang, Alan Yuille
In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks.
1 code implementation • ICCV 2017 • Lingxi Xie, Alan Yuille
The deep Convolutional Neural Network (CNN) is the state-of-the-art solution for large-scale visual recognition.
no code implementations • 3 Mar 2017 • Yan Wang, Lingxi Xie, Ya zhang, Wenjun Zhang, Alan Yuille
We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity.
3 code implementations • 25 Dec 2016 • Yuyin Zhou, Lingxi Xie, Wei Shen, Yan Wang, Elliot K. Fishman, Alan L. Yuille
Deep neural networks have been widely adopted for automatic organ segmentation from abdominal CT scans.
1 code implementation • 20 Nov 2016 • Zhuotun Zhu, Lingxi Xie, Alan L. Yuille
While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image.
no code implementations • 21 Jul 2016 • Lingxi Xie, Qi Tian, John Flynn, Jingdong Wang, Alan Yuille
For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them.
2 code implementations • CVPR 2016 • Lingxi Xie, Jingdong Wang, Zhen Wei, Meng Wang, Qi Tian
During a long period of time we are combating over-fitting in the CNN training process with model regularization, including weight decay, model averaging, data augmentation, etc.
no code implementations • CVPR 2016 • Lingxi Xie, Liang Zheng, Jingdong Wang, Alan Yuille, Qi Tian
An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network.
no code implementations • ICCV 2015 • Lingxi Xie, Jingdong Wang, Weiyao Lin, Bo Zhang, Qi Tian
In many fine-grained object recognition datasets, image orientation (left/right) might vary from sample to sample.
no code implementations • 21 Nov 2015 • Xuan Dong, Yu Zhu, Weixin Li, Lingxi Xie, Alex Wong, Alan Yuille
In this paper, we proposed to use both fidelity (the difference with original images) and naturalness (human visual perception of super resolved images) for evaluation.
no code implementations • CVPR 2014 • Lingxi Xie, Jingdong Wang, Baining Guo, Bo Zhang, Qi Tian
The novelty lies in that OPM uses the 3D orientations to form the pyramid and produce the pooling regions, which is unlike SPM that uses the spatial positions to form the pyramid.