Search Results for author: Yue Cao

Found 89 papers, 53 papers with code

Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot

no code implementations EMNLP 2021 Yitao Cai, Yue Cao, Xiaojun Wan

Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then decode the sentence back from the semantic representations.

Paraphrase Generation Sentence

Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection

1 code implementation20 Mar 2024 Sheetal Harris, Jinshuo Liu, Hassan Jalil Hadi, Yue Cao

In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature.

Fact Checking Fake News Detection +1

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

1 code implementation23 Jan 2024 Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng

Thus, we reformulate the two-branch Siamese tracking as a conceptually simple, fully transformer-based Single-Branch Tracking pipeline, dubbed SBT.

Feature Correlation Visual Object Tracking

CapsFusion: Rethinking Image-Text Data at Scale

1 code implementation CVPR 2024 Qiying Yu, Quan Sun, Xiaosong Zhang, Yufeng Cui, Fan Zhang, Yue Cao, Xinlong Wang, Jingjing Liu

To provide higher-quality and more scalable multimodal pretraining data, we propose CapsFusion, an advanced framework that leverages large language models to consolidate and refine information from both web-based image-text pairs and synthetic captions.

World Knowledge

IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks

1 code implementation18 Oct 2023 Yue Cao, Tianlin Li, Xiaofeng Cao, Ivor Tsang, Yang Liu, Qing Guo

The underlying rationale behind our idea is that image resampling can alleviate the influence of adversarial perturbations while preserving essential semantic information, thereby conferring an inherent advantage in defending against adversarial attacks.

Adversarial Robustness

Ground Manipulator Primitive Tasks to Executable Actions using Large Language Models

no code implementations13 Aug 2023 Yue Cao, C. S. George Lee

In order to tackle this challenge, we propose a novel approach to ground the manipulator primitive tasks to robot low-level actions using large language models (LLMs).


Continual Learners are Incremental Model Generalizers

no code implementations21 Jun 2023 Jaehong Yoon, Sung Ju Hwang, Yue Cao

We believe this paper breaks the barriers between pre-training and fine-tuning steps and leads to a sustainable learning framework in which the continual learner incrementally improves model generalization, yielding better transfer to unseen tasks.

Continual Learning

Multimodal Object Detection by Channel Switching and Spatial Attention

no code implementations Conference on Computer Vision and Pattern Recognition (CVPR) 2023 Yue Cao, Junchi Bin, Jozsef Hamari, Erik Blasch, Zheng Liu

Multimodal object detection has attracted great attention in recent years since the information specific to different modalities can complement each other and effectively improve the accuracy and stability of the detection model.

Multispectral Object Detection object-detection +2

On the Robustness of Segment Anything

no code implementations25 May 2023 Yihao Huang, Yue Cao, Tianlin Li, Felix Juefei-Xu, Di Lin, Ivor W. Tsang, Yang Liu, Qing Guo

Second, we extend representative adversarial attacks against SAM and study the influence of different prompts on robustness.

Autonomous Vehicles valid

A robust design of time-varying internal model principle-based control for ultra-precision tracking in a direct-drive servo stage

no code implementations13 Apr 2023 Yue Cao, Zhen Zhang

By means of the ESO feedback, the plant model is kept as nominal, and hence the structural robustness is achieved for the time-varying internal model.

Robust Design

SegGPT: Segmenting Everything In Context

1 code implementation6 Apr 2023 Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang

We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images.

 Ranked #1 on Few-Shot Semantic Segmentation on PASCAL-5i (5-Shot) (using extra training data)

Few-Shot Semantic Segmentation In-Context Learning +5

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models

1 code implementation30 Mar 2023 Wen Wang, Yan Jiang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen

Our vid2vid-zero leverages off-the-shelf image diffusion models, and doesn't require training on any video.

Image Generation Video Alignment +1

EVA-CLIP: Improved Training Techniques for CLIP at Scale

4 code implementations27 Mar 2023 Quan Sun, Yuxin Fang, Ledell Wu, Xinlong Wang, Yue Cao

Our approach incorporates new techniques for representation learning, optimization, and augmentation, enabling EVA-CLIP to achieve superior performance compared to previous CLIP models with the same number of parameters but significantly smaller training costs.

Image Classification Representation Learning +2

EVA-02: A Visual Representation for Neon Genesis

6 code implementations20 Mar 2023 Yuxin Fang, Quan Sun, Xinggang Wang, Tiejun Huang, Xinlong Wang, Yue Cao

We launch EVA-02, a next-generation Transformer-based visual representation pre-trained to reconstruct strong and robust language-aligned vision features via masked image modeling.

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

3 code implementations12 Mar 2023 Fan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu

Inspired by the unified view, UniDiffuser learns all distributions simultaneously with a minimal modification to the original diffusion model -- perturbs data in all modalities instead of a single modality, inputs individual timesteps in different modalities, and predicts the noise of all modalities instead of a single modality.

Text-to-Image Generation

Robot Behavior-Tree-Based Task Generation with Large Language Models

no code implementations24 Feb 2023 Yue Cao, C. S. George Lee

To cope with this issue, we propose a novel behavior-tree-based task generation approach that utilizes state-of-the-art large language models.

Revisiting Discriminative vs. Generative Classifiers: Theory and Implications

1 code implementation5 Feb 2023 Chenyu Zheng, Guoqiang Wu, Fan Bao, Yue Cao, Chongxuan Li, Jun Zhu

Theoretically, the paper considers the surrogate loss instead of the zero-one loss in analyses and generalizes the classical results from binary cases to multiclass ones.

Few-Shot Learning Image Classification +2

SegGPT: Towards Segmenting Everything in Context

no code implementations ICCV 2023 Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang

We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images.

Few-Shot Semantic Segmentation In-Context Learning +4

Improving CLIP Fine-tuning Performance

1 code implementation ICCV 2023 Yixuan Wei, Han Hu, Zhenda Xie, Ze Liu, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

Experiments suggest that the feature map distillation approach significantly boosts the fine-tuning performance of CLIP models on several typical downstream vision tasks.

object-detection Object Detection +1

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition

no code implementations CVPR 2023 Yixuan Wei, Yue Cao, Zheng Zhang, Houwen Peng, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo

This paper presents a method that effectively combines two prevalent visual recognition methods, i. e., image classification and contrastive language-image pre-training, dubbed iCLIP.

Classification Image Classification +2

Physics-Guided ISO-Dependent Sensor Noise Modeling for Extreme Low-Light Photography

1 code implementation CVPR 2023 Yue Cao, Ming Liu, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo

Although deep neural networks have achieved astonishing performance in many vision tasks, existing learning-based methods are far inferior to the physical model-based solutions in extreme low-light sensor noise modeling.

Image Denoising

Deep Incubation: Training Large Models by Divide-and-Conquering

3 code implementations ICCV 2023 Zanlin Ni, Yulin Wang, Jiangwei Yu, Haojun Jiang, Yue Cao, Gao Huang

In this paper, we present Deep Incubation, a novel approach that enables the efficient and effective training of large models by dividing them into smaller sub-modules that can be trained separately and assembled seamlessly.

Image Segmentation object-detection +2

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

1 code implementation CVPR 2023 Xinlong Wang, Wen Wang, Yue Cao, Chunhua Shen, Tiejun Huang

In this work, we present Painter, a generalist model which addresses these obstacles with an "image"-centric solution, that is, to redefine the output of core vision tasks as images, and specify task prompts as also images.

In-Context Learning Keypoint Detection +2

Could Giant Pretrained Image Models Extract Universal Representations?

no code implementations3 Nov 2022 Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao

In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition.

Action Recognition In Videos Instance Segmentation +5

All are Worth Words: A ViT Backbone for Diffusion Models

3 code implementations CVPR 2023 Fan Bao, Shen Nie, Kaiwen Xue, Yue Cao, Chongxuan Li, Hang Su, Jun Zhu

We evaluate U-ViT in unconditional and class-conditional image generation, as well as text-to-image generation tasks, where U-ViT is comparable if not superior to a CNN-based U-Net of a similar size.

Conditional Image Generation Text-to-Image Generation

Geo-Spatio-Temporal Information Based 3D Cooperative Positioning in LOS/NLOS Mixed Environments

no code implementations2 Sep 2022 Yue Cao, Shaoshi Yang, Zhiyong Feng

We propose a geographic and spatio-temporal information based distributed cooperative positioning (GSTICP) algorithm for wireless networks that require three-dimensional (3D) coordinates and operate in the line-of-sight (LOS) and nonline-of-sight (NLOS) mixed environments.

Distributed Spatio-Temporal Information Based Cooperative 3D Positioning in GNSS-Denied Environments

no code implementations25 Aug 2022 Yue Cao, Shaoshi Yang, Zhiyong Feng, Lihua Wang, Lajos Hanzo

A distributed spatio-temporal information based cooperative positioning (STICP) algorithm is proposed for wireless networks that require three-dimensional (3D) coordinates and operate in the global navigation satellite system (GNSS) denied environments.

Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation

1 code implementation27 May 2022 Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

These properties, which we aggregately refer to as optimization friendliness, are identified and analyzed by a set of attention- and optimization-related diagnosis tools.

Ranked #2 on Instance Segmentation on COCO test-dev (using extra training data)

Contrastive Learning Image Classification +5

Revealing the Dark Secrets of Masked Image Modeling

1 code implementation CVPR 2023 Zhenda Xie, Zigang Geng, Jingcheng Hu, Zheng Zhang, Han Hu, Yue Cao

In this paper, we compare MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences.

Diversity Inductive Bias +4

Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction

1 code implementation20 May 2022 Yue Cao, Xiaojiang Zhou, Jiaqi Feng, Peihao Huang, Yao Xiao, Dayao Chen, Sheng Chen

However, the retrieval-based methods are sub-optimal and would cause more or less information losses, and it's difficult to balance the effectiveness and efficiency of the retrieval algorithm.

Click-Through Rate Prediction Retrieval

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

no code implementations22 Apr 2022 Yixuan Wei, Yue Cao, Zheng Zhang, Zhuliang Yao, Zhenda Xie, Han Hu, Baining Guo

Second, we convert the image classification problem from learning parametric category classifier weights to learning a text encoder as a meta network to generate category classifier weights.

Action Recognition Classification +7

Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment

no code implementations CVPR 2022 Yue Cao, Zhaolin Wan, Dongwei Ren, Zifei Yan, WangMeng Zuo

Particularly, by treating all labeled data as positive samples, PU learning is leveraged to identify negative samples (i. e., outliers) from unlabeled data.

Image Quality Assessment

Enhanced Contour Tracking: a Time-Varying Internal Model Principle-Based Approach

no code implementations23 Mar 2022 Yue Cao, Zhen Zhang

The proposed TV-IMCC is twofold, including an extended position domain framework with master-slave structures for contour regulation, and a time-varying internal model principle-based controller for each axial tracking precision improvement.


Decipher soil organic carbon dynamics and driving forces across China using machine learning

no code implementations Global Change Biology 2022 Huiwen Li, Yiping Wu, Shuguang Liu, Jingfeng Xiao, Wenzhi Zhao, Ji Chen, Georgii Alexandrov, Yue Cao

Additionally, the national cropland topsoil organic carbon increased with a rate of 23. 6 ± 7. 6 g C m−2 yr−1since the 1980s, and the widely applied nitrogenous fertilizer was a key stimulus.

Correlation-Aware Deep Tracking

1 code implementation CVPR 2022 Fei Xie, Chunyu Wang, Guangting Wang, Yue Cao, Wankou Yang, Wenjun Zeng

In contrast to the Siamese-like feature extraction, our network deeply embeds cross-image feature correlation in multiple layers of the feature network.

Feature Correlation Visual Object Tracking

Self-supervised Learning from 100 Million Medical Images

no code implementations4 Jan 2022 Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Dominik Neumann, Pragneshkumar Patel, R. S. Vishwanath, James M. Balter, Yue Cao, Sasa Grbic, Dorin Comaniciu

Building accurate and robust artificial intelligence systems for medical image assessment requires not only the research and design of advanced deep learning models but also the creation of large and curated sets of annotated training examples.

Computed Tomography (CT) Contrastive Learning +1

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-language Model

2 code implementations29 Dec 2021 Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Han Hu, Xiang Bai

However, semantic segmentation and the CLIP model perform on different visual granularity, that semantic segmentation processes on pixels while CLIP performs on images.

Image Classification Language Modelling +8

SimMIM: A Simple Framework for Masked Image Modeling

4 code implementations CVPR 2022 Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, Han Hu

We also leverage this approach to facilitate the training of a 3B model (SwinV2-G), that by $40\times$ less data than that in previous practice, we achieve the state-of-the-art on four representative vision benchmarks.

Representation Learning Self-Supervised Image Classification +1

Swin Transformer V2: Scaling Up Capacity and Resolution

19 code implementations CVPR 2022 Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo

Three main techniques are proposed: 1) a residual-post-norm method combined with cosine attention to improve training stability; 2) A log-spaced continuous position bias method to effectively transfer models pre-trained using low-resolution images to downstream tasks with high-resolution inputs; 3) A self-supervised pre-training method, SimMIM, to reduce the needs of vast labeled images.

Ranked #4 on Image Classification on ImageNet V2 (using extra training data)

Action Classification Image Classification +3

Bootstrap Your Object Detector via Mixed Training

1 code implementation NeurIPS 2021 Mengde Xu, Zheng Zhang, Fangyun Wei, Yutong Lin, Yue Cao, Stephen Lin, Han Hu, Xiang Bai

We introduce MixTraining, a new training paradigm for object detection that can improve the performance of existing detectors for free.

Data Augmentation Missing Labels +3

Deep Reinforcement Learning-Based Long-Range Autonomous Valet Parking for Smart Cities

no code implementations23 Sep 2021 Muhammad Khalid, Liang Wang, Kezhi Wang, Cunhua Pan, Nauman Aslam, Yue Cao

In this paper, to reduce the congestion rate at the city center and increase the quality of experience (QoE) of each user, the framework of long-range autonomous valet parking (LAVP) is presented, where an Autonomous Vehicle (AV) is deployed in the city, which can pick up, drop off users at their required spots, and then drive to the car park out of city center autonomously.

reinforcement-learning Reinforcement Learning (RL)

Video Swin Transformer

14 code implementations CVPR 2022 Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, Han Hu

The vision community is witnessing a modeling shift from CNNs to Transformers, where pure Transformer architectures have attained top accuracy on the major video recognition benchmarks.

Ranked #28 on Action Classification on Kinetics-600 (using extra training data)

Action Classification Action Recognition +5

Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design

1 code implementation24 Jun 2021 Yue Cao, Payel Das, Vijil Chenthamarakshan, Pin-Yu Chen, Igor Melnyk, Yang shen

Designing novel protein sequences for a desired 3D topological fold is a fundamental yet non-trivial task in protein engineering.

Protein Design

Continual Learning for Neural Machine Translation

no code implementations NAACL 2021 Yue Cao, Hao-Ran Wei, Boxing Chen, Xiaojun Wan

In practical applications, NMT models are usually trained on a general domain corpus and then fine-tuned by continuing training on the in-domain corpus.

Continual Learning Knowledge Distillation +3

Prediction of Prognosis and Survival of Patients with Gastric Cancer by Weighted Improved Random Forest Model

no code implementations Archives of Medical Science 2021 Cheng Xu, Jing Wang, TianLong Zheng, Yue Cao, Fan Ye

Among the 10 public datasets, the Random Forest weighted in accuracy has the best performance on 6 datasets, with an average increase of 1. 44% in accuracy and an average increase of 1. 2% in AUC.


Group-Free 3D Object Detection via Transformers

4 code implementations ICCV 2021 Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong

Instead of grouping local points to each object candidate, our method computes the feature of an object from all the points in the point cloud with the help of an attention mechanism in the Transformers \cite{vaswani2017attention}, where the contribution of each point is automatically learned in the network training.

3D Object Detection Object +1

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser

1 code implementation18 Mar 2021 Yue Cao, Xiaohe Wu, Shuran Qi, Xiao Liu, Zhongqin Wu, WangMeng Zuo

To begin with, the pre-trained denoiser is used to generate the pseudo clean images for the test images.


ParaSCI: A Large Scientific Paraphrase Dataset for Longer Paraphrase Generation

1 code implementation EACL 2021 Qingxiu Dong, Xiaojun Wan, Yue Cao

We propose ParaSCI, the first large-scale paraphrase dataset in the scientific field, including 33, 981 paraphrase pairs from ACL (ParaSCI-ACL) and 316, 063 pairs from arXiv (ParaSCI-arXiv).

Diversity Paraphrase Generation

Leveraging Batch Normalization for Vision Transformers

no code implementations ICCVW 2021 Zhuliang Yao, Yue Cao, Yutong Lin, Ze Liu, Zheng Zhang, Han Hu

Transformer-based vision architectures have attracted great attention because of the strong performance over the convolutional neural networks (CNNs).

Bayesian Learning to Optimize: Quantifying the Optimizer Uncertainty

no code implementations1 Jan 2021 Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Optimizing an objective function with uncertainty awareness is well-known to improve the accuracy and confidence of optimization solutions.

Image Classification Uncertainty Quantification +1

Global Context Networks

3 code implementations24 Dec 2020 Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies within an image, via aggregating query-specific global context to each query position.

Instance Segmentation Object Detection

DR 21 South Filament: a Parsec-sized Dense Gas Accretion Flow onto the DR 21 Massive Young Cluster

no code implementations4 Dec 2020 Bo Hu, Keping Qiu, Yue Cao, Junhao Liu, Yuwei Wang, Guangxing Li, Zhiqiang Shen, Juan Li, Junzhi Wang, Bin Li, Jian Dong

DR21 south filament (DR21SF) is a unique component of the giant network of filamentary molecular clouds in the north region of Cygnus X complex.

Astrophysics of Galaxies

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

7 code implementations CVPR 2021 Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

We argue that the power of contrastive learning has yet to be fully unleashed, as current methods are trained only on instance-level pretext tasks, leading to representations that may be sub-optimal for downstream tasks requiring dense pixel predictions.

Contrastive Learning object-detection +3

Progressive Training of Multi-level Wavelet Residual Networks for Image Denoising

2 code implementations23 Oct 2020 Yali Peng, Yue Cao, Shigang Liu, Jian Yang, WangMeng Zuo

To cope with this issue, this paper presents a multi-level wavelet residual network (MWRN) architecture as well as a progressive training (PTMWRN) scheme to improve image denoising performance.

Image Denoising

Unpaired Learning of Deep Image Denoising

2 code implementations ECCV 2020 Xiaohe Wu, Ming Liu, Yue Cao, Dongwei Ren, WangMeng Zuo

As for knowledge distillation, we first apply the learned noise models to clean images to synthesize a paired set of training images, and use the real noisy images and the corresponding denoising results in the first stage to form another paired set.

Image Denoising Knowledge Distillation +1

RepPoints V2: Verification Meets Regression for Object Detection

1 code implementation NeurIPS 2020 Yihong Chen, Zheng Zhang, Yue Cao, Li-Wei Wang, Stephen Lin, Han Hu

Though RepPoints provides high performance, we find that its heavy reliance on regression for object localization leaves room for improvement.

Instance Segmentation Object +6

Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

no code implementations8 Jul 2020 Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Eli Gibson, R. S. Vishwanath, Abishek Balachandran, James M. Balter, Yue Cao, Ramandeep Singh, Subba R. Digumarthy, Mannudeep K. Kalra, Sasa Grbic, Dorin Comaniciu

In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e. g., by 8% to 0. 91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs.

Anatomy Classification +1

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

1 code implementation ECCV 2020 Ze Liu, Han Hu, Yue Cao, Zheng Zhang, Xin Tong

Our investigation reveals that despite the different designs of these operators, all of these operators make surprisingly similar contributions to the network performance under the same network input and feature numbers and result in the state-of-the-art accuracy on standard benchmarks.

3D Semantic Segmentation

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

no code implementations ACL 2020 Yue Cao, Hui Liu, Xiaojun Wan

However, it is a big challenge for the model to directly learn cross-lingual summarization as it requires learning to understand different languages and learning how to summarize at the same time.

Cross-Lingual Transfer

Disentangled Non-Local Neural Networks

5 code implementations ECCV 2020 Minghao Yin, Zhuliang Yao, Yue Cao, Xiu Li, Zheng Zhang, Stephen Lin, Han Hu

This paper first studies the non-local block in depth, where we find that its attention computation can be split into two terms, a whitened pairwise term accounting for the relationship between two pixels and a unary term representing the saliency of every pixel.

Ranked #20 on Semantic Segmentation on Cityscapes test (using extra training data)

Action Recognition object-detection +2

Memory Enhanced Global-Local Aggregation for Video Object Detection

2 code implementations CVPR 2020 Yihong Chen, Yue Cao, Han Hu, Li-Wei Wang

We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information.

Object object-detection +1

Cross-Iteration Batch Normalization

2 code implementations CVPR 2021 Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin

We thus compensate for the network weight changes via a proposed technique based on Taylor polynomials, so that the statistics can be accurately estimated and batch normalization can be effectively applied.

Image Classification object-detection +1

Energy-based Graph Convolutional Networks for Scoring Protein Docking Models

no code implementations28 Dec 2019 Yue Cao, Yang shen

Moreover, estimating model quality, also known as the quality assessment problem, is rarely addressed in protein docking.

Learning to Optimize in Swarms

1 code implementation NeurIPS 2019 Yue Cao, Tianlong Chen, Zhangyang Wang, Yang shen

Learning to optimize has emerged as a powerful framework for various optimization and machine learning tasks.

Spatial-Temporal Relation Networks for Multi-Object Tracking

no code implementations ICCV 2019 Jiarui Xu, Yue Cao, Zheng Zhang, Han Hu

Recent progress in multiple object tracking (MOT) has shown that a robust similarity score is key to the success of trackers.

Multi-Object Tracking Multiple Object Tracking +2

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

9 code implementations25 Apr 2019 Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu

In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation.

Instance Segmentation Object Detection +1

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph

1 code implementation9 Mar 2019 Weizhi Ma, Min Zhang, Yue Cao, Woojeong, Jin, Chenyang Wang, Yiqun Liu, Shaoping Ma, Xiang Ren

The framework encourages two modules to complement each other in generating effective and explainable recommendation: 1) inductive rules, mined from item-centric knowledge graphs, summarize common multi-hop relational patterns for inferring different item associations and provide human-readable explanation for model prediction; 2) recommendation module can be augmented by induced rules and thus have better generalization ability dealing with the cold-start issue.

Explainable Recommendation Knowledge Graphs +1

Deep Triplet Quantization

1 code implementation1 Feb 2019 Bin Liu, Yue Cao, Mingsheng Long, Jian-Min Wang, Jingdong Wang

We propose Deep Triplet Quantization (DTQ), a novel approach to learning deep quantization models from the similarity triplets.

Deep Hashing Image Retrieval +1

Bayesian active learning for optimization and uncertainty quantification in protein docking

1 code implementation31 Jan 2019 Yue Cao, Yang shen

To the best of our knowledge, this study represents the first uncertainty quantification solution for protein docking, with theoretical rigor and comprehensive assessment.

Active Learning Binary Classification +1

HashGAN: Deep Learning to Hash With Pair Conditional Wasserstein GAN

no code implementations CVPR 2018 Yue Cao, Bin Liu, Mingsheng Long, Jian-Min Wang

The main idea is to augment the training data with nearly real images synthesized from a new Pair Conditional Wasserstein GAN (PC-WGAN) conditioned on the pairwise similarity information.

Image Retrieval Representation Learning +1

Deep Cauchy Hashing for Hamming Space Retrieval

no code implementations CVPR 2018 Yue Cao, Mingsheng Long, Bin Liu, Jian-Min Wang

Due to its computation efficiency and retrieval quality, hashing has been widely applied to approximate nearest neighbor search for large-scale image retrieval, while deep hashing further improves the retrieval quality by end-to-end representation learning and hash coding.

Deep Hashing Image Retrieval +1

Deep Visual-Semantic Quantization for Efficient Image Retrieval

no code implementations CVPR 2017 Yue Cao, Mingsheng Long, Jian-Min Wang, Shichen Liu

This paper presents a compact coding solution with a focus on the deep learning to quantization approach, which improves retrieval quality by end-to-end representation learning and compact encoding and has already shown the superior performance over the hashing solutions for similarity retrieval.

Image Retrieval Quantization +2

Correlation Hashing Network for Efficient Cross-Modal Retrieval

no code implementations22 Feb 2016 Yue Cao, Mingsheng Long, Jian-Min Wang, Philip S. Yu

This paper presents a Correlation Hashing Network (CHN) approach to cross-modal hashing, which jointly learns good data representation tailored to hash coding and formally controls the quantization error.

Cross-Modal Retrieval Quantization +1

Learning Transferable Features with Deep Adaptation Networks

5 code implementations10 Feb 2015 Mingsheng Long, Yue Cao, Jian-Min Wang, Michael. I. Jordan

Recent studies reveal that a deep neural network can learn transferable features which generalize well to novel tasks for domain adaptation.

Domain Adaptation Image Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.