Search Results for author: Jing Zhang

Found 221 papers, 112 papers with code

SRCB at SemEval-2022 Task 5: Pretraining Based Image to Text Late Sequential Fusion System for Multimodal Misogynous Meme Identification

no code implementations SemEval (NAACL) 2022 Jing Zhang, Yujin Wang

Online misogyny meme detection is an image/text multimodal classification task, the complicated relation of image and text challenges the intelligent system’s modality fusion learning capability.

Long-range Sequence Modeling with Predictable Sparse Attention

no code implementations ACL 2022 Yimeng Zhuang, Jing Zhang, Mei Tu

(2) A sparse attention matrix estimation module, which predicts dominant elements of an attention matrix based on the output of the previous hidden state cross module.

P-INT: A Path-based Interaction Model for Few-shot Knowledge Graph Completion

no code implementations Findings (EMNLP) 2021 Jingwen Xu, Jing Zhang, Xirui Ke, Yuxiao Dong, Hong Chen, Cuiping Li, Yongbin Liu

Its general process is to first encode the implicit relation of an entity pair and then match the relation of a query entity pair with the relations of the reference entity pairs.

Knowledge Graph Completion

HOSMEL: A Hot-Swappable Modularized Entity Linking Toolkit for Chinese

1 code implementation ACL 2022 Daniel Zhang-li, Jing Zhang, Jifan Yu, Xiaokang Zhang, Peng Zhang, Jie Tang, Juanzi Li

We investigate the usage of entity linking (EL)in downstream tasks and present the first modularized EL toolkit for easy task adaptation.

Entity Linking Question Answering

Audio-Visual Segmentation with Semantics

1 code implementation30 Jan 2023 Jinxing Zhou, Xuyang Shen, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Semantic Segmentation Video Semantic Segmentation

APAC: Authorized Probability-controlled Actor-Critic For Offline Reinforcement Learning

no code implementations28 Jan 2023 Jing Zhang, Chi Zhang, Wenjia Wang, Bing-Yi Jing

Due to the inability to interact with the environment, offline reinforcement learning (RL) methods face the challenge of estimating the Out-of-Distribution (OOD) points.

reinforcement-learning reinforcement Learning

Joint Spatio-Temporal Modeling for the Semantic Change Detection in Remote Sensing Images

1 code implementation10 Dec 2022 Lei Ding, Jing Zhang, Kai Zhang, Haitao Guo, Bing Liu, Lorenzo Bruzzone

Semantic Change Detection (SCD) refers to the task of simultaneously extracting the changed areas and the semantic categories (before and after the changes) in Remote Sensing Images (RSIs).

Change Detection

ViTPose+: Vision Transformer Foundation Model for Generic Body Pose Estimation

1 code implementation7 Dec 2022 Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao

In this paper, we show the surprisingly good properties of plain vision transformers for body pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model dubbed ViTPose.

Keypoint Detection

Learning to Learn Better for Video Object Segmentation

no code implementations5 Dec 2022 Meng Lan, Jing Zhang, Lefei Zhang, DaCheng Tao

Recently, the joint learning framework (JOINT) integrates matching based transductive reasoning and online inductive learning to achieve accurate and robust semi-supervised video object segmentation (SVOS).

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

no code implementations24 Nov 2022 Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Sagar Verma, Siddharth Gupta, Shishir Muralidhara, Niharika Hegde, Daitao Xing, Nikolaos Evangeliou, Anthony Tzes, Vojtěch Bartl, Jakub Špaňhel, Adam Herout, Neelanjan Bhowmik, Toby P. Breckon, Shivanand Kundargi, Tejas Anvekar, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudengudi, Arpita Vats, Yang song, Delong Liu, Yonglin Li, Shuman Li, Chenhao Tan, Long Lan, Vladimir Somers, Christophe De Vleeschouwer, Alexandre Alahi, Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang, Pyong-Kun Kim, Kwangju Kim, Kyoungoh Lee, Shuai Jiang, Haiwen Li, Zheng Ziqiang, Tuan-Anh Vu, Hai Nguyen-Truong, Sai-Kit Yeung, Zhuang Jia, Sophia Yang, Chih-Chung Hsu, Xiu-Yu Hou, Yu-An Jhang, Simon Yang, Mau-Tsuen Yang

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection.

object-detection Object Detection +1

GLT-T: Global-Local Transformer Voting for 3D Single Object Tracking in Point Clouds

no code implementations20 Nov 2022 Jiahao Nie, Zhiwei He, Yuxiang Yang, Mingyu Gao, Jing Zhang

Technically, a global-local transformer (GLT) module is employed to integrate object- and patch-aware prior into seed point features to effectively form strong feature representation for geometric positions of the seed points, thus providing more robust and accurate cues for offset learning.

Object Tracking Region Proposal

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

1 code implementation19 Nov 2022 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Tongliang Liu, Bo Du, DaCheng Tao

In this paper, we present DeepSolo, a simple detection transformer baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.

Scene Text Detection Text Matching +1

Energy-Based Residual Latent Transport for Unsupervised Point Cloud Completion

1 code implementation13 Nov 2022 Ruikai Cui, Shi Qiu, Saeed Anwar, Jing Zhang, Nick Barnes

Unsupervised point cloud completion aims to infer the whole geometry of a partial object observation without requiring partial-complete correspondence.

Point Cloud Completion

Unifying Flow, Stereo and Depth Estimation

1 code implementation10 Nov 2022 Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Fisher Yu, DaCheng Tao, Andreas Geiger

We present a unified formulation and model for three motion and 3D perception tasks: optical flow, rectified stereo matching and unrectified stereo depth estimation from posed images.

Optical Flow Estimation Stereo Depth Estimation +1

Rethinking Hierarchies in Pre-trained Plain Vision Transformer

no code implementations3 Nov 2022 Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao

Self-supervised pre-training vision transformer (ViT) via masked image modeling (MIM) has been proven very effective.

Watermarking for Out-of-distribution Detection

1 code implementation27 Oct 2022 Qizhou Wang, Feng Liu, Yonggang Zhang, Jing Zhang, Chen Gong, Tongliang Liu, Bo Han

Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.

OOD Detection Out-of-Distribution Detection

Adaptive Test-Time Defense with the Manifold Hypothesis

no code implementations26 Oct 2022 Zhaoyuan Yang, Zhiwei Xu, Jing Zhang, Richard Hartley, Peter Tu

In this work, we formulate a novel framework of adversarial robustness using the manifold hypothesis.

Adversarial Robustness Variational Inference

Oscillatory cooperation prevalence emerges from misperception

no code implementations17 Oct 2022 Jing Zhang, Zhao Li, Jiqiang Zhang, Lin Ma, Guozhong Zheng, Li Chen

Here we show that oscillatory behaviors naturally emerge if incomplete information is incorporated into the cooperation evolution of a non-Markov model.

On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation

1 code implementation19 Sep 2022 Haimei Zhao, Jing Zhang, Zhuo Chen, Bo Yuan, DaCheng Tao

Compared with the photometric consistency loss as well as the rigid point cloud alignment loss, the proposed DFA and VDA losses are more robust owing to the strong representation power of deep features as well as the high tolerance of voxel density to the aforementioned challenges.

Monocular Depth Estimation

Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

1 code implementation31 Aug 2022 ZiMing Wang, Xiaoliang Huo, Zhenghao Chen, Jing Zhang, Lu Sheng, Dong Xu

In addition to previous methods that seek correspondences by hand-crafted or learnt geometric features, recent point cloud registration methods have tried to apply RGB-D data to achieve more accurate correspondence.

Point Cloud Registration

Grounded Affordance from Exocentric View

2 code implementations28 Aug 2022 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

Due to the diversity of interactive affordance, the uniqueness of different individuals leads to diverse interactions, which makes it difficult to establish an explicit link between object parts and affordance labels.

Human-Object Interaction Detection Transfer Learning

Robust control problems of BSDEs coupled with value functions

no code implementations23 Aug 2022 Zhou Yang, Jing Zhang, Chao Zhou

A robust control problem is considered in this paper, where the controlled stochastic differential equations (SDEs) include ambiguity parameters and their coefficients satisfy non-Lipschitz continuous and non-linear growth conditions, the objective function is expressed as a backward stochastic differential equation (BSDE) with the generator depending on the value function.

Generalised Co-Salient Object Detection

no code implementations20 Aug 2022 Jiawei Liu, Jing Zhang, Ruikai Cui, Kaihao Zhang, Nick Barnes

To evaluate the performance of CoSOD models under the GCoSOD setting, we propose two new testing datasets, namely CoCA-Common and CoCA-Zero, where a common salient object is partially present in the former and completely absent in the latter.

Co-Salient Object Detection object-detection +2

Transformer Networks for Predictive Group Elevator Control

no code implementations15 Aug 2022 Jing Zhang, Athanasios Tsiligkaridis, Hiroshi Taguchi, Arvind Raghunathan, Daniel Nikovski

We propose a Predictive Group Elevator Scheduler by using predictive information of passengers arrivals from a Transformer based destination predictor and a linear regression model that predicts remaining time to destinations.

regression

Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model

2 code implementations8 Aug 2022 Di Wang, Qiming Zhang, Yufei Xu, Jing Zhang, Bo Du, DaCheng Tao, Liangpei Zhang

Large-scale vision foundation models have made significant progress in visual tasks on natural images, with vision transformers being the primary choice due to their good scalability and representation ability.

Aerial Scene Classification Few-Shot Learning +2

Subtype-Former: a deep learning approach for cancer subtype discovery with multi-omics data

no code implementations28 Jul 2022 Hai Yang, Yuhang Sheng, Yi Jiang, Xiaoyang Fang, Dongdong Li, Jing Zhang, Zhe Wang

In addition, Subtype-Former also achieved outstanding results in pan-cancer subtyping, which can help analyze the commonalities and differences across various cancer types at the molecular level.

Survival Analysis

MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis

no code implementations20 Jul 2022 Yaqian Liang, Shanshan Zhao, Baosheng Yu, Jing Zhang, Fazhi He

We first randomly mask some patches of the mesh and feed the corrupted mesh into Mesh Transformers.

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

1 code implementation18 Jul 2022 Ziqiang Li, Chaoyue Wang, Heliang Zheng, Jing Zhang, Bin Li

Since data augmentation strategies have largely alleviated the training instability, how to further improve the generative performance of DE-GANs becomes a hotspot.

Contrastive Learning Data Augmentation

JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes

1 code implementation16 Jul 2022 Haimei Zhao, Jing Zhang, Sen Zhang, DaCheng Tao

A naive way is to accomplish them independently in a sequential or parallel manner, but there are many drawbacks, i. e., 1) the depth and VO results suffer from the inherent scale ambiguity issue; 2) the BEV layout is directly predicted from the front-view image without using any depth-related information, although the depth map contains useful geometry clues for inferring scene layouts.

Autonomous Driving Depth Estimation +3

Transformer-based Context Condensation for Boosting Feature Pyramids in Object Detection

no code implementations14 Jul 2022 Zhe Chen, Jing Zhang, Yufei Xu, DaCheng Tao

Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF) which aims to mitigate the gap between features from different levels and form a comprehensive object representation to achieve better detection performance.

object-detection Object Detection

Audio-Visual Segmentation

1 code implementation11 Jul 2022 Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

1 code implementation10 Jul 2022 Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, DaCheng Tao

However, these methods built upon detection transformer framework might achieve sub-optimal training efficiency and performance due to coarse positional query modeling. In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation.

Inductive Bias Scene Text Detection

A State Transition Model for Mobile Notifications via Survival Analysis

no code implementations7 Jul 2022 Yiping Yuan, Jing Zhang, Shaunak Chatterjee, Shipeng Yu, Romer Rosales

In particular, we provide an online use case on notification delivery time optimization to show how we make better decisions, drive more user engagement, and provide more value to users.

Decision Making Survival Analysis

Re-weighting Negative Samples for Model-Agnostic Matching

no code implementations6 Jul 2022 Jiazhen Lou, Hong Wen, Fuyu Lv, Jing Zhang, Tengfei Yuan, Zhao Li

Recommender Systems (RS), as an efficient tool to discover users' interested items from a very large corpus, has attracted more and more attention from academia and industry.

Multi-Task Learning Recommendation Systems

CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose

no code implementations23 Jun 2022 Xu Zhang, Wen Wang, Zhe Chen, Yufei Xu, Jing Zhang, DaCheng Tao

Motivated by the progress of visual-language research, we propose that pre-trained language models (e. g., CLIP) can facilitate animal pose estimation by providing rich prior knowledge for describing animal keypoints in text.

Animal Pose Estimation Contrastive Learning

Knowledge Learning with Crowdsourcing: A Brief Review and Systematic Perspective

no code implementations19 Jun 2022 Jing Zhang

Big data have the characteristics of enormous volume, high velocity, diversity, value-sparsity, and uncertainty, which lead the knowledge learning from them full of challenges.

APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking

4 code implementations12 Jun 2022 Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, DaCheng Tao

Based on APT-36K, we benchmark several representative models on the following three tracks: (1) supervised animal pose estimation on a single frame under intra- and inter-domain transfer learning settings, (2) inter-species domain generalization test for unseen animals, and (3) animal pose estimation with animal tracking.

Animal Pose Estimation Domain Generalization +1

Toward Real-world Single Image Deraining: A New Benchmark and Beyond

1 code implementation11 Jun 2022 Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, DaCheng Tao

To address these issues, we establish a new high-quality dataset named RealRain-1k, consisting of $1, 120$ high-resolution paired clean and rainy images with low- and high-density rain streaks, respectively.

Domain Generalization Image Restoration +2

Referring Image Matting

1 code implementation10 Jun 2022 Jizhizi Li, Jing Zhang, DaCheng Tao

RIM aims to extract the meticulous alpha matte of the specific object that best matches the given natural language description, thus enabling a more natural and simpler instruction for image matting.

Domain Generalization Image Matting +4

Multi-Task Learning with Multi-Query Transformer for Dense Prediction

no code implementations28 May 2022 Yangyang Xu, Xiangtai Li, Haobo Yuan, Yibo Yang, Jing Zhang, Yunhai Tong, Lefei Zhang, DaCheng Tao

Secondly, we design a cross task attention module to reason the dependencies among multiple tasks and feature scales from two perspectives including different tasks of the same scale and different scales of the same task.

Multi-Task Learning

Towards Deeper Understanding of Camouflaged Object Detection

1 code implementation23 May 2022 Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Nick Barnes, Deng-Ping Fan

With the above understanding about camouflaged objects, we present the first triple-task learning framework to simultaneously localize, segment, and rank camouflaged objects, indicating the conspicuousness level of camouflage.

object-detection Object Detection

Salient Object Detection via Bounding-box Supervision

no code implementations11 May 2022 Mengqi He, Jing Zhang, Wenxin Yu

However, as a large amount of background is excluded, the foreground bounding box region contains a less complex background, making it possible to perform handcrafted features-based saliency detection with only the cropped foreground region.

object-detection Object Detection +2

From Heavy Rain Removal to Detail Restoration: A Faster and Better Network

1 code implementation7 May 2022 Tao Gao, Yuanbo Wen, Jing Zhang, Kaihao Zhang, Ting Chen

Firstly, a dilated dense residual block (DDRB) within the rain streaks removal network is presented to aggregate high/low level features of heavy rain.

Rain Removal

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

no code implementations CVPR 2022 Xianing Chen, Qiong Cao, Yujie Zhong, Jing Zhang, Shenghua Gao, DaCheng Tao

Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation.

Knowledge Distillation

ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation

2 code implementations26 Apr 2022 Yufei Xu, Jing Zhang, Qiming Zhang, DaCheng Tao

In this paper, we show the surprisingly good capabilities of plain vision transformers for pose estimation from various aspects, namely simplicity in model structure, scalability in model size, flexibility in training paradigm, and transferability of knowledge between models, through a simple baseline model called ViTPose.

 Ranked #1 on Pose Estimation on MPII Human Pose (using extra training data)

Keypoint Detection

An Energy-Based Prior for Generative Saliency

1 code implementation19 Apr 2022 Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li

With the generative saliency model, we can obtain a pixel-wise uncertainty map from an image, indicating model confidence in the saliency prediction.

object-detection RGB-D Salient Object Detection +2

A Comprehensive Survey on Data-Efficient GANs in Image Generation

no code implementations18 Apr 2022 Ziqiang Li, Beihao Xia, Jing Zhang, Chaoyue Wang, Bin Li

Generative Adversarial Networks (GANs) have achieved remarkable achievements in image synthesis.

Image Generation

VSA: Learning Varied-Size Window Attention in Vision Transformers

1 code implementation18 Apr 2022 Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao

Attention within windows has been widely explored in vision transformers to balance the performance, computation complexity, and memory footprint.

Instance Segmentation Object Detection +1

An Empirical Study of Remote Sensing Pretraining

1 code implementation6 Apr 2022 Di Wang, Jing Zhang, Bo Du, Gui-Song Xia, DaCheng Tao

To this end, we train different networks from scratch with the help of the largest RS scene recognition dataset up to now -- MillionAID, to obtain a series of RS pretrained backbones, including both convolutional neural networks (CNN) and vision transformers such as Swin and ViTAE, which have shown promising performance on computer vision tasks.

Aerial Scene Classification Building change detection for remote sensing images +5

BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation

1 code implementation6 Apr 2022 Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, wei he, DaCheng Tao

Source-free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to the unlabeled target domain without accessing the well-labeled source data, which is a much more practical setting due to the data privacy, security, and transmission issues.

Pseudo Label Source-Free Domain Adaptation

Dynamic Focus-aware Positional Queries for Semantic Segmentation

1 code implementation4 Apr 2022 Haoyu He, Jianfei Cai, Zizheng Pan, Jing Liu, Jing Zhang, DaCheng Tao, Bohan Zhuang

In this paper, we propose a simple yet effective query design for semantic segmentation termed Dynamic Focus-aware Positional Queries (DFPQ), which dynamically generates positional queries conditioned on the cross-attention scores from the preceding decoder block and the positional encodings for the corresponding image features, simultaneously.

Ranked #18 on Semantic Segmentation on ADE20K (using extra training data)

Semantic Segmentation

Rethinking Portrait Matting with Privacy Preserving

1 code implementation31 Mar 2022 Sihan Ma, Jizhizi Li, Jing Zhang, He Zhang, DaCheng Tao

We systematically evaluate both trimap-free and trimap-based matting methods on P3M-10k and find that existing matting methods show different generalization abilities under the privacy preserving training setting, i. e., training only on face-blurred images while testing on arbitrary images.

Image Matting Privacy Preserving

Learning Affordance Grounding from Exocentric Images

2 code implementations CVPR 2022 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

To empower an agent with such ability, this paper proposes a task of affordance grounding from exocentric view, i. e., given exocentric human-object interaction and egocentric object images, learning the affordance knowledge of the object and transferring it to the egocentric image using only the affordance label as supervision.

Human-Object Interaction Detection Transfer Learning

AlignTransformer: Hierarchical Alignment of Visual Regions and Disease Tags for Medical Report Generation

no code implementations18 Mar 2022 Di You, Fenglin Liu, Shen Ge, Xiaoxia Xie, Jing Zhang, Xian Wu

The acquired disease-grounded visual features can better represent the abnormal regions of the input image, which could alleviate data bias problem; 2) MGT module effectively uses the multi-grained features and Transformer framework to generate the long medical report.

Image Captioning Medical Report Generation

Towards Data-Efficient Detection Transformers

2 code implementations17 Mar 2022 Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, DaCheng Tao

Besides, we introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.

Towards Scale Consistent Monocular Visual Odometry by Learning from the Virtual World

no code implementations11 Mar 2022 Sen Zhang, Jing Zhang, DaCheng Tao

In this work, we propose VRVO, a novel framework for retrieving the absolute scale from virtual data that can be easily obtained from modern simulation environments, whereas in the real domain no stereo or ground-truth data are required in either the training or inference phases.

Monocular Visual Odometry

Information-Theoretic Odometry Learning

no code implementations11 Mar 2022 Sen Zhang, Jing Zhang, DaCheng Tao

In this paper, we propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation, a crucial component of many robotics and vision tasks such as navigation and virtual reality where relative camera poses are required in real time.

ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond

3 code implementations21 Feb 2022 Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao

Vision transformers have shown great potential in various computer vision tasks owing to their strong capability to model long-range dependency using the self-attention mechanism.

Image Classification Inductive Bias

Deep Interest Highlight Network for Click-Through Rate Prediction in Trigger-Induced Recommendation

1 code implementation5 Feb 2022 Qijie Shen, Hong Wen, Wanjie Tao, Jing Zhang, Fuyu Lv, Zulong Chen, Zhao Li

In many classical e-commerce platforms, personalized recommendation has been proven to be of great business value, which can improve user satisfaction and increase the revenue of platforms.

Click-Through Rate Prediction

SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection

1 code implementation6 Jan 2022 Chen Chen, Zhe Chen, Jing Zhang, DaCheng Tao

We observe that the prevailing set abstraction design for down-sampling points may maintain too much unimportant background information that can affect feature learning for detecting objects.

3D Object Detection object-detection

Exemplar-free Class Incremental Learning via Discriminative and Comparable One-class Classifiers

no code implementations5 Jan 2022 Wenju Sun, Qingyong Li, Jing Zhang, Danyu Wang, Wen Wang, Yangli-ao Geng

DisCOIL follows the basic principle of POC, but it adopts variational auto-encoders (VAE) instead of other well-established one-class classifiers (e. g. deep SVDD), because a trained VAE can not only identify the probability of an input sample belonging to a class but also generate pseudo samples of the class to assist in learning new tasks.

class-incremental learning Incremental Learning +1

3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

no code implementations CVPR 2022 Daigang Cai, Lichen Zhao, Jing Zhang, Lu Sheng, Dong Xu

Observing that the 3D captioning task and the 3D grounding task contain both shared and complementary information in nature, in this work, we propose a unified framework to jointly solve these two distinct but closely related tasks in a synergistic fashion, which consists of both shared task-agnostic modules and lightweight task-specific modules.

Dense Captioning Visual Grounding

ISNet: Shape Matters for Infrared Small Target Detection

1 code implementation CVPR 2022 Mingjin Zhang, Rui Zhang, Yuxiang Yang, Haichen Bai, Jing Zhang, Jie Guo

TOAA block calculates the low-level information with attention mechanism in both row and column directions and fuses it with the high-level information to capture the shape characteristic of targets and suppress noises.

Management

Siamese Network with Interactive Transformer for Video Object Segmentation

no code implementations28 Dec 2021 Meng Lan, Jing Zhang, Fengxiang He, Lefei Zhang

Semi-supervised video object segmentation (VOS) refers to segmenting the target object in remaining frames given its annotation in the first frame, which has been actively studied in recent years.

Semantic Segmentation Semi-Supervised Video Object Segmentation +1

Semi-supervised Salient Object Detection with Effective Confidence Estimation

no code implementations28 Dec 2021 Jiawei Liu, Jing Zhang, Nick Barnes

The success of existing salient object detection models relies on a large pixel-wise labeled training dataset.

object-detection Object Detection +1

MetaCVR: Conversion Rate Prediction via Meta Learning in Small-Scale Recommendation Scenarios

no code implementations27 Dec 2021 Xiaofeng Pan, Ming Li, Jing Zhang, Keren Yu, Luping Wang, Hong Wen, Chengjun Mao, Bo Cao

At last, we develop an Ensemble Prediction Network (EPN) which incorporates the output of FRN and DMN to make the final CVR prediction.

Meta-Learning

Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction

no code implementations NeurIPS 2021 Jing Zhang, Jianwen Xie, Nick Barnes, Ping Li

In this paper, we take a step further by proposing a novel generative vision transformer with latent variables following an informative energy-based prior for salient object detection.

object-detection RGB-D Salient Object Detection +2

Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition

1 code implementation AAAI 2022 2021 Yue He, Chen Chen, Jing Zhang, Juhua Liu, Fengxiang He, Chaoyue Wang, Bo Du

Technically, given the character segmentation maps predicted by a VR model, we construct a subgraph for each instance, where nodes represent the pixels in it and edges are added between nodes based on their spatial similarity.

Ranked #2 on Scene Text Recognition on SVTP (using extra training data)

Language Modelling Scene Text Recognition

Injecting Numerical Reasoning Skills into Knowledge Base Question Answering Models

1 code implementation12 Dec 2021 Yu Feng, Jing Zhang, Xiaokang Zhang, Lemao Liu, Cuiping Li, Hong Chen

Embedding-based methods are popular for Knowledge Base Question Answering (KBQA), but few current models have numerical reasoning skills and thus struggle to answer ordinal constrained questions.

Data Augmentation Knowledge Base Question Answering

Recurrent Glimpse-based Decoder for Detection with Transformer

1 code implementation CVPR 2022 Zhe Chen, Jing Zhang, DaCheng Tao

Then, a glimpse-based decoder is introduced to provide refined detection results based on both the glimpse features and the attention modeling outputs of the previous stage.

Object Detection

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

2 code implementations6 Dec 2021 Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.

Data Augmentation

A Multi-Strategy based Pre-Training Method for Cold-Start Recommendation

no code implementations4 Dec 2021 Bowen Hao, Hongzhi Yin, Jing Zhang, Cuiping Li, Hong Chen

In terms of the pretext task, in addition to considering the intra-correlations of users and items by the embedding reconstruction task, we add embedding contrastive learning task to capture inter-correlations of users and items.

Contrastive Learning Meta-Learning +1

FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis

1 code implementation CVPR 2022 Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, DaCheng Tao

However, designing a unified BA method that can be applied to various MIA systems is challenging due to the diversity of imaging modalities (e. g., X-Ray, CT, and MRI) and analysis tasks (e. g., classification, detection, and segmentation).

Backdoor Attack Classification +4

GMFlow: Learning Optical Flow via Global Matching

2 code implementations CVPR 2022 Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, DaCheng Tao

Learning-based optical flow estimation has been dominated with the pipeline of cost volume with convolutions for flow regression, which is inherently limited to local correlations and thus is hard to address the long-standing challenge of large displacements.

Optical Flow Estimation regression

RegionCL: Can Simple Region Swapping Contribute to Contrastive Learning?

2 code implementations24 Nov 2021 Yufei Xu, Qiming Zhang, Jing Zhang, DaCheng Tao

In this paper, we make the first attempt to demonstrate the importance of both regions in cropping from a complete perspective and propose a simple yet effective pretext task called Region Contrastive Learning (RegionCL).

Contrastive Learning

A General Divergence Modeling Strategy for Salient Object Detection

no code implementations23 Nov 2021 Xinyu Tian, Jing Zhang, Yuchao Dai

Given multiple saliency annotations, we introduce a general divergence modeling strategy via random sampling, and apply our strategy to an ensemble based framework and three latent variable model based solutions to explore the subjective nature of saliency.

object-detection Object Detection +1

Dense Uncertainty Estimation via an Ensemble-based Conditional Latent Variable Model

no code implementations22 Nov 2021 Jing Zhang, Yuchao Dai, Mehrtash Harandi, Yiran Zhong, Nick Barnes, Richard Hartley

Uncertainty estimation has been extensively studied in recent literature, which can usually be classified as aleatoric uncertainty and epistemic uncertainty.

object-detection Object Detection

Inferring the Class Conditional Response Map for Weakly Supervised Semantic Segmentation

1 code implementation27 Oct 2021 Weixuan Sun, Jing Zhang, Nick Barnes

To solve this, most existing approaches follow a multi-training pipeline to refine CAMs for better pseudo-labels, which includes: 1) re-training the classification model to generate CAMs; 2) post-processing CAMs to obtain pseudo labels; and 3) training a semantic segmentation model with the obtained pseudo labels.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Capsule Graph Neural Networks with EM Routing

no code implementations18 Oct 2021 Yu Lei, Jing Zhang

To effectively classify graph instances, graph neural networks need to have the capability to capture the part-whole relationship existing in a graph.

Graph Classification

Dense Uncertainty Estimation

1 code implementation13 Oct 2021 Jing Zhang, Yuchao Dai, Mochu Xiang, Deng-Ping Fan, Peyman Moghadam, Mingyi He, Christian Walder, Kaihao Zhang, Mehrtash Harandi, Nick Barnes

Deep neural networks can be roughly divided into deterministic neural networks and stochastic neural networks. The former is usually trained to achieve a mapping from input space to output space via maximum likelihood estimation for the weights, which leads to deterministic predictions during testing.

Decision Making

FP-DETR: Detection Transformer Advanced by Fully Pre-training

no code implementations ICLR 2022 Wen Wang, Yang Cao, Jing Zhang, DaCheng Tao

To this end, we propose the task adapter which leverages self-attention to model the contextual relation between object query embedding.

object-detection Object Detection +1

Modeling Variable Space with Residual Tensor Networks for Multivariate Time Series

no code implementations29 Sep 2021 Jing Zhang, Peng Zhang, Yupeng He, Siwei Rao, Jun Wang, Guangjian Tian

In this framework, we derive the mathematical representation of the variable space, and then use a tensor network based on the idea of low-rank approximation to model the variable space.

Multivariate Time Series Forecasting Tensor Networks +1

RGB-D Saliency Detection via Cascaded Mutual Information Minimization

1 code implementation ICCV 2021 Jing Zhang, Deng-Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao

In this paper, we introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.

Saliency Detection

Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images

1 code implementation28 Aug 2021 Lefei Zhang, Meng Lan, Jing Zhang, DaCheng Tao

In this paper, we propose a novel stagewise domain adaptation model called RoadDA to address the DS issue in this field.

Unsupervised Domain Adaptation

AP-10K: A Benchmark for Animal Pose Estimation in the Wild

6 code implementations28 Aug 2021 Hang Yu, Yufei Xu, Jing Zhang, Wei Zhao, Ziyu Guan, DaCheng Tao

The experimental results provide sound empirical evidence on the superiority of learning from diverse animals species in terms of both accuracy and generalization ability.

Animal Pose Estimation Domain Generalization +1

Out-of-boundary View Synthesis Towards Full-Frame Video Stabilization

1 code implementation ICCV 2021 Yufei Xu, Jing Zhang, DaCheng Tao

However, since the view outside the boundary is not available during warping, the resulting holes around the boundary of the stabilized frame must be discarded (i. e., cropping) to maintain visual consistency, and thus does leads to a tradeoff between stability and cropping ratio.

Video Stabilization

Graph Contrastive Learning for Anomaly Detection

2 code implementations17 Aug 2021 Bo Chen, Jing Zhang, Xiaokang Zhang, Yuxiao Dong, Jian Song, Peng Zhang, Kaibo Xu, Evgeny Kharlamov, Jie Tang

To achieve the contrastive objective, we design a graph neural network encoder that can infer and further remove suspicious links during message passing, as well as learn the global context of the input graph.

Anomaly Detection Contrastive Learning +1

Bi-Temporal Semantic Reasoning for the Semantic Change Detection in HR Remote Sensing Images

1 code implementation13 Aug 2021 Lei Ding, Haitao Guo, Sicong Liu, Lichao Mou, Jing Zhang, Lorenzo Bruzzone

Recent studies indicate that the SCD can be modeled through a triple-branch Convolutional Neural Network (CNN), which contains two temporal branches and a change branch.

Change Detection

Learning Visual Affordance Grounding from Demonstration Videos

no code implementations12 Aug 2021 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

For the object branch, we introduce a semantic enhancement module (SEM) to make the network focus on different parts of the object according to the action classes and utilize a distillation loss to align the output features of the object branch with that of the video branch and transfer the knowledge in the video branch to the object branch.

Action Recognition

One-Shot Object Affordance Detection in the Wild

1 code implementation8 Aug 2021 Wei Zhai, Hongchen Luo, Jing Zhang, Yang Cao, DaCheng Tao

To empower robots with this ability in unseen scenarios, we first study the challenging one-shot affordance detection problem in this paper, i. e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected.

Action Recognition Affordance Detection +2

I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection

1 code implementation3 Aug 2021 Bo Du, Jian Ye, Jing Zhang, Juhua Liu, DaCheng Tao

Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i. e., 1) fracture detections at the gaps in a text instance; and 2) inaccurate detections of arbitrary-shaped text instances with diverse background context.

Scene Text Detection

TA-MAMC at SemEval-2021 Task 4: Task-adaptive Pretraining and Multi-head Attention for Abstract Meaning Reading Comprehension

no code implementations SEMEVAL 2021 Jing Zhang, Yimeng Zhuang, Yinpei Su

This paper describes our system used in the SemEval-2021 Task4 Reading Comprehension of Abstract Meaning, achieving 1st for subtask 1 and 2nd for subtask 2 on the leaderboard.

Contrastive Learning Multiple-choice +1

DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic Segmentation

1 code implementation20 Jul 2021 Li Gao, Jing Zhang, Lefei Zhang, DaCheng Tao

In addition, feature-level alignment is carried out by aligning the feature maps of the source and target images from student network using a weighted maximum mean discrepancy loss.

Semantic Segmentation Synthetic-to-Real Translation +1

Deep Automatic Natural Image Matting

1 code implementation15 Jul 2021 Jizhizi Li, Jing Zhang, DaCheng Tao

To address the problem, a novel end-to-end matting network is proposed, which can predict a generalized trimap for any image of the above types as a unified semantic representation.

Image Matting

One-Shot Affordance Detection

3 code implementations28 Jun 2021 Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, DaCheng Tao

To empower robots with this ability in unseen scenarios, we consider the challenging one-shot affordance detection problem in this paper, i. e., given a support image that depicts the action purpose, all objects in a scene with the common affordance should be detected.

Affordance Detection

Few-Shot Domain Expansion for Face Anti-Spoofing

no code implementations27 Jun 2021 Bowen Yang, Jing Zhang, Zhenfei Yin, Jing Shao

In practice, given a handful of labeled samples from a new deployment scenario (target domain) and abundant labeled face images in the existing source domain, the FAS system is expected to perform well in the new scenario without sacrificing the performance on the original domain.

Face Anti-Spoofing Face Recognition +1

Energy-Based Generative Cooperative Saliency Prediction

1 code implementation25 Jun 2021 Jing Zhang, Jianwen Xie, Zilong Zheng, Nick Barnes

In this paper, to model the uncertainty of visual saliency, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over the saliency map given an input image, and treating the saliency prediction as a sampling process from the learned distribution.

Saliency Prediction

Exploring Depth Contribution for Camouflaged Object Detection

no code implementations24 Jun 2021 Mochu Xiang, Jing Zhang, Yunqiu Lv, Aixuan Li, Yiran Zhong, Yuchao Dai

In this paper, we study the depth contribution for camouflaged object detection, where the depth maps are generated with existing monocular depth estimation (MDE) methods.

Monocular Depth Estimation object-detection +3

Confidence-Aware Learning for Camouflaged Object Detection

1 code implementation22 Jun 2021 Jiawei Liu, Jing Zhang, Nick Barnes

Then, we concatenate it with the input image and feed it to the confidence estimation network to produce an one channel confidence map. We generate dynamic supervision for the confidence estimation network, representing the agreement of camouflage prediction with the ground truth camouflage map.

object-detection Object Detection

Invertible Attention

1 code implementation16 Jun 2021 Jiajun Zha, Yiran Zhong, Jing Zhang, Richard Hartley, Liang Zheng

Attention has been proved to be an efficient mechanism to capture long-range dependencies.

Image Reconstruction

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

2 code implementations NeurIPS 2021 Yufei Xu, Qiming Zhang, Jing Zhang, DaCheng Tao

Nevertheless, vision transformers treat an image as 1D sequence of visual tokens, lacking an intrinsic inductive bias (IB) in modeling local visual structures and dealing with scale variance.

Image Classification Inductive Bias +2

A Comprehensive Survey and Taxonomy on Single Image Dehazing Based on Deep Learning

1 code implementation7 Jun 2021 Jie Gui, Xiaofeng Cong, Yuan Cao, Wenqi Ren, Jun Zhang, Jing Zhang, Jiuxin Cao, DaCheng Tao

With the development of convolutional neural networks, hundreds of deep learning based dehazing methods have been proposed.

Image Dehazing Single Image Dehazing

Salient Objects in Clutter

2 code implementations7 May 2021 Deng-Ping Fan, Jing Zhang, Gang Xu, Ming-Ming Cheng, Ling Shao

This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.

Image Augmentation object-detection +3

End-to-end One-shot Human Parsing

1 code implementation4 May 2021 Haoyu He, Bohan Zhuang, Jing Zhang, Jianfei Cai, DaCheng Tao

To address three main challenges in OSHP, i. e., small sizes, testing bias, and similar parts, we devise an End-to-end One-shot human Parsing Network (EOP-Net).

Human Parsing Metric Learning +1

Privacy-Preserving Portrait Matting

2 code implementations29 Apr 2021 Jizhizi Li, Sihan Ma, Jing Zhang, DaCheng Tao

We systematically evaluate both trimap-free and trimap-based matting methods on P3M-10k and find that existing matting methods show different generalization capabilities when following the Privacy-Preserving Training (PPT) setting, i. e., training on face-blurred images and testing on arbitrary images.

Image Matting Privacy Preserving

Hierarchically Modeling Micro and Macro Behaviors via Multi-Task Learning for Conversion Rate Prediction

no code implementations20 Apr 2021 Hong Wen, Jing Zhang, Fuyu Lv, Wentian Bao, Tianyi Wang, Zulong Chen

Motivated by this observation, we propose a novel \emph{CVR} prediction method by Hierarchically Modeling both Micro and Macro behaviors ($HM^3$).

Multi-Task Learning Selection bias

Generative Transformer for Accurate and Reliable Salient Object Detection

2 code implementations20 Apr 2021 Yuxin Mao, Jing Zhang, Zhexiong Wan, Yuchao Dai, Aixuan Li, Yunqiu Lv, Xinyu Tian, Deng-Ping Fan, Nick Barnes

For the former, we apply transformer to a deterministic model, and explain that the effective structure modeling and global context modeling abilities lead to its superior performance compared with the CNN based frameworks.

Camouflaged Object Segmentation Machine Translation +5

Learning structure-aware semantic segmentation with image-level supervision

1 code implementation15 Apr 2021 Jiawei Liu, Jing Zhang, Yicong Hong, Nick Barnes

Within this pipeline, the class activation map (CAM) is obtained and further processed to serve as a pseudo label to train the semantic segmentation model in a fully-supervised manner.

Boundary Detection Common Sense Reasoning +2

Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data

no code implementations7 Apr 2021 Tingyi Wanyan, Jing Zhang, Ying Ding, Ariful Azad, Zhangyang Wang, Benjamin S Glicksberg

Electronic Health Record (EHR) data has been of tremendous utility in Artificial Intelligence (AI) for healthcare such as predicting future clinical events.

Contrastive Learning Data Augmentation

Weakly Supervised Video Salient Object Detection

1 code implementation CVPR 2021 Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, Junwei Han

Significant performance improvement has been achieved for fully-supervised video salient object detection with the pixel-wise labeled training datasets, which are time-consuming and expensive to obtain.

object-detection Pseudo Label +3

Uncertainty-aware Joint Salient Object and Camouflaged Object Detection

1 code implementation CVPR 2021 Aixuan Li, Jing Zhang, Yunqiu Lv, Bowen Liu, Tong Zhang, Yuchao Dai

Visual salient object detection (SOD) aims at finding the salient object(s) that attract human attention, while camouflaged object detection (COD) on the contrary intends to discover the camouflaged object(s) that hidden in the surrounding.

object-detection Object Detection +1

VDM-DA: Virtual Domain Modeling for Source Data-free Domain Adaptation

no code implementations26 Mar 2021 Jiayi Tian, Jing Zhang, Wen Li, Dong Xu

On the other hand, we also design an effective distribution alignment method to reduce the distribution divergence between the virtual domain and the target domain by gradually improving the compactness of the target domain distribution through model learning.

Object Recognition Unsupervised Domain Adaptation

Forest Fire Clustering for Single-cell Sequencing with Iterative Label Propagation and Parallelized Monte Carlo Simulation

1 code implementation22 Mar 2021 Zhanlin Chen, Jeremy Goldwasser, Philip Tuckman, Jason Liu, Jing Zhang, Mark Gerstein

Here, we introduce Forest Fire Clustering, an efficient and interpretable method for cell-type discovery from single-cell data.

Simultaneously Localize, Segment and Rank the Camouflaged Objects

1 code implementation CVPR 2021 Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Bowen Liu, Nick Barnes, Deng-Ping Fan

With the above understanding about camouflaged objects, we present the first ranking based COD network (Rank-Net) to simultaneously localize, segment and rank camouflaged objects.

object-detection Object Detection

Understanding WeChat User Preferences and "Wow" Diffusion

1 code implementation4 Mar 2021 Fanjin Zhang, Jie Tang, Xueyi Liu, Zhenyu Hou, Yuxiao Dong, Jing Zhang, Xiao Liu, Ruobing Xie, Kai Zhuang, Xu Zhang, Leyu Lin, Philip S. Yu

"Top Stories" is a novel friend-enhanced recommendation engine in WeChat, in which users can read articles based on preferences of both their own and their friends.

Graph Representation Learning Social and Information Networks

Koopmans' theorem as the mechanism of nearly gapless surface states in self-doped magnetic topological insulators

no code implementations24 Feb 2021 Weizhao Chen, Yufei Zhao, Qiushi Yao, Jing Zhang, Qihang Liu

The magnetization-induced gap at the surface state is widely believed as the kernel of magnetic topological insulators (MTIs) because of its relevance to various topological phenomena, such as the quantum anomalous Hall effect and the axion insulator phase.

Materials Science

Multi-Stage Transmission Line Flow Control Using Centralized and Decentralized Reinforcement Learning Agents

no code implementations16 Feb 2021 Xiumin Shang, Jinping Yang, Bingquan Zhu, Lin Ye, Jing Zhang, Jianping Xu, Qin Lyu, Ruisheng Diao

At stage one, centralized soft actor-critic (SAC) agent is trained to control generator active power outputs in a wide area to control transmission line flows against specified security limits.

reinforcement-learning reinforcement Learning

TextTN: Probabilistic Encoding of Language on Tensor Network

no code implementations1 Jan 2021 Peng Zhang, Jing Zhang, Xindian Ma, Siwei Rao, Guangjian Tian, Jun Wang

As a novel model that bridges machine learning and quantum theory, tensor network (TN) has recently gained increasing attention and successful applications for processing natural images.

General Classification Sentiment Analysis +2

Recommending Courses in MOOCs for Jobs: An Auto Weak Supervision Approach

1 code implementation28 Dec 2020 Bowen Hao, Jing Zhang, Cuiping Li, Hong Chen, Hongzhi Yin

On the one hand, the framework enables training multiple supervised ranking models upon the pseudo labels produced by multiple unsupervised ranking models.

Memory-Gated Recurrent Networks

1 code implementation24 Dec 2020 Yaquan Zhang, Qi Wu, Nanbo Peng, Min Dai, Jing Zhang, Hu Wang

The essence of multivariate sequential learning is all about how to extract dependencies in data.

Time Series

Residual Matrix Product State for Machine Learning

no code implementations22 Dec 2020 Ye-Ming Meng, Jing Zhang, Peng Zhang, Chao GAO, Shi-Ju Ran

Tensor network, which originates from quantum physics, is emerging as an efficient tool for classical and quantum machine learning.

BIG-bench Machine Learning Quantum Machine Learning +1

Progressive One-shot Human Parsing

1 code implementation22 Dec 2020 Haoyu He, Jing Zhang, Bhavani Thuraisingham, DaCheng Tao

In this paper, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges , i. e., testing bias and small sizes.

Human Parsing Metric Learning +1

6 GHz hyperfast rotation of an optically levitated nanoparticle in vacuum

no code implementations17 Dec 2020 Yuanbin Jin, Jiangwei Yan, Shah Jee Rahman, Jie Li, Xudong Yu, Jing Zhang

We measure a highest rotation frequency about 4. 3 GHz of the trapped nanoparticle without feedback cooling and a 6 GHz rotation with feedback cooling, which is the fastest mechanical rotation ever reported to date.

Optics Mesoscale and Nanoscale Physics Quantum Physics

CODE: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking

2 code implementations14 Dec 2020 Bo Chen, Jing Zhang, Xiaokang Zhang, Xiaobin Tang, Lingfan Cai, Hong Chen, Cuiping Li, Peng Zhang, Jie Tang

In this paper, we propose CODE, which first pre-trains an expert linking model by contrastive learning on AMiner such that it can capture the representation and matching patterns of experts without supervised signals, then it is fine-tuned between AMiner and external sources to enhance the models transferability in an adversarial manner.

Active Learning Contrastive Learning +2

Uncertainty-Aware Deep Calibrated Salient Object Detection

no code implementations10 Dec 2020 Jing Zhang, Yuchao Dai, Xin Yu, Mehrtash Harandi, Nick Barnes, Richard Hartley

Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.

object-detection Object Detection +1

Auto Learning Attention

1 code implementation NeurIPS 2020 Benteng Ma, Jing Zhang, Yong Xia, DaCheng Tao

Attention modules have been demonstrated effective in strengthening the representation ability of a neural network via reweighting spatial or channel features or stacking both operations sequentially.

Image Classification Keypoint Detection +2

3D Guided Weakly Supervised Semantic Segmentation

no code implementations1 Dec 2020 Weixuan Sun, Jing Zhang, Nick Barnes

In this paper, we propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information, which is much easier to obtain with advanced sensors.

2D Semantic Segmentation Weakly supervised Semantic Segmentation +1

Inter-layer Transition in Neural Architecture Search

1 code implementation30 Nov 2020 Benteng Ma, Jing Zhang, Yong Xia, DaCheng Tao

Differential Neural Architecture Search (NAS) methods represent the network architecture as a repetitive proxy directed acyclic graph (DAG) and optimize the network weights and architecture weights alternatively in a differential manner.

Neural Architecture Search

SIR: Self-supervised Image Rectification via Seeing the Same Scene from Multiple Different Lenses

no code implementations30 Nov 2020 Jinlong Fan, Jing Zhang, DaCheng Tao

However, the model may overfit the synthetic images and generalize not well on real-world fisheye images due to the limited universality of a specific distortion model and the lack of explicitly modeling the distortion and rectification process.

Self-Supervised Learning

DUT: Learning Video Stabilization by Simply Watching Unstable Videos

2 code implementations30 Nov 2020 Yufei Xu, Jing Zhang, Stephen J. Maybank, DaCheng Tao

In this paper, we attempt to tackle the video stabilization problem in a deep unsupervised learning manner, which borrows the divide-and-conquer idea from traditional stabilizers while leveraging the representation power of DNNs to handle the challenges in real-world scenarios.

Association Homography Estimation +1