Search Results for author: Xu Yang

Found 75 papers, 34 papers with code

Learning Progressive Joint Propagation for Human Motion Prediction

no code implementations • ECCV 2020 • Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann

Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.

Human motion prediction motion prediction

Paper
Add Code

FedMPQ: Secure and Communication-Efficient Federated Learning with Multi-codebook Product Quantization

no code implementations • 21 Apr 2024 • Xu Yang, Jiapeng Zhang, Qifeng Zhang, Zhuo Tang

In federated learning, particularly in cross-device scenarios, secure aggregation has recently gained popularity as it effectively defends against inference attacks by malicious aggregators.

Federated Learning Quantization

Paper
Add Code

RD2Bench: Toward Data-Centric Automatic R&D

no code implementations • 17 Apr 2024 • Haotian Chen, Xinjie Shen, Zeqi Ye, Xiao Yang, Xu Yang, Weiqing Liu, Jiang Bian

The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments.

Language Modelling Large Language Model +1

Paper
Add Code

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

1 code implementation • 1 Apr 2024 • Xu Yang, Changxing Ding, Zhibin Hong, Junhao Huang, Jin Tao, Xiangmin Xu

Second, we propose a novel diffusion-based method that predicts a precise inpainting mask based on the person and reference garment images, further enhancing the reliability of the try-on results.

Denoising Image Generation +1

Paper
Code

DA-PFL: Dynamic Affinity Aggregation for Personalized Federated Learning

no code implementations • 14 Mar 2024 • Xu Yang, Jiyuan Feng, Songyue Guo, Ye Wang, Ye Ding, Binxing Fang, Qing Liao

In this paper, we propose a novel Dynamic Affinity-based Personalized Federated Learning model (DA-PFL) to alleviate the class imbalanced problem during federated learning.

Personalized Federated Learning

Paper
Add Code

FedHCDR: Federated Cross-Domain Recommendation with Hypergraph Signal Decoupling

1 code implementation • 5 Mar 2024 • Hongyu Zhang, Dongyi Zheng, Lin Zhong, Xu Yang, Jiyuan Feng, Yunqing Feng, Qing Liao

Specifically, to address the data heterogeneity across domains, we introduce an approach called hypergraph signal decoupling (HSD) to decouple the user features into domain-exclusive and domain-shared features.

Contrastive Learning Data Augmentation +6

Paper
Code

MemoNav: Working Memory Model for Visual Navigation

1 code implementation • 29 Feb 2024 • Hongxin Li, Zeyu Wang, Xu Yang, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang

Subsequently, a graph attention module encodes the retained STM and the LTM to generate working memory (WM) which contains the scene features essential for efficient navigation.

Decision Making Graph Attention +2

Paper
Code

Analysis of the Two-Step Heterogeneous Transfer Learning for Laryngeal Blood Vessel Classification: Issue and Improvement

no code implementations • 29 Feb 2024 • Xinyi Fang, Xu Yang, Chak Fong Chong, Kei Long Wong, Yapeng Wang, Tiankui Zhang, Sio-Kei Im

Accurate classification of laryngeal vascular as benign or malignant is crucial for early detection of laryngeal cancer.

Lesion Classification Transfer Learning

Paper
Add Code

A Lightweight Inception Boosted U-Net Neural Network for Routability Prediction

1 code implementation • 7 Feb 2024 • Hailiang Li, Yan Huo, Yan Wang, Xu Yang, Miaohui Hao, Xiao Wang

As the modern CPU, GPU, and NPU chip design complexity and transistor counts keep increasing, and with the relentless shrinking of semiconductor technology nodes to nearly 1 nanometer, the placement and routing have gradually become the two most pivotal processes in modern very-large-scale-integrated (VLSI) circuit back-end design.

Avg SSIM

Paper
Code

Do You Guys Want to Dance: Zero-Shot Compositional Human Dance Generation with Multiple Persons

no code implementations • 24 Jan 2024 • Zhe Xu, Kun Wei, Xu Yang, Cheng Deng

Human dance generation (HDG) aims to synthesize realistic videos from images and sequences of driving poses.

Paper
Add Code

ICD-LM: Configuring Vision-Language In-Context Demonstrations by Language Modeling

1 code implementation • 15 Dec 2023 • Yingzhe Peng, Xu Yang, Haoxuan Ma, Shuo Xu, Chi Zhang, Yucheng Han, Hanwang Zhang

Moreover, during data construction, we use the LVLM intended for ICL implementation to validate the strength of each ICD sequence, resulting in a model-specific dataset and the ICD-LM trained by this dataset is also model-specific.

Image Captioning In-Context Learning +4

Paper
Code

Building Variable-sized Models via Learngene Pool

no code implementations • 10 Dec 2023 • Boyu Shi, Shiyu Xia, Xu Yang, Haokun Chen, Zhiqiang Kou, Xin Geng

To overcome these challenges, motivated by the recently proposed Learngene framework, we propose a novel method called Learngene Pool.

Paper
Add Code

Transformer as Linear Expansion of Learngene

1 code implementation • 9 Dec 2023 • Shiyu Xia, Miaosen Zhang, Xu Yang, Ruiming Chen, Haokun Chen, Xin Geng

Under the situation where we need to produce models of varying depths adapting for different resource constraints, TLEG achieves comparable results while reducing around 19x parameters stored to initialize these models and around 5x pre-training costs, in contrast to the pre-training and fine-tuning approach.

Paper
Code

How to Configure Good In-Context Sequence for Visual Question Answering

1 code implementation • 4 Dec 2023 • Li Li, Jiawei Peng, Huiyi Chen, Chongyang Gao, Xu Yang

Inspired by the success of Large Language Models in dealing with new tasks via In-Context Learning (ICL) in NLP, researchers have also developed Large Vision-Language Models (LVLMs) with ICL capabilities.

In-Context Learning Question Answering +2

Paper
Code

Manipulating the Label Space for In-Context Classification

no code implementations • 1 Dec 2023 • Haokun Chen, Xu Yang, Yuhang Huang, Zihan Wu, Jing Wang, Xin Geng

Specifically, using our approach on ImageNet, we increase accuracy from 74. 70\% in a 4-shot setting to 76. 21\% with just 2 shots.

Classification Contrastive Learning +2

Paper
Add Code

Category-Wise Fine-Tuning for Image Multi-label Classification with Partial Labels

2 code implementations • International Conference on Neural Information Processing 2023 • Chak Fong Chong, Xu Yang, Tenglong Wang, Wei Ke, Yapeng Wang

A single model submitted to the competition server for the official evaluation achieves mAUC 91. 82% on the test set, which is the highest single model score in the leaderboard and literature.

Ranked #1 on Multi-Label Classification on CheXpert

Binary Classification Multi-Label Classification

Paper
Code

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

no code implementations • 27 Nov 2023 • Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang

Next, we introduce ChartLlama, a multi-modal large language model that we've trained using our created dataset.

Language Modelling Large Language Model

Paper
Add Code

Rethinking Residual Connection in Training Large-Scale Spiking Neural Networks

no code implementations • 9 Nov 2023 • Yudong Li, Yunlin Lei, Xu Yang

Spiking Neural Network (SNN) is known as the most famous brain-inspired model, but the non-differentiable spiking mechanism makes it hard to train large-scale SNNs.

Paper
Add Code

Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion

1 code implementation • 6 Nov 2023 • Hao Zhou, Tiancheng Shen, Xu Yang, Hai Huang, Xiangtai Li, Lu Qi, Ming-Hsuan Yang

We benchmarked the proposed evaluation metrics on 12 open-vocabulary methods of three segmentation tasks.

Segmentation

664

Paper
Code

Leveraging Large Language Model for Automatic Evolving of Industrial Data-Centric R&D Cycle

no code implementations • 17 Oct 2023 • Xu Yang, Xiao Yang, Weiqing Liu, Jinhui Li, Peng Yu, Zeqi Ye, Jiang Bian

In the wake of relentless digital transformation, data-driven solutions are emerging as powerful tools to address multifarious industrial tasks such as forecasting, anomaly detection, planning, and even complex decision-making.

Anomaly Detection Decision Making +2

Paper
Add Code

SeisT: A foundational deep learning model for earthquake monitoring tasks

1 code implementation • 2 Oct 2023 • Sen Li, Xu Yang, Anye Cao, Changbin Wang, Yaoqi Liu, Yapeng Liu, Qiang Niu

The most significant improvements, in comparison to existing models, are observed in phase-P picking, phase-S picking, and magnitude estimation, with gains of 1. 7%, 9. 5%, and 8. 0%, respectively.

Out-of-Distribution Generalization

Paper
Code

FedDCSR: Federated Cross-domain Sequential Recommendation via Disentangled Representation Learning

1 code implementation • 15 Sep 2023 • Hongyu Zhang, Dongyi Zheng, Xu Yang, Jiyuan Feng, Qing Liao

Nonetheless, the sequence feature heterogeneity across different domains significantly impacts the overall performance of FL.

Data Augmentation Disentanglement +3

Paper
Code

Temporal Difference Learning for High-Dimensional PIDEs with Jumps

no code implementations • 6 Jul 2023 • Liwei Lu, Hailong Guo, Xu Yang, Yi Zhu

In this paper, we propose a deep learning framework for solving high-dimensional partial integro-differential equations (PIDEs) based on the temporal difference learning.

Paper
Add Code

Genes in Intelligent Agents

1 code implementation • 17 Jun 2023 • Fu Feng, Jing Wang, Xu Yang, Xin Geng

Inspired by the biological intelligence, artificial intelligence (AI) has devoted to building the machine intelligence.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Exploring Diverse In-Context Configurations for Image Captioning

1 code implementation • NeurIPS 2023 • Xu Yang, Yongliang Wu, Mingzhuo Yang, Haokun Chen, Xin Geng

After discovering that Language Models (LMs) can be good in-context few-shot learners, numerous strategies have been proposed to optimize in-context sequence configurations.

Image Captioning In-Context Learning

Paper
Code

Transforming Visual Scene Graphs to Image Captions

1 code implementation • 3 May 2023 • Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang

In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.

Attribute Descriptive +1

Paper
Code

Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant Models

no code implementations • 3 May 2023 • Qiufeng Wang, Xu Yang, Shuxia Lin, Jing Wang, Xin Geng

(i) Accumulating: the knowledge is accumulated during the continuous learning of an ancestry model.

Paper
Add Code

SC-ML: Self-supervised Counterfactual Metric Learning for Debiased Visual Question Answering

no code implementations • 4 Apr 2023 • Xinyao Shu, ShiYang Yan, Xu Yang, Ziheng Wu, Zhongfeng Chen, Zhenyu Lu

Unfortunately, language bias is a common problem in VQA, which refers to the model generating answers only by associating with the questions while ignoring the visual content, resulting in biased results.

counterfactual Metric Learning +2

Paper
Add Code

Spatial Attention and Syntax Rule Enhanced Tree Decoder for Offine Handwritten Mathematical Expression Recognition

no code implementations • 13 Mar 2023 • Zihao Lin, Jinrong Li, Fan Yang, Shuangping Huang, Xu Yang, Jianmin Lin, Ming Yang

In this paper, we propose a novel model called Spatial Attention and Syntax Rule Enhanced Tree Decoder (SS-TD), which is equipped with spatial attention mechanism to alleviate the prediction error of tree structure and use syntax masks (obtained from the transformation of syntax rules) to constrain the occurrence of ungrammatical mathematical expression.

Paper
Add Code

Learning Trajectory-Word Alignments for Video-Language Tasks

no code implementations • ICCV 2023 • Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang

To amend this, we propose a novel TW-BERT to learn Trajectory-Word alignment by a newly designed trajectory-to-word (T2W) attention for solving video-language tasks.

Question Answering Retrieval +4

Paper
Add Code

Adaptively Clustering Neighbor Elements for Image Captioning

no code implementations • 5 Jan 2023 • Zihua Wang, Xu Yang, Haiyang Xu, Hanwang Zhang, and Qinghao Ye, Chenliang Li, and Weiwei Sun, Ming Yan, Songfang Huang, Fei Huang, Yu Zhang

We design a novel global-local Transformer named \textbf{Ada-ClustFormer} (\textbf{ACF}) to generate captions.

Clustering Image Captioning

Paper
Add Code

Spikeformer: A Novel Architecture for Training High-Performance Low-Latency Spiking Neural Network

1 code implementation • 19 Nov 2022 • Yudong Li, Yunlin Lei, Xu Yang

Spiking neural networks (SNNs) have made great progress on both performance and efficiency over the last few years, but their unique working pattern makes it hard to train a high-performance low-latency SNN. Thus the development of SNNs still lags behind traditional artificial neural networks (ANNs). To compensate this gap, many extraordinary works have been proposed. Nevertheless, these works are mainly based on the same kind of network structure (i. e. CNN) and their performance is worse than their ANN counterparts, which limits the applications of SNNs. To this end, we propose a novel Transformer-based SNN, termed "Spikeformer", which outperforms its ANN counterpart on both static dataset and neuromorphic dataset and may be an alternative architecture to CNN for training high-performance SNNs. First, to deal with the problem of "data hungry" and the unstable training period exhibited in the vanilla model, we design the Convolutional Tokenizer (CT) module, which improves the accuracy of the original model on DVS-Gesture by more than 16%. Besides, in order to better incorporate the attention mechanism inside Transformer and the spatio-temporal information inherent to SNN, we adopt spatio-temporal attention (STA) instead of spatial-wise or temporal-wise attention. With our proposed method, we achieve competitive or state-of-the-art (SOTA) SNN performance on DVS-CIFAR10, DVS-Gesture, and ImageNet datasets with the least simulation time steps (i. e. low latency). Remarkably, our Spikeformer outperforms other SNNs on ImageNet by a large margin (i. e. more than 5%) and even outperforms its ANN counterpart by 3. 1% and 2. 2% on DVS-Gesture and ImageNet respectively, indicating that Spikeformer is a promising architecture for training large-scale SNNs and may be more suitable for SNNs compared to CNN. We believe that this work shall keep the development of SNNs in step with ANNs as much as possible. Code will be available.

Paper
Code

Image Projective Transformation Rectification with Synthetic Data for Smartphone-captured Chest X-ray Photos Classification

1 code implementation • 12 Oct 2022 • Chak Fong Chong, Yapeng Wang, Benjamin Ng, Wuman Luo, Xu Yang

To the best of our knowledge, it is the first work to predict the projective transformation matrix as the learning goal for photo rectification.

Ranked #1 on Medical Image Classification on CheXphoto

Image Classification Medical Image Classification

Paper
Code

Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning

1 code implementation • 4 Oct 2022 • Xu Yang, Hanwang Zhang, Chongyang Gao, Jianfei Cai

This is because the language is only partially observable, for which we need to dynamically collocate the modules during the process of image captioning.

Image Captioning Sentence +2

Paper
Code

MemoNav: Selecting Informative Memories for Visual Navigation

no code implementations • 20 Aug 2022 • Hongxin Li, Xu Yang, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang

To address this limitation, we present the MemoNav, a novel memory mechanism for image-goal navigation, which retains the agent's informative short-term memory and long-term memory to improve the navigation performance on a multi-goal task.

Action Generation Graph Attention +2

Paper
Add Code

Automatically Discovering Novel Visual Categories with Self-supervised Prototype Learning

1 code implementation • 1 Aug 2022 • Lu Zhang, Lu Qi, Xu Yang, Hong Qiao, Ming-Hsuan Yang, Zhiyong Liu

In the first stage, we obtain a robust feature extractor, which could serve for all images with base and novel categories.

Representation Learning Self-Supervised Learning

665

Paper
Code

Siamese Contrastive Embedding Network for Compositional Zero-Shot Learning

1 code implementation • CVPR 2022 • Xiangyu Li, Xu Yang, Kun Wei, Cheng Deng, Muli Yang

Some methods recognize state and object with two trained classifiers, ignoring the impact of the interaction between object and state; the other methods try to learn the joint representation of the state-object compositions, leading to the domain gap between seen and unseen composition sets.

Compositional Zero-Shot Learning Object

Paper
Code

iExam: A Novel Online Exam Monitoring and Analysis System Based on Face Detection and Recognition

1 code implementation • 27 Jun 2022 • Xu Yang, Daoyuan Wu, Xiao Yi, Jimmy H. M. Lee, Tan Lee

In this paper, we propose iExam, an intelligent online exam monitoring and analysis system that can not only use face detection to assist invigilators in real-time student identification, but also be able to detect common abnormal behaviors (including face disappearing, rotating faces, and replacing with a different person during the exams) via a face recognition-based post-exam video analysis.

Face Detection Face Recognition +2

Paper
Code

Unseen Object Instance Segmentation with Fully Test-time RGB-D Embeddings Adaptation

no code implementations • 21 Apr 2022 • Lu Zhang, Siqi Zhang, Xu Yang, Hong Qiao, Zhiyong Liu

In this paper, we emphasize the adaptation process across sim2real domains and model it as a learning problem on the BatchNorm parameters of a simulation-trained model.

Knowledge Distillation Segmentation +4

Paper
Add Code

Weakly Aligned Feature Fusion for Multimodal Object Detection

no code implementations • 21 Apr 2022 • Lu Zhang, Zhiyong Liu, Xiangyu Zhu, Zhan Song, Xu Yang, Zhen Lei, Hong Qiao

In this article, we propose a general multimodal detector named aligned region CNN (AR-CNN) to tackle the position shift problem.

Object object-detection +2

Paper
Add Code

Not Just Selection, but Exploration: Online Class-Incremental Continual Learning via Dual View Consistency

1 code implementation • CVPR 2022 • Yanan Gu, Xu Yang, Kun Wei, Cheng Deng

Unfortunately, these methods only focus on selecting samples from the memory bank for replay and ignore the adequate exploration of semantic information in the single-pass data stream, leading to poor classification accuracy.

Continual Learning

Paper
Code

Show, Deconfound and Tell: Image Captioning With Causal Inference

1 code implementation • CVPR 2022 • Bing Liu, Dong Wang, Xu Yang, Yong Zhou, Rui Yao, Zhiwen Shao, Jiaqi Zhao

In the encoding stage, the IOD is able to disentangle the region-based visual features by deconfounding the visual confounder.

Causal Inference Image Captioning

Paper
Code

Towards End-to-End Image Compression and Analysis with Transformers

1 code implementation • 17 Dec 2021 • Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, YaoWei Wang, Xiangyang Ji, Wen Gao

Meanwhile, we propose a feature aggregation module to fuse the compressed features with the selected intermediate features of the Transformer, and feed the aggregated features to a deconvolutional neural network for image reconstruction.

Classification Image Classification +3

Paper
Code

Auto-Encoding Score Distribution Regression for Action Quality Assessment

2 code implementations • 22 Nov 2021 • Boyu Zhang, Jiayuan Chen, Yinfei Xu, HUI ZHANG, Xu Yang, Xin Geng

Traditionally, AQA is treated as a regression problem to learn the underlying mappings between videos and action scores.

Ranked #1 on Action Quality Assessment on JIGSAWS

Action Quality Assessment regression

Paper
Code

EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

1 code implementation • CVPR 2022 • Yaya Shi, Xu Yang, Haiyang Xu, Chunfeng Yuan, Bing Li, Weiming Hu, Zheng-Jun Zha

The datasets will be released to facilitate the development of video captioning metrics.

Language Modelling Video Captioning

Paper
Code

Sliding Sequential CVAE with Time Variant Socially-aware Rethinking for Trajectory Prediction

no code implementations • 28 Oct 2021 • Hao Zhou, Dongchun Ren, Xu Yang, Mingyu Fan, Hai Huang

First, with the continuation of time, the prediction error at each time step increases significantly, causing the final displacement error to be impossible to ignore.

Autonomous Driving Pedestrian Trajectory Prediction +3

Paper
Add Code

Can AI detect pain and express pain empathy? A review from emotion recognition and a human-centered AI perspective

no code implementations • 8 Oct 2021 • Siqi Cao, Di Fu, Xu Yang, Stefan Wermter, Xun Liu, Haiyan Wu

Furthermore, we discuss challenges for responsible evaluation of cognitive methods and computational techniques and show approaches to future work to contribute to affective assistants capable of empathy.

Emotion Recognition

Paper
Add Code

Text-Driven Image Manipulation via Semantic-Aware Knowledge Transfer

no code implementations • 29 Sep 2021 • Ziqi Zhang, Cheng Deng, Kun Wei, Xu Yang

And on this basis, a novel attribute transfer method, named semantic directional decomposition network (SDD-Net), is proposed to achieve semantic-level facial attribute transfer by latent semantic direction decomposition, improving the interpretability and editability of our method.

Attribute Image Manipulation +1

Paper
Add Code

Open Set Domain Adaptation with Zero-shot Learning on Graph

no code implementations • 29 Sep 2021 • Xinyue Zhang, Xu Yang, Zhi-Yong Liu

Thus the classification ability of the source domain is transferred to the target domain and the model can distinguish the unknown classes with prior knowledge.

Domain Adaptation Zero-Shot Learning

Paper
Add Code

Auto-Parsing Network for Image Captioning and Visual Question Answering

no code implementations • ICCV 2021 • Xu Yang, Chongyang Gao, Hanwang Zhang, Jianfei Cai

We propose an Auto-Parsing Network (APN) to discover and exploit the input data's hidden tree structures for improving the effectiveness of the Transformer-based vision-language systems.

Image Captioning Question Answering +1

Paper
Add Code

Towards Unbiased Visual Emotion Recognition via Causal Intervention

1 code implementation • 26 Jul 2021 • Yuedong Chen, Xu Yang, Tat-Jen Cham, Jianfei Cai

In this work, we scrutinize this problem from the perspective of causal inference, where such dataset characteristic is termed as a confounder which misleads the system to learn the spurious correlation.

Causal Inference Emotion Recognition

Paper
Code

Nearest Neighbor Matching for Deep Clustering

1 code implementation • CVPR 2021 • Zhiyuan Dang, Cheng Deng, Xu Yang, Kun Wei, Heng Huang

Specifically, for the local level, we match the nearest neighbors based on batch embedded features, as for the global one, we match neighbors from overall embedded features.

Clustering Deep Clustering

Paper
Code

SelfSAGCN: Self-Supervised Semantic Alignment for Graph Convolution Network

1 code implementation • CVPR 2021 • Xu Yang, Cheng Deng, Zhiyuan Dang, Kun Wei, Junchi Yan

Specifically, the Identity Aggregation is applied to extract semantic features from labeled nodes, the Semantic Alignment is utilized to align node features obtained from different aspects using the class central similarity.

Representation Learning

Paper
Code

Doubly Contrastive Deep Clustering

1 code implementation • 9 Mar 2021 • Zhiyuan Dang, Cheng Deng, Xu Yang, Heng Huang

In this paper, we present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views to obtain more discriminative features and competitive results.

Clustering Contrastive Learning +2

Paper
Code

Causal Attention for Vision-Language Tasks

no code implementations • CVPR 2021 • Xu Yang, Hanwang Zhang, GuoJun Qi, Jianfei Cai

Specifically, CATT is implemented as a combination of 1) In-Sample Attention (IS-ATT) and 2) Cross-Sample Attention (CS-ATT), where the latter forcibly brings other samples into every IS-ATT, mimicking the causal intervention.

Paper
Add Code

A Distributed Implementation of Steady-State Kalman Filter

no code implementations • 26 Jan 2021 • Jiaqi Yan, Xu Yang, Yilin Mo, Keyou You

This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system.

Paper
Add Code

Incremental Embedding Learning via Zero-Shot Translation

1 code implementation • 31 Dec 2020 • Kun Wei, Cheng Deng, Xu Yang, Maosen Li

Different from traditional incremental classification networks, the semantic gap between the embedding spaces of two adjacent tasks is the main challenge for embedding networks under incremental learning setting.

Face Recognition Image Retrieval +4

Paper
Code

Adversarial Learning for Robust Deep Clustering

1 code implementation • NeurIPS 2020 • Xu Yang, Cheng Deng, Kun Wei, Junchi Yan, Wei Liu

Meanwhile, we devise an adversarial attack strategy to explore samples that easily fool the clustering layers but do not impact the performance of the deep embedding.

Adversarial Attack Clustering +1

Paper
Code

Cloud Cover and Aurora Contamination at Dome A in 2017 from KLCAM

no code implementations • 7 Oct 2020 • Xu Yang, Zhaohui Shang, Keliang Hu, Yi Hu, Bin Ma, Yongjiang Wang, Zihuang Cao, Michael C. B. Ashley, Wei Wang

Dome A in Antarctica has many characteristics that make it an excellent site for astronomical observations, from the optical to the terahertz.

Instrumentation and Methods for Astrophysics

Paper
Add Code

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

no code implementations • ECCV 2020 • Xiangxi Shi, Xu Yang, Jiuxiang Gu, Shafiq Joty, Jianfei Cai

In this paper, we propose a novel visual encoder to explicitly distinguish viewpoint changes from semantic changes in the change captioning task.

Reinforcement Learning (RL)

Paper
Add Code

Learning to Scan: A Deep Reinforcement Learning Approach for Personalized Scanning in CT Imaging

no code implementations • 3 Jun 2020 • Ziju Shen, YuFei Wang, Dufan Wu, Xu Yang, Bin Dong

It is more desirable to design a personalized scanning strategy for each subject to obtain better reconstruction result.

Computed Tomography (CT) Reinforcement Learning (RL)

Paper
Add Code

Deconfounded Image Captioning: A Causal Retrospect

no code implementations • 9 Mar 2020 • Xu Yang, Hanwang Zhang, Jianfei Cai

Dataset bias in vision-language tasks is becoming one of the main problems which hinders the progress of our community.

Causal Inference Image Captioning

Paper
Add Code

Classical limit for the varying-mass Schrödinger equation with random inhomogeneities

no code implementations • 12 Feb 2020 • Shi Chen, Qin Li, Xu Yang

The varying-mass Schr\"odinger equation (VMSE) has been successfully applied to model electronic properties of semiconductor hetero-structures, for example, quantum dots and quantum wells.

Numerical Analysis Numerical Analysis

Paper
Add Code

Automated Pavement Crack Segmentation Using U-Net-based Convolutional Neural Network

no code implementations • 7 Jan 2020 • Stephen L. H. Lau, Edwin K. P. Chong, Xu Yang, Xin Wang

In this paper, we propose a deep learning technique based on a convolutional neural network to perform segmentation tasks on pavement crack images.

Crack Segmentation Feature Engineering +2

Paper
Add Code

TBC-Net: A real-time detector for infrared small target detection using semantic constraint

no code implementations • 27 Dec 2019 • Mingxin Zhao, Li Cheng, Xu Yang, Peng Feng, Liyuan Liu, Nanjian Wu

Meanwhile, we propose a joint loss function and a training method.

Paper
Add Code

mu-Forcing: Training Variational Recurrent Autoencoders for Text Generation

2 code implementations • 24 May 2019 • Dayiheng Liu, Xu Yang, Feng He, YuanYuan Chen, Jiancheng Lv

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problem.

Language Modelling Text Generation

Paper
Code

Deep Spectral Clustering using Dual Autoencoder Network

no code implementations • CVPR 2019 • Xu Yang, Cheng Deng, Feng Zheng, Junchi Yan, Wei Liu

In this paper, we propose a joint learning framework for discriminative embedding and spectral clustering.

Clustering Deep Clustering +1

Paper
Add Code

Learning to Collocate Neural Modules for Image Captioning

no code implementations • ICCV 2019 • Xu Yang, Hanwang Zhang, Jianfei Cai

To this end, we make the following technical contributions for CNM training: 1) compact module design --- one for function words and three for visual content words (eg, noun, adjective, and verb), 2) soft module fusion and multi-step module execution, robustifying the visual reasoning in partial observation, 3) a linguistic loss for module controller being faithful to part-of-speech collocations (eg, adjective is before noun).

Image Captioning Sentence +2

Paper
Add Code

Unpaired Image Captioning via Scene Graph Alignments

no code implementations • ICCV 2019 • Jiuxiang Gu, Shafiq Joty, Jianfei Cai, Handong Zhao, Xu Yang, Gang Wang

Most of current image captioning models heavily rely on paired image-caption datasets.

Image Captioning Sentence

Paper
Add Code

Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection

no code implementations • ICCV 2019 • Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, Zhi-Yong Liu

In this paper, we propose a novel Aligned Region CNN (AR-CNN) to handle the weakly aligned multispectral data in an end-to-end way.

Position

Paper
Add Code

Auto-Encoding Scene Graphs for Image Captioning

2 code implementations • CVPR 2019 • Xu Yang, Kaihua Tang, Hanwang Zhang, Jianfei Cai

We propose Scene Graph Auto-Encoder (SGAE) that incorporates the language inductive bias into the encoder-decoder image captioning framework for more human-like captions.

Image Captioning Inductive Bias +1