no code implementations • ECCV 2020 • Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann
Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.
no code implementations • 21 Apr 2024 • Xu Yang, Jiapeng Zhang, Qifeng Zhang, Zhuo Tang
In federated learning, particularly in cross-device scenarios, secure aggregation has recently gained popularity as it effectively defends against inference attacks by malicious aggregators.
no code implementations • 17 Apr 2024 • Haotian Chen, Xinjie Shen, Zeqi Ye, Xiao Yang, Xu Yang, Weiqing Liu, Jiang Bian
The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments.
1 code implementation • 1 Apr 2024 • Xu Yang, Changxing Ding, Zhibin Hong, Junhao Huang, Jin Tao, Xiangmin Xu
Second, we propose a novel diffusion-based method that predicts a precise inpainting mask based on the person and reference garment images, further enhancing the reliability of the try-on results.
no code implementations • 14 Mar 2024 • Xu Yang, Jiyuan Feng, Songyue Guo, Ye Wang, Ye Ding, Binxing Fang, Qing Liao
In this paper, we propose a novel Dynamic Affinity-based Personalized Federated Learning model (DA-PFL) to alleviate the class imbalanced problem during federated learning.
1 code implementation • 5 Mar 2024 • Hongyu Zhang, Dongyi Zheng, Lin Zhong, Xu Yang, Jiyuan Feng, Yunqing Feng, Qing Liao
Specifically, to address the data heterogeneity across domains, we introduce an approach called hypergraph signal decoupling (HSD) to decouple the user features into domain-exclusive and domain-shared features.
1 code implementation • 29 Feb 2024 • Hongxin Li, Zeyu Wang, Xu Yang, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang
Subsequently, a graph attention module encodes the retained STM and the LTM to generate working memory (WM) which contains the scene features essential for efficient navigation.
no code implementations • 29 Feb 2024 • Xinyi Fang, Xu Yang, Chak Fong Chong, Kei Long Wong, Yapeng Wang, Tiankui Zhang, Sio-Kei Im
Accurate classification of laryngeal vascular as benign or malignant is crucial for early detection of laryngeal cancer.
1 code implementation • 7 Feb 2024 • Hailiang Li, Yan Huo, Yan Wang, Xu Yang, Miaohui Hao, Xiao Wang
As the modern CPU, GPU, and NPU chip design complexity and transistor counts keep increasing, and with the relentless shrinking of semiconductor technology nodes to nearly 1 nanometer, the placement and routing have gradually become the two most pivotal processes in modern very-large-scale-integrated (VLSI) circuit back-end design.
no code implementations • 24 Jan 2024 • Zhe Xu, Kun Wei, Xu Yang, Cheng Deng
Human dance generation (HDG) aims to synthesize realistic videos from images and sequences of driving poses.
1 code implementation • 15 Dec 2023 • Yingzhe Peng, Xu Yang, Haoxuan Ma, Shuo Xu, Chi Zhang, Yucheng Han, Hanwang Zhang
Moreover, during data construction, we use the LVLM intended for ICL implementation to validate the strength of each ICD sequence, resulting in a model-specific dataset and the ICD-LM trained by this dataset is also model-specific.
no code implementations • 10 Dec 2023 • Boyu Shi, Shiyu Xia, Xu Yang, Haokun Chen, Zhiqiang Kou, Xin Geng
To overcome these challenges, motivated by the recently proposed Learngene framework, we propose a novel method called Learngene Pool.
1 code implementation • 9 Dec 2023 • Shiyu Xia, Miaosen Zhang, Xu Yang, Ruiming Chen, Haokun Chen, Xin Geng
Under the situation where we need to produce models of varying depths adapting for different resource constraints, TLEG achieves comparable results while reducing around 19x parameters stored to initialize these models and around 5x pre-training costs, in contrast to the pre-training and fine-tuning approach.
1 code implementation • 4 Dec 2023 • Li Li, Jiawei Peng, Huiyi Chen, Chongyang Gao, Xu Yang
Inspired by the success of Large Language Models in dealing with new tasks via In-Context Learning (ICL) in NLP, researchers have also developed Large Vision-Language Models (LVLMs) with ICL capabilities.
no code implementations • 1 Dec 2023 • Haokun Chen, Xu Yang, Yuhang Huang, Zihan Wu, Jing Wang, Xin Geng
Specifically, using our approach on ImageNet, we increase accuracy from 74. 70\% in a 4-shot setting to 76. 21\% with just 2 shots.
2 code implementations • International Conference on Neural Information Processing 2023 • Chak Fong Chong, Xu Yang, Tenglong Wang, Wei Ke, Yapeng Wang
A single model submitted to the competition server for the official evaluation achieves mAUC 91. 82% on the test set, which is the highest single model score in the leaderboard and literature.
Ranked #1 on Multi-Label Classification on CheXpert
no code implementations • 27 Nov 2023 • Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
Next, we introduce ChartLlama, a multi-modal large language model that we've trained using our created dataset.
no code implementations • 9 Nov 2023 • Yudong Li, Yunlin Lei, Xu Yang
Spiking Neural Network (SNN) is known as the most famous brain-inspired model, but the non-differentiable spiking mechanism makes it hard to train large-scale SNNs.
1 code implementation • 6 Nov 2023 • Hao Zhou, Tiancheng Shen, Xu Yang, Hai Huang, Xiangtai Li, Lu Qi, Ming-Hsuan Yang
We benchmarked the proposed evaluation metrics on 12 open-vocabulary methods of three segmentation tasks.
no code implementations • 17 Oct 2023 • Xu Yang, Xiao Yang, Weiqing Liu, Jinhui Li, Peng Yu, Zeqi Ye, Jiang Bian
In the wake of relentless digital transformation, data-driven solutions are emerging as powerful tools to address multifarious industrial tasks such as forecasting, anomaly detection, planning, and even complex decision-making.
1 code implementation • 2 Oct 2023 • Sen Li, Xu Yang, Anye Cao, Changbin Wang, Yaoqi Liu, Yapeng Liu, Qiang Niu
The most significant improvements, in comparison to existing models, are observed in phase-P picking, phase-S picking, and magnitude estimation, with gains of 1. 7%, 9. 5%, and 8. 0%, respectively.
1 code implementation • 15 Sep 2023 • Hongyu Zhang, Dongyi Zheng, Xu Yang, Jiyuan Feng, Qing Liao
Nonetheless, the sequence feature heterogeneity across different domains significantly impacts the overall performance of FL.
no code implementations • 6 Jul 2023 • Liwei Lu, Hailong Guo, Xu Yang, Yi Zhu
In this paper, we propose a deep learning framework for solving high-dimensional partial integro-differential equations (PIDEs) based on the temporal difference learning.
1 code implementation • 17 Jun 2023 • Fu Feng, Jing Wang, Xu Yang, Xin Geng
Inspired by the biological intelligence, artificial intelligence (AI) has devoted to building the machine intelligence.
1 code implementation • NeurIPS 2023 • Xu Yang, Yongliang Wu, Mingzhuo Yang, Haokun Chen, Xin Geng
After discovering that Language Models (LMs) can be good in-context few-shot learners, numerous strategies have been proposed to optimize in-context sequence configurations.
1 code implementation • 3 May 2023 • Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang
In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.
no code implementations • 3 May 2023 • Qiufeng Wang, Xu Yang, Shuxia Lin, Jing Wang, Xin Geng
(i) Accumulating: the knowledge is accumulated during the continuous learning of an ancestry model.
no code implementations • 4 Apr 2023 • Xinyao Shu, ShiYang Yan, Xu Yang, Ziheng Wu, Zhongfeng Chen, Zhenyu Lu
Unfortunately, language bias is a common problem in VQA, which refers to the model generating answers only by associating with the questions while ignoring the visual content, resulting in biased results.
no code implementations • 13 Mar 2023 • Zihao Lin, Jinrong Li, Fan Yang, Shuangping Huang, Xu Yang, Jianmin Lin, Ming Yang
In this paper, we propose a novel model called Spatial Attention and Syntax Rule Enhanced Tree Decoder (SS-TD), which is equipped with spatial attention mechanism to alleviate the prediction error of tree structure and use syntax masks (obtained from the transformation of syntax rules) to constrain the occurrence of ungrammatical mathematical expression.
no code implementations • ICCV 2023 • Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang
To amend this, we propose a novel TW-BERT to learn Trajectory-Word alignment by a newly designed trajectory-to-word (T2W) attention for solving video-language tasks.
no code implementations • 5 Jan 2023 • Zihua Wang, Xu Yang, Haiyang Xu, Hanwang Zhang, and Qinghao Ye, Chenliang Li, and Weiwei Sun, Ming Yan, Songfang Huang, Fei Huang, Yu Zhang
We design a novel global-local Transformer named \textbf{Ada-ClustFormer} (\textbf{ACF}) to generate captions.
1 code implementation • 19 Nov 2022 • Yudong Li, Yunlin Lei, Xu Yang
Spiking neural networks (SNNs) have made great progress on both performance and efficiency over the last few years, but their unique working pattern makes it hard to train a high-performance low-latency SNN. Thus the development of SNNs still lags behind traditional artificial neural networks (ANNs). To compensate this gap, many extraordinary works have been proposed. Nevertheless, these works are mainly based on the same kind of network structure (i. e. CNN) and their performance is worse than their ANN counterparts, which limits the applications of SNNs. To this end, we propose a novel Transformer-based SNN, termed "Spikeformer", which outperforms its ANN counterpart on both static dataset and neuromorphic dataset and may be an alternative architecture to CNN for training high-performance SNNs. First, to deal with the problem of "data hungry" and the unstable training period exhibited in the vanilla model, we design the Convolutional Tokenizer (CT) module, which improves the accuracy of the original model on DVS-Gesture by more than 16%. Besides, in order to better incorporate the attention mechanism inside Transformer and the spatio-temporal information inherent to SNN, we adopt spatio-temporal attention (STA) instead of spatial-wise or temporal-wise attention. With our proposed method, we achieve competitive or state-of-the-art (SOTA) SNN performance on DVS-CIFAR10, DVS-Gesture, and ImageNet datasets with the least simulation time steps (i. e. low latency). Remarkably, our Spikeformer outperforms other SNNs on ImageNet by a large margin (i. e. more than 5%) and even outperforms its ANN counterpart by 3. 1% and 2. 2% on DVS-Gesture and ImageNet respectively, indicating that Spikeformer is a promising architecture for training large-scale SNNs and may be more suitable for SNNs compared to CNN. We believe that this work shall keep the development of SNNs in step with ANNs as much as possible. Code will be available.
1 code implementation • 12 Oct 2022 • Chak Fong Chong, Yapeng Wang, Benjamin Ng, Wuman Luo, Xu Yang
To the best of our knowledge, it is the first work to predict the projective transformation matrix as the learning goal for photo rectification.
Ranked #1 on Medical Image Classification on CheXphoto
1 code implementation • 4 Oct 2022 • Xu Yang, Hanwang Zhang, Chongyang Gao, Jianfei Cai
This is because the language is only partially observable, for which we need to dynamically collocate the modules during the process of image captioning.
no code implementations • 20 Aug 2022 • Hongxin Li, Xu Yang, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang
To address this limitation, we present the MemoNav, a novel memory mechanism for image-goal navigation, which retains the agent's informative short-term memory and long-term memory to improve the navigation performance on a multi-goal task.
1 code implementation • 1 Aug 2022 • Lu Zhang, Lu Qi, Xu Yang, Hong Qiao, Ming-Hsuan Yang, Zhiyong Liu
In the first stage, we obtain a robust feature extractor, which could serve for all images with base and novel categories.
1 code implementation • CVPR 2022 • Xiangyu Li, Xu Yang, Kun Wei, Cheng Deng, Muli Yang
Some methods recognize state and object with two trained classifiers, ignoring the impact of the interaction between object and state; the other methods try to learn the joint representation of the state-object compositions, leading to the domain gap between seen and unseen composition sets.
1 code implementation • 27 Jun 2022 • Xu Yang, Daoyuan Wu, Xiao Yi, Jimmy H. M. Lee, Tan Lee
In this paper, we propose iExam, an intelligent online exam monitoring and analysis system that can not only use face detection to assist invigilators in real-time student identification, but also be able to detect common abnormal behaviors (including face disappearing, rotating faces, and replacing with a different person during the exams) via a face recognition-based post-exam video analysis.
no code implementations • 21 Apr 2022 • Lu Zhang, Siqi Zhang, Xu Yang, Hong Qiao, Zhiyong Liu
In this paper, we emphasize the adaptation process across sim2real domains and model it as a learning problem on the BatchNorm parameters of a simulation-trained model.
no code implementations • 21 Apr 2022 • Lu Zhang, Zhiyong Liu, Xiangyu Zhu, Zhan Song, Xu Yang, Zhen Lei, Hong Qiao
In this article, we propose a general multimodal detector named aligned region CNN (AR-CNN) to tackle the position shift problem.
1 code implementation • CVPR 2022 • Yanan Gu, Xu Yang, Kun Wei, Cheng Deng
Unfortunately, these methods only focus on selecting samples from the memory bank for replay and ignore the adequate exploration of semantic information in the single-pass data stream, leading to poor classification accuracy.
1 code implementation • CVPR 2022 • Bing Liu, Dong Wang, Xu Yang, Yong Zhou, Rui Yao, Zhiwen Shao, Jiaqi Zhao
In the encoding stage, the IOD is able to disentangle the region-based visual features by deconfounding the visual confounder.
1 code implementation • 17 Dec 2021 • Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, YaoWei Wang, Xiangyang Ji, Wen Gao
Meanwhile, we propose a feature aggregation module to fuse the compressed features with the selected intermediate features of the Transformer, and feed the aggregated features to a deconvolutional neural network for image reconstruction.
2 code implementations • 22 Nov 2021 • Boyu Zhang, Jiayuan Chen, Yinfei Xu, HUI ZHANG, Xu Yang, Xin Geng
Traditionally, AQA is treated as a regression problem to learn the underlying mappings between videos and action scores.
Ranked #1 on Action Quality Assessment on JIGSAWS
1 code implementation • CVPR 2022 • Yaya Shi, Xu Yang, Haiyang Xu, Chunfeng Yuan, Bing Li, Weiming Hu, Zheng-Jun Zha
The datasets will be released to facilitate the development of video captioning metrics.
no code implementations • 28 Oct 2021 • Hao Zhou, Dongchun Ren, Xu Yang, Mingyu Fan, Hai Huang
First, with the continuation of time, the prediction error at each time step increases significantly, causing the final displacement error to be impossible to ignore.
no code implementations • 8 Oct 2021 • Siqi Cao, Di Fu, Xu Yang, Stefan Wermter, Xun Liu, Haiyan Wu
Furthermore, we discuss challenges for responsible evaluation of cognitive methods and computational techniques and show approaches to future work to contribute to affective assistants capable of empathy.
no code implementations • 29 Sep 2021 • Ziqi Zhang, Cheng Deng, Kun Wei, Xu Yang
And on this basis, a novel attribute transfer method, named semantic directional decomposition network (SDD-Net), is proposed to achieve semantic-level facial attribute transfer by latent semantic direction decomposition, improving the interpretability and editability of our method.
no code implementations • 29 Sep 2021 • Xinyue Zhang, Xu Yang, Zhi-Yong Liu
Thus the classification ability of the source domain is transferred to the target domain and the model can distinguish the unknown classes with prior knowledge.
no code implementations • ICCV 2021 • Xu Yang, Chongyang Gao, Hanwang Zhang, Jianfei Cai
We propose an Auto-Parsing Network (APN) to discover and exploit the input data's hidden tree structures for improving the effectiveness of the Transformer-based vision-language systems.
1 code implementation • 26 Jul 2021 • Yuedong Chen, Xu Yang, Tat-Jen Cham, Jianfei Cai
In this work, we scrutinize this problem from the perspective of causal inference, where such dataset characteristic is termed as a confounder which misleads the system to learn the spurious correlation.
1 code implementation • CVPR 2021 • Zhiyuan Dang, Cheng Deng, Xu Yang, Kun Wei, Heng Huang
Specifically, for the local level, we match the nearest neighbors based on batch embedded features, as for the global one, we match neighbors from overall embedded features.
1 code implementation • CVPR 2021 • Xu Yang, Cheng Deng, Zhiyuan Dang, Kun Wei, Junchi Yan
Specifically, the Identity Aggregation is applied to extract semantic features from labeled nodes, the Semantic Alignment is utilized to align node features obtained from different aspects using the class central similarity.
1 code implementation • 9 Mar 2021 • Zhiyuan Dang, Cheng Deng, Xu Yang, Heng Huang
In this paper, we present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views to obtain more discriminative features and competitive results.
no code implementations • CVPR 2021 • Xu Yang, Hanwang Zhang, GuoJun Qi, Jianfei Cai
Specifically, CATT is implemented as a combination of 1) In-Sample Attention (IS-ATT) and 2) Cross-Sample Attention (CS-ATT), where the latter forcibly brings other samples into every IS-ATT, mimicking the causal intervention.
no code implementations • 26 Jan 2021 • Jiaqi Yan, Xu Yang, Yilin Mo, Keyou You
This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system.
1 code implementation • 31 Dec 2020 • Kun Wei, Cheng Deng, Xu Yang, Maosen Li
Different from traditional incremental classification networks, the semantic gap between the embedding spaces of two adjacent tasks is the main challenge for embedding networks under incremental learning setting.
1 code implementation • NeurIPS 2020 • Xu Yang, Cheng Deng, Kun Wei, Junchi Yan, Wei Liu
Meanwhile, we devise an adversarial attack strategy to explore samples that easily fool the clustering layers but do not impact the performance of the deep embedding.
no code implementations • 7 Oct 2020 • Xu Yang, Zhaohui Shang, Keliang Hu, Yi Hu, Bin Ma, Yongjiang Wang, Zihuang Cao, Michael C. B. Ashley, Wei Wang
Dome A in Antarctica has many characteristics that make it an excellent site for astronomical observations, from the optical to the terahertz.
Instrumentation and Methods for Astrophysics
no code implementations • ECCV 2020 • Xiangxi Shi, Xu Yang, Jiuxiang Gu, Shafiq Joty, Jianfei Cai
In this paper, we propose a novel visual encoder to explicitly distinguish viewpoint changes from semantic changes in the change captioning task.
no code implementations • 3 Jun 2020 • Ziju Shen, YuFei Wang, Dufan Wu, Xu Yang, Bin Dong
It is more desirable to design a personalized scanning strategy for each subject to obtain better reconstruction result.
no code implementations • 9 Mar 2020 • Xu Yang, Hanwang Zhang, Jianfei Cai
Dataset bias in vision-language tasks is becoming one of the main problems which hinders the progress of our community.
no code implementations • 12 Feb 2020 • Shi Chen, Qin Li, Xu Yang
The varying-mass Schr\"odinger equation (VMSE) has been successfully applied to model electronic properties of semiconductor hetero-structures, for example, quantum dots and quantum wells.
Numerical Analysis Numerical Analysis
no code implementations • 7 Jan 2020 • Stephen L. H. Lau, Edwin K. P. Chong, Xu Yang, Xin Wang
In this paper, we propose a deep learning technique based on a convolutional neural network to perform segmentation tasks on pavement crack images.
no code implementations • 27 Dec 2019 • Mingxin Zhao, Li Cheng, Xu Yang, Peng Feng, Liyuan Liu, Nanjian Wu
Meanwhile, we propose a joint loss function and a training method.
2 code implementations • 24 May 2019 • Dayiheng Liu, Xu Yang, Feng He, YuanYuan Chen, Jiancheng Lv
It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problem.
no code implementations • CVPR 2019 • Xu Yang, Cheng Deng, Feng Zheng, Junchi Yan, Wei Liu
In this paper, we propose a joint learning framework for discriminative embedding and spectral clustering.
no code implementations • ICCV 2019 • Xu Yang, Hanwang Zhang, Jianfei Cai
To this end, we make the following technical contributions for CNM training: 1) compact module design --- one for function words and three for visual content words (eg, noun, adjective, and verb), 2) soft module fusion and multi-step module execution, robustifying the visual reasoning in partial observation, 3) a linguistic loss for module controller being faithful to part-of-speech collocations (eg, adjective is before noun).
no code implementations • ICCV 2019 • Jiuxiang Gu, Shafiq Joty, Jianfei Cai, Handong Zhao, Xu Yang, Gang Wang
Most of current image captioning models heavily rely on paired image-caption datasets.
no code implementations • ICCV 2019 • Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, Zhi-Yong Liu
In this paper, we propose a novel Aligned Region CNN (AR-CNN) to handle the weakly aligned multispectral data in an end-to-end way.
2 code implementations • CVPR 2019 • Xu Yang, Kaihua Tang, Hanwang Zhang, Jianfei Cai
We propose Scene Graph Auto-Encoder (SGAE) that incorporates the language inductive bias into the encoder-decoder image captioning framework for more human-like captions.
1 code implementation • ECCV 2018 • Xu Yang, Hanwang Zhang, Jianfei Cai
By "agnostic", we mean that the feature is less likely biased to the classes of paired objects.
no code implementations • 19 Sep 2015 • Xu Yang
In order to get a smoother sketch, we propose a new method to reduce such jagged parts and mottled points.
no code implementations • 4 Nov 2014 • Xu Yang, Hong Qiao, Zhi-Yong Liu
We propose a weighted common subgraph (WCS) matching algorithm to find the most similar subgraphs in two labeled weighted graphs.