no code implementations • 2014 IEEE International Conference on Data Mining 2015 • Ya Zhang, Yi Wei, Jianbiao Ren
With the ever enhanced capability to tracking advertisement and users' interaction with the advertisement, data-driven multi-touch attribution models, which attempt to infer the contribution from user interaction data, become an important research direction.
no code implementations • ICCV 2015 • Zhe Xu, Shaoli Huang, Ya zhang, DaCheng Tao
We propose a new method for fine-grained object recognition that employs part-level annotations and deep convolutional neural networks (CNNs) in a unified framework.
no code implementations • CVPR 2016 • Shaoli Huang, Zhe Xu, DaCheng Tao, Ya zhang
In the context of fine-grained visual categorization, the ability to interpret models as human-understandable visual manuals is sometimes as important as achieving high classification accuracy.
Ranked #62 on Fine-Grained Image Classification on CUB-200-2011
no code implementations • 28 Feb 2017 • Shanshan Huang, Yichao Xiong, Ya zhang, Jia Wang
Considering the difficulty in obtaining labeled datasets for image retrieval task in large scale, we propose a novel CNN-based unsupervised hashing method, namely Unsupervised Triplet Hashing (UTH).
no code implementations • 3 Mar 2017 • Yan Wang, Lingxi Xie, Ya zhang, Wenjun Zhang, Alan Yuille
We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity.
no code implementations • ICCV 2017 • Yan Wang, Lingxi Xie, Chenxi Liu, Ya zhang, Wenjun Zhang, Alan Yuille
In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks.
no code implementations • 31 Oct 2017 • Yuefu Zhou, Shanshan Huang, Ya zhang, Yan-Feng Wang
While minimizing the quantization loss guarantees that quantization has minimal effect on retrieval accuracy, it unfortunately significantly reduces the expressiveness of features even before the quantization.
no code implementations • 31 Oct 2017 • Zhonghao Wang, Yujun Gu, Ya zhang, Jun Zhou, Xiao Gu
The VAM is further connected to a global network to form an end-to-end network structure through Impdrop connection which randomly Dropout on the feature maps with the probabilities given by the attention map.
no code implementations • 1 Nov 2017 • Zhuoxiang Chen, Zhe Xu, Ya zhang, Xiao Gu
We model this problem as a new type of image retrieval task in which the target image resides only in the user's mind (called "mental image retrieval" hereafter).
no code implementations • 2 Nov 2017 • Jiangchao Yao, Jiajie Wang, Ivor Tsang, Ya zhang, Jun Sun, Chengqi Zhang, Rui Zhang
However, the label noise among the datasets severely degenerates the \mbox{performance of deep} learning approaches.
1 code implementation • CVPR 2018 • Yexun Zhang, Ya zhang, Wenbin Cai, Jie Chang
We here attempt to separate the representations for styles and contents, and propose a generalized style transfer network consisting of style encoder, content encoder, mixer and decoder.
no code implementations • 17 Nov 2017 • Jie Chang, Yujun Gu, Ya zhang
Inspired by the recent advancement in Generative Adversarial Networks (GANs), we propose a Hierarchical Adversarial Network (HAN) for typeface transformation.
no code implementations • 10 Feb 2018 • Jiajie Wang, Jiangchao Yao, Ya zhang, Rui Zhang
For object detection, taking WSDDN-like architecture as weakly supervised detector sub-network and Faster-RCNN-like architecture as strongly supervised detector sub-network, we propose an end-to-end Weakly Supervised Collaborative Detection Network.
no code implementations • 19 Feb 2018 • Huangjie Zheng, Jiangchao Yao, Ya zhang, Ivor W. Tsang
While enormous progress has been made to Variational Autoencoder (VAE) in recent years, similar to other deep networks, VAE with deep networks suffers from the problem of degeneration, which seriously weakens the correlation between the input and the corresponding latent codes, deviating from the goal of the representation learning.
no code implementations • ECCV 2018 • Yan Wang, Lingxi Xie, Siyuan Qiao, Ya zhang, Wenjun Zhang, Alan L. Yuille
Convolution is spatially-symmetric, i. e., the visual features are independent of its position in the image, which limits its ability to utilize contextual cues for visual recognition.
no code implementations • 12 Apr 2018 • Jiangchao Yao, Ivor Tsang, Ya zhang
Learning in the latent variable model is challenging in the presence of the complex data structure or the intractable latent variable.
no code implementations • 27 Apr 2018 • Yujun Gu, Jie Chang, Ya zhang, Yan-Feng Wang
Understanding human visual attention is important for multimedia applications.
2 code implementations • NeurIPS 2018 • Bo Han, Jiangchao Yao, Gang Niu, Mingyuan Zhou, Ivor Tsang, Ya zhang, Masashi Sugiyama
It is important to learn various types of classifiers given training data with noisy labels.
Ranked #41 on Image Classification on Clothing1M (using extra training data)
no code implementations • 29 May 2018 • Yu Li, Ya zhang
Web page saliency prediction is a challenge problem in image transformation and computer vision.
1 code implementation • 13 Jun 2018 • Yexun Zhang, Ya zhang, Wenbin Cai
The encoders are expected to capture the underlying features for different styles and contents which is generalizable to new styles and contents.
no code implementations • 10 Jul 2018 • Huangjie Zheng, Jiangchao Yao, Ya zhang, Ivor W. Tsang, Jia Wang
In information theory, Fisher information and Shannon information (entropy) are respectively used to quantify the uncertainty associated with the distribution modeling and the uncertainty in specifying the outcome of given variables.
1 code implementation • 11 Aug 2018 • Kan Ren, Yuchen Fang, Wei-Nan Zhang, Shuhao Liu, Jiajun Li, Ya zhang, Yong Yu, Jun Wang
To achieve this, we utilize sequence-to-sequence prediction for user clicks, and combine both post-view and post-click attribution patterns together for the final conversion estimation.
no code implementations • 22 Sep 2018 • Kenan Cui, Xu Chen, Jiangchao Yao, Ya zhang
Conventional CF-based methods use the user-item interaction data as the sole information source to recommend items to users.
no code implementations • 28 Nov 2018 • Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille
However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.
no code implementations • 30 Nov 2018 • Yexun Zhang, Ya zhang, Yan-Feng Wang, Qi Tian
Unsupervised domain adaption aims to learn a powerful classifier for the target domain given a labeled source data set and an unlabeled target data set.
no code implementations • ICCV 2019 • Yuefu Zhou, Ya zhang, Yan-Feng Wang, Qi Tian
A new dropout-based measurement of redundancy, which facilitate the computation of posterior assuming inter-layer dependency, is introduced.
1 code implementation • 6 Mar 2019 • Jiangchao Yao, Ya zhang, Ivor W. Tsang, Jun Sun
We further generalize LCCN for open-set noisy labels and the semi-supervised setting.
Ranked #34 on Image Classification on Clothing1M (using extra training data)
1 code implementation • CVPR 2019 • Maosen Li, Siheng Chen, Xu Chen, Ya zhang, Yan-Feng Wang, Qi Tian
We validate AS-GCN in action recognition using two skeleton data sets, NTU-RGB+D and Kinetics.
no code implementations • 30 Apr 2019 • Chuan Wen, Jie Chang, Ya zhang, Siheng Chen, Yan-Feng Wang, Mei Han, Qi Tian
Automatic character generation is an appealing solution for new typeface design, especially for Chinese typefaces including over 3700 most commonly-used characters.
no code implementations • 26 Jun 2019 • Yifeng Li, Lingxi Xie, Ya zhang, Rui Zhang, Yanfeng Wang, Qi Tian
Generating and eliminating adversarial examples has been an intriguing topic in the field of deep learning.
3 code implementations • 23 Jul 2019 • Xu Chen, Siheng Chen, Huangjie Zheng, Jiangchao Yao, Kenan Cui, Ya zhang, Ivor W. Tsang
NANG learns a unifying latent representation which is shared by both node attributes and graph structures and can be translated to different modalities.
no code implementations • 19 Sep 2019 • Zhuoxun He, Lingxi Xie, Xin Chen, Ya zhang, Yan-Feng Wang, Qi Tian
Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks.
no code implementations • 5 Oct 2019 • Maosen Li, Siheng Chen, Xu Chen, Ya zhang, Yan-Feng Wang, Qi Tian
For the backbone, we propose multi-branch multi-scale graph convolution networks to extract spatial and temporal features.
Ranked #39 on Skeleton Based Action Recognition on NTU RGB+D
no code implementations • 17 Oct 2019 • Xu Chen, Kenan Cui, Ya zhang, Yan-Feng Wang
Recently, recommendation according to sequential user behaviors has shown promising results in many application scenarios.
no code implementations • CVPR 2020 • Xuan Liao, Wenhao Li, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang
We here propose to model the dynamic process of iterative interactive image segmentation as a Markov decision process (MDP) and solve it with reinforcement learning (RL).
1 code implementation • 25 Nov 2019 • Chaoqin Huang, Fei Ye, Jinkun Cao, Maosen Li, Ya zhang, Cewu Lu
We here propose to break this equivalence by erasing selected attributes from the original data and reformulate it as a restoration task, where the normal and the anomalous data are expected to be distinguishable based on restoration errors.
Ranked #21 on Anomaly Detection on One-class CIFAR-10
1 code implementation • ECCV 2020 • Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian
To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.
1 code implementation • CVPR 2020 • Yue Hu, Siheng Chen, Ya zhang, Xiao Gu
Motion prediction is essential and challenging for autonomous vehicles and social robots.
1 code implementation • 17 Mar 2020 • Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yan-Feng Wang, Qi Tian
The core idea of DMGNN is to use a multiscale graph to comprehensively model the internal relations of a human body for motion feature learning.
no code implementations • 11 Apr 2020 • Kunyuan Du, Ya zhang, Haibing Guan
This paper proposes Quantizable DNNs, a special type of DNNs that can flexibly quantize its bit-width (denoted as `bit modes' thereafter) during execution without further re-training.
no code implementations • 13 Jul 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian
The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.
no code implementations • 16 Jul 2020 • Jingchao Su, Xu Chen, Ya zhang, Siheng Chen, Dan Lv, Chenyang Li
The two-level alignment acts as two different constraints on different relations of the shared entities and facilitates better knowledge transfer for relational learning on multiple bipartite graphs.
no code implementations • 16 Jul 2020 • Chenyang Li, Xu Chen, Ya zhang, Siheng Chen, Dan Lv, Yan-Feng Wang
Most existing methods focus on preserving the first-order proximity between entities in the KG.
1 code implementation • 26 Aug 2020 • Xu Chen, Yuangang Pan, Ivor Tsang, Ya zhang
In this paper, we study how to learn node representations against perturbations in GNN.
1 code implementation • 28 Aug 2020 • Xu Chen, Jiangchao Yao, Maosen Li, Ya zhang, Yan-Feng Wang
Comprehensive results on both link sign prediction and node recommendation task demonstrate the effectiveness of DVE.
1 code implementation • 15 Sep 2020 • Xu Chen, Ya zhang, Ivor Tsang, Yuangang Pan, Jingchao Su
In this paper, we attempt to learn both features of user preferences in a more principled way.
no code implementations • 17 Sep 2020 • Ya Zhang, Mingming Lu, Haifeng Li
Traffic forecasting is an important prerequisite for the application of intelligent transportation systems in urban traffic networks.
2 code implementations • NeurIPS 2020 • Maosen Li, Siheng Chen, Ya zhang, Ivor W. Tsang
Based on trainable hierarchical representations of a graph, GXN enables the interchange of intermediate features across scales to promote information flow.
no code implementations • 13 Oct 2020 • Xiaoman Zhang, Shixiang Feng, YuHang Zhou, Ya zhang, Yanfeng Wang
We demonstrate the effectiveness of our methods on two downstream tasks: i) Brain tumor segmentation, ii) Pancreas tumor segmentation.
no code implementations • 13 Oct 2020 • Shixiang Feng, Beibei Liu, Ya zhang, Xiaoyun Zhang, Yuehua Li
In this paper, we explore to model VCFs diagnosis as a three-class classification problem, i. e. normal vertebrae, benign VCFs, and malignant VCFs.
no code implementations • 3 Nov 2020 • Siheng Chen, Maosen Li, Ya zhang
Compared to previous analytical sampling and recovery, the proposed methods are able to flexibly learn a variety of graph signal models from data by leveraging the learning ability of neural networks; compared to previous neural-network-based sampling and recovery, the proposed methods are designed through exploiting specific graph properties and provide interpretability.
3 code implementations • 3 Nov 2020 • Xu Chen, Siheng Chen, Jiangchao Yao, Huangjie Zheng, Ya zhang, Ivor W Tsang
Thereby, designing a new GNN for these graphs is a burning issue to the graph learning community.
no code implementations • 18 Nov 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian
Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.
Ranked #11 on Online Action Detection on TVSeries
no code implementations • 9 Dec 2020 • Fei Ye, Huangjie Zheng, Chaoqin Huang, Ya zhang
Based on this object function we introduce a novel information theoretic framework for unsupervised image anomaly detection.
Ranked #8 on Anomaly Detection on One-class CIFAR-100
no code implementations • 9 Dec 2020 • Chaoqin Huang, Fei Ye, Peisen Zhao, Ya zhang, Yan-Feng Wang, Qi Tian
This paper explores semi-supervised anomaly detection, a more practical setting for anomaly detection where a small additional set of labeled samples are provided.
Ranked #25 on Anomaly Detection on One-class CIFAR-10 (using extra training data)
no code implementations • 15 Dec 2020 • Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian
Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
Ranked #3 on Weakly Supervised Action Localization on BEOID
1 code implementation • 17 Dec 2020 • Chenxin Xu, Siheng Chen, Maosen Li, Ya zhang
To handle the decomposition ambiguity in the teacher network, we propose a cycle-consistent architecture promoting a 3D rotation-invariant property to train the teacher network.
no code implementations • ICCV 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian
Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
no code implementations • 9 Mar 2021 • Jieneng Chen, Ke Yan, Yu-Dong Zhang, YouBao Tang, Xun Xu, Shuwen Sun, Qiuping Liu, Lingyun Huang, Jing Xiao, Alan L. Yuille, Ya zhang, Le Lu
(2) The sampled deep vertex features with positional embedding are mapped into a sequential space and decoded by a multilayer perceptron (MLP) for semantic classification.
no code implementations • 9 Mar 2021 • YuHang Zhou, Xiaoman Zhang, Shixiang Feng, Ya zhang, Yanfeng
Specifically, given a pretrained $K$ organ segmentation model and a new single-organ dataset, we train a unified $K+1$ organ segmentation model without accessing any data belonging to the previous training stages.
no code implementations • 23 Mar 2021 • Mingming Lu, Ya zhang
Graph Neural Networks (GNNs) have attracted increasing attention due to its successful applications on various graph-structure data.
no code implementations • 31 Mar 2021 • Hao Wu, Jiangchao Yao, Jiajie Wang, Yinru Chen, Ya zhang, Yanfeng Wang
Deep neural networks (DNNs) have the capacity to fit extremely noisy labels nonetheless they tend to learn data with clean labels first and then memorize those with noisy labels.
no code implementations • 6 Apr 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian
To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.
Ranked #7 on Weakly Supervised Action Localization on THUMOS14
Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1
no code implementations • 15 Apr 2021 • Zhenfeng Shao, Yong Li, Xiao Huang, Bowen Cai, Lin Ding, Wenkang Pan, Ya zhang
Ecosystem valuation is a method of assigning a monetary value to an ecosystem with its goods and services, often referred to as ecosystem service value (ESV).
1 code implementation • 8 May 2021 • Huangjie Zheng, Xu Chen, Jiangchao Yao, Hongxia Yang, Chunyuan Li, Ya zhang, Hao Zhang, Ivor Tsang, Jingren Zhou, Mingyuan Zhou
We realize this strategy with contrastive attraction and contrastive repulsion (CACR), which makes the query not only exert a greater force to attract more distant positive samples but also do so to repel closer negative samples.
1 code implementation • CVPR 2021 • Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian
Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.
no code implementations • 17 Jun 2021 • Minhao Hu, Matthis Maillard, Ya zhang, Tommaso Ciceri, Giammarco La Barbera, Isabelle Bloch, Pietro Gori
In this paper, we propose KD-Net, a framework to transfer knowledge from a trained multi-modal network (teacher) to a mono-modal one (student).
no code implementations • 2 Jul 2021 • Maosen Li, Siheng Chen, Yanning Shen, Genjia Liu, Ivor W. Tsang, Ya zhang
This paper considers predicting future statuses of multiple agents in an online fashion by exploiting dynamic interactions in the system.
no code implementations • 16 Jul 2021 • Zida Cheng, Siheng Chen, Ya zhang
Experiments are conducted on FPHA and HO-3D datasets.
no code implementations • 5 Aug 2021 • Shixiang Feng, YuHang Zhou, Xiaoman Zhang, Ya zhang, Yanfeng Wang
A novel Multi-teacher Single-student Knowledge Distillation (MS-KD) framework is proposed, where the teacher models are pre-trained single-organ segmentation networks, and the student model is a multi-organ segmentation network.
no code implementations • 11 Aug 2021 • Hao Wu, Jiangchao Yao, Ya zhang, Yanfeng Wang
Learning with noisy labels has gained the enormous interest in the robust deep learning area.
no code implementations • ICCV 2021 • Tianyue Cao, Lianyu Du, Xiaoyun Zhang, Siheng Chen, Ya zhang, Yan-Feng Wang
To handle overlapping category transfer, we propose a double-supervision mean teacher to gather common category information and bridge the domain gap between two datasets.
no code implementations • 25 Aug 2021 • Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yanfeng Wang, Qi Tian
The core of MST-GNN is a multiscale spatio-temporal graph that explicitly models the relations in motions at various spatial and temporal scales.
no code implementations • 7 Sep 2021 • Xiaoman Zhang, Weidi Xie, Chaoqin Huang, Yanfeng Wang, Ya zhang, Xin Chen, Qi Tian
In this paper, we target self-supervised representation learning for zero-shot tumor segmentation.
no code implementations • 24 Sep 2021 • Jinxiang Liu, Yangheng Zhao, Siheng Chen, Ya zhang
To leverage the human body shape prior, LPNet exploits the topological information of the body mesh to learn an expressive visual representation for the target person in the 3D mesh space.
no code implementations • 27 Sep 2021 • Yujie Pan, Jiangchao Yao, Bo Han, Kunyang Jia, Ya zhang, Hongxia Yang
Click-through rate (CTR) prediction becomes indispensable in ubiquitous web recommendation applications.
no code implementations • 23 Oct 2021 • Zida Cheng, Siheng Chen, Ya zhang
Spatio-temporal graph signal analysis has a significant impact on a wide range of applications, including hand/body pose action recognition.
no code implementations • NeurIPS 2021 • Bohan Tang, Yiqi Zhong, Ulrich Neumann, Gang Wang, Ya zhang, Siheng Chen
2) The results of trajectory forecasting benchmarks demonstrate that the CU-based framework steadily helps SOTA systems improve their performances.
1 code implementation • 8 Dec 2021 • Chen Ju, Tengda Han, Kunhao Zheng, Ya zhang, Weidi Xie
Image-based visual-language (I-VL) pre-training has shown great success for learning joint visual-textual representations from large-scale web data, revealing remarkable ability for zero-shot generalisation.
Ranked #5 on Zero-Shot Action Detection on ActivityNet-1.3
1 code implementation • CVPR 2022 • Baisong Guo, Xiaoyun Zhang, HaoNing Wu, Yu Wang, Ya zhang, Yan-Feng Wang
Previous super-resolution (SR) approaches often formulate SR as a regression problem and pixel wise restoration, which leads to a blurry and unreal SR output.
no code implementations • CVPR 2022 • Yixuan Huang, Xiaoyun Zhang, Yu Fu, Siheng Chen, Ya zhang, Yan-Feng Wang, Dazhi He
Those methods conduct the super-resolution task of the input low-resolution(LR) image and the texture transfer task from the reference image together in one module, easily introducing the interference between LR and reference features.
1 code implementation • CVPR 2022 • Chenxin Xu, Maosen Li, Zhenyang Ni, Ya zhang, Siheng Chen
From the aspect of interaction capturing, we propose a trainable multiscale hypergraph to capture both pair-wise and group-wise interactions at multiple group sizes.
no code implementations • 13 May 2022 • Chaoqin Huang, Qinwei Xu, Yanfeng Wang, Yu Wang, Ya zhang
To extend the reconstruction-based anomaly detection architecture to the localized anomalies, we propose a self-supervised learning approach through random masking and then restoring, named Self-Supervised Masking (SSM) for unsupervised anomaly detection and localization.
1 code implementation • 25 May 2022 • Zhihan Zhou, Jiangchao Yao, Yanfeng Wang, Bo Han, Ya zhang
Different from previous works, we explore this direction from an alternative perspective, i. e., the data perspective, and propose a novel Boosted Contrastive Learning (BCL) method.
1 code implementation • 14 Jun 2022 • Ziheng Zhao, Tianjiao Zhang, Weidi Xie, Yanfeng Wang, Ya zhang
This paper considers the problem of undersampled MRI reconstruction.
no code implementations • 26 Jun 2022 • Jinxiang Liu, Chen Ju, Weidi Xie, Ya zhang
We present a simple yet effective self-supervised framework for audio-visual representation learning, to localize the sound source in videos.
1 code implementation • 27 Jun 2022 • Chenxin Xu, Yuxi Wei, Bohan Tang, Sheng Yin, Ya zhang, Siheng Chen
Demystifying the interactions among multiple agents from their past trajectories is fundamental to precise and interpretable trajectory prediction.
no code implementations • 11 Jul 2022 • Bohan Tang, Yiqi Zhong, Chenxin Xu, Wei-Tao Wu, Ulrich Neumann, Yanfeng Wang, Ya zhang, Siheng Chen
Further, we apply the proposed framework to current SOTA multi-agent multi-modal forecasting systems as a plugin module, which enables the SOTA systems to 1) estimate the uncertainty in the multi-agent multi-modal trajectory forecasting task; 2) rank the multiple predictions and select the optimal one based on the estimated uncertainty.
1 code implementation • 15 Jul 2022 • Chaoqin Huang, Haoyan Guan, Aofan Jiang, Ya zhang, Michael Spratling, Yan-Feng Wang
Inspired by how humans detect anomalies, i. e., comparing an image in question to normal images, we here leverage registration, an image alignment task that is inherently generalizable across categories, as the proxy task, to train a category-agnostic anomaly detection model.
Ranked #68 on Anomaly Detection on MVTec AD
1 code implementation • 31 Jul 2022 • Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang
To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.
1 code implementation • 8 Aug 2022 • Yue Hu, Siheng Chen, Xu Chen, Ya zhang, Xiao Gu
Visual relationship detection aims to detect the interactions between objects in an image; however, this task suffers from combinatorial explosion due to the variety of objects and interactions.
no code implementations • 20 Aug 2022 • Wentao Liu, Chaofan Ma, Yuhuan Yang, Weidi Xie, Ya zhang
The goal of this paper is to interactively refine the automatic segmentation on challenging structures that fall behind human performance, either due to the scarcity of available annotations or the difficulty nature of the problem itself, for example, on segmenting cancer or small organs.
no code implementations • 7 Oct 2022 • Qinye Zhou, Ziyi Li, Weidi Xie, Xiaoyun Zhang, Ya zhang, Yanfeng Wang
Existing models on super-resolution often specialized for one scale, fundamentally limiting their use in practical scenarios.
1 code implementation • 27 Oct 2022 • Chaofan Ma, Yuhuan Yang, Yanfeng Wang, Ya zhang, Weidi Xie
When trained at a sufficient scale, self-supervised learning has exhibited a notable ability to solve a wide range of visual or language understanding tasks.
1 code implementation • 14 Dec 2022 • Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian
However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.
no code implementations • CVPR 2023 • Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya zhang, Jianlong Chang, Yanfeng Wang, Qi Tian
And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.
Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization
no code implementations • 29 Dec 2022 • Shengchao Hu, Li Shen, Ya zhang, Yixin Chen, DaCheng Tao
Transformer, originally devised for natural language processing, has also attested significant success in computer vision.
1 code implementation • CVPR 2023 • Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya zhang, Qi Tian, Yanfeng Wang
Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift.
no code implementations • ICCV 2023 • Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.
no code implementations • 5 Jan 2023 • Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.
no code implementations • 9 Jan 2023 • Chaoyi Wu, Feng Chang, Xiao Su, Zhihan Wu, Yanfeng Wang, Ling Zhu, Ya zhang
The branch targets to solve a closely related task on the LN station level, i. e., classifying whether an LN station contains metastatic LN or not, so as to learn representations for LN stations.
1 code implementation • ICCV 2023 • Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
The goal of this paper is to extract the visual-language correspondence from a pre-trained text-to-image diffusion model, in the form of segmentation map, i. e., simultaneously generating images and segmentation masks for the corresponding visual entities described in the text prompt.
1 code implementation • 10 Feb 2023 • Feng Hong, Jiangchao Yao, Zhihan Zhou, Ya zhang, Yanfeng Wang
The straightforward combination of LT and PLL, i. e., LT-PLL, suffers from a fundamental dilemma: LT methods build upon a given class distribution that is unavailable in PLL, and the performance of PLL is severely influenced in long-tailed context.
1 code implementation • 19 Feb 2023 • Jiangchao Yao, Bo Han, Zhihan Zhou, Ya zhang, Ivor W. Tsang
We solve this problem by introducing a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework.
no code implementations • 20 Feb 2023 • Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian
Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.
no code implementations • 22 Feb 2023 • Chaoyi Wu, Xiaoman Zhang, Yanfeng Wang, Ya zhang, Weidi Xie
In this paper, we consider the problem of disease diagnosis.
1 code implementation • 27 Feb 2023 • Xiaoman Zhang, Chaoyi Wu, Ya zhang, Yanfeng Wang, Weidi Xie
While multi-modal foundation models pre-trained on large-scale data have been successful in natural language understanding and vision recognition, their use in medical domains is still limited due to the fine-grained nature of medical tasks and the high demand for domain knowledge.
no code implementations • 7 Mar 2023 • Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao
Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment.
1 code implementation • 13 Mar 2023 • Weixiong Lin, Ziheng Zhao, Xiaoman Zhang, Chaoyi Wu, Ya zhang, Yanfeng Wang, Weidi Xie
Foundation models trained on large-scale dataset gain a recent surge in CV and NLP.
Ranked #3 on Medical Visual Question Answering on PMC-VQA
1 code implementation • CVPR 2023 • Zhixin Wang, Xiaoyun Zhang, Ziying Zhang, Huangjie Zheng, Mingyuan Zhou, Ya zhang, Yanfeng Wang
However, it is expensive and infeasible to include every type of degradation to cover real-world cases in the training data.
no code implementations • CVPR 2023 • Zhaoyang Lyu, Jinyi Wang, Yuwei An, Ya zhang, Dahua Lin, Bo Dai
In this work, we design a novel sparse latent point diffusion model for mesh generation.
no code implementations • 17 Mar 2023 • Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya zhang, Yanfeng Wang
However, the challenges exist as there is one structural difference between generative and discriminative models, which limits the direct use.
no code implementations • 19 Mar 2023 • Chaofan Ma, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang, Ya zhang
Interactive segmentation has recently been explored to effectively and efficiently harvest high-quality segmentation masks by iteratively incorporating user hints.
no code implementations • 21 Mar 2023 • Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.
1 code implementation • 27 Apr 2023 • Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
Our contributions are threefold: (i) we systematically investigate the process of adapting a general-purpose foundation language model towards medical domain, this involves data-centric knowledge injection through the integration of 4. 8M biomedical academic papers and 30K medical textbooks, as well as comprehensive fine-tuning for alignment with domain-specific instructions; (ii) we contribute a large-scale, comprehensive dataset for instruction tuning.
1 code implementation • CVPR 2023 • Yiming Qin, Huangjie Zheng, Jiangchao Yao, Mingyuan Zhou, Ya zhang
To tackle this problem, we set from the hypothesis that the data distribution is not class-balanced, and propose Class-Balancing Diffusion Models (CBDM) that are trained with a distribution adjustment regularizer as a solution.
no code implementations • 16 May 2023 • Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao
Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.
2 code implementations • 17 May 2023 • Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Weixiong Lin, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA), which is crucial in efficiently interpreting medical images with vital clinic-relevant information.
Ranked #1 on Medical Visual Question Answering on PMC-VQA
no code implementations • 18 May 2023 • Jinxiang Liu, Yu Wang, Chen Ju, Chaofan Ma, Ya zhang, Weidi Xie
The objective of Audio-Visual Segmentation (AVS) is to localise the sounding objects within visual scenes by accurately predicting pixel-wise segmentation masks.
1 code implementation • 12 Jun 2023 • Yikun Liu, Jiangchao Yao, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of composed image retrieval (CIR), it aims to train a model that can fuse multi-modal information, e. g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
Ranked #1 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRR
no code implementations • CVPR 2023 • Mengxi Chen, Linyu Xing, Yu Wang, Ya zhang
This paper explores the tasks of leveraging auxiliary modalities which are only available at training to enhance multimodal representation learning through cross-modal Knowledge Distillation (KD).
no code implementations • 15 Jun 2023 • Chuyun Shen, Wenhao Li, Ya zhang, Xiangfeng Wang
The Segmentation Anything Model (SAM) has recently emerged as a foundation model for addressing image segmentation.
1 code implementation • 24 Jun 2023 • HaoNing Wu, Xiaoyun Zhang, Weidi Xie, Ya zhang, Yanfeng Wang
Video frame interpolation (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
no code implementations • 5 Jul 2023 • Yuhuan Yang, Chaofan Ma, Chen Ju, Ya zhang, Yanfeng Wang
In this paper, we define a unified setting termed as open-set semantic segmentation (O3S), which aims to learn seen and unseen semantics from both visual examples and textual names.
no code implementations • 25 Jul 2023 • Jinxiang Liu, Chen Ju, Chaofan Ma, Yanfeng Wang, Yu Wang, Ya zhang
The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues.
1 code implementation • 3 Aug 2023 • YuHang Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Yanfeng Wang
By dynamically manipulating the gradient during training based on these factors, BDR can effectively alleviate knowledge destruction and improve knowledge reconstruction.
1 code implementation • 3 Aug 2023 • Aofan Jiang, Chaoqin Huang, Qing Cao, Shuang Wu, Zi Zeng, Kang Chen, Ya zhang, Yanfeng Wang
To address this challenge, this paper introduces a novel multi-scale cross-restoration framework for ECG anomaly detection and localization that considers both local and global ECG characteristics.
1 code implementation • 4 Aug 2023 • Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this study, we aim to initiate the development of Radiology Foundation Model, termed as RadFM.
1 code implementation • ICCV 2023 • Qingyao Xu, Weibo Mao, Jingze Gong, Chenxin Xu, Siheng Chen, Weidi Xie, Ya zhang, Yanfeng Wang
Multi-person motion prediction is a challenging problem due to the dependency of motion on both individual past movements and interactions with other people.
no code implementations • 9 Aug 2023 • Chaoqin Huang, Aofan Jiang, Ya zhang, Yanfeng Wang
Anomaly detection has gained considerable attention due to its broad range of applications, particularly in industrial defect detection.
no code implementations • 17 Aug 2023 • Feng Hong, Tianjie Dai, Jiangchao Yao, Ya zhang, Yanfeng Wang
Clinical classification of chest radiography is particularly challenging for standard machine learning algorithms due to its inherent long-tailed and multi-label nature.
no code implementations • NeurIPS 2023 • Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Ya zhang, Yanfeng Wang
The results show the superior performance of attribute decomposition-aggregation.
1 code implementation • 13 Sep 2023 • Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya zhang, Yanfeng Wang
Magnetic resonance imaging~(MRI) have played a crucial role in brain disease diagnosis, with which a range of computer-aided artificial intelligence methods have been proposed.
1 code implementation • 15 Oct 2023 • Chaoyi Wu, Jiayu Lei, Qiaoyu Zheng, Weike Zhao, Weixiong Lin, Xiaoman Zhang, Xiao Zhou, Ziheng Zhao, Ya zhang, Yanfeng Wang, Weidi Xie
Driven by the large foundation models, the development of artificial intelligence has witnessed tremendous progress lately, leading to a surge of general interest from the public.
1 code implementation • NeurIPS 2023 • Zhihan Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Bo Han, Yanfeng Wang
Self-supervised learning (SSL) as an effective paradigm of representation learning has achieved tremendous success on various curated datasets in diverse scenarios.
1 code implementation • 18 Dec 2023 • Tianjie Dai, Ruipeng Zhang, Feng Hong, Jiangchao Yao, Ya zhang, Yanfeng Wang
Vision-Language Pre-training (VLP) that utilizes the multi-modal information to promote the training efficiency and effectiveness, has achieved great success in vision recognition of natural domains and shown promise in medical imaging diagnosis for the Chest X-Rays (CXRs).
no code implementations • 20 Dec 2023 • Yan Cai, LinLin Wang, Ye Wang, Gerard de Melo, Ya zhang, Yanfeng Wang, Liang He
The emergence of various medical large language models (LLMs) in the medical domain has highlighted the need for unified evaluation standards, as manual evaluation of LLMs proves to be time-consuming and labor-intensive.
no code implementations • 21 Dec 2023 • Zeqian Li, Qirui Chen, Tengda Han, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporally aligning the video and texts from instructional videos, specifically, given a long-term video, and associated text sentences, our goal is to determine their corresponding timestamps in the video.
1 code implementation • 26 Dec 2023 • Qiaoyu Zheng, Weike Zhao, Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this study, we aim to investigate the problem of large-scale, large-vocabulary disease classification for radiologic images, which can be formulated as a multi-modal, multi-anatomy, multi-label, long-tailed classification.
no code implementations • 28 Dec 2023 • Ziheng Zhao, Yao Zhang, Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
Our main contributions are three folds: (i) on data construction, we combine multiple knowledge sources to construct a multi-modal medical knowledge tree; Then we build up a large-scale segmentation dataset for training, by collecting over 11K 3D medical image scans from 31 segmentation datasets with careful standardization on both visual scans and label space; (ii) on model training, we formulate a universal segmentation model, that can be prompted by inputting medical terminologies in text form.
no code implementations • 14 Feb 2024 • Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang, Hui Zhao
In this paper, we explore a learning-based automatic bone quality classification method for spinal metastasis based on CT images.
no code implementations • 14 Feb 2024 • Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang, Hui Zhao
In this paper, we propose a Weakly supervised Iterative Spinal Segmentation (WISS) method leveraging only four corner landmark weak labels on a single sagittal slice to achieve automatic volumetric segmentation from CT images for VBs.
no code implementations • 18 Feb 2024 • YiQiu Guo, Yuchen Yang, Ya zhang, Yu Wang, Yanfeng Wang
Structured data offers a sophisticated mechanism for the organization of information.
1 code implementation • 21 Feb 2024 • Pengcheng Qiu, Chaoyi Wu, Xiaoman Zhang, Weixiong Lin, Haicheng Wang, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we aim to develop an open-source, multilingual language model for medicine, that the benefits a wider, linguistically diverse audience from different regions.
1 code implementation • 1 Mar 2024 • Jinyan Hou, Shan Liu, Ya zhang, Haotong Qin
To tackle these challenges, this paper introduces a novel graph construction method tailored to free-floating traffic mode.
no code implementations • 17 Mar 2024 • Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya zhang, Yanfeng Wang
NFs, temporally adjacent to the labeled frame, often contain rich motion information that assists in the accurate localization of sounding objects.
no code implementations • 18 Mar 2024 • Zhaoyang Lyu, Ben Fei, Jinyi Wang, Xudong Xu, Ya zhang, Weidong Yang, Bo Dai
Mesh is a fundamental representation of 3D assets in various industrial applications, and is widely supported by professional softwares.
1 code implementation • 19 Mar 2024 • Chaoqin Huang, Aofan Jiang, Jinghao Feng, Ya zhang, Xinchao Wang, Yanfeng Wang
Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains.
no code implementations • 26 Mar 2024 • Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya zhang, Yanfeng Wang
Referring Image Segmentation (RIS) leveraging transformers has achieved great success on the interpretation of complex visual-language tasks.
no code implementations • ECCV 2020 • Kunyuan Du, Ya zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin
Compared with low-bit models trained directly, the proposed framework brings 0. 5% to 3. 4% accuracy gains to three different quantization schemes.