Search Results for author: Ya zhang

Found 169 papers, 68 papers with code

FTL: A universal framework for training low-bit DNNs via Feature Transfer

no code implementations ECCV 2020 Kunyuan Du, Ya zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin

Compared with low-bit models trained directly, the proposed framework brings 0. 5% to 3. 4% accuracy gains to three different quantization schemes.

Quantization Transfer Learning

Reprogramming Distillation for Medical Foundation Models

no code implementations9 Jul 2024 YuHang Zhou, Siyuan Du, Haolin Li, Jiangchao Yao, Ya zhang, Yanfeng Wang

However, due to the gap between pre-training tasks (or modalities) and downstream tasks (or modalities), the real-world computation and speed constraints, it might not be straightforward to apply medical foundation models in the downstream scenarios.

Knowledge Distillation Transfer Learning

RaTEScore: A Metric for Radiology Report Generation

1 code implementation24 Jun 2024 Weike Zhao, Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie

This paper introduces a novel, entity-aware metric, termed as Radiological Report (Text) Evaluation (RaTEScore), to assess the quality of medical reports generated by AI models.

Entity Embeddings Language Modelling +2

Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

no code implementations14 Jun 2024 YuHang Zhou, Zihua Zhao, Haolin Li, Siyuan Du, Jiangchao Yao, Ya zhang, Yanfeng Wang

Training a unified model to take multiple targets into account is a trend towards artificial general intelligence.

Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

1 code implementation13 Jun 2024 Chaoqin Huang, Haoyan Guan, Aofan Jiang, Yanfeng Wang, Michael Spratling, Xinchao Wang, Ya zhang

Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD) framework.

Anomaly Detection Representation Learning

Diversified Batch Selection for Training Acceleration

1 code implementation7 Jun 2024 Feng Hong, Yueming Lyu, Jiangchao Yao, Ya zhang, Ivor W. Tsang, Yanfeng Wang

The remarkable success of modern machine learning models on large datasets often demands extensive training time and resource consumption.

Diversity

TAIA: Large Language Models are Out-of-Distribution Data Learners

1 code implementation30 May 2024 Shuyang Jiang, Yusheng Liao, Ya zhang, Yu Wang, Yanfeng Wang

Fine-tuning on task-specific question-answer pairs is a predominant method for enhancing the performance of instruction-tuned large language models (LLMs) on downstream tasks.

Math

Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization

1 code implementation29 May 2024 Ziqing Fan, Shengchao Hu, Jiangchao Yao, Gang Niu, Ya zhang, Masashi Sugiyama, Yanfeng Wang

However, the local loss landscapes may not accurately reflect the flatness of global loss landscape in heterogeneous environments; as a result, minimizing local sharpness and calculating perturbations on client data might not align the efficacy of SAM in FL with centralized training.

Federated Learning

Federated Learning with Bilateral Curation for Partially Class-Disjoint Data

1 code implementation NeurIPS 2023 Ziqing Fan, Ruipeng Zhang, Jiangchao Yao, Bo Han, Ya zhang, Yanfeng Wang

Partially class-disjoint data (PCDD), a common yet under-explored data formation where each client contributes a part of classes (instead of all classes) of samples, severely challenges the performance of federated algorithms.

Federated Learning

Federated Learning under Partially Class-Disjoint Data via Manifold Reshaping

1 code implementation29 May 2024 Ziqing Fan, Jiangchao Yao, Ruipeng Zhang, Lingjuan Lyu, Ya zhang, Yanfeng Wang

Statistical heterogeneity severely limits the performance of federated learning (FL), motivating several explorations e. g., FedProx, MOON and FedDyn, to alleviate this problem.

Federated Learning

Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts

1 code implementation29 May 2024 Ruipeng Zhang, Ziqing Fan, Jiangchao Yao, Ya zhang, Yanfeng Wang

This paper presents a Domain-Inspired Sharpness-Aware Minimization (DISAM) algorithm for optimization under domain shifts.

Domain Generalization

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

1 code implementation28 May 2024 Shengchao Hu, Ziqing Fan, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

However, variations in task content and complexity pose significant challenges in policy formulation, necessitating judicious parameter sharing and management of conflicting gradients for optimal policy performance.

Management Meta-Learning +1

Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning

no code implementations CVPR 2024 Zihua Zhao, Mengxi Chen, Tianjie Dai, Jiangchao Yao, Bo Han, Ya zhang, Yanfeng Wang

Prior approaches to leverage such data mainly consider the application of uni-modal noisy label learning without amending the impact on both cross-modal and intra-modal geometrical structures in multimodal learning.

Cross-modal retrieval with noisy correspondence

Q-value Regularized Transformer for Offline Reinforcement Learning

1 code implementation27 May 2024 Shengchao Hu, Ziqing Fan, Chaoqin Huang, Li Shen, Ya zhang, Yanfeng Wang, DaCheng Tao

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action distribution based on history trajectory and target returns for each state.

D4RL Offline RL +3

JointRF: End-to-End Joint Optimization for Dynamic Neural Radiance Field Representation and Compression

no code implementations23 May 2024 Zihan Zheng, Houqiang Zhong, Qiang Hu, Xiaoyun Zhang, Li Song, Ya zhang, Yanfeng Wang

Neural Radiance Field (NeRF) excels in photo-realistically static scenes, inspiring numerous efforts to facilitate volumetric videos.

Feature Compression

Learning Multi-Agent Communication from Graph Modeling Perspective

1 code implementation14 May 2024 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

In numerous artificial intelligence applications, the collaborative efforts of multiple intelligent agents are imperative for the successful attainment of target objectives.

Low-Rank Knowledge Decomposition for Medical Foundation Models

no code implementations CVPR 2024 YuHang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya zhang, Yanfeng Wang

The popularity of large-scale pre-training has promoted the development of medical foundation models.

RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis

no code implementations25 Apr 2024 Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Jiayu Lei, Ya zhang, Yanfeng Wang, Weidi Xie

We believe that RadGenome-Chest CT can significantly advance the development of multimodal medical foundation models, by training to generate texts based on given segmentation regions, which is unattainable with previous relevant datasets.

Segmentation Sentence +2

Knowledge-enhanced Visual-Language Pretraining for Computational Pathology

1 code implementation15 Apr 2024 Xiao Zhou, Xiaoman Zhang, Chaoyi Wu, Ya zhang, Weidi Xie, Yanfeng Wang

In this paper, we consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources, along with the domain specific knowledge in pathology.

Cross-Modal Retrieval Language Modelling +4

Anomaly Detection in Electrocardiograms: Advancing Clinical Diagnosis Through Self-Supervised Learning

no code implementations7 Apr 2024 Aofan Jiang, Chaoqin Huang, Qing Cao, Yuchen Xu, Zi Zeng, Kang Chen, Ya zhang, Yanfeng Wang

We introduce a novel self-supervised learning framework for ECG AD, utilizing a vast dataset of normal ECGs to autonomously detect and localize cardiac anomalies.

Self-Supervised Anomaly Detection Self-Supervised Learning +2

ReMamber: Referring Image Segmentation with Mamba Twister

1 code implementation26 Mar 2024 Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya zhang, Yanfeng Wang

Referring Image Segmentation (RIS) leveraging transformers has achieved great success on the interpretation of complex visual-language tasks.

Image Segmentation Semantic Segmentation

Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

1 code implementation CVPR 2024 Chaoqin Huang, Aofan Jiang, Jinghao Feng, Ya zhang, Xinchao Wang, Yanfeng Wang

Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains.

Anomaly Classification Anomaly Detection

GetMesh: A Controllable Model for High-quality Mesh Generation and Manipulation

no code implementations18 Mar 2024 Zhaoyang Lyu, Ben Fei, Jinyi Wang, Xudong Xu, Ya zhang, Weidong Yang, Bo Dai

Mesh is a fundamental representation of 3D assets in various industrial applications, and is widely supported by professional softwares.

Audio-Visual Segmentation via Unlabeled Frame Exploitation

no code implementations CVPR 2024 Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya zhang, Yanfeng Wang

NFs, temporally adjacent to the labeled frame, often contain rich motion information that assists in the accurate localization of sounding objects.

Diversity valid

Graph Construction with Flexible Nodes for Traffic Demand Prediction

1 code implementation1 Mar 2024 Jinyan Hou, Shan Liu, Ya zhang, Haotong Qin

To tackle these challenges, this paper introduces a novel graph construction method tailored to free-floating traffic mode.

Clustering Computational Efficiency +1

Towards Building Multilingual Language Model for Medicine

1 code implementation21 Feb 2024 Pengcheng Qiu, Chaoyi Wu, Xiaoman Zhang, Weixiong Lin, Haicheng Wang, Ya zhang, Yanfeng Wang, Weidi Xie

The development of open-source, multilingual medical language models can benefit a wide, linguistically diverse audience from different regions.

Domain Adaptation Language Modelling +1

Learning-based Bone Quality Classification Method for Spinal Metastasis

no code implementations14 Feb 2024 Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang, Hui Zhao

In this paper, we explore a learning-based automatic bone quality classification method for spinal metastasis based on CT images.

Binary Classification Classification +3

Weakly Supervised Segmentation of Vertebral Bodies with Iterative Slice-propagation

no code implementations14 Feb 2024 Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang, Hui Zhao

In this paper, we propose a Weakly supervised Iterative Spinal Segmentation (WISS) method leveraging only four corner landmark weak labels on a single sagittal slice to achieve automatic volumetric segmentation from CT images for VBs.

Segmentation Weakly supervised segmentation

One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts

1 code implementation28 Dec 2023 Ziheng Zhao, Yao Zhang, Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie

Our main contributions are three folds: (i) for dataset construction, we construct the first multi-modal knowledge tree on human anatomy, including 6502 anatomical terminologies; Then we build up the largest and most comprehensive segmentation dataset for training, by collecting over 22K 3D medical image scans from 72 segmentation datasets, across 497 classes, with careful standardization on both image scans and label space; (ii) for architecture design, we propose to inject medical knowledge into a text encoder via contrastive learning, and then formulate a universal segmentation model, that can be prompted by feeding in medical terminologies in text form; (iii) As a result, we have trained SAT-Nano (110M parameters) and SAT-Pro (447M parameters), demonstrating comparable performance to 72 specialist nnU-Nets trained on each dataset/subsets.

Anatomy Contrastive Learning +5

A Strong Baseline for Temporal Video-Text Alignment

no code implementations21 Dec 2023 Zeqian Li, Qirui Chen, Tengda Han, Ya zhang, Yanfeng Wang, Weidi Xie

In this paper, we consider the problem of temporally aligning the video and texts from instructional videos, specifically, given a long-term video, and associated text sentences, our goal is to determine their corresponding timestamps in the video.

Descriptive Language Modelling +3

MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models

no code implementations20 Dec 2023 Yan Cai, LinLin Wang, Ye Wang, Gerard de Melo, Ya zhang, Yanfeng Wang, Liang He

The emergence of various medical large language models (LLMs) in the medical domain has highlighted the need for unified evaluation standards, as manual evaluation of LLMs proves to be time-consuming and labor-intensive.

Clinical Knowledge

UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification

1 code implementation18 Dec 2023 Tianjie Dai, Ruipeng Zhang, Feng Hong, Jiangchao Yao, Ya zhang, Yanfeng Wang

Vision-Language Pre-training (VLP) that utilizes the multi-modal information to promote the training efficiency and effectiveness, has achieved great success in vision recognition of natural domains and shown promise in medical imaging diagnosis for the Chest X-Rays (CXRs).

Combating Representation Learning Disparity with Geometric Harmonization

1 code implementation NeurIPS 2023 Zhihan Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Bo Han, Yanfeng Wang

Self-supervised learning (SSL) as an effective paradigm of representation learning has achieved tremendous success on various curated datasets in diverse scenarios.

Representation Learning Self-Supervised Learning

Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis

1 code implementation15 Oct 2023 Chaoyi Wu, Jiayu Lei, Qiaoyu Zheng, Weike Zhao, Weixiong Lin, Xiaoman Zhang, Xiao Zhou, Ziheng Zhao, Ya zhang, Yanfeng Wang, Weidi Xie

Driven by the large foundation models, the development of artificial intelligence has witnessed tremendous progress lately, leading to a surge of general interest from the public.

Anatomy Computed Tomography (CT) +2

UniBrain: Universal Brain MRI Diagnosis with Hierarchical Knowledge-enhanced Pre-training

1 code implementation13 Sep 2023 Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya zhang, Yanfeng Wang

Magnetic resonance imaging~(MRI) have played a crucial role in brain disease diagnosis, with which a range of computer-aided artificial intelligence methods have been proposed.

Bag of Tricks for Long-Tailed Multi-Label Classification on Chest X-Rays

no code implementations17 Aug 2023 Feng Hong, Tianjie Dai, Jiangchao Yao, Ya zhang, Yanfeng Wang

Clinical classification of chest radiography is particularly challenging for standard machine learning algorithms due to its inherent long-tailed and multi-label nature.

Data Augmentation Multi-Label Classification

Joint-Relation Transformer for Multi-Person Motion Prediction

1 code implementation ICCV 2023 Qingyao Xu, Weibo Mao, Jingze Gong, Chenxin Xu, Siheng Chen, Weidi Xie, Ya zhang, Yanfeng Wang

Multi-person motion prediction is a challenging problem due to the dependency of motion on both individual past movements and interactions with other people.

motion prediction Relation

Multi-Scale Memory Comparison for Zero-/Few-Shot Anomaly Detection

no code implementations9 Aug 2023 Chaoqin Huang, Aofan Jiang, Ya zhang, Yanfeng Wang

Anomaly detection has gained considerable attention due to its broad range of applications, particularly in industrial defect detection.

Anomaly Detection Defect Detection +1

Balanced Destruction-Reconstruction Dynamics for Memory-replay Class Incremental Learning

1 code implementation3 Aug 2023 YuHang Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Yanfeng Wang

By dynamically manipulating the gradient during training based on these factors, BDR can effectively alleviate knowledge destruction and improve knowledge reconstruction.

Class Incremental Learning Incremental Learning

Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection

1 code implementation3 Aug 2023 Aofan Jiang, Chaoqin Huang, Qing Cao, Shuang Wu, Zi Zeng, Kang Chen, Ya zhang, Yanfeng Wang

To address this challenge, this paper introduces a novel multi-scale cross-restoration framework for ECG anomaly detection and localization that considers both local and global ECG characteristics.

Anomaly Detection

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation

no code implementations25 Jul 2023 Jinxiang Liu, Chen Ju, Chaofan Ma, Yanfeng Wang, Yu Wang, Ya zhang

The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues.

Decoder Segmentation

Multi-Modal Prototypes for Open-World Semantic Segmentation

no code implementations5 Jul 2023 Yuhuan Yang, Chaofan Ma, Chen Ju, Fei Zhang, Jiangchao Yao, Ya zhang, Yanfeng Wang

To be specific, unlike the straightforward combination of bi-modal clues, we decompose the high-level language information as multi-aspect prototypes and aggregate the low-level visual information as more semantic prototypes, on basis of which, a fine-grained complementary fusion makes the multi-modal prototypes more powerful and accurate to promote the prediction.

Segmentation Semantic Segmentation

Boost Video Frame Interpolation via Motion Adaptation

1 code implementation24 Jun 2023 HaoNing Wu, Xiaoyun Zhang, Weidi Xie, Ya zhang, Yanfeng Wang

Video frame interpolation (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.

Motion Estimation Video Frame Interpolation

Enhanced Multimodal Representation Learning with Cross-modal KD

no code implementations CVPR 2023 Mengxi Chen, Linyu Xing, Yu Wang, Ya zhang

This paper explores the tasks of leveraging auxiliary modalities which are only available at training to enhance multimodal representation learning through cross-modal Knowledge Distillation (KD).

Contrastive Learning Emotion Classification +5

Zero-shot Composed Text-Image Retrieval

1 code implementation12 Jun 2023 Yikun Liu, Jiangchao Yao, Ya zhang, Yanfeng Wang, Weidi Xie

In this paper, we consider the problem of composed image retrieval (CIR), it aims to train a model that can fuse multi-modal information, e. g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.

Image Retrieval Retrieval +1

Annotation-free Audio-Visual Segmentation

no code implementations18 May 2023 Jinxiang Liu, Yu Wang, Chen Ju, Chaofan Ma, Ya zhang, Weidi Xie

The objective of Audio-Visual Segmentation (AVS) is to localise the sounding objects within visual scenes by accurately predicting pixel-wise segmentation masks.

Image Segmentation Segmentation +1

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

2 code implementations17 May 2023 Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Weixiong Lin, Ya zhang, Yanfeng Wang, Weidi Xie

In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA), which is crucial in efficiently interpreting medical images with vital clinic-relevant information.

Generative Visual Question Answering Language Modelling +4

Prompt-Tuning Decision Transformer with Preference Ranking

no code implementations16 May 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Our work contributes to the advancement of prompt-tuning approaches in RL, providing a promising direction for optimizing large RL agents for specific preference tasks.

Class-Balancing Diffusion Models

1 code implementation CVPR 2023 Yiming Qin, Huangjie Zheng, Jiangchao Yao, Mingyuan Zhou, Ya zhang

To tackle this problem, we set from the hypothesis that the data distribution is not class-balanced, and propose Class-Balancing Diffusion Models (CBDM) that are trained with a distribution adjustment regularizer as a solution.

Diversity

PMC-LLaMA: Towards Building Open-source Language Models for Medicine

1 code implementation27 Apr 2023 Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie

Our contributions are threefold: (i) we systematically investigate the process of adapting a general-purpose foundation language model towards medical domain, this involves data-centric knowledge injection through the integration of 4. 8M biomedical academic papers and 30K medical textbooks, as well as comprehensive fine-tuning for alignment with domain-specific instructions; (ii) we contribute a large-scale, comprehensive dataset for instruction tuning.

Language Modelling Natural Language Understanding +1

Multi-modal Prompting for Low-Shot Temporal Action Localization

no code implementations21 Mar 2023 Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie

In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.

Action Classification Temporal Action Localization

Boundary-aware Supervoxel-level Iteratively Refined Interactive 3D Image Segmentation with Multi-agent Reinforcement Learning

no code implementations19 Mar 2023 Chaofan Ma, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang, Ya zhang

Interactive segmentation has recently been explored to effectively and efficiently harvest high-quality segmentation masks by iteratively incorporating user hints.

Image Segmentation Interactive Segmentation +5

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery

no code implementations17 Mar 2023 Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya zhang, Yanfeng Wang

However, the challenges exist as there is one structural difference between generative and discriminative models, which limits the direct use.

Object Object Discovery +1

Graph Decision Transformer

no code implementations7 Mar 2023 Shengchao Hu, Li Shen, Ya zhang, DaCheng Tao

Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment.

Offline RL OpenAI Gym +1

Knowledge-enhanced Visual-Language Pre-training on Chest Radiology Images

1 code implementation27 Feb 2023 Xiaoman Zhang, Chaoyi Wu, Ya zhang, Yanfeng Wang, Weidi Xie

While multi-modal foundation models pre-trained on large-scale data have been successful in natural language understanding and vision recognition, their use in medical domains is still limited due to the fine-grained nature of medical tasks and the high demand for domain knowledge.

Natural Language Understanding Representation Learning

Constraint and Union for Partially-Supervised Temporal Sentence Grounding

no code implementations20 Feb 2023 Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian

Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.

Sentence Temporal Sentence Grounding

Latent Class-Conditional Noise Model

1 code implementation19 Feb 2023 Jiangchao Yao, Bo Han, Zhihan Zhou, Ya zhang, Ivor W. Tsang

We solve this problem by introducing a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework.

Learning with noisy labels

Long-Tailed Partial Label Learning via Dynamic Rebalancing

1 code implementation10 Feb 2023 Feng Hong, Jiangchao Yao, Zhihan Zhou, Ya zhang, Yanfeng Wang

The straightforward combination of LT and PLL, i. e., LT-PLL, suffers from a fundamental dilemma: LT methods build upon a given class distribution that is unavailable in PLL, and the performance of PLL is severely influenced in long-tailed context.

Partial Label Learning

Open-vocabulary Object Segmentation with Diffusion Models

1 code implementation ICCV 2023 Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya zhang, Yanfeng Wang, Weidi Xie

The goal of this paper is to extract the visual-language correspondence from a pre-trained text-to-image diffusion model, in the form of segmentation map, i. e., simultaneously generating images and segmentation masks for the corresponding visual entities described in the text prompt.

Image Segmentation Object +3

Integrating features from lymph node stations for metastatic lymph node detection

no code implementations9 Jan 2023 Chaoyi Wu, Feng Chang, Xiao Su, Zhihan Wu, Yanfeng Wang, Ling Zhu, Ya zhang

The branch targets to solve a closely related task on the LN station level, i. e., classifying whether an LN station contains metastatic LN or not, so as to learn representations for LN stations.

Computed Tomography (CT)

MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology

no code implementations5 Jan 2023 Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie

In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.

Medical Diagnosis Self-Supervised Learning

Federated Domain Generalization With Generalization Adjustment

1 code implementation CVPR 2023 Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya zhang, Qi Tian, Yanfeng Wang

Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift.

Domain Generalization Fairness +1

MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training for X-ray Diagnosis

no code implementations ICCV 2023 Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie

In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.

Medical Diagnosis

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

no code implementations29 Dec 2022 Shengchao Hu, Li Shen, Ya zhang, Yixin Chen, DaCheng Tao

Transformer, originally devised for natural language processing, has also attested significant success in computer vision.

Autonomous Driving reinforcement-learning +2

FedSkip: Combatting Statistical Heterogeneity with Federated Skip Aggregation

1 code implementation14 Dec 2022 Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian

However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.

Federated Learning

Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models

1 code implementation27 Oct 2022 Chaofan Ma, Yuhuan Yang, Yanfeng Wang, Ya zhang, Weidi Xie

When trained at a sufficient scale, self-supervised learning has exhibited a notable ability to solve a wide range of visual or language understanding tasks.

Image Segmentation Language Modelling +4

A Simple Plugin for Transforming Images to Arbitrary Scales

no code implementations7 Oct 2022 Qinye Zhou, Ziyi Li, Weidi Xie, Xiaoyun Zhang, Ya zhang, Yanfeng Wang

Existing models on super-resolution often specialized for one scale, fundamentally limiting their use in practical scenarios.

Super-Resolution

Transforming the Interactive Segmentation for Medical Imaging

no code implementations20 Aug 2022 Wentao Liu, Chaofan Ma, Yuhuan Yang, Weidi Xie, Ya zhang

The goal of this paper is to interactively refine the automatic segmentation on challenging structures that fall behind human performance, either due to the scarcity of available annotations or the difficulty nature of the problem itself, for example, on segmenting cancer or small organs.

Decoder Interactive Segmentation +1

Neural Message Passing for Visual Relationship Detection

1 code implementation8 Aug 2022 Yue Hu, Siheng Chen, Xu Chen, Ya zhang, Xiao Gu

Visual relationship detection aims to detect the interactions between objects in an image; however, this task suffers from combinatorial explosion due to the variety of objects and interactions.

Relationship Detection Visual Relationship Detection

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

1 code implementation31 Jul 2022 Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang

To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.

Human motion prediction motion prediction

Registration based Few-Shot Anomaly Detection

1 code implementation15 Jul 2022 Chaoqin Huang, Haoyan Guan, Aofan Jiang, Ya zhang, Michael Spratling, Yan-Feng Wang

Inspired by how humans detect anomalies, i. e., comparing an image in question to normal images, we here leverage registration, an image alignment task that is inherently generalizable across categories, as the proxy task, to train a category-agnostic anomaly detection model.

Anomaly Detection

Collaborative Uncertainty Benefits Multi-Agent Multi-Modal Trajectory Forecasting

no code implementations11 Jul 2022 Bohan Tang, Yiqi Zhong, Chenxin Xu, Wei-Tao Wu, Ulrich Neumann, Yanfeng Wang, Ya zhang, Siheng Chen

Further, we apply the proposed framework to current SOTA multi-agent multi-modal forecasting systems as a plugin module, which enables the SOTA systems to 1) estimate the uncertainty in the multi-agent multi-modal trajectory forecasting task; 2) rank the multiple predictions and select the optimal one based on the estimated uncertainty.

regression Task 2 +1

Dynamic-Group-Aware Networks for Multi-Agent Trajectory Prediction with Relational Reasoning

1 code implementation27 Jun 2022 Chenxin Xu, Yuxi Wei, Bohan Tang, Sheng Yin, Ya zhang, Siheng Chen

Demystifying the interactions among multiple agents from their past trajectories is fundamental to precise and interpretable trajectory prediction.

Diversity Relational Reasoning +1

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

no code implementations26 Jun 2022 Jinxiang Liu, Chen Ju, Weidi Xie, Ya zhang

We present a simple yet effective self-supervised framework for audio-visual representation learning, to localize the sound source in videos.

Cross-Modal Retrieval Representation Learning +1

Contrastive Learning with Boosted Memorization

1 code implementation25 May 2022 Zhihan Zhou, Jiangchao Yao, Yanfeng Wang, Bo Han, Ya zhang

Different from previous works, we explore this direction from an alternative perspective, i. e., the data perspective, and propose a novel Boosted Contrastive Learning (BCL) method.

Contrastive Learning Memorization +2

Self-Supervised Masking for Unsupervised Anomaly Detection and Localization

no code implementations13 May 2022 Chaoqin Huang, Qinwei Xu, Yanfeng Wang, Yu Wang, Ya zhang

To extend the reconstruction-based anomaly detection architecture to the localized anomalies, we propose a self-supervised learning approach through random masking and then restoring, named Self-Supervised Masking (SSM) for unsupervised anomaly detection and localization.

Defect Detection Medical Diagnosis +2

GroupNet: Multiscale Hypergraph Neural Networks for Trajectory Prediction with Relational Reasoning

1 code implementation CVPR 2022 Chenxin Xu, Maosen Li, Zhenyang Ni, Ya zhang, Siheng Chen

From the aspect of interaction capturing, we propose a trainable multiscale hypergraph to capture both pair-wise and group-wise interactions at multiple group sizes.

Relational Reasoning Representation Learning +1

Task Decoupled Framework for Reference-Based Super-Resolution

no code implementations CVPR 2022 Yixuan Huang, Xiaoyun Zhang, Yu Fu, Siheng Chen, Ya zhang, Yan-Feng Wang, Dazhi He

Those methods conduct the super-resolution task of the input low-resolution(LR) image and the texture transfer task from the reference image together in one module, easily introducing the interference between LR and reference features.

Image Super-Resolution Reference-based Super-Resolution

LAR-SR: A Local Autoregressive Model for Image Super-Resolution

1 code implementation CVPR 2022 Baisong Guo, Xiaoyun Zhang, HaoNing Wu, Yu Wang, Ya zhang, Yan-Feng Wang

Previous super-resolution (SR) approaches often formulate SR as a regression problem and pixel wise restoration, which leads to a blurry and unreal SR output.

Image Super-Resolution

Prompting Visual-Language Models for Efficient Video Understanding

1 code implementation8 Dec 2021 Chen Ju, Tengda Han, Kunhao Zheng, Ya zhang, Weidi Xie

Image-based visual-language (I-VL) pre-training has shown great success for learning joint visual-textual representations from large-scale web data, revealing remarkable ability for zero-shot generalisation.

Action Recognition Language Modelling +4

Collaborative Uncertainty in Multi-Agent Trajectory Forecasting

no code implementations NeurIPS 2021 Bohan Tang, Yiqi Zhong, Ulrich Neumann, Gang Wang, Ya zhang, Siheng Chen

2) The results of trajectory forecasting benchmarks demonstrate that the CU-based framework steadily helps SOTA systems improve their performances.

Trajectory Forecasting

Spatio-Temporal Graph Complementary Scattering Networks

no code implementations23 Oct 2021 Zida Cheng, Siheng Chen, Ya zhang

Spatio-temporal graph signal analysis has a significant impact on a wide range of applications, including hand/body pose action recognition.

Action Recognition

A 3D Mesh-based Lifting-and-Projection Network for Human Pose Transfer

no code implementations24 Sep 2021 Jinxiang Liu, Yangheng Zhao, Siheng Chen, Ya zhang

To leverage the human body shape prior, LPNet exploits the topological information of the body mesh to learn an expressive visual representation for the target person in the 3D mesh space.

Image-to-Image Translation Pose Transfer +1

Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based Motion Prediction

no code implementations25 Aug 2021 Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yanfeng Wang, Qi Tian

The core of MST-GNN is a multiscale spatio-temporal graph that explicitly models the relations in motions at various spatial and temporal scales.

Decoder Graph Neural Network +1

CaT: Weakly Supervised Object Detection with Category Transfer

no code implementations ICCV 2021 Tianyue Cao, Lianyu Du, Xiaoyun Zhang, Siheng Chen, Ya zhang, Yan-Feng Wang

To handle overlapping category transfer, we propose a double-supervision mean teacher to gather common category information and bridge the domain gap between two datasets.

Object object-detection +2

Cooperative Learning for Noisy Supervision

no code implementations11 Aug 2021 Hao Wu, Jiangchao Yao, Ya zhang, Yanfeng Wang

Learning with noisy labels has gained the enormous interest in the robust deep learning area.

Learning with noisy labels

MS-KD: Multi-Organ Segmentation with Multiple Binary-Labeled Datasets

no code implementations5 Aug 2021 Shixiang Feng, YuHang Zhou, Xiaoman Zhang, Ya zhang, Yanfeng Wang

A novel Multi-teacher Single-student Knowledge Distillation (MS-KD) framework is proposed, where the teacher models are pre-trained single-organ segmentation networks, and the student model is a multi-organ segmentation network.

Knowledge Distillation Organ Segmentation +1

Online Multi-Agent Forecasting with Interpretable Collaborative Graph Neural Network

no code implementations2 Jul 2021 Maosen Li, Siheng Chen, Yanning Shen, Genjia Liu, Ivor W. Tsang, Ya zhang

This paper considers predicting future statuses of multiple agents in an online fashion by exploiting dynamic interactions in the system.

Graph Neural Network Human motion prediction +1

Knowledge distillation from multi-modal to mono-modal segmentation networks

no code implementations17 Jun 2021 Minhao Hu, Matthis Maillard, Ya zhang, Tommaso Ciceri, Giammarco La Barbera, Isabelle Bloch, Pietro Gori

In this paper, we propose KD-Net, a framework to transfer knowledge from a trained multi-modal network (teacher) to a mono-modal one (student).

Brain Tumor Segmentation Image Segmentation +3

A Fourier-based Framework for Domain Generalization

1 code implementation CVPR 2021 Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian

Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.

Data Augmentation Domain Generalization

Contrastive Attraction and Contrastive Repulsion for Representation Learning

1 code implementation8 May 2021 Huangjie Zheng, Xu Chen, Jiangchao Yao, Hongxia Yang, Chunyuan Li, Ya zhang, Hao Zhang, Ivor Tsang, Jingren Zhou, Mingyuan Zhou

We realize this strategy with contrastive attraction and contrastive repulsion (CACR), which makes the query not only exert a greater force to attract more distant positive samples but also do so to repel closer negative samples.

Contrastive Learning Representation Learning

Monitoring urban ecosystem service value using dynamic multi-level grids

no code implementations15 Apr 2021 Zhenfeng Shao, Yong Li, Xiao Huang, Bowen Cai, Lin Ding, Wenkang Pan, Ya zhang

Ecosystem valuation is a method of assigning a monetary value to an ecosystem with its goods and services, often referred to as ecosystem service value (ESV).

valid

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

no code implementations6 Apr 2021 Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian

To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Collaborative Label Correction via Entropy Thresholding

no code implementations31 Mar 2021 Hao Wu, Jiangchao Yao, Jiajie Wang, Yinru Chen, Ya zhang, Yanfeng Wang

Deep neural networks (DNNs) have the capacity to fit extremely noisy labels nonetheless they tend to learn data with clean labels first and then memorize those with noisy labels.

Spatio-Temporal Sparsification for General Robust Graph Convolution Networks

no code implementations23 Mar 2021 Mingming Lu, Ya zhang

Graph Neural Networks (GNNs) have attracted increasing attention due to its successful applications on various graph-structure data.

Sequential Learning on Liver Tumor Boundary Semantics and Prognostic Biomarker Mining

no code implementations9 Mar 2021 Jieneng Chen, Ke Yan, Yu-Dong Zhang, YouBao Tang, Xun Xu, Shuwen Sun, Qiuping Liu, Lingyun Huang, Jing Xiao, Alan L. Yuille, Ya zhang, Le Lu

(2) The sampled deep vertex features with positional embedding are mapped into a sequential space and decoded by a multilayer perceptron (MLP) for semantic classification.

valid

Uncertainty-aware Incremental Learning for Multi-organ Segmentation

no code implementations9 Mar 2021 YuHang Zhou, Xiaoman Zhang, Shixiang Feng, Ya zhang, Yanfeng

Specifically, given a pretrained $K$ organ segmentation model and a new single-organ dataset, we train a unified $K+1$ organ segmentation model without accessing any data belonging to the previous training stages.

Ethics Incremental Learning +3

Divide and Conquer for Single-Frame Temporal Action Localization

no code implementations ICCV 2021 Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian

Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.

Temporal Action Localization

Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation

1 code implementation17 Dec 2020 Chenxin Xu, Siheng Chen, Maosen Li, Ya zhang

To handle the decomposition ambiguity in the teacher network, we propose a cycle-consistent architecture promoting a 3D rotation-invariant property to train the teacher network.

3D Human Pose Estimation Knowledge Distillation +1

Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses

no code implementations15 Dec 2020 Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian

Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.

Weakly Supervised Action Localization

Deep Unsupervised Image Anomaly Detection: An Information Theoretic Framework

no code implementations9 Dec 2020 Fei Ye, Huangjie Zheng, Chaoqin Huang, Ya zhang

Based on this object function we introduce a novel information theoretic framework for unsupervised image anomaly detection.

Anomaly Detection

ESAD: End-to-end Deep Semi-supervised Anomaly Detection

no code implementations9 Dec 2020 Chaoqin Huang, Fei Ye, Peisen Zhao, Ya zhang, Yan-Feng Wang, Qi Tian

This paper explores semi-supervised anomaly detection, a more practical setting for anomaly detection where a small additional set of labeled samples are provided.

Ranked #25 on Anomaly Detection on One-class CIFAR-10 (using extra training data)

Decoder Medical Diagnosis +2

Privileged Knowledge Distillation for Online Action Detection

no code implementations18 Nov 2020 Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian

Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.

Knowledge Distillation Online Action Detection

Sampling and Recovery of Graph Signals based on Graph Neural Networks

no code implementations3 Nov 2020 Siheng Chen, Maosen Li, Ya zhang

Compared to previous analytical sampling and recovery, the proposed methods are able to flexibly learn a variety of graph signal models from data by leveraging the learning ability of neural networks; compared to previous neural-network-based sampling and recovery, the proposed methods are designed through exploiting specific graph properties and provide interpretability.

Graph Classification Graph Neural Network +1

Learning on Attribute-Missing Graphs

3 code implementations3 Nov 2020 Xu Chen, Siheng Chen, Jiangchao Yao, Huangjie Zheng, Ya zhang, Ivor W Tsang

Thereby, designing a new GNN for these graphs is a burning issue to the graph learning community.

Attribute Graph Learning +1

SAR: Scale-Aware Restoration Learning for 3D Tumor Segmentation

no code implementations13 Oct 2020 Xiaoman Zhang, Shixiang Feng, YuHang Zhou, Ya zhang, Yanfeng Wang

We demonstrate the effectiveness of our methods on two downstream tasks: i) Brain tumor segmentation, ii) Pancreas tumor segmentation.

Brain Tumor Segmentation Segmentation +3

Two-Stream Compare and Contrast Network for Vertebral Compression Fracture Diagnosis

no code implementations13 Oct 2020 Shixiang Feng, Beibei Liu, Ya zhang, Xiaoyun Zhang, Yuehua Li

In this paper, we explore to model VCFs diagnosis as a three-class classification problem, i. e. normal vertebrae, benign VCFs, and malignant VCFs.

Classification General Classification +2

Graph Cross Networks with Vertex Infomax Pooling

2 code implementations NeurIPS 2020 Maosen Li, Siheng Chen, Ya zhang, Ivor W. Tsang

Based on trainable hierarchical representations of a graph, GXN enables the interchange of intermediate features across scales to promote information flow.

General Classification Graph Classification

Urban Traffic Flow Forecast Based on FastGCRNN

no code implementations17 Sep 2020 Ya Zhang, Mingming Lu, Haifeng Li

Traffic forecasting is an important prerequisite for the application of intelligent transportation systems in urban traffic networks.

Decoder

Decoupled Variational Embedding for Signed Directed Networks

1 code implementation28 Aug 2020 Xu Chen, Jiangchao Yao, Maosen Li, Ya zhang, Yan-Feng Wang

Comprehensive results on both link sign prediction and node recommendation task demonstrate the effectiveness of DVE.

Link Sign Prediction Node Classification +1

Learning Node Representations against Perturbations

1 code implementation26 Aug 2020 Xu Chen, Yuangang Pan, Ivor Tsang, Ya zhang

In this paper, we study how to learn node representations against perturbations in GNN.

Contrastive Learning Node Classification +1

Collaborative Adversarial Learning for RelationalLearning on Multiple Bipartite Graphs

no code implementations16 Jul 2020 Jingchao Su, Xu Chen, Ya zhang, Siheng Chen, Dan Lv, Chenyang Li

The two-level alignment acts as two different constraints on different relations of the shared entities and facilitates better knowledge transfer for relational learning on multiple bipartite graphs.

Relational Reasoning Transfer Learning

Universal-to-Specific Framework for Complex Action Recognition

no code implementations13 Jul 2020 Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian

The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.

Action Recognition Decision Making

From Quantized DNNs to Quantizable DNNs

no code implementations11 Apr 2020 Kunyuan Du, Ya zhang, Haibing Guan

This paper proposes Quantizable DNNs, a special type of DNNs that can flexibly quantize its bit-width (denoted as `bit modes' thereafter) during execution without further re-training.

Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human Motion Prediction

1 code implementation17 Mar 2020 Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yan-Feng Wang, Qi Tian

The core idea of DMGNN is to use a multiscale graph to comprehensively model the internal relations of a human body for motion feature learning.

3D Human Pose Estimation 3D Pose Estimation +3

Bottom-Up Temporal Action Localization with Mutual Regularization

1 code implementation ECCV 2020 Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian

To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.

Temporal Action Localization

Attribute Restoration Framework for Anomaly Detection

1 code implementation25 Nov 2019 Chaoqin Huang, Fei Ye, Jinkun Cao, Maosen Li, Ya zhang, Cewu Lu

We here propose to break this equivalence by erasing selected attributes from the original data and reformulate it as a restoration task, where the normal and the anomalous data are expected to be distinguishable based on restoration errors.

Anomaly Detection Attribute +1

Iteratively-Refined Interactive 3D Medical Image Segmentation with Multi-Agent Reinforcement Learning

no code implementations CVPR 2020 Xuan Liao, Wenhao Li, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Ya zhang, Yan-Feng Wang

We here propose to model the dynamic process of iterative interactive image segmentation as a Markov decision process (MDP) and solve it with reinforcement learning (RL).

Image Segmentation Medical Image Segmentation +5

Cascading: Association Augmented Sequential Recommendation

no code implementations17 Oct 2019 Xu Chen, Kenan Cui, Ya zhang, Yan-Feng Wang

Recently, recommendation according to sequential user behaviors has shown promising results in many application scenarios.

Graph Embedding Sequential Recommendation

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

no code implementations19 Sep 2019 Zhuoxun He, Lingxi Xie, Xin Chen, Ya zhang, Yan-Feng Wang, Qi Tian

Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks.

Data Augmentation Image Classification +2

Node Attribute Generation on Graphs

3 code implementations23 Jul 2019 Xu Chen, Siheng Chen, Huangjie Zheng, Jiangchao Yao, Kenan Cui, Ya zhang, Ivor W. Tsang

NANG learns a unifying latent representation which is shared by both node attributes and graph structures and can be translated to different modalities.

Attribute Data Augmentation +3

Defending Adversarial Attacks by Correcting logits

no code implementations26 Jun 2019 Yifeng Li, Lingxi Xie, Ya zhang, Rui Zhang, Yanfeng Wang, Qi Tian

Generating and eliminating adversarial examples has been an intriguing topic in the field of deep learning.

Handwritten Chinese Font Generation with Collaborative Stroke Refinement

no code implementations30 Apr 2019 Chuan Wen, Jie Chang, Ya zhang, Siheng Chen, Yan-Feng Wang, Mei Han, Qi Tian

Automatic character generation is an appealing solution for new typeface design, especially for Chinese typefaces including over 3700 most commonly-used characters.

Font Generation

Safeguarded Dynamic Label Regression for Generalized Noisy Supervision

1 code implementation6 Mar 2019 Jiangchao Yao, Ya zhang, Ivor W. Tsang, Jun Sun

We further generalize LCCN for open-set noisy labels and the semi-supervised setting.

Ranked #35 on Image Classification on Clothing1M (using extra training data)

Learning with noisy labels regression

Accelerate CNN via Recursive Bayesian Pruning

no code implementations ICCV 2019 Yuefu Zhou, Ya zhang, Yan-Feng Wang, Qi Tian

A new dropout-based measurement of redundancy, which facilitate the computation of posterior assuming inter-layer dependency, is introduced.

Domain-Invariant Adversarial Learning for Unsupervised Domain Adaption

no code implementations30 Nov 2018 Yexun Zhang, Ya zhang, Yan-Feng Wang, Qi Tian

Unsupervised domain adaption aims to learn a powerful classifier for the target domain given a labeled source data set and an unlabeled target data set.

Domain Adaptation Generative Adversarial Network

Phase Collaborative Network for Two-Phase Medical Image Segmentation

no code implementations28 Nov 2018 Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille

However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.

Image Segmentation Medical Image Segmentation +3

Variational Collaborative Learning for User Probabilistic Representation

no code implementations22 Sep 2018 Kenan Cui, Xu Chen, Jiangchao Yao, Ya zhang

Conventional CF-based methods use the user-item interaction data as the sole information source to recommend items to users.

Collaborative Filtering Recommendation Systems

Learning Multi-touch Conversion Attribution with Dual-attention Mechanisms for Online Advertising

1 code implementation11 Aug 2018 Kan Ren, Yuchen Fang, Wei-Nan Zhang, Shuhao Liu, Jiajun Li, Ya zhang, Yong Yu, Jun Wang

To achieve this, we utilize sequence-to-sequence prediction for user clicks, and combine both post-view and post-click attribution patterns together for the final conversion estimation.

Understanding VAEs in Fisher-Shannon Plane

no code implementations10 Jul 2018 Huangjie Zheng, Jiangchao Yao, Ya zhang, Ivor W. Tsang, Jia Wang

In information theory, Fisher information and Shannon information (entropy) are respectively used to quantify the uncertainty associated with the distribution modeling and the uncertainty in specifying the outcome of given variables.

Decoder Representation Learning

A Unified Framework for Generalizable Style Transfer: Style and Content Separation

1 code implementation13 Jun 2018 Yexun Zhang, Ya zhang, Wenbin Cai

The encoders are expected to capture the underlying features for different styles and contents which is generalizable to new styles and contents.

Decoder Multi-Task Learning +1

Masking: A New Perspective of Noisy Supervision

2 code implementations NeurIPS 2018 Bo Han, Jiangchao Yao, Gang Niu, Mingyuan Zhou, Ivor Tsang, Ya zhang, Masashi Sugiyama

It is important to learn various types of classifiers given training data with noisy labels.

Ranked #42 on Image Classification on Clothing1M (using extra training data)

Image Classification

Variational Composite Autoencoders

no code implementations12 Apr 2018 Jiangchao Yao, Ivor Tsang, Ya zhang

Learning in the latent variable model is challenging in the presence of the complex data structure or the intractable latent variable.

Decoder

Multi-Scale Spatially-Asymmetric Recalibration for Image Classification

no code implementations ECCV 2018 Yan Wang, Lingxi Xie, Siyuan Qiao, Ya zhang, Wenjun Zhang, Alan L. Yuille

Convolution is spatially-symmetric, i. e., the visual features are independent of its position in the image, which limits its ability to utilize contextual cues for visual recognition.