no code implementations • IWSLT (EMNLP) 2018 • Yuguang Wang, Liangliang Shi, Linyu Wei, Weifeng Zhu, Jinkun Chen, Zhichao Wang, Shixue Wen, Wei Chen, Yanfeng Wang, Jia Jia
Our final average result on speech translation is 31. 02 BLEU.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 26 Apr 2024 • YuHang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya zhang, Yanfeng Wang
The popularity of large-scale pre-training has promoted the development of medical foundation models.
no code implementations • 25 Apr 2024 • Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Jiayu Lei, Ya zhang, Yanfeng Wang, Weidi Xie
We believe that RadGenome-Chest CT can significantly advance the development of multimodal medical foundation models, by training to generate texts based on given segmentation regions, which is unattainable with previous relevant datasets.
no code implementations • 23 Apr 2024 • Haozhe Cheng, Cheng Ju, Haicheng Wang, Jinxiang Liu, Mengting Chen, Qiang Hu, Xiaoyun Zhang, Yanfeng Wang
The denoised text classes help OVAR models classify visual samples more accurately; in return, classified visual samples help better denoising.
no code implementations • 18 Apr 2024 • Yuzhu Cai, Sheng Yin, Yuxi Wei, Chenxin Xu, Weibo Mao, Felix Juefei-Xu, Siheng Chen, Yanfeng Wang
The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors.
1 code implementation • 15 Apr 2024 • Xiao Zhou, Xiaoman Zhang, Chaoyi Wu, Ya zhang, Weidi Xie, Yanfeng Wang
In this paper, we consider the problem of visual representation learning for computational pathology, by exploiting large-scale image-text pairs gathered from public resources, along with the domain specific knowledge in pathology.
2 code implementations • 13 Apr 2024 • Yusheng Liao, Shuyang Jiang, Yu Wang, Yanfeng Wang
Large language models like ChatGPT have shown substantial progress in natural language understanding and generation, proving valuable across various disciplines, including the medical field.
no code implementations • 7 Apr 2024 • Aofan Jiang, Chaoqin Huang, Qing Cao, Yuchen Xu, Zi Zeng, Kang Chen, Ya zhang, Yanfeng Wang
We introduce a novel self-supervised learning framework for ECG AD, utilizing a vast dataset of normal ECGs to autonomously detect and localize cardiac anomalies.
Self-Supervised Anomaly Detection Self-Supervised Learning +2
no code implementations • 26 Mar 2024 • Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya zhang, Yanfeng Wang
Referring Image Segmentation (RIS) leveraging transformers has achieved great success on the interpretation of complex visual-language tasks.
no code implementations • 21 Mar 2024 • Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang
Although multiple academic video datasets have been constructed and released, few of them support both multimodal content recognition and understanding tasks, which is partially due to the lack of high-quality human annotations.
1 code implementation • 19 Mar 2024 • Chaoqin Huang, Aofan Jiang, Jinghao Feng, Ya zhang, Xinchao Wang, Yanfeng Wang
Recent advancements in large-scale visual-language pre-trained models have led to significant progress in zero-/few-shot anomaly detection within natural image domains.
no code implementations • 17 Mar 2024 • Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya zhang, Yanfeng Wang
NFs, temporally adjacent to the labeled frame, often contain rich motion information that assists in the accurate localization of sounding objects.
2 code implementations • 13 Mar 2024 • Yusheng Liao, Yutong Meng, Yuhao Wang, Hongcheng Liu, Yanfeng Wang, Yu Wang
Large Language Models (LLMs) have demonstrated remarkable proficiency in human interactions, yet their application within the medical field remains insufficiently explored.
no code implementations • 11 Mar 2024 • Shuo Tang, Rui Ye, Chenxin Xu, Xiaowen Dong, Siheng Chen, Yanfeng Wang
In this paper, we propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs.
no code implementations • 7 Mar 2024 • Wanru Zhao, Yaxin Du, Nicholas Donald Lane, Siheng Chen, Yanfeng Wang
In the current landscape of foundation model training, there is a significant reliance on public domain data, which is nearing exhaustion according to recent research.
no code implementations • 1 Mar 2024 • Heyang Liu, Yu Wang, Yanfeng Wang
End-to-end (E2E) approach is gradually replacing hybrid models for automatic speech recognition (ASR) tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 28 Feb 2024 • Yusheng Liao, Yanfeng Wang, Yu Wang
Autoregressive (AR) and Non-autoregressive (NAR) models are two types of generative models for Neural Machine Translation (NMT).
1 code implementation • 21 Feb 2024 • Pengcheng Qiu, Chaoyi Wu, Xiaoman Zhang, Weixiong Lin, Haicheng Wang, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we aim to develop an open-source, multilingual language model for medicine, that the benefits a wider, linguistically diverse audience from different regions.
no code implementations • 19 Feb 2024 • Hongcheng Liu, Pingjie Wang, Yu Wang, Yanfeng Wang
Video-grounded dialogue generation (VDG) requires the system to generate a fluent and accurate answer based on multimodal knowledge.
no code implementations • 18 Feb 2024 • YiQiu Guo, Yuchen Yang, Ya zhang, Yu Wang, Yanfeng Wang
Structured data offers a sophisticated mechanism for the organization of information.
1 code implementation • 10 Feb 2024 • Rui Ye, Wenhao Wang, Jingyi Chai, Dihan Li, Zexi Li, Yinda Xu, Yaxin Du, Yanfeng Wang, Siheng Chen
Trained on massive publicly available data, large language models (LLMs) have demonstrated tremendous success across various fields.
no code implementations • 8 Feb 2024 • Xianghe Pang, Shuo Tang, Rui Ye, Yuxin Xiong, Bolun Zhang, Yanfeng Wang, Siheng Chen
Aligning large language models (LLMs) with human values is imperative to mitigate potential adverse effects resulting from their misuse.
1 code implementation • 8 Feb 2024 • Yuxi Wei, Zi Wang, Yifan Lu, Chenxin Xu, Changxing Liu, Hao Zhao, Siheng Chen, Yanfeng Wang
Furthermore, to unleash the potential of extensive high-quality digital assets, ChatSim employs a novel multi-camera lighting estimation method to achieve scene-consistent assets' rendering.
1 code implementation • 25 Jan 2024 • Yifan Lu, Yue Hu, Yiqi Zhong, Dequan Wang, Yanfeng Wang, Siheng Chen
In this paper, we introduce a new open heterogeneous problem: how to accommodate continually emerging new heterogeneous agent types into collaborative perception, while ensuring high perception performance and low integration cost?
no code implementations • 23 Jan 2024 • Shaoheng Fang, Rui Ye, Wenhao Wang, Zuhong Liu, Yuxiao Wang, Yafei Wang, Siheng Chen, Yanfeng Wang
In this paper, we introduce FedRSU, an innovative federated learning framework for self-supervised scene flow estimation.
1 code implementation • 15 Jan 2024 • Yuhao Wang, Yusheng Liao, Heyang Liu, Hongcheng Liu, Yu Wang, Yanfeng Wang
We believe that these hallucinations are partially due to the models' struggle with understanding what they can and cannot perceive from images, a capability we refer to as self-awareness in perception.
no code implementations • 28 Dec 2023 • Ziheng Zhao, Yao Zhang, Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
Our main contributions are three folds: (i) on data construction, we combine multiple knowledge sources to construct a multi-modal medical knowledge tree; Then we build up a large-scale segmentation dataset for training, by collecting over 11K 3D medical image scans from 31 segmentation datasets with careful standardization on both visual scans and label space; (ii) on model training, we formulate a universal segmentation model, that can be prompted by inputting medical terminologies in text form.
1 code implementation • 26 Dec 2023 • Qiaoyu Zheng, Weike Zhao, Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this study, we aim to investigate the problem of large-scale, large-vocabulary disease classification for radiologic images, which can be formulated as a multi-modal, multi-anatomy, multi-label, long-tailed classification.
no code implementations • 21 Dec 2023 • Zeqian Li, Qirui Chen, Tengda Han, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporally aligning the video and texts from instructional videos, specifically, given a long-term video, and associated text sentences, our goal is to determine their corresponding timestamps in the video.
no code implementations • 20 Dec 2023 • Yan Cai, LinLin Wang, Ye Wang, Gerard de Melo, Ya zhang, Yanfeng Wang, Liang He
The emergence of various medical large language models (LLMs) in the medical domain has highlighted the need for unified evaluation standards, as manual evaluation of LLMs proves to be time-consuming and labor-intensive.
1 code implementation • 18 Dec 2023 • Tianjie Dai, Ruipeng Zhang, Feng Hong, Jiangchao Yao, Ya zhang, Yanfeng Wang
Vision-Language Pre-training (VLP) that utilizes the multi-modal information to promote the training efficiency and effectiveness, has achieved great success in vision recognition of natural domains and shown promise in medical imaging diagnosis for the Chest X-Rays (CXRs).
1 code implementation • 18 Dec 2023 • Zexi Liu, Bohan Tang, Ziyuan Ye, Xiaowen Dong, Siheng Chen, Yanfeng Wang
Hypergraphs play a pivotal role in the modelling of data featuring higher-order relations involving more than two entities.
no code implementations • 10 Dec 2023 • Rui Ye, Xinyu Zhu, Jingyi Chai, Siheng Chen, Yanfeng Wang
In this paper, we propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
no code implementations • 10 Dec 2023 • Rui Ye, Yaxin Du, Zhenyang Ni, Siheng Chen, Yanfeng Wang
FedCOG consists of two key components at the client side: complementary data generation, which generates data extracted from the shared global model to complement the original dataset, and knowledge-distillation-based model training, which distills knowledge from global model to local model based on the generated data to mitigate over-fitting the original heterogeneous dataset.
1 code implementation • NeurIPS 2023 • Zhihan Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Bo Han, Yanfeng Wang
Self-supervised learning (SSL) as an effective paradigm of representation learning has achieved tremendous success on various curated datasets in diverse scenarios.
1 code implementation • 15 Oct 2023 • Chaoyi Wu, Jiayu Lei, Qiaoyu Zheng, Weike Zhao, Weixiong Lin, Xiaoman Zhang, Xiao Zhou, Ziheng Zhao, Ya zhang, Yanfeng Wang, Weidi Xie
Driven by the large foundation models, the development of artificial intelligence has witnessed tremendous progress lately, leading to a surge of general interest from the public.
no code implementations • 7 Oct 2023 • Yuchen Yang, Houqiang Li, Yanfeng Wang, Yu Wang
In this study, we introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
no code implementations • 26 Sep 2023 • Hongcheng Liu, Zhe Chen, Hui Li, Pingjie Wang, Yanfeng Wang, Yu Wang
Generating dialogue grounded in videos requires a high level of understanding and reasoning about the visual scenes in the videos.
1 code implementation • 13 Sep 2023 • Jiayu Lei, Lisong Dai, Haoyun Jiang, Chaoyi Wu, Xiaoman Zhang, Yao Zhang, Jiangchao Yao, Weidi Xie, Yanyong Zhang, Yuehua Li, Ya zhang, Yanfeng Wang
Magnetic resonance imaging~(MRI) have played a crucial role in brain disease diagnosis, with which a range of computer-aided artificial intelligence methods have been proposed.
no code implementations • 5 Sep 2023 • Yusheng Liao, Yutong Meng, Hongcheng Liu, Yanfeng Wang, Yu Wang
A medical consultation training set is further constructed to improve the consultation ability of LLMs.
no code implementations • NeurIPS 2023 • Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Ya zhang, Yanfeng Wang
The results show the superior performance of attribute decomposition-aggregation.
1 code implementation • 20 Aug 2023 • Zihan Zhao, Yiyang Jiang, Heyang Liu, Yanfeng Wang, Yu Wang
While Large Language Models (LLMs) have demonstrated commendable performance across a myriad of domains and tasks, existing LLMs still exhibit a palpable deficit in handling multimodal functionalities, especially for the Spoken Question Answering (SQA) task which necessitates precise alignment and deep interaction between speech and text features.
no code implementations • 17 Aug 2023 • Feng Hong, Tianjie Dai, Jiangchao Yao, Ya zhang, Yanfeng Wang
Clinical classification of chest radiography is particularly challenging for standard machine learning algorithms due to its inherent long-tailed and multi-label nature.
1 code implementation • ICCV 2023 • Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Xinchao Wang, Yanfeng Wang
To work with auxiliary tasks, we propose a novel auxiliary-adapted transformer, which can handle incomplete, corrupted motion data and achieve coordinate recovery via capturing spatial-temporal dependencies.
no code implementations • 9 Aug 2023 • Chaoqin Huang, Aofan Jiang, Ya zhang, Yanfeng Wang
Anomaly detection has gained considerable attention due to its broad range of applications, particularly in industrial defect detection.
1 code implementation • ICCV 2023 • Qingyao Xu, Weibo Mao, Jingze Gong, Chenxin Xu, Siheng Chen, Weidi Xie, Ya zhang, Yanfeng Wang
Multi-person motion prediction is a challenging problem due to the dependency of motion on both individual past movements and interactions with other people.
1 code implementation • 4 Aug 2023 • Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this study, we aim to initiate the development of Radiology Foundation Model, termed as RadFM.
1 code implementation • 3 Aug 2023 • Aofan Jiang, Chaoqin Huang, Qing Cao, Shuang Wu, Zi Zeng, Kang Chen, Ya zhang, Yanfeng Wang
To address this challenge, this paper introduces a novel multi-scale cross-restoration framework for ECG anomaly detection and localization that considers both local and global ECG characteristics.
1 code implementation • 3 Aug 2023 • YuHang Zhou, Jiangchao Yao, Feng Hong, Ya zhang, Yanfeng Wang
By dynamically manipulating the gradient during training based on these factors, BDR can effectively alleviate knowledge destruction and improve knowledge reconstruction.
no code implementations • 25 Jul 2023 • Jinxiang Liu, Chen Ju, Chaofan Ma, Yanfeng Wang, Yu Wang, Ya zhang
The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues.
no code implementations • 7 Jul 2023 • Chunhui Zhang, Xin Sun, Li Liu, Yiqian Yang, Qiong Liu, Xi Zhou, Yanfeng Wang
This approach achieves feature integration in a unified backbone, removing the need for carefully-designed fusion modules and resulting in a more effective and efficient VL tracking framework.
no code implementations • 5 Jul 2023 • Yuhuan Yang, Chaofan Ma, Chen Ju, Ya zhang, Yanfeng Wang
In this paper, we define a unified setting termed as open-set semantic segmentation (O3S), which aims to learn seen and unseen semantics from both visual examples and textual names.
1 code implementation • 24 Jun 2023 • HaoNing Wu, Xiaoyun Zhang, Weidi Xie, Ya zhang, Yanfeng Wang
Video frame interpolation (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
1 code implementation • 12 Jun 2023 • Yikun Liu, Jiangchao Yao, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of composed image retrieval (CIR), it aims to train a model that can fuse multi-modal information, e. g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
Ranked #1 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRR
no code implementations • 9 Jun 2023 • Lin Liu, Mingming Zhao, Shanxin Yuan, Wenlong Lyu, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian
Specifically, Cube Mask Sampling Module (CMSM) is proposed to apply both spatial and channel mask sampling modeling to image compression in the pre-training stage.
1 code implementation • 1 Jun 2023 • Chang Liu, HaoNing Wu, Yujie Zhong, Xiaoyun Zhang, Yanfeng Wang, Weidi Xie
Generative models have recently exhibited exceptional capabilities in text-to-image generation, but still struggle to generate image sequences coherently.
1 code implementation • 30 May 2023 • Rui Ye, Mingkai Xu, Jianyu Wang, Chenxin Xu, Siheng Chen, Yanfeng Wang
However, based on our empirical observations and theoretical analysis, we find that the dataset size is not optimal and the discrepancy between local and global category distributions could be a beneficial and complementary indicator for determining aggregation weights.
2 code implementations • 17 May 2023 • Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Weixiong Lin, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA), which is crucial in efficiently interpreting medical images with vital clinic-relevant information.
Ranked #1 on Medical Visual Question Answering on PMC-VQA
1 code implementation • 27 Apr 2023 • Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
Our contributions are threefold: (i) we systematically investigate the process of adapting a general-purpose foundation language model towards medical domain, this involves data-centric knowledge injection through the integration of 4. 8M biomedical academic papers and 30K medical textbooks, as well as comprehensive fine-tuning for alignment with domain-specific instructions; (ii) we contribute a large-scale, comprehensive dataset for instruction tuning.
1 code implementation • CVPR 2023 • Yue Hu, Yifan Lu, Runsheng Xu, Weidi Xie, Siheng Chen, Yanfeng Wang
Camera-only 3D detection provides an economical solution with a simple configuration for localizing objects in 3D space compared to LiDAR-based detection systems.
no code implementations • 21 Mar 2023 • Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.
1 code implementation • CVPR 2023 • Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, Yanfeng Wang
In motion prediction tasks, maintaining motion equivariance under Euclidean geometric transformations and invariance of agent interaction is a critical and fundamental principle.
Ranked #1 on Human Pose Forecasting on Human3.6M
1 code implementation • CVPR 2023 • Weibo Mao, Chenxin Xu, Qi Zhu, Siheng Chen, Yanfeng Wang
The core of the proposed LED is to leverage a trainable leapfrog initializer to directly learn an expressive multi-modal distribution of future trajectories, which skips a large number of denoising steps, significantly accelerating inference speed.
no code implementations • 19 Mar 2023 • Chaofan Ma, Qisen Xu, Xiangfeng Wang, Bo Jin, Xiaoyun Zhang, Yanfeng Wang, Ya zhang
Interactive segmentation has recently been explored to effectively and efficiently harvest high-quality segmentation masks by iteratively incorporating user hints.
no code implementations • 17 Mar 2023 • Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya zhang, Yanfeng Wang
However, the challenges exist as there is one structural difference between generative and discriminative models, which limits the direct use.
no code implementations • CVPR 2023 • Shaoheng Fang, Zi Wang, Yiqi Zhong, Junhao Ge, Siheng Chen, Yanfeng Wang
Second, a spatial-temporal pyramid transformer is introduced to comprehensively extract multi-scale BEV features and predict future BEV states with the support of spatial-temporal priors.
Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU ped - 224x480 - Vis filter. - 100x100 at 0.5 metric)
1 code implementation • CVPR 2023 • Zhixin Wang, Xiaoyun Zhang, Ziying Zhang, Huangjie Zheng, Mingyuan Zhou, Ya zhang, Yanfeng Wang
However, it is expensive and infeasible to include every type of degradation to cover real-world cases in the training data.
1 code implementation • 13 Mar 2023 • Weixiong Lin, Ziheng Zhao, Xiaoman Zhang, Chaoyi Wu, Ya zhang, Yanfeng Wang, Weidi Xie
Foundation models trained on large-scale dataset gain a recent surge in CV and NLP.
Ranked #3 on Medical Visual Question Answering on PMC-VQA
1 code implementation • 27 Feb 2023 • Xiaoman Zhang, Chaoyi Wu, Ya zhang, Yanfeng Wang, Weidi Xie
While multi-modal foundation models pre-trained on large-scale data have been successful in natural language understanding and vision recognition, their use in medical domains is still limited due to the fine-grained nature of medical tasks and the high demand for domain knowledge.
no code implementations • 22 Feb 2023 • Chaoyi Wu, Xiaoman Zhang, Yanfeng Wang, Ya zhang, Weidi Xie
In this paper, we consider the problem of disease diagnosis.
no code implementations • 20 Feb 2023 • Zihan Zhao, Yu Wang, Yanfeng Wang
Multimodal emotion recognition is a challenging research area that aims to fuse different modalities to predict human emotion.
1 code implementation • 10 Feb 2023 • Feng Hong, Jiangchao Yao, Zhihan Zhou, Ya zhang, Yanfeng Wang
The straightforward combination of LT and PLL, i. e., LT-PLL, suffers from a fundamental dilemma: LT methods build upon a given class distribution that is unavailable in PLL, and the performance of PLL is severely influenced in long-tailed context.
1 code implementation • ICCV 2023 • Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
The goal of this paper is to extract the visual-language correspondence from a pre-trained text-to-image diffusion model, in the form of segmentation map, i. e., simultaneously generating images and segmentation masks for the corresponding visual entities described in the text prompt.
no code implementations • 9 Jan 2023 • Chaoyi Wu, Feng Chang, Xiao Su, Zhihan Wu, Yanfeng Wang, Ling Zhu, Ya zhang
The branch targets to solve a closely related task on the LN station level, i. e., classifying whether an LN station contains metastatic LN or not, so as to learn representations for LN stations.
no code implementations • 5 Jan 2023 • Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.
1 code implementation • CVPR 2023 • Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya zhang, Qi Tian, Yanfeng Wang
Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift.
no code implementations • ICCV 2023 • Chaoyi Wu, Xiaoman Zhang, Ya zhang, Yanfeng Wang, Weidi Xie
In this paper, we consider enhancing medical visual-language pre-training (VLP) with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.
no code implementations • CVPR 2023 • Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya zhang, Jianlong Chang, Yanfeng Wang, Qi Tian
And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.
Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization
1 code implementation • 14 Dec 2022 • Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian
However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.
1 code implementation • 14 Nov 2022 • Yifan Lu, Quanhao Li, Baoan Liu, Mehrdad Dianati, Chen Feng, Siheng Chen, Yanfeng Wang
Collaborative 3D object detection exploits information exchange among multiple agents to enhance accuracy of object detection in presence of sensor impairments such as occlusion.
no code implementations • 31 Oct 2022 • Enpei Zhang, Shuo Tang, Xiaowen Dong, Siheng Chen, Yanfeng Wang
To fill this gap, we propose a distributed multi-agent learning model inspired by human collaboration, in which the agents can autonomously detect suitable collaborators and refer to collaborators' model for better performance.
1 code implementation • 27 Oct 2022 • Chaofan Ma, Yuhuan Yang, Yanfeng Wang, Ya zhang, Weidi Xie
When trained at a sufficient scale, self-supervised learning has exhibited a notable ability to solve a wide range of visual or language understanding tasks.
no code implementations • 18 Oct 2022 • Yangheng Zhao, Jun Wang, Xiaolong Li, Yue Hu, Ce Zhang, Yanfeng Wang, Siheng Chen
Instead of learning a single prototype for each class, in this paper, we propose to use an adaptive number of prototypes to dynamically describe the different point patterns within a semantic class.
Ranked #17 on 3D Semantic Segmentation on SemanticKITTI
no code implementations • 7 Oct 2022 • Qinye Zhou, Ziyi Li, Weidi Xie, Xiaoyun Zhang, Ya zhang, Yanfeng Wang
Existing models on super-resolution often specialized for one scale, fundamentally limiting their use in practical scenarios.
no code implementations • 23 Aug 2022 • Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian
Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.
no code implementations • 11 Jul 2022 • Zihan Zhao, Yanfeng Wang, Yu Wang
The research and applications of multimodal emotion recognition have become increasingly popular recently.
no code implementations • 11 Jul 2022 • Bohan Tang, Yiqi Zhong, Chenxin Xu, Wei-Tao Wu, Ulrich Neumann, Yanfeng Wang, Ya zhang, Siheng Chen
Further, we apply the proposed framework to current SOTA multi-agent multi-modal forecasting systems as a plugin module, which enables the SOTA systems to 1) estimate the uncertainty in the multi-agent multi-modal trajectory forecasting task; 2) rank the multiple predictions and select the optimal one based on the estimated uncertainty.
1 code implementation • 29 Jun 2022 • Yongjun Jiang, Jian Yu, Wenwen Yang, Bihong Zhang, Yanfeng Wang
To the best of our knowledge, the proposed Nextformer model achieves SOTA results on AISHELL-1(CER 4. 06%) and WenetSpeech(CER 7. 56%/11. 29%).
1 code implementation • 14 Jun 2022 • Ziheng Zhao, Tianjiao Zhang, Weidi Xie, Yanfeng Wang, Ya zhang
This paper considers the problem of undersampled MRI reconstruction.
1 code implementation • 25 May 2022 • Zhihan Zhou, Jiangchao Yao, Yanfeng Wang, Bo Han, Ya zhang
Different from previous works, we explore this direction from an alternative perspective, i. e., the data perspective, and propose a novel Boosted Contrastive Learning (BCL) method.
no code implementations • 13 May 2022 • Chaoqin Huang, Qinwei Xu, Yanfeng Wang, Yu Wang, Ya zhang
To extend the reconstruction-based anomaly detection architecture to the localized anomalies, we propose a self-supervised learning approach through random masking and then restoring, named Self-Supervised Masking (SSM) for unsupervised anomaly detection and localization.
1 code implementation • 7 Dec 2021 • Xiaohang Bian, Bo Qin, Xiaozhe Xin, Jianwu Li, Xuefeng Su, Yanfeng Wang
Handwritten mathematical expression recognition aims to automatically generate LaTeX sequences from given images.
no code implementations • 7 Sep 2021 • Xiaoman Zhang, Weidi Xie, Chaoqin Huang, Yanfeng Wang, Ya zhang, Xin Chen, Qi Tian
In this paper, we target self-supervised representation learning for zero-shot tumor segmentation.
no code implementations • 25 Aug 2021 • Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yanfeng Wang, Qi Tian
The core of MST-GNN is a multiscale spatio-temporal graph that explicitly models the relations in motions at various spatial and temporal scales.
no code implementations • 11 Aug 2021 • Hao Wu, Jiangchao Yao, Ya zhang, Yanfeng Wang
Learning with noisy labels has gained the enormous interest in the robust deep learning area.
no code implementations • 5 Aug 2021 • Shixiang Feng, YuHang Zhou, Xiaoman Zhang, Ya zhang, Yanfeng Wang
A novel Multi-teacher Single-student Knowledge Distillation (MS-KD) framework is proposed, where the teacher models are pre-trained single-organ segmentation networks, and the student model is a multi-organ segmentation network.
1 code implementation • CVPR 2021 • Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian
Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.
no code implementations • ICCV 2021 • Ruolin Ye, Wenqiang Xu, Zhendong Xue, Tutian Tang, Yanfeng Wang, Cewu Lu
Besides, we also report the hand and object pose errors with existing baselines and show that the dataset can serve as the video demonstrations for robot imitation learning on the handover task.
no code implementations • 31 Mar 2021 • Hao Wu, Jiangchao Yao, Jiajie Wang, Yinru Chen, Ya zhang, Yanfeng Wang
Deep neural networks (DNNs) have the capacity to fit extremely noisy labels nonetheless they tend to learn data with clean labels first and then memorize those with noisy labels.
no code implementations • ICCV 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian
Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
1 code implementation • LREC 2022 • Wenhao Zhu, ShuJian Huang, Tong Pu, Pingxuan Huang, Xu Zhang, Jian Yu, Wei Chen, Yanfeng Wang, Jiajun Chen
Previous research for adapting a general neural machine translation (NMT) model into a specific domain usually neglects the diversity in translation within the same domain, which is a core problem for domain adaptation in real-world scenarios.
no code implementations • 15 Dec 2020 • Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian
Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
Ranked #3 on Weakly Supervised Action Localization on BEOID
no code implementations • 18 Nov 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian
Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.
Ranked #11 on Online Action Detection on TVSeries
no code implementations • 13 Oct 2020 • Xiaoman Zhang, Shixiang Feng, YuHang Zhou, Ya zhang, Yanfeng Wang
We demonstrate the effectiveness of our methods on two downstream tasks: i) Brain tumor segmentation, ii) Pancreas tumor segmentation.
no code implementations • 26 Jun 2019 • Yifeng Li, Lingxi Xie, Ya zhang, Rui Zhang, Yanfeng Wang, Qi Tian
Generating and eliminating adversarial examples has been an intriguing topic in the field of deep learning.