Search Results for author: Chen Chen

Found 374 papers, 170 papers with code

Rule Based Event Extraction for Artificial Social Intelligence

no code implementations PANDL (COLING) 2022 Remo Nitschke, Yuwei Wang, Chen Chen, Adarsh Pyarelal, Rebecca Sharp

Natural language (as opposed to structured communication modes such as Morse code) is by far the most common mode of communication between humans, and can thus provide significant insight into both individual mental states and interpersonal dynamics.

Event Extraction

ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models

1 code implementation21 Feb 2024 Chenyang Song, Xu Han, Zhengyan Zhang, Shengding Hu, Xiyu Shi, Kuai Li, Chen Chen, Zhiyuan Liu, Guangli Li, Tao Yang, Maosong Sun

Some recent efforts have explored introducing ReLU or its variants as the substitutive activation function to help LLMs achieve activation sparsity and inference acceleration, but few can simultaneously obtain high sparsity and comparable model performance.

CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes

1 code implementation16 Feb 2024 Ishan Rajendrakumar Dave, Tristan de Blegiers, Chen Chen, Mubarak Shah

Annotating images from LCM significantly increases the burden on medical experts compared to annotating images from high-cost microscopes (HCM).

Domain Adaptation object-detection +1

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

1 code implementation10 Feb 2024 Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

Leveraging the rich linguistic knowledge and strong reasoning abilities of LLMs, our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.

Machine Translation Translation

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

no code implementations8 Feb 2024 Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, EnSiong Chng, Chao-Han Huck Yang

Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output.

Audio-Visual Speech Recognition Automatic Speech Recognition +3

Tensor Completion via Integer Optimization

no code implementations6 Feb 2024 Xin Chen, Sukanya Kudva, Yongzheng Dai, Anil Aswani, Chen Chen

The main challenge with the tensor completion problem is a fundamental tension between computation power and the information-theoretic sample complexity rate.

Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning

1 code implementation4 Feb 2024 Li Ren, Chen Chen, Liqiang Wang, Kien Hua

As a result of the success of recent pre-trained models trained from larger-scale datasets, it is challenging to adapt the model to the DML tasks in the local data domain while retaining the previously gained knowledge.

Metric Learning

Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition

no code implementations4 Feb 2024 Mengyuan Liu, Chen Chen, Songtao Wu, Fanyang Meng, Hong Liu

Recognizing interactive actions, including hand-to-hand interaction and human-to-human interaction, has attracted increasing attention for various applications in the field of video analysis and human-robot interaction.

Action Recognition Human Interaction Recognition

Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image Outpainting

no code implementations19 Jan 2024 Hao Ai, Zidong Cao, Haonan Lu, Chen Chen, Jian Ma, Pengyuan Zhou, Tae-Kyun Kim, Pan Hui, Lin Wang

To this end, we propose a transformer-based 360 image outpainting framework called Dream360, which can generate diverse, high-fidelity, and high-resolution panoramas from user-selected viewports, considering the spherical properties of 360 images.

Image Outpainting

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

1 code implementation19 Jan 2024 Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng

To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Enhanced Few-Shot Class-Incremental Learning via Ensemble Models

no code implementations14 Jan 2024 Mingli Zhu, Zihao Zhu, Sihong Chen, Chen Chen, Baoyuan Wu

To tackle overfitting challenge, we design a new ensemble model framework cooperated with data augmentation to boost generalization.

Data Augmentation Few-Shot Class-Incremental Learning +2

Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports

1 code implementation3 Jan 2024 Haopeng Li, Andong Deng, Qiuhong Ke, Jun Liu, Hossein Rahmani, Yulan Guo, Bernt Schiele, Chen Chen

Reasoning over sports videos for question answering is an important task with numerous applications, such as player training and information retrieval.

Action Understanding counterfactual +4

NID-SLAM: Neural Implicit Representation-based RGB-D SLAM in dynamic environments

no code implementations2 Jan 2024 Ziheng Xu, Jianwei Niu, Qingfeng Li, Tao Ren, Chen Chen

In this paper we present NID-SLAM, which significantly improves the performance of neural SLAM in dynamic environments.

Towards Improved Proxy-based Deep Metric Learning via Data-Augmented Domain Adaptation

1 code implementation1 Jan 2024 Li Ren, Chen Chen, Liqiang Wang, Kien Hua

Our experiments on benchmarks, including the popular CUB-200-2011, CARS196, Stanford Online Products, and In-Shop Clothes Retrieval, show that our learning algorithm significantly improves the existing proxy losses and achieves superior results compared to the existing methods.

Domain Adaptation Metric Learning +1

Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

2 code implementations25 Dec 2023 Jing Wang, Jinagyun Li, Chen Chen, Yisi Zhang, Haoran Shen, Tianxiang Zhang

In this paper, we propose a novel framework based on the adapter mechanism, namely Adaptive FSS, which can efficiently adapt the existing FSS model to the novel classes.

Meta-Learning

Free-Editor: Zero-shot Text-driven 3D Scene Editing

no code implementations21 Dec 2023 Nazmul Karim, Umar Khalid, Hasan Iqbal, Jing Hua, Chen Chen

To date, editing 3D scenes requires either re-training the model to adapt to various 3D edited scenes or design-specific methods for each special editing type.

3D scene Editing Style Transfer +1

GCNext: Towards the Unity of Graph Convolutions for Human Motion Prediction

1 code implementation19 Dec 2023 Xinshun Wang, Qiongjie Cui, Chen Chen, Mengyuan Liu

The past few years has witnessed the dominance of Graph Convolutional Networks (GCNs) over human motion prediction. Various styles of graph convolutions have been proposed, with each one meticulously designed and incorporated into a carefully-crafted network architecture.

Human motion prediction motion prediction +1

LatentEditor: Text Driven Local Editing of 3D Scenes

1 code implementation14 Dec 2023 Umar Khalid, Hasan Iqbal, Nazmul Karim, Jing Hua, Chen Chen

Our approach achieves faster editing speeds and superior output quality compared to existing 3D editing models, bridging the gap between textual instructions and high-quality 3D scene editing in latent space.

3D scene Editing Denoising

IL-NeRF: Incremental Learning for Neural Radiance Fields with Camera Pose Alignment

no code implementations10 Dec 2023 Letian Zhang, Ming Li, Chen Chen, Jie Xu

This poses a paradox as the necessary camera pose must be estimated from the entire dataset, even though the data arrives sequentially and future chunks are inaccessible.

Incremental Learning Knowledge Distillation

Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning

1 code implementation6 Dec 2023 Xinshun Wang, Zhongbin Fang, Xia Li, Xiangtai Li, Chen Chen, Mengyuan Liu

Under this setting, the model can perceive tasks from prompts and accomplish them without any extra task-specific head predictions or model fine-tuning.

In-Context Learning motion prediction +1

MMM: Generative Masked Motion Model

1 code implementation6 Dec 2023 Ekkasit Pinyoanuntapong, Pu Wang, Minwoo Lee, Chen Chen

MMM consists of two key components: (1) a motion tokenizer that transforms 3D human motion into a sequence of discrete tokens in latent space, and (2) a conditional masked motion transformer that learns to predict randomly masked motion tokens, conditioned on the pre-computed text tokens.

Motion Synthesis

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

1 code implementation30 Nov 2023 Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun, Chen Chen

Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.

Descriptive Language Modelling +5

LucidDreaming: Controllable Object-Centric 3D Generation

no code implementations30 Nov 2023 Zhaoning Wang, Ming Li, Chen Chen

Nonetheless, achieving precise control over 3D generation continues to be an arduous task, as using text to control often leads to missing objects and imprecise locations.

Benchmarking Language Modelling +3

Decouple Content and Motion for Conditional Image-to-Video Generation

no code implementations24 Nov 2023 Cuifeng Shen, Yulu Gan, Chen Chen, Xiongwei Zhu, Lele Cheng, Tingting Gao, Jinzhi Wang

The goal of conditional image-to-video (cI2V) generation is to create a believable new video by beginning with the condition, i. e., one image and text. The previous cI2V generation methods conventionally perform in RGB pixel space, with limitations in modeling motion consistency and visual continuity.

Image to Video Generation

FBChain: A Blockchain-based Federated Learning Model with Efficiency and Secure Communication

no code implementations21 Nov 2023 Yang Li, Chunhe Xia, Wei Liu, Weidong Zhou, Chen Chen, Tianbo Wang

This article proposes Blockchain-based Federated Learning (FBChain) model for federated learning parameter communication to overcome the above two problems.

Federated Learning

Supported Trust Region Optimization for Offline Reinforcement Learning

no code implementations15 Nov 2023 Yixiu Mao, Hongchang Zhang, Chen Chen, Yi Xu, Xiangyang Ji

Offline reinforcement learning suffers from the out-of-distribution issue and extrapolation error.

reinforcement-learning

MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval

no code implementations30 Oct 2023 Youbo Lei, Feifei He, Chen Chen, Yingbin Mo, Si Jia Li, Defeng Xie, Haonan Lu

With the success of large-scale visual-language pretraining models and the wide application of image-text retrieval in industry areas, reducing the model size and streamlining their terminal-device deployment have become urgently necessary.

Retrieval Text Retrieval

Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation

no code implementations28 Oct 2023 Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, Jing Liu, Shanshan Song, Jiangyun Li

As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i. e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices.

Computational Efficiency MRI segmentation +2

Knowledge Editing for Large Language Models: A Survey

no code implementations24 Oct 2023 Song Wang, Yaochen Zhu, Haochen Liu, Zaiyi Zheng, Chen Chen, Jundong Li

Afterward, we provide an innovative taxonomy of KME techniques based on how the new knowledge is introduced into pre-trained LLMs, and investigate existing KME strategies while analyzing key insights, advantages, and limitations of methods from each category.

knowledge editing

Adversarial Attacks on Fairness of Graph Neural Networks

1 code implementation20 Oct 2023 Binchi Zhang, Yushun Dong, Chen Chen, Yada Zhu, Minnan Luo, Jundong Li

Fairness-aware graph neural networks (GNNs) have gained a surge of attention as they can reduce the bias of predictions on any demographic group (e. g., female) in graph-based applications.

Fairness

Lifelong Sequence Generation with Dynamic Module Expansion and Adaptation

no code implementations15 Oct 2023 Chengwei Qin, Chen Chen, Shafiq Joty

Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks.

Continual Learning Transfer Learning

Beyond Sharing Weights in Decoupling Feature Learning Network for UAV RGB-Infrared Vehicle Re-Identification

no code implementations12 Oct 2023 Xingyue Liu, Jiahao Qi, Chen Chen, Kangcheng Bin, Ping Zhong

Moreover, to meet cross-modality discrepancy and orientation discrepancy challenges, we present a hybrid weights decoupling network (HWDNet) to learn the shared discriminative orientation-invariant features.

Vehicle Re-Identification

Adaptive Quantization for Key Generation in Low-Power Wide-Area Networks

no code implementations11 Oct 2023 Chen Chen, Junqing Zhang, Yingying Chen

Physical layer key generation based on reciprocal and random wireless channels has been an attractive solution for securing resource-constrained low-power wide-area networks (LPWANs).

Quantization

STAG: Enabling Low Latency and Low Staleness of GNN-based Services with Dynamic Graphs

no code implementations27 Sep 2023 Jiawen Wang, Quan Chen, Deze Zeng, Zhuo Song, Chen Chen, Minyi Guo

With the collaborative serving mechanism, only part of node representations are updated during the update phase, and the final representations are calculated in the inference phase.

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models

1 code implementation NeurIPS 2023 Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Sabato Macro Siniscalchi, Pin-Yu Chen, Eng Siong Chng

We make our results publicly accessible for reproducible pipelines with released pre-trained models, thus providing a new evaluation paradigm for ASR error correction with LLMs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

ConPET: Continual Parameter-Efficient Tuning for Large Language Models

1 code implementation26 Sep 2023 Chenyang Song, Xu Han, Zheni Zeng, Kuai Li, Chen Chen, Zhiyuan Liu, Maosong Sun, Tao Yang

First, Static ConPET can adapt former continual learning methods originally designed for relatively smaller models to LLMs through PET and a dynamic replay strategy, which largely reduces the tuning costs and alleviates the over-fitting and forgetting issue.

Continual Learning

Regress Before Construct: Regress Autoencoder for Point Cloud Self-supervised Learning

1 code implementation25 Sep 2023 Yang Liu, Chen Chen, Can Wang, Xulin King, Mengyuan Liu

The proposed method decouples functions between the decoder and the encoder by introducing a mask regressor, which predicts the masked patch representation from the visible patch representation encoded by the encoder and the decoder reconstructs the target from the predicted masked patch representation.

Few-Shot 3D Point Cloud Classification Representation Learning +1

Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges

no code implementations25 Sep 2023 Tongtong Yuan, Xuange Zhang, Kun Liu, Bo Liu, Chen Chen, Jian Jin, Zhenzhen Jiao

Furthermore, we benchmark SOTA models for four multimodal tasks on this newly created dataset, which serve as new baselines for surveillance video-and-language understanding.

Anomaly Detection Dense Video Captioning +1

Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning

1 code implementation NeurIPS 2023 Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji

Offline multi-agent reinforcement learning is challenging due to the coupling effect of both distribution shift issue common in offline setting and the high dimension issue common in multi-agent setting, making the action out-of-distribution (OOD) and value overestimation phenomenon excessively severe.

counterfactual Multi-agent Reinforcement Learning +3

RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation

1 code implementation ICCV 2023 Lijun Li, Linrui Tian, Xindi Zhang, Qi Wang, Bang Zhang, Mengyuan Liu, Chen Chen

The current interacting hand (IH) datasets are relatively simplistic in terms of background and texture, with hand joints being annotated by a machine annotator, which may result in inaccuracies, and the diversity of pose distribution is limited.

3D Interacting Hand Pose Estimation Hand Pose Estimation

SoccerNet 2023 Challenges Results

2 code implementations12 Sep 2023 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers

no code implementations8 Sep 2023 Sijia Li, Chen Chen, Haonan Lu

In this work, we propose a method with a mixture-of-expert (MOE) controllers to align the text-guided capacity of diffusion models with different kinds of human instructions, enabling our model to handle various open-domain image manipulation tasks with natural language instructions.

Image Generation Image Manipulation

Uncertainty Aware Training to Improve Deep Learning Model Calibration for Classification of Cardiac MR Images

no code implementations29 Aug 2023 Tareen Dawood, Chen Chen, Baldeep S. Sidhua, Bram Ruijsink, Justin Goulda, Bradley Porter, Mark K. Elliott, Vishal Mehta, Christopher A. Rinaldi, Esther Puyol-Anton, Reza Razavi, Andrew P. King

The best-performing model in terms of both classification accuracy and the most common calibration measure, expected calibration error (ECE) was the Confidence Weight method, a novel approach that weights the loss of samples to explicitly penalise confident incorrect predictions.

GeoDTR+: Toward generic cross-view geolocalization via geometric disentanglement

no code implementations18 Aug 2023 Xiaohan Zhang, Xingyu Li, Waqas Sultani, Chen Chen, Safwan Wshah

We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details.

Attribute Disentanglement

FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning

1 code implementation ICCV 2023 Guangyu Sun, Matias Mendieta, Jun Luo, Shandong Wu, Chen Chen

Personalized Federated Learning (PFL) represents a promising solution for decentralized learning in heterogeneous data environments.

Personalized Federated Learning

Pseudo-label Alignment for Semi-supervised Instance Segmentation

1 code implementation ICCV 2023 Jie Hu, Chen Chen, Liujuan Cao, Shengchuan Zhang, Annan Shu, Guannan Jiang, Rongrong Ji

Through extensive experiments conducted on the COCO and Cityscapes datasets, we demonstrate that PAIS is a promising framework for semi-supervised instance segmentation, particularly in cases where labeled data is severely limited.

Instance Segmentation Pseudo Label +3

A Safe DRL Method for Fast Solution of Real-Time Optimal Power Flow

no code implementations7 Aug 2023 Pengfei Wu, Chen Chen, Dexiang Lai, Jian Zhong

Instead of integrating the constraint violation penalty with the reward function, its actor gradients are estimated by a Lagrange advantage function which is derived from two critic systems based on economic reward and violation cost.

Source-free Domain Adaptive Human Pose Estimation

1 code implementation ICCV 2023 Qucheng Peng, Ce Zheng, Chen Chen

To this end, we propose a new task, named source-free domain adaptive HPE, which aims to address the challenges of cross-domain learning of HPE without access to source data during the adaptation process.

Contrastive Learning Domain Adaptation +1

Learning Snippet-to-Motion Progression for Skeleton-based Human Motion Prediction

no code implementations26 Jul 2023 Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu

Existing Graph Convolutional Networks to achieve human motion prediction largely adopt a one-step scheme, which output the prediction straight from history input, failing to exploit human motion patterns.

Human motion prediction motion prediction +1

Mystique: Deconstructing SVG Charts for Layout Reuse

no code implementations25 Jul 2023 Chen Chen, Bongshin Lee, Yunhai Wang, Yunjeong Chang, Zhicheng Liu

To facilitate the reuse of existing charts, previous research has examined how to obtain a semantic understanding of a chart by deconstructing its visual representation into reusable components, such as encodings.

Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning

1 code implementation21 Jul 2023 Jian Ma, Junhao Liang, Chen Chen, Haonan Lu

In this paper, we propose Subject-Diffusion, a novel open-domain personalized image generation model that, in addition to not requiring test-time fine-tuning, also only requires a single reference image to support personalized generation of single- or multi-subject in any domain.

Diffusion Personalization Tuning Free Test +1

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

1 code implementation ICCV 2023 Ming Li, Jie Wu, Xionghui Wang, Chen Chen, Jie Qin, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan

To this end, we propose AlignDet, a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies.

object-detection Object Detection

M-FLAG: Medical Vision-Language Pre-training with Frozen Language Models and Latent Space Geometry Optimization

1 code implementation17 Jul 2023 Che Liu, Sibo Cheng, Chen Chen, Mengyun Qiao, Weitong Zhang, Anand Shah, Wenjia Bai, Rossella Arcucci

The proposed method, named Medical vision-language pre-training with Frozen language models and Latent spAce Geometry optimization (M-FLAG), leverages a frozen language model for training stability and efficiency and introduces a novel orthogonality loss to harmonize the latent space geometry.

Image Classification Language Modelling +3

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

1 code implementation16 Jul 2023 Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process.

Denoising Multi-Task Learning +2

Separate-and-Aggregate: A Transformer-based Patch Refinement Model for Knowledge Graph Completion

no code implementations11 Jul 2023 Chen Chen, YuFei Wang, Yang Zhang, Quan Z. Sheng, Kwok-Yan Lam

Previous KGC methods typically represent knowledge graph entities and relations as trainable continuous embeddings and fuse the embeddings of the entity $h$ (or $t$) and relation $r$ into hidden representations of query $(h, r, ?

Inductive Bias Relation

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

1 code implementation6 Jul 2023 Zhenting Wang, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas, Shiqing Ma

To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset.

Memorization

Pay Attention to the Atlas: Atlas-Guided Test-Time Adaptation Method for Robust 3D Medical Image Segmentation

no code implementations2 Jul 2023 Jingjie Guo, Weitong Zhang, Matthew Sinclair, Daniel Rueckert, Chen Chen

In addition, different from most existing TTA methods which restrict the adaptation to batch normalization blocks in the segmentation network only, we further exploit the use of channel and spatial attention blocks for improved adaptability at test time.

Image Segmentation Medical Image Segmentation +4

When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions

no code implementations27 Jun 2023 Weiming Zhuang, Chen Chen, Lingjuan Lyu

The intersection of the Foundation Model (FM) and Federated Learning (FL) provides mutual benefits, presents a unique opportunity to unlock new possibilities in AI research, and address critical challenges in AI and real-world applications.

Federated Learning Privacy Preserving

First Place Solution to the CVPR'2023 AQTC Challenge: A Function-Interaction Centric Approach with Spatiotemporal Visual-Language Alignment

1 code implementation23 Jun 2023 Tom Tongjia Chen, Hongshan Yu, Zhengeng Yang, Ming Li, Zechuan Li, Jingwen Wang, Wei Miao, Wei Sun, Chen Chen

Affordance-Centric Question-driven Task Completion (AQTC) has been proposed to acquire knowledge from videos to furnish users with comprehensive and systematic instructions.

Human-Object Interaction Detection

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

1 code implementation18 Jun 2023 Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng

In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a., unsupervised noise adaptation.

Audio-Visual Speech Recognition speech-recognition +1

Federated Few-shot Learning

1 code implementation17 Jun 2023 Song Wang, Xingbo Fu, Kaize Ding, Chen Chen, Huiyuan Chen, Jundong Li

In this way, the server can exploit the computational power of all clients and train the model on a larger set of data samples among all clients.

Federated Learning Few-Shot Learning

MOFI: Learning Image Representations from Noisy Entity Annotated Images

no code implementations13 Jun 2023 Wentao Wu, Aleksei Timofeev, Chen Chen, BoWen Zhang, Kun Duan, Shuangning Liu, Yantao Zheng, Jon Shlens, Xianzhi Du, Zhe Gan, Yinfei Yang

Further experiments on zero-shot and linear probe image classification also show that MOFI outperforms a CLIP model trained on the original image-text data, demonstrating the effectiveness of the I2E dataset in learning strong image representations.

Image Classification Image Retrieval +3

Unsupervised Anomaly Detection in Medical Images Using Masked Diffusion Model

1 code implementation31 May 2023 Hasan Iqbal, Umar Khalid, Jing Hua, Chen Chen

It can be challenging to identify brain MRI anomalies using supervised deep-learning techniques due to anatomical heterogeneity and the requirement for pixel-level labeling.

Anatomy Unsupervised Anomaly Detection

SAVE: Spectral-Shift-Aware Adaptation of Image Diffusion Models for Text-driven Video Editing

1 code implementation30 May 2023 Nazmul Karim, Umar Khalid, Mohsen Joneidi, Chen Chen, Nazanin Rahnavard

Text-to-Image (T2I) diffusion models have achieved remarkable success in synthesizing high-quality images conditioned on text prompts.

Style Transfer Video Editing

Alteration-free and Model-agnostic Origin Attribution of Generated Images

no code implementations29 May 2023 Zhenting Wang, Chen Chen, Yi Zeng, Lingjuan Lyu, Shiqing Ma

To overcome this problem, we first develop an alteration-free and model-agnostic origin attribution method via input reverse-engineering on image generation models, i. e., inverting the input of a particular model for a specific image.

Image Generation

A Neural State-Space Model Approach to Efficient Speech Separation

1 code implementation26 May 2023 Chen Chen, Chao-Han Huck Yang, Kai Li, Yuchen Hu, Pin-Jui Ku, Eng Siong Chng

In this work, we introduce S4M, a new efficient speech separation framework based on neural state-space models (SSM).

Representation Learning Speech Separation

CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition

no code implementations25 May 2023 Lantian Li, Xiaolou Li, Haoyu Jiang, Chen Chen, Ruihai Hou, Dong Wang

A comprehensive study was conducted to compare CN-Celeb-AV with two popular public AVPR benchmark datasets, and the results demonstrated that CN-Celeb-AV is more in line with real-world scenarios and can be regarded as a new benchmark dataset for AVPR research.

Person Recognition

RaSa: Relation and Sensitivity Aware Representation Learning for Text-based Person Search

1 code implementation23 May 2023 Yang Bai, Min Cao, Daming Gao, Ziqiang Cao, Chen Chen, Zhenfeng Fan, Liqiang Nie, Min Zhang

RA offsets the overfitting risk by introducing a novel positive relation detection task (i. e., learning to distinguish strong and weak positive pairs).

Person Search Relation +2

DiffHand: End-to-End Hand Mesh Reconstruction via Diffusion Models

no code implementations23 May 2023 Lijun Li, Li'an Zhuo, Bang Zhang, Liefeng Bo, Chen Chen

Hand mesh reconstruction from the monocular image is a challenging task due to its depth ambiguity and severe occlusion, there remains a non-unique mapping between the monocular image and hand mesh.

Denoising Test

Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models

1 code implementation23 May 2023 Ruichen Wang, Zekang Chen, Chen Chen, Jian Ma, Haonan Lu, Xiaodong Lin

Our approach produces a more semantically accurate synthesis by constraining the attention regions of each token in the prompt to the image.

Attribute Image Generation

Text-based Person Search without Parallel Image-Text Data

no code implementations22 May 2023 Yang Bai, Jingyao Wang, Min Cao, Chen Chen, Ziqiang Cao, Liqiang Nie, Min Zhang

Text-based person search (TBPS) aims to retrieve the images of the target person from a large image gallery based on a given natural language description.

Image Captioning Language Modelling +4

CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation

no code implementations19 May 2023 Wenxuan Wang, Jing Liu, Xingjian He, Yisi Zhang, Chen Chen, Jiachen Shen, Yan Zhang, Jiangyun Li

Referring image segmentation (RIS) is a fundamental vision-language task that intends to segment a desired object from an image based on a given natural language expression.

Image Segmentation Segmentation +1

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

1 code implementation16 May 2023 Yuchen Hu, Ruizhe Li, Chen Chen, Heqing Zou, Qiushi Zhu, Eng Siong Chng

However, most existing AVSR approaches simply fuse the audio and visual features by concatenation, without explicit interactions to capture the deep correlations between them, which results in sub-optimal multimodal representations for downstream speech recognition task.

Audio-Visual Speech Recognition Automatic Speech Recognition +3

UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

1 code implementation16 May 2023 Heqing Zou, Meng Shen, Chen Chen, Yuchen Hu, Deepu Rajan, Eng Siong Chng

Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks.

Contrastive Learning Image-text Classification +2

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness

1 code implementation8 May 2023 Liangliang Cao, BoWen Zhang, Chen Chen, Yinfei Yang, Xianzhi Du, Wencong Zhang, Zhiyun Lu, Yantao Zheng

In this paper, we discuss two effective approaches to improve the efficiency and robustness of CLIP training: (1) augmenting the training dataset while maintaining the same number of optimization steps, and (2) filtering out samples that contain text regions in the image.

Adversarial Text Retrieval

Spatial-Temporal Networks for Antibiogram Pattern Prediction

no code implementations2 May 2023 Xingbo Fu, Chen Chen, Yushun Dong, Anil Vullikanti, Eili Klein, Gregory Madden, Jundong Li

In this paper, we propose a novel problem of antibiogram pattern prediction that aims to predict which patterns will appear in the future.

Part Aware Contrastive Learning for Self-Supervised Action Recognition

1 code implementation1 May 2023 Yilei Hua, Wenhan Wu, Ce Zheng, Aidong Lu, Mengyuan Liu, Chen Chen, Shiqian Wu

This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations.

Contrastive Learning Data Augmentation +3

Secret Key Generation for IRS-Assisted Multi-Antenna Systems: A Machine Learning-Based Approach

no code implementations28 Apr 2023 Chen Chen, Junqing Zhang, Tianyu Lu, Magnus Sandell, Liquan Chen

Different from most previous works that adopt iterative optimisation to solve the problem, the proposed DNN-based algorithm directly obtains the BS precoding and IRS phase shifts as the output of the DNN.

Edit Everything: A Text-Guided Generative System for Images Editing

1 code implementation27 Apr 2023 Defeng Xie, Ruichen Wang, Jian Ma, Chen Chen, Haonan Lu, Dong Yang, Fobo Shi, Xiaodong Lin

We introduce a new generative system called Edit Everything, which can take image and text inputs and produce image outputs.

Med-Tuning: Parameter-Efficient Transfer Learning with Fine-Grained Feature Enhancement for Medical Volumetric Segmentation

no code implementations21 Apr 2023 Wenxuan Wang, Jiachen Shen, Chen Chen, Jianbo Jiao, Jing Liu, Yan Zhang, Shanshan Song, Jiangyun Li

In this paper, we present the study on parameter-efficient transfer learning for medical volumetric segmentation and propose a new framework named Med-Tuning based on intra-stage feature enhancement and inter-stage feature interaction.

Segmentation Transfer Learning

FreMIM: Fourier Transform Meets Masked Image Modeling for Medical Image Segmentation

1 code implementation21 Apr 2023 Wenxuan Wang, Jing Wang, Chen Chen, Jianbo Jiao, Yuanxiu Cai, Shanshan Song, Jiangyun Li

The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data.

Image Segmentation Medical Image Segmentation +2

Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR

no code implementations11 Apr 2023 Yuchen Hu, Chen Chen, Qiushi Zhu, Eng Siong Chng

Second, during finetuning we propose a Transformer-based code predictor to accurately predict clean codes by modeling the global dependency of input noisy representations, which enables discovery and restoration of high-quality clean representations without distortions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Prediction

no code implementations7 Apr 2023 Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu

In recent years, Graph Convolutional Networks (GCNs) have been widely used in human motion prediction, but their performance remains unsatisfactory.

Human motion prediction Human Pose Forecasting +1

$R^{2}$Former: Unified $R$etrieval and $R$eranking Transformer for Place Recognition

no code implementations6 Apr 2023 Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang

Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database.

Feature Correlation Retrieval +1

TopNet: Transformer-based Object Placement Network for Image Compositing

no code implementations CVPR 2023 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.

Object

Towards Adversarially Robust Continual Learning

no code implementations31 Mar 2023 Tao Bai, Chen Chen, Lingjuan Lyu, Jun Zhao, Bihan Wen

Recent studies show that models trained by continual learning can achieve the comparable performances as the standard supervised learning and the learning flexibility of continual learning models enables their wide applications in the real world.

Adversarial Robustness Continual Learning

GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation

3 code implementations31 Mar 2023 Jian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin

Recent breakthroughs in the field of language-guided image generation have yielded impressive achievements, enabling the creation of high-quality and diverse images based on user instructions. Although the synthesis performance is fascinating, one significant limitation of current image generation models is their insufficient ability to generate text coherently within images, particularly for complex glyph structures like Chinese characters.

Optical Character Recognition (OCR) Text-to-Image Generation

PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation

2 code implementations CVPR 2023 Qitao Zhao, Ce Zheng, Mengyuan Liu, Pichao Wang, Chen Chen

However, in real scenarios, the performance of PoseFormer and its follow-ups is limited by two factors: (a) The length of the input joint sequence; (b) The quality of 2D joint detection.

3D Human Pose Estimation Human Dynamics

A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition

1 code implementation ICCV 2023 Andong Deng, Taojiannan Yang, Chen Chen

The goal of building a benchmark (suite of datasets) is to provide a unified protocol for fair evaluation and thus facilitate the evolution of a specific area.

Action Recognition Representation Learning +3

DiffMesh: A Motion-aware Diffusion-like Framework for Human Mesh Recovery from Videos

no code implementations23 Mar 2023 Ce Zheng, Xianpeng Liu, Mengyuan Liu, Tianfu Wu, Guo-Jun Qi, Chen Chen

While image-based HMR methods have achieved impressive results, they often struggle to recover humans in dynamic scenarios, leading to temporal inconsistencies and non-smooth 3D motion predictions due to the absence of human motion.

3D Human Pose Estimation Human Mesh Recovery

POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery

1 code implementation CVPR 2023 Ce Zheng, Xianpeng Liu, Guo-Jun Qi, Chen Chen

In this paper, we propose a pure transformer architecture named POoling aTtention TransformER (POTTER) for the HMR task from single images.

3D Human Pose Estimation Human Mesh Recovery

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID

1 code implementation CVPR 2023 Jianyang Gu, Kai Wang, Hao Luo, Chen Chen, Wei Jiang, Yuqiang Fang, Shanghang Zhang, Yang You, Jian Zhao

Neural Architecture Search (NAS) has been increasingly appealing to the society of object Re-Identification (ReID), for that task-specific architectures significantly improve the retrieval performance.

Image Classification Neural Architecture Search +3

TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation

1 code implementation ICCV 2023 Jie Zhang, Chen Chen, Weiming Zhuang, LingJuan Lv

This paper focuses on an under-explored yet important problem: Federated Class-Continual Learning (FCCL), where new classes are dynamically added in federated learning.

Continual Learning Federated Learning

A Pathway Towards Responsible AI Generated Content

no code implementations2 Mar 2023 Chen Chen, Jie Fu, Lingjuan Lyu

AI Generated Content (AIGC) has received tremendous attention within the past few years, with content generated in the format of image, text, audio, video, etc.

Misinformation

Metric-oriented Speech Enhancement using Diffusion Probabilistic Model

no code implementations23 Feb 2023 Chen Chen, Yuchen Hu, Weiwei Weng, Eng Siong Chng

Deep neural network based speech enhancement technique focuses on learning a noisy-to-clean transformation supervised by paired training data.

Speech Enhancement

Unsupervised Noise adaptation using Data Simulation

no code implementations23 Feb 2023 Chen Chen, Yuchen Hu, Heqing Zou, Linhui Sun, Eng Siong Chng

Deep neural network based speech enhancement approaches aim to learn a noisy-to-clean transformation using a supervised learning paradigm.

Domain Adaptation Generative Adversarial Network +1

Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation

1 code implementation22 Feb 2023 Yuchen Hu, Chen Chen, Heqing Zou, Xionghu Zhong, Eng Siong Chng

To alleviate this problem, we propose a novel network to unify speech enhancement and separation with gradient modulation to improve noise-robustness.

Multi-Task Learning Speech Enhancement +2

Delving into Identify-Emphasize Paradigm for Combating Unknown Bias

no code implementations22 Feb 2023 Bowen Zhao, Chen Chen, Qian-Wei Wang, Anfeng He, Shu-Tao Xia

For challenge B, we point out that the gradient contribution statistics can be a reliable indicator to inspect whether the optimization is dominated by bias-aligned samples.

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

1 code implementation22 Feb 2023 Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Delving into the Adversarial Robustness of Federated Learning

no code implementations19 Feb 2023 Jie Zhang, Bo Li, Chen Chen, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chao Wu

In this work, we propose a novel algorithm called Decision Boundary based Federated Adversarial Training (DBFAT), which consists of two components (local re-weighting and global regularization) to improve both accuracy and robustness of FL systems.

Adversarial Robustness Federated Learning

Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting

1 code implementation13 Feb 2023 Yuchen Liu, Chen Chen, Lingjuan Lyu, Fangzhao Wu, Sai Wu, Gang Chen

In order to address this issue, we propose GAS, a \shorten approach that can successfully adapt existing robust AGRs to non-IID settings.

Federated Learning

Towards Geospatial Foundation Models via Continual Pretraining

1 code implementation ICCV 2023 Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen

Geospatial technologies are becoming increasingly essential in our world for a wide range of applications, including agriculture, urban planning, and disaster response.

Change Detection Continual Pretraining +4

AIM: Adapting Image Models for Efficient Video Action Recognition

1 code implementation6 Feb 2023 Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li

Recent vision transformer based video models mostly follow the ``image pre-training then finetuning" paradigm and have achieved great success on multiple video benchmarks.

 Ranked #1 on Action Recognition on Diving-48 (using extra training data)

Action Classification Action Recognition +2

Dynamic Ensemble of Low-fidelity Experts: Mitigating NAS "Cold-Start"

1 code implementation2 Feb 2023 Junbo Zhao, Xuefei Ning, Enshu Liu, Binxin Ru, Zixuan Zhou, Tianchen Zhao, Chen Chen, Jiajin Zhang, Qingmin Liao, Yu Wang

In the first step, we train different sub-predictors on different types of available low-fidelity information to extract beneficial knowledge as low-fidelity experts.

Neural Architecture Search

Filtering Context Mitigates Scarcity and Selection Bias in Political Ideology Prediction

no code implementations1 Feb 2023 Chen Chen, Dylan Walker, Venkatesh Saligrama

We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs.

Selection bias

GaitSADA: Self-Aligned Domain Adaptation for mmWave Gait Recognition

1 code implementation31 Jan 2023 Ekkasit Pinyoanuntapong, Ayman Ali, Kalvik Jakkala, Pu Wang, Minwoo Lee, Qucheng Peng, Chen Chen, Zhi Sun

mmWave radar-based gait recognition is a novel user identification method that captures human gait biometrics from mmWave radar return signals.

Contrastive Learning Domain Adaptation +1

DELTA: degradation-free fully test-time adaptation

no code implementations30 Jan 2023 Bowen Zhao, Chen Chen, Shu-Tao Xia

However, we find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning.

Self-Learning Test

The Exploration of Knowledge-Preserving Prompts for Document Summarisation

no code implementations27 Jan 2023 Chen Chen, Wei Emma Zhang, Alireza Seyed Shakeri, Makhmoor Fiza

Despite the great development of document summarisation techniques nowadays, factual inconsistencies between the generated summaries and the original texts still occur from time to time.

Document Summarization

Plan To Predict: Learning an Uncertainty-Foreseeing Model for Model-Based Reinforcement Learning

1 code implementation20 Jan 2023 Zifan Wu, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo

In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples.

Decision Making Model-based Reinforcement Learning

Machine Learning-Based Secret Key Generation for IRS-assisted Multi-antenna Systems

no code implementations19 Jan 2023 Chen Chen, Junqing Zhang, Tianyu Lu, Magnus Sandell, Liquan Chen

Different from most previous works that adopt the iterative optimization to solve the problem, the proposed DNN based algorithm directly obtains the BS precoding and IRS phase shifts as the output of the DNN.

Few-shot Node Classification with Extremely Weak Supervision

1 code implementation6 Jan 2023 Song Wang, Yushun Dong, Kaize Ding, Chen Chen, Jundong Li

Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i. e., meta-training classes) and then generalize to classes with limited labeled nodes (i. e., meta-test classes).

Classification Meta-Learning +2

Private Image Generation With Dual-Purpose Auxiliary Classifier

no code implementations CVPR 2023 Chen Chen, Daochang Liu, Siqi Ma, Surya Nepal, Chang Xu

However, apart from this standard utility, we identify the "reversed utility" as another crucial aspect, which computes the accuracy on generated data of a classifier trained using real data, dubbed as real2gen accuracy (r2g%).

Image Generation Privacy Preserving

Reconciling Object-Level and Global-Level Objectives for Long-Tail Detection

1 code implementation ICCV 2023 Shaoyu Zhang, Chen Chen, Silong Peng

Specifically, complementary to the object-level classification loss for model discrimination, we design a generalized average precision (GAP) loss to explicitly optimize the global-level score ranking across different objects.

Multi-Task Learning Object

Dynamic Graph Learning With Content-Guided Spatial-Frequency Relation Reasoning for Deepfake Detection

no code implementations CVPR 2023 YuAn Wang, Kun Yu, Chen Chen, Xiyuan Hu, Silong Peng

To address this issue, we propose a Spatial-Frequency Dynamic Graph method to exploit the relation-aware features in spatial and frequency domains via dynamic graph learning.

DeepFake Detection Face Generation +3

When Do Curricula Work in Federated Learning?

no code implementations ICCV 2023 Saeed Vahidian, Sreevatsank Kadaveru, Woonjoon Baek, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, Bill Lin

Specifically, we aim to investigate how ordered learning principles can contribute to alleviating the heterogeneity effects in FL.

Federated Learning

Context Label Learning: Improving Background Class Representations in Semantic Segmentation

1 code implementation16 Dec 2022 Zeju Li, Konstantinos Kamnitsas, Cheng Ouyang, Chen Chen, Ben Glocker

The results demonstrate that CoLab can guide the segmentation model to map the logits of background samples away from the decision boundary, resulting in significantly improved segmentation accuracy.

Segmentation Semantic Segmentation

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

3 code implementations13 Dec 2022 Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan

The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.

PGFed: Personalize Each Client's Global Objective for Federated Learning

1 code implementation ICCV 2023 Jun Luo, Matias Mendieta, Chen Chen, Shandong Wu

Based on our observation, in this work, we propose Personalized Global Federated Learning (PGFed), a novel personalized FL framework that enables each client to personalize its own global objective by explicitly and adaptively aggregating the empirical risks of itself and other clients.

Personalized Federated Learning Transfer Learning

State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning

no code implementations28 Nov 2022 Chen Chen, Hongyao Tang, Yi Ma, Chao Wang, Qianli Shen, Dong Li, Jianye Hao

The key idea of SA-PP is leveraging discounted stationary state distribution ratios between the learning policy and the offline dataset to modulate the degree of behavior regularization in a state-wise manner, so that pessimism can be implemented in a more appropriate way.

Offline RL Q-Learning +2

Refined Semantic Enhancement towards Frequency Diffusion for Video Captioning

1 code implementation28 Nov 2022 Xian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye

In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diffusion (RSFD), a captioning model that constantly perceives the linguistic representation of the infrequent tokens.

FAD Video Captioning

Accelerated Nonnegative Tensor Completion via Integer Programming

1 code implementation28 Nov 2022 Wenhao Pan, Anil Aswani, Chen Chen

A recent approach, based on integer programming, resolves this tension for nonnegative tensor completion.

Language-Assisted Deep Learning for Autistic Behaviors Recognition

no code implementations17 Nov 2022 Andong Deng, Taojiannan Yang, Chen Chen, Qian Chen, Leslie Neely, Sakiko Oyama

In such cases, automatic recognition systems based on computer vision and machine learning (in particular deep learning) technology can alleviate this issue to a large extent.

Action Recognition Multimodal Deep Learning +1

Revisiting Training-free NAS Metrics: An Efficient Training-based Method

1 code implementation16 Nov 2022 Taojiannan Yang, Linjie Yang, Xiaojie Jin, Chen Chen

In this paper, we revisit these training-free metrics and find that: (1) the number of parameters (\#Param), which is the most straightforward training-free metric, is overlooked in previous works but is surprisingly effective, (2) recent training-free metrics largely rely on the \#Param information to rank networks.

Neural Architecture Search

GaitMixer: Skeleton-based Gait Representation Learning via Wide-spectrum Multi-axial Mixer

1 code implementation27 Oct 2022 Ekkasit Pinyoanuntapong, Ayman Ali, Pu Wang, Minwoo Lee, Chen Chen

Most existing gait recognition methods are appearance-based, which rely on the silhouettes extracted from the video data of human walking activities.

Multiview Gait Recognition Representation Learning

Graph Few-shot Learning with Task-specific Structures

1 code implementation21 Oct 2022 Song Wang, Chen Chen, Jundong Li

Therefore, to adaptively learn node representations across meta-tasks, we propose a novel framework that learns a task-specific structure for each meta-task.

Classification Few-Shot Learning +2

End-to-End Context-Aided Unicity Matching for Person Re-identification

no code implementations20 Oct 2022 Min Cao, Cong Ding, Chen Chen, Junchi Yan, Qi Tian

Based on a natural assumption that images belonging to the same person identity should not match with images belonging to multiple different person identities across views, called the unicity of person matching on the identity level, we propose an end-to-end person unicity matching architecture for learning and refining the person matching relations.

Graph Matching Person Re-Identification

Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model

no code implementations8 Oct 2022 Zeyu Gao, Yao Mu, Ruoyan Shen, Chen Chen, Yangang Ren, Jianyu Chen, Shengbo Eben Li, Ping Luo, YanFeng Lu

End-to-end autonomous driving provides a feasible way to automatically maximize overall driving system performance by directly mapping the raw pixels from a front-facing camera to control signals.

Autonomous Driving