Search Results for author: Hang Chen

Found 33 papers, 9 papers with code

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

1 code implementation • 7 Mar 2024 • Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

In this paper, we investigate this contrasting phenomenon from the perspective of modality bias and reveal that an excessive modality bias on the audio caused by dropout is the underlying reason.

Audio-Visual Speech Recognition Knowledge Distillation +2

Paper
Code

Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning

no code implementations • 27 Feb 2024 • Yun Li, Zhe Liu, Hang Chen, Lina Yao

Our framework evaluates the specificity of attributes by considering the diversity of objects they apply to and their related context.

Attribute Compositional Zero-Shot Learning +1

Paper
Add Code

Towards Causal Relationship in Indefinite Data: Baseline Model and New Datasets

1 code implementation • 16 Jan 2024 • Hang Chen, Xinyu Yang, Keqing Du

These highpoints make the probabilistic model capable of overcoming challenges brought by the coexistence of multi-structure data and multi-value representations and pave the way for the extension of latent confounders.

Causal Discovery Disentanglement

Paper
Code

Discrete Messages Improve Communication Efficiency among Isolated Intelligent Agents

no code implementations • 26 Dec 2023 • Hang Chen, Yuchuan Jang, Weijie Zhou, Cristian Meo, Ziwei Chen, Dianbo Liu

Individuals, despite having varied life experiences and learning processes, can communicate effectively through languages.

Paper
Add Code

CASR: Refining Action Segmentation via Marginalizing Frame-levle Causal Relationships

no code implementations • 21 Nov 2023 • Keqing Du, Xinyu Yang, Hang Chen

CASR works out by reducing the difference in the causal adjacency matrix between we constructed and pre-segmentation results of backbone models.

Action Segmentation Causal Discovery +1

Paper
Add Code

A Review and Roadmap of Deep Causal Model from Different Causal Structures and Representations

no code implementations • 2 Nov 2023 • Hang Chen, Keqing Du, Chenguang Li, Xinyu Yang

The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area.

Time Series

Paper
Add Code

SSL Framework for Causal Inconsistency between Structures and Representations

no code implementations • 28 Oct 2023 • Hang Chen, Xinyu Yang, Keqing Du

The cross-pollination of deep learning and causal discovery has catalyzed a burgeoning field of research seeking to elucidate causal relationships within non-statistical data forms like images, videos, and text.

Causal Discovery Philosophy +1

Paper
Add Code

Large Language Models in Finance: A Survey

no code implementations • 28 Sep 2023 • Yinheng Li, Shaofei Wang, Han Ding, Hang Chen

In this paper, we provide a practical survey focused on two key aspects of utilizing LLMs for financial tasks: existing solutions and guidance for adoption.

Few-Shot Learning

Paper
Add Code

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

no code implementations • 15 Sep 2023 • Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao

This pioneering effort aims to set the first benchmark for the AVTSE task, offering fresh insights into enhancing the ac-curacy of back-end speech recognition systems through AVTSE in challenging and real acoustic environments.

Audio-Visual Speech Recognition speech-recognition +2

Paper
Add Code

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

no code implementations • 11 Sep 2023 • Haotian Wang, Yuxuan Xi, Hang Chen, Jun Du, Yan Song, Qing Wang, Hengshun Zhou, Chenxi Wang, Jiefeng Ma, Pengfei Hu, Ya Jiang, Shi Cheng, Jie Zhang, Yuzhe Weng

Three different structures based on attention-guided feature gathering (AFG) are designed for deep feature fusion.

Emotion Classification Multimodal Emotion Recognition +1

Paper
Add Code

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

no code implementations • 28 Aug 2023 • Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios.

speaker-diarization Speaker Diarization +2

Paper
Add Code

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

1 code implementation • 14 Aug 2023 • Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee

In this paper, we propose two novel techniques to improve audio-visual speech recognition (AVSR) under a pre-training and fine-tuning training framework.

Audio-Visual Speech Recognition Automatic Speech Recognition +2

Paper
Code

Semi-supervised multi-channel speaker diarization with cross-channel attention

no code implementations • 17 Jul 2023 • Shilong Wu, Jun Du, Maokui He, Shutong Niu, Hang Chen, Haitao Tang, Chin-Hui Lee

Most neural speaker diarization systems rely on sufficient manual training data labels, which are hard to collect under real-world scenarios.

speaker-diarization Speaker Diarization

Paper
Add Code

Learning a Structural Causal Model for Intuition Reasoning in Conversation

1 code implementation • 28 May 2023 • Hang Chen, Bingyu Liao, Jing Luo, Wenjing Zhu, Xinyu Yang

Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models including Large Language Model.

Causal Discovery Language Modelling +2

Paper
Code

On the Importance of Backbone to the Adversarial Robustness of Object Detectors

no code implementations • 27 May 2023 • Xiao Li, Hang Chen, Xiaolin Hu

We argue that using adversarially pre-trained backbone networks is essential for enhancing the adversarial robustness of object detectors.

Adversarial Robustness Autonomous Driving +3

Paper
Add Code

Towards Causal Representation Learning and Deconfounding from Indefinite Data

no code implementations • 4 May 2023 • Hang Chen, Xinyu Yang, Qing Yang

We implement the above designs as a dynamic variational inference model, tailored to learn causal representation from indefinite data under latent confounding.

Causal Discovery Disentanglement +1

Paper
Add Code

How to Enhance Causal Discrimination of Utterances: A Case on Affective Reasoning

1 code implementation • 4 May 2023 • Hang Chen, Jing Luo, Xinyu Yang, Wenjing Zhu

noise terms into the conversation process, thereby constructing a structural causal model (SCM).

Causal Discovery Emotion-Cause Pair Extraction +1

Paper
Code

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

2 code implementations • 21 Dec 2022 • Kai Li, Fenghua Xie, Hang Chen, Kexin Yuan, Xiaolin Hu

Then, inspired by the large number of connections between cortical regions and the thalamus, the model fuses the auditory and visual information in a thalamic subnetwork through top-down connections.

Ranked #3 on Speech Separation on VoxCeleb2

Speech Separation

Paper
Code

Dual adaptive training of photonic neural networks

1 code implementation • 9 Dec 2022 • Ziyang Zheng, Zhengyang Duan, Hang Chen, Rui Yang, Sheng Gao, Haiou Zhang, Hongkai Xiong, Xing Lin

Photonic neural network (PNN) is a remarkable analog artificial intelligence (AI) accelerator that computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism.

Image Classification

Paper
Code

Progressive Multi-Scale Self-Supervised Learning for Speech Recognition

no code implementations • 7 Dec 2022 • Genshun Wan, Tan Liu, Hang Chen, Jia Pan, Cong Liu, Zhongfu Ye

Self-supervised learning (SSL) models have achieved considerable improvements in automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit

no code implementations • 7 Dec 2022 • Pengcheng Li, Genshun Wan, Fenglin Ding, Hang Chen, Jianqing Gao, Jia Pan, Cong Liu

Speech pre-training has shown great success in learning useful and general latent representations from large-scale unlabeled data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Optical multi-task learning using multi-wavelength diffractive deep neural networks

no code implementations • 30 Nov 2022 • Zhengyang Duan, Hang Chen, Xing Lin

By encoding multi-task inputs into multi-wavelength channels, the system can increase the computing throughput and significantly alle-viate the competition to perform multiple tasks in parallel with high accuracy.

Multi-Task Learning

Paper
Add Code

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function

no code implementations • 26 Oct 2022 • Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du, Chin-Hui Lee

In this paper, we propose a deep learning based multi-speaker direction of arrival (DOA) estimation with audio and visual signals by using permutation-free loss function.

Paper
Add Code

Optical Neural Ordinary Differential Equations

no code implementations • 26 Sep 2022 • Yun Zhao, Hang Chen, Min Lin, Haiou Zhang, Tao Yan, Xing Lin, Ruqi Huang, Qionghai Dai

Increasing the layer number of on-chip photonic neural networks (PNNs) is essential to improve its model performance.

Image Classification Trajectory Prediction

Paper
Add Code

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

no code implementations • 14 Sep 2022 • Hang Chen, Keqing Du, Xinyu Yang, Chenguang Li

Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions.

Causal Discovery

Paper
Add Code

Learning a General Clause-to-Clause Relationships for Enhancing Emotion-Cause Pair Extraction

no code implementations • 29 Aug 2022 • Hang Chen, Xinyu Yang, Xiang Li

To learn it applicably, we propose a general clause-level encoding model named EA-GAT comprising E-GAT and Activation Sort.

Emotion-Cause Pair Extraction

Paper
Add Code

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation

2 code implementations • CVPR 2021 • Chufeng Tang, Hang Chen, Xiao Li, Jianmin Li, Zhaoxiang Zhang, Xiaolin Hu

Tremendous efforts have been made on instance segmentation but the mask quality is still not satisfactory.

Instance Segmentation Segmentation +1

170

Paper
Code

Robust Blockchained Federated Learning with Model Validation and Proof-of-Stake Inspired Consensus

2 code implementations • 9 Jan 2021 • Hang Chen, Syed Ali Asif, Jihong Park, Chien-Chung Shen, Mehdi Bennis

Federated learning (FL) is a promising distributed learning solution that only exchanges model parameters without revealing raw data.

Federated Learning

Paper
Code

Lip-reading with Hierarchical Pyramidal Convolution and Self-Attention

no code implementations • 28 Dec 2020 • Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Chin-Hui Lee, Bao-Cai Yin

In this paper, we propose a novel deep learning architecture to improving word-level lip-reading.

Lip Reading

Paper
Add Code

Modified EP MIMO Detection Algorithm with Deep Learning Parameters Selection

no code implementations • 19 Oct 2020 • Hang Chen, Guoqiang Yao, Jianhao Hu

According to the influence of the moment matching and parameter selection for the performance of the EP MIMO detector, we propose a modified EP MIMO detector (MEPD).

Paper
Add Code

Correlating Subword Articulation with Lip Shapes for Embedding Aware Audio-Visual Speech Enhancement

no code implementations • 21 Sep 2020 • Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Bao-Cai Yin, Chin-Hui Lee

We first extract visual embedding from lip frames using a pre-trained phone or articulation place recognizer for visual-only EASE (VEASE).

Speech Enhancement

Paper
Add Code

What does the language of foods say about us?

no code implementations • WS 2019 • Hoang Van, Ahmad Musa, Hang Chen, Stephen Kobourov, Mihai Surdeanu

Second, we investigate the effect of socioeconomic factors (income, poverty, and education) on predicting state-level T2DM rates.

Paper
Add Code

Assessment of central serous chorioretinopathy (CSC) depicted on color fundus photographs using deep Learning

no code implementations • 14 Jan 2019 • Yi Zhen, Hang Chen, Xu Zhang, Meng Liu, Xin Meng, Jian Zhang, Jiantao Pu

To investigate whether and to what extent central serous chorioretinopathy (CSC) depicted on color fundus photographs can be assessed using deep learning technology.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.