Search Results for author: Xin Fang

Found 20 papers, 2 papers with code

MVANet: Multi-Stage Video Attention Network for Sound Event Localization and Detection with Source Distance Estimation

no code implementations21 Nov 2024 Hengyi Hong, Qing Wang, Jun Du, Ruoyu Wei, Mingqi Cai, Xin Fang

We propose a novel output representation that combines the DOA with distance of sound sources by calculating the real Cartesian coordinates to address the newly introduced source distance estimation (SDE) task in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2024 Challenge.

Data Augmentation Sound Event Localization and Detection

The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

no code implementations2 Jul 2024 Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case.

Automatic Speech Recognition Pseudo Label +5

Equity-aware Load Shedding Optimization

no code implementations25 Jun 2024 Xin Fang, Wenbo Wang, Fei Ding

Load shedding is usually the last resort to balance generation and demand to maintain stable operation of the electric grid after major disturbances.

Multitask frame-level learning for few-shot sound event detection

no code implementations17 Mar 2024 Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang

This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples.

Data Augmentation Event Detection +1

AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer

no code implementations7 Mar 2023 Kang Li, Yan Song, Li-Rong Dai, Ian McLoughlin, Xin Fang, Lin Liu

In this paper, we propose an effective sound event detection (SED) method based on the audio spectrogram transformer (AST) model, pretrained on the large-scale AudioSet for audio tagging (AT) task, termed AST-SED.

Audio Tagging Decoder +2

Deep Virtual-to-Real Distillation for Pedestrian Crossing Prediction

no code implementations2 Nov 2022 Jie Bai, Xin Fang, Jianwu Fang, Jianru Xue, Changwei Yuan

To this end, we formulate a deep virtual to real distillation framework by introducing the synthetic data that can be generated conveniently, and borrow the abundant information of pedestrian movement in synthetic videos for the pedestrian crossing prediction in real data with a simple and lightweight implementation.

A Unified Analytical Method to Quantify Three Types of Fast Frequency Response from Inverter-based Resources

no code implementations20 Sep 2022 Shuan Dong, Xin Fang, Jin Tan, Ningchao Gao, Xiaofan Cui, Anderson Hoke

The simulation results in the IEEE 39-bus system with different types of FFR demonstrate that the proposed method provides an accurate and fast prediction of the frequency nadir under various disturbances.

DLMP of Competitive Markets in Active Distribution Networks: Models, Solutions, Applications, and Visions

no code implementations27 May 2022 Xiaofei Wang, Fangxing Li, Linquan Bai, Xin Fang

The DLMP provides a solution that can be essential for competitive market operation in future distribution systems.

A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition

no code implementations5 Apr 2022 Ye-Qian Du, Jie Zhang, Qiu-Shi Zhu, Li-Rong Dai, Ming-Hui Wu, Xin Fang, Zhou-Wang Yang

Unpaired data has shown to be beneficial for low-resource automatic speech recognition~(ASR), which can be involved in the design of hybrid models with multi-task training or language model dependent pre-training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition

no code implementations15 Feb 2022 Zi-Qiang Zhang, Jie Zhang, Jian-Shu Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai

The proposed approach explores both the complementarity of audio-visual modalities and long-term context dependency using a transformer-based fusion module and a flexible masking strategy.

Audio-Visual Speech Recognition Lipreading +4

Impact of DER Communication Delay in AGC: Cyber-Physical Dynamic Simulation

no code implementations7 May 2021 Wenbo Wang, Xin Fang, Anthony Florita

Distributed energy resource (DER) frequency regulations are promising technologies for future grid operation.

XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition

no code implementations15 Mar 2021 Zi-Qiang Zhang, Yan Song, Ming-Hui Wu, Xin Fang, Li-Rong Dai

In this paper, we propose a weakly supervised multilingual representation learning framework, called cross-lingual self-training (XLST).

Data Augmentation Representation Learning +2

Effective Parallelism for Equation and Jacobian Evaluation in Power Flow Calculation

no code implementations24 Nov 2020 Hantao Cui, Fangxing Li, Xin Fang

This letter investigates parallelism approaches for equation and Jacobian evaluations in large-scale power flow calculation.

Cannot find the paper you are looking for? You can Submit a new open access paper.