Search Results for author: Huy Phan

Found 57 papers, 17 papers with code

Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

1 code implementation • 27 Mar 2024 • Jinhua Liang, Ines Nolasco, Burooj Ghani, Huy Phan, Emmanouil Benetos, Dan Stowell

A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims to train a versatile animal sound detector using only a small set of audio samples.

Data Augmentation Domain Adaptation +3

Paper
Code

WavCraft: Audio Editing and Generation with Natural Language Prompts

1 code implementation • 14 Mar 2024 • Jinhua Liang, huan zhang, Haohe Liu, Yin Cao, Qiuqiang Kong, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

Paper
Code

Hierarchical Tree-structured Knowledge Graph For Academic Insight Survey

no code implementations • 7 Feb 2024 • Jinghong Li, Huy Phan, Wen Gu, Koichi Ota, Shinobu Hasegawa

To address these issues, this study aims to support research insight surveys for beginner researchers by establishing a hierarchical tree-structured knowledge graph that reflects the inheritance insight of research topics and the relevance insight among the academic papers.

Knowledge Graphs Recommendation Systems +1

Paper
Add Code

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models

no code implementations • 5 Feb 2024 • Yang Sui, Huy Phan, Jinqi Xiao, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan

In this paper, for the first time, we systematically explore the detectability of the poisoned noise input for the backdoored diffusion models, an important performance metric yet little explored in the existing works.

Backdoor Attack

Paper
Add Code

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks

no code implementations • 18 Jan 2024 • Yang Sui, Miao Yin, Yu Gong, Jinqi Xiao, Huy Phan, Bo Yuan

Low-rank compression, a popular model compression technique that produces compact convolutional neural networks (CNNs) with low rankness, has been well-studied in the literature.

Low-rank compression Model Compression

Paper
Add Code

Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input

no code implementations • 11 Dec 2023 • Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do

We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue.

3D Human Pose Estimation Data Augmentation

Paper
Add Code

Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities

1 code implementation • 30 Nov 2023 • Jinhua Liang, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

Moreover, we improve the framework of audio language model by using interleaved audio-text embeddings as the input sequence.

Audio Classification Few-Shot Audio Classification +2

Paper
Code

ATGNN: Audio Tagging Graph Neural Network

no code implementations • 2 Nov 2023 • Shubhr Singh, Christian J. Steinmetz, Emmanouil Benetos, Huy Phan, Dan Stowell

Deep learning models such as CNNs and Transformers have achieved impressive performance for end-to-end audio tagging.

Audio Tagging

Paper
Add Code

Adapting Language-Audio Models as Few-Shot Audio Learners

no code implementations • 28 May 2023 • Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang

We presented the Treff adapter, a training-efficient adapter for CLAP, to boost zero-shot classification performance by making use of a small set of labelled data.

Audio Classification Few-Shot Learning +1

Paper
Add Code

CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities

no code implementations • 27 Mar 2023 • Konstantinos Kontras, Christos Chatzichristos, Huy Phan, Johan Suykens, Maarten De Vos

The results indicate that training the model on multimodal data does positively influence performance when tested on unimodal data.

Ranked #1 on Sleep Stage Detection on SHHS

EEG Sleep Staging +1

Paper
Add Code

An Inception-Residual-Based Architecture with Multi-Objective Loss for Detecting Respiratory Anomalies

no code implementations • 7 Mar 2023 • Dat Ngo, Lam Pham, Huy Phan, Minh Tran, Delaram Jarchi, Sefki Kolozali

Notably, we achieved the Top-1 performance in Task 2-1 and Task 2-2 with the highest Score of 74. 5% and 53. 9%, respectively.

Task 2

Paper
Add Code

deep learning of segment-level feature representation for speech emotion recognition in conversations

no code implementations • 5 Feb 2023 • Jiachen Luo, Huy Phan, Joshua Reiss

Accurately detecting emotions in conversation is a necessary yet challenging task due to the complexity of emotions and dynamics in dialogues.

Speech Emotion Recognition

Paper
Add Code

cross-modal fusion techniques for utterance-level emotion recognition from text and speech

no code implementations • 5 Feb 2023 • Jiachen Luo, Huy Phan, Joshua Reiss

Multimodal emotion recognition (MER) is a fundamental complex research problem due to the uncertainty of human emotional expression and the heterogeneity gap between different modalities.

Multimodal Emotion Recognition

Paper
Add Code

L-SeqSleepNet: Whole-cycle Long Sequence Modelling for Automatic Sleep Staging

1 code implementation • 9 Jan 2023 • Huy Phan, Kristian P. Lorenzen, Elisabeth Heremans, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Mathias Baumert, Kaare Mikkelsen, Maarten De Vos

In this work, we show that while encoding the logic of a whole sleep cycle is crucial to improve sleep staging performance, the sequential modelling approach in existing state-of-the-art deep learning models are inefficient for that purpose.

EEG Sleep Staging

Paper
Code

Improving trajectory localization accuracy via direction-of-arrival derivative estimation

no code implementations • 7 Dec 2022 • Ruchi Pandey, Shreyas Jaiswal, Huy Phan, Santosh Nannuru

In this paper, we do a comprehensive analysis of improvement in sound source localization by combining the direction of arrivals (DOAs) with their derivatives which quantify the changes in the positions of sources over time.

Direction of Arrival Estimation

Paper
Add Code

CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness

no code implementations • 4 Dec 2022 • Huy Phan, Miao Yin, Yang Sui, Bo Yuan, Saman Zonouz

Considering the co-importance of model compactness and robustness in practical applications, several prior works have explored to improve the adversarial robustness of the sparse neural networks.

Adversarial Robustness Model Compression

Paper
Add Code

Modelling black-box audio effects with time-varying feature modulation

no code implementations • 1 Nov 2022 • Marco Comunità, Christian J. Steinmetz, Huy Phan, Joshua D. Reiss

Deep learning approaches for black-box modelling of audio effects have shown promise, however, the majority of existing work focuses on nonlinear effects with behaviour on relatively short time-scales, such as guitar amplifiers and distortion.

Paper
Add Code

Personalized Longitudinal Assessment of Multiple Sclerosis Using Smartphones

no code implementations • 20 Sep 2022 • Oliver Y. Chén, Florian Lipsmeier, Huy Phan, Frank Dondelinger, Andrew Creagh, Christian Gossens, Michael Lindemann, Maarten De Vos

The results show that the proposed model is promising to achieve personalized longitudinal MS assessment; they also suggest that features related to gait and balance as well as upper extremity function, remotely collected from sensor-based assessments, may be useful digital markers for predicting MS over time.

Imputation

Paper
Add Code

RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN

1 code implementation • 22 Aug 2022 • Huy Phan, Cong Shi, Yi Xie, Tianfang Zhang, Zhuohang Li, Tianming Zhao, Jian Liu, Yan Wang, Yingying Chen, Bo Yuan

Recently backdoor attack has become an emerging threat to the security of deep neural network (DNN) models.

Backdoor Attack

Paper
Code

Polyphonic audio event detection: multi-label or multi-class multi-task classification problem?

no code implementations • 29 Jan 2022 • Huy Phan, Thi Ngoc Tho Nguyen, Philipp Koch, Alfred Mertins

The network is composed of a backbone subnet and multiple task-specific subnets.

Classification Event Detection +2

Paper
Add Code

Feature matching as improved transfer learning technique for wearable EEG

no code implementations • 29 Dec 2021 • Elisabeth R. M. Heremans, Huy Phan, Amir H. Ansari, Pascal Borzée, Bertien Buyse, Dries Testelmans, Maarten De Vos

This method consists of training a model with larger amounts of data from the source modality and few paired samples of source and target modality.

EEG Sleep Staging +1

Paper
Add Code

SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays

4 code implementations • 16 Nov 2021 • Thi Ngoc Tho Nguyen, Douglas L. Jones, Karn N. Watcharasupat, Huy Phan, Woon-Seng Gan

In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs.

Sound Event Localization and Detection

Paper
Code

Automatic Sleep Staging of EEG Signals: Recent Development, Challenges, and Future Directions

no code implementations • 3 Nov 2021 • Huy Phan, Kaare Mikkelsen

Modern deep learning holds a great potential to transform clinical practice on human sleep.

EEG Sleep Staging

Paper
Add Code

CHIP: CHannel Independence-based Pruning for Compact Neural Networks

1 code implementation • NeurIPS 2021 • Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Zonouz, Bo Yuan

Filter pruning has been widely used for neural network compression because of its enabled practical acceleration.

Neural Network Compression

Paper
Code

Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks

no code implementations • 18 Oct 2021 • Marco Comunità, Huy Phan, Joshua D. Reiss

Footsteps are among the most ubiquitous sound effects in multimedia applications.

Paper
Add Code

Pediatric Automatic Sleep Staging: A comparative study of state-of-the-art deep learning methods

no code implementations • 23 Aug 2021 • Huy Phan, Alfred Mertins, Mathias Baumert

Background: Despite the tremendous progress recently made towards automatic sleep staging in adults, it is currently unknown if the most advanced algorithms generalize to the pediatric population, which displays distinctive characteristics in overnight polysomnography (PSG).

Electroencephalogram (EEG) Sleep Staging

Paper
Add Code

SleepTransformer: Automatic Sleep Staging with Interpretability and Uncertainty Quantification

no code implementations • 23 May 2021 • Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

It is based on the transformer backbone and offers interpretability of the model's decisions at both the epoch and sequence level.

EEG Sleep Staging +1

Paper
Add Code

Light-weight sleep monitoring: electrode distance matters more than placement for automatic scoring

no code implementations • 9 Apr 2021 • Kaare B. Mikkelsen, Huy Phan, Mike L. Rank, Martin C. Hemmsen, Maarten De Vos, Preben Kidmose

Modern sleep monitoring development is shifting towards the use of unobtrusive sensors combined with algorithms for automatic sleep scoring.

Position

Paper
Add Code

Multi-view Audio and Music Classification

no code implementations • 3 Mar 2021 • Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Lam Pham, Philipp Koch, Ian McLoughlin, Alfred Mertins

The learned embedding in the subnetworks are then concatenated to form the multi-view embedding for classification similar to a simple concatenation network.

Classification General Classification +2

Paper
Add Code

MIN2Net: End-to-End Multi-Task Learning for Subject-Independent Motor Imagery EEG Classification

1 code implementation • 7 Feb 2021 • Phairot Autthasan, Rattanaphon Chaisaen, Thapanun Sudhawiyangkul, Phurin Rangpong, Suktipol Kiatthaveephong, Nat Dilokthanakul, Gun Bhakdisongkhram, Huy Phan, Cuntai Guan, Theerawit Wilaiprasitporn

We integrate deep metric learning into a multi-task autoencoder to learn a compact and discriminative latent representation from EEG and perform classification simultaneously.

Classification EEG +4

Paper
Code

Inception-Based Network and Multi-Spectrogram Ensemble Applied For Predicting Respiratory Anomalies and Lung Diseases

no code implementations • 26 Dec 2020 • Lam Pham, Huy Phan, Ross King, Alfred Mertins, Ian McLoughlin

This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input.

Paper
Add Code

Self-Attention Generative Adversarial Network for Speech Enhancement

1 code implementation • 18 Oct 2020 • Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input.

Generative Adversarial Network Speech Enhancement

Paper
Code

On Multitask Loss Function for Audio Event Detection and Localization

no code implementations • 11 Sep 2020 • Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Audio event localization and detection (SELD) have been commonly tackled using multitask models.

Action Detection Activity Detection +3

Paper
Add Code

XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging

1 code implementation • 8 Jul 2020 • Huy Phan, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Maarten De Vos

This work proposes a sequence-to-sequence sleep staging model, XSleepNet, that is capable of learning a joint representation from both raw signals and time-frequency images.

Ranked #1 on Sleep Stage Detection on PhysioNet Challenge 2018

Sleep Staging

Paper
Code

Personalized Automatic Sleep Staging with Single-Night Data: a Pilot Study with KL-Divergence Regularization

no code implementations • 23 Apr 2020 • Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Preben Kidmose, Maarten De Vos

We employ the pretrained SeqSleepNet (i. e. the subject independent model) as a starting point and finetune it with the single-night personalization data to derive the personalized model.

Sleep Staging Specificity +1

Paper
Add Code

MetaSleepLearner: A Pilot Study on Fast Adaptation of Bio-signals-Based Sleep Stage Classifier to New Individual Subject Using Meta-Learning

1 code implementation • 8 Apr 2020 • Nannapas Banluesombatkul, Pichayoot Ouppaphan, Pitshaporn Leelaarporn, Payongkit Lakhan, Busarakum Chaitusaney, Nattapong Jaimchariyatam, Ekapol Chuangsuwanich, Wei Chen, Huy Phan, Nat Dilokthanakul, Theerawit Wilaiprasitporn

This is the first work that investigated a non-conventional pre-training method, MAML, resulting in a possibility for human-machine collaboration in sleep stage classification and easing the burden of the clinicians in labelling the sleep stages through only several epochs rather than an entire recording.

Automatic Sleep Stage Classification Meta-Learning +2

Paper
Code

CNN-MoE based framework for classification of respiratory anomalies and lung disease detection

no code implementations • 4 Apr 2020 • Lam Pham, Huy Phan, Ramaswamy Palaniappan, Alfred Mertins, Ian McLoughlin

This paper presents and explores a robust deep learning framework for auscultation analysis.

Data Augmentation General Classification

Paper
Add Code

Robust Deep Learning Framework For Predicting Respiratory Anomalies and Diseases

no code implementations • 21 Jan 2020 • Lam Pham, Ian McLoughlin, Huy Phan, Minh Tran, Truc Nguyen, Ramaswamy Palaniappan

This paper presents a robust deep learning framework developed to detect respiratory diseases from recordings of respiratory sounds.

Paper
Add Code

Improving GANs for Speech Enhancement

2 code implementations • 15 Jan 2020 • Huy Phan, Ian V. McLoughlin, Lam Pham, Oliver Y. Chén, Philipp Koch, Maarten De Vos, Alfred Mertins

The former constrains the generators to learn a common mapping that is iteratively applied at all enhancement stages and results in a small model footprint.

Speech Enhancement

Paper
Code

CAG: A Real-time Low-cost Enhanced-robustness High-transferability Content-aware Adversarial Attack Generator

no code implementations • 16 Dec 2019 • Huy Phan, Yi Xie, Siyu Liao, Jie Chen, Bo Yuan

In addition, CAG exhibits high transferability across different DNN classifier models in black-box attack scenario by introducing random dropout in the process of generating perturbations.

Adversarial Attack

Paper
Add Code

Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning

1 code implementation • 30 Jul 2019 • Huy Phan, Oliver Y. Chén, Philipp Koch, Zongqing Lu, Ian McLoughlin, Alfred Mertins, Maarten De Vos

We employ the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and study deep transfer learning on three different target domains: the Sleep Cassette subset and the Sleep Telemetry subset of the Sleep-EDF Expanded database, and the Surrey-cEEGrid database.

Ranked #1 on Multimodal Sleep Stage Detection on Surrey-PSG

Automatic Sleep Stage Classification Multimodal Sleep Stage Detection +2

Paper
Code

Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch

no code implementations • 11 Apr 2019 • Huy Phan, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

This work presents a deep transfer learning approach to overcome the channel mismatch problem and transfer knowledge from a large dataset to a small cohort to study automatic sleep staging with single-channel input.

Sleep Staging Transfer Learning

Paper
Add Code

Spatio-Temporal Attention Pooling for Audio Scene Classification

no code implementations • 6 Apr 2019 • Huy Phan, Oliver Y. Chén, Lam Pham, Philipp Koch, Maarten De Vos, Ian McLoughlin, Alfred Mertins

Acoustic scenes are rich and redundant in their content.

Acoustic Scene Classification Classification +3

Paper
Add Code

Beyond Equal-Length Snippets: How Long is Sufficient to Recognize an Audio Scene?

no code implementations • 2 Nov 2018 • Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos

Moreover, as model fusion with deep network ensemble is prevalent in audio scene classification, we further study whether, and if so, when model fusion is necessary for this task.

General Classification Scene Classification

Paper
Add Code

Unifying Isolated and Overlapping Audio Event Detection with Multi-Label Multi-Task Convolutional Recurrent Neural Networks

no code implementations • 2 Nov 2018 • Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos

We propose a multi-label multi-task framework based on a convolutional recurrent neural network to unify detection of isolated and overlapping audio events.

Event Detection Multi-Label Classification

Paper
Add Code

SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging

2 code implementations • 28 Sep 2018 • Huy Phan, Fernando Andreotti, Navin Cooray, Oliver Y. Chén, Maarten De Vos

At the sequence processing level, a recurrent layer placed on top of the learned epoch-wise features for long-term modelling of sequential epochs.

General Classification Sleep Staging

Paper
Code

Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification

1 code implementation • 16 May 2018 • Huy Phan, Fernando Andreotti, Navin Cooray, Oliver Y. Chén, Maarten De Vos

While the proposed framework is orthogonal to the widely adopted classification schemes, which take one or multiple epochs as contextual inputs and produce a single classification decision on the target epoch, we demonstrate its advantages in several ways.

Ranked #2 on Sleep Stage Detection on MASS SS2

Automatic Sleep Stage Classification Classification +2

Paper
Code

Enabling Early Audio Event Detection with Neural Networks

no code implementations • 6 Dec 2017 • Huy Phan, Philipp Koch, Ian McLoughlin, Alfred Mertins

The proposed system consists of a novel inference step coupled with dual parallel tailored-loss deep neural networks (DNNs).

Event Detection

Paper
Add Code

DNN and CNN with Weighted and Multi-task Loss Functions for Audio Event Detection

no code implementations • 10 Aug 2017 • Huy Phan, Martin Krawczyk-Becker, Timo Gerkmann, Alfred Mertins

Our proposed systems significantly outperform the challenge baseline, improving F-score from 72. 7% to 90. 0% and reducing detection error rate from 0. 53 to 0. 18 on average on the development data.

Event Detection Task 2

Paper
Add Code

Audio Scene Classification with Deep Recurrent Neural Networks

no code implementations • 14 Mar 2017 • Huy Phan, Philipp Koch, Fabrice Katzberg, Marco Maass, Radoslaw Mazur, Alfred Mertins

We introduce in this work an efficient approach for audio scene classification using deep recurrent neural networks.

Classification General Classification +1

Paper
Add Code

Classifying Variable-Length Audio Files with All-Convolutional Networks and Masked Global Pooling

1 code implementation • 11 Jul 2016 • Lars Hertel, Huy Phan, Alfred Mertins

We trained a deep all-convolutional neural network with masked global pooling to perform single-label classification for acoustic scene classification and multi-label classification for domestic audio tagging in the DCASE-2016 contest.

Acoustic Scene Classification Audio Tagging +5

Paper
Code

CaR-FOREST: Joint Classification-Regression Decision Forests for Overlapping Audio Event Detection

no code implementations • 8 Jul 2016 • Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

The regression phase is then carried out to let the positive audio segments vote for the event onsets and offsets, and therefore model the temporal structure of audio events.

Event Detection General Classification +1

Paper
Add Code

CNN-LTE: a Class of 1-X Pooling Convolutional Neural Networks on Label Tree Embeddings for Audio Scene Recognition

no code implementations • 8 Jul 2016 • Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

This category taxonomy is then used in the feature extraction step in which an audio scene instance is represented by a label tree embedding image.

Scene Recognition

Paper
Add Code

Label Tree Embeddings for Acoustic Scene Classification

no code implementations • 25 Jun 2016 • Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

We present in this paper an efficient approach for acoustic scene classification by exploring the structure of class labels.

Acoustic Scene Classification Classification +3

Paper
Add Code

Learning Compact Structural Representations for Audio Events Using Regressor Banks

no code implementations • 29 Apr 2016 • Huy Phan, Marco Maass, Lars Hertel, Radoslaw Mazur, Ian McLoughlin, Alfred Mertins

The entries of the descriptor are produced by evaluating a set of regressors on the input signal.

General Classification

Paper
Add Code

Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks

1 code implementation • 21 Apr 2016 • Huy Phan, Lars Hertel, Marco Maass, Alfred Mertins

We present in this paper a simple, yet efficient convolutional neural network (CNN) architecture for robust audio event recognition.

Paper
Code

Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning

no code implementations • 18 Mar 2016 • Lars Hertel, Huy Phan, Alfred Mertins

Recognizing acoustic events is an intricate problem for a machine and an emerging field of research.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.