Search Results for author: Huy Phan

Found 57 papers, 17 papers with code

Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection

1 code implementation27 Mar 2024 Jinhua Liang, Ines Nolasco, Burooj Ghani, Huy Phan, Emmanouil Benetos, Dan Stowell

A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims to train a versatile animal sound detector using only a small set of audio samples.

Data Augmentation Domain Adaptation +3

WavCraft: Audio Editing and Generation with Natural Language Prompts

1 code implementation14 Mar 2024 Jinhua Liang, huan zhang, Haohe Liu, Yin Cao, Qiuqiang Kong, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos

We introduce WavCraft, a collective system that leverages large language models (LLMs) to connect diverse task-specific models for audio content creation and editing.

In-Context Learning

Hierarchical Tree-structured Knowledge Graph For Academic Insight Survey

no code implementations7 Feb 2024 Jinghong Li, Huy Phan, Wen Gu, Koichi Ota, Shinobu Hasegawa

To address these issues, this study aims to support research insight surveys for beginner researchers by establishing a hierarchical tree-structured knowledge graph that reflects the inheritance insight of research topics and the relevance insight among the academic papers.

Knowledge Graphs Recommendation Systems +1

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models

no code implementations5 Feb 2024 Yang Sui, Huy Phan, Jinqi Xiao, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan

In this paper, for the first time, we systematically explore the detectability of the poisoned noise input for the backdoored diffusion models, an important performance metric yet little explored in the existing works.

Backdoor Attack

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks

no code implementations18 Jan 2024 Yang Sui, Miao Yin, Yu Gong, Jinqi Xiao, Huy Phan, Bo Yuan

Low-rank compression, a popular model compression technique that produces compact convolutional neural networks (CNNs) with low rankness, has been well-studied in the literature.

Low-rank compression Model Compression

Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input

no code implementations11 Dec 2023 Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do

We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue.

3D Human Pose Estimation Data Augmentation

ATGNN: Audio Tagging Graph Neural Network

no code implementations2 Nov 2023 Shubhr Singh, Christian J. Steinmetz, Emmanouil Benetos, Huy Phan, Dan Stowell

Deep learning models such as CNNs and Transformers have achieved impressive performance for end-to-end audio tagging.

Audio Tagging

Adapting Language-Audio Models as Few-Shot Audio Learners

no code implementations28 May 2023 Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang

We presented the Treff adapter, a training-efficient adapter for CLAP, to boost zero-shot classification performance by making use of a small set of labelled data.

Audio Classification Few-Shot Learning +1

An Inception-Residual-Based Architecture with Multi-Objective Loss for Detecting Respiratory Anomalies

no code implementations7 Mar 2023 Dat Ngo, Lam Pham, Huy Phan, Minh Tran, Delaram Jarchi, Sefki Kolozali

Notably, we achieved the Top-1 performance in Task 2-1 and Task 2-2 with the highest Score of 74. 5% and 53. 9%, respectively.

Task 2

deep learning of segment-level feature representation for speech emotion recognition in conversations

no code implementations5 Feb 2023 Jiachen Luo, Huy Phan, Joshua Reiss

Accurately detecting emotions in conversation is a necessary yet challenging task due to the complexity of emotions and dynamics in dialogues.

Speech Emotion Recognition

cross-modal fusion techniques for utterance-level emotion recognition from text and speech

no code implementations5 Feb 2023 Jiachen Luo, Huy Phan, Joshua Reiss

Multimodal emotion recognition (MER) is a fundamental complex research problem due to the uncertainty of human emotional expression and the heterogeneity gap between different modalities.

Multimodal Emotion Recognition

L-SeqSleepNet: Whole-cycle Long Sequence Modelling for Automatic Sleep Staging

1 code implementation9 Jan 2023 Huy Phan, Kristian P. Lorenzen, Elisabeth Heremans, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Mathias Baumert, Kaare Mikkelsen, Maarten De Vos

In this work, we show that while encoding the logic of a whole sleep cycle is crucial to improve sleep staging performance, the sequential modelling approach in existing state-of-the-art deep learning models are inefficient for that purpose.

EEG Sleep Staging

Improving trajectory localization accuracy via direction-of-arrival derivative estimation

no code implementations7 Dec 2022 Ruchi Pandey, Shreyas Jaiswal, Huy Phan, Santosh Nannuru

In this paper, we do a comprehensive analysis of improvement in sound source localization by combining the direction of arrivals (DOAs) with their derivatives which quantify the changes in the positions of sources over time.

Direction of Arrival Estimation

CSTAR: Towards Compact and STructured Deep Neural Networks with Adversarial Robustness

no code implementations4 Dec 2022 Huy Phan, Miao Yin, Yang Sui, Bo Yuan, Saman Zonouz

Considering the co-importance of model compactness and robustness in practical applications, several prior works have explored to improve the adversarial robustness of the sparse neural networks.

Adversarial Robustness Model Compression

Modelling black-box audio effects with time-varying feature modulation

no code implementations1 Nov 2022 Marco Comunità, Christian J. Steinmetz, Huy Phan, Joshua D. Reiss

Deep learning approaches for black-box modelling of audio effects have shown promise, however, the majority of existing work focuses on nonlinear effects with behaviour on relatively short time-scales, such as guitar amplifiers and distortion.

Personalized Longitudinal Assessment of Multiple Sclerosis Using Smartphones

no code implementations20 Sep 2022 Oliver Y. Chén, Florian Lipsmeier, Huy Phan, Frank Dondelinger, Andrew Creagh, Christian Gossens, Michael Lindemann, Maarten De Vos

The results show that the proposed model is promising to achieve personalized longitudinal MS assessment; they also suggest that features related to gait and balance as well as upper extremity function, remotely collected from sensor-based assessments, may be useful digital markers for predicting MS over time.

Imputation

Feature matching as improved transfer learning technique for wearable EEG

no code implementations29 Dec 2021 Elisabeth R. M. Heremans, Huy Phan, Amir H. Ansari, Pascal Borzée, Bertien Buyse, Dries Testelmans, Maarten De Vos

This method consists of training a model with larger amounts of data from the source modality and few paired samples of source and target modality.

EEG Sleep Staging +1

CHIP: CHannel Independence-based Pruning for Compact Neural Networks

1 code implementation NeurIPS 2021 Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Zonouz, Bo Yuan

Filter pruning has been widely used for neural network compression because of its enabled practical acceleration.

Neural Network Compression

Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks

no code implementations18 Oct 2021 Marco Comunità, Huy Phan, Joshua D. Reiss

Footsteps are among the most ubiquitous sound effects in multimedia applications.

Pediatric Automatic Sleep Staging: A comparative study of state-of-the-art deep learning methods

no code implementations23 Aug 2021 Huy Phan, Alfred Mertins, Mathias Baumert

Background: Despite the tremendous progress recently made towards automatic sleep staging in adults, it is currently unknown if the most advanced algorithms generalize to the pediatric population, which displays distinctive characteristics in overnight polysomnography (PSG).

Electroencephalogram (EEG) Sleep Staging

SleepTransformer: Automatic Sleep Staging with Interpretability and Uncertainty Quantification

no code implementations23 May 2021 Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

It is based on the transformer backbone and offers interpretability of the model's decisions at both the epoch and sequence level.

EEG Sleep Staging +1

Light-weight sleep monitoring: electrode distance matters more than placement for automatic scoring

no code implementations9 Apr 2021 Kaare B. Mikkelsen, Huy Phan, Mike L. Rank, Martin C. Hemmsen, Maarten De Vos, Preben Kidmose

Modern sleep monitoring development is shifting towards the use of unobtrusive sensors combined with algorithms for automatic sleep scoring.

Position

Multi-view Audio and Music Classification

no code implementations3 Mar 2021 Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Lam Pham, Philipp Koch, Ian McLoughlin, Alfred Mertins

The learned embedding in the subnetworks are then concatenated to form the multi-view embedding for classification similar to a simple concatenation network.

Classification General Classification +2

Inception-Based Network and Multi-Spectrogram Ensemble Applied For Predicting Respiratory Anomalies and Lung Diseases

no code implementations26 Dec 2020 Lam Pham, Huy Phan, Ross King, Alfred Mertins, Ian McLoughlin

This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input.

Self-Attention Generative Adversarial Network for Speech Enhancement

1 code implementation18 Oct 2020 Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input.

Generative Adversarial Network Speech Enhancement

XSleepNet: Multi-View Sequential Model for Automatic Sleep Staging

1 code implementation8 Jul 2020 Huy Phan, Oliver Y. Chén, Minh C. Tran, Philipp Koch, Alfred Mertins, Maarten De Vos

This work proposes a sequence-to-sequence sleep staging model, XSleepNet, that is capable of learning a joint representation from both raw signals and time-frequency images.

Sleep Staging

Personalized Automatic Sleep Staging with Single-Night Data: a Pilot Study with KL-Divergence Regularization

no code implementations23 Apr 2020 Huy Phan, Kaare Mikkelsen, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Preben Kidmose, Maarten De Vos

We employ the pretrained SeqSleepNet (i. e. the subject independent model) as a starting point and finetune it with the single-night personalization data to derive the personalized model.

Sleep Staging Specificity +1

MetaSleepLearner: A Pilot Study on Fast Adaptation of Bio-signals-Based Sleep Stage Classifier to New Individual Subject Using Meta-Learning

1 code implementation8 Apr 2020 Nannapas Banluesombatkul, Pichayoot Ouppaphan, Pitshaporn Leelaarporn, Payongkit Lakhan, Busarakum Chaitusaney, Nattapong Jaimchariyatam, Ekapol Chuangsuwanich, Wei Chen, Huy Phan, Nat Dilokthanakul, Theerawit Wilaiprasitporn

This is the first work that investigated a non-conventional pre-training method, MAML, resulting in a possibility for human-machine collaboration in sleep stage classification and easing the burden of the clinicians in labelling the sleep stages through only several epochs rather than an entire recording.

Automatic Sleep Stage Classification Meta-Learning +2

Robust Deep Learning Framework For Predicting Respiratory Anomalies and Diseases

no code implementations21 Jan 2020 Lam Pham, Ian McLoughlin, Huy Phan, Minh Tran, Truc Nguyen, Ramaswamy Palaniappan

This paper presents a robust deep learning framework developed to detect respiratory diseases from recordings of respiratory sounds.

Improving GANs for Speech Enhancement

2 code implementations15 Jan 2020 Huy Phan, Ian V. McLoughlin, Lam Pham, Oliver Y. Chén, Philipp Koch, Maarten De Vos, Alfred Mertins

The former constrains the generators to learn a common mapping that is iteratively applied at all enhancement stages and results in a small model footprint.

Speech Enhancement

CAG: A Real-time Low-cost Enhanced-robustness High-transferability Content-aware Adversarial Attack Generator

no code implementations16 Dec 2019 Huy Phan, Yi Xie, Siyu Liao, Jie Chen, Bo Yuan

In addition, CAG exhibits high transferability across different DNN classifier models in black-box attack scenario by introducing random dropout in the process of generating perturbations.

Adversarial Attack

Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning

1 code implementation30 Jul 2019 Huy Phan, Oliver Y. Chén, Philipp Koch, Zongqing Lu, Ian McLoughlin, Alfred Mertins, Maarten De Vos

We employ the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and study deep transfer learning on three different target domains: the Sleep Cassette subset and the Sleep Telemetry subset of the Sleep-EDF Expanded database, and the Surrey-cEEGrid database.

Automatic Sleep Stage Classification Multimodal Sleep Stage Detection +2

Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch

no code implementations11 Apr 2019 Huy Phan, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos

This work presents a deep transfer learning approach to overcome the channel mismatch problem and transfer knowledge from a large dataset to a small cohort to study automatic sleep staging with single-channel input.

Sleep Staging Transfer Learning

Beyond Equal-Length Snippets: How Long is Sufficient to Recognize an Audio Scene?

no code implementations2 Nov 2018 Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos

Moreover, as model fusion with deep network ensemble is prevalent in audio scene classification, we further study whether, and if so, when model fusion is necessary for this task.

General Classification Scene Classification

SeqSleepNet: End-to-End Hierarchical Recurrent Neural Network for Sequence-to-Sequence Automatic Sleep Staging

2 code implementations28 Sep 2018 Huy Phan, Fernando Andreotti, Navin Cooray, Oliver Y. Chén, Maarten De Vos

At the sequence processing level, a recurrent layer placed on top of the learned epoch-wise features for long-term modelling of sequential epochs.

General Classification Sleep Staging

Joint Classification and Prediction CNN Framework for Automatic Sleep Stage Classification

1 code implementation16 May 2018 Huy Phan, Fernando Andreotti, Navin Cooray, Oliver Y. Chén, Maarten De Vos

While the proposed framework is orthogonal to the widely adopted classification schemes, which take one or multiple epochs as contextual inputs and produce a single classification decision on the target epoch, we demonstrate its advantages in several ways.

Automatic Sleep Stage Classification Classification +2

Enabling Early Audio Event Detection with Neural Networks

no code implementations6 Dec 2017 Huy Phan, Philipp Koch, Ian McLoughlin, Alfred Mertins

The proposed system consists of a novel inference step coupled with dual parallel tailored-loss deep neural networks (DNNs).

Event Detection

DNN and CNN with Weighted and Multi-task Loss Functions for Audio Event Detection

no code implementations10 Aug 2017 Huy Phan, Martin Krawczyk-Becker, Timo Gerkmann, Alfred Mertins

Our proposed systems significantly outperform the challenge baseline, improving F-score from 72. 7% to 90. 0% and reducing detection error rate from 0. 53 to 0. 18 on average on the development data.

Event Detection Task 2

Audio Scene Classification with Deep Recurrent Neural Networks

no code implementations14 Mar 2017 Huy Phan, Philipp Koch, Fabrice Katzberg, Marco Maass, Radoslaw Mazur, Alfred Mertins

We introduce in this work an efficient approach for audio scene classification using deep recurrent neural networks.

Classification General Classification +1

Classifying Variable-Length Audio Files with All-Convolutional Networks and Masked Global Pooling

1 code implementation11 Jul 2016 Lars Hertel, Huy Phan, Alfred Mertins

We trained a deep all-convolutional neural network with masked global pooling to perform single-label classification for acoustic scene classification and multi-label classification for domestic audio tagging in the DCASE-2016 contest.

Acoustic Scene Classification Audio Tagging +5

CaR-FOREST: Joint Classification-Regression Decision Forests for Overlapping Audio Event Detection

no code implementations8 Jul 2016 Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

The regression phase is then carried out to let the positive audio segments vote for the event onsets and offsets, and therefore model the temporal structure of audio events.

Event Detection General Classification +1

CNN-LTE: a Class of 1-X Pooling Convolutional Neural Networks on Label Tree Embeddings for Audio Scene Recognition

no code implementations8 Jul 2016 Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

This category taxonomy is then used in the feature extraction step in which an audio scene instance is represented by a label tree embedding image.

Scene Recognition

Label Tree Embeddings for Acoustic Scene Classification

no code implementations25 Jun 2016 Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

We present in this paper an efficient approach for acoustic scene classification by exploring the structure of class labels.

Acoustic Scene Classification Classification +3

Robust Audio Event Recognition with 1-Max Pooling Convolutional Neural Networks

1 code implementation21 Apr 2016 Huy Phan, Lars Hertel, Marco Maass, Alfred Mertins

We present in this paper a simple, yet efficient convolutional neural network (CNN) architecture for robust audio event recognition.

Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning

no code implementations18 Mar 2016 Lars Hertel, Huy Phan, Alfred Mertins

Recognizing acoustic events is an intricate problem for a machine and an emerging field of research.

Cannot find the paper you are looking for? You can Submit a new open access paper.