Search Results for author: Minh Tran

Found 28 papers, 11 papers with code

CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect

no code implementations • 17 Apr 2024 • Minh Tran, Sang Truong, Arthur F. A. Fernandes, Michael T. Kidd, Ngan Le

This study proposes an effective approach for automating the assessment of carcass quality without requiring skilled labor or inspector involvement.

Defect Detection

Paper
Add Code

ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation

no code implementations • 18 Mar 2024 • Minh Tran, Winston Bounsavy, Khoa Vo, Anh Nguyen, Tri Nguyen, Ngan Le

Consequently, this compromised quality of visible features during the subsequent visible-to-amodal transition.

Amodal Instance Segmentation Semantic Segmentation

Paper
Add Code

Dyadic Interaction Modeling for Social Behavior Generation

no code implementations • 14 Mar 2024 • Minh Tran, Di Chang, Maksim Siniukov, Mohammad Soleymani

Hence, an effective model for generating listener nonverbal behaviors requires understanding the dyadic context and interaction.

Contrastive Learning

Paper
Add Code

3FM: Multi-modal Meta-learning for Federated Tasks

1 code implementation • 15 Dec 2023 • Minh Tran, Roochi Shah, Zejun Gong

We present a novel approach in the domain of federated learning (FL), particularly focusing on addressing the challenges posed by modality heterogeneity, variability in modality availability across clients, and the prevalent issue of missing data.

Federated Learning Meta-Learning

Paper
Code

SolarFormer: Multi-scale Transformer for Solar PV Profiling

no code implementations • 30 Oct 2023 • Adrian de Luis, Minh Tran, Taisei Hanyu, Anh Tran, Liao Haitao, Roy McCann, Alan Mantooth, Ying Huang, Ngan Le

Accurate mapping of PV installations is crucial for understanding their adoption and informing energy policy.

Paper
Add Code

Privacy-preserving Representation Learning for Speech Understanding

no code implementations • 26 Oct 2023 • Minh Tran, Mohammad Soleymani

In this paper, we present a novel framework to anonymize utterance-level speech embeddings generated by pre-trained encoders and show its effectiveness for a range of speech classification tasks.

Classification Emotion Recognition +5

Paper
Add Code

Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation

1 code implementation • 5 Oct 2023 • Kashu Yamazaki, Taisei Hanyu, Khoa Vo, Thang Pham, Minh Tran, Gianfranco Doretto, Anh Nguyen, Ngan Le

Open-Fusion harnesses the power of a pre-trained vision-language foundation model (VLFM) for open-set semantic comprehension and employs the Truncated Signed Distance Function (TSDF) for swift 3D scene reconstruction.

3D Scene Reconstruction

Paper
Code

Personalized Adaptation with Pre-trained Speech Encoders for Continuous Emotion Recognition

no code implementations • 5 Sep 2023 • Minh Tran, Yufeng Yin, Mohammad Soleymani

There are individual differences in expressive behaviors driven by cultural norms and personality.

Speech Emotion Recognition Valence Estimation

Paper
Add Code

LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis

1 code implementation • 18 Aug 2023 • Di Chang, Yufeng Yin, Zongjian Li, Minh Tran, Mohammad Soleymani

Facial expression analysis is an important tool for human-computer interaction.

Facial Expression Recognition Knowledge Distillation

Paper
Code

Representation Learning for Audio Privacy Preservation using Source Separation and Robust Adversarial Learning

no code implementations • 9 Aug 2023 • Diep Luong, Minh Tran, Shayan Gharib, Konstantinos Drossos, Tuomas Virtanen

Privacy preservation has long been a concern in smart acoustic monitoring systems, where speech can be passively recorded along with a target signal in the system's operating environment.

Privacy Preserving Representation Learning

Paper
Add Code

AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation

1 code implementation • 12 Jun 2023 • Kashu Yamazaki, Taisei Hanyu, Minh Tran, Adrian de Luis, Roy McCann, Haitao Liao, Chase Rainwater, Meredith Adkins, Jackson Cothren, Ngan Le

Aerial Image Segmentation is a top-down perspective semantic segmentation and has several challenging characteristics such as strong imbalance in the foreground-background distribution, complex background, intra-class heterogeneity, inter-class homogeneity, and tiny objects.

Ranked #1 on Semantic Segmentation on ISPRS Potsdam

Image Segmentation Segmentation +1

Paper
Code

Adversarial Representation Learning for Robust Privacy Preservation in Audio

1 code implementation • 29 Apr 2023 • Shayan Gharib, Minh Tran, Diep Luong, Konstantinos Drossos, Tuomas Virtanen

In this study, we propose a novel adversarial training method for learning representations of audio recordings that effectively prevents the detection of speech activity from the latent features of the recordings.

Event Detection Representation Learning +1

Paper
Code

Multi-modal Facial Action Unit Detection with Large Pre-trained Models for the 5th Competition on Affective Behavior Analysis in-the-wild

no code implementations • 19 Mar 2023 • Yufeng Yin, Minh Tran, Di Chang, Xinrui Wang, Mohammad Soleymani

Facial action unit detection has emerged as an important task within facial expression analysis, aimed at detecting specific pre-defined, objective facial expressions, such as lip tightening and cheek raising.

Action Unit Detection Face Alignment +2

Paper
Add Code

An Inception-Residual-Based Architecture with Multi-Objective Loss for Detecting Respiratory Anomalies

no code implementations • 7 Mar 2023 • Dat Ngo, Lam Pham, Huy Phan, Minh Tran, Delaram Jarchi, Sefki Kolozali

Notably, we achieved the Top-1 performance in Task 2-1 and Task 2-2 with the highest Score of 74. 5% and 53. 9%, respectively.

Task 2

Paper
Add Code

Meta Learning for Few-Shot Medical Text Classification

no code implementations • 3 Dec 2022 • Pankaj Sharma, Imran Qureshi, Minh Tran

We investigate the use of meta-learning and robustness techniques on a broad corpus of benchmark text and medical data.

Meta-Learning text-classification +1

Paper
Add Code

AISFormer: Amodal Instance Segmentation with Transformer

1 code implementation • 12 Oct 2022 • Minh Tran, Khoa Vo, Kashu Yamazaki, Arthur Fernandes, Michael Kidd, Ngan Le

AISFormer explicitly models the complex coherence between occluder, visible, amodal, and invisible masks within an object's regions of interest by treating them as learnable queries.

Amodal Instance Segmentation Segmentation +1

Paper
Code

3DConvCaps: 3DUnet with Convolutional Capsule Encoder for Medical Image Segmentation

1 code implementation • 19 May 2022 • Minh Tran, Viet-Khoa Vo-Ho, Ngan T. H. Le

Capsule network is a recent new architecture that has achieved better robustness in part-whole representation learning by replacing pooling layers with dynamic routing and convolutional strides, which has shown potential results on popular tasks such as digit classification and object segmentation.

Hippocampus Image Segmentation +4

Paper
Code

Scaling Cross-Domain Content-Based Image Retrieval for E-commerce Snap and Search Application

no code implementations • 13 Apr 2022 • Isaac Kwan Yin Chung, Minh Tran, Eran Nussinovitch

In this industry talk at ECIR 2022, we illustrate how we approach the main challenges from large scale cross-domain content-based image retrieval using a cascade method and a combination of our visual search and classification capabilities.

Content-Based Image Retrieval Retrieval

Paper
Add Code

A Speech Representation Anonymization Framework via Selective Noise Perturbation

1 code implementation • 26 Mar 2022 • Minh Tran, Mohammad Soleymani

Privacy and security are major concerns when communicating speech signals to cloud services such as automatic speech recognition (ASR) and speech emotion recognition (SER).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Code

CapsNet for Medical Image Segmentation

no code implementations • 16 Mar 2022 • Minh Tran, Viet-Khoa Vo-Ho, Kyle Quinn, Hien Nguyen, Khoa Luu, Ngan Le

We then provide recent developments of CapsNet for the task of medical image segmentation.

Image Segmentation Representation Learning +3

Paper
Add Code

A Pre-trained Audio-Visual Transformer for Emotion Recognition

no code implementations • 23 Jan 2022 • Minh Tran, Mohammad Soleymani

In this paper, we introduce a pretrained audio-visual Transformer trained on more than 500k utterances from nearly 4000 celebrities from the VoxCeleb2 dataset for human behavior understanding.

Emotion Classification Emotion Recognition

Paper
Add Code

SS-3DCapsNet: Self-supervised 3D Capsule Networks for Medical Segmentation on Less Labeled Data

no code implementations • 15 Jan 2022 • Minh Tran, Loi Ly, Binh-Son Hua, Ngan Le

Capsule network is a recent new deep network architecture that has been applied successfully for medical image segmentation tasks.

Hippocampus Image Segmentation +4

Paper
Add Code

Deep Federated Learning for Autonomous Driving

1 code implementation • 12 Oct 2021 • Anh Nguyen, Tuong Do, Minh Tran, Binh X. Nguyen, Chien Duong, Tu Phan, Erman Tjiputra, Quang D. Tran

We design a new Federated Autonomous Driving network (FADNet) that can improve the model stability, ensure convergence, and handle imbalanced data distribution problems while is being trained with federated learning methods.

Autonomous Driving Federated Learning

Paper
Code

Modeling Dynamics of Facial Behavior for Mental Health Assessment

1 code implementation • 23 Aug 2021 • Minh Tran, Ellen Bradley, Michelle Matvey, Joshua Woolley, Mohammad Soleymani

Facial action unit (FAU) intensities are popular descriptors for the analysis of facial behavior.

Clustering

Paper
Code

Multiple Meta-model Quantifying for Medical Visual Question Answering

2 code implementations • 19 May 2021 • Tuong Do, Binh X. Nguyen, Erman Tjiputra, Minh Tran, Quang D. Tran, Anh Nguyen

However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized.

Ranked #5 on Medical Visual Question Answering on PathVQA

Medical Visual Question Answering Meta-Learning +3

Paper
Code

Towards A Friendly Online Community: An Unsupervised Style Transfer Framework for Profanity Redaction

no code implementations • COLING 2020 • Minh Tran, YiPeng Zhang, Mohammad Soleymani

Offensive and abusive language is a pressing problem on social media platforms.

Abusive Language Style Transfer

Paper
Add Code

Robust Deep Learning Framework For Predicting Respiratory Anomalies and Diseases

no code implementations • 21 Jan 2020 • Lam Pham, Ian McLoughlin, Huy Phan, Minh Tran, Truc Nguyen, Ramaswamy Palaniappan

This paper presents a robust deep learning framework developed to detect respiratory diseases from recordings of respiratory sounds.

Paper
Add Code

Are you really looking at me? A Feature-Extraction Framework for Estimating Interpersonal Eye Gaze from Conventional Video

no code implementations • 21 Jun 2019 • Minh Tran, Taylan Sen, Kurtis Haut, Mohammad Rafayet Ali, Mohammed Ehsan Hoque

Despite a revolution in the pervasiveness of video cameras in our daily lives, one of the most meaningful forms of nonverbal affective communication, interpersonal eye gaze, i. e. eye gaze relative to a conversation partner, is not available from common video.

Clustering Deception Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.