Search Results for author: Duc Minh Vo

Found 14 papers, 3 papers with code

Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input

no code implementations • 11 Dec 2023 • Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do

We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue.

3D Human Pose Estimation Data Augmentation

Paper
Add Code

Persistent Test-time Adaptation in Episodic Testing Scenarios

no code implementations • 30 Nov 2023 • Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do

Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously.

Test-time Adaptation

Paper
Add Code

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

no code implementations • 27 Nov 2023 • Jiaxuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

Instead of relying on large amounts of data and/or scaling up network parameters, we introduce a highly effective retrieval-augmented image captioning method that prompts LLMs with object names retrieved from External Visual--name memory (EVCap).

Image Captioning Object +1

Paper
Add Code

Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

1 code implementation • ICCV 2023 • Jiaxuan Li, Duc Minh Vo, Hideki Nakayama

However, most of these methods implicitly assume that a given image contains only one type of known or unknown bias, failing to consider the complexities of real-world biases.

Image Classification

Paper
Code

Revisiting Latent Space of GAN Inversion for Real Image Editing

no code implementations • 18 Jul 2023 • Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem.

Paper
Add Code

Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data

no code implementations • 17 Jul 2023 • Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama

Label-noise or curated unlabeled data is used to compensate for the assumption of clean labeled data in training the conditional generative adversarial network; however, satisfying such an extended assumption is occasionally laborious or impractical.

Conditional Image Generation Generative Adversarial Network

Paper
Add Code

Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space

no code implementations • 31 May 2023 • Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

Paper
Add Code

A-CAP: Anticipation Captioning with Commonsense Knowledge

no code implementations • CVPR 2023 • Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama

Humans possess the capacity to reason about the future based on a sparse collection of visual cues acquired over time.

Image Captioning Language Modelling +1

Paper
Add Code

StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning

1 code implementation • 16 Oct 2022 • Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama

Existing automatic story evaluation methods place a premium on story lexical level coherence, deviating from human preference.

Comment Generation

Paper
Code

OSSGAN: Open-Set Semi-Supervised Image Generation

1 code implementation • CVPR 2022 • Kai Katsumata, Duc Minh Vo, Hideki Nakayama

We introduce a challenging training scheme of conditional GANs, called open-set semi-supervised image generation, where the training dataset consists of two parts: (i) labeled data and (ii) unlabeled data with samples belonging to one of the labeled data classes, namely, a closed-set, and samples not belonging to any of the labeled data classes, namely, an open-set.

Image Generation

Paper
Code

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

no code implementations • CVPR 2022 • Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama

We propose an end-to-end Novel Object Captioning with Retrieved vocabulary from External Knowledge method (NOC-REK), which simultaneously learns vocabulary retrieval and caption generation, successfully describing novel objects outside of the training dataset.

Caption Generation Object +3

Paper
Add Code

PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

no code implementations • 16 Mar 2022 • Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

We push forward neural network compression research by exploiting a novel challenging task of large-scale conditional generative adversarial networks (GANs) compression.

Neural Network Compression

Paper
Add Code

Two-Stream FCNs to Balance Content and Style for Style Transfer

no code implementations • 19 Nov 2019 • Duc Minh Vo, Akihiro Sugimoto

The semantic content feature and the style representation feature are then concatenated adaptively and fed into the decoder to generate style-transferred (stylized) images.

Style Transfer Vocal Bursts Valence Prediction

Paper
Add Code

Visual-Relation Conscious Image Generation from Structured-Text

no code implementations • ECCV 2020 • Duc Minh Vo, Akihiro Sugimoto

We also use individual relation separately to predict from the initial bounding-boxes relation-units for all the relations in the input text.

Relation Text-to-Image Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.