Search Results for author: Duc Minh Vo

Found 14 papers, 3 papers with code

Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input

no code implementations11 Dec 2023 Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do

We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue.

3D Human Pose Estimation Data Augmentation

Persistent Test-time Adaptation in Episodic Testing Scenarios

no code implementations30 Nov 2023 Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do

Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously.

Test-time Adaptation

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

no code implementations27 Nov 2023 Jiaxuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

Instead of relying on large amounts of data and/or scaling up network parameters, we introduce a highly effective retrieval-augmented image captioning method that prompts LLMs with object names retrieved from External Visual--name memory (EVCap).

Image Captioning Object +1

Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts

1 code implementation ICCV 2023 Jiaxuan Li, Duc Minh Vo, Hideki Nakayama

However, most of these methods implicitly assume that a given image contains only one type of known or unknown bias, failing to consider the complexities of real-world biases.

Image Classification

Revisiting Latent Space of GAN Inversion for Real Image Editing

no code implementations18 Jul 2023 Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem.

Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data

no code implementations17 Jul 2023 Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama

Label-noise or curated unlabeled data is used to compensate for the assumption of clean labeled data in training the conditional generative adversarial network; however, satisfying such an extended assumption is occasionally laborious or impractical.

Conditional Image Generation Generative Adversarial Network

Balancing Reconstruction and Editing Quality of GAN Inversion for Real Image Editing with StyleGAN Prior Latent Space

no code implementations31 May 2023 Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama

The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem.

StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning

1 code implementation16 Oct 2022 Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama

Existing automatic story evaluation methods place a premium on story lexical level coherence, deviating from human preference.

Comment Generation

OSSGAN: Open-Set Semi-Supervised Image Generation

1 code implementation CVPR 2022 Kai Katsumata, Duc Minh Vo, Hideki Nakayama

We introduce a challenging training scheme of conditional GANs, called open-set semi-supervised image generation, where the training dataset consists of two parts: (i) labeled data and (ii) unlabeled data with samples belonging to one of the labeled data classes, namely, a closed-set, and samples not belonging to any of the labeled data classes, namely, an open-set.

Image Generation

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

no code implementations CVPR 2022 Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama

We propose an end-to-end Novel Object Captioning with Retrieved vocabulary from External Knowledge method (NOC-REK), which simultaneously learns vocabulary retrieval and caption generation, successfully describing novel objects outside of the training dataset.

Caption Generation Object +3

PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

no code implementations16 Mar 2022 Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

We push forward neural network compression research by exploiting a novel challenging task of large-scale conditional generative adversarial networks (GANs) compression.

Neural Network Compression

Two-Stream FCNs to Balance Content and Style for Style Transfer

no code implementations19 Nov 2019 Duc Minh Vo, Akihiro Sugimoto

The semantic content feature and the style representation feature are then concatenated adaptively and fed into the decoder to generate style-transferred (stylized) images.

Style Transfer Vocal Bursts Valence Prediction

Visual-Relation Conscious Image Generation from Structured-Text

no code implementations ECCV 2020 Duc Minh Vo, Akihiro Sugimoto

We also use individual relation separately to predict from the initial bounding-boxes relation-units for all the relations in the input text.

Relation Text-to-Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.