no code implementations • 11 Dec 2023 • Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do
We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue.
no code implementations • 30 Nov 2023 • Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do
Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously.
no code implementations • 27 Nov 2023 • Jiaxuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
Instead of relying on large amounts of data and/or scaling up network parameters, we introduce a highly effective retrieval-augmented image captioning method that prompts LLMs with object names retrieved from External Visual--name memory (EVCap).
1 code implementation • ICCV 2023 • Jiaxuan Li, Duc Minh Vo, Hideki Nakayama
However, most of these methods implicitly assume that a given image contains only one type of known or unknown bias, failing to consider the complexities of real-world biases.
no code implementations • 18 Jul 2023 • Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem.
no code implementations • 17 Jul 2023 • Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama
Label-noise or curated unlabeled data is used to compensate for the assumption of clean labeled data in training the conditional generative adversarial network; however, satisfying such an extended assumption is occasionally laborious or impractical.
no code implementations • 31 May 2023 • Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
The exploration of the latent space in StyleGANs and GAN inversion exemplify impressive real-world image editing, yet the trade-off between reconstruction quality and editing quality remains an open problem.
no code implementations • CVPR 2023 • Duc Minh Vo, Quoc-An Luong, Akihiro Sugimoto, Hideki Nakayama
Humans possess the capacity to reason about the future based on a sparse collection of visual cues acquired over time.
1 code implementation • 16 Oct 2022 • Hong Chen, Duc Minh Vo, Hiroya Takamura, Yusuke Miyao, Hideki Nakayama
Existing automatic story evaluation methods place a premium on story lexical level coherence, deviating from human preference.
1 code implementation • CVPR 2022 • Kai Katsumata, Duc Minh Vo, Hideki Nakayama
We introduce a challenging training scheme of conditional GANs, called open-set semi-supervised image generation, where the training dataset consists of two parts: (i) labeled data and (ii) unlabeled data with samples belonging to one of the labeled data classes, namely, a closed-set, and samples not belonging to any of the labeled data classes, namely, an open-set.
no code implementations • CVPR 2022 • Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama
We propose an end-to-end Novel Object Captioning with Retrieved vocabulary from External Knowledge method (NOC-REK), which simultaneously learns vocabulary retrieval and caption generation, successfully describing novel objects outside of the training dataset.
no code implementations • 16 Mar 2022 • Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama
We push forward neural network compression research by exploiting a novel challenging task of large-scale conditional generative adversarial networks (GANs) compression.
no code implementations • 19 Nov 2019 • Duc Minh Vo, Akihiro Sugimoto
The semantic content feature and the style representation feature are then concatenated adaptively and fed into the decoder to generate style-transferred (stylized) images.
no code implementations • ECCV 2020 • Duc Minh Vo, Akihiro Sugimoto
We also use individual relation separately to predict from the initial bounding-boxes relation-units for all the relations in the input text.