Search Results for author: Duy-Kien Nguyen

Found 7 papers, 4 papers with code

R-MAE: Regions Meet Masked Autoencoders

1 code implementation8 Jun 2023 Duy-Kien Nguyen, Vaibhav Aggarwal, Yanghao Li, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, Xinlei Chen

In this work, we explore regions as a potential visual analogue of words for self-supervised image representation learning.

Contrastive Learning Interactive Segmentation +4

BoxeR: Box-Attention for 2D and 3D Transformers

1 code implementation CVPR 2022 Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees G. M. Snoek

Specifically, we present BoxeR, short for Box Transformer, which attends to a set of boxes by predicting their transformation from a reference window on an input feature map.

3D Object Detection Instance Segmentation +2

MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond

1 code implementation ICLR 2021 Duy-Kien Nguyen, Vedanuj Goswami, Xinlei Chen

This paper focuses on visual counting, which aims to predict the number of occurrences given a natural image and a query (e. g. a question or a category).

Object Counting Question Answering +1

UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision

no code implementations20 Jan 2020 Tsun-Yi Yang, Duy-Kien Nguyen, Huub Heijnen, Vassileios Balntas

In this paper, we explore how three related tasks, namely keypoint detection, description, and image retrieval can be jointly tackled using a single unified framework, which is trained without the need of training data with point to point correspondences.

Image Retrieval Keypoint Detection +1

Multi-task Learning of Hierarchical Vision-Language Representation

no code implementations CVPR 2019 Duy-Kien Nguyen, Takayuki Okatani

The representation is hierarchical, and prediction for each task is computed from the representation at its corresponding level of the hierarchy.

Multi-Task Learning Question Answering +3

Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering

1 code implementation CVPR 2018 Duy-Kien Nguyen, Takayuki Okatani

A key solution to visual question answering (VQA) exists in how to fuse visual and language features extracted from an input image and question.

Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.