Search Results for author: Duy-Kien Nguyen

Found 7 papers, 4 papers with code

SimPLR: A Simple and Plain Transformer for Scaling-Efficient Object Detection and Segmentation

no code implementations • 9 Oct 2023 • Duy-Kien Nguyen, Martin R. Oswald, Cees G. M. Snoek

The ability to detect objects in images at varying scales has played a pivotal role in the design of modern object detectors.

Instance Segmentation object-detection +3

Paper
Add Code

R-MAE: Regions Meet Masked Autoencoders

1 code implementation • 8 Jun 2023 • Duy-Kien Nguyen, Vaibhav Aggarwal, Yanghao Li, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, Xinlei Chen

In this work, we explore regions as a potential visual analogue of words for self-supervised image representation learning.

Contrastive Learning Interactive Segmentation +4

104

Paper
Code

BoxeR: Box-Attention for 2D and 3D Transformers

1 code implementation • CVPR 2022 • Duy-Kien Nguyen, Jihong Ju, Olaf Booij, Martin R. Oswald, Cees G. M. Snoek

Specifically, we present BoxeR, short for Box Transformer, which attends to a set of boxes by predicting their transformation from a reference window on an input feature map.

3D Object Detection Instance Segmentation +2

134

Paper
Code

MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond

1 code implementation • ICLR 2021 • Duy-Kien Nguyen, Vedanuj Goswami, Xinlei Chen

This paper focuses on visual counting, which aims to predict the number of occurrences given a natural image and a query (e. g. a question or a category).

Ranked #1 on Object Counting on HowMany-QA

Object Counting Question Answering +1

5,413

Paper
Code

UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision

no code implementations • 20 Jan 2020 • Tsun-Yi Yang, Duy-Kien Nguyen, Huub Heijnen, Vassileios Balntas

In this paper, we explore how three related tasks, namely keypoint detection, description, and image retrieval can be jointly tackled using a single unified framework, which is trained without the need of training data with point to point correspondences.

Image Retrieval Keypoint Detection +1