Search Results for author: Dong Huk Park

Found 12 papers, 5 papers with code

Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations

no code implementations12 Aug 2021 Josh Beal, Hao-Yu Wu, Dong Huk Park, Andrew Zhai, Dmitry Kislyuk

Large-scale pretraining of visual representations has led to state-of-the-art performance on a range of benchmark computer vision tasks, yet the benefits of these techniques at extreme scale in complex production systems has been relatively unexplored.

Computer Vision Multi-Task Learning +1

Novelty Detection with Rotated Contrastive Predictive Coding

no code implementations1 Jan 2021 Dong Huk Park, Trevor Darrell

To this end, reconstruction-based learning is often used in which the normality of an observation is expressed in how well it can be reconstructed.

Contrastive Learning

Toward Transformer-Based Object Detection

no code implementations17 Dec 2020 Josh Beal, Eric Kim, Eric Tzeng, Dong Huk Park, Andrew Zhai, Dmitry Kislyuk

The Vision Transformer was the first major attempt to apply a pure transformer model directly to images as input, demonstrating that as compared to convolutional networks, transformer-based architectures can achieve competitive results on benchmark classification tasks.

Natural Language Processing object-detection +1

Learning a Unified Embedding for Visual Search at Pinterest

no code implementations5 Aug 2019 Andrew Zhai, Hao-Yu Wu, Eric Tzeng, Dong Huk Park, Charles Rosenberg

The solution we present not only allows us to train for multiple application objectives in a single deep neural network architecture, but takes advantage of correlated information in the combination of all training data from each application to generate a unified embedding that outperforms all specialized embeddings previously deployed for each product.

Metric Learning Recommendation Systems

Robust Change Captioning

1 code implementation ICCV 2019 Dong Huk Park, Trevor Darrell, Anna Rohrbach

We present a novel Dual Dynamic Attention Model (DUDA) to perform robust Change Captioning.

Natural Language Visual Grounding

Attentive Explanations: Justifying Decisions and Pointing to the Evidence

no code implementations14 Dec 2016 Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Bernt Schiele, Trevor Darrell, Marcus Rohrbach

In contrast, humans can justify their decisions with natural language and point to the evidence in the visual world which led to their decisions.

Decision Making Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.