no code implementations • 25 Feb 2023 • Wan-Duo Kurt Ma, J. P. Lewis, Avisek Lahiri, Thomas Leung, W. Bastiaan Kleijn
Text-guided diffusion models such as DALLE-2, Imagen, eDiff-I, and Stable Diffusion are able to generate an effectively endless variety of images given only a short text prompt describing the desired image content.
1 code implementation • 26 Jul 2022 • Reuben Tan, Bryan A. Plummer, Kate Saenko, JP Lewis, Avneesh Sud, Thomas Leung
Thus, we explore a novel setting where the goal is to learn a self-supervised visual-language representation that is robust to varying text length and the number of images.
no code implementations • ACL 2021 • Cesar Ilharco, Afsaneh Shirazi, Arjun Gopalan, Arsha Nagrani, Blaz Bratanic, Chris Bregler, Christina Funk, Felipe Ferreira, Gabriel Barcik, Gabriel Ilharco, Georg Osang, Jannis Bulian, Jared Frank, Lucas Smaira, Qin Cao, Ricardo Marino, Roma Patel, Thomas Leung, Vaiva Imbrasaite
How information is created, shared and consumed has changed rapidly in recent decades, in part thanks to new social platforms and technologies on the web.
1 code implementation • 4 Jun 2019 • Grace Chu, Brian Potetz, Weijun Wang, Andrew Howard, Yang song, Fernando Brucher, Thomas Leung, Hartwig Adam
By leveraging geolocation information we improve top-1 accuracy in iNaturalist from 70. 1% to 79. 0% for a strong baseline image-only model.
no code implementations • 1 Aug 2018 • Troy Chinen, Johannes Ballé, Chunhui Gu, Sung Jin Hwang, Sergey Ioffe, Nick Johnston, Thomas Leung, David Minnen, Sean O'Malley, Charles Rosenberg, George Toderici
We present a full reference, perceptual image metric based on VGG-16, an artificial neural network trained on object classification.
1 code implementation • ICML 2018 • Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, Li Fei-Fei
Recent deep networks are capable of memorizing the entire data even when the labels are completely random.
Ranked #16 on Image Classification on WebVision-1000
no code implementations • CVPR 2016 • Stephan Zheng, Yang song, Thomas Leung, Ian Goodfellow
In this paper we address the issue of output instability of deep neural networks: small perturbations in the visual input can significantly distort the feature embeddings and output of a neural network.
no code implementations • 1 Jul 2015 • Greg Mori, Caroline Pantofaru, Nisarg Kothari, Thomas Leung, George Toderici, Alexander Toshev, Weilong Yang
We present a method for learning an embedding that places images of humans in similar poses nearby.
2 code implementations • CVPR 2015 • Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg
We perform a comprehensive set of experiments on standard datasets to carefully study the contributions of each aspect of MatchNet, with direct comparisons to established methods.
1 code implementation • 2014 IEEE Conference on Computer Vision and Pattern Recognition 2014 • Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei
We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63. 3% up from 43. 9%).
Ranked #9 on Action Recognition on Sports-1M
6 code implementations • CVPR 2014 • Jiang Wang, Yang song, Thomas Leung, Chuck Rosenberg, Jinbin Wang, James Philbin, Bo Chen, Ying Wu
This paper proposes a deep ranking model that employs deep learning techniques to learn similarity metric directly from images. It has higher learning capability than models based on hand-crafted features.
no code implementations • 17 Dec 2013 • Yunchao Gong, Yangqing Jia, Thomas Leung, Alexander Toshev, Sergey Ioffe
Multilabel image annotation is one of the most important challenges in computer vision with many real-world applications.