no code implementations • 29 Oct 2024 • Zhaochong An, Guolei Sun, Yun Liu, Runjia Li, Min Wu, Ming-Ming Cheng, Ender Konukoglu, Serge Belongie
Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal annotated support samples.
Few-shot 3D Point Cloud Semantic Segmentation Point Cloud Segmentation +1
no code implementations • 10 Oct 2024 • Ching Lam Choi, Alexandre Duplessis, Serge Belongie
Gradient-based interpretations often require an anchor point of comparison to avoid saturation in computing feature importance.
no code implementations • 9 Oct 2024 • Jona Ruthardt, Gertjan J. Burghouts, Serge Belongie, Yuki M. Asano
To this end, we propose the Visual Text Representation Benchmark (ViTeRB) to isolate key properties that make language models well-aligned with the visual world.
1 code implementation • 27 Jun 2024 • Dustin Wright, Arnav Arora, Nadav Borenstein, Srishti Yadav, Serge Belongie, Isabelle Augenstein
For fine-grained analysis, we propose to identify tropes in the responses: semantically similar phrases that are recurrent and consistent across different prompts, revealing natural patterns in the text that a given LLM is prone to produce.
no code implementations • 7 Jun 2024 • Bingchen Zhao, Nico Lang, Serge Belongie, Oisin Mac Aodha
The labeled data provides guidance during training by indicating what types of visual properties and features are relevant for performing discovery in the unlabeled data.
1 code implementation • 6 Jun 2024 • Sebastian Loeschcke, Dan Wang, Christian Leth-Espensen, Serge Belongie, Michael J. Kastoryano, Sagie Benaim
This has prevented practitioners from deploying the full potential of tensor networks for visual data.
1 code implementation • 26 May 2024 • Sebastian Loeschcke, Mads Toftrup, Michael J. Kastoryano, Serge Belongie, Vésteinn Snæbjarnarson
Despite advances using low-rank adapters and quantization, pretraining of large models on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates.
2 code implementations • 4 May 2024 • Vishal Nedungadi, Ankit Kariryaa, Stefan Oehmcke, Serge Belongie, Christian Igel, Nico Lang
We find that pretraining with multi-modal pretext tasks notably improves the linear probing performance compared to pretraining on optical satellite images only.
1 code implementation • CVPR 2024 • Zhaochong An, Guolei Sun, Yun Liu, Fayao Liu, Zongwei Wu, Dan Wang, Luc van Gool, Serge Belongie
The former arises from non-uniform point sampling, allowing models to distinguish the density disparities between foreground and background for easier segmentation.
Few-shot 3D Point Cloud Semantic Segmentation Segmentation +1
no code implementations • CVPR 2024 • Nico Lang, Vésteinn Snæbjarnarson, Elijah Cole, Oisin Mac Aodha, Christian Igel, Serge Belongie
Depending on how visually sim- ilar a test example is to the training categories the OSR task can be easy or extremely challenging.
no code implementations • 8 Nov 2023 • Philip Enevoldsen, Christian Gundersen, Nico Lang, Serge Belongie, Christian Igel
Open-set recognition (OSR), the identification of novel categories, can be a critical component when deploying classification models in real-world applications.
no code implementations • 19 Sep 2023 • Peter Ebert Christensen, Srishti Yadav, Serge Belongie
Unsupported and unfalsifiable claims we encounter in our daily lives can influence our view of the world.
1 code implementation • NeurIPS 2023 • Thoranna Bender, Simon Moe Sørensen, Alireza Kashani, K. Eldjarn Hjorleifsson, Grethe Hyldig, Søren Hauberg, Serge Belongie, Frederik Warburg
We demonstrate that this shared concept embedding space improves upon separate embedding spaces for coarse flavor classification (alcohol percentage, country, grape, price, rating) and aligns with the intricate human perception of flavor.
no code implementations • 3 May 2023 • Mengyun Shi, Claire Cardie, Serge Belongie
Does ads from other domains infer their fashion taste as well?
no code implementations • 3 May 2023 • Mengyun Shi, Serge Belongie, Claire Cardie
Existing fashion datasets do not consider the multi-facts that cause a consumer to like or dislike a fashion image.
1 code implementation • ICCV 2023 • Idan Schwartz, Vésteinn Snæbjarnarson, Hila Chefer, Ryan Cotterell, Serge Belongie, Lior Wolf, Sagie Benaim
This approach has two disadvantages: (i) supervised datasets are generally small compared to large-scale scraped text-image datasets on which text-to-image models are trained, affecting the quality and diversity of the generated images, or (ii) the input is a hard-coded label, as opposed to free-form text, limiting the control over the generated images.
1 code implementation • 9 Feb 2023 • Guandao Yang, Sagie Benaim, Varun Jampani, Kyle Genova, Jonathan T. Barron, Thomas Funkhouser, Bharath Hariharan, Serge Belongie
We use this framework to design Fourier PNFs, which match state-of-the-art performance in signal representation tasks that use neural fields.
no code implementations • 20 Dec 2022 • Boyi Li, Rodolfo Corona, Karttikeya Mangalam, Catherine Chen, Daniel Flaherty, Serge Belongie, Kilian Q. Weinberger, Jitendra Malik, Trevor Darrell, Dan Klein
Are multimodal inputs necessary for grammar induction?
2 code implementations • 28 Nov 2022 • Kevin Musgrave, Serge Belongie, Ser-Nam Lim
PyTorch Adapt is a library for domain adaptation, a type of machine learning algorithm that re-purposes existing models to work in new domains.
no code implementations • 17 Nov 2022 • Peter Ebert Christensen, Vésteinn Snæbjarnarson, Andrea Dittadi, Serge Belongie, Sagie Benaim
We demonstrate that APT is capable of a wide range of class-preserving semantic image manipulations that fool a variety of pretrained classifiers.
1 code implementation • 19 Aug 2022 • Peter Ebert Christensen, Frederik Warburg, Menglin Jia, Serge Belongie
In this work, we aim to distill such posts into a small set of narratives that capture the essential claims related to a given topic.
1 code implementation • 15 Aug 2022 • Kevin Musgrave, Serge Belongie, Ser-Nam Lim
In a supervised setting, these validators evaluate checkpoints by computing accuracy on a validation set that has labels.
1 code implementation • 21 Jul 2022 • Grant van Horn, Rui Qian, Kimberly Wilber, Hartwig Adam, Oisin Mac Aodha, Serge Belongie
We thoroughly benchmark audiovisual classification performance and modality fusion experiments through the use of state-of-the-art transformer methods.
1 code implementation • 20 Jul 2022 • Elijah Cole, Kimberly Wilber, Grant van Horn, Xuan Yang, Marco Fornoni, Pietro Perona, Serge Belongie, Andrew Howard, Oisin Mac Aodha
Weakly supervised object localization (WSOL) aims to learn representations that encode object location using only image-level category labels.
no code implementations • 15 Jul 2022 • Rui Qian, Yeqing Li, Zheng Xu, Ming-Hsuan Yang, Serge Belongie, Yin Cui
Utilizing vision and language models (VLMs) pre-trained on large-scale image-text pairs is becoming a promising paradigm for open-vocabulary visual recognition.
Ranked #1 on Zero-Shot Action Recognition on HMDB51
no code implementations • 24 Jun 2022 • Sebastian Loeschcke, Serge Belongie, Sagie Benaim
The first target text prompt describes the global semantics and the second target text prompt describes the local semantics.
no code implementations • 6 Jun 2022 • Sagie Benaim, Frederik Warburg, Peter Ebert Christensen, Serge Belongie
To this end, we propose a volumetric framework for (i) disentangling or separating, the volumetric representation of a given foreground object from the background, and (ii) semantically manipulating the foreground object, as well as the background.
6 code implementations • 23 Mar 2022 • Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, Ser-Nam Lim
The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning.
Ranked #2 on Prompt Engineering on ImageNet-21k
no code implementations • 8 Feb 2022 • Flora Yu Shen, Katie Luo, Guandao Yang, Harald Haraldsson, Serge Belongie
In this work, we address an important problem of optical see through (OST) augmented reality: non-negative image synthesis.
1 code implementation • CVPR 2021 • Katie Luo, Guandao Yang, Wenqi Xian, Harald Haraldsson, Bharath Hariharan, Serge Belongie
In applications such as optical see-through and projector augmented reality, producing images amounts to solving non-negative image generation, where one can only add light to an existing image.
1 code implementation • ICLR 2022 • Boyi Li, Kilian Q. Weinberger, Serge Belongie, Vladlen Koltun, René Ranftl
We present LSeg, a novel model for language-driven semantic image segmentation.
Ranked #1 on Few-Shot Semantic Segmentation on FSS-1000
1 code implementation • 15 Dec 2021 • Menglin Jia, Bor-Chun Chen, Zuxuan Wu, Claire Cardie, Serge Belongie, Ser-Nam Lim
In this paper, we investigate $k$-Nearest-Neighbor (k-NN) classifiers, a classical model-free learning method from the pre-deep learning era, as an augmentation to modern neural network based approaches.
no code implementations • 8 Dec 2021 • Rui Qian, Yeqing Li, Liangzhe Yuan, Boqing Gong, Ting Liu, Matthew Brown, Serge Belongie, Ming-Hsuan Yang, Hartwig Adam, Yin Cui
The training objective consists of two parts: a fine-grained temporal learning objective to maximize the similarity between corresponding temporal embeddings in the short clip and the long clip, and a persistent temporal learning objective to pull together global embeddings of the two clips.
1 code implementation • NeurIPS 2021 • Guandao Yang, Serge Belongie, Bharath Hariharan, Vladlen Koltun
Most existing geometry processing algorithms use meshes as the default shape representation.
no code implementations • 30 Nov 2021 • Kevin Musgrave, Serge Belongie, Ser-Nam Lim
Interest in unsupervised domain adaptation (UDA) has surged in recent years, resulting in a plethora of new algorithms.
no code implementations • 15 Nov 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai
To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario.
no code implementations • 11 Nov 2021 • Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, Serge Belongie
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications.
no code implementations • 29 Sep 2021 • Ryan Y Benmalek, Sabhya Chhabria, Pedro O. Pinheiro, Claire Cardie, Serge Belongie
These models outperform both previous work and static models under both \emph{static} and \emph{continual} semantic shifts, suggesting that ``learning to adapt'' is a useful capability for models and agents in a changing world.
no code implementations • ICCV 2021 • Omid Poursaeed, Tianxing Jiang, Harry Yang, Serge Belongie, SerNam Lim
Adversarial training with these examples enable the model to withstand a wide range of attacks by observing a variety of input alterations during training.
1 code implementation • 30 Aug 2021 • Gui-Song Xia, Jian Ding, Ming Qian, Nan Xue, Jiaming Han, Xiang Bai, Michael Ying Yang, Shengyang Li, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang, Qiang Zhou, Chao-hui Yu, Kaixuan Hu, Yingjia Bu, Wenming Tan, Zhe Yang, Wei Li, Shang Liu, Jiaxuan Zhao, Tianzhi Ma, Zi-han Gao, Lingqi Wang, Yi Zuo, Licheng Jiao, Chang Meng, Hao Wang, Jiahao Wang, Yiming Hui, Zhuojun Dong, Jie Zhang, Qianyue Bao, Zixiao Zhang, Fang Liu
This report summarizes the results of Learning to Understand Aerial Images (LUAI) 2021 challenge held on ICCV 2021, which focuses on object detection and semantic segmentation in aerial images.
2 code implementations • 25 Jun 2021 • Boyi Li, Yin Cui, Tsung-Yi Lin, Serge Belongie
In this paper, we propose and explore the problem of image translation for data augmentation.
1 code implementation • 28 May 2021 • Riccardo de Lutio, Damon Little, Barbara Ambrose, Serge Belongie
Herbarium sheets present a unique view of the world's botanical history, evolution, and diversity.
no code implementations • CVPR 2022 • Elijah Cole, Xuan Yang, Kimberly Wilber, Oisin Mac Aodha, Serge Belongie
Recent self-supervised representation learning techniques have largely closed the gap between supervised and unsupervised learning on ImageNet classification.
no code implementations • ICCV 2021 • Zekun Hao, Arun Mallya, Serge Belongie, Ming-Yu Liu
We represent the world as a continuous volumetric function and train our model to render view-consistent photorealistic images for a user-controlled camera.
1 code implementation • ICCV 2021 • Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim
Visual engagement in social media platforms comprises interactions with photo posts including comments, shares, and likes.
1 code implementation • CVPR 2021 • Grant van Horn, Elijah Cole, Sara Beery, Kimberly Wilber, Serge Belongie, Oisin Mac Aodha
In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT.
2 code implementations • 24 Feb 2021 • Jian Ding, Nan Xue, Gui-Song Xia, Xiang Bai, Wen Yang, Micheal Ying Yang, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang
In this paper, we present a large-scale Dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI.
2 code implementations • 2 Feb 2021 • Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge Belongie, Alan Yuille, Philip H. S. Torr, Song Bai
On the OVIS dataset, the highest AP achieved by state-of-the-art algorithms is only 16. 3, which reveals that we are still at a nascent stage for understanding objects, instances, and videos in a real-world scenario.
Ranked #41 on Video Instance Segmentation on OVIS validation
1 code implementation • CVPR 2021 • Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim
Based on our findings, we conduct further study to quantify the effect of attending to object and context classes as well as textual information in the form of hashtags when training an intent classifier.
1 code implementation • 20 Aug 2020 • Kevin Musgrave, Serge Belongie, Ser-Nam Lim
Deep metric learning algorithms have a wide variety of applications, but implementing these algorithms can be tedious and time consuming.
1 code implementation • ECCV 2020 • Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, Bharath Hariharan
Point cloud generation thus amounts to moving randomly sampled points to high-density areas.
4 code implementations • CVPR 2021 • Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui
Our representations are learned using a contrastive loss, where two augmented clips from the same short video are pulled together in the embedding space, while clips from different videos are pushed away.
Ranked #1 on Self-Supervised Action Recognition on Kinetics-600
5 code implementations • ECCV 2020 • Menglin Jia, Mengyun Shi, Mikhail Sirotenko, Yin Cui, Claire Cardie, Bharath Hariharan, Hartwig Adam, Serge Belongie
In this work we explore the task of instance segmentation with attribute localization, which unifies instance segmentation (detect and segment each object instance) and fine-grained visual attribute categorization (recognize one or multiple attributes).
5 code implementations • 24 Apr 2020 • Ranjita Thapa, Noah Snavely, Serge Belongie, Awais Khan
Appropriate and timely deployment of disease management depends on early disease detection.
1 code implementation • CVPR 2020 • Rui Qian, Divyansh Garg, Yan Wang, Yurong You, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger, Wei-Lun Chao
Reliable and accurate 3D object detection is a necessity for safe autonomous driving.
1 code implementation • CVPR 2020 • Zekun Hao, Hadar Averbuch-Elor, Noah Snavely, Serge Belongie
We are seeing a Cambrian explosion of 3D shape representations for use in machine learning.
4 code implementations • ECCV 2020 • Kevin Musgrave, Serge Belongie, Ser-Nam Lim
Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods.
2 code implementations • ICML 2020 • Aaron Lou, Isay Katsman, Qingxuan Jiang, Serge Belongie, Ser-Nam Lim, Christopher De Sa
Recent advances in deep representation learning on Riemannian manifolds extend classical deep learning operations to better capture the geometry of the manifold.
1 code implementation • CVPR 2021 • Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger
The moments (a. k. a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time.
Ranked #32 on Domain Generalization on ImageNet-A
1 code implementation • 21 Dec 2019 • Yin Cui, Zeqi Gu, Dhruv Mahajan, Laurens van der Maaten, Serge Belongie, Ser-Nam Lim
We also investigate the interplay between dataset granularity with a variety of factors and find that fine-grained datasets are more difficult to learn from, more difficult to transfer to, more difficult to perform few-shot learning with, and more vulnerable to adversarial attacks.
no code implementations • 20 Nov 2019 • Omid Poursaeed, Tianxing Jiang, Yordanos Goshu, Harry Yang, Serge Belongie, Ser-Nam Lim
We propose a novel approach for generating unrestricted adversarial examples by manipulating fine-grained aspects of image generation.
no code implementations • 4 Oct 2019 • Omid Poursaeed, Vladimir G. Kim, Eli Shechtman, Jun Saito, Serge Belongie
We capture these subtle changes by applying an image translation network to refine the mesh rendering, providing an end-to-end model to generate new animations of a character with high visual quality.
1 code implementation • IJCNLP 2019 • Maxwell Forbes, Christine Kaeser-Chen, Piyush Sharma, Serge Belongie
We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds.
2 code implementations • ICCV 2019 • Qian Huang, Isay Katsman, Horace He, Zeqi Gu, Serge Belongie, Ser-Nam Lim
We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability.
1 code implementation • 14 Jul 2019 • Parneet Kaur, Karan Sikka, Weijun Wang, Serge Belongie, Ajay Divakaran
Food classification is a challenging problem due to the large number of categories, high visual similarity between different foods, as well as the lack of datasets for training state-of-the-art deep models.
2 code implementations • NeurIPS 2019 • Boyi Li, Felix Wu, Kilian Q. Weinberger, Serge Belongie
A popular method to reduce the training time of deep neural networks is to normalize activations at each layer.
12 code implementations • ICCV 2019 • Guandao Yang, Xun Huang, Zekun Hao, Ming-Yu Liu, Serge Belongie, Bharath Hariharan
Specifically, we learn a two-level hierarchy of distributions where the first level is the distribution of shapes and the second level is the distribution of points given a shape.
Ranked #4 on Point Cloud Generation on ShapeNet Car
1 code implementation • 13 Jun 2019 • Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui, Yuan Li, Matthew R. Scott, Hartwig Adam, Serge Belongie
The dataset is constructed from over one million fashion images with a label space that includes 8 groups of 228 fine-grained attributes in total.
no code implementations • 12 Jun 2019 • Kiat Chuan Tan, Yulong Liu, Barbara Ambrose, Melissa Tulig, Serge Belongie
Herbarium sheets are invaluable for botanical research, and considerable time and effort is spent by experts to label and identify specimens on them.
1 code implementation • 3 Jun 2019 • Chenyang Zhang, Christine Kaeser-Chen, Grace Vesom, Jennie Choi, Maria Kessler, Serge Belongie
Existing computer vision technologies in artwork recognition focus mainly on instance retrieval or coarse-grained attribute classification.
8 code implementations • CVPR 2019 • Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang song, Serge Belongie
We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss.
Ranked #2 on Long-tail Learning on EGTEA
no code implementations • 4 Dec 2018 • Horace He, Aaron Lou, Qingxuan Jiang, Isay Katsman, Serge Belongie, Ser-Nam Lim
Research has shown that widely used deep neural networks are vulnerable to carefully crafted adversarial perturbations.
no code implementations • 26 Nov 2018 • Xiao Ma, Lina Mezghani, Kimberly Wilber, Hui Hong, Robinson Piramuthu, Mor Naaman, Serge Belongie
In this work, we conducted a large-scale study on the quality of user-generated images in peer-to-peer marketplaces.
no code implementations • 20 Nov 2018 • Qian Huang, Zeqi Gu, Isay Katsman, Horace He, Pian Pawakapan, Zhiqiu Lin, Serge Belongie, Ser-Nam Lim
Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models.
no code implementations • 3 Oct 2018 • Omid Poursaeed, Guandao Yang, Aditya Prakash, Qiuren Fang, Hanqing Jiang, Bharath Hariharan, Serge Belongie
Estimating fundamental matrices is a classic problem in computer vision.
1 code implementation • ECCV 2018 • Guandao Yang, Yin Cui, Serge Belongie, Bharath Hariharan
It is expensive to label images with 3D structure or precise camera pose.
no code implementations • 2 Jul 2018 • Isay Katsman, Rohun Tripathi, Andreas Veit, Serge Belongie
Semantic segmentation is a challenging vision problem that usually necessitates the collection of large amounts of finely annotated data, which is often quite expensive to obtain.
1 code implementation • CVPR 2018 • Yin Cui, Guandao Yang, Andreas Veit, Xun Huang, Serge Belongie
To address these two challenges, we propose a novel learning based discriminative evaluation metric that is directly trained to distinguish between human and machine-generated captions.
no code implementations • 16 Jun 2018 • Ryan Y. Benmalek, Claire Cardie, Serge Belongie, Xiadong He, Jianfeng Gao
In this work we combine two research threads from Vision/ Graphics and Natural Language Processing to formulate an image generation task conditioned on attributes in a multi-turn setting.
1 code implementation • CVPR 2018 • Yin Cui, Yang song, Chen Sun, Andrew Howard, Serge Belongie
We propose a measure to estimate domain similarity via Earth Mover's Distance and demonstrate that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure.
Ranked #31 on Fine-Grained Image Classification on CUB-200-2011
Fine-Grained Image Classification Fine-Grained Visual Categorization +1
no code implementations • CVPR 2018 • Grant Van Horn, Steve Branson, Scott Loarie, Serge Belongie, Pietro Perona
We introduce a method for efficiently crowdsourcing multiclass annotations in challenging, real world image datasets.
no code implementations • CVPR 2018 • Zekun Hao, Xun Huang, Serge Belongie
Video generation and manipulation is an important yet challenging task in computer vision.
14 code implementations • ECCV 2018 • Xun Huang, Ming-Yu Liu, Serge Belongie, Jan Kautz
To translate an image to another domain, we recombine its content code with a random style code sampled from the style space of the target domain.
Multimodal Unsupervised Image-To-Image Translation Translation +1
no code implementations • 7 Jan 2018 • Benjamin Spector, Serge Belongie
Recent work in deep reinforcement learning has allowed algorithms to learn complex tasks such as Atari 2600 games just from the reward provided by the game, but these algorithms presently require millions of training steps in order to learn, making them approximately five orders of magnitude slower than humans.
1 code implementation • CVPR 2018 • Omid Poursaeed, Isay Katsman, Bicheng Gao, Serge Belongie
In this paper, we propose novel generative models for creating adversarial examples, slightly perturbed images resembling natural images but maliciously crafted to fool pre-trained models.
2 code implementations • ECCV 2018 • Andreas Veit, Serge Belongie
In this work, we propose convolutional networks with adaptive inference graphs (ConvNet-AIG) that adaptively define their network topology conditioned on the input image.
6 code implementations • CVPR 2018 • Gui-Song Xia, Xiang Bai, Jian Ding, Zhen Zhu, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang
The fully annotated DOTA images contains $188, 282$ instances, each of which is labeled by an arbitrary (8 d. o. f.)
Ranked #54 on Object Detection In Aerial Images on DOTA (using extra training data)
1 code implementation • CVPR 2018 • Andreas Veit, Maximilian Nickel, Serge Belongie, Laurens van der Maaten
The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models.
5 code implementations • 31 Aug 2017 • Baoguang Shi, Cong Yao, Minghui Liao, Mingkun Yang, Pei Xu, Linyan Cui, Serge Belongie, Shijian Lu, Xiang Bai
This report introduces RCTW, a new competition that focuses on Chinese text reading.
19 code implementations • CVPR 2018 • Grant Van Horn, Oisin Mac Aodha, Yang song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie
Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories.
Ranked #8 on Image Classification on iNaturalist
no code implementations • 18 Jul 2017 • Omid Poursaeed, Tomas Matera, Serge Belongie
Using deep convolutional neural networks on a large dataset of photos of home interiors and exteriors, we develop a method for estimating the luxury level of real estate photos.
no code implementations • CVPR 2017 • Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, Yuanqing Lin, Serge Belongie
We demonstrate how to approximate kernels such as Gaussian RBF up to a given order using compact explicit feature maps in a parameter-free manner.
no code implementations • ICLR 2018 • David Rolnick, Andreas Veit, Serge Belongie, Nir Shavit
Deep neural networks trained on large supervised datasets have led to impressive results in image classification and other tasks.
no code implementations • ICCV 2017 • Michael J. Wilber, Chen Fang, Hailin Jin, Aaron Hertzmann, John Collomosse, Serge Belongie
Furthermore, we carry out baseline experiments to show the value of this dataset for artistic style prediction, for improving the generality of existing object classifiers, and for the study of visual domain adaptation.
no code implementations • 4 Apr 2017 • Subarna Tripathi, Maxwell Collins, Matthew Brown, Serge Belongie
In a more realistic environment, without the oracle keypoints, the proposed deep person instance segmentation model conditioned on human pose achieves 3. 8% to 10. 5% relative improvements comparing with its strongest baseline of a deep network trained only for segmentation.
2 code implementations • WWW 2017 • Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, Deborah Estrin
Metric learning algorithms produce distance metrics that capture the important relationships among data.
Ranked #1 on Recommendation Systems on Million Song Dataset (Recall@100 metric)
28 code implementations • ICCV 2017 • Xun Huang, Serge Belongie
Gatys et al. recently introduced a neural algorithm that renders a content image in the style of another image, achieving so-called style transfer.
6 code implementations • CVPR 2017 • Baoguang Shi, Xiang Bai, Serge Belongie
It achieves an f-measure of 75. 0% on the standard ICDAR 2015 Incidental (Challenge 4) benchmark, outperforming the previous best by a large margin.
Ranked #11 on Scene Text Detection on ICDAR 2013
no code implementations • CVPR 2017 • Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, Serge Belongie
For the small clean set of annotations we use a quarter of the validation set with ~40k images.
2 code implementations • CVPR 2017 • Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, Serge Belongie
In this paper, we propose a novel generative model named Stacked Generative Adversarial Networks (SGAN), which is trained to invert the hierarchical representations of a bottom-up discriminative network.
Ranked #12 on Conditional Image Generation on CIFAR-10 (Inception score metric)
85 code implementations • CVPR 2017 • Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie
Feature pyramids are a basic component in recognition systems for detecting objects at different scales.
Ranked #3 on Pedestrian Detection on TJU-Ped-campus
no code implementations • 15 Jul 2016 • Subarna Tripathi, Zachary C. Lipton, Serge Belongie, Truong Nguyen
Then we train a recurrent neural network that takes as input sequences of pseudo-labeled frames and optimizes an objective that encourages both accuracy on the target frame and consistency across consecutive frames.
no code implementations • CVPR 2016 • Hani Altwaijry, Eduard Trulls, James Hays, Pascal Fua, Serge Belongie
We demonstrate that our models outperform the state-of-the-art on ultra-wide baseline matching and approach human accuracy.
2 code implementations • 25 May 2016 • Longqi Yang, Cheng-Kang Hsieh, Hongjian Yang, Nicola Dell, Serge Belongie, Curtis Cole, Deborah Estrin
We propose Yum-me, a personalized nutrient-based meal recommender system designed to meet individuals' nutritional expectations, dietary restrictions, and fine-grained food preferences.
2 code implementations • NeurIPS 2016 • Andreas Veit, Michael Wilber, Serge Belongie
Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training.
5 code implementations • CVPR 2017 • Andreas Veit, Serge Belongie, Theofanis Karaletsos
A main reason for this is that contradicting notions of similarities cannot be captured in a single space.
9 code implementations • 14 Feb 2016 • Michael J. Wilber, Vitaly Shmatikov, Serge Belongie
In this setting, is it still possible for privacy-conscientious users to avoid automatic face detection and recognition?
4 code implementations • 26 Jan 2016 • Andreas Veit, Tomas Matera, Lukas Neumann, Jiri Matas, Serge Belongie
The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images.
no code implementations • 20 Jan 2016 • Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen
We further propose a clustering of VOPs which can efficiently be used for detecting objects in video in a streaming fashion.
no code implementations • CVPR 2016 • Yin Cui, Feng Zhou, Yuanqing Lin, Serge Belongie
To demonstrate the effectiveness of the proposed framework, we bootstrap a fine-grained flower dataset with 620 categories from Instagram images.
no code implementations • ICCV 2015 • Michael J. Wilber, Iljung S. Kwak, David Kriegman, Serge Belongie
This paper presents our work on "SNaCK," a low-dimensional concept embedding algorithm that combines human expertise with automatic machine similarity kernels.
no code implementations • ICCV 2015 • Andreas Veit, Balazs Kovacs, Sean Bell, Julian McAuley, Kavita Bala, Serge Belongie
In this paper, we propose a novel learning framework to help answer these types of questions.
no code implementations • 24 Sep 2015 • Andreas Veit, Michael Wilber, Rajan Vaish, Serge Belongie, James Davis, Vishal Anand, Anshu Aviral, Prithvijit Chakrabarty, Yash Chandak, Sidharth Chaturvedi, Chinmaya Devaraj, Ankit Dhall, Utkarsh Dwivedi, Sanket Gupte, Sharath N. Sridhar, Karthik Paga, Anuj Pahuja, Aditya Raisinghani, Ayush Sharma, Shweta Sharma, Darpana Sinha, Nisarg Thakkar, K. Bala Vignesh, Utkarsh Verma, Kanniganti Abhishek, Amod Agrawal, Arya Aishwarya, Aurgho Bhattacharjee, Sarveshwaran Dhanasekar, Venkata Karthik Gullapalli, Shuchita Gupta, Chandana G, Kinjal Jain, Simran Kapur, Meghana Kasula, Shashi Kumar, Parth Kundaliya, Utkarsh Mathur, Alankrit Mishra, Aayush Mudgal, Aditya Nadimpalli, Munakala Sree Nihit, Akanksha Periwal, Ayush Sagar, Ayush Shah, Vikas Sharma, Yashovardhan Sharma, Faizal Siddiqui, Virender Singh, Abhinav S., Anurag. D. Yadav
When crowdsourcing systems are used in combination with machine inference systems in the real world, they benefit the most when the machine system is deeply integrated with the crowd workers.
1 code implementation • 4 Sep 2015 • Subarna Tripathi, Serge Belongie, Youngbae Hwang, Truong Nguyen
We explore the efficiency of the CRF inference beyond image level semantic segmentation and perform joint inference in video frames.
no code implementations • 1 Jul 2015 • Subarna Tripathi, Serge Belongie, Truong Nguyen
We explore the efficiency of the CRF inference module beyond image level semantic segmentation.
no code implementations • 16 Jun 2015 • Theofanis Karaletsos, Serge Belongie, Gunnar Rätsch
Representation learning systems typically rely on massive amounts of labeled data in order to be trained to high accuracy.
no code implementations • CVPR 2015 • Grant Van Horn, Steve Branson, Ryan Farrell, Scott Haber, Jessie Barry, Panos Ipeirotis, Pietro Perona, Serge Belongie
We worked with bird experts to measure the quality of popular datasets like CUB-200-2011 and ImageNet and found class label error rates of at least 4%.
no code implementations • CVPR 2015 • Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays
Most approaches predict the location of a query image by matching to ground-level images with known locations (e. g., street-view data).
no code implementations • 11 Jun 2014 • Steve Branson, Grant van Horn, Serge Belongie, Pietro Perona
We perform a detailed investigation of state-of-the-art deep convolutional feature implementations and fine-tuning feature learning for fine-grained classification.
no code implementations • CVPR 2014 • Catherine Wah, Grant van Horn, Steve Branson, Subhransu Maji, Pietro Perona, Serge Belongie
Current human-in-the-loop fine-grained visual categorization systems depend on a predefined vocabulary of attributes and parts, usually determined by experts.
36 code implementations • 1 May 2014 • Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, Piotr Dollár
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.
no code implementations • 14 Feb 2014 • Subarna Tripathi, Youngbae Hwang, Serge Belongie, Truong Nguyen
Despite recent advances in video segmentation, many opportunities remain to improve it using a variety of low and mid-level visual cues.
no code implementations • CVPR 2013 • Steve Branson, Oscar Beijbom, Serge Belongie
Our method is shown to be 10-50 times faster than SVMstruct for cost-sensitive multiclass classification while being about as fast as the fastest 1-vs-all methods for multiclass classification.
no code implementations • CVPR 2013 • Catherine Wah, Serge Belongie
Recent work in computer vision has addressed zero-shot learning or unseen class detection, which involves categorizing objects without observing any training examples.
no code implementations • CVPR 2013 • Tsung-Yi Lin, Serge Belongie, James Hays
On the other hand, there is no shortage of visual and geographic data that densely covers the Earth we examine overhead imagery and land cover survey data but the relationship between this data and ground level query photographs is complex.