no code implementations • ECCV 2020 • Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
This process enables incrementally improving the model by processing multiple learning episodes, each representing a different learning task, even with few training examples.
no code implementations • 24 Feb 2025 • Tian Yu Liu, Alessandro Achille, Matthew Trager, Aditya Golatkar, Luca Zancato, Stefano Soatto
A key challenge arises when attempting to leverage information present across multiple contexts, since there is no straightforward way to condition generation on multiple independent states in existing SSMs.
no code implementations • 17 Feb 2025 • Pramuditha Perera, Matthew Trager, Luca Zancato, Alessandro Achille, Stefano Soatto
We operate on CLIP text features and propose to use a combination of a textual inversion loss and a classification loss to ensure that text features of the learned token are aligned with image features of the concept in the CLIP embedding space.
no code implementations • 17 Dec 2024 • Elvis Nunez, Luca Zancato, Benjamin Bowman, Aditya Golatkar, Wei Xia, Stefano Soatto
We describe a method to expand the memory span of the hybrid state by "reserving" a fraction of the Attention context for tokens retrieved from arbitrarily distant in the past, thus expanding the eidetic memory span of the overall state.
no code implementations • 16 Dec 2024 • Hao Li, Shamit Lal, Zhiheng Li, Yusheng Xie, Ying Wang, Yang Zou, Orchid Majumder, R. Manmatha, Zhuowen Tu, Stefano Ermon, Stefano Soatto, Ashwin Swaminathan
We empirically study the scaling properties of various Diffusion Transformers (DiTs) for text-to-image generation by performing extensive and rigorous ablations, including training scaled DiTs ranging from 0. 3B upto 8B parameters on datasets up to 600M images.
no code implementations • 6 Nov 2024 • Lawrence Stewart, Matthew Trager, Sujan Kumar Gonugondla, Stefano Soatto
Speculative decoding aims to speed up autoregressive generation of a language model by verifying in parallel the tokens generated by a smaller draft model. In this work, we explore the effectiveness of learning-free, negligible-cost draft strategies, namely $N$-grams obtained from the model weights and the context.
no code implementations • 21 Oct 2024 • Tian Yu Liu, Stefano Soatto
Therefore, we characterize the semantic similarity between two textual expressions simply as the distance between image distributions they induce, or 'conjure.'
no code implementations • 4 Oct 2024 • Sungnyun Kim, Haofu Liao, Srikar Appalaraju, Peng Tang, Zhuowen Tu, Ravi Kumar Satzoda, R. Manmatha, Vijay Mahadevan, Stefano Soatto
Visual document understanding (VDU) is a challenging task that involves understanding documents across various modalities (text and image) and layouts (forms, tables, etc.).
1 code implementation • 3 Oct 2024 • Ziyao Zeng, Yangchao Wu, Hyoungseob Park, Daniel Wang, Fengyu Yang, Stefano Soatto, Dong Lao, Byung-Woo Hong, Alex Wong
Our method, RSA, takes as input a text caption describing objects present in an image and outputs the parameters of a linear transformation which can be applied globally to a relative depth map to yield metric-scaled depth predictions.
no code implementations • 18 Aug 2024 • Chaofan Tao, Gukyeong Kwon, Varad Gunjal, Hao Yang, Zhaowei Cai, Yonatan Dukler, Ashwin Swaminathan, R. Manmatha, Colin Jon Taylor, Stefano Soatto
The benchmark is constructed by generating negative texts with incorrect action descriptions for a given video and the model is expected to pair a positive text with its corresponding video.
no code implementations • 12 Jul 2024 • Matthew Trager, Alessandro Achille, Pramuditha Perera, Luca Zancato, Stefano Soatto
Specifically, we introduce a characterization of compositional structures in terms of "interaction decompositions," and we establish necessary and sufficient conditions for the presence of such structures within the representations of a model.
no code implementations • 8 Jul 2024 • Luca Zancato, Arjun Seshadri, Yonatan Dukler, Aditya Golatkar, Yantao Shen, Benjamin Bowman, Matthew Trager, Alessandro Achille, Stefano Soatto
Recent hybrid architectures have combined eidetic and fading memory, but with limitations that do not allow the designer or the learning process to seamlessly modulate the two, nor to extend the eidetic memory span.
no code implementations • 12 Jun 2024 • Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data.
no code implementations • 5 Jun 2024 • Evan Becker, Stefano Soatto
We instead propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
no code implementations • 22 May 2024 • Tian Yu Liu, Stefano Soatto, Matteo Marchi, Pratik Chaudhari, Paulo Tabuada
But if there are 'system prompts' not visible to the user, then the set of indistinguishable trajectories becomes non-trivial, and there can be multiple state trajectories that yield the same verbalized output.
no code implementations • CVPR 2024 • Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano Soatto
By evaluating a large selection of recent LVLMs using public datasets, we show that an improvement in existing metrics do not lead to a reduction in Type I hallucinations, and that established benchmarks for measuring Type I hallucinations are incomplete.
no code implementations • CVPR 2024 • Dong Lao, Congli Wang, Alex Wong, Stefano Soatto
Rather than initializing a latent irradiance ("template") by heuristics to estimate deformation, we select one of the images as a reference, and model the deformation in this image by the aggregation of the optical flow from it to other images, exploiting a prior imposed by Central Limit Theorem.
no code implementations • 30 Apr 2024 • Benet Oriol Sabat, Alessandro Achille, Matthew Trager, Stefano Soatto
We propose NeRF-Insert, a NeRF editing framework that allows users to make high-quality local edits with a flexible level of control.
no code implementations • 28 Apr 2024 • Xiaolong Li, Jiawei Mo, Ying Wang, Chethan Parameshwara, Xiaohan Fei, Ashwin Swaminathan, Cj Taylor, Zhuowen Tu, Paolo Favaro, Stefano Soatto
In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model.
no code implementations • 16 Apr 2024 • Hantian Ding, Zijian Wang, Giovanni Paolini, Varun Kumar, Anoop Deoras, Dan Roth, Stefano Soatto
In large language model training, input documents are typically concatenated together and then split into sequences of equal length to avoid padding tokens.
no code implementations • 6 Apr 2024 • Pei Wang, Zhaowei Cai, Hao Yang, Ashwin Swaminathan, R. Manmatha, Stefano Soatto
Existing unified image segmentation models either employ a unified architecture across multiple tasks but use separate weights tailored to each dataset, or apply a single set of weights to multiple datasets but are limited to a single task.
1 code implementation • CVPR 2024 • Ziyao Zeng, Daniel Wang, Fengyu Yang, Hyoungseob Park, Yangchao Wu, Stefano Soatto, Byung-Woo Hong, Dong Lao, Alex Wong
To test this, we focus on monocular depth estimation, the problem of predicting a dense depth map from a single image, but with an additional text caption describing the scene.
no code implementations • CVPR 2024 • Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R. Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto
On the data scaling side, we show the quality and diversity of the training set matters more than simply dataset size.
no code implementations • 2 Apr 2024 • Matteo Marchi, Stefano Soatto, Pratik Chaudhari, Paulo Tabuada
The aim of this paper is to provide insights into this process (that we refer to as "generative closed-loop learning") by studying the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset.
no code implementations • CVPR 2024 • Aditya Golatkar, Alessandro Achille, Luca Zancato, Yu-Xiang Wang, Ashwin Swaminathan, Stefano Soatto
To reduce risks of leaking private information contained in the retrieved set, we introduce Copy-Protected generation with Retrieval (CPR), a new method for RAG with strong copyright protection guarantees in a mixed-private setting for diffusion models. CPR allows to condition the output of diffusion models on a set of retrieved images, while also guaranteeing that unique identifiable information about those example is not exposed in the generated outputs.
no code implementations • CVPR 2024 • Alessandro Favero, Luca Zancato, Matthew Trager, Siddharth Choudhary, Pramuditha Perera, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
In particular, we show that as more tokens are generated, the reliance on the visual prompt decreases, and this behavior strongly correlates with the emergence of hallucinations.
1 code implementation • 16 Mar 2024 • Ziqi Lu, Jianbo Ye, Xiaohan Fei, Xiaolong Li, Jiawei Mo, Ashwin Swaminathan, Stefano Soatto
Neural Radiance Field (NeRF), as an implicit 3D scene representation, lacks inherent ability to accommodate changes made to the initial static scene.
no code implementations • CVPR 2024 • Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto
We propose Strongly Supervised pre-training with ScreenShots (S4) - a novel pre-training paradigm for Vision-Language Models using data from large-scale web screenshot rendering.
1 code implementation • CVPR 2024 • Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto
Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions.
no code implementations • 29 Feb 2024 • Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, Cj Taylor, Paolo Favaro, Stefano Soatto
However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D model inaccuracies.
no code implementations • CVPR 2024 • Alessandro Achille, Greg Ver Steeg, Tian Yu Liu, Matthew Trager, Carson Klingenberg, Stefano Soatto
Quantifying the degree of similarity between images is a key copyright issue for image-based machine learning.
1 code implementation • 23 Oct 2023 • Tian Yu Liu, Matthew Trager, Alessandro Achille, Pramuditha Perera, Luca Zancato, Stefano Soatto
We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text.
no code implementations • 15 Oct 2023 • Yangchao Wu, Tian Yu Liu, Hyoungseob Park, Stefano Soatto, Dong Lao, Alex Wong
The sparse depth modality in depth completion have seen even less use as intensity transformations alter the scale of the 3D scene, and geometric transformations may decimate the sparse points during resampling.
1 code implementation • 6 Oct 2023 • Dong Lao, Yangchao Wu, Tian Yu Liu, Alex Wong, Stefano Soatto
Vision Transformer (ViT) architectures represent images as collections of high-dimensional vectorized tokens, each corresponding to a rectangular non-overlapping patch.
1 code implementation • 23 Aug 2023 • Michael Kleinman, Alessandro Achille, Stefano Soatto
Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations.
no code implementations • 2 Aug 2023 • Aditya Golatkar, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto
We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time.
1 code implementation • ICCV 2023 • Tian Yu Liu, Stefano Soatto
Component models are composed at inference time via scalar combination, reducing the cost of ensembling to that of a single model.
1 code implementation • 16 Jul 2023 • Tian Yu Liu, Aditya Golatkar, Stefano Soatto
We introduce Tangent Attention Fine-Tuning (TAFT), a method for fine-tuning linearized transformers obtained by computing a First-order Taylor Expansion around a pre-trained initialization.
no code implementations • 6 Jun 2023 • Chethan Parameshwara, Alessandro Achille, Xiaolong Li, Jiawei Mo, Matthew Trager, Ashwin Swaminathan, Cj Taylor, Dheera Venkatraman, Xiaohan Fei, Stefano Soatto
We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion.
no code implementations • 1 Jun 2023 • Pramuditha Perera, Matthew Trager, Luca Zancato, Alessandro Achille, Stefano Soatto
We investigate whether prompts learned independently for different tasks can be later combined through prompt algebra to obtain a model that supports composition of tasks.
no code implementations • 29 May 2023 • Stefano Soatto, Paulo Tabuada, Pratik Chaudhari, Tian Yu Liu
We then characterize the subset of meanings that can be reached by the state of the LLMs for some input prompt, and show that a well-trained bot can reach any meaning albeit with small probability.
no code implementations • CVPR 2024 • Qin Zhang, Dongsheng An, Tianjun Xiao, Tong He, Qingming Tang, Ying Nian Wu, Joseph Tighe, Yifan Xing, Stefano Soatto
In deep metric learning for visual recognition, the calibration of distance thresholds is crucial for achieving desired model performance in the true positive rates (TPR) or true negative rates (TNR).
1 code implementation • 11 May 2023 • Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto
We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks which may interfere with each other, resulting in a single model which we named Musketeer.
no code implementations • ICCV 2023 • Yonatan Dukler, Benjamin Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto
We present Synergy Aware Forgetting Ensemble (SAFE), a method to adapt large models on a diverse collection of data while minimizing the expected cost to remove the influence of training samples from the trained model.
no code implementations • NeurIPS 2023 • Marco Fumero, Florian Wenzel, Luca Zancato, Alessandro Achille, Emanuele Rodolà, Stefano Soatto, Bernhard Schölkopf, Francesco Locatello
Recovering the latent factors of variation of high dimensional data has so far focused on simple synthetic settings.
no code implementations • 7 Apr 2023 • Alessandro Achille, Michael Kearns, Carson Klingenberg, Stefano Soatto
One potential fix for training corpus data defects is model disgorgement -- the elimination of not just the improperly used data, but also the effects of improperly used data on any component of an ML model.
no code implementations • 4 Apr 2023 • Dong Lao, Zhengyang Hu, Francesco Locatello, Yanchao Yang, Stefano Soatto
We introduce a method to segment the visual field into independently moving regions, trained with no ground truth or supervision.
no code implementations • CVPR 2023 • Luca Zancato, Alessandro Achille, Tian Yu Liu, Matthew Trager, Pramuditha Perera, Stefano Soatto
Second, we apply ${\rm T^3AR}$ for test-time adaptation and show that exploiting a pool of external images at test-time leads to more robust representations over existing methods on DomainNet-126 and VISDA-C, especially when few adaptation data are available (up to 8%).
no code implementations • 25 Mar 2023 • Stephanie Tsuei, Wenjie Mo, Stefano Soatto
In state estimation algorithms that use feature tracks as input, it is customary to assume that the errors in feature track positions are zero-mean Gaussian.
no code implementations • CVPR 2023 • Achin Jain, Gurumurthy Swaminathan, Paolo Favaro, Hao Yang, Avinash Ravichandran, Hrayr Harutyunyan, Alessandro Achille, Onkar Dabeer, Bernt Schiele, Ashwin Swaminathan, Stefano Soatto
The PPL improves the performance estimation on average by 37% across 16 classification and 33% across 10 detection datasets, compared to the power law.
no code implementations • ICCV 2023 • Matthew Trager, Pramuditha Perera, Luca Zancato, Alessandro Achille, Parminder Bhatia, Stefano Soatto
These vectors can be seen as "ideal words" for generating concepts directly within the embedding space of the model.
no code implementations • 15 Feb 2023 • Benjamin Bowman, Alessandro Achille, Luca Zancato, Matthew Trager, Pramuditha Perera, Giovanni Paolini, Stefano Soatto
During inference, models can be assembled based on arbitrary selections of data sources, which we call "\`a-la-carte learning".
no code implementations • CVPR 2023 • Hao Li, Charless Fowlkes, Hao Yang, Onkar Dabeer, Zhuowen Tu, Stefano Soatto
With thousands of historical training jobs, a recommendation system can be learned to predict the model selection score given the features of the dataset and the model as input.
no code implementations • CVPR 2023 • Benjamin Bowman, Alessandro Achille, Luca Zancato, Matthew Trager, Pramuditha Perera, Giovanni Paolini, Stefano Soatto
During inference, models can be assembled based on arbitrary selections of data sources, which we call a-la-carte learning.
no code implementations • CVPR 2023 • Akash Deep Singh, Yunhao Ba, Ankur Sarker, Howard Zhang, Achuta Kadambi, Stefano Soatto, Mani Srivastava, Alex Wong
To fuse radar depth with an image, we propose a gated fusion scheme that accounts for the confidence scores of the correspondence so that we selectively combine radar and camera embeddings to yield a dense depth map.
no code implementations • 23 Nov 2022 • Tian Yu Liu, Aditya Golatkar, Stefano Soatto, Alessandro Achille
We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models.
1 code implementation • 14 Nov 2022 • Alexandre Tiard, Alex Wong, David Joon Ho, Yangchao Wu, Eliram Nof, Alvin C. Goh, Stefano Soatto, Saad Nadeem
Our method achieves the state-of-the-art performance on several publicly available breast cancer datasets ranging from tumor classification (CAMELYON17) and subtyping (BRACS) to HER2 status classification and treatment response prediction.
1 code implementation • CVPR 2023 • Michael Kleinman, Alessandro Achille, Stefano Soatto
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.
1 code implementation • 11 Aug 2022 • Zhaowei Cai, Avinash Ravichandran, Paolo Favaro, Manchen Wang, Davide Modolo, Rahul Bhotika, Zhuowen Tu, Stefano Soatto
We study semi-supervised learning (SSL) for vision transformers (ViT), an under-explored topic despite the wide adoption of the ViT architectures to different tasks.
no code implementations • 3 Aug 2022 • Gukyeong Kwon, Zhaowei Cai, Avinash Ravichandran, Erhan Bas, Rahul Bhotika, Stefano Soatto
Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality.
no code implementations • 25 Jul 2022 • Alessandro Achille, Stefano Soatto
We revisit the classic signal-to-symbol barrier in light of the remarkable ability of deep neural networks to generate realistic synthetic data.
no code implementations • 1 Jul 2022 • Mohamad Rida Rammal, Alessandro Achille, Aditya Golatkar, Suhas Diggavi, Stefano Soatto
We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI).
1 code implementation • 22 Jun 2022 • Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso de Melo, Suya You, Stefano Soatto, Alex Wong, Achuta Kadambi
We propose a large-scale dataset of real-world rainy and clean image pairs and a method to remove degradations, induced by rain streaks and rain accumulation, from the image.
1 code implementation • NeurIPS 2023 • Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao
We propose a notion of common information that allows one to quantify and separate the information that is shared between two random variables from the information that is unique to each.
1 code implementation • 12 May 2022 • Yue Zhao, Yantao Shen, Yuanjun Xiong, Shuo Yang, Wei Xia, Zhuowen Tu, Bernt Schiele, Stefano Soatto
Based on the observation, we present a method, called Ensemble Logit Difference Inhibition (ELODI), to train a classification system that achieves paragon performance in both error rate and NFR, at the inference cost of a single model.
no code implementations • 7 May 2022 • Shay Deutsch, Stefano Soatto
We introduce the Graph Sylvester Embedding (GSE), an unsupervised graph representation of local similarity, connectivity, and global structure.
no code implementations • 12 Apr 2022 • Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto
In this paper, we study the challenging instance-wise vision-language tasks, where the free-form language is required to align with the objects instead of the whole image.
2 code implementations • CVPR 2022 • Tz-Ying Wu, Gurumurthy Swaminathan, Zhizhong Li, Avinash Ravichandran, Nuno Vasconcelos, Rahul Bhotika, Stefano Soatto
We hypothesize that a strong base model can provide a good representation for novel classes and incremental learning can be done with small adaptations.
no code implementations • CVPR 2022 • Jiarui Cai, Mingze Xu, Wei Li, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto
We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after a long time span.
no code implementations • 30 Mar 2022 • Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto
While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning.
1 code implementation • CVPR 2022 • Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto
TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights.
1 code implementation • CVPR 2022 • Pei Wang, Zhaowei Cai, Hao Yang, Gurumurthy Swaminathan, Nuno Vasconcelos, Bernt Schiele, Stefano Soatto
This is enabled by a unified architecture, Omni-DETR, based on the recent progress on student-teacher framework and end-to-end transformer based object detection.
Ranked #14 on
Semi-Supervised Object Detection
on COCO 2% labeled data
1 code implementation • 26 Mar 2022 • Dong Lao, Fengyu Yang, Daniel Wang, Hyoungseob Park, Samuel Lu, Alex Wong, Stefano Soatto
We choose monocular depth prediction as the geometric task, and semantic segmentation as the downstream semantic task, and design a collection of empirical tests by exploring different forms of supervision, training pipelines, and data sources for both depth pre-training and semantic fine-tuning.
no code implementations • CVPR 2022 • Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto
AdaMix incorporates few-shot training, or cross-modal zero-shot learning, on public data prior to private fine-tuning, to improve the trade-off.
no code implementations • 6 Jan 2022 • Pengkai Zhu, Zhaowei Cai, Yuanjun Xiong, Zhuowen Tu, Luis Goncalves, Vijay Mahadevan, Stefano Soatto
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features whereby data points that are mapped to nearby representations by the source (teacher) model are also mapped to neighbors by the target (student) model.
1 code implementation • CVPR 2022 • Zachary Berger, Parth Agrawal, Tian Yu Liu, Stefano Soatto, Alex Wong
We study the effect of adversarial perturbations of images on deep stereo matching networks for the disparity estimation task.
no code implementations • ICLR 2022 • Yonatan Dukler, Alessandro Achille, Giovanni Paolini, Avinash Ravichandran, Marzia Polito, Stefano Soatto
A learning task is a function from a training set to the validation error, which can be represented by a trained deep neural network (DNN).
no code implementations • 29 Sep 2021 • Zhizhong Li, Avinash Ravichandran, Charless Fowlkes, Marzia Polito, Rahul Bhotika, Stefano Soatto
Indeed, we observe experimentally that standard distillation of task-specific teachers, or using these teacher representations directly, **reduces** downstream transferability compared to a task-agnostic generalist model.
no code implementations • 29 Sep 2021 • Luca Zancato, Alessandro Achille, Giovanni Paolini, Alessandro Chiuso, Stefano Soatto
After modeling the signals, we use an anomaly detection system based on the classic CUMSUM algorithm and a variational approximation of the $f$-divergence to detect both isolated point anomalies and change-points in statistics of the signals.
1 code implementation • 18 Sep 2021 • Alex Wong, Allison Chen, Yangchao Wu, Safa Cicek, Alexandre Tiard, Byung-Woo Hong, Stefano Soatto
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
1 code implementation • ICCV 2021 • Alex Wong, Stefano Soatto
At inference time, the calibration of the camera, which can be different than the one used for training, is fed as an input to the network along with the sparse point cloud and a single image.
Ranked #2 on
Depth Completion
on VOID
no code implementations • ICCV 2021 • Tong He, Yuanlu Xu, Shunsuke Saito, Stefano Soatto, Tony Tung
We present ARCH++, an image-based method to reconstruct 3D avatars with arbitrary clothing styles.
Ranked #1 on
3D Object Reconstruction From A Single Image
on RenderPeople
(using extra training data)
3D Object Reconstruction From A Single Image
Image-to-Image Translation
1 code implementation • 3 Aug 2021 • Alexander Schperberg, Stephanie Tsuei, Stefano Soatto, Dennis Hong
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal while avoiding obstacles in uncertain environments.
2 code implementations • NeurIPS 2021 • Sébastien M. R. Arnold, Guneet S. Dhillon, Avinash Ravichandran, Stefano Soatto
Episodic training is a core ingredient of few-shot learning to train models on tasks with limited labelled data.
no code implementations • 16 Jul 2021 • Zhizhong Li, Avinash Ravichandran, Charless Fowlkes, Marzia Polito, Rahul Bhotika, Stefano Soatto
Traditionally, distillation has been used to train a student model to emulate the input/output functionality of a teacher.
2 code implementations • NeurIPS 2021 • Mingze Xu, Yuanjun Xiong, Hao Chen, Xinyu Li, Wei Xia, Zhuowen Tu, Stefano Soatto
We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs a long- and short-term memory mechanism to model prolonged sequence data.
Ranked #3 on
Online Action Detection
on TVSeries
3 code implementations • ICCV 2021 • Yifan Xing, Tong He, Tianjun Xiao, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Wipf, Zheng Zhang, Stefano Soatto
Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.
no code implementations • 25 Jun 2021 • Stephanie Tsuei, Aditya Golatkar, Stefano Soatto
We propose a method to estimate the uncertainty of the outcome of an image classifier on a given input datum.
no code implementations • 16 Jun 2021 • Lanlan Liu, Yuting Zhang, Jia Deng, Stefano Soatto
Recent work introduced progressive network growing as a promising way to ease the training for large GANs, but the model design and architecture-growing strategy still remain under-explored and needs manual design for different image data.
no code implementations • 8 Jun 2021 • Siqi Deng, Yuanjun Xiong, Meng Wang, Wei Xia, Stefano Soatto
The common implementation of face recognition systems as a cascade of a detection stage and a recognition or verification stage can cause problems beyond failures of the detector.
1 code implementation • 6 Jun 2021 • Alex Wong, Safa Cicek, Stefano Soatto
We present a method for inferring dense depth maps from images and sparse depth measurements by leveraging synthetic data to learn the association of sparse point clouds with dense natural shapes, and using the image as evidence to validate the predicted depth map.
Ranked #3 on
Depth Completion
on VOID
1 code implementation • 6 Jun 2021 • Alex Wong, Xiaohan Fei, Byung-Woo Hong, Stefano Soatto
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
no code implementations • CVPR 2021 • Rahul Duggal, Hao Zhou, Shuo Yang, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto
Existing systems use the same embedding model to compute representations (embeddings) for the query and gallery images.
no code implementations • ACL 2021 • Yuqing Xie, Yi-An Lai, Yuanjun Xiong, Yi Zhang, Stefano Soatto
Behavior of deep neural networks can be inconsistent between different versions.
no code implementations • ICCV 2021 • Qi Dong, Zhuowen Tu, Haofu Liao, Yuting Zhang, Vijay Mahadevan, Stefano Soatto
Computer vision applications such as visual relationship detection and human object interaction can be formulated as a composite (structured) set detection problem in which both the parts (subject, object, and predicate) and the sum (triplet as a whole) are to be detected in a hierarchical fashion.
no code implementations • CVPR 2021 • Xinzhu Bei, Yanchao Yang, Stefano Soatto
The appearance of the scene is warped from past frames using the predicted motion in co-visible regions; dis-occluded regions are synthesized with content-aware inpainting utilizing the predicted scene layout.
no code implementations • ICLR Workshop Neural_Compression 2021 • Michael Kleinman, Alessandro Achille, Stefano Soatto, Jonathan Kao
We introduce the Redundant Information Neural Estimator (RINE), a method that allows efficient estimation for the component of information about a target variable that is common to a set of sources, previously referred to as the “redundant information.” We show that existing definitions of the redundant information can be recast in terms of an optimization over a family of deterministic or stochastic functions.
no code implementations • 29 Jan 2021 • Aditya Deshpande, Alessandro Achille, Avinash Ravichandran, Hao Li, Luca Zancato, Charless Fowlkes, Rahul Bhotika, Stefano Soatto, Pietro Perona
Since all model selection algorithms in the literature have been tested on different use-cases and never compared directly, we introduce a new comprehensive benchmark for model selection comprising of: i) A model zoo of single and multi-domain models, and ii) Many target tasks.
no code implementations • 26 Jan 2021 • Orchid Majumder, Avinash Ravichandran, Subhransu Maji, Alessandro Achille, Marzia Polito, Stefano Soatto
In this work we investigate the complementary roles of these two sources of information by combining instance-discriminative contrastive learning and supervised learning in a single framework called Supervised Momentum Contrastive learning (SUPMOCO).
1 code implementation • CVPR 2021 • Zhaowei Cai, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Zhuowen Tu, Stefano Soatto
We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques.
Self-Supervised Learning
Semi-Supervised Image Classification
1 code implementation • ICLR 2021 • Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights.
2 code implementations • ICLR 2021 • Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos santos, Bing Xiang, Stefano Soatto
We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking.
Ranked #3 on
Relation Classification
on TACRED
no code implementations • CVPR 2021 • Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto
We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting.
no code implementations • CVPR 2021 • Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto
Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization.
no code implementations • CVPR 2021 • Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto
Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error.
no code implementations • 30 Sep 2020 • Shay Deutsch, Stefano Soatto
We introduce an unsupervised graph embedding that trades off local node similarity and connectivity, and global structure.
1 code implementation • 21 Sep 2020 • Alex Wong, Mukund Mundhra, Stefano Soatto
We study the effect of adversarial perturbations of images on the estimates of disparity by deep learning models trained for stereo.
no code implementations • NeurIPS 2020 • Luca Zancato, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.
no code implementations • CVPR 2021 • Yanchao Yang, Brian Lai, Stefano Soatto
Then, it uses the segments to learn object models that can be used for detection in a static image.
no code implementations • 28 Jul 2020 • Alexander Schperberg, Kenny Chen, Stephanie Tsuei, Michael Jewett, Joshua Hooks, Stefano Soatto, Ankur Mehta, Dennis Hong
In this paper, we propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties for safer navigation through cluttered environments.
1 code implementation • NeurIPS 2020 • Tong He, John Collomosse, Hailin Jin, Stefano Soatto
We propose Geo-PIFu, a method to recover a 3D mesh from a monocular color image of a clothed person.
1 code implementation • NeurIPS 2020 • Alex Wong, Safa Cicek, Stefano Soatto
We study the effect of adversarial perturbations on the task of monocular depth prediction.
no code implementations • 14 Apr 2020 • Kensuke Nakamura, Stefano Soatto, Byung-Woo Hong
We propose a first-order stochastic optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
1 code implementation • CVPR 2020 • Yanchao Yang, Yutong Chen, Stefano Soatto
We describe a method to train a generative model with latent factors that are (approximately) independent and localized.
3 code implementations • CVPR 2020 • Yanchao Yang, Stefano Soatto
We describe a simple method for unsupervised domain adaptation, whereby the discrepancy between the source and target distributions is reduced by swapping the low-frequency spectrum of one with the other.
Ranked #4 on
Domain Adaptation
on Panoptic SYNTHIA-to-Mapillary
1 code implementation • CVPR 2020 • Yanchao Yang, Dong Lao, Ganesh Sundaramoorthi, Stefano Soatto
We introduce two criteria to regularize the optimization involved in learning a classifier in a domain where no annotated data are available, leveraging annotated data in a different domain, a problem known as unsupervised domain adaptation.
3 code implementations • CVPR 2020 • Yantao Shen, Yuanjun Xiong, Wei Xia, Stefano Soatto
Backward compatibility is critical to quickly deploy new embedding models that leverage ever-growing large-scale training datasets and improvements in deep learning architectures and training methods.
1 code implementation • ECCV 2020 • Aditya Golatkar, Alessandro Achille, Stefano Soatto
We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network.
1 code implementation • ICLR 2020 • Hao Li, Pratik Chaudhari, Hao Yang, Michael Lam, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
Our findings challenge common practices of fine-tuning and encourages deep learning practitioners to rethink the hyperparameters for fine-tuning.
no code implementations • 13 Feb 2020 • Xialei Liu, Hao Yang, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
For the difficult cases, where the domain gaps and especially category differences are large, we explore three different exemplar sampling methods and show the proposed adaptive sampling method is effective to select diverse and informative samples from entire datasets, to further prevent forgetting.
no code implementations • 11 Feb 2020 • Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen during the meta-training phase.
1 code implementation • 6 Dec 2019 • Albert Zhao, Tong He, Yitao Liang, Haibin Huang, Guy Van Den Broeck, Stefano Soatto
To learn this representation, we train a squeeze network to drive using annotations for the side task as input.
2 code implementations • CVPR 2020 • Aditya Golatkar, Alessandro Achille, Stefano Soatto
We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network.
2 code implementations • ICLR 2020 • Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola
This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm for meta-Reinforcement Learning (meta-RL).
no code implementations • 25 Sep 2019 • Alessandro Achille, Stefano Soatto
We relate this to the Information in the Weights, and use this result to show that models of low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.
3 code implementations • ICLR 2020 • Guneet S. Dhillon, Pratik Chaudhari, Avinash Ravichandran, Stefano Soatto
When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, Tiered-ImageNet, CIFAR-FS and FC-100 with the same hyper-parameters.
no code implementations • 2 Aug 2019 • Cuong V. Nguyen, Alessandro Achille, Michael Lam, Tal Hassner, Vijay Mahadevan, Stefano Soatto
As an application, we apply our procedure to study two properties of a task sequence: (1) total complexity and (2) sequential heterogeneity.
no code implementations • NeurIPS 2019 • Aditya Golatkar, Alessandro Achille, Stefano Soatto
Deep neural networks (DNNs), however, challenge this view: We show that removing regularization after an initial transient period has little effect on generalization, even if the final loss landscape is the same as if there had been no regularization.
no code implementations • 29 May 2019 • Alessandro Achille, Giovanni Paolini, Stefano Soatto
We establish a novel relation between the information in the weights and the effective information in the activations, and use this result to show that models with low (information) complexity not only generalize better, but are bound to learn invariant representations of future inputs.
no code implementations • ICCV 2019 • Safa Cicek, Stefano Soatto
We propose a method for unsupervised domain adaptation that trains a shared embedding to align the joint distributions of inputs (domain) and outputs (classes), making any classifier agnostic to the domain.
2 code implementations • 15 May 2019 • Alex Wong, Xiaohan Fei, Stephanie Tsuei, Stefano Soatto
Our method first constructs a piecewise planar scaffolding of the scene, and then uses it to infer dense depth using the image along with the sparse points.
Ranked #4 on
Depth Completion
on VOID
no code implementations • ICCV 2019 • Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
We propose a method for learning embeddings for few-shot learning that is suitable for use with any number of ways and any number of shots (shot-free).
no code implementations • ICLR 2019 • Alessandro Achille, Matteo Rovere, Stefano Soatto
Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.
7 code implementations • CVPR 2019 • Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto
We propose to use these predictors as base learners to learn representations for few-shot learning and show they offer better tradeoffs between feature size and performance across a range of few-shot recognition benchmarks.
Ranked #12 on
Few-Shot Image Classification
on FC100 5-way (1-shot)
no code implementations • 5 Apr 2019 • Alessandro Achille, Giovanni Paolini, Glen Mbeng, Stefano Soatto
Our framework is the first to measure complexity in a way that accounts for the effect of the optimization scheme, which is critical in Deep Learning.
1 code implementation • CVPR 2019 • Alex Wong, Byung-Woo Hong, Stefano Soatto
Supervised learning methods to infer (hypothesize) depth of a scene from a single image require costly per-pixel ground-truth.
no code implementations • 15 Mar 2019 • Shay Deutsch, Andrea Bertozzi, Stefano Soatto
We introduce the isoperimetric loss as a regularization criterion for learning the map from a visual representation to a semantic embedding, to be used to transfer knowledge to unknown classes in a zero-shot learning setting.
1 code implementation • ICCV 2019 • Alessandro Achille, Michael Lam, Rahul Tewari, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Stefano Soatto, Pietro Perona
We demonstrate that this embedding is capable of predicting task similarities that match our intuition about semantic and taxonomic relations between different visual tasks (e. g., tasks based on classifying different types of plants are similar) We also demonstrate the practical value of this framework for the meta-task of selecting a pre-trained feature extractor for a new task.
no code implementations • CVPR 2019 • Yanchao Yang, Alex Wong, Stefano Soatto
We present a deep learning system to infer the posterior distribution of a dense depth map associated with an image, by exploiting sparse range measurements, for instance from a lidar.
Ranked #5 on
Depth Completion
on VOID
no code implementations • 11 Jan 2019 • Tong He, Stefano Soatto
We present a method to infer 3D pose and shape of vehicles from a single image.
1 code implementation • CVPR 2019 • Yanchao Yang, Antonio Loquercio, Davide Scaramuzza, Stefano Soatto
We propose an adversarial contextual model for detecting moving objects in images.
no code implementations • CVPR 2019 • Tong He, Haibin Huang, Li Yi, Yuqian Zhou, Chi-Hao Wu, Jue Wang, Stefano Soatto
Surface-based geodesic topology provides strong cues for object semantic analysis and geometric modeling.
no code implementations • 4 Oct 2018 • Alessandro Achille, Glen Mbeng, Stefano Soatto
We compute the transition probability between two learning tasks, and show that it decomposes into two factors.
2 code implementations • 30 Jul 2018 • Xiaohan Fei, Alex Wong, Stefano Soatto
We propose using global orientation from inertial measurements, and the bias it induces on the shape of objects populating the scene, to inform visual 3D reconstruction.
1 code implementation • ECCV 2018 • Yanchao Yang, Stefano Soatto
On the other hand, fully supervised methods learn the regularity in the annotated data, without explicit regularization and with the risk of overfitting.
no code implementations • ECCV 2018 • Xiaohan Fei, Stefano Soatto
We present a method to populate an unknown environment with models of previously seen objects, placed in a Euclidean reference frame that is inferred causally and on-line using monocular video along with inertial sensors.
no code implementations • CVPR 2018 • Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto
We specifically study the topology of classification regions created by deep networks, as well as their associated decision boundary.
no code implementations • 23 May 2018 • Safa Cicek, Stefano Soatto
We propose regularizing the empirical loss for semi-supervised learning by acting on both the input (data) space, and the weight (parameter) space.
1 code implementation • ECCV 2018 • Safa Cicek, Alhussein Fawzi, Stefano Soatto
We introduce the SaaS Algorithm for semi-supervised learning, which uses learning speed during stochastic gradient descent in a deep neural network to measure the quality of an iterative estimate of the posterior probability of unknown labels.
no code implementations • CVPR 2018 • Simon Korman, Mark Milam, Stefano Soatto
We present a novel approach to template matching that is efficient, can handle partial occlusions, and comes with provable performance guarantees.
no code implementations • 13 Dec 2017 • Rene Vidal, Joan Bruna, Raja Giryes, Stefano Soatto
Recently there has been a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for representation learning and classification.
no code implementations • 26 Nov 2017 • Jameson Merkow, Robert Lufkin, Kim Nguyen, Stefano Soatto, Zhuowen Tu, Andrea Vedaldi
Thus, DeepRadiologyNet enables significant reduction in the workload of human radiologists by automatically filtering studies and reporting on the high-confidence ones at an operating point well below the literal error rate for US Board Certified radiologists, estimated at 0. 82%.
1 code implementation • 24 Nov 2017 • Alessandro Achille, Matteo Rovere, Stefano Soatto
Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training.
no code implementations • 20 Nov 2017 • Kensuke Nakamura, Stefano Soatto, Byung-Woo Hong
We present a stochastic first-order optimization algorithm, named BCSC, that adds a cyclic constraint to stochastic block-coordinate descent.
no code implementations • 9 Nov 2017 • Alessandro Achille, Stefano Soatto
Again this can be finitely-parametrized using a deep neural network, and already some applications are beginning to emerge.
no code implementations • ICLR 2018 • Pratik Chaudhari, Stefano Soatto
So SGD does perform variational inference, but for a different loss than the one used to compute the gradients.
no code implementations • 3 Jul 2017 • Pratik Chaudhari, Carlo Baldassi, Riccardo Zecchina, Stefano Soatto, Ameet Talwalkar, Adam Oberman
We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters.
no code implementations • CVPR 2017 • Yanchao Yang, Stefano Soatto
We introduce a method to compute optical flow at multiple scales of motion, without resorting to multi- resolution or combinatorial methods.
no code implementations • CVPR 2017 • Shay Deutsch, Soheil Kolouri, Kyungnam Kim, Yuri Owechko, Stefano Soatto
We address zero-shot learning using a new manifold alignment framework based on a localized multi-scale transform on graphs.
no code implementations • CVPR 2017 • Jingming Dong, Xiaohan Fei, Stefano Soatto
We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones.
no code implementations • 5 Jun 2017 • Alessandro Achille, Stefano Soatto
Using established principles from Statistics and Information Theory, we show that invariance to nuisance factors in a deep neural network is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations.
no code implementations • ECCV 2018 • Nikolaos Karianakis, Zicheng Liu, Yinpeng Chen, Stefano Soatto
We address the problem of person re-identification from commodity depth sensors.
no code implementations • 26 May 2017 • Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard, Stefano Soatto
The goal of this paper is to analyze the geometric properties of deep neural network classifiers in the input space.
no code implementations • ICLR 2018 • Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, Pascal Frossard, Stefano Soatto
Deep networks have recently been shown to be vulnerable to universal perturbations: there exist very small image-agnostic perturbations that cause most natural images to be misclassified by such classifiers.
no code implementations • 9 May 2017 • Byung-Woo Hong, Ja-Keoung Koo, Martin Burger, Stefano Soatto
We present an adaptive regularization scheme for optimizing composite energy functionals arising in image analysis problems.
no code implementations • 17 Apr 2017 • Pratik Chaudhari, Adam Oberman, Stanley Osher, Stefano Soatto, Guillaume Carlier
In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs).
no code implementations • 27 Feb 2017 • Byung-Woo Hong, Ja-Keoung Koo, Stefano Soatto
We present a variational multi-label segmentation algorithm based on a robust Huber loss for both the data and the regularizer, minimized within a convex optimization framework.
2 code implementations • 6 Nov 2016 • Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina
This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.
1 code implementation • 4 Nov 2016 • Alessandro Achille, Stefano Soatto
The cross-entropy loss commonly used in deep learning is closely related to the defining properties of optimal representations, but does not enforce some of the key properties.
no code implementations • 7 Aug 2016 • Thomas Goldstein, Paul Hand, Choongbum Lee, Vladislav Voroninski, Stefano Soatto
We introduce a new method for location recovery from pair-wise directions that leverages an efficient convex program that comes with exact recovery guarantees, even in the presence of adversarial outliers.
no code implementations • 13 Jun 2016 • Jingming Dong, Xiaohan Fei, Stefano Soatto
We describe a system to detect objects in three-dimensional space using video and inertial sensors (accelerometer and gyrometer), ubiquitous in modern mobile platforms from phones to drones.
no code implementations • 19 Jan 2016 • Hossein Mobahi, Stefano Soatto
Can it suggest new algorithms with reduced computational complexity or new descriptors with better accuracy for matching?
no code implementations • ICCV 2015 • Yanchao Yang, Ganesh Sundaramoorthi, Stefano Soatto
We propose a method to detect disocclusion in video sequences of three-dimensional scenes and to partition the disoccluded regions into objects, defined by coherent deformation corresponding to surfaces in the scene.
no code implementations • 20 Nov 2015 • Xiaohan Fei, Konstantine Tsotsos, Stefano Soatto
We propose a data structure obtained by hierarchically averaging bag-of-word descriptors during a sequence of views that achieves average speedups in large-scale loop closure applications ranging from 4 to 20 times on benchmark datasets.
no code implementations • 20 Nov 2015 • Pratik Chaudhari, Stefano Soatto
Specifically, we show that a regularization term akin to a magnetic field can be modulated with a single scalar parameter to transition the loss function from a complex, non-convex landscape with exponentially many local minima, to a phase with a polynomial number of minima, all the way down to a trivial landscape with a unique minimum.
no code implementations • CVPR 2015 • Jingming Dong, Nikolaos Karianakis, Damek Davis, Joshua Hernandez, Jonathan Balzer, Stefano Soatto
We frame the problem of local representation of imaging data as the computation of minimal sufficient statistics that are invariant to nuisance variability induced by viewpoint and illumination.
no code implementations • CVPR 2015 • Gottfried Graber, Jonathan Balzer, Stefano Soatto, Thomas Pock
We propose a method for dense three-dimensional surface reconstruction that leverages the strengths of shape-based approaches, by imposing regularization that respects the geometry of the surface, and the strength of depth-map-based stereo, by avoiding costly computation of surface topology.
no code implementations • CVPR 2015 • Georgios Georgiadis, Alessandro Chiuso, Stefano Soatto
In texture synthesis and classification, algorithms require a small texture to be provided as an input, which is assumed to be representative of a larger region to be re-synthesized or categorized.
no code implementations • CVPR 2015 • Brian Taylor, Vasiliy Karasev, Stefano Soatto
Occlusion relations inform the partition of the image domain into ``objects'' but are difficult to determine from a single image or short-baseline video.
no code implementations • CVPR 2016 • Nikolaos Karianakis, Jingming Dong, Stefano Soatto
We conduct an empirical study to test the ability of Convolutional Neural Networks (CNNs) to reduce the effects of nuisance transformations of the input data, such as location, scale and aspect ratio.
no code implementations • 21 Mar 2015 • Nikolaos Karianakis, Thomas J. Fuchs, Stefano Soatto
Modern detection algorithms like Regions with CNNs (Girshick et al., 2014) rely on Selective Search (Uijlings et al., 2013) to propose regions which with high probability represent objects, where in turn CNNs are deployed for classification.
no code implementations • CVPR 2015 • Jingming Dong, Stefano Soatto
We introduce a simple modification of local image descriptors, such as SIFT, based on pooling gradient orientations across different domain sizes, in addition to spatial locations.
no code implementations • 20 Dec 2014 • Stefano Soatto, Jingming Dong, Nikolaos Karianakis
We study the structure of representations, defined as approximations of minimal sufficient statistics that are maximal invariants to nuisance factors, for visual data subject to scaling and occlusion of line-of-sight.
no code implementations • 27 Nov 2014 • Stefano Soatto, Alessandro Chiuso
Visual representations are defined in terms of minimal sufficient statistics of visual data, for a class of tasks, that are also invariant to nuisance variability.
no code implementations • 14 Sep 2014 • Jonathan Balzer, Daniel Acevedo-Feliz, Stefano Soatto, Sebastian Höfer, Markus Hadwiger, Jürgen Beyerer
We introduce a method based on the deflectometry principle for the reconstruction of specular objects exhibiting significant size and geometric complexity.
no code implementations • CVPR 2014 • Vasiliy Karasev, Avinash Ravichandran, Stefano Soatto
We describe an information-driven active selection approach to determine which detectors to deploy at which location in which frame of a video to minimize semantic class label uncertainty at every pixel, with the smallest computational cost that ensures a given uncertainty bound.
no code implementations • CVPR 2014 • Damek Davis, Jonathan Balzer, Stefano Soatto
We introduce an asymmetric sparse approximate embedding optimized for fast kernel comparison operations arising in large-scale visual search.
no code implementations • CVPR 2014 • Jonathan Balzer, Stefano Soatto
We develop a method for optimization in shape spaces, i. e., sets of surfaces modulo re-parametrization.
no code implementations • 23 Nov 2013 • Jingming Dong, Jonathan Balzer, Damek Davis, Joshua Hernandez, Stefano Soatto
We propose an extension of popular descriptors based on gradient orientation histograms (HOG, computed in a single image) to multiple views.
no code implementations • CVPR 2013 • Yun Zeng, Chaohui Wang, Stefano Soatto, Shing-Tung Yau
This paper introduces an efficient approach to integrating non-local statistics into the higher-order Markov Random Fields (MRFs) framework.
no code implementations • CVPR 2013 • Jonathan Balzer, Stefano Soatto
We describe a method to efficiently generate a model (map) of small-scale objects from video.
no code implementations • NeurIPS 2012 • Vasiliy Karasev, Alessandro Chiuso, Stefano Soatto
We describe the tradeoff between the performance in a visual recognition problem and the control authority that the agent can exercise on the sensing process.
no code implementations • NeurIPS 2011 • Kamil A. Wnuk, Stefano Soatto
We propose a robust filtering approach based on semi-supervised and multiple instance learning (MIL).
no code implementations • 10 Oct 2011 • Stefano Soatto
The concept of Actionable Information is described, that relates to a notion of information championed by J. Gibson, and a notion of "complete information" that relates to the minimal sufficient statistics of a complete representation.
no code implementations • NeurIPS 2010 • Alper Ayvaci, Michalis Raptis, Stefano Soatto
We tackle the problem of simultaneously detecting occlusions and estimating optical flow.