no code implementations • ECCV 2020 • Yuan-Ting Hu, Heng Wang, Nicolas Ballas, Kristen Grauman, Alexander G. Schwing
Video inpainting is an important technique for a wide variety of applications from video content editing to video restoration.
1 code implementation • 16 Jun 2022 • Changan Chen, Carl Schissler, Sanchit Garg, Philip Kobernik, Alexander Clegg, Paul Calamia, Dhruv Batra, Philip W Robinson, Kristen Grauman
We introduce SoundSpaces 2. 0, a platform for on-the-fly geometry-based audio rendering for 3D environments.
no code implementations • 8 Jun 2022 • Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener, with implications for various applications in AR, VR, and robotics.
no code implementations • CVPR 2022 • Changan Chen, Ruohan Gao, Paul Calamia, Kristen Grauman
We introduce the visual acoustic matching task, in which an audio clip is transformed to sound like it was recorded in a target environment.
no code implementations • CVPR 2022 • Ziad Al-Halah, Santhosh K. Ramakrishnan, Kristen Grauman
In reinforcement learning for visual navigation, it is common to develop a model for each new task, and train that model from scratch with task-specific interactions in 3D environments.
no code implementations • 2 Feb 2022 • Sagnik Majumder, Ziad Al-Halah, Kristen Grauman
We explore active audio-visual separation for dynamic sound sources, where an embodied agent moves intelligently in a 3D environment to continuously isolate the time-varying audio stream being emitted by an object of interest.
no code implementations • 1 Feb 2022 • Priyanka Mandikal, Kristen Grauman
Dexterous multi-fingered robotic hands have a formidable action space, yet their morphological similarity to the human hand holds immense potential to accelerate robot learning.
no code implementations • CVPR 2022 • Santhosh Kumar Ramakrishnan, Devendra Singh Chaplot, Ziad Al-Halah, Jitendra Malik, Kristen Grauman
We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?'
no code implementations • 21 Nov 2021 • Rishabh Garg, Ruohan Gao, Kristen Grauman
Binaural audio provides human listeners with an immersive spatial sound experience, but most existing videos lack binaural audio recordings.
no code implementations • NeurIPS 2021 • Tushar Nagarajan, Kristen Grauman
For a given object, an activity-context prior represents the set of other compatible objects that are required for activities to succeed (e. g., a knife and cutting board brought together with a tomato are conducive to cutting).
1 code implementation • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.
no code implementations • ICLR 2022 • Santhosh Kumar Ramakrishnan, Tushar Nagarajan, Ziad Al-Halah, Kristen Grauman
We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents.
no code implementations • 6 Jul 2021 • Sukjin Han, Eric H. Schulman, Kristen Grauman, Santhosh Ramakrishnan
We then study the causal effects of a merger on the merging firm's creative decisions using the constructed measures in a synthetic control method.
no code implementations • 14 Jun 2021 • Changan Chen, Wei Sun, David Harwath, Kristen Grauman
The visual environment surrounding a human speaker reveals important cues about the room geometry, materials, and speaker location, all of which influence the precise reverberation effects in the audio stream.
1 code implementation • ICCV 2021 • Rohit Girdhar, Kristen Grauman
We propose Anticipative Video Transformer (AVT), an end-to-end attention-based video modeling architecture that attends to the previously observed video in order to anticipate future actions.
no code implementations • 20 May 2021 • Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li
Our model takes the inputs of a Hierarchical Volumetric Representation (HVR) of the environment and an egocentric video, infers the 3D action location as a latent variable, and recognizes the action based on the video and contextual cues surrounding its potential locations.
no code implementations • ICCV 2021 • Sagnik Majumder, Ziad Al-Halah, Kristen Grauman
We introduce the active audio-visual source separation problem, where an agent must move intelligently in order to better isolate the sounds coming from an object of interest in its environment.
1 code implementation • CVPR 2021 • Yanghao Li, Tushar Nagarajan, Bo Xiong, Kristen Grauman
We introduce an approach for pre-training egocentric video models using large-scale third-person video datasets.
no code implementations • ICCV 2021 • Bo Xiong, Haoqi Fan, Kristen Grauman, Christoph Feichtenhofer
We present a multiview pseudo-labeling approach to video learning, a novel framework that uses complementary views in the form of appearance and motion information for semi-supervised learning in video.
no code implementations • 3 Feb 2021 • Santhosh K. Ramakrishnan, Tushar Nagarajan, Ziad Al-Halah, Kristen Grauman
We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents.
no code implementations • ICCV 2021 • Wei-Lin Hsiao, Kristen Grauman
Fashion is intertwined with external cultural factors, but identifying these links remains a manual process limited to only the most salient phenomena.
no code implementations • CVPR 2021 • Ruohan Gao, Kristen Grauman
Given a video, the goal is to extract the speech associated with a face in spite of simultaneous background sounds and/or other human speakers.
1 code implementation • ICCV 2021 • Senthil Purushwalkam, Sebastian Vicenc Amengual Gari, Vamsi Krishna Ithapu, Carl Schissler, Philip Robinson, Abhinav Gupta, Kristen Grauman
Given only a few glimpses of an environment, how much can we infer about its entire floorplan?
no code implementations • CVPR 2021 • Changan Chen, Ziad Al-Halah, Kristen Grauman
We propose a transformer-based model to tackle this new semantic AudioGoal task, incorporating an inferred goal descriptor that captures both spatial and semantic properties of the target.
no code implementations • 4 Dec 2020 • Utkarsh Mall, Kavita Bala, Tamara Berg, Kristen Grauman
The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region.
no code implementations • 17 Nov 2020 • Ziad Al-Halah, Kristen Grauman
The discovered influence relationships reveal how both cities and brands exert and receive fashion influence for an array of visual styles inferred from the images.
no code implementations • 3 Sep 2020 • Priyanka Mandikal, Kristen Grauman
Our key idea is to embed an object-centric visual affordance model within a deep reinforcement learning loop to learn grasping policies that favor the same object regions favored by people.
1 code implementation • ECCV 2020 • Santhosh K. Ramakrishnan, Ziad Al-Halah, Kristen Grauman
State-of-the-art navigation methods leverage a spatial memory to generalize to new environments, but their occupancy maps are limited to capturing the geometric structures directly observed by the agent.
Ranked #3 on
Robot Navigation
on Habitat 2020 Point Nav test-std
1 code implementation • ICLR 2021 • Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh Kumar Ramakrishnan, Kristen Grauman
In audio-visual navigation, an agent intelligently travels through a complex, unmapped 3D environment using both sights and sounds to find a sound source (e. g., a phone ringing in another room).
1 code implementation • NeurIPS 2020 • Tushar Nagarajan, Kristen Grauman
We introduce a reinforcement learning approach for exploration for interaction, whereby an embodied agent autonomously discovers the affordance landscape of a new unmapped 3D environment (such as an unfamiliar kitchen).
no code implementations • 29 Jun 2020 • Nicole D. Payntar, Wei-Lin Hsiao, R. Alan Covey, Kristen Grauman
The popularity of media sharing platforms in recent decades has provided an abundance of open source data that remains underutilized by heritage scholars.
no code implementations • ECCV 2020 • Ruohan Gao, Changan Chen, Ziad Al-Halah, Carl Schissler, Kristen Grauman
Several animal species (e. g., bats, dolphins, and whales) and even visually impaired humans have the remarkable ability to perform echolocation: a biological sonar used to perceive spatial layout and locate objects in the world.
1 code implementation • CVPR 2020 • Ziad Al-Halah, Kristen Grauman
The evolution of clothing styles and their migration across the world is intriguing, yet difficult to describe quantitatively.
1 code implementation • 23 Jan 2020 • Fanyi Xiao, Yong Jae Lee, Kristen Grauman, Jitendra Malik, Christoph Feichtenhofer
We present Audiovisual SlowFast Networks, an architecture for integrated audiovisual perception.
1 code implementation • CVPR 2020 • Tushar Nagarajan, Yanghao Li, Christoph Feichtenhofer, Kristen Grauman
We introduce a model for environment affordances that is learned directly from egocentric video.
1 code implementation • CVPR 2020 • Krishna Kumar Singh, Dhruv Mahajan, Kristen Grauman, Yong Jae Lee, Matt Feiszli, Deepti Ghadiyaram
Our key idea is to decorrelate feature representations of a category from its co-occurring context.
1 code implementation • 7 Jan 2020 • Santhosh K. Ramakrishnan, Dinesh Jayaraman, Kristen Grauman
Embodied computer vision considers perception for robots in novel, unstructured environments.
2 code implementations • ECCV 2020 • Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman
Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment.
no code implementations • CVPR 2020 • Wei-Lin Hsiao, Kristen Grauman
Body shape plays an important role in determining what garments will best suit a given person, yet today's clothing recommendation methods take a "one shape fits all" approach.
1 code implementation • CVPR 2020 • Ruohan Gao, Tae-Hyun Oh, Kristen Grauman, Lorenzo Torresani
In the face of the video data deluge, today's expensive clip-level classifiers are increasingly impractical.
1 code implementation • Science Robotics 2019 • Santhosh K. Ramakrishnan, Dinesh Jayaraman, Kristen Grauman
Standard computer vision systems assume access to intelligently captured inputs (e. g., photos from a human photographer), yet autonomously capturing good observations is a major challenge in itself.
no code implementations • 3 Jun 2019 • Tushar Nagarajan, Christoph Feichtenhofer, Kristen Grauman
Learning how to interact with objects is an important step towards embodied visual intelligence, but existing techniques suffer from heavy supervision or sensing requirements.
2 code implementations • CVPR 2021 • Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris
We provide a detailed analysis of the characteristics of the Fashion IQ data, and present a transformer-based user simulator and interactive image retriever that can seamlessly integrate visual attributes with image features, user feedback, and dialog history, leading to improved performance over the state of the art in dialog-based image retrieval.
no code implementations • 30 Apr 2019 • Danna Gurari, Yinan Zhao, Suyog Dutt Jain, Margrit Betke, Kristen Grauman
We propose a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and automated methods.
1 code implementation • CVPR 2020 • Evonne Ng, Donglai Xiang, Hanbyul Joo, Kristen Grauman
The body pose of a person wearing a camera is of great interest for applications in augmented reality, healthcare, and robotics, yet much of the person's body is out of view for a typical wearable camera.
no code implementations • ICCV 2019 • Wei-Lin Hsiao, Isay Katsman, Chao-yuan Wu, Devi Parikh, Kristen Grauman
We introduce Fashion++, an approach that proposes minimal adjustments to a full-body clothing outfit that will have maximal impact on its fashionability.
2 code implementations • ICCV 2019 • Ruohan Gao, Kristen Grauman
Learning how objects sound from video is challenging, since they often heavily overlap in a single audio channel.
Ranked #1 on
Audio Denoising
on AV-Bench - Wooden Horse
no code implementations • 10 Apr 2019 • Antonino Furnari, Sebastiano Battiato, Kristen Grauman, Giovanni Maria Farinella
Although First Person Vision systems can sense the environment from the user's perspective, they are generally unable to predict his intentions and goals.
no code implementations • CVPR 2019 • Bo Xiong, Yannis Kalantidis, Deepti Ghadiyaram, Kristen Grauman
Highlight detection has the potential to significantly ease video browsing, but existing methods often suffer from expensive supervision requirements, where human viewers must manually identify highlights in training videos.
no code implementations • CVPR 2019 • Aron Yu, Kristen Grauman
Current wisdom suggests more labeled image data is always better, and obtaining labels is the bottleneck.
1 code implementation • CVPR 2019 • Zhenpei Yang, Jeffrey Z. Pan, Linjie Luo, Xiaowei Zhou, Kristen Grauman, Qi-Xing Huang
In particular, instead of only performing scene completion from each individual scan, our approach alternates between relative pose estimation and scene completion.
2 code implementations • CVPR 2019 • Ruohan Gao, Kristen Grauman
We devise a deep convolutional neural network that learns to decode the monaural (single-channel) soundtrack into its binaural counterpart by injecting visual information about object and scene configurations.
1 code implementation • ICCV 2019 • Tushar Nagarajan, Christoph Feichtenhofer, Kristen Grauman
Learning how to interact with objects is an important step towards embodied visual intelligence, but existing techniques suffer from heavy supervision or sensing requirements.
no code implementations • CVPR 2019 • Yu-Chuan Su, Kristen Grauman
KTNs efficiently transfer convolution kernels from perspective images to the equirectangular projection of 360{\deg} images.
3 code implementations • CVPR 2019 • Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, Rogerio Feris
Transfer learning, which allows a source task to affect the inductive bias of the target task, is widely used in computer vision.
no code implementations • ECCV 2018 • Bo Xiong, Kristen Grauman
360° panoramas are a rich medium, yet notoriously difficult to visualize in the 2D image plane.
no code implementations • ECCV 2018 • Ke Zhang, Kristen Grauman, Fei Sha
The key idea is to complement the discriminative losses with another loss which measures if the predicted summary preserves the same information as in the original video.
no code implementations • 11 Aug 2018 • Bo Xiong, Suyog Dutt Jain, Kristen Grauman
We propose an end-to-end learning framework for segmenting generic objects in both images and videos.
no code implementations • ECCV 2018 • Santhosh K. Ramakrishnan, Kristen Grauman
We consider an active visual exploration scenario, where an agent must intelligently select its camera motions to efficiently reconstruct the full environment from only a limited set of narrow field-of-view glimpses.
no code implementations • CVPR 2018 • Yu-Chuan Su, Kristen Grauman
Standard video encoders developed for conventional narrow field-of-view video are widely applied to 360° video as well, with reasonable results.
2 code implementations • ECCV 2018 • Ruohan Gao, Rogerio Feris, Kristen Grauman
Our work is the first to learn audio source separation from large-scale "in the wild" videos containing multiple audio sources per video.
no code implementations • 31 Mar 2018 • Bo Xiong, Kristen Grauman
360$^{\circ}$ panoramas are a rich medium, yet notoriously difficult to visualize in the 2D image plane.
no code implementations • CVPR 2018 • Steven Chen, Kristen Grauman
We collect instance-level annotations of most noticeable differences, and build a model trained on relative attribute features that predicts prominent differences for unseen pairs.
1 code implementation • ECCV 2018 • Tushar Nagarajan, Kristen Grauman
In addition, we show that not only can our model recognize unseen compositions robustly in an open-world setting, it can also generalize to compositions where objects themselves were unseen during training.
Ranked #5 on
Image Retrieval with Multi-Modal Query
on MIT-States
Compositional Zero-Shot Learning
Image Retrieval with Multi-Modal Query
no code implementations • CVPR 2018 • Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, Jeffrey P. Bigham
The study of algorithms to automatically answer visual questions currently is motivated by visual question answering (VQA) datasets constructed in artificial VQA settings.
4 code implementations • CVPR 2018 • Ruohan Gao, Bo Xiong, Kristen Grauman
Second, we show the power of hallucinated flow for recognition, successfully transferring the learned motion into a standard two-stream network for activity recognition.
no code implementations • 12 Dec 2017 • Yu-Chuan Su, Kristen Grauman
Standard video encoders developed for conventional narrow field-of-view video are widely applied to 360{\deg} video as well, with reasonable results.
no code implementations • CVPR 2018 • Wei-Lin Hsiao, Kristen Grauman
To permit efficient subset selection over the space of all outfit combinations, we develop submodular objective functions capturing the key ingredients of visual compatibility, versatility, and user-specific preference.
1 code implementation • CVPR 2018 • Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris
Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications.
2 code implementations • CVPR 2018 • Dinesh Jayaraman, Kristen Grauman
It is common to implicitly assume access to intelligently captured inputs (e. g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge.
no code implementations • ECCV 2018 • Dinesh Jayaraman, Ruohan Gao, Kristen Grauman
We introduce an unsupervised feature learning approach that embeds 3D shape information into a single-view image representation.
no code implementations • NeurIPS 2017 • Yu-Chuan Su, Kristen Grauman
While 360{\deg} cameras offer tremendous new possibilities in vision, graphics, and augmented reality, the spherical images they produce make core feature extraction non-trivial.
1 code implementation • ICCV 2017 • Wei-Lin Hsiao, Kristen Grauman
Given a collection of unlabeled fashion images, our approach mines for the latent styles, then summarizes outfits by how they mix those styles.
no code implementations • CVPR 2017 • Yu-Chuan Su, Kristen Grauman
360deg video requires human viewers to actively control "where" to look while watching the video.
no code implementations • CVPR 2017 • Suyog Dutt Jain, Bo Xiong, Kristen Grauman
Our method learns to combine appearance and motion information to produce pixel level segmentation masks for all prominent objects in videos.
no code implementations • ICCV 2017 • Ziad Al-Halah, Rainer Stiefelhagen, Kristen Grauman
What is the future of fashion?
no code implementations • 30 Apr 2017 • Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, Kristen Grauman
We propose the ambiguity problem for the foreground object segmentation task and motivate the importance of estimating and accounting for this ambiguity when designing vision systems.
no code implementations • 1 Mar 2017 • Yu-Chuan Su, Kristen Grauman
360$^{\circ}$ video requires human viewers to actively control "where" to look while watching the video.
no code implementations • CVPR 2017 • Suyog Dutt Jain, Bo Xiong, Kristen Grauman
Our method learns to combine appearance and motion information to produce pixel level segmentation masks for all prominent objects in videos.
Ranked #21 on
Unsupervised Video Object Segmentation
on DAVIS 2016
Structured Prediction
Unsupervised Video Object Segmentation
+2
no code implementations • 19 Jan 2017 • Suyog Dutt Jain, Bo Xiong, Kristen Grauman
We propose an end-to-end learning framework for generating foreground object segmentations.
no code implementations • ICCV 2017 • Aron Yu, Kristen Grauman
Distinguishing subtle differences in attributes is valuable, yet learning to make visual comparisons remains non-trivial.
no code implementations • 7 Dec 2016 • Yu-Chuan Su, Dinesh Jayaraman, Kristen Grauman
AutoCam leverages NFOV web video to discriminatively identify space-time "glimpses" of interest at each time instant, and then uses dynamic programming to select optimal human-like camera trajectories.
1 code implementation • ICCV 2017 • Ruohan Gao, Kristen Grauman
While machine learning approaches to image restoration offer great promise, current methods risk training models fixated on performing well only for image corruption of a particular level of difficulty---such as a certain level of noise or blur.
no code implementations • 1 Dec 2016 • Ruohan Gao, Dinesh Jayaraman, Kristen Grauman
Compared to existing temporal coherence methods, our idea has the advantage of lightweight preprocessing of the unlabeled video (no tracking required) while still being able to extract object-level regions from which to learn invariances.
no code implementations • 7 Nov 2016 • Adriana Kovashka, Olga Russakovsky, Li Fei-Fei, Kristen Grauman
Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts.
no code implementations • 29 Aug 2016 • Danna Gurari, Kristen Grauman
Visual question answering (VQA) systems are emerging from a desire to empower users to ask any natural language question about visual content and receive a valid answer in response.
no code implementations • 11 Jul 2016 • Chao-Yeh Chen, Kristen Grauman
We show that this detection strategy permits an efficient branch-and-cut solution for the best-scoring---and possibly non-cubically shaped---portion of the video for a given activity classifier.
no code implementations • 5 Jul 2016 • Suyog Dutt Jain, Kristen Grauman
We present a novel form of interactive video object segmentation where a few clicks by the user helps the system produce a full spatio-temporal segmentation of the object of interest.
Interactive Video Object Segmentation
Semantic Segmentation
+1
no code implementations • CVPR 2016 • Danna Gurari, Suyog Jain, Margrit Betke, Kristen Grauman
We propose a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and automated methods.
no code implementations • CVPR 2016 • Suyog Dutt Jain, Kristen Grauman
We propose a semi-automatic method to obtain foreground object masks for a large set of related images.
no code implementations • 26 May 2016 • Ke Zhang, Wei-Lun Chao, Fei Sha, Kristen Grauman
We propose a novel supervised learning technique for summarizing videos by automatically selecting keyframes or key subshots.
no code implementations • 30 Apr 2016 • Dinesh Jayaraman, Kristen Grauman
To verify this hypothesis, we attempt to induce this capacity in our active recognition pipeline, by simultaneously learning to forecast the effects of the agent's motions on its internal representation of the environment conditional on all past views.
no code implementations • 17 Apr 2016 • Chao-Yeh Chen, Kristen Grauman
We propose to predict the "interactee" in novel images---that is, to localize the \emph{object} of a person's action.
no code implementations • CVPR 2017 • Hao Jiang, Kristen Grauman
In addition, we demonstrate its impact on a proxemics recognition task, which demands a precise representation of "whose body part is where" in crowded images.
no code implementations • 4 Apr 2016 • Yu-Chuan Su, Kristen Grauman
In a wearable camera video, we see what the camera wearer sees.
no code implementations • 1 Apr 2016 • Yu-Chuan Su, Kristen Grauman
Current approaches for activity recognition often ignore constraints on computational resources: 1) they rely on extensive feature computation to obtain rich descriptors on all frames, and 2) they assume batch-mode access to the entire test video at once.
no code implementations • CVPR 2017 • Hao Jiang, Kristen Grauman
We propose to infer the "invisible pose" of a person behind the egocentric camera.
no code implementations • CVPR 2016 • Ke Zhang, Wei-Lun Chao, Fei Sha, Kristen Grauman
Video summarization has unprecedented importance to help us digest, browse, and search today's ever-growing video collections.
no code implementations • ICCV 2015 • Aron Yu, Kristen Grauman
We develop a Bayesian local learning strategy to infer when images are indistinguishable for a given attribute.
no code implementations • CVPR 2016 • Dinesh Jayaraman, Kristen Grauman
While this standard approach captures the fact that high-level visual signals change slowly over time, it fails to capture *how* the visual content changes.
no code implementations • 18 May 2015 • Yong Jae Lee, Kristen Grauman
Our results on two egocentric video datasets show the method's promise relative to existing techniques for saliency and summarization.
no code implementations • 15 May 2015 • Adriana Kovashka, Devi Parikh, Kristen Grauman
We propose a novel mode of feedback for image search, where a user describes which properties of exemplar images should be adjusted in order to more closely match his/her mental model of the image sought.
no code implementations • 15 May 2015 • Adriana Kovashka, Kristen Grauman
We propose to discover shades of attribute meaning.
1 code implementation • ICCV 2015 • Dinesh Jayaraman, Kristen Grauman
Understanding how images of objects and scenes behave in response to specific ego-motions is a crucial aspect of proper visual development, yet existing visual learning methods are conspicuously disconnected from the physical source of their images.
no code implementations • NeurIPS 2014 • Aron Yu, Kristen Grauman
Lazy local learning methods train a classifier on the fly" at test time, using only a subset of the training instances that are most relevant to the novel test example.
no code implementations • NeurIPS 2014 • Boqing Gong, Wei-Lun Chao, Kristen Grauman, Fei Sha
Video summarization is a challenging problem with great application potential.
no code implementations • NeurIPS 2014 • Dinesh Jayaraman, Kristen Grauman
In principle, zero-shot learning makes it possible to train an object recognition model simply by specifying the category's attributes.
no code implementations • 6 Nov 2014 • Boqing Gong, Wei-Lun Chao, Kristen Grauman, Fei Sha
Extensive empirical studies validate our contributions, including applications on challenging document and video summarization, where flexibility in modeling the kernel matrix and balancing different errors is indispensable.
no code implementations • 15 Sep 2014 • Dinesh Jayaraman, Kristen Grauman
In principle, zero-shot learning makes it possible to train a recognition model simply by specifying the category's attributes.
no code implementations • CVPR 2014 • Chao-Yeh Chen, Kristen Grauman
The appearance of an attribute can vary considerably from class to class (e. g., a "fluffy" dog vs. a "fluffy" towel), making standard class-independent attribute models break down.
no code implementations • CVPR 2014 • Dinesh Jayaraman, Fei Sha, Kristen Grauman
Existing methods to learn visual attributes are prone to learning the wrong thing---namely, properties that are correlated with the attribute of interest among training samples.
no code implementations • CVPR 2014 • Chao-Yeh Chen, Kristen Grauman
We pose unseen view synthesis as a probabilistic tensor completion problem.
no code implementations • CVPR 2014 • Aron Yu, Kristen Grauman
Given two images, we want to predict which exhibits a particular visual attribute more than the other---even when the two images are quite similar.
no code implementations • CVPR 2014 • Lucy Liang, Kristen Grauman
It is useful to automatically compare images based on their visual properties---to predict which image is brighter, more feminine, more blurry, etc.
no code implementations • NeurIPS 2013 • Boqing Gong, Kristen Grauman, Fei Sha
By maximum distinctiveness, we require the underlying distributions of the identified domains to be different from each other; by maximum learnability, we ensure that a strong discriminative model can be learned from the domain.
no code implementations • CVPR 2013 • Chao-Yeh Chen, Kristen Grauman
We propose an approach to learn action categories from static images that leverages prior observations of generic human motion to augment its training process.
no code implementations • CVPR 2013 • Zheng Lu, Kristen Grauman
We present a video summarization approach that discovers the story of an egocentric video.
no code implementations • CVPR 2013 • Jaechul Kim, Ce Liu, Fei Sha, Kristen Grauman
We introduce a fast deformable spatial pyramid (DSP) matching algorithm for computing dense pixel correspondences.
no code implementations • NeurIPS 2012 • Sung Ju Hwang, Kristen Grauman, Fei Sha
When learning features for complex visual recognition problems, labeled image exemplars alone can be insufficient.
no code implementations • NeurIPS 2011 • Kristen Grauman, Fei Sha, Sung Ju Hwang
Given a hierarchical taxonomy that captures semantic similarity between the objects, we learn a corresponding tree of metrics (ToM).
no code implementations • NeurIPS 2010 • Prateek Jain, Sudheendra Vijayanarasimhan, Kristen Grauman
Our first approach maps the data to two-bit binary keys that are locality-sensitive for the angle between the hyperplane normal and a database point.
no code implementations • NeurIPS 2008 • Sudheendra Vijayanarasimhan, Kristen Grauman
We introduce a framework for actively learning visual categories from a mixture of weakly and strongly labeled image examples.
no code implementations • NeurIPS 2008 • Prateek Jain, Brian Kulis, Inderjit S. Dhillon, Kristen Grauman
Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once.