1 code implementation • 24 Jul 2024 • Ziwei Zhao, David Leake, Xiaomeng Ye, David Crandall
This short paper presents preliminary research on the Case-Enhanced Vision Transformer (CEViT), a similarity measurement method aimed at improving the explainability of similarity assessments for image data.
no code implementations • 13 Jan 2024 • Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du
Object Re-Identification (Re-ID) aims to identify and retrieve specific objects from varying viewpoints.
2 code implementations • CVPR 2024 • Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.
no code implementations • 30 Jun 2023 • Samuel Goree, David Crandall
In her influential 1988 paper, Situated Knowledges, Donna Haraway uses vision and perspective as a metaphor to discuss scientific knowledge.
no code implementations • 29 Mar 2023 • Zhenhua Chen, David Crandall
Inspired by the ConvNets with structured hidden representations, we propose a Tensor-based Neural Network, TCNN.
no code implementations • 5 Mar 2023 • Zheng Chen, Deepak Duggirala, David Crandall, Lei Jiang, Lantao Liu
Prediction beyond partial observations is crucial for robots to navigate in unknown environments because it can provide extra information regarding the surroundings beyond the current sensing range or resolution.
1 code implementation • CVPR 2024 • Xizi Wang, Feng Cheng, Gedas Bertasius, David Crandall
These two contexts are complementary to each other and can help infer the active speaker.
1 code implementation • CVPR 2023 • Feng Cheng, Xizi Wang, Jie Lei, David Crandall, Mohit Bansal, Gedas Bertasius
Furthermore, our model also obtains state-of-the-art video question-answering results on ActivityNet-QA, MSRVTT-QA, MSRVTT-MC and TVQA.
Ranked #2 on Video Retrieval on Condensed Movies (using extra training data)
no code implementations • 22 Sep 2022 • Samuel Goree, Gabriel Appleby, David Crandall, Norman Su
In this work, we examine the effects of this growth from a media archaeology perspective, through the changes to figures and tables in research papers.
1 code implementation • 15 Aug 2022 • Satoshi Tsutsui, Xizi Wang, Guangyuan Weng, Yayun Zhang, David Crandall, Chen Yu
We set out to identify properties of training data that lead to action recognition models with greater generalization ability.
no code implementations • 26 Jul 2022 • Junbo Yin, Jianbing Shen, Xin Gao, David Crandall, Ruigang Yang
In this paper, we propose to detect 3D objects by exploiting temporal information in multiple frames, i. e., the point cloud videos.
no code implementations • 22 Apr 2022 • Satoshi Tsutsui, Yanwei Fu, David Crandall
One-shot fine-grained visual recognition often suffers from the problem of having few training examples for new fine-grained classes.
no code implementations • 19 Dec 2021 • Vibhas Vats, David Crandall
We argue that for a given teacher-student pair, the quality of distillation can be improved by finding the sweet spot between batch size and number of epochs while training the teacher.
no code implementations • 15 Nov 2021 • Jagpreet Chawla, Nikhil Thakurdesai, Anuj Godase, Md Reza, David Crandall, Soon-Heung Jung
To address errors in depth estimation, we introduce a novel Depth Error Detection Network (DEDN) that spatially identifies erroneous depth predictions in the monocular depth estimation models.
no code implementations • 29 Oct 2021 • Zheng Chen, Zhengming Ding, David Crandall, Lantao Liu
Detecting navigable space is a fundamental capability for mobile robots navigating in unknown or unmapped environments.
8 code implementations • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.
1 code implementation • 15 Jul 2021 • Xiaomeng Ye, Ziwei Zhao, David Leake, Xizi Wang, David Crandall
Given a pair of cases, the CDH approach attributes the difference in their solutions to the difference in the problems they solve, and generates adaptation rules to adjust solutions accordingly when a retrieved case and new query have similar problem differences.
1 code implementation • 2 Jul 2021 • Tianfei Zhou, Fatih Porikli, David Crandall, Luc van Gool, Wenguan Wang
Video segmentation -- partitioning video frames into multiple segments or objects -- plays a critical role in a broad range of practical applications, from enhancing visual effects in movie, to understanding scenes in autonomous driving, to creating virtual background in video conferencing.
no code implementations • 12 Jun 2021 • Satoshi Tsutsui, David Crandall, Chen Yu
We analyze egocentric views of attended objects from infants.
no code implementations • 6 Apr 2021 • Zhenhua Chen, Xiwen Li, Qian Lou, David Crandall
How to improve the efficiency of routing procedures in CapsNets has been studied a lot.
no code implementations • 23 Nov 2020 • Zehua Zhang, David Crandall
We present a novel technique for self-supervised video representation learning by: (a) decoupling the learning objective into two contrastive subtasks respectively emphasizing spatial and temporal features, and (b) performing it hierarchically to encourage multi-scale understanding.
no code implementations • 17 Nov 2020 • Satoshi Tsutsui, Yanwei Fu, David Crandall
But while one's own face is not frequently visible, their hands are: in fact, hands are among the most common objects in one's own field of view.
no code implementations • 8 Oct 2020 • Yuchen Wang, Mingze Xu, John Paden, Lora Koenig, Geoffrey Fox, David Crandall
Understanding the structure of Earth's polar ice sheets is important for modeling how global warming will impact polar ice and, in turn, the Earth's climate.
1 code implementation • 4 Jun 2020 • Satoshi Tsutsui, Arjun Chandrasekaran, Md. Alimoor Reza, David Crandall, Chen Yu
Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences.
3 code implementations • 6 Apr 2020 • Yu Yao, Xizi Wang, Mingze Xu, Zelin Pu, Ella Atkins, David Crandall
A new spatial-temporal area under curve (STAUC) evaluation metric is proposed and used with DoTA.
1 code implementation • CVPR 2020 • Bardia Doosti, Shujon Naha, Majid Mirbagheri, David Crandall
Hand-object pose estimation (HOPE) aims to jointly detect the poses of both a hand and of a held object.
no code implementations • 12 Mar 2020 • Zehua Zhang, Ashish Tawari, Sujitha Martin, David Crandall
A vehicle driving along the road is surrounded by many objects, but only a small subset of them influence the driver's decisions and actions.
1 code implementation • CVPR 2020 • Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, Steven C. H. Hoi
We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data.
1 code implementation • ICCV 2019 • Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, Ling Shao
Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation.
no code implementations • 18 Dec 2019 • Zhenhua Chen, Xiwen Li, Chuhua Wang, David Crandall
The experiment shows that P-CapsNets achieve better performance than CapsNets with varied routing procedures by using significantly fewer parameters on MNIST\&CIFAR10.
1 code implementation • NeurIPS 2019 • Satoshi Tsutsui, Yanwei Fu, David Crandall
One-shot fine-grained visual recognition often suffers from the problem of training data scarcity for new fine-grained classes.
Fine-Grained Image Classification Fine-Grained Visual Recognition +2
1 code implementation • NeurIPS 2019 • Zehua Zhang, Chen Yu, David Crandall
Due to the foveated nature of the human vision system, people can focus their visual attention on a small region of their visual field at a time, which usually contains only a single object.
no code implementations • 4 Jun 2019 • Satoshi Tsutsui, Dian Zhi, Md. Alimoor Reza, David Crandall, Chen Yu
Inspired by the remarkable ability of the infant visual learning system, a recent study collected first-person images from children to analyze the `training data' that they receive.
no code implementations • 9 Apr 2019 • Jianwei Yang, Zhile Ren, Mingze Xu, Xinlei Chen, David Crandall, Devi Parikh, Dhruv Batra
Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded.
no code implementations • 2 Dec 2018 • Eman T. Hassan, Xin Chen, David Crandall
The results suggest that selfensembling is better than simple data augmentation with the newly generated data and a single model trained this way can have the best performance across all different transfer tasks.
no code implementations • NeurIPS 2018 • Sven Bambach, David Crandall, Linda Smith, Chen Yu
Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider.
1 code implementation • ICLR 2019 • Zhenhua Chen, David Crandall
To overcome this disadvantages of current routing procedures in CapsNet, we embed the routing procedure into the optimization procedure with all other parameters in neural networks, namely, make coupling coefficients in the routing procedure become completely trainable.
no code implementations • 21 Feb 2018 • Zhenhua Chen, David Crandall, Robert Templeman
Detecting small, densely distributed objects is a significant challenge: small objects often contain less distinctive information compared to larger ones, and finer-grained precision of bounding box boundaries are required.
1 code implementation • 20 Jun 2017 • Satoshi Tsutsui, David Crandall
Recent work in computer vision has yielded impressive results in automatically describing images with natural language.
no code implementations • 15 Mar 2017 • Kai Zhen, Mridul Birla, David Crandall, Bingjing Zhang, Judy Qiu
Given the progress in image recognition with recent data driven paradigms, it's still expensive to manually label a large training data to fit a convolutional neural network (CNN) model.
no code implementations • 15 Mar 2017 • Satoshi Tsutsui, David Crandall
CNNs eliminate the need for manually designing features and separation rules, but require a large amount of annotated training data.
25 code implementations • 7 Oct 2016 • Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David Crandall, Dhruv Batra
We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.
no code implementations • NeurIPS 2016 • Stefan Lee, Senthil Purushwalkam, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra
Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions.
no code implementations • 1 Jan 2016 • Mohammed Korayem, Khalifeh Aljadda, David Crandall
This paper surveys different ways used for building systems for subjective and sentiment analysis for languages other than English.
no code implementations • 19 Nov 2015 • Stefan Lee, Senthil Purushwalkam, Michael Cogswell, David Crandall, Dhruv Batra
Convolutional Neural Networks have achieved state-of-the-art performance on a wide range of tasks.