Search Results for author: David Crandall

Found 45 papers, 18 papers with code

Case-Enhanced Vision Transformer: Improving Explanations of Image Similarity with a ViT-based Similarity Metric

1 code implementation24 Jul 2024 Ziwei Zhao, David Leake, Xiaomeng Ye, David Crandall

This short paper presents preliminary research on the Case-Enhanced Vision Transformer (CEViT), a similarity measurement method aimed at improving the explainability of similarity assessments for image data.

Classification

Transformer for Object Re-Identification: A Survey

no code implementations13 Jan 2024 Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du

Object Re-Identification (Re-ID) aims to identify and retrieve specific objects from varying viewpoints.

Object Survey

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

2 code implementations CVPR 2024 Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

Situated Cameras, Situated Knowledges: Towards an Egocentric Epistemology for Computer Vision

no code implementations30 Jun 2023 Samuel Goree, David Crandall

In her influential 1988 paper, Situated Knowledges, Donna Haraway uses vision and perspective as a metaphor to discuss scientific knowledge.

Position

A Tensor-based Convolutional Neural Network for Small Dataset Classification

no code implementations29 Mar 2023 Zhenhua Chen, David Crandall

Inspired by the ConvNets with structured hidden representations, we propose a Tensor-based Neural Network, TCNN.

SePaint: Semantic Map Inpainting via Multinomial Diffusion

no code implementations5 Mar 2023 Zheng Chen, Deepak Duggirala, David Crandall, Lei Jiang, Lantao Liu

Prediction beyond partial observations is crucial for robots to navigate in unknown environments because it can provide extra information regarding the surroundings beyond the current sensing range or resolution.

Navigate

VindLU: A Recipe for Effective Video-and-Language Pretraining

1 code implementation CVPR 2023 Feng Cheng, Xizi Wang, Jie Lei, David Crandall, Mohit Bansal, Gedas Bertasius

Furthermore, our model also obtains state-of-the-art video question-answering results on ActivityNet-QA, MSRVTT-QA, MSRVTT-MC and TVQA.

Ranked #2 on Video Retrieval on Condensed Movies (using extra training data)

Question Answering Retrieval +3

Attention is All They Need: Exploring the Media Archaeology of the Computer Vision Research Paper

no code implementations22 Sep 2022 Samuel Goree, Gabriel Appleby, David Crandall, Norman Su

In this work, we examine the effects of this growth from a media archaeology perspective, through the changes to figures and tables in research papers.

Action Recognition based on Cross-Situational Action-object Statistics

1 code implementation15 Aug 2022 Satoshi Tsutsui, Xizi Wang, Guangyuan Weng, Yayun Zhang, David Crandall, Chen Yu

We set out to identify properties of training data that lead to action recognition models with greater generalization ability.

Action Recognition Object +1

Reinforcing Generated Images via Meta-learning for One-Shot Fine-Grained Visual Recognition

no code implementations22 Apr 2022 Satoshi Tsutsui, Yanwei Fu, David Crandall

One-shot fine-grained visual recognition often suffers from the problem of having few training examples for new fine-grained classes.

Diversity Fine-Grained Image Classification +4

Controlling the Quality of Distillation in Response-Based Network Compression

no code implementations19 Dec 2021 Vibhas Vats, David Crandall

We argue that for a given teacher-student pair, the quality of distillation can be improved by finding the sweet spot between batch size and number of epochs while training the teacher.

Knowledge Distillation

Error Diagnosis of Deep Monocular Depth Estimation Models

no code implementations15 Nov 2021 Jagpreet Chawla, Nikhil Thakurdesai, Anuj Godase, Md Reza, David Crandall, Soon-Heung Jung

To address errors in depth estimation, we introduce a novel Depth Error Detection Network (DEDN) that spatially identifies erroneous depth predictions in the monocular depth estimation models.

Depth Prediction Monocular Depth Estimation

Ego4D: Around the World in 3,000 Hours of Egocentric Video

8 code implementations CVPR 2022 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

Applying the Case Difference Heuristic to Learn Adaptations from Deep Network Features

1 code implementation15 Jul 2021 Xiaomeng Ye, Ziwei Zhao, David Leake, Xizi Wang, David Crandall

Given a pair of cases, the CDH approach attributes the difference in their solutions to the difference in the problems they solve, and generates adaptation rules to adjust solutions accordingly when a retrieved case and new query have similar problem differences.

A Survey on Deep Learning Technique for Video Segmentation

1 code implementation2 Jul 2021 Tianfei Zhou, Fatih Porikli, David Crandall, Luc van Gool, Wenguan Wang

Video segmentation -- partitioning video frames into multiple segments or objects -- plays a critical role in a broad range of practical applications, from enhancing visual effects in movie, to understanding scenes in autonomous driving, to creating virtual background in video conferencing.

Autonomous Driving Segmentation +4

How to Accelerate Capsule Convolutions in Capsule Networks

no code implementations6 Apr 2021 Zhenhua Chen, Xiwen Li, Qian Lou, David Crandall

How to improve the efficiency of routing procedures in CapsNets has been studied a lot.

Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning

no code implementations23 Nov 2020 Zehua Zhang, David Crandall

We present a novel technique for self-supervised video representation learning by: (a) decoupling the learning objective into two contrastive subtasks respectively emphasizing spatial and temporal features, and (b) performing it hierarchically to encourage multi-scale understanding.

Action Recognition Contrastive Learning +1

Whose hand is this? Person Identification from Egocentric Hand Gestures

no code implementations17 Nov 2020 Satoshi Tsutsui, Yanwei Fu, David Crandall

But while one's own face is not frequently visible, their hands are: in fact, hands are among the most common objects in one's own field of view.

Gesture Recognition Person Identification

Deep Tiered Image Segmentation For Detecting Internal Ice Layers in Radar Imagery

no code implementations8 Oct 2020 Yuchen Wang, Mingze Xu, John Paden, Lora Koenig, Geoffrey Fox, David Crandall

Understanding the structure of Earth's polar ice sheets is important for modeling how global warming will impact polar ice and, in turn, the Earth's climate.

Image Segmentation Semantic Segmentation

A Computational Model of Early Word Learning from the Infant's Point of View

1 code implementation4 Jun 2020 Satoshi Tsutsui, Arjun Chandrasekaran, Md. Alimoor Reza, David Crandall, Chen Yu

Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences.

Interaction Graphs for Object Importance Estimation in On-road Driving Videos

no code implementations12 Mar 2020 Zehua Zhang, Ashish Tawari, Sujitha Martin, David Crandall

A vehicle driving along the road is surrounded by many objects, but only a small subset of them influence the driver's decisions and actions.

Autonomous Driving Decision Making +1

Learning Video Object Segmentation from Unlabeled Videos

1 code implementation CVPR 2020 Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, Steven C. H. Hoi

We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data.

Object Representation Learning +6

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks

1 code implementation ICCV 2019 Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, Ling Shao

Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation.

Graph Neural Network Segmentation +5

P-CapsNets: a General Form of Convolutional Neural Networks

no code implementations18 Dec 2019 Zhenhua Chen, Xiwen Li, Chuhua Wang, David Crandall

The experiment shows that P-CapsNets achieve better performance than CapsNets with varied routing procedures by using significantly fewer parameters on MNIST\&CIFAR10.

Adversarial Robustness

A Self Validation Network for Object-Level Human Attention Estimation

1 code implementation NeurIPS 2019 Zehua Zhang, Chen Yu, David Crandall

Due to the foveated nature of the human vision system, people can focus their visual attention on a small region of their visual field at a time, which usually contains only a single object.

Object

Active Object Manipulation Facilitates Visual Object Learning: An Egocentric Vision Study

no code implementations4 Jun 2019 Satoshi Tsutsui, Dian Zhi, Md. Alimoor Reza, David Crandall, Chen Yu

Inspired by the remarkable ability of the infant visual learning system, a recent study collected first-person images from children to analyze the `training data' that they receive.

Few-Shot Learning Object

Embodied Visual Recognition

no code implementations9 Apr 2019 Jianwei Yang, Zhile Ren, Mingze Xu, Xinlei Chen, David Crandall, Devi Parikh, Dhruv Batra

Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded.

Object Object Localization +1

Unsupervised Domain Adaptation using Generative Models and Self-ensembling

no code implementations2 Dec 2018 Eman T. Hassan, Xin Chen, David Crandall

The results suggest that selfensembling is better than simple data augmentation with the newly generated data and a single model trained this way can have the best performance across all different transfer tasks.

Data Augmentation Style Transfer +1

Toddler-Inspired Visual Object Learning

no code implementations NeurIPS 2018 Sven Bambach, David Crandall, Linda Smith, Chen Yu

Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider.

Diversity Object

Generalized Capsule Networks with Trainable Routing Procedure

1 code implementation ICLR 2019 Zhenhua Chen, David Crandall

To overcome this disadvantages of current routing procedures in CapsNet, we embed the routing procedure into the optimization procedure with all other parameters in neural networks, namely, make coupling coefficients in the routing procedure become completely trainable.

Detecting Small, Densely Distributed Objects with Filter-Amplifier Networks and Loss Boosting

no code implementations21 Feb 2018 Zhenhua Chen, David Crandall, Robert Templeman

Detecting small, densely distributed objects is a significant challenge: small objects often contain less distinctive information compared to larger ones, and finer-grained precision of bounding box boundaries are required.

Using Artificial Tokens to Control Languages for Multilingual Image Caption Generation

1 code implementation20 Jun 2017 Satoshi Tsutsui, David Crandall

Recent work in computer vision has yielded impressive results in automatically describing images with natural language.

Caption Generation

A Hybrid Supervised-unsupervised Method on Image Topic Visualization with Convolutional Neural Network and LDA

no code implementations15 Mar 2017 Kai Zhen, Mridul Birla, David Crandall, Bingjing Zhang, Judy Qiu

Given the progress in image recognition with recent data driven paradigms, it's still expensive to manually label a large training data to fit a convolutional neural network (CNN) model.

A Data Driven Approach for Compound Figure Separation Using Convolutional Neural Networks

no code implementations15 Mar 2017 Satoshi Tsutsui, David Crandall

CNNs eliminate the need for manually designing features and separation rules, but require a large amount of annotated training data.

Transfer Learning

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models

25 code implementations7 Oct 2016 Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David Crandall, Dhruv Batra

We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.

Diversity Image Captioning +5

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

no code implementations NeurIPS 2016 Stefan Lee, Senthil Purushwalkam, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra

Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions.

Multiple-choice

Sentiment/Subjectivity Analysis Survey for Languages other than English

no code implementations1 Jan 2016 Mohammed Korayem, Khalifeh Aljadda, David Crandall

This paper surveys different ways used for building systems for subjective and sentiment analysis for languages other than English.

Arabic Sentiment Analysis Subjectivity Analysis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.