1 code implementation • ECCV 2020 • Ziqiang Zheng, Yang Wu, Xinran Han, Jianbo Shi
We present a ForkGAN for task-agnostic image translation that can boost multiple vision tasks in adverse weather conditions.
no code implementations • ECCV 2020 • Lingzhi Zhang, Tarmily Wen, Jie Min, Jiancong Wang, David Han, Jianbo Shi
We study the problem of common sense placement of visual objects in an image.
no code implementations • 26 Jun 2024 • Huzheng Yang, James Gee, Jianbo Shi
We study the intriguing connection between visual data, deep networks, and the brain.
no code implementations • 23 May 2024 • Katherine Xu, Lingzhi Zhang, Jianbo Shi
In this work, we conduct a large-scale scientific study into the impact of random seeds during diffusion inference.
1 code implementation • 28 Mar 2024 • Katherine Xu, Lingzhi Zhang, Jianbo Shi
Modern text-to-image (T2I) diffusion models can generate images with remarkable realism and creativity.
no code implementations • CVPR 2024 • Katherine Xu, Lingzhi Zhang, Jianbo Shi
We propose to sidestep many of the difficulties of existing approaches, which typically involve a two-step process of predicting amodal masks and then generating pixels.
1 code implementation • CVPR 2024 • Huzheng Yang, James Gee, Jianbo Shi
This mapping method, FactorTopy, is plug-and-play for any deep-network; with it, one can paint a picture of the network onto the brain (literally!).
2 code implementations • CVPR 2024 • Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.
1 code implementation • ICCV 2023 • Lingzhi Zhang, Zhengjie Xu, Connelly Barnes, Yuqian Zhou, Qing Liu, He Zhang, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi
Recent advancements in deep generative models have facilitated the creation of photo-realistic images across various tasks.
1 code implementation • 2 Aug 2023 • Huzheng Yang, James Gee, Jianbo Shi
Our ensemble model without memory input (61. 4) can also stand a 3rd place.
no code implementations • 26 Jul 2023 • Huzheng Yang, Jianbo Shi, James Gee
We use this diversity to our advantage by introducing the All-for-One training recipe, which divides the challenging one-big-model problem into multiple small models, with the small models aggregating the knowledge while preserving the distinction between the different functional regions.
2 code implementations • 14 Mar 2023 • Renrui Zhang, Liuhui Wang, Ziyu Guo, Yali Wang, Peng Gao, Hongsheng Li, Jianbo Shi
We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.
Ranked #1 on Training-free 3D Part Segmentation on ShapeNet-Part
Supervised Only 3D Point Cloud Classification Training-free 3D Part Segmentation +1
no code implementations • 1 Mar 2023 • Renrui Zhang, Liuhui Wang, Ziyu Guo, Jianbo Shi
Performances on standard 3D point cloud benchmarks have plateaued, resulting in oversized models and complex network design to make a fractional improvement.
1 code implementation • CVPR 2023 • Renrui Zhang, Liuhui Wang, Yali Wang, Peng Gao, Hongsheng Li, Jianbo Shi
We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.
1 code implementation • CVPR 2023 • Jiaben Chen, Renrui Zhang, Dongze Lian, Jiaqi Yang, Ziyao Zeng, Jianbo Shi
To generalize to a new instrument or event class, drawing inspiration from the text-prompt design, we insert an additional query as an audio prompt while freezing the attention mechanism.
no code implementations • 29 Nov 2022 • Lukas Zhornyak, Zhengjie Xu, Haoran Tang, Jianbo Shi
We present HashEncoding, a novel autoencoding architecture that leverages a non-parametric multiscale coordinate hash function to facilitate a per-pixel decoder without convolutions.
1 code implementation • 7 Aug 2022 • Lingzhi Zhang, Shenghao Zhou, Simon Stent, Jianbo Shi
Egocentric videos offer fine-grained information for high-fidelity modeling of human behaviors.
no code implementations • 6 Aug 2022 • Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi
Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes.
1 code implementation • 5 Aug 2022 • Lingzhi Zhang, Yuqian Zhou, Connelly Barnes, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi
Inspired by this workflow, we propose a new learning task of automatic segmentation of inpainting perceptual artifacts, and apply the model for inpainting model evaluation and iterative refinement.
no code implementations • CVPR 2023 • Rohit Jena, Lukas Zhornyak, Nehal Doiphode, Pratik Chaudhari, Vivek Buch, James Gee, Jianbo Shi
Correctness of instance segmentation constitutes counting the number of objects, correctly localizing all predictions and classifying each localized prediction.
1 code implementation • 19 Nov 2021 • Renrui Zhang, Ziyao Zeng, Ziyu Guo, Xinben Gao, Kexue Fu, Jianbo Shi
We reverse the conventional design of applying convolution on voxels and attention to points.
Ranked #37 on 3D Part Segmentation on ShapeNet-Part
1 code implementation • 14 Oct 2021 • Peihao Wang, Yuehao Wang, Hua Lin, Jianbo Shi
Graph Convolutional Networks (GCN) with multi-hop aggregation is more expressive than one-hop GCN but suffers from higher model complexity.
8 code implementations • CVPR 2022 • Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.
no code implementations • 30 Aug 2021 • Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng Dong, Jianbo Shi
Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence.
no code implementations • 7 Aug 2020 • Haiping Zhu, Hongming Shan, Yuheng Zhang, Lingfu Che, Xiaoyang Xu, Junping Zhang, Jianbo Shi, Fei-Yue Wang
We propose a novel ordinal regression approach, termed Convolutional Ordinal Regression Forest or CORF, for image ordinal estimation, which can integrate ordinal regression and differentiable decision trees with a convolutional neural network for obtaining precise and stable global ordinal relationships.
no code implementations • 3 Jun 2020 • Lingzhi Zhang, Jiancong Wang, Yinshuang Xu, Jie Min, Tarmily Wen, James C. Gee, Jianbo Shi
We propose an image synthesis approach that provides stratified navigation in the latent code space.
no code implementations • ICLR 2020 • Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei LI, Jianbo Shi
While almost all state-of-the-art object detectors utilize predefined anchors to enumerate possible locations, scales and aspect ratios for the search of the objects, their performance and generalization ability are also limited to the design of anchors.
no code implementations • 18 Nov 2019 • Shan Su, Cheng Peng, Jianbo Shi, Chiho Choi
From the generated potential fields, we further estimate future motion direction and speed, which are modeled as Gaussian distributions to account for the multi-modal nature of the problem.
1 code implementation • 25 Oct 2019 • Lingzhi Zhang, Jiancong Wang, Jianbo Shi
In this paper, we study the problem of generating a set ofrealistic and diverse backgrounds when given only a smallforeground region.
2 code implementations • 25 Oct 2019 • Lingzhi Zhang, Tarmily Wen, Jianbo Shi
In addition, we jointly optimize the proposed Poisson blending loss as well as the style and content loss computed from a deep network, and reconstruct the blending region by iteratively updating the pixels using the L-BFGS solver.
1 code implementation • ICCV 2019 • Jyh-Jing Hwang, Stella X. Yu, Jianbo Shi, Maxwell D. Collins, Tien-Ju Yang, Xiao Zhang, Liang-Chieh Chen
The proposed SegSort further produces an interpretable result, as each choice of label can be easily understood from the retrieved nearest segments.
Ranked #10 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)
no code implementations • 8 Aug 2019 • Andong Cao, Ali Dhanaliwala, Jianbo Shi, Terence Gade, Brian Park
The location of the markers in the CT scan are extracted and the CT scan is converted into a 3D surface object.
no code implementations • 13 Jul 2019 • Lingzhi Zhang, Andong Cao, Rui Li, Jianbo Shi
In common real-world robotic operations, action and state spaces can be vast and sometimes unknown, and observations are often relatively sparse.
no code implementations • 10 Jul 2019 • Jiancong Wang, Yu-Hua Chen, Yifan Wu, Jianbo Shi, James Gee
Single image super-resolution (SISR) reconstruction for magnetic resonance imaging (MRI) has generated significant interest because of its potential to not only speed up imaging but to improve quantitative processing and analysis of available image data.
3 code implementations • NeurIPS 2019 • Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani
To reduce the need for dense annotations, we propose a PoseWarper network that leverages training videos with sparse annotations (every k frames) to learn to perform dense temporal pose propagation and estimation.
Ranked #2 on Multi-Person Pose Estimation on PoseTrack2018 (using extra training data)
7 code implementations • 8 Apr 2019 • Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei LI, Jianbo Shi
In FoveaBox, an instance is assigned to adjacent feature levels to make the model more accurate. We demonstrate its effectiveness on standard benchmarks and report extensive experimental analysis.
Ranked #90 on Object Detection on COCO test-dev (APM metric)
1 code implementation • CVPR 2019 • Shaohui Liu, Xiao Zhang, Jianqiao Wangni, Jianbo Shi
We introduce the concept of normalized diversity which force the model to preserve the normalized pairwise distance between the sparse samples from a latent parametric distribution and their corresponding high-dimensional outputs.
no code implementations • 24 Jan 2019 • Jianqiao Wangni, Ke Li, Jianbo Shi, Jitendra Malik
Recently, researchers proposed various low-precision gradient compression, for efficient communication in large-scale distributed optimization.
no code implementations • 19 Jan 2019 • Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Jianbo Shi
We present consistent optimization for single stage object detection.
no code implementations • 29 Dec 2018 • Jianqiao Wangni, Dahua Lin, Ji Liu, Kostas Daniilidis, Jianbo Shi
For recovering 3D object poses from 2D images, a prevalent method is to pre-train an over-complete dictionary $\mathcal D=\{B_i\}_i^D$ of 3D basis poses.
no code implementations • 11 Dec 2018 • Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani
Our network learns to spatially sample features from Frame B in order to maximize pose detection accuracy in Frame A.
no code implementations • CVPR 2019 • Liangzhe Yuan, Yibo Chen, Hantian Liu, Tao Kong, Jianbo Shi
We propose a light-weight video frame interpolation algorithm.
1 code implementation • CVPR 2019 • Jyh-Jing Hwang, Tsung-Wei Ke, Jianbo Shi, Stella X. Yu
The structure analyzer is trained to maximize the ASM loss, or to emphasize recurring multi-scale hard negative structural mistakes among co-occurring patterns.
no code implementations • ECCV 2018 • Gedas Bertasius, Lorenzo Torresani, Jianbo Shi
We propose a Spatiotemporal Sampling Network (STSN) that uses deformable convolutions across time for object detection in videos.
no code implementations • CVPR 2018 • Gedas Bertasius, Aaron Chan, Jianbo Shi
We present a model that uses a single first-person image to generate an egocentric basketball motion sequence in the form of a 12D camera configuration trajectory, which encodes a player's 3D location and 3D head orientation throughout the sequence.
no code implementations • 5 Sep 2017 • Gedas Bertasius, Jianbo Shi
We present a first-person method for cooperative basketball intention prediction: we predict with whom the camera wearer will cooperate in the near future from unlabeled first-person images.
no code implementations • CVPR 2017 • Shan Su, Jung Pyo Hong, Jianbo Shi, Hyun Soo Park
This paper presents a method to predict the future movements (location and gaze direction) of basketball players as a whole from their first person videos.
no code implementations • 1 Apr 2017 • Shan Su, Jianbo Shi, Hyun Soo Park
Our conjecture is that the spatial arrangement of a first person visual scene is deployed to afford an action, and therefore, the action can be inversely used to synthesize a new scene such that the action is feasible.
no code implementations • 29 Nov 2016 • Shan Su, Jung Pyo Hong, Jianbo Shi, Hyun Soo Park
This paper presents a method to predict the future movements (location and gaze direction) of basketball players as a whole from their first person videos.
1 code implementation • ICCV 2017 • Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi
In this work, we show that we can detect important objects in first-person images without the supervision by the camera wearer or even third-person labelers.
no code implementations • ICCV 2017 • Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi
Finally, we use this feature to learn a basketball assessment model from pairs of labeled first-person basketball videos, for which a basketball expert indicates, which of the two players is better.
no code implementations • CVPR 2016 • Hyun Soo Park, Jyh-Jing Hwang, Yedong Niu, Jianbo Shi
We refine them by minimizing a cost function that describes compatibility between the obstacles in the EgoRetinal map and trajectories.
no code implementations • CVPR 2016 • Hyun Soo Park, Jyh-Jing Hwang, Jianbo Shi
In this paper, we focus on a problem of Force from Motion---decoding the sensation of 1) passive forces such as the gravity, 2) the physical scale of the motion (speed) and space, and 3) active forces exerted by the observer such as pedaling a bike or banking on a ski turn.
no code implementations • CVPR 2017 • Gedas Bertasius, Lorenzo Torresani, Stella X. Yu, Jianbo Shi
It combines these two objectives via a novel random walk layer that enforces consistent spatial grouping in the deep layers of the network.
no code implementations • 24 May 2016 • Gedas Bertasius, Qiang Liu, Lorenzo Torresani, Jianbo Shi
In this work, we present a new Local Perturb-and-MAP (locPMAP) framework that replaces the global optimization with a local optimization by exploiting our observed connection between locPMAP and the pseudolikelihood of the original CRF model.
no code implementations • 15 Mar 2016 • Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi
Unlike traditional third-person cameras mounted on robots, a first-person camera, captures a person's visual sensorimotor object interactions from up close.
no code implementations • CVPR 2016 • Gedas Bertasius, Jianbo Shi, Lorenzo Torresani
To overcome these problems, we introduce a Boundary Neural Field (BNF), which is a global energy model integrating FCN predictions with boundary cues.
no code implementations • 9 Nov 2015 • Gedas Bertasius, Hyun Soo Park, Jianbo Shi
We empirically show that this representation can accurately characterize the egocentric object prior by testing it on an egocentric RGBD dataset for three tasks: the 3D saliency detection, future saliency prediction, and interaction classification.
no code implementations • 7 Sep 2015 • Hyun Soo Park, Yedong Niu, Jianbo Shi
As a byproduct of the predicted trajectories of ego-motion, we discover in the image the empty space occluded by foreground objects.
no code implementations • CVPR 2015 • Hyun Soo Park, Jianbo Shi
An ensemble classifier is trained to learn the geometric relationship.
no code implementations • ICCV 2015 • Gedas Bertasius, Jianbo Shi, Lorenzo Torresani
We can view this process as a "Low-for-High" scheme, where low-level boundaries aid high-level vision tasks.
no code implementations • CVPR 2015 • Gedas Bertasius, Jianbo Shi, Lorenzo Torresani
This section of the network is applied to four different scales of the image input.
no code implementations • CVPR 2014 • Lubor Ladicky, Jianbo Shi, Marc Pollefeys
The limitations of current state-of-the-art methods for single-view depth estimation and semantic segmentations are closely tied to the property of perspective geometry, that the perceived size of the objects scales inversely with the distance.
no code implementations • CVPR 2013 • Katerina Fragkiadaki, Han Hu, Jianbo Shi
The pose labeled segments and corresponding articulated joints are used to improve the motion flow fields by proposing kinematically constrained affine displacements on body parts.
1 code implementation • 25 Mar 2013 • Juan Nunez-Iglesias, Ryan Kennedy, Toufiq Parag, Jianbo Shi, Dmitri B. Chklovskii
We aim to improve segmentation through the use of machine learning tools during region agglomeration.
no code implementations • NeurIPS 2008 • Abhinav Gupta, Jianbo Shi, Larry S. Davis
Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context.
no code implementations • NeurIPS 2008 • Praveen Srinivasan, Liming Wang, Jianbo Shi
We present a method for further grouping of contours in an image using their relationship to the contours of a second, related image.