Search Results for author: Jianbo Shi

Found 67 papers, 23 papers with code

ForkGAN: Seeing into the Rainy Night

1 code implementation ECCV 2020 Ziqiang Zheng, Yang Wu, Xinran Han, Jianbo Shi

We present a ForkGAN for task-agnostic image translation that can boost multiple vision tasks in adverse weather conditions.

Image Generation Image Segmentation +5

AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space

no code implementations26 Jun 2024 Huzheng Yang, James Gee, Jianbo Shi

We study the intriguing connection between visual data, deep networks, and the brain.

Decoder

Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models

no code implementations23 May 2024 Katherine Xu, Lingzhi Zhang, Jianbo Shi

In this work, we conduct a large-scale scientific study into the impact of random seeds during diffusion inference.

Image Generation

Amodal Completion via Progressive Mixed Context Diffusion

no code implementations CVPR 2024 Katherine Xu, Lingzhi Zhang, Jianbo Shi

We propose to sidestep many of the difficulties of existing approaches, which typically involve a two-step process of predicting amodal masks and then generating pixels.

Brain Decodes Deep Nets

1 code implementation CVPR 2024 Huzheng Yang, James Gee, Jianbo Shi

This mapping method, FactorTopy, is plug-and-play for any deep-network; with it, one can paint a picture of the network onto the brain (literally!).

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

2 code implementations CVPR 2024 Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

Memory Encoding Model

1 code implementation2 Aug 2023 Huzheng Yang, James Gee, Jianbo Shi

Our ensemble model without memory input (61. 4) can also stand a 3rd place.

Brain Decoding Hippocampus

Retinotopy Inspired Brain Encoding Model and the All-for-One Training Recipe

no code implementations26 Jul 2023 Huzheng Yang, Jianbo Shi, James Gee

We use this diversity to our advantage by introducing the All-for-One training recipe, which divides the challenging one-big-model problem into multiple small models, with the small models aggregating the knowledge while preserving the distinction between the different functional regions.

Brain Decoding Diversity +1

Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis

2 code implementations14 Mar 2023 Renrui Zhang, Liuhui Wang, Ziyu Guo, Yali Wang, Peng Gao, Hongsheng Li, Jianbo Shi

We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.

Supervised Only 3D Point Cloud Classification Training-free 3D Part Segmentation +1

Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis

no code implementations1 Mar 2023 Renrui Zhang, Liuhui Wang, Ziyu Guo, Jianbo Shi

Performances on standard 3D point cloud benchmarks have plateaued, resulting in oversized models and complex network design to make a fractional improvement.

3D Object Detection object-detection

Starting From Non-Parametric Networks for 3D Point Cloud Analysis

1 code implementation CVPR 2023 Renrui Zhang, Liuhui Wang, Yali Wang, Peng Gao, Hongsheng Li, Jianbo Shi

We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.

iQuery: Instruments as Queries for Audio-Visual Sound Separation

1 code implementation CVPR 2023 Jiaben Chen, Renrui Zhang, Dongze Lian, Jiaqi Yang, Ziyao Zeng, Jianbo Shi

To generalize to a new instrument or event class, drawing inspiration from the text-prompt design, we insert an additional query as an audio prompt while freezing the attention mechanism.

Decoder Disentanglement

HashEncoding: Autoencoding with Multiscale Coordinate Hashing

no code implementations29 Nov 2022 Lukas Zhornyak, Zhengjie Xu, Haoran Tang, Jianbo Shi

We present HashEncoding, a novel autoencoding architecture that leverages a non-parametric multiscale coordinate hash function to facilitate a per-pixel decoder without convolutions.

Decoder Optical Flow Estimation

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation

no code implementations6 Aug 2022 Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi

Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes.

4k Image Inpainting

Perceptual Artifacts Localization for Inpainting

1 code implementation5 Aug 2022 Lingzhi Zhang, Yuqian Zhou, Connelly Barnes, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi

Inspired by this workflow, we propose a new learning task of automatic segmentation of inpainting perceptual artifacts, and apply the model for inpainting model evaluation and iterative refinement.

Image Inpainting

Beyond mAP: Towards better evaluation of instance segmentation

no code implementations CVPR 2023 Rohit Jena, Lukas Zhornyak, Nehal Doiphode, Pratik Chaudhari, Vivek Buch, James Gee, Jianbo Shi

Correctness of instance segmentation constitutes counting the number of objects, correctly localizing all predictions and classifying each localized prediction.

Instance Segmentation Segmentation +1

SoGCN: Second-Order Graph Convolutional Networks

1 code implementation14 Oct 2021 Peihao Wang, Yuehao Wang, Hua Lin, Jianbo Shi

Graph Convolutional Networks (GCN) with multi-hop aggregation is more expressive than one-hop GCN but suffers from higher model complexity.

Graph Classification Graph Regression +2

Ego4D: Around the World in 3,000 Hours of Egocentric Video

8 code implementations CVPR 2022 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

no code implementations30 Aug 2021 Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng Dong, Jianbo Shi

Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence.

Convolutional Ordinal Regression Forest for Image Ordinal Estimation

no code implementations7 Aug 2020 Haiping Zhu, Hongming Shan, Yuheng Zhang, Lingfu Che, Xiaoyang Xu, Junping Zhang, Jianbo Shi, Fei-Yue Wang

We propose a novel ordinal regression approach, termed Convolutional Ordinal Regression Forest or CORF, for image ordinal estimation, which can integrate ordinal regression and differentiable decision trees with a convolutional neural network for obtaining precise and stable global ordinal relationships.

Age Estimation Binary Classification +1

FoveaBox: Beyound Anchor-based Object Detection

no code implementations ICLR 2020 Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei LI, Jianbo Shi

While almost all state-of-the-art object detectors utilize predefined anchors to enumerate possible locations, scales and aspect ratios for the search of the objects, their performance and generalization ability are also limited to the design of anchors.

Object object-detection +1

Potential Field: Interpretable and Unified Representation for Trajectory Prediction

no code implementations18 Nov 2019 Shan Su, Cheng Peng, Jianbo Shi, Chiho Choi

From the generated potential fields, we further estimate future motion direction and speed, which are modeled as Gaussian distributions to account for the multi-modal nature of the problem.

Trajectory Prediction

Multimodal Image Outpainting With Regularized Normalized Diversification

1 code implementation25 Oct 2019 Lingzhi Zhang, Jiancong Wang, Jianbo Shi

In this paper, we study the problem of generating a set ofrealistic and diverse backgrounds when given only a smallforeground region.

Image Outpainting

Deep Image Blending

2 code implementations25 Oct 2019 Lingzhi Zhang, Tarmily Wen, Jianbo Shi

In addition, we jointly optimize the proposed Poisson blending loss as well as the style and content loss computed from a deep network, and reconstruct the blending region by iteratively updating the pixels using the L-BFGS solver.

Object

SegSort: Segmentation by Discriminative Sorting of Segments

1 code implementation ICCV 2019 Jyh-Jing Hwang, Stella X. Yu, Jianbo Shi, Maxwell D. Collins, Tien-Ju Yang, Xiao Zhang, Liang-Chieh Chen

The proposed SegSort further produces an interpretable result, as each choice of label can be easily understood from the retrieved nearest segments.

Ranked #10 on Unsupervised Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)

Clustering Metric Learning +2

Neural Embedding for Physical Manipulations

no code implementations13 Jul 2019 Lingzhi Zhang, Andong Cao, Rui Li, Jianbo Shi

In common real-world robotic operations, action and state spaces can be vast and sometimes unknown, and observations are often relatively sparse.

Enhanced generative adversarial network for 3D brain MRI super-resolution

no code implementations10 Jul 2019 Jiancong Wang, Yu-Hua Chen, Yifan Wu, Jianbo Shi, James Gee

Single image super-resolution (SISR) reconstruction for magnetic resonance imaging (MRI) has generated significant interest because of its potential to not only speed up imaging but to improve quantitative processing and analysis of available image data.

Generative Adversarial Network Image Super-Resolution +1

Learning Temporal Pose Estimation from Sparsely-Labeled Videos

3 code implementations NeurIPS 2019 Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani

To reduce the need for dense annotations, we propose a PoseWarper network that leverages training videos with sparse annotations (every k frames) to learn to perform dense temporal pose propagation and estimation.

Ranked #2 on Multi-Person Pose Estimation on PoseTrack2018 (using extra training data)

Multi-Person Pose Estimation Optical Flow Estimation

FoveaBox: Beyond Anchor-based Object Detector

7 code implementations8 Apr 2019 Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei LI, Jianbo Shi

In FoveaBox, an instance is assigned to adjacent feature levels to make the model more accurate. We demonstrate its effectiveness on standard benchmarks and report extensive experimental analysis.

Ranked #90 on Object Detection on COCO test-dev (APM metric)

Object object-detection +1

Normalized Diversification

1 code implementation CVPR 2019 Shaohui Liu, Xiao Zhang, Jianqiao Wangni, Jianbo Shi

We introduce the concept of normalized diversity which force the model to preserve the normalized pairwise distance between the sparse samples from a latent parametric distribution and their corresponding high-dimensional outputs.

Conditional Image Generation Diversity +3

Trajectory Normalized Gradients for Distributed Optimization

no code implementations24 Jan 2019 Jianqiao Wangni, Ke Li, Jianbo Shi, Jitendra Malik

Recently, researchers proposed various low-precision gradient compression, for efficient communication in large-scale distributed optimization.

Benchmarking Distributed Optimization

Monocular 3D Pose Recovery via Nonconvex Sparsity with Theoretical Analysis

no code implementations29 Dec 2018 Jianqiao Wangni, Dahua Lin, Ji Liu, Kostas Daniilidis, Jianbo Shi

For recovering 3D object poses from 2D images, a prevalent method is to pre-train an over-complete dictionary $\mathcal D=\{B_i\}_i^D$ of 3D basis poses.

Adversarial Structure Matching for Structured Prediction Tasks

1 code implementation CVPR 2019 Jyh-Jing Hwang, Tsung-Wei Ke, Jianbo Shi, Stella X. Yu

The structure analyzer is trained to maximize the ASM loss, or to emphasize recurring multi-scale hard negative structural mistakes among co-occurring patterns.

Image Classification Monocular Depth Estimation +2

Object Detection in Video with Spatiotemporal Sampling Networks

no code implementations ECCV 2018 Gedas Bertasius, Lorenzo Torresani, Jianbo Shi

We propose a Spatiotemporal Sampling Network (STSN) that uses deformable convolutions across time for object detection in videos.

Object object-detection +2

Egocentric Basketball Motion Planning from a Single First-Person Image

no code implementations CVPR 2018 Gedas Bertasius, Aaron Chan, Jianbo Shi

We present a model that uses a single first-person image to generate an egocentric basketball motion sequence in the form of a 12D camera configuration trajectory, which encodes a player's 3D location and 3D head orientation throughout the sequence.

Motion Planning

Using Cross-Model EgoSupervision to Learn Cooperative Basketball Intention

no code implementations5 Sep 2017 Gedas Bertasius, Jianbo Shi

We present a first-person method for cooperative basketball intention prediction: we predict with whom the camera wearer will cooperate in the near future from unlabeled first-person images.

Pose Estimation

Predicting Behaviors of Basketball Players From First Person Videos

no code implementations CVPR 2017 Shan Su, Jung Pyo Hong, Jianbo Shi, Hyun Soo Park

This paper presents a method to predict the future movements (location and gaze direction) of basketball players as a whole from their first person videos.

3D Reconstruction

Customizing First Person Image Through Desired Actions

no code implementations1 Apr 2017 Shan Su, Jianbo Shi, Hyun Soo Park

Our conjecture is that the spatial arrangement of a first person visual scene is deployed to afford an action, and therefore, the action can be inversely used to synthesize a new scene such that the action is feasible.

Generative Adversarial Network

Social Behavior Prediction from First Person Videos

no code implementations29 Nov 2016 Shan Su, Jung Pyo Hong, Jianbo Shi, Hyun Soo Park

This paper presents a method to predict the future movements (location and gaze direction) of basketball players as a whole from their first person videos.

3D Reconstruction

Unsupervised Learning of Important Objects from First-Person Videos

1 code implementation ICCV 2017 Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi

In this work, we show that we can detect important objects in first-person images without the supervision by the camera wearer or even third-person labelers.

Object Segmentation +1

Am I a Baller? Basketball Performance Assessment from First-Person Videos

no code implementations ICCV 2017 Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi

Finally, we use this feature to learn a basketball assessment model from pairs of labeled first-person basketball videos, for which a basketball expert indicates, which of the two players is better.

Egocentric Future Localization

no code implementations CVPR 2016 Hyun Soo Park, Jyh-Jing Hwang, Yedong Niu, Jianbo Shi

We refine them by minimizing a cost function that describes compatibility between the obstacles in the EgoRetinal map and trajectories.

Motion Planning

Force From Motion: Decoding Physical Sensation in a First Person Video

no code implementations CVPR 2016 Hyun Soo Park, Jyh-Jing Hwang, Jianbo Shi

In this paper, we focus on a problem of Force from Motion---decoding the sensation of 1) passive forces such as the gravity, 2) the physical scale of the motion (speed) and space, and 3) active forces exerted by the observer such as pedaling a bike or banking on a ski turn.

Action Recognition Friction +2

Convolutional Random Walk Networks for Semantic Image Segmentation

no code implementations CVPR 2017 Gedas Bertasius, Lorenzo Torresani, Stella X. Yu, Jianbo Shi

It combines these two objectives via a novel random walk layer that enforces consistent spatial grouping in the deep layers of the network.

Image Segmentation Scene Labeling +2

Local Perturb-and-MAP for Structured Prediction

no code implementations24 May 2016 Gedas Bertasius, Qiang Liu, Lorenzo Torresani, Jianbo Shi

In this work, we present a new Local Perturb-and-MAP (locPMAP) framework that replaces the global optimization with a local optimization by exploiting our observed connection between locPMAP and the pseudolikelihood of the original CRF model.

Combinatorial Optimization Structured Prediction

First Person Action-Object Detection with EgoNet

no code implementations15 Mar 2016 Gedas Bertasius, Hyun Soo Park, Stella X. Yu, Jianbo Shi

Unlike traditional third-person cameras mounted on robots, a first-person camera, captures a person's visual sensorimotor object interactions from up close.

Human-Object Interaction Detection Object +2

Semantic Segmentation with Boundary Neural Fields

no code implementations CVPR 2016 Gedas Bertasius, Jianbo Shi, Lorenzo Torresani

To overcome these problems, we introduce a Boundary Neural Field (BNF), which is a global energy model integrating FCN predictions with boundary cues.

Boundary Detection Object Localization +2

Exploiting Egocentric Object Prior for 3D Saliency Detection

no code implementations9 Nov 2015 Gedas Bertasius, Hyun Soo Park, Jianbo Shi

We empirically show that this representation can accurately characterize the egocentric object prior by testing it on an egocentric RGBD dataset for three tasks: the 3D saliency detection, future saliency prediction, and interaction classification.

Object Saliency Prediction

Future Localization from an Egocentric Depth Image

no code implementations7 Sep 2015 Hyun Soo Park, Yedong Niu, Jianbo Shi

As a byproduct of the predicted trajectories of ego-motion, we discover in the image the empty space occluded by foreground objects.

object-detection Object Detection

Social Saliency Prediction

no code implementations CVPR 2015 Hyun Soo Park, Jianbo Shi

An ensemble classifier is trained to learn the geometric relationship.

Saliency Prediction

Pulling Things out of Perspective

no code implementations CVPR 2014 Lubor Ladicky, Jianbo Shi, Marc Pollefeys

The limitations of current state-of-the-art methods for single-view depth estimation and semantic segmentations are closely tied to the property of perspective geometry, that the perceived size of the objects scales inversely with the distance.

Depth Estimation Semantic Segmentation

Pose from Flow and Flow from Pose

no code implementations CVPR 2013 Katerina Fragkiadaki, Han Hu, Jianbo Shi

The pose labeled segments and corresponding articulated joints are used to improve the motion flow fields by proposing kinematically constrained affine displacements on body parts.

Motion Estimation Segmentation

A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context

no code implementations NeurIPS 2008 Abhinav Gupta, Jianbo Shi, Larry S. Davis

Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context.

Object Object Recognition

Grouping Contours Via a Related Image

no code implementations NeurIPS 2008 Praveen Srinivasan, Liming Wang, Jianbo Shi

We present a method for further grouping of contours in an image using their relationship to the contours of a second, related image.

Descriptive

Cannot find the paper you are looking for? You can Submit a new open access paper.