1 code implementation • 28 Nov 2024 • Yiming Zuo, Willow Yang, Zeyu Ma, Jia Deng
To evaluate our model, we establish a new evaluation protocol named Robust-DC for zero-shot testing under various sparse depth patterns.
no code implementations • 18 Nov 2024 • Jinhao Jiang, Zhipeng Chen, Yingqian Min, Jie Chen, Xiaoxue Cheng, Jiapeng Wang, Yiru Tang, Haoxiang Sun, Jia Deng, Wayne Xin Zhao, Zheng Liu, Dong Yan, Jian Xie, Zhongyuan Wang, Ji-Rong Wen
It is primarily constructed around a tree search algorithm, where the policy model navigates a dynamically expanding tree guided by a specially trained reward model.
1 code implementation • 16 Oct 2024 • Jia Deng, Tianyi Tang, Yanbin Yin, Wenhao Yang, Wayne Xin Zhao, Ji-Rong Wen
Second, by leveraging PersonalityBench, we propose an efficient method for identifying personality-related neurons within LLMs by examining the opposite aspects of a given trait.
no code implementations • 14 Oct 2024 • Yiming Zuo, Karhan Kayan, Maggie Wang, Kevin Jeon, Jia Deng, Thomas L. Griffiths
We evaluate state-of-the-art Vision-Language Models (VLMs), specialized models, and human subjects on it.
1 code implementation • 10 Oct 2024 • Xu Wang, Longji Xu, Yiquan Wang, Yuhua Dong, Xiang Li, Jia Deng, Rui He
This paper introduces a novel bionic intelligent optimisation algorithm, Octopus Inspired Optimization (OIO) algorithm, which is inspired by the neural structure of octopus, especially its hierarchical and decentralised interaction properties.
no code implementations • 9 Sep 2024 • Hongyu Wen, Erich Liang, Jia Deng
To provide training data for this task, we introduce a large-scale densely-annotated synthetic dataset containing 60k images within 30 scenes tailored for non-Lambertian objects.
1 code implementation • 3 Aug 2024 • Lahav Lipson, Zachary Teed, Jia Deng
Recent work in visual SLAM has shown the effectiveness of using deep network backbones.
1 code implementation • 17 Jun 2024 • Yiming Zuo, Jia Deng
Depth completion is the task of generating a dense depth map given an image and a sparse depth map as inputs.
1 code implementation • CVPR 2024 • Alexander Raistrick, Lingjie Mei, Karhan Kayan, David Yan, Yiming Zuo, Beining Han, Hongyu Wen, Meenal Parakh, Stamatis Alexandropoulos, Lahav Lipson, Zeyu Ma, Jia Deng
We introduce Infinigen Indoors, a Blender-based procedural generator of photorealistic indoor scenes.
no code implementations • 17 Jun 2024 • Bingxiang He, Ning Ding, Cheng Qian, Jia Deng, Ganqu Cui, Lifan Yuan, Huan-ang Gao, Huimin Chen, Zhiyuan Liu, Maosong Sun
For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level.
1 code implementation • 23 May 2024 • Yihan Wang, Lahav Lipson, Jia Deng
In addition, SEA-RAFT obtains the best cross-dataset generalization on KITTI and Spring.
1 code implementation • CVPR 2024 • Lahav Lipson, Jia Deng
The backbone is trained end-to-end using a novel differentiable solver for wide-baseline two-view pose.
1 code implementation • 2 Apr 2024 • Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, BoWen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun
We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning.
1 code implementation • 14 Feb 2024 • Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions.
1 code implementation • 13 Dec 2023 • Zeyu Ma, Alexander Raistrick, Lahav Lipson, Jia Deng
Procedural synthetic data generation has received increasing attention in computer vision.
4 code implementations • 16 Oct 2023 • Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen Mcaleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck
We present Llemma, a large language model for mathematics.
Ranked #7 on Automated Theorem Proving on miniF2F-test
1 code implementation • ICCV 2023 • Alexandre Kirchmeyer, Jia Deng
Specifically, we find that one key ingredient to a high-performing 1D ConvNet is oriented 1D kernels: 1D kernels that are oriented not just horizontally or vertically, but also at other angles.
1 code implementation • CVPR 2023 • Alexander Raistrick, Lahav Lipson, Zeyu Ma, Lingjie Mei, Mingzhe Wang, Yiming Zuo, Karhan Kayan, Hongyu Wen, Beining Han, Yihan Wang, Alejandro Newell, Hei Law, Ankit Goyal, Kaiyu Yang, Jia Deng
We introduce Infinigen, a procedural generator of photorealistic 3D scenes of the natural world.
1 code implementation • NeurIPS 2023 • Zachary Teed, Lahav Lipson, Jia Deng
DPVO disproves this assumption, showing that it is possible to get the best accuracy and efficiency by exploiting the advantages of sparse patch-based matching over dense flow.
1 code implementation • 8 Aug 2022 • Hei Law, Jia Deng
Our "SOLID" approach consists of two main components: (1) generating synthetic images using a collection of unlabelled 3D models with optimized scene arrangement; (2) pretraining an object detector on "instance detection" task - given a query image depicting an object, detecting all instances of the exact same object in a target image.
1 code implementation • 25 May 2022 • Kaiyu Yang, Jia Deng, Danqi Chen
In this paper, we present a novel stepwise method, NLProofS (Natural Language Proof Search), which learns to generate relevant steps conditioning on the hypothesis.
1 code implementation • 12 May 2022 • Yiming Zuo, Jia Deng
In this work, we propose a new approach that performs view synthesis using point clouds.
1 code implementation • 9 May 2022 • Zeyu Ma, Zachary Teed, Jia Deng
CER-MVS is significantly different from prior work in multiview stereo.
1 code implementation • CVPR 2022 • Lahav Lipson, Zachary Teed, Ankit Goyal, Jia Deng
We propose a new approach to 6D object pose estimation which consists of an end-to-end differentiable architecture that makes use of geometric knowledge.
no code implementations • CVPR 2022 • Ankit Goyal, Arsalan Mousavian, Chris Paxton, Yu-Wei Chao, Brian Okorn, Jia Deng, Dieter Fox
Accurate object rearrangement from vision is a crucial problem for a wide variety of real-world robotics applications in unstructured environments.
2 code implementations • 23 Nov 2021 • Kaiyu Yang, Jia Deng
In this work, we ask how we can build a rule-based system that can reason with natural language input but without the manual construction of rules.
4 code implementations • 14 Oct 2021 • Ankit Goyal, Alexey Bochkovskiy, Jia Deng, Vladlen Koltun
This begs the question -- is it possible to build high-performing "non-deep" neural networks?
1 code implementation • 15 Sep 2021 • Lahav Lipson, Zachary Teed, Jia Deng
We introduce RAFT-Stereo, a new deep architecture for rectified stereo based on the optical flow network RAFT.
Ranked #2 on Stereo Disparity Estimation on Middlebury 2014
1 code implementation • NeurIPS 2021 • Zachary Teed, Jia Deng
We introduce DROID-SLAM, a new deep learning based SLAM system.
no code implementations • 16 Jun 2021 • Lanlan Liu, Yuting Zhang, Jia Deng, Stefano Soatto
Recent work introduced progressive network growing as a promising way to ease the training for large GANs, but the model design and architecture-growing strategy still remain under-explored and needs manual design for different image data.
3 code implementations • 9 Jun 2021 • Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng
It also outperforms state-of-the-art methods on ScanObjectNN, a real-world point cloud benchmark, and demonstrates better cross-dataset generalization.
Ranked #18 on Point Cloud Classification on PointCloud-C
1 code implementation • CVPR 2021 • Zachary Teed, Jia Deng
We address the problem of performing backpropagation for computation graphs involving 3D transformation groups SO(3), SE(3), and Sim(3).
1 code implementation • 10 Mar 2021 • Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng, Olga Russakovsky
In this paper, we explore the effects of face obfuscation on the popular ImageNet challenge visual recognition benchmark.
2 code implementations • 1 Jan 2021 • Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng
It also outperforms state-of-the-art methods on ScanObjectNN, a real-world point cloud benchmark, and demonstrates better cross-dataset generalization.
Ranked #12 on 3D Point Cloud Classification on ModelNet40-C
no code implementations • 1 Jan 2021 • Emily Walters, Weifeng Chen, Jia Deng
Recent work has proposed the use of human evaluation for image synthesis models, allowing for a reliable method to evaluate the visual quality of generated images.
2 code implementations • NeurIPS 2020 • Ankit Goyal, Kaiyu Yang, Dawei Yang, Jia Deng
The 3D scenes in our dataset come in minimally contrastive pairs: two scenes in a pair are almost identical, but a spatial relation holds in one and fails in the other.
Ranked #1 on Spatial Relation Recognition on Rel3D
1 code implementation • CVPR 2021 • Zachary Teed, Jia Deng
We address the problem of scene flow: given a pair of stereo or RGB-D video frames, estimate pixelwise 3D motion.
Ranked #2 on Scene Flow Estimation on Spring
no code implementations • 3 Nov 2020 • Dhruv Batra, Angel X. Chang, Sonia Chernova, Andrew J. Davison, Jia Deng, Vladlen Koltun, Sergey Levine, Jitendra Malik, Igor Mordatch, Roozbeh Mottaghi, Manolis Savva, Hao Su
In the rearrangement task, the goal is to bring a given physical environment into a specified state.
3 code implementations • NeurIPS 2020 • Kaiyu Yang, Jia Deng
Based on our transition system, we develop a strongly incremental parser.
Ranked #1 on Constituency Parsing on CTB5
no code implementations • 29 Jul 2020 • Jonathan C. Stroud, Zhichao Lu, Chen Sun, Jia Deng, Rahul Sukthankar, Cordelia Schmid, David A. Ross
Based on this observation, we propose to use text as a method for learning video representations.
1 code implementation • ECCV 2020 • Lanlan Liu, Mingzhe Wang, Jia Deng
We introduce UniLoss, a unified framework to generate surrogate losses for training deep networks with gradient descent, reducing the amount of manual design of task-specific surrogate losses.
no code implementations • CVPR 2020 • Weifeng Chen, Shengyi Qian, David Fan, Noriyuki Kojima, Max Hamilton, Jia Deng
Single-view 3D is the task of recovering 3D properties such as depth and surface normals from a single image.
1 code implementation • ICML 2020 • Ankit Goyal, Jia Deng
The ability to jointly understand the geometry of objects and plan actions for manipulating them is crucial for intelligent agents.
Ranked #1 on Robot Task Planning on PackIt
1 code implementation • LREC 2020 • Santiago Castro, Mahmoud Azab, Jonathan Stroud, Cristina Noujaim, Ruoyao Wang, Jia Deng, Rada Mihalcea
We introduce LifeQA, a benchmark dataset for video question answering that focuses on day-to-day real-life situations.
2 code implementations • CVPR 2020 • Alejandro Newell, Jia Deng
We investigate what factors may play a role in the utility of these pretraining methods for practitioners.
15 code implementations • ECCV 2020 • Zachary Teed, Jia Deng
RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field through a recurrent unit that performs lookups on the correlation volumes.
Ranked #6 on Optical Flow Estimation on Spring
2 code implementations • NeurIPS 2020 • Mingzhe Wang, Jia Deng
We consider the task of automated theorem proving, a key AI task.
Ranked #2 on Automated Theorem Proving on Metamath set.mm
no code implementations • 16 Dec 2019 • Kaiyu Yang, Klint Qinami, Li Fei-Fei, Jia Deng, Olga Russakovsky
Computer vision technology is being used by many but remains representative of only a few.
no code implementations • 4 Dec 2019 • Jonathan C. Stroud, Ryan McCaffrey, Rada Mihalcea, Jia Deng, Olga Russakovsky
Temporal grounding entails establishing a correspondence between natural language event descriptions and their visual depictions.
no code implementations • CONLL 2019 • Mahmoud Azab, Noriyuki Kojima, Jia Deng, Rada Mihalcea
We introduce a new embedding model to represent movie characters and their interactions in a dialogue by encoding in the same representation the language used by these characters as well as information about the other participants in the dialogue.
1 code implementation • ICCV 2019 • Lanlan Liu, Michael Muelly, Jia Deng, Tomas Pfister, Li-Jia Li
This paper explores object detection in the small data regime, where only a limited number of annotated bounding boxes are available due to data rarity and annotation expense.
no code implementations • 20 Aug 2019 • Yu-Wei Chao, Jimei Yang, Weifeng Chen, Jia Deng
We experimentally demonstrate the strength of our approach over different non-hierarchical and hierarchical baselines.
Deep Reinforcement Learning Hierarchical Reinforcement Learning +4
no code implementations • ICLR 2020 • Alejandro Newell, Lu Jiang, Chong Wang, Li-Jia Li, Jia Deng
Multi-task learning holds the promise of less data, parameters, and time than training of separate models.
1 code implementation • ICCV 2019 • Kaiyu Yang, Olga Russakovsky, Jia Deng
Understanding the spatial relations between objects in images is a surprisingly challenging task.
no code implementations • 26 Jul 2019 • Noriyuki Kojima, Jia Deng
In this paper we compare learning-based methods and classical methods for navigation in virtual environments.
no code implementations • 29 Jun 2019 • Dawei Yang, Jia Deng
We parametrize the design decisions as a real vector, and combine the approximate gradient and the analytical gradient to obtain the hybrid gradient of the network performance with respect to this vector.
1 code implementation • ACL 2019 • Oana Ignat, Laura Burdick, Jia Deng, Rada Mihalcea
We consider the task of identifying human actions visible in online videos.
1 code implementation • 21 May 2019 • Kaiyu Yang, Jia Deng
Proof assistants offer a formalism that resembles human mathematical reasoning, representing theorems in higher-order logic and proofs as high-level tactics.
Ranked #1 on Automated Theorem Proving on CoqGym
6 code implementations • 18 Apr 2019 • Hei Law, Yun Teng, Olga Russakovsky, Jia Deng
Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency.
Ranked #149 on Object Detection on COCO minival
1 code implementation • 19 Dec 2018 • Jonathan C. Stroud, David A. Ross, Chen Sun, Jia Deng, Rahul Sukthankar
State-of-the-art methods for video action recognition commonly use an ensemble of two networks: the spatial stream, which takes RGB frames as input, and the temporal stream, which takes optical flow as input.
Ranked #11 on Action Recognition on AVA v2.1
1 code implementation • ICLR 2020 • Zachary Teed, Jia Deng
We propose DeepV2D, an end-to-end deep learning architecture for predicting depth from video.
no code implementations • CVPR 2019 • Chaowei Xiao, Dawei Yang, Bo Li, Jia Deng, Mingyan Liu
Highly expressive models such as deep neural networks (DNNs) have been widely applied to various applications.
no code implementations • NAACL 2018 • Mahmoud Azab, Mingzhe Wang, Max Smith, Noriyuki Kojima, Jia Deng, Rada Mihalcea
We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework.
no code implementations • 7 Aug 2018 • Parker Hill, Babak Zamirai, Shengshuo Lu, Yu-Wei Chao, Michael Laurenzano, Mehrzad Samadi, Marios Papaefthymiou, Scott Mahlke, Thomas Wenisch, Jia Deng, Lingjia Tang, Jason Mars
With ever-increasing computational demand for deep learning, it is critical to investigate the implications of the numeric representation and precision of DNN model weights and activations on computational efficiency.
5 code implementations • ECCV 2018 • Hei Law, Jia Deng
We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network.
Ranked #172 on Object Detection on COCO test-dev
no code implementations • CVPR 2019 • Weifeng Chen, Shengyi Qian, Jia Deng
Depth estimation from a single image in the wild remains a challenging problem.
1 code implementation • ACL 2018 • Ankit Goyal, Jian Wang, Jia Deng
In this paper, we study the problem of geometric reasoning in the context of question-answering.
6 code implementations • CVPR 2018 • Lei Huang, Dawei Yang, Bo Lang, Jia Deng
Batch Normalization (BN) is capable of accelerating the training of deep models by centering and scaling activations within mini-batches.
no code implementations • CVPR 2018 • Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David A. Ross, Jia Deng, Rahul Sukthankar
We propose TAL-Net, an improved approach to temporal action localization in video that is inspired by the Faster R-CNN object detection framework.
Ranked #29 on Temporal Action Localization on THUMOS’14
no code implementations • CVPR 2018 • Dawei Yang, Jia Deng
The evolution generates better shapes guided by the network training, while the training improves by using the evolved shapes.
1 code implementation • NeurIPS 2017 • Mingzhe Wang, Yihe Tang, Jian Wang, Jia Deng
We propose a deep learning-based approach to the problem of premise selection: selecting mathematical statements relevant for proving a given conjecture.
Ranked #1 on Automated Theorem Proving on HolStep (Unconditional)
no code implementations • 7 Sep 2017 • Timnit Gebru, Jonathan Krause, Jia Deng, Li Fei-Fei
We present a crowdsourcing workflow to collect image annotations for visually similar synthetic categories without requiring experts.
no code implementations • 7 Sep 2017 • Timnit Gebru, Jonathan Krause, Yi-Lun Wang, Duyun Chen, Jia Deng, Li Fei-Fei
In this work, we leverage the ubiquity of Google Street View images and develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data.
3 code implementations • NeurIPS 2017 • Alejandro Newell, Jia Deng
Graphs are a useful abstraction of image content.
no code implementations • CVPR 2017 • Zehuan Yuan, Jonathan C. Stroud, Tong Lu, Jia Deng
We pose action localization as a structured prediction over arbitrary-length temporal windows, where each window is scored as the sum of frame-wise classification scores.
no code implementations • CVPR 2017 • Yu-Wei Chao, Jimei Yang, Brian Price, Scott Cohen, Jia Deng
This paper presents the first study on forecasting human dynamics from static images.
no code implementations • ICCV 2017 • Weifeng Chen, Donglai Xiang, Jia Deng
We study the problem of single-image depth estimation for images in the wild.
no code implementations • 22 Feb 2017 • Timnit Gebru, Jonathan Krause, Yi-Lun Wang, Duyun Chen, Jia Deng, Erez Lieberman Aiden, Li Fei-Fei
The United States spends more than $1B each year on initiatives such as the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors.
no code implementations • 17 Feb 2017 • Yu-Wei Chao, Yunfan Liu, Xieyang Liu, Huayi Zeng, Jia Deng
We study the problem of detecting human-object interactions (HOI) in static images, defined as predicting a human and an object bounding box with an interaction class label that connects them.
General Classification Human-Object Interaction Detection +1
no code implementations • 2 Jan 2017 • Lanlan Liu, Jia Deng
We introduce Dynamic Deep Neural Networks (D2NN), a new type of feed-forward deep neural network that allows selective execution.
5 code implementations • NeurIPS 2017 • Alejandro Newell, Zhiao Huang, Jia Deng
We introduce associative embedding, a novel method for supervising convolutional neural networks for the task of detection and grouping.
Ranked #5 on Keypoint Detection on MPII Multi-Person
4 code implementations • NeurIPS 2016 • Weifeng Chen, Zhao Fu, Dawei Yang, Jia Deng
This paper studies single-image depth perception in the wild, i. e., recovering depth from a single image taken in unconstrained settings.
46 code implementations • 22 Mar 2016 • Alejandro Newell, Kaiyu Yang, Jia Deng
This work introduces a novel convolutional network architecture for the task of human pose estimation.
Ranked #1 on Pose Estimation on FLIC Wrists
no code implementations • ICCV 2015 • Yu-Wei Chao, Zhan Wang, Yugeng He, Jiaxuan Wang, Jia Deng
We introduce a new benchmark "Humans Interacting with Common Objects" (HICO) for recognizing human-object interactions (HOI).
no code implementations • CVPR 2015 • Vignesh Ramanathan, Cong-Cong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang song, Samy Bengio, Charles Rosenberg, Li Fei-Fei
Human actions capture a wide variety of interactions between people and objects.
no code implementations • CVPR 2015 • Yu-Wei Chao, Zhan Wang, Rada Mihalcea, Jia Deng
In this paper we introduce the new problem of mining the knowledge of semantic affordance: given an object, determining whether an action can be performed on it.
no code implementations • ICCV 2015 • Nan Ding, Jia Deng, Kevin Murphy, Hartmut Neven
In this paper, we extend the HEX model to allow for soft or probabilistic relations between labels, which is useful when there is uncertainty about the relationship between two labels (e. g., an antelope is "sort of" furry, but not to the same degree as a grizzly bear).
12 code implementations • 1 Sep 2014 • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images.
no code implementations • CVPR 2013 • Jia Deng, Jonathan Krause, Li Fei-Fei
In this work, we include humans in the loop to help computers select discriminative features.
no code implementations • NeurIPS 2011 • Jia Deng, Sanjeev Satheesh, Alexander C. Berg, Fei Li
We present a novel approach to efficiently learn a label tree for large scale classification with many classes.