Search Results for author: Junhua Mao

Found 13 papers, 5 papers with code

Generation and Comprehension of Unambiguous Object Descriptions

1 code implementation CVPR 2016 Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan Yuille, Kevin Murphy

We propose a method that can generate an unambiguous description (known as a referring expression) of a specific object or region in an image, and which can also comprehend or interpret such an expression to infer which object is being described.

Image Captioning Object +1

Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images

1 code implementation ICCV 2015 Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille

In particular, we propose a transposed weight sharing scheme, which not only improves performance on image captioning, but also makes the model more suitable for the novel concept learning task.

Image Captioning Novel Concepts +1

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

2 code implementations20 Dec 2014 Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan Yuille

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions.

8k Image Captioning +1

CNN-RNN: A Unified Framework for Multi-label Image Classification

1 code implementation CVPR 2016 Jiang Wang, Yi Yang, Junhua Mao, Zhiheng Huang, Chang Huang, Wei Xu

While deep convolutional neural networks (CNNs) have shown a great success in single-label image classification, it is important to note that real world images generally contain multiple labels, which could correspond to different objects, scenes, actions and attributes in an image.

Classification General Classification +2

Attention Correctness in Neural Image Captioning

no code implementations31 May 2016 Chenxi Liu, Junhua Mao, Fei Sha, Alan Yuille

Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision.

Image Captioning

Explain Images with Multimodal Recurrent Neural Networks

no code implementations4 Oct 2014 Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Alan L. Yuille

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images.

8k Retrieval +1

Learning From Weakly Supervised Data by The Expectation Loss SVM (e-SVM) algorithm

no code implementations NeurIPS 2014 Jun Zhu, Junhua Mao, Alan L. Yuille

We propose a novel learning algorithm called \emph{expectation loss SVM} (e-SVM) that is devoted to the problems where only the ``positiveness" instead of a binary label of each training sample is available.

object-detection Object Detection +1

STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction

no code implementations CVPR 2020 Zhishuai Zhang, Jiyang Gao, Junhua Mao, Yukai Liu, Dragomir Anguelov, Cong-Cong Li

For the Waymo Open Dataset, we achieve a bird-eyes-view (BEV) detection AP of 80. 73 and trajectory prediction average displacement error (ADE) of 33. 67cm for pedestrians, which establish the state-of-the-art for both tasks.

Autonomous Driving object-detection +3

Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving

no code implementations22 Dec 2021 Jingxiao Zheng, Xinwei Shi, Alexander Gorban, Junhua Mao, Yang song, Charles R. Qi, Ting Liu, Visesh Chari, Andre Cornman, Yin Zhou, CongCong Li, Dragomir Anguelov

3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors, including the 3D resolution and range of data, absence of dense depth maps, failure modes for LiDAR, relative location between the camera and LiDAR, and a high bar for estimation accuracy.

3D Human Pose Estimation Autonomous Driving

Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

no code implementations1 Jun 2023 Jiachen Li, Xinwei Shi, Feiyu Chen, Jonathan Stroud, Zhishuai Zhang, Tian Lan, Junhua Mao, Jeonhyung Kang, Khaled S. Refaat, Weilong Yang, Eugene Ie, CongCong Li

Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas.

Action Recognition Autonomous Vehicles +3

Cannot find the paper you are looking for? You can Submit a new open access paper.