Search Results for author: Zhengdong Zhang

In this work, we formulate a model for predicting the behavior of all agents jointly, producing consistent futures that account for interactions between agents.

Autonomous Driving Language Modelling +1

Paper
Add Code

Scene Transformer: A unified architecture for predicting multiple agent trajectories

3 code implementations • 15 Jun 2021 • Jiquan Ngiam, Benjamin Caine, Vijay Vasudevan, Zhengdong Zhang, Hao-Tien Lewis Chiang, Jeffrey Ling, Rebecca Roelofs, Alex Bewley, Chenxi Liu, Ashish Venugopal, David Weiss, Ben Sapp, Zhifeng Chen, Jonathon Shlens

In this work, we formulate a model for predicting the behavior of all agents jointly, producing consistent futures that account for interactions between agents.

Autonomous Driving Language Modelling +1

119

Paper
Code

RAR-U-Net: a Residual Encoder to Attention Decoder by Residual Connections Framework for Spine Segmentation under Noisy Labels

no code implementations • 27 Sep 2020 • Ziyang Wang, Zhengdong Zhang, Irina Voiculescu

Segmentation algorithms for medical images are widely studied for various clinical and research purposes.

Denoising Image Segmentation +3

Paper
Add Code

Conformer: Convolution-augmented Transformer for Speech Recognition

24 code implementations • 16 May 2020 • Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs).

Ranked #12 on Speech Recognition on LibriSpeech test-other (using extra training data)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

10,131

Paper
Code

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

6 code implementations • 7 May 2020 • Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2. 1%/4. 6% without external language model (LM), 1. 9%/4. 1% with LM and 2. 9%/7. 0% with only 10M parameters on the clean/noisy LibriSpeech test sets.

Ranked #12 on Speech Recognition on LibriSpeech test-clean

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

900

Paper
Code

Streaming Object Detection for 3-D Point Clouds

no code implementations • ECCV 2020 • Wei Han, Zhengdong Zhang, Benjamin Caine, Brandon Yang, Christoph Sprunk, Ouais Alsharif, Jiquan Ngiam, Vijay Vasudevan, Jonathon Shlens, Zhifeng Chen

This built-in data capture latency is artificial, and based on treating the point cloud as a camera image in order to leverage camera-inspired architectures.

Action Recognition Autonomous Vehicles +4

Paper
Add Code

A Novel and Efficient Tumor Detection Framework for Pancreatic Cancer via CT Images

no code implementations • 11 Feb 2020 • Zhengdong Zhang, Shuai Li, Ziyang Wang, Yun Lu

Experimental results achieve competitive performance in detection with the AUC of 0. 9455, which outperforms other state-of-the-art methods to our best of knowledge, demonstrating the proposed framework can detect the tumor of pancreatic cancer efficiently and accurately.

Computed Tomography (CT)

Paper
Add Code

Hardware for Machine Learning: Challenges and Opportunities

1 code implementation • 22 Dec 2016 • Vivienne Sze, Yu-Hsin Chen, Joel Emer, Amr Suleiman, Zhengdong Zhang

Machine learning plays a critical role in extracting meaningful information out of the zetabytes of sensor data collected every day.

BIG-bench Machine Learning Self-Driving Cars

Paper
Code

A 58.6mW Real-Time Programmable Object Detector with Multi-Scale Multi-Object Support Using Deformable Parts Model on 1920x1080 Video at 30fps

no code implementations • 27 Jul 2016 • Amr Suleiman, Zhengdong Zhang, Vivienne Sze

This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2x higher accuracy than traditional rigid body models.

Classification General Classification +3

Paper
Add Code

FAST: A Framework to Accelerate Super-Resolution Processing on Compressed Videos

no code implementations • 29 Mar 2016 • Zhengdong Zhang, Vivienne Sze

State-of-the-art super-resolution (SR) algorithms require significant computational resources to achieve real-time throughput (e. g., 60Mpixels/s for HD video).

Super-Resolution

Paper
Add Code

Sparkle Vision: Seeing the World through Random Specular Microfacets

no code implementations • 26 Dec 2014 • Zhengdong Zhang, Phillip Isola, Edward H. Adelson

In this paper, we study the problem of reproducing the world lighting from a single image of an object covered with random specular microfacets on the surface.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.