Search Results for author: Yu Zhang

Found 278 papers, 62 papers with code

Learn to Cross-lingual Transfer with Meta Graph Learning Across Heterogeneous Languages

no code implementations EMNLP 2020 Zheng Li, Mukul Kumar, William Headden, Bing Yin, Ying WEI, Yu Zhang, Qiang Yang

Recent emergence of multilingual pre-training language model (mPLM) has enabled breakthroughs on various downstream cross-lingual transfer (CLT) tasks.

Cross-Lingual Transfer Graph Learning +1

Learning to See in the Dark with Events

no code implementations ECCV 2020 Song Zhang, Yu Zhang, Zhe Jiang, Dongqing Zou, Jimmy Ren, Bin Zhou

A detail enhancing branch is proposed to reconstruct day light-specific features from the domain-invariant representations in a residual manner, regularized by a ranking loss.

Representation Learning Unsupervised Domain Adaptation

Dynamic Feature Alignment for Semi-supervised Domain Adaptation

no code implementations18 Oct 2021 Yu Zhang, Gongbo Liang, Nathan Jacobs

Most research on domain adaptation has focused on the purely unsupervised setting, where no labeled examples in the target domain are available.

Domain Adaptation

SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing

no code implementations14 Oct 2021 Junyi Ao, Rui Wang, Long Zhou, Shujie Liu, Shuo Ren, Yu Wu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-training natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning.

automatic-speech-recognition Quantization +4

Region Semantically Aligned Network for Zero-Shot Learning

no code implementations14 Oct 2021 Ziyang Wang, Yunhao Gou, Jingjing Li, Yu Zhang, Yang Yang

Zero-shot learning (ZSL) aims to recognize unseen classes based on the knowledge of seen classes.

Transfer Learning Zero-Shot Learning

Multi-View Self-Attention Based Transformer for Speaker Recognition

no code implementations11 Oct 2021 Rui Wang, Junyi Ao, Long Zhou, Shujie Liu, Zhihua Wei, Tom Ko, Qing Li, Yu Zhang

In this work, we propose a novel multi-view self-attention mechanism and present an empirical study of different Transformer variants with or without the proposed attention mechanism for speaker recognition.

Speaker Recognition

Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

no code implementations9 Oct 2021 Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Many speech applications require understanding aspects beyond the words being spoken, such as recognizing emotion, detecting whether the speaker is wearing a mask, or distinguishing real from synthetic speech.

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition

no code implementations7 Oct 2021 Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland

As end-to-end automatic speech recognition (ASR) models reach promising performance, various downstream tasks rely on good confidence estimators for these systems.

automatic-speech-recognition End-To-End Speech Recognition +2

Learning Material Parameters and Hydrodynamics of Soft Robotic Fish via Differentiable Simulation

no code implementations30 Sep 2021 John Z. Zhang, Yu Zhang, Pingchuan Ma, Elvis Nava, Tao Du, Philip Arm, Wojciech Matusik, Robert K. Katzschmann

We address this gap with our differentiable simulation tool by learning the material parameters and hydrodynamics of our robots.

Comparison of Object Detection Algorithms Using Video and Thermal Images Collected from a UAS Platform: An Application of Drones in Traffic Management

no code implementations27 Sep 2021 Hualong Tang, Joseph Post, Achilleas Kourtellis, Brian Porter, Yu Zhang

The results show that a background subtraction-based method can achieve good detection performance on RGB images (F1 scores around 0. 9 for most cases), and a more varied performance is seen on thermal images with different azimuth angles.

Object Detection

A Simple Self-calibration Method for The Internal Time Synchronization of MEMS LiDAR

no code implementations26 Sep 2021 Yu Zhang, Xiaoguang Di, Shiyu Yan, Bin Zhang, Baoling Qi, Chunhui Wang

This paper proposes a simple self-calibration method for the internal time synchronization of MEMS(Micro-electromechanical systems) LiDAR during research and development.

Multi-Task Learning in Natural Language Processing: An Overview

no code implementations19 Sep 2021 Shijie Chen, Yu Zhang, Qiang Yang

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP).

Multi-Task Learning

Generating Active Explicable Plans in Human-Robot Teaming

no code implementations18 Sep 2021 Akkamahadevi Hanni, Yu Zhang

In our experimental evaluation, we verify that our approach generates more efficient explicable plans while successfully capturing the dynamic belief change of the human teammate.

Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification

1 code implementation14 Sep 2021 Qi Shi, Yu Zhang, Qingyu Yin, Ting Liu

Specifically, we first retrieve logic-level program-like evidence from the given table and statement as supplementary evidence for the table.

Fact Verification Table-based Fact Verification

Domain Adaptation by Maximizing Population Correlation with Neural Architecture Search

no code implementations12 Sep 2021 Zhixiong Yue, Pengxin Guo, Yu Zhang

Base on the PC function, we propose a new method called Domain Adaptation by Maximizing Population Correlation (DAMPC) to learn a domain-invariant feature representation for DA.

Domain Adaptation Neural Architecture Search

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

1 code implementation10 Sep 2021 Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han

We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base.

Language Modelling Named Entity Recognition +1

Injecting Text in Self-Supervised Speech Pretraining

no code implementations27 Aug 2021 Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro Moreno

The proposed method, tts4pretrain complements the power of contrastive learning in self-supervision with linguistic/lexical representations derived from synthesized speech, effectively learning from untranscribed speech and unspoken text.

Contrastive Learning Language Modelling +1

Attention-based Neural Load Forecasting: A Dynamic Feature Selection Approach

no code implementations25 Aug 2021 Jing Xiong, Pengyang Zhou, Alan Chen, Yu Zhang

Then, a decoder with hierarchical temporal attention enables a similar day selection, which re-evaluates the importance of historical information at each time step.

Feature Selection Load Forecasting +4

Online Dictionary Learning Based Fault and Cyber Attack Detection for Power Systems

no code implementations24 Aug 2021 Gabriel Intriago, Yu Zhang

This paper deals with the event and intrusion detection problem by leveraging a stream data mining classifier (Hoeffding adaptive tree) with semi-supervised learning techniques to distinguish cyber-attacks from regular system perturbations accurately.

Cyber Attack Detection Dictionary Learning +1

Mitigating Greenhouse Gas Emissions Through Generative Adversarial Networks Based Wildfire Prediction

no code implementations20 Aug 2021 Sifat Chowdhury, Kai Zhu, Yu Zhang

Over the past decade, the number of wildfire has increased significantly around the world, especially in the State of California.

Data Augmentation

Reinforcement Learning for Robot Navigation with Adaptive ExecutionDuration (AED) in a Semi-Markov Model

no code implementations13 Aug 2021 Yu'an Chen, Ruosong Ye, Ziyang Tao, Hongjian Liu, Guangda Chen, Jie Peng, Jun Ma, Yu Zhang, Yanyong Zhang, Jianmin Ji

Specifically, we formulate the navigation task as a Semi-Markov Decision Process (SMDP) problem to handle adaptive execution duration.

Robot Navigation

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training

no code implementations7 Aug 2021 Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu

In particular, when compared to published models such as conformer-based wav2vec~2. 0 and HuBERT, our model shows~5\% to~10\% relative WER reduction on the test-clean and test-other subsets.

Contrastive Learning Language Modelling +1

Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting

1 code implementation6 Jul 2021 Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, Yu Zhang

As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding.

Autonomous Driving Monocular 3D Object Detection +2

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

no code implementations24 Jun 2021 Yingjie Wang, Qiuyu Mao, Hanqi Zhu, Yu Zhang, Jianmin Ji, Yanyong Zhang

In this survey, we first introduce the background of popular sensors for autonomous cars, including their common data representations as well as object detection networks developed for each type of sensor data.

3D Object Detection Autonomous Driving +2

Sparse Multi-Path Corrections in Fringe Projection Profilometry

no code implementations CVPR 2021 Yu Zhang, Daniel Lau, David Wipf

Three-dimensional scanning by means of structured light illumination is an active imaging technique involving projecting and capturing a series of striped patterns and then using the observed warping of stripes to reconstruct the target object's surface through triangulating each pixel in the camera to a unique projector coordinate corresponding to a particular feature in the projected patterns.

Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection

no code implementations CVPR 2021 Luwei Hou, Yu Zhang, Kui Fu, Jia Li

Cross-domain weakly supervised object detection aims to adapt object-level knowledge from a fully labeled source domain dataset (i. e. with object bounding boxes) to train object detectors for target domains that are weakly labeled (i. e. with image-level tags).

Transfer Learning Weakly Supervised Object Detection

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

2 code implementations17 Jun 2021 Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

The model takes an input phoneme sequence, and through an iterative refinement process, generates an audio waveform.

Speech Synthesis Text-To-Speech Synthesis

FedNILM: Applying Federated Learning to NILM Applications at the Edge

no code implementations7 Jun 2021 Yu Zhang, Guoming Tang, Qianyi Huang, Yi Wang, Xudong Wang, Jiadong Lou

Non-intrusive load monitoring (NILM) helps disaggregate the household's main electricity consumption to energy usages of individual appliances, thus greatly cutting down the cost in fine-grained household load monitoring.

Federated Learning Model Compression +2

Rethinking Training from Scratch for Object Detection

1 code implementation6 Jun 2021 Yang Li, Hong Zhang, Yu Zhang

The ImageNet pre-training initialization is the de-facto standard for object detection.

Object Detection

More Behind Your Electricity Bill: a Dual-DNN Approach to Non-Intrusive Load Monitoring

no code implementations1 Jun 2021 Yu Zhang, Guoming Tang, Qianyi Huang, Yi Wang, Hong Xu

Non-intrusive load monitoring (NILM) is a well-known single-channel blind source separation problem that aims to decompose the household energy consumption into itemised energy usage of individual appliances.

Non-Intrusive Load Monitoring

Large-Signal Grid-Synchronization Stability Analysis of PLL-based VSCs Using Lyapunov's Direct Method

no code implementations23 May 2021 Yu Zhang, Chen Zhang, Xu Cai

Grid-synchronization stability (GSS) is an emerging stability issue of grid-tied voltage source converters (VSCs), which can be provoked by severe grid voltage sags.

Scaling End-to-End Models for Large-Scale Multilingual ASR

no code implementations30 Apr 2021 Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai

Building ASR models across many languages is a challenging multi-task learning problem due to large variations and heavily unbalanced data.

Multi-Task Learning

Deep Latent Emotion Network for Multi-Task Learning

no code implementations18 Apr 2021 Huangbin Zhang, Chong Zhao, Yu Zhang, Danlei Wang, Haichao Yang

DLEN is deployed on a real-world multi-task feed recommendation scenario of Tencent QQ-Small-World with a dataset containing over a billion samples, and it exhibits a significant performance advantage over the SOTA MTL model in offline evaluation, together with a considerable increase by 3. 02% in view-count and 2. 63% in user stay-time in production.

Multi-Task Learning

CSAFL: A Clustered Semi-Asynchronous Federated Learning Framework

no code implementations16 Apr 2021 Yu Zhang, Moming Duan, Duo Liu, Li Li, Ao Ren, Xianzhang Chen, Yujuan Tan, Chengliang Wang

Asynchronous FL has a natural advantage in mitigating the straggler effect, but there are threats of model quality degradation and server crash.

Federated Learning

Pushing the Limits of Non-Autoregressive Speech Recognition

no code implementations7 Apr 2021 Edwin G. Ng, Chung-Cheng Chiu, Yu Zhang, William Chan

We combine recent advancements in end-to-end speech recognition to non-autoregressive automatic speech recognition.

automatic-speech-recognition End-To-End Speech Recognition +2

Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models

no code implementations6 Apr 2021 Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao

To attack RNN-T, we find prepending perturbation is more effective than the additive perturbation, and can mislead the models to predict the same short target on utterances of arbitrary length.

automatic-speech-recognition End-To-End Speech Recognition +1

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

no code implementations5 Apr 2021 William Chan, Daniel Park, Chris Lee, Yu Zhang, Quoc Le, Mohammad Norouzi

We present SpeechStew, a speech recognition model that is trained on a combination of various publicly available speech recognition datasets: AMI, Broadcast News, Common Voice, LibriSpeech, Switchboard/Fisher, Tedlium, and Wall Street Journal.

Language Modelling Speech Recognition +1

Using Simulation to Aid the Design and Optimization of Intelligent User Interfaces for Quality Assurance Processes in Machine Learning

no code implementations2 Apr 2021 Yu Zhang, Martijn Tennekes, Tim De Jong, Lyana Curier, Bob Coecke, Min Chen

Because QA4ML users have to view a non-trivial amount of data and perform many input actions to correct errors made by the ML model, an optimally-designed user interface (UI) can reduce the cost of interactions significantly.

On the limits of algorithmic prediction across the globe

no code implementations28 Mar 2021 Xingyu Li, Difan Song, Miaozhe Han, Yu Zhang, Rene F. Kizilcec

We tested how well predictive models of human behavior trained in a developed country generalize to people in less developed countries by modeling global variation in 200 predictors of academic achievement on nationally representative student data for 65 countries.

Deciphering Star Cluster Evolution by Shape Morphology

no code implementations4 Mar 2021 Qingshun Hu, Yu Zhang, Ali Esamdin, Jinzhong Liu, Xiangyun Zeng

A significant negative correlation between the overall ellipticities and masses is also detected for the sample clusters with log(age/year) $\geq$ 8, suggesting that the overall shapes of the clusters are possibly influenced by the number of members and masses, in addition to the external forces and the surrounding environment.

Astrophysics of Galaxies Solar and Stellar Astrophysics

Self-supervised Low Light Image Enhancement and Denoising

1 code implementation1 Mar 2021 Yu Zhang, Xiaoguang Di, Bin Zhang, Qingyan Li, Shiyu Yan, Chunhui Wang

Both of the networks can be trained with low light images only, which is achieved by a Maximum Entropy based Retinex (ME-Retinex) model and an assumption that noises are independently distributed.

Denoising Low-Light Image Enhancement

Fronthaul Compression and Passive Beamforming Design for Intelligent Reflecting Surface-aided Cloud Radio Access Networks

no code implementations25 Feb 2021 Yu Zhang, Xuelu Wu, Hong Peng, Caijun Zhong, Xiaoming Chen

This letter studies a cloud radio access network (C-RAN) with multiple intelligent reflecting surfaces (IRS) deployed between users and remote radio heads (RRH).


Reinforcement Learning for Beam Pattern Design in Millimeter Wave and Massive MIMO Systems

no code implementations18 Feb 2021 Yu Zhang, Muhammad Alrabeiah, Ahmed Alkhateeb

Employing large antenna arrays is a key characteristic of millimeter wave (mmWave) and terahertz communication systems.

Echo State Speech Recognition

no code implementations18 Feb 2021 Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara Sainath

We propose automatic speech recognition (ASR) models inspired by echo state network (ESN), in which a subset of recurrent neural networks (RNN) layers in the models are randomly initialized and untrained.

automatic-speech-recognition Speech Recognition

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

1 code implementation15 Feb 2021 Yu Zhang, Zhihong Shen, Yuxiao Dong, Kuansan Wang, Jiawei Han

Multi-label text classification refers to the problem of assigning each given document its most relevant labels from the label set.

Classification General Classification +1

Multi-Objective Meta Learning

no code implementations14 Feb 2021 Feiyang Ye, Baijiong Lin, Zhixiong Yue, Pengxin Guo, Qiao Xiao, Yu Zhang

Empirically, we show the effectiveness of the proposed MOML framework in several meta learning problems, including few-shot learning, neural architecture search, domain adaptation, and multi-task learning.

Domain Adaptation Few-Shot Learning +2

Phase discontinuities induced scintillation enhancement: coherent vortex beams propagating through weak oceanic turbulence

no code implementations5 Feb 2021 Hantao Wang, Huajun Zhang, Mingyuan Ren, Jinren Yao, Yu Zhang

Under the impact of an infinitely extended edge phase dislocation, optical vortices (screw phase dislocations) induce scintillation enhancement.


Joint Transmit Precoding and Reflect Beamforming Design for IRS-Assisted MIMO Cognitive Radio Systems

no code implementations2 Feb 2021 Weiheng Jiang, Yu Zhang, Jun Zhao, Zehui Xiong, Zhiguo Ding

Cognitive radio (CR) is an effective solution to improve the spectral efficiency (SE) of wireless communications by allowing the secondary users (SUs) to share spectrum with primary users (PUs).

Information Theory Signal Processing Information Theory

Photoproduction $γp \to K^+Λ(1520)$ in an effective Lagrangian approach

no code implementations22 Jan 2021 Neng-Chang Wei, Yu Zhang, Fei Huang, De-Min Li

In addition to the $t$-channel $K$ and $K^\ast$ exchanges, the $u$-channel $\Lambda$ exchange, the $s$-channel nucleon exchange, and the interaction current, a minimal number of nucleon resonances in the $s$ channel are introduced in constructing the reaction amplitudes to describe the data.

High Energy Physics - Phenomenology Nuclear Theory

Generative Adversarial U-Net for Domain-free Medical Image Augmentation

no code implementations12 Jan 2021 Xiaocong Chen, Yun Li, Lina Yao, Ehsan Adeli, Yu Zhang

The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.

Computed Tomography (CT) Image Augmentation +1

Model-based cellular kinetic analysis of SARS-CoV-2 infection: different immune response modes and treatment strategies

no code implementations12 Jan 2021 Zhengqing Zhou, Zhiheng Zhao, Shuyu Shi, Jianghua Wu, Dianjie Li, Jianwei Li, Jingpeng Zhang, Ke Gui, Yu Zhang, Heng Mei, Yu Hu, Qi Ouyang, Fangting Li

Integrating theoretical results with clinical COVID-19 patients' data, we classified the COVID-19 development processes into three typical modes of immune responses, correlated with the clinical classification of mild & moderate, severe and critical patients.

Training Weakly Supervised Video Frame Interpolation With Events

1 code implementation ICCV 2021 ZHIYANG YU, Yu Zhang, Deyuan Liu, Dongqing Zou, Xijun Chen, Yebin Liu, Jimmy S. Ren

Though trained on low frame-rate videos, our framework outperforms existing models trained with full high frame-rate videos (and events) on both GoPro dataset and a new real event-based dataset.

Video Frame Interpolation

FOC OSOD: Focus on Classification One-Shot Object Detection

no code implementations1 Jan 2021 Hanqing Yang, Huaijin Pi, SABA GHORBANI BARZEGAR, Yu Zhang

This paper analyzes the serious false positive problem in OSOD and proposes a Focus on Classification One-Shot Object Detection (FOC OSOD) framework, which is improved in two important aspects: (1) classification cascade head with the fixed IoU threshold can enhance the robustness of classification by comparing multiple close regions; (2) classification region deformation on the query feature and the reference feature to obtain a more effective comparison region.

Classification General Classification +1

A Survey on Neural Network Interpretability

no code implementations28 Dec 2020 Yu Zhang, Peter Tiňo, Aleš Leonardis, Ke Tang

Along with the great success of deep neural networks, there is also growing concern about their black-box nature.

Drug Discovery

On Convergence of Gradient Expected Sarsa($λ$)

no code implementations14 Dec 2020 Long Yang, Gang Zheng, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

We study the convergence of $\mathtt{Expected~Sarsa}(\lambda)$ with linear function approximation.

Improving EEG Decoding via Clustering-based Multi-task Feature Learning

no code implementations12 Dec 2020 Yu Zhang, Tao Zhou, Wei Wu, Hua Xie, Hongru Zhu, Guoxu Zhou, Andrzej Cichocki

With the encoded label matrix, we devise a novel multi-task learning algorithm by exploiting the subclass relationship to jointly optimize the EEG pattern features from the uncovered subclasses.

EEG Eeg Decoding +1

Learn to Predict Vertical Track Irregularity with Extremely Imbalanced Data

no code implementations5 Dec 2020 Yutao Chen, Yu Zhang, Fei Yang

Railway systems require regular manual maintenance, a large part of which is dedicated to inspecting track deformation.

Ensemble Learning Time Series +1

Optical Wavelength Guided Self-Supervised Feature Learning For Galaxy Cluster Richness Estimate

no code implementations4 Dec 2020 Gongbo Liang, Yuanyuan Su, Sheng-Chieh Lin, Yu Zhang, Yuanyuan Zhang, Nathan Jacobs

We believe the proposed method will benefit astronomy and cosmology, where a large number of unlabeled multi-band images are available, but acquiring image labels is costly.

A robust and generalizable immune-relatedsignature for sepsis diagnostics

no code implementations23 Nov 2020 Yueran Yang, Yu Zhang, Shuai Li, Xubin Zheng, Man-Hon Wong, Kwong-Sak Leung, Lixin Cheng

High-throughput sequencing can detect tens of thousands of genes in parallel, providing opportunities for improving the diagnostic accuracy of multiple diseases including sepsis, which is an aggressive inflammatory response to infection that can cause organ failure and death.

A Better and Faster End-to-End Model for Streaming ASR

no code implementations21 Nov 2020 Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu

To address this, we explore replacing the LSTM layers in the encoder of our E2E model with Conformer layers [4], which has shown good improvements for ASR.

Audio and Speech Processing Sound

Multi-Task Adversarial Attack

no code implementations19 Nov 2020 Pengxin Guo, Yuancheng Xu, Baijiong Lin, Yu Zhang

More specifically, MTA uses a generator for adversarial perturbations which consists of a shared encoder for all tasks and multiple task-specific decoders.

Adversarial Attack

Effective, Efficient and Robust Neural Architecture Search

no code implementations19 Nov 2020 Zhixiong Yue, Baijiong Lin, Xiaonan Huang, Yu Zhang

Although NAS methods can find network architectures with the state-of-the-art performance, the adversarial robustness and resource constraint are often ignored in NAS.

Neural Architecture Search

Domain Concretization from Examples: Addressing Missing Domain Knowledge via Robust Planning

no code implementations18 Nov 2020 Akshay Sharma, Piyush Rajesh Medikeri, Yu Zhang

This problem is more challenging than partial observability in the sense that the agent is unaware of certain knowledge, in contrast to it being partially observable: the difference between known unknowns and unknown unknowns.

Decision Making

Large-scale multilingual audio visual dubbing

no code implementations6 Nov 2020 Yi Yang, Brendan Shillingford, Yannis Assael, Miaosen Wang, Wendi Liu, Yutian Chen, Yu Zhang, Eren Sezener, Luis C. Cobo, Misha Denil, Yusuf Aytar, Nando de Freitas

The visual content is translated by synthesizing lip movements for the speaker to match the translated audio, creating a seamless audiovisual experience in the target language.


Fault Detection for Covered Conductors With High-Frequency Voltage Signals: From Local Patterns to Global Features

no code implementations1 Nov 2020 Kunjin Chen, Tomáš Vantuch, Yu Zhang, Jun Hu, Jinliang He

The detection and characterization of partial discharge (PD) are crucial for the insulation diagnosis of overhead lines with covered conductors.

Fault Detection

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

1 code implementation26 Oct 2020 Yu Zhang, Xiusi Chen, Yu Meng, Jiawei Han

Our experiments demonstrate a consistent improvement of HiMeCat over competitive baselines and validate the contribution of our representation learning and data augmentation modules.

Data Augmentation Document Classification +1

Unsupervised Learning of Disentangled Speech Content and Style Representation

no code implementations24 Oct 2020 Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita

We present an approach for unsupervised learning of speech representation disentangling contents and styles.

Speaker Recognition

Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data

no code implementations22 Oct 2020 Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao

We propose a novel and effective learning method by leveraging a non-streaming ASR model as a teacher to generate transcripts on an arbitrarily large data set, which is then used to distill knowledge into streaming ASR models.

automatic-speech-recognition Model distillation +1

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

no code implementations20 Oct 2020 Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu

We employ a combination of recent developments in semi-supervised learning for automatic speech recognition to obtain state-of-the-art results on LibriSpeech utilizing the unlabeled audio of the Libri-Light dataset.

 Ranked #1 on Speech Recognition on LibriSpeech test-clean (using extra training data)

automatic-speech-recognition Speech Recognition

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling

3 code implementations8 Oct 2020 Jonathan Shen, Ye Jia, Mike Chrzanowski, Yu Zhang, Isaac Elias, Heiga Zen, Yonghui Wu

This paper presents Non-Attentive Tacotron based on the Tacotron 2 text-to-speech model, replacing the attention mechanism with an explicit duration predictor.

Speech Recognition

Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging

no code implementations6 Oct 2020 Gongbo Liang, Connor Greenwell, Yu Zhang, Xiaoqin Wang, Ramakanth Kavuluru, Nathan Jacobs

A key challenge in training neural networks for a given medical imaging task is often the difficulty of obtaining a sufficient number of manually labeled examples.

Image Classification Text Matching +1

Cross-Modal Alignment with Mixture Experts Neural Network for Intral-City Retail Recommendation

no code implementations17 Sep 2020 Po Li, Lei LI, Yan Fu, Jun Rong, Yu Zhang

At top of the MoE layer, we deploy a transformer layer for each task as task tower to learn task-specific information.

Recommendation Systems

Boosting Retailer Revenue by Generated Optimized Combined Multiple Digital Marketing Campaigns

no code implementations9 Sep 2020 Yafei Xu, Tian Xie, Yu Zhang

Secondly, based on the sub-modular optimization theory and the DMC pool by DMCNet, the generated combined multiple DMCs are ranked with respect to their revenue generation strength then the top three ranked campaigns are returned to the sellers' back-end management system, so that retailers can set combined multiple DMCs for their online shops just in one-shot.

Improved Trainable Calibration Method for Neural Networks on Medical Imaging Classification

no code implementations9 Sep 2020 Gongbo Liang, Yu Zhang, Xiaoqin Wang, Nathan Jacobs

Recent works have shown that deep neural networks can achieve super-human performance in a wide range of image classification tasks in the medical imaging domain.

Classification Decision Making +2

Oceanic non-Kolmogorov optical turbulence and spherical wave propagation

no code implementations5 Sep 2020 Jinren Yao, Hantao Wang, Huajun Zhang, Jiandong Cai, Mingyuan Ren, Yu Zhang, Olga Korotkova

In particular, for natural water turbulence several models for the spatial power spectra have been developed based on the classic, Kolmogorov postulates.

Atmospheric and Oceanic Physics Optics

WaveGrad: Estimating Gradients for Waveform Generation

6 code implementations ICLR 2021 Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan

This paper introduces WaveGrad, a conditional model for waveform generation which estimates gradients of the data density.

Speech Synthesis Text-To-Speech Synthesis

Better Than Reference In Low Light Image Enhancement: Conditional Re-Enhancement Networks

1 code implementation26 Aug 2020 Yu Zhang, Xiaoguang Di, Bin Zhang, Ruihang Ji, Chunhui Wang

The network takes low light images as input and the enhanced V channel as condition, then it can re-enhance the contrast and brightness of the low light image and at the same time reduce noise and color distortion.

Low-Light Image Enhancement

MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction

1 code implementation7 Aug 2020 Wentao Ouyang, Xiuwu Zhang, Lei Zhao, Jinmei Luo, Yu Zhang, Heng Zou, Zhaojie Liu, Yanlong Du

Our study is based on UC Toutiao (a news feed service integrated with the UC Browser App, serving hundreds of millions of users daily), where the source domain is the news and the target domain is the ad.

Click-Through Rate Prediction

Multi-source Heterogeneous Domain Adaptation with Conditional Weighting Adversarial Network

1 code implementation6 Aug 2020 Yuan Yao, Xutao Li, Yu Zhang, Yunming Ye

In reality, however, it is not uncommon to obtain samples from multiple heterogeneous domains.

Domain Adaptation

COMET: Convolutional Dimension Interaction for Collaborative Filtering

no code implementations28 Jul 2020 Zhuoyi Lin, Lei Feng, Xingzhi Guo, Yu Zhang, Rui Yin, Chee Keong Kwoh, Chi Xu

In this paper, we propose a novel latent factor model called COMET (COnvolutional diMEnsion inTeraction), which simultaneously model the high-order interaction patterns among historical interactions and embedding dimensions.

Collaborative Filtering

A Study on Evaluation Standard for Automatic Crack Detection Regard the Random Fractal

no code implementations23 Jul 2020 Hongyu Li, Jihe Wang, Yu Zhang, Zi-Rui Wang, Tiejun Wang

In CovEval, a different matching process based on the idea of covering box matching is adopted for this issue.

Object Detection

Deep Image Clustering with Category-Style Representation

1 code implementation ECCV 2020 Junjie Zhao, Donghuan Lu, Kai Ma, Yu Zhang, Yefeng Zheng

In this paper, we propose a novel deep image clustering framework to learn a category-style latent representation in which the category information is disentangled from image style and can be directly used as the cluster assignment.

Deep Clustering Image Clustering

Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

1 code implementation18 Jul 2020 Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Chao Zhang, Jiawei Han

Mining a set of meaningful topics organized into a hierarchy is intuitively appealing since topic correlations are ubiquitous in massive text corpora.

Topic Models

Multi-Site Infant Brain Segmentation Algorithms: The iSeg-2019 Challenge

no code implementations4 Jul 2020 Yue Sun, Kun Gao, Zhengwang Wu, Zhihao Lei, Ying WEI, Jun Ma, Xiaoping Yang, Xue Feng, Li Zhao, Trung Le Phan, Jitae Shin, Tao Zhong, Yu Zhang, Lequan Yu, Caizi Li, Ramesh Basnet, M. Omair Ahmad, M. N. S. Swamy, Wenao Ma, Qi Dou, Toan Duc Bui, Camilo Bermudez Noguera, Bennett Landman, Ian H. Gotlib, Kathryn L. Humphreys, Sarah Shultz, Longchuan Li, Sijie Niu, Weili Lin, Valerie Jewells, Gang Li, Dinggang Shen, Li Wang

Deep learning-based methods have achieved state-of-the-art performance; however, one of major limitations is that the learning-based methods may suffer from the multi-site issue, that is, the models trained on a dataset from one site may not be applicable to the datasets acquired from other sites with different imaging protocols/scanners.

Brain Segmentation

Allocation of Multi-Robot Tasks with Task Variants

no code implementations1 Jul 2020 Zakk Giacometti, Yu Zhang

We referred to this new problem as the multi-robot task allocation problem with task variants.

A Multi-spectral Dataset for Evaluating Motion Estimation Systems

no code implementations1 Jul 2020 Weichen Dai, Yu Zhang, Shenzhou Chen, Donglei Sun, Da Kong

The multi-spectral images, including both color and thermal images in full sensor resolution (640 x 480), are obtained from a standard and a long-wave infrared camera at 32Hz with hardware-synchronization.

Motion Capture Motion Estimation +1

Offline Handwritten Chinese Text Recognition with Convolutional Neural Networks

1 code implementation28 Jun 2020 Brian Liu, Xianchao Xu, Yu Zhang

Deep learning based methods have been dominating the text recognition tasks in different and multilingual scenarios.

Handwritten Chinese Text Recognition Language Modelling

Neural Networks Based Beam Codebooks: Learning mmWave Massive MIMO Beams that Adapt to Deployment and Hardware

no code implementations25 Jun 2020 Muhammad Alrabeiah, Yu Zhang, Ahmed Alkhateeb

To overcome these limitations, this paper develops an efficient online machine learning framework that learns how to adapt the codebook beam patterns to the specific deployment, surrounding environment, user distribution, and hardware characteristics.

Distant Transfer Learning via Deep Random Walk

no code implementations13 Jun 2020 Qiao Xiao, Yu Zhang

Transfer learning, which is to improve the learning performance in the target domain by leveraging useful knowledge from the source domain, often requires that those two domains are very close, which limits its application scope.

Transfer Learning

M2Net: Multi-modal Multi-channel Network for Overall Survival Time Prediction of Brain Tumor Patients

no code implementations1 Jun 2020 Tao Zhou, Huazhu Fu, Yu Zhang, Changqing Zhang, Xiankai Lu, Jianbing Shen, Ling Shao

Then, we use a modality-specific network to extract implicit and high-level features from different MR scans.

Attention: to Better Stand on the Shoulders of Giants

no code implementations27 May 2020 Sha Yuan, Zhou Shao, Yu Zhang, Xingxing Wei, Tong Xiao, Yifan Wang, Jie Tang

In the progress of science, the previously discovered knowledge principally inspires new scientific ideas, and citation is a reasonably good reflection of this cumulative nature of scientific research.

Improved Noisy Student Training for Automatic Speech Recognition

no code implementations19 May 2020 Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le

Noisy student training is an iterative self-training method that leverages augmentation to improve network performance.

Ranked #3 on Speech Recognition on LibriSpeech test-clean (using extra training data)

automatic-speech-recognition Image Classification +1

Conformer: Convolution-augmented Transformer for Speech Recognition

12 code implementations16 May 2020 Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs).

automatic-speech-recognition Language Modelling +1

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

3 code implementations7 May 2020 Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2. 1%/4. 6% without external language model (LM), 1. 9%/4. 1% with LM and 2. 9%/7. 0% with only 10M parameters on the clean/noisy LibriSpeech test sets.

automatic-speech-recognition End-To-End Speech Recognition +2

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

no code implementations7 May 2020 Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu

On a long-form YouTube test set, when the nonstreaming RNN-T model is trained with shorter segments of data, the proposed combination improves word error rate (WER) from 22. 3% to 14. 8%; when the streaming RNN-T model trained on short Search queries, the proposed techniques improve WER on the YouTube set from 67. 0% to 25. 3%.

automatic-speech-recognition Speech Recognition

Efficient Second-Order TreeCRF for Neural Dependency Parsing

2 code implementations ACL 2020 Yu Zhang, Zhenghua Li, Min Zhang

Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data.

Chinese Dependency Parsing Dependency Parsing

Minimally Supervised Categorization of Text with Metadata

1 code implementation1 May 2020 Yu Zhang, Yu Meng, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han

Then, based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity.

Document Classification

A Large Scale Speech Sentiment Corpus

no code implementations LREC 2020 Eric Chen, Zhiyun Lu, Hao Xu, Liangliang Cao, Yu Zhang, James Fan

We present a multimodal corpus for sentiment analysis based on the existing Switchboard-1 Telephone Speech Corpus released by the Linguistic Data Consortium.

Sentiment Analysis

Partially-Typed NER Datasets Integration: Connecting Practice to Theory

no code implementations1 May 2020 Shi Zhi, Liyuan Liu, Yu Zhang, Shiyin Wang, Qi Li, Chao Zhang, Jiawei Han

While typical named entity recognition (NER) models require the training set to be annotated with all target types, each available datasets may only cover a part of them.

Named Entity Recognition NER

Learning an Adaptive Model for Extreme Low-light Raw Image Processing

1 code implementation22 Apr 2020 Qingxu Fu, Xiaoguang Di, Yu Zhang

Furthermore, those tests illustrate that the proposed method is able to adaptively control the global image brightness according to the content of the image scene.

Denoising Low-Light Image Enhancement +1

Order Matters: Generating Progressive Explanations for Planning Tasks in Human-Robot Teaming

no code implementations16 Apr 2020 Mehrdad Zakershahrak, Shashank Rao Marpally, Akshay Sharma, Ze Gong, Yu Zhang

Given this sequential process, a formulation based on goal-based MDP for generating progressive explanations is presented.

Decision Making

Learning Event-Based Motion Deblurring

no code implementations CVPR 2020 Zhe Jiang, Yu Zhang, Dongqing Zou, Jimmy Ren, Jiancheng Lv, Yebin Liu

Recovering sharp video sequence from a motion-blurred image is highly ill-posed due to the significant loss of motion information in the blurring process.

Ranked #8 on Deblurring on GoPro (using extra training data)


Residual Attention U-Net for Automated Multi-Class Segmentation of COVID-19 Chest CT Images

no code implementations12 Apr 2020 Xiaocong Chen, Lina Yao, Yu Zhang

The novel coronavirus disease 2019 (COVID-19) has been spreading rapidly around the world and caused significant impact on the public health and economy.

Computed Tomography (CT)

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

1 code implementation1 Apr 2020 Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun, Jiawei Han

Since there has already been a broad body of HNE algorithms, as the first contribution of this work, we provide a generic paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms.

Network Embedding

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

no code implementations28 Mar 2020 Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i. e., word error rate (WER), and latency, i. e., the time the hypothesis is finalized after the user stops speaking.

CF2-Net: Coarse-to-Fine Fusion Convolutional Network for Breast Ultrasound Image Segmentation

no code implementations23 Mar 2020 Zhenyuan Ning, Ke Wang, Shengzhou Zhong, Qianjin Feng, Yu Zhang

Breast ultrasound (BUS) image segmentation plays a crucial role in a computer-aided diagnosis system, which is regarded as a useful tool to help increase the accuracy of breast cancer diagnosis.

Semantic Segmentation

Fisher Deep Domain Adaptation

1 code implementation12 Mar 2020 Yinghua Zhang, Yu Zhang, Ying WEI, Kun Bai, Yangqiu Song, Qiang Yang

Though the learned representations are separable in the source domain, they usually have a large variance and samples with different class labels tend to overlap in the target domain, which yields suboptimal adaptation performance.

Domain Adaptation

Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing?

no code implementations6 Mar 2020 Houquan Zhou, Yu Zhang, Zhenghua Li, Min Zhang

In the pre deep learning era, part-of-speech tags have been considered as indispensable ingredients for feature engineering in dependency parsing.

Dependency Parsing Feature Engineering +2

Defense-PointNet: Protecting PointNet Against Adversarial Attacks

no code implementations27 Feb 2020 Yu Zhang, Gongbo Liang, Tawfiq Salem, Nathan Jacobs

Despite remarkable performance across a broad range of tasks, neural networks have been shown to be vulnerable to adversarial attacks.

2D Convolutional Neural Networks for 3D Digital Breast Tomosynthesis Classification

no code implementations27 Feb 2020 Yu Zhang, Xiaoqin Wang, Hunter Blanton, Gongbo Liang, Xin Xing, Nathan Jacobs

Automated methods for breast cancer detection have focused on 2D mammography and have largely ignored 3D digital breast tomosynthesis (DBT), which is frequently used in clinical practice.

Breast Cancer Detection Classification +1

Attention-guided Chained Context Aggregation for Semantic Segmentation

3 code implementations27 Feb 2020 Quan Tang, Fagui Liu, Tong Zhang, Jun Jiang, Yu Zhang

The way features propagate in Fully Convolutional Networks is of momentous importance to capture multi-scale contexts for obtaining precise segmentation masks.

Semantic Segmentation

Self-supervised Image Enhancement Network: Training with Low Light Images Only

1 code implementation26 Feb 2020 Yu Zhang, Xiaoguang Di, Bin Zhang, Chunhui Wang

We introduce a constraint that the maximum channel of the reflectance conforms to the maximum channel of the low light image and its entropy should be largest in our model to achieve self-supervised learning.

Low-Light Image Enhancement Self-Supervised Learning

Learning Beam Codebooks with Neural Networks: Towards Environment-Aware mmWave MIMO

1 code implementation25 Feb 2020 Yu Zhang, Muhammad Alrabeiah, Ahmed Alkhateeb

This leads to high beam training overhead and loss in the achievable beamforming gains.

Information Theory Signal Processing Information Theory

CODAR: A Contextual Duration-Aware Qubit Mapping for Various NISQ Devices

1 code implementation24 Feb 2020 Haowei Deng, Yu Zhang, Quanxi Li

Quantum computing devices in the NISQ era share common features and challenges like limited connectivity between qubits.

Quantum Physics

High-Order Paired-ASPP Networks for Semantic Segmenation

no code implementations18 Feb 2020 Yu Zhang, Xin Sun, Junyu Dong, Changrui Chen, Yue Shen

The network first introduces a High-Order Representation module to extract the contextual high-order information from all stages of the backbone.

Semantic Segmentation

Deep Multi-Task Learning via Generalized Tensor Trace Norm

no code implementations12 Feb 2020 Yi Zhang, Yu Zhang, Wei Wang

The GTTN is defined as a convex combination of matrix trace norms of all possible tensor flattenings and hence it can discover all the possible low-rank structures.

Multi-Task Learning

MDLdroid: a ChainSGD-reduce Approach to Mobile Deep Learning for Personal Mobile Sensing

no code implementations7 Feb 2020 Yu Zhang, Tao Gu, Xi Zhang

Towards pushing deep learning on devices, we present MDLdroid, a novel decentralized mobile deep learning framework to enable resource-aware on-device collaborative learning for personal mobile sensing applications.

Federated Learning Multi-Goal Reinforcement Learning

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

no code implementations6 Feb 2020 Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu

This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model.

Speech Synthesis

SpecAugment on Large Scale Datasets

no code implementations11 Dec 2019 Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu

Recently, SpecAugment, an augmentation scheme for automatic speech recognition that acts directly on the spectrogram of input utterances, has shown to be highly effective in enhancing the performance of end-to-end networks on public datasets.

automatic-speech-recognition Speech Recognition

Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models

no code implementations21 Nov 2019 Zhiyun Lu, Liangliang Cao, Yu Zhang, Chung-Cheng Chiu, James Fan

In this paper, we propose to use pre-trained features from end-to-end ASR models to solve speech sentiment analysis as a down-stream task.

End-To-End Speech Recognition Sentiment Analysis

FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10

no code implementations18 Nov 2019 Ke He, Bo Liu, Yu Zhang, Andrew Ling, Dian Gu

In this paper, we firstly propose the FeCaffe, i. e. FPGA-enabled Caffe, a hierarchical software and hardware design methodology based on the Caffe to enable FPGA to support mainline deep learning development features, e. g. training and inference with Caffe.

Scale- and Context-Aware Convolutional Non-intrusive Load Monitoring

no code implementations17 Nov 2019 Kunjin Chen, Yu Zhang, Qin Wang, Jun Hu, Hang Fan, Jinliang He

Non-intrusive load monitoring addresses the challenging task of decomposing the aggregate signal of a household's electricity consumption into appliance-level data without installing dedicated meters.

Non-Intrusive Load Monitoring

Solving Optimization Problems through Fully Convolutional Networks: an Application to the Travelling Salesman Problem

no code implementations27 Oct 2019 Zhengxuan Ling, Xinyu Tao, Yu Zhang, Xi Chen

Based on samples of a 10 city TSP, a fully convolutional network (FCN) is used to learn the mapping from a feasible region to an optimal solution.

Traveling Salesman Problem

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

3 code implementations24 Oct 2019 Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan

Furthermore, the unified design enables the integration of ASR functions with TTS, e. g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models.

automatic-speech-recognition Speech Recognition

Deep Learning for Massive MIMO with 1-Bit ADCs: When More Antennas Need Fewer Pilots

1 code implementation15 Oct 2019 Yu Zhang, Muhammad Alrabeiah, Ahmed Alkhateeb

This leads to the interesting, and \textit{counter-intuitive}, observation that when more antennas are employed by the massive MIMO base station, our proposed deep learning approach achieves better channel estimation performance, for the same pilot sequence length.

Information Theory Signal Processing Information Theory

End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds

no code implementations15 Oct 2019 Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan

In this paper, we aim to synergize the birds-eye view and the perspective view and propose a novel end-to-end multi-view fusion (MVF) algorithm, which can effectively learn to utilize the complementary information from both.

3D Object Detection

Knowledge Distillation from Internal Representations

no code implementations8 Oct 2019 Gustavo Aguilar, Yuan Ling, Yu Zhang, Benjamin Yao, Xing Fan, Chenlei Guo

In this paper, we propose to distill the internal representations of a large model such as BERT into a simplified version of it.

Knowledge Distillation

Speech Recognition with Augmented Synthesized Speech

no code implementations25 Sep 2019 Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro Moreno, Yonghui Wu, Zelin Wu

Recent success of the Tacotron speech synthesis architecture and its variants in producing natural sounding multi-speaker synthesized speech has raised the exciting possibility of replacing expensive, manually transcribed, domain-specific, human speech that is used to train speech recognizers.

Data Augmentation Robust Speech Recognition +1

Adversarial Representation Learning for Robust Patient-Independent Epileptic Seizure Detection

1 code implementation18 Sep 2019 Xiang Zhang, Lina Yao, Manqing Dong, Zhe Liu, Yu Zhang, Yong Li

Furthermore, to enhance the explainability, we develop an attention mechanism to automatically learn the importance of each EEG channels in the seizure diagnosis procedure.

EEG Feature Engineering +2

Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning

no code implementations6 Sep 2019 Long Yang, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

To address above problem, we propose a GQ$(\sigma,\lambda)$ that extends tabular Q$(\sigma,\lambda)$ with linear function approximation.


Heterogeneous Domain Adaptation via Soft Transfer Network

no code implementations28 Aug 2019 Yuan Yao, Yu Zhang, Xutao Li, Yunming Ye

Heterogeneous domain adaptation (HDA) aims to facilitate the learning task in a target domain by borrowing knowledge from a heterogeneous source domain.

Domain Adaptation

Multi-Spectral Visual Odometry without Explicit Stereo Matching

no code implementations23 Aug 2019 Weichen Dai, Yu Zhang, Donglei Sun, Naira Hovakimyan, Ping Li

Moreover, the proposed method can also provide a metric 3D reconstruction in semi-dense density with multi-spectral information, which is not available from existing multi-spectral methods.

3D Reconstruction Stereo Matching +2

Discriminative Topic Mining via Category-Name Guided Text Embedding

1 code implementation20 Aug 2019 Yu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu Zhang, Jiawei Han

We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora.

Document Classification General Classification +3

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

1 code implementation9 Jul 2019 Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran

We present a multispeaker, multilingual text-to-speech (TTS) synthesis model based on Tacotron that is able to produce high quality speech in multiple languages.

Speech Synthesis

Expected Sarsa($λ$) with Control Variate for Variance Reduction

no code implementations25 Jun 2019 Long Yang, Yu Zhang, Jun Wen, Qian Zheng, Pengfei Li, Gang Pan

In this paper, for reducing the variance, we introduce control variate technique to $\mathtt{Expected}$ $\mathtt{Sarsa}$($\lambda$) and propose a tabular $\mathtt{ES}$($\lambda$)-$\mathtt{CV}$ algorithm.

Brain Network Construction and Classification Toolbox (BrainNetClass)

1 code implementation17 Jun 2019 Zhen Zhou, Xiaobo Chen, Yu Zhang, Lishan Qiao, Renping Yu, Gang Pan, Han Zhang, Dinggang Shen

The goal of this work is to introduce a toolbox namely "Brain Network Construction and Classification" (BrainNetClass) to the field to promote more advanced brain network construction methods.

Classification General Classification

Evidence for $Z_{c}^{\pm}$ decays into the $ρ^{\pm} η_{c}$ final state

no code implementations3 Jun 2019 M. Ablikim, M. N. Achasov, S. Ahmed, M. Albrecht, M. Alekseev, A. Amoroso, F. F. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, Y. Ban, K. Begzsuren, D. W. Bennett, J. V. Bennett, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Boger, I. Boyko, R. A. Briere, H. Cai, X. Cai, A. Calcaterra, G. F. Cao, S. A. Cetin, J. Chai, J. F. Chang, W. L. Chang, G. Chelkov, G. Chen, H. S. Chen, J. C. Chen, M. L. Chen, P. L. Chen, S. J. Chen, X. R. Chen, Y. B. Chen, W. Cheng, X. K. Chu, G. Cibinetto, F. Cossio, H. L. Dai, J. P. Dai, A. Dbeyssi, D. Dedovich, Z. Y. Deng, A. Denig, I. Denysenko, M. Destefanis, F. DeMori, Y. Ding, C. Dong, J. Dong, L. Y. Dong, M. Y. Dong, Z. L. Dou, S. X. Du, P. F. Duan, J. Fang, S. S. Fang, Y. Fang, R. Farinelli, L. Fava, F. Feldbauer, G. Felici, C. Q. Feng, M. Fritsch, C. D. Fu, Q. Gao, X. L. Gao, Y. Gao, Y. G. Gao, Z. Gao, B. Garillon, I. Garzia, A. Gilman, K. Goetzen, L. Gong, W. X. Gong, W. Gradl, M. Greco, L. M. Gu, M. H. Gu, Y. T. Gu, A. Q. Guo, L. B. Guo, R. P. Guo, Y. P. Guo, A. Guskov, Z. Haddadi, S. Han, X. Q. Hao, F. A. Harris, K. L. He, F. H. Heinsius, T. Held, Y. K. Heng, Z. L. Hou, H. M. Hu, J. F. Hu, T. Hu, Y. Hu, G. S. Huang, J. S. Huang, X. T. Huang, X. Z. Huang, Z. L. Huang, T. Hussain, W. Ikegami Andersson, M. Irshad, Q. Ji, Q. P. Ji, X. B. Ji, X. L. Ji, H. L. Jiang, X. S. Jiang, X. Y. Jiang, J. B. Jiao, Z. Jiao, D. P. Jin, S. Jin, Y. Jin, T. Johansson, A. Julin, N. Kalantar-Nayestanaki, X. S. Kang, M. Kavatsyuk, B. C. Ke, I. K. Keshk, T. Khan, A. Khoukaz, P. Kiese, R. Kiuchi, R. Kliemt, L. Koch, O. B. Kolcu, B. Kopf, M. Kuemmel, M. Kuessner, A. Kupsc, M. Kurth, W. Kühn, J. S. Lange, P. Larin, L. Lavezzi, S. Leiber, H. Leithoff, C. Li, Cheng Li, D. M. Li, F. Li, F. Y. Li, G. Li, H. B. Li, H. J. Li, J. C. Li, J. W. Li, K. J. Li, Kang Li, Ke Li, Lei LI, P. L. Li, P. R. Li, Q. Y. Li, T. Li, W. D. Li, W. G. Li, X. L. Li, X. N. Li, X. Q. Li, Z. B. Li, H. Liang, Y. F. Liang, Y. T. Liang, G. R. Liao, L. Z. Liao, J. Libby, C. X. Lin, D. X. Lin, B. Liu, B. J. Liu