Search Results for author: Yu Zhang

Found 390 papers, 107 papers with code

All Information is Valuable: Question Matching over Full Information Transmission Network

no code implementations Findings (NAACL) 2022 Le Qi, Yu Zhang, Qingyu Yin, Guidong Zheng, Wen Junjie, Jinlong Li, Ting Liu

In this process, there are two kinds of critical information that are commonly employed: the representation information of original questions and the interactive information between pairs of questions.

Learn to Cross-lingual Transfer with Meta Graph Learning Across Heterogeneous Languages

no code implementations EMNLP 2020 Zheng Li, Mukul Kumar, William Headden, Bing Yin, Ying WEI, Yu Zhang, Qiang Yang

Recent emergence of multilingual pre-training language model (mPLM) has enabled breakthroughs on various downstream cross-lingual transfer (CLT) tasks.

Cross-Lingual Transfer Graph Learning +1

Joint Goal Segmentation and Goal Success Prediction on Multi-Domain Conversations

no code implementations COLING 2022 Meiguo Wang, Benjamin Yao, Bin Guo, Xiaohu Liu, Yu Zhang, Tuan-Hung Pham, Chenlei Guo

To evaluate the performance of a multi-domain goal-oriented Dialogue System (DS), it is important to understand what the users’ goals are for the conversations and whether those goals are successfully achieved.

Dialogue Evaluation Multi-Task Learning

\textrm{DuReader}_{\textrm{vis}}: A Chinese Dataset for Open-domain Document Visual Question Answering

1 code implementation Findings (ACL) 2022 Le Qi, Shangwen Lv, Hongyu Li, Jing Liu, Yu Zhang, Qiaoqiao She, Hua Wu, Haifeng Wang, Ting Liu

Open-domain question answering has been used in a wide range of applications, such as web search and enterprise search, which usually takes clean texts extracted from various formats of documents (e. g., web pages, PDFs, or Word documents) as the information source.

Open-Domain Question Answering Visual Question Answering

Learning to See in the Dark with Events

no code implementations ECCV 2020 Song Zhang, Yu Zhang, Zhe Jiang, Dongqing Zou, Jimmy Ren, Bin Zhou

A detail enhancing branch is proposed to reconstruct day light-specific features from the domain-invariant representations in a residual manner, regularized by a ranking loss.

Representation Learning Unsupervised Domain Adaptation

A Coarse-to-Fine Labeling Framework for Joint Word Segmentation, POS Tagging, and Constituent Parsing

1 code implementation CoNLL (EMNLP) 2021 Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

In the coarse labeling stage, the joint model outputs a bracketed tree, in which each node corresponds to one of four labels (i. e., phrase, subphrase, word, subword).

Part-Of-Speech Tagging POS +1

Physics-guided Residual Learning for Probabilistic Power Flow Analysis

no code implementations28 Jan 2023 Kejun Chen, Yu Zhang

In addition, based on our proposed framework, we design three methods to initialize the weights of the shortcut connection layer according to the physical characteristics of AC-PF equations.

MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer

1 code implementation19 Jan 2023 Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yanwu Xu

This architectural improvement leads to a new diffusion-based medical image segmentation method called MedSegDiff-V2, which significantly improves the performance of MedSegDiff.

Image Generation Image Segmentation +2

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

no code implementations19 Jan 2023 Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman

In this work, we propose a new parameter-efficient learning framework based on neural model reprogramming for cross-lingual speech recognition, which can \textbf{re-purpose} well-trained English automatic speech recognition (ASR) models to recognize the other languages.

Automatic Speech Recognition speech-recognition

Super-Resolution Harmonic Retrieval of Non-Circular Signals

no code implementations17 Jan 2023 Yu Zhang, Yue Wang, Zhi Tian, Geert Leus, Gong Zhang

This paper proposes a super-resolution harmonic retrieval method for uncorrelated strictly non-circular signals, whose covariance and pseudo-covariance present Toeplitz and Hankel structures, respectively.

Retrieval Super-Resolution

OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection

no code implementations13 Jan 2023 Xiaomeng Chu, Jiajun Deng, Yuan Zhao, Jianmin Ji, Yu Zhang, Houqiang Li, Yanyong Zhang

To this end, we propose OA-BEV, a network that can be plugged into the BEV-based 3D object detection framework to bring out the objects by incorporating object-aware pseudo-3D features and depth features.

3D Object Detection object-detection

Learning Trajectory-Word Alignments for Video-Language Tasks

no code implementations5 Jan 2023 Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang

Besides T2W attention, we also follow previous VDL-BERTs to set a word-to-patch (W2P) attention in the cross-modal encoder.

Question Answering Retrieval +4

Adaptively Clustering Neighbor Elements for Image Captioning

no code implementations5 Jan 2023 Zihua Wang, Xu Yang, Haiyang Xu, Hanwang Zhang, Chenliang Li, Songfang Huang, Fei Huang, Yu Zhang

We design a novel global-local Transformer named \textbf{Ada-ClustFormer} (\textbf{ACF}) to generate captions.

Image Captioning

Dynamic Sparse Network for Time Series Classification: Learning What to "see''

1 code implementation19 Dec 2022 Qiao Xiao, Boqian Wu, Yu Zhang, Shiwei Liu, Mykola Pechenizkiy, Elena Mocanu, Decebal Constantin Mocanu

The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC).

Time Series Classification

Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models

no code implementations19 Dec 2022 Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna

We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages.

Automatic Speech Recognition Denoising +5

Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts

1 code implementation12 Dec 2022 Yu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, Jiawei Han

Instead of mining coherent topics from a given text corpus in a completely unsupervised manner, seed-guided topic discovery methods leverage user-provided seed words to extract distinctive and coherent topics so that the mined topics can better cater to the user's interest.

Language Modelling Word Embeddings

Unsupervised Deep Learning for AC Optimal Power Flow via Lagrangian Duality

no code implementations7 Dec 2022 Kejun Chen, Shourya Bose, Yu Zhang

Non-convex AC optimal power flow (AC-OPF) is a fundamental optimization problem in power system analysis.

Entity Set Co-Expansion in StackOverflow

no code implementations5 Dec 2022 Yu Zhang, Yunyi Zhang, Yucheng Jiang, Martin Michalski, Yu Deng, Lucian Popa, ChengXiang Zhai, Jiawei Han

Given a few seed entities of a certain type (e. g., Software or Programming Language), entity set expansion aims to discover an extensive set of entities that share the same type as the seeds.

graph construction Management

Feature Aggregation and Propagation Network for Camouflaged Object Detection

1 code implementation2 Dec 2022 Tao Zhou, Yi Zhou, Chen Gong, Jian Yang, Yu Zhang

In this paper, we propose a novel Feature Aggregation and Propagation Network (FAP-Net) for camouflaged object detection.

object-detection Object Detection

TSGP: Two-Stage Generative Prompting for Unsupervised Commonsense Question Answering

no code implementations24 Nov 2022 Yueqing Sun, Yu Zhang, Le Qi, Qi Shi

In this paper, we aim to address the above limitation by leveraging the implicit knowledge stored in PrLMs and propose a two-stage prompt-based unsupervised commonsense question answering framework (TSGP).

Answer Generation Question Answering

Leveraging per Image-Token Consistency for Vision-Language Pre-training

no code implementations20 Nov 2022 Yunhao Gou, Tom Ko, Hansi Yang, James Kwok, Yu Zhang, Mingxuan Wang

(2) Under-utilization of the unmasked tokens: CMLM primarily focuses on the masked token but it cannot simultaneously leverage other tokens to learn vision-language associations.

Language Modelling Masked Language Modeling

Disentangling Task Relations for Few-shot Text Classification via Self-Supervised Hierarchical Task Clustering

no code implementations16 Nov 2022 Juan Zha, Zheng Li, Ying WEI, Yu Zhang

However, most prior works assume that all the tasks are sampled from a single data source, which cannot adapt to real-world scenarios where tasks are heterogeneous and lie in different distributions.

Few-Shot Text Classification text-classification +1

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

1 code implementation7 Nov 2022 Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang

Instead of extracting features from the tensor program itself, TLP extracts features from the schedule primitives.

Multi-Task Learning

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

1 code implementation6 Nov 2022 Yu Meng, Martin Michalski, Jiaxin Huang, Yu Zhang, Tarek Abdelzaher, Jiawei Han

In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set.

Few-Shot Learning Pretrained Language Models

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition

no code implementations2 Nov 2022 Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee

We propose a quantum kernel learning (QKL) framework to address the inherent data sparsity issues often encountered in training large-scare acoustic models in low-resource scenarios.

Spoken Command Recognition

Max Markov Chain

no code implementations2 Nov 2022 Yu Zhang, Mitchell Bucklew

In this paper, we introduce Max Markov Chain (MMC), a novel representation for a useful subset of High-order Markov Chains (HMCs) with sparse correlations among the states.

MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model

1 code implementation1 Nov 2022 Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yehui Yang, Haoyi Xiong, Huiying Liu, Yanwu Xu

Inspired by the success of DPM, we propose the first DPM based model toward general medical image segmentation tasks, which we named MedSegDiff.

Anomaly Detection Brain Tumor Segmentation +7

Modular Hybrid Autoregressive Transducer

no code implementations31 Oct 2022 Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno

In this work, we propose a modular hybrid autoregressive transducer (MHAT) that has structurally separated label and blank decoders to predict label and blank distributions, respectively, along with a shared acoustic encoder.

Language Modelling speech-recognition +1

Accelerating RNN-T Training and Inference Using CTC guidance

no code implementations29 Oct 2022 Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani

We propose a novel method to accelerate training and inference process of recurrent neural network transducer (RNN-T) based on the guidance from a co-trained connectionist temporal classification (CTC) model.

Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation

no code implementations28 Oct 2022 Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding

Adapting a neural text-to-speech (TTS) model to a target speaker typically involves fine-tuning most if not all of the parameters of a pretrained multi-speaker backbone model.

Personalized Dialogue Generation with Persona-Adaptive Attention

1 code implementation27 Oct 2022 Qiushi Huang, Yu Zhang, Tom Ko, Xubo Liu, Bo Wu, Wenwu Wang, Lilian Tang

Persona-based dialogue systems aim to generate consistent responses based on historical context and predefined persona.

Dialogue Generation

Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR

no code implementations18 Oct 2022 Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro Moreno, Nanxin Chen

First, we show that by combining speech representations with byte-level text representations and use of language embeddings, we can dramatically reduce the Character Error Rate (CER) on languages with no supervised speech from 64. 8\% to 30. 8\%, a relative reduction of 53\%.

Representation Learning speech-recognition +2

JOIST: A Joint Speech and Text Streaming Model For ASR

no code implementations13 Oct 2022 Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman

In addition, we explore JOIST using a streaming E2E model with an order of magnitude more data, which are also novelties compared to previous works.

Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR

no code implementations11 Oct 2022 Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman

Knowledge distillation is an effective machine learning technique to transfer knowledge from a teacher model to a smaller student model, especially with unlabeled data.

Automatic Speech Recognition Knowledge Distillation +1

On Clustering Trend in Language Evolution Based on Dynamical Behaviors of Multi-Agent Model

no code implementations3 Oct 2022 Yu Zhang, Li Liu, Chen Diao, Ning Cai

Computer model has been extensively adopted to overcome the time limitation of language evolution by transforming language theory into physical modeling mechanism, which helps to explore the general laws of the evolution.

Fault Detection Scheme for Grid-Forming Inverters in Islanded Droop-Controlled AC Microgrids

no code implementations26 Sep 2022 Gabriel Intriago, Andres Intriago, Raul Intriago, Yu Zhang

An observer-based fault detection scheme for grid-forming inverters operating in islanded droop-controlled AC microgrids is proposed.

Fault Detection

Spatiotemporal Multi-scale Bilateral Motion Network for Gait Recognition

no code implementations26 Sep 2022 Xinnan Ding, Shan Du, Yu Zhang, Kejun Wang

The critical goal of gait recognition is to acquire the inter-frame walking habit representation from the gait sequences.

Gait Recognition Optical Flow Estimation

How Good Is Neural Combinatorial Optimization?

1 code implementation22 Sep 2022 Shengcai Liu, Yu Zhang, Ke Tang, Xin Yao

Traditional solvers for tackling combinatorial optimization (CO) problems are usually designed by human experts.

Combinatorial Optimization Traveling Salesman Problem

Discrete Linear Canonical Transform on Graphs

no code implementations21 Sep 2022 Yu Zhang, Bing-Zhao Li

In this paper, we propose and design the definition of the discrete linear canonical transform on graphs (GLCT), which is an extension of the discrete linear canonical transform (DLCT), just as the graph Fourier transform (GFT) is an extension of the discrete Fourier transform (DFT).

Online Beam Learning with Interference Nulling for Millimeter Wave MIMO Systems

no code implementations9 Sep 2022 Yu Zhang, Tawfik Osman, Ahmed Alkhateeb

Furthermore, a hardware proof-of-concept prototype based on mmWave phased arrays is built and used to implement and evaluate the developed online beam learning solutions in realistic scenarios.

Exploiting Deep Reinforcement Learning for Edge Caching in Cell-Free Massive MIMO Systems

no code implementations26 Aug 2022 Yu Zhang, Shuaifei Chen, Jiayi Zhang

Cell-free massive multiple-input-multiple-output is promising to meet the stringent quality-of-experience (QoE) requirements of railway wireless communications by coordinating many successional access points (APs) to serve the onboard users coherently.

reinforcement-learning reinforcement Learning

Local Low-Rank Approximation With Superpixel-Guided Locality Preserving Graph for Hyperspectral Image Classification

1 code implementation journal 2022 Shujun Yang, Yu Zhang, Yuheng Jia, and Weijia Zhang

By taking advantage of the local manifold structure, a Laplacian graph is constructed from the superpixels to ensure that a typical pixel should be similar to its neighbors within the same superpixel.

Hyperspectral Image Classification Superpixels

An Adaptive Repeated-Intersection-Reduction Local Search for the Maximum Independent Set Problem

no code implementations16 Aug 2022 Enqiang Zhu, Yu Zhang, Chanjuan Liu

The maximum independent set (MIS) problem, a classical NP-hard problem with extensive applications in various areas, aims to find the largest set of vertices with no edge among them.

See What You See: Self-supervised Cross-modal Retrieval of Visual Stimuli from Brain Activity

no code implementations7 Aug 2022 Zesheng Ye, Lina Yao, Yu Zhang, Sylvia Gustin

Recent studies demonstrate the use of a two-stage supervised framework to generate images that depict human perception to visual stimuli from EEG, referring to EEG-visual reconstruction.

Cross-Modal Retrieval EEG +1

CROLoss: Towards a Customizable Loss for Retrieval Models in Recommender Systems

1 code implementation5 Aug 2022 Yongxiang Tang, Wentao Bai, Guilin Li, Xialong Liu, Yu Zhang

In this paper, we proposed the Customizable Recall@N Optimization Loss (CROLoss), a loss function that can directly optimize the Recall@N metrics and is customizable for different choices of N. This proposed CROLoss formulation defines a more generalized loss function space, covering most of the conventional loss functions as special cases.

Recommendation Systems Retrieval

An Efficient Person Clustering Algorithm for Open Checkout-free Groceries

1 code implementation5 Aug 2022 Junde Wu, Yu Zhang, Rao Fu, Yuanpei Liu, Jing Gao

Then, to ensure that the method adapts to the dynamic and unseen person flow, we propose Graph Convolutional Network (GCN) with a simple Nearest Neighbor (NN) strategy to accurately cluster the instances of CSG.

A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis

no code implementations3 Aug 2022 Qibing Bai, Tom Ko, Yu Zhang

In human speech, the attitude of a speaker cannot be fully expressed only by the textual content.

Speech Synthesis

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

1 code implementation18 Jul 2022 Xinyu Shi, Dong Wei, Yu Zhang, Donghuan Lu, Munan Ning, Jiashun Chen, Kai Ma, Yefeng Zheng

A key to this challenging task is to fully utilize the information in the support images by exploiting fine-grained correlations between the query and support images.

Few-Shot Semantic Segmentation Semantic Segmentation

Structured Light with Redundancy Codes

no code implementations18 Jun 2022 Zhanghao Sun, Yu Zhang, Yicheng Wu, Dong Huo, Yiming Qian, Jian Wang

We propose three applications using our redundancy codes: (1) Self error-correction for SL imaging under strong ambient light, (2) Error detection for adaptive reconstruction under global illumination, and (3) Interference filtering with device-specific projection sequence encoding, especially for event camera-based SL and light curtain devices.

Co-optimization of Battery Routing and Load Restoration for Microgrids with Mobile Energy Storage Systems

no code implementations24 May 2022 Shourya Bose, Sifat Chowdhury, Yu Zhang

Mobile energy storage systems (MESS) offer great operational flexibility to enhance the resiliency of distribution systems in an emergency condition.

Heterformer: A Transformer Architecture for Node Representation Learning on Heterogeneous Text-Rich Networks

no code implementations20 May 2022 Bowen Jin, Yu Zhang, Qi Zhu, Jiawei Han

We study node representation learning on heterogeneous text-rich networks, where nodes and edges are multi-typed and some types of nodes are associated with text information.

Graph Attention Link Prediction +5

Transferable Physical Attack against Object Detection with Separable Attention

no code implementations19 May 2022 Yu Zhang, Zhiqiang Gong, Yichuang Zhang, YongQian Li, Kangcheng Bin, Jiahao Qi, Wei Xue, Ping Zhong

Transferable adversarial attack is always in the spotlight since deep learning models have been demonstrated to be vulnerable to adversarial samples.

Adversarial Attack object-detection +1

Side-aware Meta-Learning for Cross-Dataset Listener Diagnosis with Subjective Tinnitus

no code implementations3 May 2022 Yun Li, Zhe Liu, Lina Yao, Molly Lucas, Jessica J. M. Monaghan, Yu Zhang

With the development of digital technology, machine learning has paved the way for the next generation of tinnitus diagnoses.

BIG-bench Machine Learning EEG +1

Variation-cognizant Probabilistic Power Flow Analysis via Multi-task Learning

no code implementations2 May 2022 Kejun Chen, Yu Zhang

With an increasing high penetration of solar photovoltaic generation in electric power grids, voltage phasors and branch power flows experience more severe fluctuations.

Multi-Task Learning regression

Differentially Private Load Restoration for Microgrids with Distributed Energy Storage

no code implementations29 Apr 2022 Shourya Bose, Yu Zhang

Distributed energy storage systems (ESSs) can be efficiently leveraged for load restoration (LR) for a microgrid (MG) in island mode.

Interpretable Graph Convolutional Network of Multi-Modality Brain Imaging for Alzheimer's Disease Diagnosis

no code implementations27 Apr 2022 Houliang Zhou, Lifang He, Yu Zhang, Li Shen, Brian Chen

Identification of brain regions related to the specific neurological disorders are of great importance for biomarker and diagnostic studies.

Adversarial Filtering Modeling on Long-term User Behavior Sequences for Click-Through Rate Prediction

no code implementations25 Apr 2022 Xiaochen Li, Rui Zhong, Jian Liang, Xialong Liu, Yu Zhang

Rich user behavior information is of great importance for capturing and understanding user interest in click-through rate (CTR) prediction.

Click-Through Rate Prediction

MAESTRO: Matched Speech Text Representations through Modality Matching

no code implementations7 Apr 2022 Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Moreno, Ankur Bapna, Heiga Zen

Self-supervised learning from speech signals aims to learn the latent structure inherent in the signal, while self-supervised learning from text attempts to capture lexical information.

Language Modelling Self-Supervised Learning +3

Unsupervised Data Selection via Discrete Speech Representation for ASR

no code implementations5 Apr 2022 Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani

Self-supervised learning of speech representations has achieved impressive results in improving automatic speech recognition (ASR).

Automatic Speech Recognition Self-Supervised Learning +1

Modern Views of Machine Learning for Precision Psychiatry

no code implementations4 Apr 2022 Zhe Sage Chen, Prathamesh, Kulkarni, Isaac R. Galatzer-Levy, Benedetta Bigio, Carla Nasca, Yu Zhang

In this review, we provide a comprehensive review of the ML methodologies and applications by combining neuroimaging, neuromodulation, and advanced mobile technologies in psychiatry practice.

BIG-bench Machine Learning

Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation

no code implementations30 Mar 2022 Kuan Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-Yi Lee

Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models.

Domain Adaptation

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

1 code implementation29 Mar 2022 Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li

LightHuBERT outperforms the original HuBERT on ASR and five SUPERB tasks with the HuBERT size, achieves comparable performance to the teacher model in most tasks with a reduction of 29% parameters, and obtains a $3. 5\times$ compression ratio in three SUPERB tasks, e. g., automatic speaker verification, keyword spotting, and intent classification, with a slight accuracy loss.

Automatic Speech Recognition intent-classification +5

AnoDFDNet: A Deep Feature Difference Network for Anomaly Detection

1 code implementation29 Mar 2022 Zhixue Wang, Yu Zhang, Lin Luo, Nan Wang

This paper proposed a novel anomaly detection (AD) approach of High-speed Train images based on convolutional neural networks and the Vision Transformer.

Anomaly Detection object-detection +1

LibMTL: A Python Library for Multi-Task Learning

1 code implementation27 Mar 2022 Baijiong Lin, Yu Zhang

This paper presents LibMTL, an open-source Python library built on PyTorch, which provides a unified, comprehensive, reproducible, and extensible implementation framework for Multi-Task Learning (MTL).

Multi-Task Learning

OneLabeler: A Flexible System for Building Data Labeling Tools

1 code implementation27 Mar 2022 Yu Zhang, Yun Wang, Haidong Zhang, Bin Zhu, Siming Chen, Dongmei Zhang

In this paper, we propose a conceptual framework for data labeling and OneLabeler based on the conceptual framework to support easy building of labeling tools for diverse usage scenarios.

Contrastive Graph Learning for Population-based fMRI Classification

1 code implementation26 Mar 2022 Xuesong Wang, Lina Yao, Islem Rekik, Yu Zhang

Nonetheless, existing contrastive methods generate resemblant pairs only on pixel-level features of 3D medical images, while the functional connectivity that reveals critical cognitive information is under-explored.

Classification Graph Learning +1

XTREME-S: Evaluating Cross-lingual Speech Representations

no code implementations21 Mar 2022 Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning.

Representation Learning Retrieval +4

Image Steganography based on Style Transfer

no code implementations9 Mar 2022 Donghui Hu, Yu Zhang, Cong Yu, Jian Wang, Yaofei Wang

Image steganography is the art and science of using images as cover for covert communications.

Image Steganography Image Stylization +1

Ask2Mask: Guided Data Selection for Masked Speech Modeling

no code implementations24 Feb 2022 Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro Moreno

They treat all unsupervised speech samples with equal weight, which hinders learning as not all samples have relevant information to learn meaningful representations.

Automatic Speech Recognition speech-recognition

Path-Aware Graph Attention for HD Maps in Motion Prediction

no code implementations23 Feb 2022 Fang Da, Yu Zhang

The success of motion prediction for autonomous driving relies on integration of information from the HD maps.

Graph Attention Motion Forecasting +1

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

1 code implementation15 Feb 2022 Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Although using bounds as surrogate functions to design safe RL algorithms have appeared in some existing works, we develop them at least three aspects: (i) We provide a rigorous theoretical analysis to extend the surrogate functions to generalized advantage estimator (GAE).

reinforcement-learning reinforcement Learning +2

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

1 code implementation9 Feb 2022 Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e. g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e. g., BERT) have been the prominent choice for natural language understanding (NLU) tasks.

Few-Shot Learning MNLI-m +6

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

1 code implementation9 Feb 2022 Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Jiawei Han

Interestingly, there have not been standard approaches to deploy PLMs for topic discovery as better alternatives to topic models.

Language Modelling Pretrained Language Models +1

Self-supervised Learning with Random-projection Quantizer for Speech Recognition

no code implementations3 Feb 2022 Chung-Cheng Chiu, James Qin, Yu Zhang, Jiahui Yu, Yonghui Wu

In particular the quantizer projects speech inputs with a randomly initialized matrix, and does a nearest-neighbor lookup in a randomly-initialized codebook.

Self-Supervised Learning speech-recognition +1

mSLAM: Massively multilingual joint pre-training for speech and text

no code implementations3 Feb 2022 Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau

We present mSLAM, a multilingual Speech and LAnguage Model that learns cross-lingual cross-modal representations of speech and text by pre-training jointly on large amounts of unlabeled speech and text in multiple languages.

 Ranked #1 on Spoken language identification on Fleurs (using extra training data)

intent-classification Intent Classification +4

LEMON: Language-Based Environment Manipulation via Execution-Guided Pre-training

no code implementations20 Jan 2022 Qi Shi, Qian Liu, Bei Chen, Yu Zhang, Ting Liu, Jian-Guang Lou

In this work, we propose LEMON, a general framework for language-based environment manipulation tasks.

Language Modelling

Domain Adaptation via Bidirectional Cross-Attention Transformer

no code implementations15 Jan 2022 Xiyu Wang, Pengxin Guo, Yu Zhang

Specifically, in BCAT, we design a weight-sharing quadruple-branch transformer with a bidirectional cross-attention mechanism to learn domain-invariant feature representations.

Domain Adaptation

AutoMine: An Unmanned Mine Dataset

no code implementations CVPR 2022 Yuchen Li, Zixuan Li, Siyu Teng, Yu Zhang, YuHang Zhou, Yuchang Zhu, Dongpu Cao, Bin Tian, Yunfeng Ai, Zhe XuanYuan, Long Chen

The main contributions of the AutoMine dataset are as follows: 1. The first autonomous driving dataset for perception and localization in mine scenarios.

Autonomous Driving

JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering

1 code implementation NAACL 2022 Yueqing Sun, Qi Shi, Le Qi, Yu Zhang

Specifically, JointLK performs joint reasoning between LM and GNN through a novel dense bidirectional attention module, in which each question token attends on KG nodes and each KG node attends on question tokens, and the two modal representations fuse and update mutually by multi-step interactions.

Knowledge Graphs Question Answering

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based Graph Parsing

1 code implementation COLING 2022 Shilin Zhou, Qingrong Xia, Zhenghua Li, Yu Zhang, Yu Hong, Min Zhang

Moreover, we propose a simple constrained Viterbi procedure to ensure the legality of the output graph according to the constraints of the SRL structure.

Chinese Word Segmentation named-entity-recognition +2

Multiple Interest and Fine Granularity Network for User Modeling

no code implementations5 Dec 2021 Jiaxuan Xie, Jianxiong Wei, Qingsong Hua, Yu Zhang

User modeling plays a fundamental role in industrial recommender systems, either in the matching stage and the ranking stage, in terms of both the customer experience and business revenue.

Recommendation Systems

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

1 code implementation4 Dec 2021 Feng Xu, Chuang Zhu, Wenqi Tang, Ying Wang, Yu Zhang, Jie Li, Hongchuan Jiang, Zhongyue Shi, Jun Liu, Mulan Jin

Conclusion: Our study provides a novel DL-based biomarker on primary tumor CNB slides to predict the metastatic status of ALN preoperatively for patients with EBC.

Multiple Instance Learning Specificity +1

Effective Meta-Regularization by Kernelized Proximal Regularization

no code implementations NeurIPS 2021 Weisen Jiang, James Kwok, Yu Zhang

We study the problem of meta-learning, which has proved to be advantageous to accelerate learning new tasks with a few samples.

Meta-Learning

VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion

no code implementations29 Nov 2021 Hanqi Zhu, Jiajun Deng, Yu Zhang, Jianmin Ji, Qiuyu Mao, Houqiang Li, Yanyong Zhang

However, this approach often suffers from the mismatch between the resolution of point clouds and RGB images, leading to sub-optimal performance.

3D Object Detection Data Augmentation +1

Deep Safe Multi-Task Learning

no code implementations20 Nov 2021 Zhixiong Yue, Feiyang Ye, Yu Zhang, Christy Liang, Ivor W. Tsang

We theoretically study the safeness of both learning strategies in the DSMTL model to show that the proposed methods can achieve some versions of safe multi-task learning.

Multi-Task Learning

Joint Unsupervised and Supervised Training for Multilingual ASR

no code implementations15 Nov 2021 Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath

Our average WER of all languages outperforms average monolingual baseline by 33. 3%, and the state-of-the-art 2-stage XLSR by 32%.

Language Modelling Masked Language Modeling +3

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation

1 code implementation12 Nov 2021 Yu Zhang, Wei Wei, Binxuan Huang, Kathleen M. Carley, Yan Zhang

Real-time location inference of social media users is the fundamental of some spatial applications such as localized search and event detection.

Event Detection

MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information

1 code implementation7 Nov 2021 Yu Zhang, Shweta Garg, Yu Meng, Xiusi Chen, Jiawei Han

We study the problem of weakly supervised text classification, which aims to classify text documents into a set of pre-defined categories with category surface names only and without any annotated training document provided.

text-classification Text Classification

Load Restoration in Islanded Microgrids: Formulation and Solution Strategies

no code implementations3 Nov 2021 Shourya Bose, Yu Zhang

In this paper, we consider the problem of load restoration in a microgrid (MG) that is islanded from the upstream DS because of an extreme weather event.

Hybrid physics-based and data-driven modeling with calibrated uncertainty for lithium-ion battery degradation diagnosis and prognosis

no code implementations25 Oct 2021 Jing Lin, Yu Zhang, Edwin Khoo

Advancing lithium-ion batteries (LIBs) in both design and usage is key to promoting electrification in the coming decades to mitigate human-caused climate change.

Transient Synchronization Stability Analysis of Wind Farms with MMC-HVDC Integration Under Offshore AC Grid Fault

no code implementations25 Oct 2021 Yu Zhang, Chen Zhang, Renxin Yang, Jing Lyu, Li Liu, Xu Cai

The MMC-HVDC connected offshore wind farms (OWFs) could suffer short circuit fault (SCF), whereas their transient stability is not well analysed.

Dynamic Feature Alignment for Semi-supervised Domain Adaptation

no code implementations18 Oct 2021 Yu Zhang, Gongbo Liang, Nathan Jacobs

Most research on domain adaptation has focused on the purely unsupervised setting, where no labeled examples in the target domain are available.

Domain Adaptation

Region Semantically Aligned Network for Zero-Shot Learning

no code implementations14 Oct 2021 Ziyang Wang, Yunhao Gou, Jingjing Li, Yu Zhang, Yang Yang

Zero-shot learning (ZSL) aims to recognize unseen classes based on the knowledge of seen classes.

Transfer Learning Zero-Shot Learning

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

1 code implementation ACL 2022 Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning.

Automatic Speech Recognition Quantization +6

Multi-View Self-Attention Based Transformer for Speaker Recognition

no code implementations11 Oct 2021 Rui Wang, Junyi Ao, Long Zhou, Shujie Liu, Zhihua Wei, Tom Ko, Qing Li, Yu Zhang

In this work, we propose a novel multi-view self-attention mechanism and present an empirical study of different Transformer variants with or without the proposed attention mechanism for speaker recognition.

Speaker Recognition

Universal Paralinguistic Speech Representations Using Self-Supervised Conformers

no code implementations9 Oct 2021 Joel Shor, Aren Jansen, Wei Han, Daniel Park, Yu Zhang

Many speech applications require understanding aspects beyond the words being spoken, such as recognizing emotion, detecting whether the speaker is wearing a mask, or distinguishing real from synthetic speech.

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition

no code implementations7 Oct 2021 Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland

As end-to-end automatic speech recognition (ASR) models reach promising performance, various downstream tasks rely on good confidence estimators for these systems.

Automatic Speech Recognition Language Modelling +1

Sim2Real for Soft Robotic Fish via Differentiable Simulation

no code implementations30 Sep 2021 John Z. Zhang, Yu Zhang, Pingchuan Ma, Elvis Nava, Tao Du, Philip Arm, Wojciech Matusik, Robert K. Katzschmann

Accurate simulation of soft mechanisms under dynamic actuation is critical for the design of soft robots.

Multi-Subspace Structured Meta-Learning

no code implementations29 Sep 2021 Weisen Jiang, James Kwok, Yu Zhang

We propose a MUlti-Subspace structured Meta-Learning (MUSML) algorithm to learn the subspace bases.

Meta-Learning

Comparison of Object Detection Algorithms Using Video and Thermal Images Collected from a UAS Platform: An Application of Drones in Traffic Management

no code implementations27 Sep 2021 Hualong Tang, Joseph Post, Achilleas Kourtellis, Brian Porter, Yu Zhang

The results show that a background subtraction-based method can achieve good detection performance on RGB images (F1 scores around 0. 9 for most cases), and a more varied performance is seen on thermal images with different azimuth angles.

Management object-detection +1

A Simple Self-calibration Method for The Internal Time Synchronization of MEMS LiDAR

no code implementations26 Sep 2021 Yu Zhang, Xiaoguang Di, Shiyu Yan, Bin Zhang, Baoling Qi, Chunhui Wang

This paper proposes a simple self-calibration method for the internal time synchronization of MEMS(Micro-electromechanical systems) LiDAR during research and development.

Multi-Task Learning in Natural Language Processing: An Overview

no code implementations19 Sep 2021 Shijie Chen, Yu Zhang, Qiang Yang

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP).

Multi-Task Learning Scheduling

Generating Active Explicable Plans in Human-Robot Teaming

no code implementations18 Sep 2021 Akkamahadevi Hanni, Yu Zhang

In our experimental evaluation, we verify that our approach generates more efficient explicable plans while successfully capturing the dynamic belief change of the human teammate.

Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification

1 code implementation EMNLP 2021 Qi Shi, Yu Zhang, Qingyu Yin, Ting Liu

Specifically, we first retrieve logic-level program-like evidence from the given table and statement as supplementary evidence for the table.

Fact Verification Retrieval +1

Domain Adaptation by Maximizing Population Correlation with Neural Architecture Search

no code implementations12 Sep 2021 Zhixiong Yue, Pengxin Guo, Yu Zhang

Base on the PC function, we propose a new method called Domain Adaptation by Maximizing Population Correlation (DAMPC) to learn a domain-invariant feature representation for DA.

Domain Adaptation Neural Architecture Search

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

1 code implementation EMNLP 2021 Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han

We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base.

Language Modelling named-entity-recognition +1

Injecting Text in Self-Supervised Speech Pretraining

no code implementations27 Aug 2021 Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro Moreno

The proposed method, tts4pretrain complements the power of contrastive learning in self-supervision with linguistic/lexical representations derived from synthesized speech, effectively learning from untranscribed speech and unspoken text.

Contrastive Learning Language Modelling +2

Attention-based Neural Load Forecasting: A Dynamic Feature Selection Approach

no code implementations25 Aug 2021 Jing Xiong, Pengyang Zhou, Alan Chen, Yu Zhang

Then, a decoder with hierarchical temporal attention enables a similar day selection, which re-evaluates the importance of historical information at each time step.

Load Forecasting Machine Translation +2

Online Dictionary Learning Based Fault and Cyber Attack Detection for Power Systems

no code implementations24 Aug 2021 Gabriel Intriago, Yu Zhang

This paper deals with the event and intrusion detection problem by leveraging a stream data mining classifier (Hoeffding adaptive tree) with semi-supervised learning techniques to distinguish cyber-attacks from regular system perturbations accurately.

Cyber Attack Detection Dictionary Learning +1

Mitigating Greenhouse Gas Emissions Through Generative Adversarial Networks Based Wildfire Prediction

no code implementations20 Aug 2021 Sifat Chowdhury, Kai Zhu, Yu Zhang

Over the past decade, the number of wildfire has increased significantly around the world, especially in the State of California.

Data Augmentation

Reinforcement Learning for Robot Navigation with Adaptive Forward Simulation Time (AFST) in a Semi-Markov Model

no code implementations13 Aug 2021 Yu'an Chen, Ruosong Ye, Ziyang Tao, Hongjian Liu, Guangda Chen, Jie Peng, Jun Ma, Yu Zhang, Yanyong Zhang, Jianmin Ji

Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, through directly mapping perception inputs into robot control commands.

reinforcement-learning reinforcement Learning +1

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training

no code implementations7 Aug 2021 Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu

In particular, when compared to published models such as conformer-based wav2vec~2. 0 and HuBERT, our model shows~5\% to~10\% relative WER reduction on the test-clean and test-other subsets.

 Ranked #1 on Speech Recognition on LibriSpeech test-other (using extra training data)

Contrastive Learning Language Modelling +3

Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting

1 code implementation6 Jul 2021 Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, Yu Zhang

As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding.

Autonomous Driving Monocular 3D Object Detection +3

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

no code implementations24 Jun 2021 Yingjie Wang, Qiuyu Mao, Hanqi Zhu, Yu Zhang, Jianmin Ji, Yanyong Zhang

In this survey, we first introduce the background of popular sensors for autonomous cars, including their common data representations as well as object detection networks developed for each type of sensor data.

3D Object Detection Autonomous Driving +2

Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection

no code implementations CVPR 2021 Luwei Hou, Yu Zhang, Kui Fu, Jia Li

Cross-domain weakly supervised object detection aims to adapt object-level knowledge from a fully labeled source domain dataset (i. e. with object bounding boxes) to train object detectors for target domains that are weakly labeled (i. e. with image-level tags).

object-detection Transfer Learning +1

Sparse Multi-Path Corrections in Fringe Projection Profilometry

no code implementations CVPR 2021 Yu Zhang, Daniel Lau, David Wipf

Three-dimensional scanning by means of structured light illumination is an active imaging technique involving projecting and capturing a series of striped patterns and then using the observed warping of stripes to reconstruct the target object's surface through triangulating each pixel in the camera to a unique projector coordinate corresponding to a particular feature in the projected patterns.

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

2 code implementations17 Jun 2021 Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

The model takes an input phoneme sequence, and through an iterative refinement process, generates an audio waveform.

Speech Synthesis Text-To-Speech Synthesis

FedNILM: Applying Federated Learning to NILM Applications at the Edge

no code implementations7 Jun 2021 Yu Zhang, Guoming Tang, Qianyi Huang, Yi Wang, Xudong Wang, Jiadong Lou

Non-intrusive load monitoring (NILM) helps disaggregate the household's main electricity consumption to energy usages of individual appliances, thus greatly cutting down the cost in fine-grained household load monitoring.

Federated Learning Model Compression +3

Rethinking Training from Scratch for Object Detection

1 code implementation6 Jun 2021 Yang Li, Hong Zhang, Yu Zhang

The ImageNet pre-training initialization is the de-facto standard for object detection.

object-detection Object Detection

More Behind Your Electricity Bill: a Dual-DNN Approach to Non-Intrusive Load Monitoring

no code implementations1 Jun 2021 Yu Zhang, Guoming Tang, Qianyi Huang, Yi Wang, Hong Xu

Non-intrusive load monitoring (NILM) is a well-known single-channel blind source separation problem that aims to decompose the household energy consumption into itemised energy usage of individual appliances.

Non-Intrusive Load Monitoring

Large-Signal Grid-Synchronization Stability Analysis of PLL-based VSCs Using Lyapunov's Direct Method

no code implementations23 May 2021 Yu Zhang, Chen Zhang, Xu Cai

Grid-synchronization stability (GSS) is an emerging stability issue of grid-tied voltage source converters (VSCs), which can be provoked by severe grid voltage sags.

Scaling End-to-End Models for Large-Scale Multilingual ASR

no code implementations30 Apr 2021 Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai

Building ASR models across many languages is a challenging multi-task learning problem due to large variations and heavily unbalanced data.

Multi-Task Learning

Deep Latent Emotion Network for Multi-Task Learning

no code implementations18 Apr 2021 Huangbin Zhang, Chong Zhao, Yu Zhang, Danlei Wang, Haichao Yang

DLEN is deployed on a real-world multi-task feed recommendation scenario of Tencent QQ-Small-World with a dataset containing over a billion samples, and it exhibits a significant performance advantage over the SOTA MTL model in offline evaluation, together with a considerable increase by 3. 02% in view-count and 2. 63% in user stay-time in production.

Multi-Task Learning

CSAFL: A Clustered Semi-Asynchronous Federated Learning Framework

no code implementations16 Apr 2021 Yu Zhang, Moming Duan, Duo Liu, Li Li, Ao Ren, Xianzhang Chen, Yujuan Tan, Chengliang Wang

Asynchronous FL has a natural advantage in mitigating the straggler effect, but there are threats of model quality degradation and server crash.

Federated Learning

Pushing the Limits of Non-Autoregressive Speech Recognition

no code implementations7 Apr 2021 Edwin G. Ng, Chung-Cheng Chiu, Yu Zhang, William Chan

We combine recent advancements in end-to-end speech recognition to non-autoregressive automatic speech recognition.

Automatic Speech Recognition Language Modelling +1

Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models

no code implementations6 Apr 2021 Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao

To attack RNN-T, we find prepending perturbation is more effective than the additive perturbation, and can mislead the models to predict the same short target on utterances of arbitrary length.

Automatic Speech Recognition speech-recognition

SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

no code implementations5 Apr 2021 William Chan, Daniel Park, Chris Lee, Yu Zhang, Quoc Le, Mohammad Norouzi

We present SpeechStew, a speech recognition model that is trained on a combination of various publicly available speech recognition datasets: AMI, Broadcast News, Common Voice, LibriSpeech, Switchboard/Fisher, Tedlium, and Wall Street Journal.

Language Modelling speech-recognition +2

Using Simulation to Aid the Design and Optimization of Intelligent User Interfaces for Quality Assurance Processes in Machine Learning

no code implementations2 Apr 2021 Yu Zhang, Martijn Tennekes, Tim De Jong, Lyana Curier, Bob Coecke, Min Chen

Because QA4ML users have to view a non-trivial amount of data and perform many input actions to correct errors made by the ML model, an optimally-designed user interface (UI) can reduce the cost of interactions significantly.

On the limits of algorithmic prediction across the globe

no code implementations28 Mar 2021 Xingyu Li, Difan Song, Miaozhe Han, Yu Zhang, Rene F. Kizilcec

We tested how well predictive models of human behavior trained in a developed country generalize to people in less developed countries by modeling global variation in 200 predictors of academic achievement on nationally representative student data for 65 countries.

Residual Energy-Based Models for End-to-End Speech Recognition

no code implementations25 Mar 2021 Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland

End-to-end models with auto-regressive decoders have shown impressive results for automatic speech recognition (ASR).

Automatic Speech Recognition Self-Supervised Learning +1

EPRNet: Efficient Pyramid Representation Network for Real-Time Street Scene Segmentation

no code implementations IEEE Transactions on Intelligent Transportation Systems 2021 Quan Tang, Fagui Liu, Jun Jiang, Yu Zhang

Current scene segmentation methods suffer from cumbersome model structures and high computational complexity, impeding their applications to real-world scenarios that require real-time processing.

Image Classification Scene Segmentation +1

Deciphering Star Cluster Evolution by Shape Morphology

no code implementations4 Mar 2021 Qingshun Hu, Yu Zhang, Ali Esamdin, Jinzhong Liu, Xiangyun Zeng

A significant negative correlation between the overall ellipticities and masses is also detected for the sample clusters with log(age/year) $\geq$ 8, suggesting that the overall shapes of the clusters are possibly influenced by the number of members and masses, in addition to the external forces and the surrounding environment.

Astrophysics of Galaxies Solar and Stellar Astrophysics

Self-supervised Low Light Image Enhancement and Denoising

1 code implementation1 Mar 2021 Yu Zhang, Xiaoguang Di, Bin Zhang, Qingyan Li, Shiyu Yan, Chunhui Wang

Both of the networks can be trained with low light images only, which is achieved by a Maximum Entropy based Retinex (ME-Retinex) model and an assumption that noises are independently distributed.

Denoising Low-Light Image Enhancement

Fronthaul Compression and Passive Beamforming Design for Intelligent Reflecting Surface-aided Cloud Radio Access Networks

no code implementations25 Feb 2021 Yu Zhang, Xuelu Wu, Hong Peng, Caijun Zhong, Xiaoming Chen

This letter studies a cloud radio access network (C-RAN) with multiple intelligent reflecting surfaces (IRS) deployed between users and remote radio heads (RRH).

Quantization

Echo State Speech Recognition

no code implementations18 Feb 2021 Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara Sainath

We propose automatic speech recognition (ASR) models inspired by echo state network (ESN), in which a subset of recurrent neural networks (RNN) layers in the models are randomly initialized and untrained.

Automatic Speech Recognition speech-recognition

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

1 code implementation15 Feb 2021 Yu Zhang, Zhihong Shen, Yuxiao Dong, Kuansan Wang, Jiawei Han

Multi-label text classification refers to the problem of assigning each given document its most relevant labels from the label set.

Classification General Classification +3

Multi-Objective Meta Learning

no code implementations NeurIPS 2021 Feiyang Ye, Baijiong Lin, Zhixiong Yue, Pengxin Guo, Qiao Xiao, Yu Zhang

Empirically, we show the effectiveness of the proposed MOML framework in several meta learning problems, including few-shot learning, neural architecture search, domain adaptation, and multi-task learning.

Domain Adaptation Few-Shot Learning +2

Phase discontinuities induced scintillation enhancement: coherent vortex beams propagating through weak oceanic turbulence

no code implementations5 Feb 2021 Hantao Wang, Huajun Zhang, Mingyuan Ren, Jinren Yao, Yu Zhang

Under the impact of an infinitely extended edge phase dislocation, optical vortices (screw phase dislocations) induce scintillation enhancement.

Optics

Joint Transmit Precoding and Reflect Beamforming Design for IRS-Assisted MIMO Cognitive Radio Systems

no code implementations2 Feb 2021 Weiheng Jiang, Yu Zhang, Jun Zhao, Zehui Xiong, Zhiguo Ding

Cognitive radio (CR) is an effective solution to improve the spectral efficiency (SE) of wireless communications by allowing the secondary users (SUs) to share spectrum with primary users (PUs).

Information Theory Signal Processing Information Theory

Photoproduction $γp \to K^+Λ(1520)$ in an effective Lagrangian approach

no code implementations22 Jan 2021 Neng-Chang Wei, Yu Zhang, Fei Huang, De-Min Li

In addition to the $t$-channel $K$ and $K^\ast$ exchanges, the $u$-channel $\Lambda$ exchange, the $s$-channel nucleon exchange, and the interaction current, a minimal number of nucleon resonances in the $s$ channel are introduced in constructing the reaction amplitudes to describe the data.

High Energy Physics - Phenomenology Nuclear Theory

Model-based cellular kinetic analysis of SARS-CoV-2 infection: different immune response modes and treatment strategies

no code implementations12 Jan 2021 Zhengqing Zhou, Zhiheng Zhao, Shuyu Shi, Jianghua Wu, Dianjie Li, Jianwei Li, Jingpeng Zhang, Ke Gui, Yu Zhang, Heng Mei, Yu Hu, Qi Ouyang, Fangting Li

Integrating theoretical results with clinical COVID-19 patients' data, we classified the COVID-19 development processes into three typical modes of immune responses, correlated with the clinical classification of mild & moderate, severe and critical patients.

Generative Adversarial U-Net for Domain-free Medical Image Augmentation

no code implementations12 Jan 2021 Xiaocong Chen, Yun Li, Lina Yao, Ehsan Adeli, Yu Zhang

The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.