Search Results for author: Yu Zhang

Found 517 papers, 159 papers with code

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

3 code implementations • ACL 2022 • Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

124,793

Paper
Code

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

11 code implementations • NeurIPS 2018 • Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Speaker Verification Speech Synthesis +3

50,722

Paper
Code

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training

3 code implementations • 7 Aug 2021 • Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu

In particular, when compared to published models such as conformer-based wav2vec~2. 0 and HuBERT, our model shows~5\% to~10\% relative WER reduction on the test-clean and test-other subsets.

Ranked #1 on Speech Recognition on LibriSpeech test-clean (using extra training data)

Contrastive Learning Language Modelling +3

29,228

Paper
Code

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

30 code implementations • 16 Dec 2017 • Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu

This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text.

Ranked #2 on Speech Synthesis on North American English

Speech Synthesis

29,133

Paper
Code

WaveGrad: Estimating Gradients for Waveform Generation

7 code implementations • ICLR 2021 • Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan

This paper introduces WaveGrad, a conditional model for waveform generation which estimates gradients of the data density.

Speech Synthesis Text-To-Speech Synthesis

29,133

Paper
Code

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

29 code implementations • 18 Apr 2019 • Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le

On LibriSpeech, we achieve 6. 8% WER on test-other without the use of a language model, and 5. 8% WER with shallow fusion with a language model.

Ranked #1 on Speech Recognition on Hub5'00 SwitchBoard

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

24,248

Paper
Code

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

11 code implementations • ICML 2018 • Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous

In this work, we propose "global style tokens" (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system.

Speech Synthesis Style Transfer +1

10,126

Paper
Code

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

4 code implementations • 9 Jul 2019 • Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran

We present a multispeaker, multilingual text-to-speech (TTS) synthesis model based on Tacotron that is able to produce high quality speech in multiple languages.

Speech Synthesis Voice Cloning

10,126

Paper
Code

Conformer: Convolution-augmented Transformer for Speech Recognition

24 code implementations • 16 May 2020 • Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs).

Ranked #12 on Speech Recognition on LibriSpeech test-other (using extra training data)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

10,126

Paper
Code

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

3 code implementations • 24 Oct 2019 • Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan

Furthermore, the unified design enables the integration of ASR functions with TTS, e. g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

7,867

Paper
Code

Simple Recurrent Units for Highly Parallelizable Recurrence

11 code implementations • EMNLP 2018 • Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, Yoav Artzi

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.

Ranked #32 on Question Answering on SQuAD1.1 dev

General Classification Machine Translation +3

5,802

Paper
Code

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

8 code implementations • CVPR 2020 • Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Sheng Zhao, Shuyang Cheng, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov

In an effort to help align the research community's contributions with real-world self-driving problems, we introduce a new large scale, high quality, diverse dataset.

Autonomous Driving

4,790

Paper
Code

Self-supervised Learning with Random-projection Quantizer for Speech Recognition

3 code implementations • 3 Feb 2022 • Chung-Cheng Chiu, James Qin, Yu Zhang, Jiahui Yu, Yonghui Wu

In particular the quantizer projects speech inputs with a randomly initialized matrix, and does a nearest-neighbor lookup in a randomly-initialized codebook.

Self-Supervised Learning speech-recognition +1

3,686

Paper
Code

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

2 code implementations • 21 Feb 2019 • Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

2,779

Paper
Code

Training RNNs as Fast as CNNs

1 code implementation • ICLR 2018 • Tao Lei, Yu Zhang, Yoav Artzi

Common recurrent neural network architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.

General Classification Language Modelling +4

2,098

Paper
Code

Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning

1 code implementation • 20 Nov 2021 • Baijiong Lin, Feiyang Ye, Yu Zhang, Ivor W. Tsang

Multi-Task Learning (MTL) has achieved success in various fields.

Multi-Task Learning

1,658

Paper
Code

LibMTL: A Python Library for Multi-Task Learning

1 code implementation • 27 Mar 2022 • Baijiong Lin, Yu Zhang

This paper presents LibMTL, an open-source Python library built on PyTorch, which provides a unified, comprehensive, reproducible, and extensible implementation framework for Multi-Task Learning (MTL).

Multi-Task Learning

1,658

Paper
Code

Dual-Balancing for Multi-Task Learning

1 code implementation • 23 Aug 2023 • Baijiong Lin, Weisen Jiang, Feiyang Ye, Yu Zhang, Pengguang Chen, Ying-Cong Chen, Shu Liu, James T. Kwok

Multi-task learning (MTL), a learning paradigm to learn multiple related tasks simultaneously, has achieved great success in various fields.

Multi-Task Learning

1,658

Paper
Code

Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

6 code implementations • 8 Jun 2017 • Takaaki Hori, Shinji Watanabe, Yu Zhang, William Chan

The CTC network sits on top of the encoder and is jointly trained with the attention-based decoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

1,158

Paper
Code

\textrm{DuReader}_{\textrm{vis}}: A Chinese Dataset for Open-domain Document Visual Question Answering

1 code implementation • Findings (ACL) 2022 • Le Qi, Shangwen Lv, Hongyu Li, Jing Liu, Yu Zhang, Qiaoqiao She, Hua Wu, Haifeng Wang, Ting Liu

Open-domain question answering has been used in a wide range of applications, such as web search and enterprise search, which usually takes clean texts extracted from various formats of documents (e. g., web pages, PDFs, or Word documents) as the information source.

document understanding Open-Domain Question Answering +1

1,101

Paper
Code

MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model

2 code implementations • 1 Nov 2022 • Junde Wu, Rao Fu, Huihui Fang, Yu Zhang, Yehui Yang, Haoyi Xiong, Huiying Liu, Yanwu Xu

Inspired by the success of DPM, we propose the first DPM based model toward general medical image segmentation tasks, which we named MedSegDiff.

Anomaly Detection Brain Tumor Segmentation +8

916

Paper
Code

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

6 code implementations • 7 May 2020 • Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu

We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2. 1%/4. 6% without external language model (LM), 1. 9%/4. 1% with LM and 2. 9%/7. 0% with only 10M parameters on the clean/noisy LibriSpeech test sets.

Ranked #12 on Speech Recognition on LibriSpeech test-clean

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

901

Paper
Code

Efficient Second-Order TreeCRF for Neural Dependency Parsing

2 code implementations • ACL 2020 • Yu Zhang, Zhenghua Li, Min Zhang

Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data.

Ranked #1 on Dependency Parsing on CoNLL-2009

Chinese Dependency Parsing Dependency Parsing

810

Paper
Code

Fast and Accurate Neural CRF Constituency Parsing

2 code implementations • IJCAI 2020 • Yu Zhang, Houquan Zhou, Zhenghua Li

Estimating probability distribution is one of the core issues in the NLP field.

Ranked #1 on Constituency Parsing on CTB7

Constituency Parsing Dependency Parsing

810

Paper
Code

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

1 code implementation • 3 Sep 2023 • Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.

Hallucination World Knowledge

807

Paper
Code

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

1 code implementation • 21 Sep 2023 • Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.

Ranked #53 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +4

304

Paper
Code

SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

1 code implementation • 8 Mar 2024 • Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, Zeyu Wang

We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.

258

Paper
Code

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

1 code implementation • 1 Apr 2020 • Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun, Jiawei Han

Since there has already been a broad body of HNE algorithms, as the first contribution of this work, we provide a generic paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms.

Attribute Network Embedding

246

Paper
Code

LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech

5 code implementations • 5 Apr 2019 • Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J. Weiss, Ye Jia, Zhifeng Chen, Yonghui Wu

This paper introduces a new speech corpus called "LibriTTS" designed for text-to-speech use.

Sound Audio and Speech Processing

223

Paper
Code

Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data

3 code implementations • NeurIPS 2017 • Wei-Ning Hsu, Yu Zhang, James Glass

We present a factorized hierarchical variational autoencoder, which learns disentangled and interpretable representations from sequential data without supervision.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

149

Paper
Code

An Efficient Person Clustering Algorithm for Open Checkout-free Groceries

1 code implementation • 5 Aug 2022 • Junde Wu, Yu Zhang, Rao Fu, Yuanpei Liu, Jing Gao

Then, to ensure that the method adapts to the dynamic and unseen person flow, we propose Graph Convolutional Network (GCN) with a simple Nearest Neighbor (NN) strategy to accurately cluster the instances of CSG.

Clustering

148

Paper
Code

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

1 code implementation • 15 Feb 2021 • Yu Zhang, Zhihong Shen, Yuxiao Dong, Kuansan Wang, Jiawei Han

Multi-label text classification refers to the problem of assigning each given document its most relevant labels from the label set.

General Classification Multi Label Text Classification +2

136

Paper
Code

Offline Handwritten Chinese Text Recognition with Convolutional Neural Networks

1 code implementation • 28 Jun 2020 • Brian Liu, Xianchao Xu, Yu Zhang

Deep learning based methods have been dominating the text recognition tasks in different and multilingual scenarios.

Handwritten Chinese Text Recognition Language Modelling

135

Paper
Code

Self-supervised Image Enhancement Network: Training with Low Light Images Only

1 code implementation • 26 Feb 2020 • Yu Zhang, Xiaoguang Di, Bin Zhang, Chunhui Wang

We introduce a constraint that the maximum channel of the reflectance conforms to the maximum channel of the low light image and its entropy should be largest in our model to achieve self-supervised learning.

Low-Light Image Enhancement Self-Supervised Learning

133

Paper
Code

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

2 code implementations • 30 Jan 2018 • Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han

Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases.

Feature Engineering Multi-Task Learning +4

129

Paper
Code

Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks

1 code implementation • EMNLP 2018 • Zhichun Wang, Qingsong Lv, Xiaohan Lan, Yu Zhang

Embeddings can be learned from both the structural and attribute information of entities, and the results of structure embedding and attribute embedding are combined to get accurate alignments.

Ranked #5 on Entity Alignment on YAGO-WIKI50K

Attribute Entity Alignment +3

127

Paper
Code

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling

6 code implementations • 8 Oct 2020 • Jonathan Shen, Ye Jia, Mike Chrzanowski, Yu Zhang, Isaac Elias, Heiga Zen, Yonghui Wu

This paper presents Non-Attentive Tacotron based on the Tacotron 2 text-to-speech model, replacing the attention mechanism with an explicit duration predictor.

Speech Recognition

110

Paper
Code

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

3 code implementations • 17 Jun 2021 • Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

The model takes an input phoneme sequence, and through an iterative refinement process, generates an audio waveform.

Speech Synthesis Text-To-Speech Synthesis

110

Paper
Code

Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification

1 code implementation • Thirty-Second AAAI Conference on Artificial Intelligence 2018 • Zheng Li, Ying WEI, Yu Zhang, Qiang Yang

Existing cross-domain sentiment classification meth- ods cannot automatically capture non-pivots, i. e., the domain- specific sentiment words, and pivots, i. e., the domain-shared sentiment words, simultaneously.

Classification Cross-Domain Text Classification +4

Paper
Code

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

1 code implementation • 3 Mar 2023 • Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Experiments show that Miipher (i) is robust against various audio degradation and (ii) enable us to train a high-quality text-to-speech (TTS) model from restored speech samples collected from the Web.

Speech Denoising Speech Enhancement

Paper
Code

Brain Network Construction and Classification Toolbox (BrainNetClass)

1 code implementation • 17 Jun 2019 • Zhen Zhou, Xiaobo Chen, Yu Zhang, Lishan Qiao, Renping Yu, Gang Pan, Han Zhang, Dinggang Shen

The goal of this work is to introduce a toolbox namely "Brain Network Construction and Classification" (BrainNetClass) to the field to promote more advanced brain network construction methods.

Classification General Classification

Paper
Code

Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

1 code implementation • 9 Feb 2022 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Jiawei Han

Interestingly, there have not been standard approaches to deploy PLMs for topic discovery as better alternatives to topic models.

Clustering Language Modelling +1

Paper
Code

MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction

1 code implementation • 7 Aug 2020 • Wentao Ouyang, Xiuwu Zhang, Lei Zhao, Jinmei Luo, Yu Zhang, Heng Zou, Zhaojie Liu, Yanlong Du

Our study is based on UC Toutiao (a news feed service integrated with the UC Browser App, serving hundreds of millions of users daily), where the source domain is the news and the target domain is the ad.

Click-Through Rate Prediction

Paper
Code

The Effect of Metadata on Scientific Literature Tagging: A Cross-Field Cross-Model Study

1 code implementation • 7 Feb 2023 • Yu Zhang, Bowen Jin, Qi Zhu, Yu Meng, Jiawei Han

Due to the exponential growth of scientific publications on the Web, there is a pressing need to tag each paper with fine-grained topics so that researchers can track their interested fields of study rather than drowning in the whole literature.

Language Modelling Multi Label Text Classification +3

Paper
Code

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

1 code implementation • 29 Mar 2022 • Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li

LightHuBERT outperforms the original HuBERT on ASR and five SUPERB tasks with the HuBERT size, achieves comparable performance to the teacher model in most tasks with a reduction of 29% parameters, and obtains a $3. 5\times$ compression ratio in three SUPERB tasks, e. g., automatic speaker verification, keyword spotting, and intent classification, with a slight accuracy loss.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Code

HiGitClass: Keyword-Driven Hierarchical Classification of GitHub Repositories

2 code implementations • 16 Oct 2019 • Yu Zhang, Frank F. Xu, Sha Li, Yu Meng, Xuan Wang, Qi Li, Jiawei Han

With the massive number of repositories available, there is a pressing need for topic-based search.

Classification General Classification +1

Paper
Code

Transferable End-to-End Aspect-based Sentiment Analysis with Selective Adversarial Learning

1 code implementation • IJCNLP 2019 • Zheng Li, Xin Li, Ying WEI, Lidong Bing, Yu Zhang, Qiang Yang

Joint extraction of aspects and sentiments can be effectively formulated as a sequence labeling problem.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Paper
Code

Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

1 code implementation • EMNLP 2021 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han

We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base.

Language Modelling named-entity-recognition +2

Paper
Code

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

1 code implementation • 20 Oct 2020 • Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu

We employ a combination of recent developments in semi-supervised learning for automatic speech recognition to obtain state-of-the-art results on LibriSpeech utilizing the unlabeled audio of the Libri-Light dataset.

Ranked #1 on Speech Recognition on LibriSpeech test-clean (using extra training data)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments

1 code implementation • COLING 2022 • Yu Zhang, Qingrong Xia, Shilin Zhou, Yong Jiang, Guohong Fu, Min Zhang

Semantic role labeling (SRL) is a fundamental yet challenging task in the NLP community.

Ranked #2 on Semantic Role Labeling (predicted predicates) on CoNLL 2012

Dependency Parsing Semantic Role Labeling (predicted predicates)

Paper
Code

Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

1 code implementation • 18 Jul 2020 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Chao Zhang, Jiawei Han

Mining a set of meaningful topics organized into a hierarchy is intuitively appealing since topic correlations are ubiquitous in massive text corpora.

Ranked #1 on Topic Models on Arxiv HEP-TH citation graph

text-classification Topic Models

Paper
Code

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

1 code implementation • 9 Feb 2022 • Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e. g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e. g., BERT) have been the prominent choice for natural language understanding (NLU) tasks.

Ranked #5 on Zero-Shot Text Classification on AG News

Few-Shot Learning MNLI-m +5

Paper
Code

Minimally Supervised Categorization of Text with Metadata

1 code implementation • 1 May 2020 • Yu Zhang, Yu Meng, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han

Then, based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity.

Document Classification

Paper
Code

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

1 code implementation • 18 Jul 2022 • Xinyu Shi, Dong Wei, Yu Zhang, Donghuan Lu, Munan Ning, Jiashun Chen, Kai Ma, Yefeng Zheng

A key to this challenging task is to fully utilize the information in the support images by exploiting fine-grained correlations between the query and support images.

Ranked #4 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)

Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Learning an Adaptive Model for Extreme Low-light Raw Image Processing

1 code implementation • 22 Apr 2020 • Qingxu Fu, Xiaoguang Di, Yu Zhang

Furthermore, those tests illustrate that the proposed method is able to adaptively control the global image brightness according to the content of the image scene.

Denoising Low-Light Image Enhancement +1

Paper
Code

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

1 code implementation • 26 Oct 2020 • Yu Zhang, Xiusi Chen, Yu Meng, Jiawei Han

Our experiments demonstrate a consistent improvement of HiMeCat over competitive baselines and validate the contribution of our representation learning and data augmentation modules.

Data Augmentation Document Classification +1

Paper
Code

Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks

1 code implementation • 21 Feb 2023 • Bowen Jin, Yu Zhang, Yu Meng, Jiawei Han

Edges in many real-world social/information networks are associated with rich text information (e. g., user-user communications or user-product reviews).

Edge Classification Link Prediction +1

Paper
Code

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

1 code implementation • 4 May 2023 • Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.

Ranked #4 on Question Answering on HotpotQA

Open-Domain Question Answering Retrieval +1

Paper
Code

Discriminative Topic Mining via Category-Name Guided Text Embedding

1 code implementation • 20 Aug 2019 • Yu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu Zhang, Jiawei Han

We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora.

Document Classification General Classification +3

Paper
Code

Simplifying Low-Light Image Enhancement Networks with Relative Loss Functions

1 code implementation • 6 Apr 2023 • Yu Zhang, Xiaoguang Di, Junde Wu, Rao Fu, Yong Li, Yue Wang, Yanwu Xu, Guohui YANG, Chunhui Wang

In this paper, to make the learning easier in low-light image enhancement, we introduce FLW-Net (Fast and LightWeight Network) and two relative loss functions.

Low-Light Image Enhancement

Paper
Code

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

1 code implementation • 4 Dec 2021 • Feng Xu, Chuang Zhu, Wenqi Tang, Ying Wang, Yu Zhang, Jie Li, Hongchuan Jiang, Zhongyue Shi, Jun Liu, Mulan Jin

Conclusion: Our study provides a novel DL-based biomarker on primary tumor CNB slides to predict the metastatic status of ALN preoperatively for patients with EBC.

Multiple Instance Learning Specificity +1

Paper
Code

Balanced and Hierarchical Relation Learning for One-Shot Object Detection

1 code implementation • CVPR 2022 • Hanqing Yang, Sijia Cai, Hualian Sheng, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Yong Tang, Yu Zhang

In this paper, we introduce the balanced and hierarchical learning for our detector.

Metric Learning object-detection +2

Paper
Code

Exploiting Coarse-to-Fine Task Transfer for Aspect-level Sentiment Classification

1 code implementation • AAAI 2019 2018 • Zheng Li, Ying WEI, Yu Zhang, Xiang Zhang, Xin Li, Qiang Yang

Aspect-level sentiment classification (ASC) aims at identifying sentiment polarities towards aspects in a sentence, where the aspect can behave as a general Aspect Category (AC) or a specific Aspect Term (AT).

Ranked #19 on Aspect-Based Sentiment Analysis (ABSA) on SemEval-2014 Task-4

General Classification Sentence +2

Paper
Code

PEAL: Prior-Embedded Explicit Attention Learning for Low-Overlap Point Cloud Registration

1 code implementation • CVPR 2023 • Junle Yu, Luwei Ren, Yu Zhang, Wenhui Zhou, Lili Lin, Guojun Dai

Recently, it has achieved huge success in incorporating Transformer into point cloud feature representation, which usually adopts a self-attention module to learn intra-point-cloud features first, then utilizes a cross-attention module to perform feature exchange between input point clouds.

Point Cloud Registration

Paper
Code

TLP: A Deep Learning-based Cost Model for Tensor Program Tuning

1 code implementation • 7 Nov 2022 • Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang

Instead of extracting features from the tensor program itself, TLP extracts features from the schedule primitives.

Multi-Task Learning

Paper
Code

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

1 code implementation • 28 Oct 2022 • Xubo Liu, Qiushi Huang, Xinhao Mei, Haohe Liu, Qiuqiang Kong, Jianyuan Sun, Shengchen Li, Tom Ko, Yu Zhang, Lilian H. Tang, Mark D. Plumbley, Volkan Kılıç, Wenwu Wang

Audio captioning aims to generate text descriptions of audio clips.

AudioCaps Audio captioning +1

Paper
Code

Integrating Local Context and Global Cohesiveness for Open Information Extraction

1 code implementation • 26 Apr 2018 • Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Ahmed El-Kishky, Jiawei Han

However, current Open IE systems focus on modeling local context information in a sentence to extract relation tuples, while ignoring the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions.

Open Information Extraction Relation +1

Paper
Code

Deep Learning for Massive MIMO with 1-Bit ADCs: When More Antennas Need Fewer Pilots

1 code implementation • 15 Oct 2019 • Yu Zhang, Muhammad Alrabeiah, Ahmed Alkhateeb

This leads to the interesting, and \textit{counter-intuitive}, observation that when more antennas are employed by the massive MIMO base station, our proposed deep learning approach achieves better channel estimation performance, for the same pilot sequence length.

Information Theory Signal Processing Information Theory

Paper
Code

Attention-guided Chained Context Aggregation for Semantic Segmentation

3 code implementations • 27 Feb 2020 • Quan Tang, Fagui Liu, Tong Zhang, Jun Jiang, Yu Zhang

The way features propagate in Fully Convolutional Networks is of momentous importance to capture multi-scale contexts for obtaining precise segmentation masks.

Ranked #23 on Semantic Segmentation on SUN-RGBD (using extra training data)

Semantic Segmentation

Paper
Code

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

1 code implementation • CVPR 2023 • Yingjie Wang, Jiajun Deng, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang

LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints.

object-detection Object Detection

Paper
Code

A Multi-spectral Dataset for Evaluating Motion Estimation Systems

1 code implementation • 1 Jul 2020 • Weichen Dai, Yu Zhang, Shenzhou Chen, Donglei Sun, Da Kong

The multi-spectral images, including both color and thermal images in full sensor resolution (640 x 480), are obtained from a standard and a long-wave infrared camera at 32Hz with hardware-synchronization.

Motion Estimation Stereo Matching

Paper
Code

Self-supervised Low Light Image Enhancement and Denoising

1 code implementation • 1 Mar 2021 • Yu Zhang, Xiaoguang Di, Bin Zhang, Qingyan Li, Shiyu Yan, Chunhui Wang

Both of the networks can be trained with low light images only, which is achieved by a Maximum Entropy based Retinex (ME-Retinex) model and an assumption that noises are independently distributed.

Denoising Low-Light Image Enhancement

Paper
Code

PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry

1 code implementation • 28 Feb 2023 • Yu Zhang, Junle Yu, Xiaolin Huang, Wenhui Zhou, Ji Hou

Different from previous methods that only use geometry representation, our module is specifically designed to effectively correlate color into geometry for the point cloud registration task.

Point Cloud Registration

Paper
Code

OneLabeler: A Flexible System for Building Data Labeling Tools

1 code implementation • 27 Mar 2022 • Yu Zhang, Yun Wang, Haidong Zhang, Bin Zhu, Siming Chen, Dongmei Zhang

In this paper, we propose a conceptual framework for data labeling and OneLabeler based on the conceptual framework to support easy building of labeling tools for diverse usage scenarios.

Paper
Code

Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs

1 code implementation • 10 Apr 2024 • Bowen Jin, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Suhang Wang, Yu Meng, Jiawei Han

Then, we propose a simple and effective framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively.

Paper
Code

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

1 code implementation • 6 Nov 2022 • Yu Meng, Martin Michalski, Jiaxin Huang, Yu Zhang, Tarek Abdelzaher, Jiawei Han

In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set.

Few-Shot Learning

Paper
Code

Improved Noisy Student Training for Automatic Speech Recognition

1 code implementation • 19 May 2020 • Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le

Noisy student training is an iterative self-training method that leverages augmentation to improve network performance.

Ranked #5 on Speech Recognition on LibriSpeech test-clean

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Multi-view Self-supervised Disentanglement for General Image Denoising

1 code implementation • ICCV 2023 • Hao Chen, Chenyuan Qu, Yu Zhang, Chen Chen, Jianbo Jiao

It is understandable as the model is designed to learn paired mapping (e. g. from a noisy image to its clean version).

Ranked #1 on Denoising on CBSD68 sigm75

Disentanglement Image Denoising +1

Paper
Code

Deep Reinforcement Learning for Chinese Zero pronoun Resolution

1 code implementation • ACL 2018 • Qingyu Yin, Yu Zhang, Wei-Nan Zhang, Ting Liu, William Yang Wang

In this study, we show how to integrate local and global decision-making by exploiting deep reinforcement learning models.

Chinese Zero Pronoun Resolution Decision Making +2

Paper
Code

Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

1 code implementation • 11 Feb 2022 • Yu Zhang, Zhihong Shen, Chieh-Han Wu, Boya Xie, Junheng Hao, Ye-Yi Wang, Kuansan Wang, Jiawei Han

Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from a large candidate set.

Contrastive Learning Multi Label Text Classification +3

Paper
Code

NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu

This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.

valid Video Quality Assessment +1

Paper
Code

JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering

1 code implementation • NAACL 2022 • Yueqing Sun, Qi Shi, Le Qi, Yu Zhang

Specifically, JointLK performs joint reasoning between LM and GNN through a novel dense bidirectional attention module, in which each question token attends on KG nodes and each KG node attends on question tokens, and the two modal representations fuse and update mutually by multi-step interactions.

Knowledge Graphs Question Answering

Paper
Code

Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation

1 code implementation • 18 May 2022 • Qianqian Dong, Fengpeng Yue, Tom Ko, Mingxuan Wang, Qibing Bai, Yu Zhang

Direct Speech-to-speech translation (S2ST) has drawn more and more attention recently.

Speech-to-Speech Translation Translation

Paper
Code

Deformer: Towards Displacement Field Learning for Unsupervised Medical Image Registration

1 code implementation • 7 Jul 2022 • Jiashun Chen, Donghuan Lu, Yu Zhang, Dong Wei, Munan Ning, Xinyu Shi, Zhe Xu, Yefeng Zheng

In this study, we propose a novel Deformer module along with a multi-scale framework for the deformable image registration task.

Image Registration Medical Image Registration

Paper
Code

Rethinking Training from Scratch for Object Detection

1 code implementation • 6 Jun 2021 • Yang Li, Hong Zhang, Yu Zhang

The ImageNet pre-training initialization is the de-facto standard for object detection.

Object object-detection +1

Paper
Code

Fault Location in Power Distribution Systems via Deep Graph Convolutional Networks

1 code implementation • 22 Dec 2018 • Kunjin Chen, Jun Hu, Yu Zhang, Zhanqing Yu, Jinliang He

This paper develops a novel graph convolutional network (GCN) framework for fault location in power distribution networks.

Data Augmentation Data Visualization

Paper
Code

Personalized Dialogue Generation with Persona-Adaptive Attention

1 code implementation • 27 Oct 2022 • Qiushi Huang, Yu Zhang, Tom Ko, Xubo Liu, Bo Wu, Wenwu Wang, Lilian Tang

Persona-based dialogue systems aim to generate consistent responses based on historical context and predefined persona.

Dialogue Generation

Paper
Code

Dynamic Sparse Network for Time Series Classification: Learning What to "see''

1 code implementation • 19 Dec 2022 • Qiao Xiao, Boqian Wu, Yu Zhang, Shiwei Liu, Mykola Pechenizkiy, Elena Mocanu, Decebal Constantin Mocanu

The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC).

Time Series Time Series Analysis +1

Paper
Code

Zero Pronoun Resolution with Attention-based Neural Network

1 code implementation • COLING 2018 • Qingyu Yin, Yu Zhang, Wei-Nan Zhang, Ting Liu, William Yang Wang

Recent neural network methods for zero pronoun resolution explore multiple models for generating representation vectors for zero pronouns and their candidate antecedents.

Chinese Zero Pronoun Resolution

Paper
Code

Deep Image Clustering with Category-Style Representation

1 code implementation • ECCV 2020 • Junjie Zhao, Donghuan Lu, Kai Ma, Yu Zhang, Yefeng Zheng

In this paper, we propose a novel deep image clustering framework to learn a category-style latent representation in which the category information is disentangled from image style and can be directly used as the cluster assignment.

Clustering Deep Clustering +1

Paper
Code

Fast and Accurate End-to-End Span-based Semantic Role Labeling as Word-based Graph Parsing

1 code implementation • COLING 2022 • Shilin Zhou, Qingrong Xia, Zhenghua Li, Yu Zhang, Yu Hong, Min Zhang

Moreover, we propose a simple constrained Viterbi procedure to ensure the legality of the output graph according to the constraints of the SRL structure.

Chinese Word Segmentation named-entity-recognition +3

Paper
Code

Deep Bayesian Video Frame Interpolation

1 code implementation • Conference 2022 • ZHIYANG YU, Yu Zhang, Xujie Xiang, Dongqing Zou, Xijun Chen, Jimmy S. Ren

Abstract.

Ranked #1 on Video Frame Interpolation on GoPro

Video Frame Interpolation

Paper
Code

Non-autoregressive Text Editing with Copy-aware Latent Alignments

1 code implementation • 11 Oct 2023 • Yu Zhang, Yue Zhang, Leyang Cui, Guohong Fu

In this work, we propose a novel non-autoregressive text editing method to circumvent the above issues, by modeling the edit process with latent CTC alignments.

Management Sentence +1

Paper
Code

Better Than Reference In Low Light Image Enhancement: Conditional Re-Enhancement Networks

1 code implementation • 26 Aug 2020 • Yu Zhang, Xiaoguang Di, Bin Zhang, Ruihang Ji, Chunhui Wang

The network takes low light images as input and the enhanced V channel as condition, then it can re-enhance the contrast and brightness of the low light image and at the same time reduce noise and color distortion.

Low-Light Image Enhancement

Paper
Code

A Survey on Multi-Task Learning

1 code implementation • 25 Jul 2017 • Yu Zhang, Qiang Yang

Multi-Task Learning (MTL) is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks.

Active Learning Clustering +3

Paper
Code

Multi-source Heterogeneous Domain Adaptation with Conditional Weighting Adversarial Network

1 code implementation • 6 Aug 2020 • Yuan Yao, Xutao Li, Yu Zhang, Yunming Ye

In reality, however, it is not uncommon to obtain samples from multiple heterogeneous domains.

Domain Adaptation

Paper
Code

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

1 code implementation • 22 Oct 2020 • Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman

For various speech-related tasks, confidence scores from a speech recogniser are a useful measure to assess the quality of transcriptions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Personalized Image Semantic Segmentation

1 code implementation • ICCV 2021 • Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming Cheng, Feng Mao

In this paper, we address the problem of personalized image segmentation.

Image Segmentation Segmentation +1

Paper
Code

Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds

1 code implementation • NAACL 2022 • Yu Zhang, Yu Meng, Xuan Wang, Sheng Wang, Jiawei Han

Discovering latent topics from text corpora has been studied for decades.

General Knowledge Topic Models

Paper
Code

Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks

1 code implementation • 20 May 2022 • Bowen Jin, Yu Zhang, Qi Zhu, Jiawei Han

In heterogeneous text-rich networks, this task is more challenging due to (1) presence or absence of text: Some nodes are associated with rich textual information, while others are not; (2) diversity of types: Nodes and edges of multiple types form a heterogeneous network structure.

Clustering Graph Attention +5

Paper
Code

Feature Aggregation and Propagation Network for Camouflaged Object Detection

1 code implementation • 2 Dec 2022 • Tao Zhou, Yi Zhou, Chen Gong, Jian Yang, Yu Zhang

In this paper, we propose a novel Feature Aggregation and Propagation Network (FAP-Net) for camouflaged object detection.

Object object-detection +1

Paper
Code

Hierarchical Generative Modeling for Controllable Speech Synthesis

2 code implementations • ICLR 2019 • Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang

This paper proposes a neural sequence-to-sequence text-to-speech (TTS) model which can control latent attributes in the generated speech that are rarely annotated in the training data, such as speaking style, accent, background noise, and recording conditions.

Attribute Speech Synthesis

Paper
Code

Training Weakly Supervised Video Frame Interpolation With Events

1 code implementation • ICCV 2021 • ZHIYANG YU, Yu Zhang, Deyuan Liu, Dongqing Zou, Xijun Chen, Yebin Liu, Jimmy S. Ren

Though trained on low frame-rate videos, our framework outperforms existing models trained with full high frame-rate videos (and events) on both GoPro dataset and a new real event-based dataset.

Video Frame Interpolation

Paper
Code

E2NeRF: Event Enhanced Neural Radiance Fields from Blurry Images

1 code implementation • ICCV 2023 • Yunshan Qi, Lin Zhu, Yu Zhang, Jia Li

To solve this problem, we propose a novel Event-Enhanced NeRF (E2NeRF) by utilizing the combination data of a bio-inspired event camera and a standard RGB camera.

Deblurring Image Deblurring +2

Paper
Code

Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts

1 code implementation • 12 Dec 2022 • Yu Zhang, Yunyi Zhang, Martin Michalski, Yucheng Jiang, Yu Meng, Jiawei Han

Instead of mining coherent topics from a given text corpus in a completely unsupervised manner, seed-guided topic discovery methods leverage user-provided seed words to extract distinctive and coherent topics so that the mined topics can better cater to the user's interest.

Language Modelling Word Embeddings

Paper
Code

Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers

1 code implementation • 24 Jun 2023 • Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen Shen, Yunyi Zhang, Yu Meng, Jiawei Han

Instead of relying on human-annotated training samples to build a classifier, weakly supervised scientific paper classification aims to classify papers only using category descriptions (e. g., category names, category-indicative keywords).

Multi-Label Classification

Paper
Code

VLLaVO: Mitigating Visual Gap through LLMs

1 code implementation • 6 Jan 2024 • Shuhao Chen, Yulong Zhang, Weisen Jiang, Jiangang Lu, Yu Zhang

Recent advances achieved by deep learning models rely on the independent and identically distributed assumption, hindering their applications in real-world scenarios with domain shifts.

Domain Generalization Language Modelling +2

Paper
Code

Learning Beam Codebooks with Neural Networks: Towards Environment-Aware mmWave MIMO

1 code implementation • 25 Feb 2020 • Yu Zhang, Muhammad Alrabeiah, Ahmed Alkhateeb

This leads to high beam training overhead and loss in the achievable beamforming gains.

Information Theory Signal Processing Information Theory

Paper
Code

MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information

1 code implementation • 7 Nov 2021 • Yu Zhang, Shweta Garg, Yu Meng, Xiusi Chen, Jiawei Han

We study the problem of weakly supervised text classification, which aims to classify text documents into a set of pre-defined categories with category surface names only and without any annotated training document provided.

text-classification Text Classification

Paper
Code

CoNet: Collaborative Cross Networks for Cross-Domain Recommendation

1 code implementation • 18 Apr 2018 • Guang-Neng Hu, Yu Zhang, Qiang Yang

CoNet enables dual knowledge transfer across domains by introducing cross connections from one base network to another and vice versa.

Recommendation Systems Transfer Learning

Paper
Code

Contrastive Graph Learning for Population-based fMRI Classification

1 code implementation • 26 Mar 2022 • Xuesong Wang, Lina Yao, Islem Rekik, Yu Zhang

Nonetheless, existing contrastive methods generate resemblant pairs only on pixel-level features of 3D medical images, while the functional connectivity that reveals critical cognitive information is under-explored.

Classification Graph Learning +1

Paper
Code

Memory-Efficient Reversible Spiking Neural Networks

1 code implementation • 13 Dec 2023 • Hong Zhang, Yu Zhang

In this paper, we propose the reversible spiking neural network to reduce the memory cost of intermediate activations and membrane potentials during training.

Paper
Code

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

1 code implementation • 4 Feb 2024 • Yanbin Wei, Qiushi Huang, James T. Kwok, Yu Zhang

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications.

In-Context Learning Language Modelling +1

Paper
Code

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

1 code implementation • 15 Feb 2022 • Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Although using bounds as surrogate functions to design safe RL algorithms have appeared in some existing works, we develop them at least three aspects: (i) We provide a rigorous theoretical analysis to extend the surrogate functions to generalized advantage estimator (GAE).

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Code

AnoDFDNet: A Deep Feature Difference Network for Anomaly Detection

1 code implementation • 29 Mar 2022 • Zhixue Wang, Yu Zhang, Lin Luo, Nan Wang

This paper proposed a novel anomaly detection (AD) approach of High-speed Train images based on convolutional neural networks and the Vision Transformer.

Anomaly Detection object-detection +1

Paper
Code

MATNilm: Multi-appliance-task Non-intrusive Load Monitoring with Limited Labeled Data

1 code implementation • 27 Jul 2023 • Jing Xiong, Tianqi Hong, Dongbo Zhao, Yu Zhang

Non-intrusive load monitoring (NILM) identifies the status and power consumption of various household appliances by disaggregating the total power usage signal of an entire house.

energy management Non-Intrusive Load Monitoring

Paper
Code

Fisher Deep Domain Adaptation

1 code implementation • 12 Mar 2020 • Yinghua Zhang, Yu Zhang, Ying WEI, Kun Bai, Yangqiu Song, Qiang Yang

Though the learned representations are separable in the source domain, they usually have a large variance and samples with different class labels tend to overlap in the target domain, which yields suboptimal adaptation performance.

Domain Adaptation

Paper
Code

Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification

1 code implementation • EMNLP 2021 • Qi Shi, Yu Zhang, Qingyu Yin, Ting Liu

Specifically, we first retrieve logic-level program-like evidence from the given table and statement as supplementary evidence for the table.

Fact Verification Retrieval +1

Paper
Code

FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

1 code implementation • 25 May 2022 • Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna

We introduce FLEURS, the Few-shot Learning Evaluation of Universal Representations of Speech benchmark.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Code

CROLoss: Towards a Customizable Loss for Retrieval Models in Recommender Systems

1 code implementation • 5 Aug 2022 • Yongxiang Tang, Wentao Bai, Guilin Li, Xialong Liu, Yu Zhang

In this paper, we proposed the Customizable Recall@N Optimization Loss (CROLoss), a loss function that can directly optimize the Recall@N metrics and is customizable for different choices of N. This proposed CROLoss formulation defines a more generalized loss function space, covering most of the conventional loss functions as special cases.

Recommendation Systems Retrieval

Paper
Code

DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images

1 code implementation • 8 Aug 2023 • Xuechao Zou, Kai Li, Junliang Xing, Yu Zhang, Shiying Wang, Lei Jin, Pin Tao

Optical satellite images are a critical data source; however, cloud cover often compromises their quality, hindering image applications and analysis.

Cloud Removal Image Generation

Paper
Code

Very Deep Convolutional Networks for End-to-End Speech Recognition

2 code implementations • 10 Oct 2016 • Yu Zhang, William Chan, Navdeep Jaitly

Sequence-to-sequence models have shown success in end-to-end speech recognition.

speech-recognition Speech Recognition

Paper
Code

A Self-Training Framework Based on Multi-Scale Attention Fusion for Weakly Supervised Semantic Segmentation

1 code implementation • 10 May 2023 • Guoqing Yang, Chuang Zhu, Yu Zhang

Weakly supervised semantic segmentation (WSSS) based on image-level labels is challenging since it is hard to obtain complete semantic regions.

Denoising Weakly supervised Semantic Segmentation +1

Paper
Code

PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training

1 code implementation • 23 May 2023 • Yunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han

Weakly-supervised text classification trains a classifier using the label name of each target class as the only supervision, which largely reduces human annotation efforts.

Pseudo Label Sentiment Analysis +3

Paper
Code

Unify word-level and span-level tasks: NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task

1 code implementation • 23 Sep 2023 • Xiang Geng, Zhejian Lai, Yu Zhang, Shimin Tao, Hao Yang, Jiajun Chen, ShuJian Huang

We generate pseudo MQM data using parallel data from the WMT translation task.

Sentence

Paper
Code

Adversarial Representation Learning for Robust Patient-Independent Epileptic Seizure Detection

1 code implementation • 18 Sep 2019 • Xiang Zhang, Lina Yao, Manqing Dong, Zhe Liu, Yu Zhang, Yong Li

Furthermore, to enhance the explainability, we develop an attention mechanism to automatically learn the importance of each EEG channels in the seizure diagnosis procedure.

EEG Feature Engineering +2

Paper
Code

CODAR: A Contextual Duration-Aware Qubit Mapping for Various NISQ Devices

1 code implementation • 24 Feb 2020 • Haowei Deng, Yu Zhang, Quanxi Li

Quantum computing devices in the NISQ era share common features and challenges like limited connectivity between qubits.

Quantum Physics

Paper
Code

Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting

1 code implementation • 6 Jul 2021 • Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, Yu Zhang

As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding.

Autonomous Driving Monocular 3D Object Detection +4

Paper
Code

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation

1 code implementation • 12 Nov 2021 • Yu Zhang, Wei Wei, Binxuan Huang, Kathleen M. Carley, Yan Zhang

Real-time location inference of social media users is the fundamental of some spatial applications such as localized search and event detection.

Event Detection

Paper
Code

Temporal Consistent Automatic Video Colorization via Semantic Correspondence

1 code implementation • 13 May 2023 • Yu Zhang, Siqi Chen, Mingdao Wang, Xianlin Zhang, Chuang Zhu, Yue Zhang, Xueming Li

Extensive experiments demonstrate that our method outperforms other methods in maintaining temporal consistency both qualitatively and quantitatively.

Colorization Image Colorization +1

Paper
Code

Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains

1 code implementation • 23 Jan 2024 • Yu Zhang, Yunyi Zhang, Yanzhen Shen, Yu Deng, Lucian Popa, Larisa Shwartz, ChengXiang Zhai, Jiawei Han

In this paper, we study the task of seed-guided fine-grained entity typing in science and engineering domains, which takes the name and a few seed entities for each entity type as the only supervision and aims to classify new entity mentions into both seen and unseen types (i. e., those without seed entities).

Entity Typing Natural Language Inference

Paper
Code

Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors

1 code implementation • 2 Dec 2023 • Yu Zhang, Songpengcheng Xia, Lei Chu, Jiarui Yang, Qi Wu, Ling Pei

This paper introduces a novel human pose estimation approach using sparse inertial sensors, addressing the shortcomings of previous methods reliant on synthetic data.

Pose Estimation

Paper
Code

Neural Networks Based Beam Codebooks: Learning mmWave Massive MIMO Beams that Adapt to Deployment and Hardware

1 code implementation • 25 Jun 2020 • Muhammad Alrabeiah, Yu Zhang, Ahmed Alkhateeb

To overcome these limitations, this paper develops an efficient online machine learning framework that learns how to adapt the codebook beam patterns to the specific deployment, surrounding environment, user distribution, and hardware characteristics.

Paper
Code

SPColor: Semantic Prior Guided Exemplar-based Image Colorization

1 code implementation • 13 Apr 2023 • Siqi Chen, Xueming Li, Xianlin Zhang, Mingdao Wang, Yu Zhang, Yue Zhang

Previous methods search for correspondence across the entire reference image, and this type of global matching is easy to get mismatch.

Colorization Image Colorization +1

Paper
Code

Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

1 code implementation • 22 May 2023 • Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, Xiang-Rong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu, Bo Zheng

However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions.

Recommendation Systems Retrieval

Paper
Code

How to Estimate Model Transferability of Pre-Trained Speech Models?

1 code implementation • 1 Jun 2023 • Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-Yi Lee, Tara N. Sainath

In this work, we introduce a "score-based assessment" framework for estimating the transferability of pre-trained speech models (PSMs) for fine-tuning target tasks.

Paper
Code

Gotta: Generative Few-shot Question Answering by Prompt-based Cloze Data Augmentation

1 code implementation • 7 Jun 2023 • Xiusi Chen, Yu Zhang, Jinliang Deng, Jyun-Yu Jiang, Wei Wang

Few-shot question answering (QA) aims at precisely discovering answers to a set of questions from context passages while only a few training samples are available.

Data Augmentation Question Answering

Paper
Code

FLEET: Butterfly Estimation from a Bipartite Graph Stream

1 code implementation • 8 Dec 2018 • Seyed-Vahid Sanei-Mehri, Yu Zhang, Ahmet Erdem Sariyuce, Srikanta Tirthapura

We consider space-efficient single-pass estimation of the number of butterflies, a fundamental bipartite graph motif, from a massive bipartite graph stream where each edge represents a connection between entities in two different partitions.

Data Structures and Algorithms

Paper
Code

A Coarse-to-Fine Labeling Framework for Joint Word Segmentation, POS Tagging, and Constituent Parsing

1 code implementation • CoNLL (EMNLP) 2021 • Yang Hou, Houquan Zhou, Zhenghua Li, Yu Zhang, Min Zhang, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan

In the coarse labeling stage, the joint model outputs a bracketed tree, in which each node corresponds to one of four labels (i. e., phrase, subphrase, word, subword).

Part-Of-Speech Tagging POS +2

Paper
Code

Improving generalizability of distilled self-supervised speech processing models under distorted settings

1 code implementation • 14 Oct 2022 • Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-Yi Lee

Self-supervised learned (SSL) speech pre-trained models perform well across various speech processing tasks.

Knowledge Distillation

Paper
Code

HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation

1 code implementation • 21 Nov 2023 • Yongliang Lin, Yongzhi Su, Praveen Nathan, Sandeep Inuganti, Yan Di, Martin Sundermeyer, Fabian Manhardt, Didier Stricker, Jason Rambach, Yu Zhang

In this work, we present a novel dense-correspondence method for 6DoF object pose estimation from a single RGB-D image.

Pose Estimation

Paper
Code

Online Test-Time Adaptation of Spatial-Temporal Traffic Flow Forecasting

1 code implementation • 8 Jan 2024 • Pengxin Guo, Pengrong Jin, Ziyue Li, Lei Bai, Yu Zhang

To make the model trained on historical data better adapt to future data in a fully online manner, this paper conducts the first study of the online test-time adaptation techniques for spatial-temporal traffic flow forecasting problems.

Ranked #4 on Traffic Prediction on PeMS07

Test-time Adaptation Traffic Prediction

Paper
Code

Deep Multi-Task Augmented Feature Learning via Hierarchical Graph Neural Network

1 code implementation • 12 Feb 2020 • Pengxin Guo, Chang Deng, Linjie Xu, Xiaonan Huang, Yu Zhang

The proposed feature augmentation strategy can be used in many deep multi-task learning models.

Multi-Task Learning

Paper
Code

Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing?

1 code implementation • 6 Mar 2020 • Houquan Zhou, Yu Zhang, Zhenghua Li, Min Zhang

In the pre deep learning era, part-of-speech tags have been considered as indispensable ingredients for feature engineering in dependency parsing.

Dependency Parsing Feature Engineering +4

Paper
Code

LEMON: Language-Based Environment Manipulation via Execution-Guided Pre-training

2 code implementations • 20 Jan 2022 • Qi Shi, Qian Liu, Bei Chen, Yu Zhang, Ting Liu, Jian-Guang Lou

In this work, we propose LEMON, a general framework for language-based environment manipulation tasks.

Language Modelling

Paper
Code

Transforming Visual Scene Graphs to Image Captions

1 code implementation • 3 May 2023 • Xu Yang, Jiawei Peng, Zihua Wang, Haiyang Xu, Qinghao Ye, Chenliang Li, Songfang Huang, Fei Huang, Zhangzikang Li, Yu Zhang

In TSG, we apply multi-head attention (MHA) to design the Graph Neural Network (GNN) for embedding scene graphs.

Attribute Descriptive +1

Paper
Code

A Unifying Framework of Attention-based Neural Load Forecasting

1 code implementation • 8 May 2023 • Jing Xiong, Yu Zhang

In this paper, we propose a unifying deep learning framework for load forecasting, which includes time-varying feature weighting, hierarchical temporal attention, and feature-reinforced error correction.

Load Forecasting

Paper
Code

Explanation Graph Generation via Generative Pre-training over Synthetic Graphs

1 code implementation • 1 Jun 2023 • Han Cui, Shangzhan Li, Yu Zhang, Qi Shi

The generation of explanation graphs is a significant task that aims to produce explanation graphs in response to user input, revealing the internal reasoning process.

Graph Generation Language Modelling

Paper
Code

LEFormer: A Hybrid CNN-Transformer Architecture for Accurate Lake Extraction from Remote Sensing Imagery

1 code implementation • 8 Aug 2023 • Ben Chen, Xuechao Zou, Yu Zhang, Jiayu Li, Kai Li, Junliang Xing, Pin Tao

LEFormer contains three main modules: CNN encoder, Transformer encoder, and cross-encoder fusion.

Paper
Code

Selective Partial Domain Adaptation

2 code implementations • British Machine Vision Conference 2022 • Pengxin Guo, Jinjing Zhu, Yu Zhang

To solve this problem, we propose a Selective Partial Domain Adaptation (SPDA) method, which selects useful data for the adaptation to the target domain.

Ranked #1 on Partial Domain Adaptation on VisDA2017

Partial Domain Adaptation

Paper
Code

A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion

1 code implementation • 20 Feb 2024 • Yanzhen Shen, Yu Zhang, Yunyi Zhang, Jiawei Han

Entity Set Expansion, Taxonomy Expansion, and Seed-Guided Taxonomy Construction are three representative tasks that can be used to automatically populate an existing taxonomy with new entities.

Language Modelling Large Language Model +1

Paper
Code

Effective Structured Prompting by Meta-Learning and Representative Verbalizer

1 code implementation • 1 Jun 2023 • Weisen Jiang, Yu Zhang, James T. Kwok

Combining meta-learning the prompt pool and RepVerb, we propose MetaPrompter for effective structured prompting.

Meta-Learning

Paper
Code

UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment

1 code implementation • 25 Mar 2024 • Kaipeng Zeng, Bo Yang, Xin Zhao, Yu Zhang, Fan Nie, Xiaokang Yang, Yaohui Jin, Yanyan Xu

Single-step retrosynthesis prediction, a crucial step in the planning process, has witnessed a surge in interest in recent years due to advancements in AI for science.

Graph-to-Sequence molecular representation +3

Paper
Code

Learning to Multitask

no code implementations • NeurIPS 2018 • Yu Zhang, Ying WEI, Qiang Yang

Based on such training set, L2MT first uses a proposed layerwise graph neural network to learn task embeddings for all the tasks in a multitask problem and then learns an estimation function to estimate the relative test error based on task embeddings and the representation of the multitask model based on a unified formulation.

Paper
Add Code

Image Co-segmentation via Multi-scale Local Shape Transfer

no code implementations • 15 May 2018 • Wei Teng, Yu Zhang, Xiaowu Chen, Jia Li, Zhiqiang He

Image co-segmentation is a challenging task in computer vision that aims to segment all pixels of the objects from a predefined semantic category.

Paper
Add Code

Parameter Transfer Unit for Deep Neural Networks

no code implementations • 23 Apr 2018 • Yinghua Zhang, Yu Zhang, Qiang Yang

Unfortunately, the transferability is usually defined as discrete states and it differs with domains and network architectures.

Paper
Add Code

Expert Finding in Community Question Answering: A Review

no code implementations • 21 Apr 2018 • Sha Yuan, Yu Zhang, Jie Tang, Juan Bautista Cabotà

Moreover, we use innovative diagrams to clarify several important concepts of ensemble learning, and find that ensemble models with several specific single models can further boosting the performance.

Community Question Answering Ensemble Learning +2

Paper
Add Code

Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment

no code implementations • 20 Apr 2018 • Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung

Also, they depend on either common slots or slot entropy, which are not available when the source and target slots are totally disjoint and no database is available to calculate the slot entropy.

Paper
Add Code

Image Matters: Visually modeling user behaviors using Advanced Model Server

no code implementations • 17 Nov 2017 • Tiezheng Ge, Liqin Zhao, Guorui Zhou, Keyu Chen, Shuying Liu, Huimin Yi, Zelin Hu, Bochao Liu, Peng Sun, Haoyu Liu, Pengtao Yi, Sui Huang, Zhiqiang Zhang, Xiaoqiang Zhu, Yu Zhang, Kun Gai

So we propose to model user preference jointly with user behavior ID features and behavior images.

Click-Through Rate Prediction

Paper
Add Code

Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning

no code implementations • 9 Nov 2017 • Meng Qu, Xiang Ren, Yu Zhang, Jiawei Han

We propose a novel co-training framework with a distributional module and a pattern module.

Knowledge Base Completion Relation +1

Paper
Add Code

Explicablility as Minimizing Distance from Expected Behavior

no code implementations • 16 Nov 2016 • Anagha Kulkarni, Yantian Zha, Tathagata Chakraborti, Satya Gautam Vadlamudi, Yu Zhang, Subbarao Kambhampati

In order to have effective human-AI collaboration, it is necessary to address how the AI agent's behavior is being perceived by the humans-in-the-loop.

Paper
Add Code

Fine Grained Knowledge Transfer for Personalized Task-oriented Dialogue Systems

no code implementations • 11 Nov 2017 • Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung

Training a personalized dialogue system requires a lot of data, and the data collected for a single user is usually insufficient.

Sentence Task-Oriented Dialogue Systems +1

Paper
Add Code

Integrating User and Agent Models: A Deep Task-Oriented Dialogue System

no code implementations • 10 Nov 2017 • Weiyan Wang, Yuxiang Wu, Yu Zhang, Zhongqi Lu, Kaixiang Mo, Qiang Yang

Then the built user model is used as a leverage to train the agent model by deep reinforcement learning.

Task-Oriented Dialogue Systems

Paper
Add Code

Learning Graphical Models from a Distributed Stream

no code implementations • 5 Oct 2017 • Yu Zhang, Srikanta Tirthapura, Graham Cormode

We study Bayesian networks, the workhorse of graphical models, and present a communication-efficient method for continuously learning and maintaining a Bayesian network model over data that is arriving as a distributed stream partitioned across multiple processors.

Management

Paper
Add Code

A Deep Neural Network for Chinese Zero Pronoun Resolution

no code implementations • 20 Apr 2016 • Qingyu Yin, Wei-Nan Zhang, Yu Zhang, Ting Liu

This is because zero pronouns have no descriptive information, which results in difficulty in explicitly capturing their semantic similarities with antecedents.

Chinese Zero Pronoun Resolution Descriptive

Paper
Add Code

Learning Latent Representations for Speech Generation and Transformation

no code implementations • 13 Apr 2017 • Wei-Ning Hsu, Yu Zhang, James Glass

In this paper, we apply a convolutional VAE to model the generative process of natural speech.

Paper
Add Code

Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation

no code implementations • 19 Jul 2017 • Wei-Ning Hsu, Yu Zhang, James Glass

Research on robust speech recognition can be regarded as trying to overcome this domain mismatch issue.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Flexible End-to-End Dialogue System for Knowledge Grounded Conversation

no code implementations • 13 Sep 2017 • Wenya Zhu, Kaixiang Mo, Yu Zhang, Zhangbin Zhu, Xuezheng Peng, Qiang Yang

Although existing generative question answering (QA) systems can be applied to knowledge grounded conversation, they either have at most one entity in a response or cannot deal with out-of-vocabulary entities.

Generative Question Answering

Paper
Add Code

Learning to Transfer

no code implementations • 18 Aug 2017 • Ying Wei, Yu Zhang, Qiang Yang

We establish the L2T framework in two stages: 1) we first learn a reflection function encrypting transfer learning skills from experiences; and 2) we infer what and how to transfer for a newly arrived pair of domains by optimizing the reflection function.

Transfer Learning

Paper
Add Code

AI Challenges in Human-Robot Cognitive Teaming

no code implementations • 15 Jul 2017 • Tathagata Chakraborti, Subbarao Kambhampati, Matthias Scheutz, Yu Zhang

Among the many anticipated roles for robots in the future is that of being a human teammate.

Paper
Add Code

Causes and Corrections for Bimodal Multipath Scanning with Structured Light

no code implementations • 8 Jun 2017 • Yu Zhang, Daniel L. Lau, Ying Yu

Structured light illumination is an active 3-D scanning technique based on projecting/capturing a set of striped patterns and measuring the warping of the patterns as they reflect off a target object's surface.

Paper
Add Code

Structured Light Phase Measuring Profilometry Pattern Design for Binary Spatial Light Modulators

no code implementations • 8 Jun 2017 • Daniel L. Lau, Yu Zhang, Kai Liu

In the case of phase measuring profilometry (PMP), the projected patterns are composed of a rolling sinusoidal wave, but as a set of time-multiplexed patterns, PMP requires the target surface to remain motionless or for scanning to be performed at such high rates that any movement is small.

Paper
Add Code

Plan Explanations as Model Reconciliation: Moving Beyond Explanation as Soliloquy

no code implementations • 28 Jan 2017 • Tathagata Chakraborti, Sarath Sreedharan, Yu Zhang, Subbarao Kambhampati

When AI systems interact with humans in the loop, they are often called on to provide explanations for their plans and behavior.

Paper
Add Code

Personalizing a Dialogue System with Transfer Reinforcement Learning

no code implementations • 10 Oct 2016 • Kaixiang Mo, Shuangyin Li, Yu Zhang, Jiajun Li, Qiang Yang

One way to solve this problem is to consider a collection of multiple users' data as a source domain and an individual user's data as a target domain, and to perform a transfer learning from the source to the target domain.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Group Component Analysis for Multiblock Data: Common and Individual Feature Extraction

no code implementations • 17 Dec 2012 • Guoxu Zhou, Andrzej Cichocki, Yu Zhang, Danilo Mandic

Very often data we encounter in practice is a collection of matrices rather than a single matrix.

blind source separation Clustering +1

Paper
Add Code

Sequence-based Multimodal Apprenticeship Learning For Robot Perception and Decision Making

no code implementations • 24 Feb 2017 • Fei Han, Xue Yang, Yu Zhang, Hao Zhang

Apprenticeship learning has recently attracted a wide attention due to its capability of allowing robots to learn physical tasks directly from demonstrations provided by human experts.

Decision Making

Paper
Add Code

Simultaneous Feature and Body-Part Learning for Real-Time Robot Awareness of Human Behaviors

no code implementations • 24 Feb 2017 • Fei Han, Xue Yang, Christopher Reardon, Yu Zhang, Hao Zhang

We formulate FABL as a regression-like optimization problem with structured sparsity-inducing norms to model interrelationships of body parts and features.

Paper
Add Code

Latent Sequence Decompositions

no code implementations • 10 Oct 2016 • William Chan, Yu Zhang, Quoc Le, Navdeep Jaitly

We present the Latent Sequence Decompositions (LSD) framework.

speech-recognition Speech Recognition +1

Paper
Add Code

Multivariate Regression with Grossly Corrupted Observations: A Robust Approach and its Applications

no code implementations • 11 Jan 2017 • Xiaowei Zhang, Chi Xu, Yu Zhang, Tingshao Zhu, Li Cheng

The implementation of our approach and comparison methods as well as the involved datasets are made publicly available in support of the open-source and reproducible research initiatives.

Hand Pose Estimation regression

Paper
Add Code

Visual Compiler: Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator

no code implementations • 15 Dec 2016 • Namhoon Lee, Xinshuo Weng, Vishnu Naresh Boddeti, Yu Zhang, Fares Beainy, Kris Kitani, Takeo Kanade

We introduce the concept of a Visual Compiler that generates a scene specific pedestrian detector and pose estimator without any pedestrian observations.

Human Detection Pose Estimation

Paper
Add Code

Learning to Search on Manifolds for 3D Pose Estimation of Articulated Objects

no code implementations • 2 Dec 2016 • Yu Zhang, Chi Xu, Li Cheng

This paper focuses on the challenging problem of 3D pose estimation of a diverse spectrum of articulated objects from single depth images.

3D Pose Estimation Structured Prediction

Paper
Add Code

Lie-X: Depth Image Based Articulated Object Pose Estimation, Tracking, and Action Recognition on Lie Groups

no code implementations • 13 Sep 2016 • Chi Xu, Lakshmi Narasimhan Govindarajan, Yu Zhang, Li Cheng

Pose estimation, tracking, and action recognition of articulated objects from depth images are important and challenging problems, which are normally considered separately.

Action Recognition Pose Estimation +2

Paper
Add Code

Proactive Decision Support using Automated Planning

no code implementations • 24 Jun 2016 • Satya Gautam Vadlamudi, Tathagata Chakraborti, Yu Zhang, Subbarao Kambhampati

Proactive decision support (PDS) helps in improving the decision making experience of human decision makers in human-in-the-loop planning environments.

Decision Making

Paper
Add Code

Exploit Bounding Box Annotations for Multi-label Object Recognition

no code implementations • CVPR 2016 • Hao Yang, Joey Tianyi Zhou, Yu Zhang, Bin-Bin Gao, Jianxin Wu, Jianfei Cai

With strong labels, our framework is able to achieve state-of-the-art results in both datasets.

Ranked #16 on Multi-Label Classification on PASCAL VOC 2007

Multi-Label Classification Object +1

Paper
Add Code

Neural Recovery Machine for Chinese Dropped Pronoun

no code implementations • 7 May 2016 • Wei-Nan Zhang, Ting Liu, Qingyu Yin, Yu Zhang

Dropped pronouns (DPs) are ubiquitous in pro-drop languages like Chinese, Japanese etc.

Feature Engineering

Paper
Add Code

Plan Explicability and Predictability for Robot Task Planning

no code implementations • 25 Nov 2015 • Yu Zhang, Sarath Sreedharan, Anagha Kulkarni, Tathagata Chakraborti, Hankz Hankui Zhuo, Subbarao Kambhampati

Hence, for such agents to be helpful, one important requirement is for them to synthesize plans that can be easily understood by humans.

Motion Planning Robot Task Planning

Paper
Add Code

Recurrent Neural Network Encoder with Attention for Community Question Answering

no code implementations • 23 Mar 2016 • Wei-Ning Hsu, Yu Zhang, James Glass

We apply a general recurrent neural network (RNN) encoder framework to community question answering (cQA) tasks.

Community Question Answering Information Retrieval +2

Paper
Add Code

Storm Detection by Visual Learning Using Satellite Images

no code implementations • 1 Mar 2016 • Yu Zhang, Stephen Wistar, Jia Li, Michael Steinberg, James Z. Wang

In our system, we extract and summarize important visual storm evidence from satellite image sequences in the way that meteorologists interpret the images.

Weather Forecasting

Paper
Add Code

On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation

1 code implementation • 19 Feb 2016 • Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu

We propose to train bi-directional neural network language model(NNLM) with noise contrastive estimation(NCE).

Language Modelling

Paper
Code

Highway Long Short-Term Memory RNNs for Distant Speech Recognition

no code implementations • 30 Oct 2015 • Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James Glass

In this paper, we extend the deep long short-term memory (DLSTM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers.

Distant Speech Recognition speech-recognition

Paper
Add Code

Prediction-Adaptation-Correction Recurrent Neural Networks for Low-Resource Language Speech Recognition

no code implementations • 30 Oct 2015 • Yu Zhang, Ekapol Chuangsuwanich, James Glass, Dong Yu

In this paper, we investigate the use of prediction-adaptation-correction recurrent neural networks (PAC-RNNs) for low-resource speech recognition.

speech-recognition Speech Recognition +1

Paper
Add Code

Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data

no code implementations • 29 Aug 2015 • Guoxu Zhou, Qibin Zhao, Yu Zhang, Tülay Adalı, Shengli Xie, Andrzej Cichocki

With the increasing availability of various sensor technologies, we now have access to large amounts of multi-block (also called multi-set, multi-relational, or multi-view) data that need to be jointly analyzed to explore their latent connections.

Tensor Decomposition

Paper
Add Code

Weakly Supervised Fine-Grained Image Categorization

no code implementations • 20 Apr 2015 • Yu Zhang, Xiu-Shen Wei, Jianxin Wu, Jianfei Cai, Jiangbo Lu, Viet-Anh Nguyen, Minh N. Do

Most existing works heavily rely on object / part detectors to build the correspondence between object parts by using object or object part annotations inside training images.

Fine-Grained Image Classification Image Categorization +1

Paper
Add Code

Plan or not: Remote Human-robot Teaming with Incomplete Task Information

no code implementations • 9 Dec 2014 • Vignesh Narayanan, Yu Zhang, Nathaniel Mendoza, Subbarao Kambhampati

While information asymmetry can be desirable sometimes, it may also lead to the robot choosing improper actions that negatively influence the teaming performance.

Paper
Add Code

Learning of Agent Capability Models with Applications in Multi-agent Planning

no code implementations • 4 Nov 2014 • Yu Zhang, Subbarao Kambhampati

Thus far, there are two common representations of agent models: MDP based and action based, which are both based on action modeling.

Paper
Add Code

A Formal Analysis of Required Cooperation in Multi-agent Planning

no code implementations • 22 Apr 2014 • Yu Zhang, Subbarao Kambhampati

Then, by dividing the problems that require cooperation (referred to as RC problems) into two classes -- problems with heterogeneous and homogeneous agents, we aim to identify all the conditions that can cause RC in these two classes.

Paper
Add Code

Electricity Market Forecasting via Low-Rank Multi-Kernel Learning

no code implementations • 2 Oct 2013 • Vassilis Kekatos, Yu Zhang, Georgios B. Giannakis

The smart grid vision entails advanced information technology and data analytics to enhance the efficiency, sustainability, and economics of the power grid infrastructure.

Computational Efficiency

Paper
Add Code

Frequency Recognition in SSVEP-based BCI using Multiset Canonical Correlation Analysis

no code implementations • 26 Aug 2013 • Yu Zhang, Guoxu Zhou, Jing Jin, Xingyu Wang, Andrzej Cichocki

Canonical correlation analysis (CCA) has been one of the most popular methods for frequency recognition in steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs).

EEG SSVEP

Paper
Add Code

An Active Learning Approach for Jointly Estimating Worker Performance and Annotation Reliability with Crowdsourced Data

no code implementations • 16 Jan 2014 • Liyue Zhao, Yu Zhang, Gita Sukthankar

Crowdsourcing platforms offer a practical solution to the problem of affordably annotating large datasets for training supervised classifiers.

Active Learning

Paper
Add Code

Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

no code implementations • 24 Aug 2013 • Luca Canzian, Yu Zhang, Mihaela van der Schaar

We present an efficient distributed online learning scheme to classify data captured from distributed, heterogeneous, and dynamic data sources.

Ensemble Learning General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.