Search Results for author: Yun Tang

Found 31 papers, 7 papers with code

ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation

1 code implementation • 23 Feb 2024 • Yi Zhang, Yun Tang, Wenjie Ruan, Xiaowei Huang, Siddartha Khastgir, Paul Jennings, Xingyu Zhao

Text-to-Image (T2I) Diffusion Models (DMs) have shown impressive abilities in generating high-quality images based on simple text descriptions.

Paper
Code

Domain Knowledge Distillation from Large Language Model: An Empirical Study in the Autonomous Driving Domain

no code implementations • 17 Jul 2023 • Yun Tang, Antonio A. Bruto da Costa, Jason Zhang, Irvine Patrick, Siddartha Khastgir, Paul Jennings

Engineering knowledge-based (or expert) systems require extensive manual effort and domain knowledge.

Autonomous Driving Knowledge Distillation +3

Paper
Add Code

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

no code implementations • 4 May 2023 • Yun Tang, Anna Y. Sun, Hirofumi Inaguma, Xinyue Chen, Ning Dong, Xutai Ma, Paden D. Tomasello, Juan Pino

In order to leverage strengths of both modeling methods, we propose a solution by combining Transducer and Attention based Encoder-Decoder (TAED) for speech-to-text tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

1 code implementation • 10 Apr 2023 • Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe

ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitated by the broadening interests of the spoken language translation community.

Benchmarking Simultaneous Speech-to-Text Translation +2

7,871

Paper
Code

Enhancing Speech-to-Speech Translation with Multiple TTS Targets

no code implementations • 10 Apr 2023 • Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe

It has been known that direct speech-to-speech translation (S2ST) models usually suffer from the data scarcity issue because of the limited existing parallel materials for both source and target speech.

Speech-to-Speech Translation Speech-to-Text Translation +1

Paper
Add Code

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

1 code implementation • 15 Dec 2022 • Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino

We enhance the model performance by subword prediction in the first-pass decoder, advanced two-pass decoder architecture design and search strategy, and better training regularization.

Denoising Speech-to-Speech Translation +3

29,237

Paper
Code

Improving Speech-to-Speech Translation Through Unlabeled Text

no code implementations • 26 Oct 2022 • Xuan-Phi Nguyen, Sravya Popuri, Changhan Wang, Yun Tang, Ilia Kulikov, Hongyu Gong

Direct speech-to-speech translation (S2ST) is among the most challenging problems in the translation paradigm due to the significant scarcity of S2ST data.

Machine Translation speech-recognition +3

Paper
Add Code

Named Entity Detection and Injection for Direct Speech Translation

no code implementations • 21 Oct 2022 • Marco Gaido, Yun Tang, Ilia Kulikov, Rongqing Huang, Hongyu Gong, Hirofumi Inaguma

In a sentence, certain words are critical for its semantic.

Sentence Translation

Paper
Add Code

Simple and Effective Unsupervised Speech Translation

no code implementations • 18 Oct 2022 • Changhan Wang, Hirofumi Inaguma, Peng-Jen Chen, Ilia Kulikov, Yun Tang, Wei-Ning Hsu, Michael Auli, Juan Pino

The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages.

Machine Translation speech-recognition +6

Paper
Add Code

Unified Speech-Text Pre-training for Speech Translation and Recognition

no code implementations • ACL 2022 • Yun Tang, Hongyu Gong, Ning Dong, Changhan Wang, Wei-Ning Hsu, Jiatao Gu, Alexei Baevski, Xian Li, Abdelrahman Mohamed, Michael Auli, Juan Pino

Two pre-training configurations for speech translation and recognition, respectively, are presented to alleviate subtask interference.

speech-recognition Speech Recognition +1

Paper
Add Code

A Survey on Scenario-Based Testing for Automated Driving Systems in High-Fidelity Simulation

no code implementations • 2 Dec 2021 • Ziyuan Zhong, Yun Tang, Yuan Zhou, Vania de Oliveira Neves, Yang Liu, Baishakhi Ray

To bridge this gap, in this work, we provide a generic formulation of scenario-based testing in high-fidelity simulation and conduct a literature review on the existing works.

Paper
Add Code

From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation

no code implementations • 15 Oct 2021 • Danni Liu, Changhan Wang, Hongyu Gong, Xutai Ma, Yun Tang, Juan Pino

Speech-to-speech translation (S2ST) converts input speech to speech in another language.

Data Augmentation Speech Synthesis +2

Paper
Add Code

Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention

no code implementations • 15 Oct 2021 • Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino

We present a direct simultaneous speech-to-speech translation (Simul-S2ST) model, Furthermore, the generation of translation is independent from intermediate text representations.

Speech Synthesis Speech-to-Speech Translation +1

Paper
Add Code

Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation

no code implementations • ICLR 2022 • Xuan-Phi Nguyen, Hongyu Gong, Yun Tang, Changhan Wang, Philipp Koehn, Shafiq Joty

Modern unsupervised machine translation systems mostly train their models by generating synthetic parallel training data from large unlabeled monolingual corpora of different languages through various means, such as iterative back-translation.

Clustering Translation +1

Paper
Add Code

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models

no code implementations • ACL 2021 • Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation through efficient transfer learning from a pretrained speech encoder and text decoder.

Text Generation Transfer Learning +1

Paper
Add Code

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

no code implementations • ACL (IWSLT) 2021 • Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task.

Transfer Learning Translation

Paper
Add Code

Direct speech-to-speech translation with discrete units

1 code implementation • ACL 2022 • Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu

When target text transcripts are available, we design a joint speech and text training framework that enables the model to generate dual modality output (speech and text) simultaneously in the same inference pass.

Speech-to-Speech Translation Text Generation +1

157

Paper
Code

Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task

no code implementations • ACL 2021 • Yun Tang, Juan Pino, Xian Li, Changhan Wang, Dmitriy Genzel

Pretraining and multitask learning are widely used to improve the speech to text translation performance.

Knowledge Distillation Speech-to-Text Translation +2

Paper
Add Code

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

no code implementations • NeurIPS 2021 • Hongyu Gong, Yun Tang, Juan Pino, Xian Li

We further propose attention sharing strategies to facilitate parameter sharing and specialization in multilingual and multi-domain sequence modeling.

speech-recognition Speech Recognition +2

Paper
Add Code

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

no code implementations • 24 Oct 2020 • Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation by efficient transfer learning from pretrained speech encoder and text decoder.

Cross-Lingual Transfer Text Generation +2

Paper
Add Code

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

no code implementations • 21 Oct 2020 • Yun Tang, Juan Pino, Changhan Wang, Xutai Ma, Dmitriy Genzel

We demonstrate that representing text input as phoneme sequences can reduce the difference between speech and text inputs, and enhance the knowledge transfer from text corpora to the speech to text tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

fairseq S2T: Fast Speech-to-Text Modeling with fairseq

3 code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, Juan Pino

We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation.

Ranked #8 on Speech-to-Text Translation on MuST-C EN->DE

Machine Translation Multi-Task Learning +4

124,984

Paper
Code

Self-Training for End-to-End Speech Translation

no code implementations • 3 Jun 2020 • Juan Pino, Qiantong Xu, Xutai Ma, Mohammad Javad Dousti, Yun Tang

One of the main challenges for end-to-end speech translation is data scarcity.

speech-recognition Speech Recognition +1

Paper
Add Code

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

no code implementations • ACL 2020 • Yun Tang, Jing Huang, Guangtao Wang, Xiaodong He, Bo-Wen Zhou

Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE.

Ranked #18 on Link Prediction on FB15k-237

Knowledge Graph Embedding Link Prediction +1

Paper
Add Code

Relation Module for Non-Answerable Predictions on Reading Comprehension

no code implementations • CONLL 2019 • Kevin Huang, Yun Tang, Jing Huang, Xiaodong He, Bo-Wen Zhou

We test the relation module on the SQuAD 2. 0 dataset using both the BiDAF and BERT models as baseline readers.

Machine Reading Comprehension Relation +2

Paper
Add Code

Relation Module for Non-answerable Prediction on Question Answering

no code implementations • 23 Oct 2019 • Kevin Huang, Yun Tang, Jing Huang, Xiaodong He, Bo-Wen Zhou

In this paper, we aim to improve a MRC model's ability to determine whether a question has an answer in a given context (e. g. the recently proposed SQuAD 2. 0 task).

Machine Reading Comprehension Question Answering +3

Paper
Add Code

Zero-shot Text-to-SQL Learning with Auxiliary Task

1 code implementation • 29 Aug 2019 • Shuaichen Chang, PengFei Liu, Yun Tang, Jing Huang, Xiaodong He, Bo-Wen Zhou

Recent years have seen great success in the use of neural seq2seq models on the text-to-SQL task.

Text-To-SQL

Paper
Code

Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs

no code implementations • ACL 2019 • Ming Tu, Guangtao Wang, Jing Huang, Yun Tang, Xiaodong He, Bo-Wen Zhou

We introduce a heterogeneous graph with different types of nodes and edges, which is named as Heterogeneous Document-Entity (HDE) graph.

Multi-Hop Reading Comprehension

Paper
Add Code

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

no code implementations • 16 Apr 2019 • Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Huy Dat Tran, Kuruvachan K. George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-Francois Bonastre, Cheng-Lin Xu, Zhi Hao Lim, Eng Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas Evans

The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE).

Domain Adaptation Speaker Recognition

Paper
Add Code

Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification

no code implementations • 21 Feb 2019 • Yun Tang, Guohong Ding, Jing Huang, Xiaodong He, Bo-Wen Zhou

This paper aims to improve the widely used deep speaker embedding x-vector model.

Text-Independent Speaker Verification

Paper
Add Code

End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

1 code implementation • 11 Nov 2018 • Chao Shang, Yun Tang, Jing Huang, Jinbo Bi, Xiaodong He, Bo-Wen Zhou

The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure.

Ranked #28 on Link Prediction on FB15k-237

Knowledge Base Completion Knowledge Graph Embedding +2

109

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.