no code implementations • 24 May 2023 • Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao
Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date.
no code implementations • 1 May 2023 • Zhenhui Ye, Jinzheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, Jinglin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao
Recently, neural radiance field (NeRF) has become a popular rendering technique in this field since it could achieve high-fidelity and 3D-consistent talking face generation with a few-minute-long training video.
1 code implementation • 25 Apr 2023 • Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe
In this work, we propose a multi-modal AI system named AudioGPT, which complements LLMs (i. e., ChatGPT) with 1) foundation models to process complex audio information and solve numerous understanding and generation tasks; and 2) the input/output interface (ASR, TTS) to support spoken dialogue.
1 code implementation • 17 Apr 2023 • GuangYu Nie, Changhoon Kim, Yezhou Yang, Yi Ren
We use StyleGAN2 and the latent diffusion model to demonstrate the efficacy of our method.
no code implementations • 11 Apr 2023 • Yi Ren, Hongyan Tang, Jiangpeng Rong, Siwen Zhu
As pairwise learning suits well for the ranking tasks, the previously proposed unbiased pairwise learning algorithm already achieves state-of-the-art performance.
no code implementations • 24 Mar 2023 • Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao
ICASSP2023 General Meeting Understanding and Generation Challenge (MUG) focuses on prompting a wide range of spoken language processing (SLP) research on meeting transcripts, as SLP applications are critical to improve users' efficiency in grasping important information in meetings.
1 code implementation • 24 Mar 2023 • Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao
To prompt SLP advancement, we establish a large-scale general Meeting Understanding and Generation Benchmark (MUG) to benchmark the performance of a wide range of SLP tasks, including topic segmentation, topic-level and session-level extractive summarization and topic title generation, keyphrase extraction, and action item detection.
no code implementations • 8 Mar 2023 • Yi Ren, Hongyan Tang, Siwen Zhu
It is a well-known challenge to learn an unbiased ranker with biased feedback.
no code implementations • 28 Feb 2023 • Shenzheng Zhang, Qi Tan, Xinzhi Zheng, Yi Ren, Xu Zhao
The gap between the randomly initialized item ID embedding and the well-trained warm item ID embedding makes the cold items hard to suit the recommendation system, which is trained on the data of historical warm items.
1 code implementation • 24 Feb 2023 • Yi Ren, Xiao Han, Xu Zhao, Shenzheng Zhang, Yan Zhang
Therefore, the ranking stage is still essential for most applications to provide high-quality candidate set for the re-ranking stage.
no code implementations • 11 Feb 2023 • Yi Ren, Shangmin Guo, Wonho Bae, Danica J. Sutherland
We identify a significant trend in the effect of changes in this initial energy on the resulting features after fine-tuning.
no code implementations • 31 Jan 2023 • Zhenhui Ye, Ziyue Jiang, Yi Ren, Jinglin Liu, Jinzheng He, Zhou Zhao
Generating photo-realistic video portrait with arbitrary speech audio is a crucial problem in film-making and virtual reality.
no code implementations • 30 Jan 2023 • Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Luping Liu, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao
Its application to audio still lags behind for two main reasons: the lack of large-scale datasets with high-quality text-audio pairs, and the complexity of modeling long continuous audio data.
no code implementations • 27 Jan 2023 • Yaoxian Song, Penglei Sun, Yi Ren, Yu Zheng, Yue Zhang
To evaluate the effectiveness, we perform multi-level difficulty part language grounding grasping experiments and deploy our proposed model on a real robot.
no code implementations • 25 Jan 2023 • Yingchaojie Feng, Xingbo Wang, Bo Pan, Kam Kwai Wong, Yi Ren, Shi Liu, Zihan Yan, Yuxin Ma, Huamin Qu, Wei Chen
Our research explores how to provide explanations for NLIs to help users locate the problems and further revise the queries.
1 code implementation • 28 Nov 2022 • Xuechao Zhang, Xuda Ding, Yi Ren, Yu Zheng, Chongrong Fang, Jianping He
Then, we form a single quantity that measures the sensing quality of the targets by the camera network.
1 code implementation • 21 Nov 2022 • Luping Liu, Yi Ren, Xize Cheng, Zhou Zhao
Under this assumption, we design new detection methods and indicator scores.
no code implementations • 19 Nov 2022 • Chenye Cui, Yi Ren, Jinglin Liu, Rongjie Huang, Zhou Zhao
In this paper, we pose the task of generating sound with a specific timbre given a video input and a reference audio sample.
no code implementations • 11 Nov 2022 • Yi Ren, Yanyang Xiao, Guo-Qiang Bi, Pek-Ming Lau
With these understandings, we developed a STDP based learning rule which could drive the network to remember any presupposed sequence.
1 code implementation • 1 Sep 2022 • Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren
To rectify the discriminative phonemes and extract video-related information from noisy audio, we develop a novel video-guided curriculum learning (VGCL) during the audio pre-training process, which can make use of the vital visual perceptions to help understand the spoken language and suppress the external noise.
no code implementations • 31 Jul 2022 • Guangyao Zhai, Yu Zheng, Ziwei Xu, Xin Kong, Yong liu, Benjamin Busam, Yi Ren, Nassir Navab, Zhengyou Zhang
In this paper, we introduce DA$^2$, the first large-scale dual-arm dexterity-aware dataset for the generation of optimal bimanual grasping pairs for arbitrary large objects.
2 code implementations • 13 Jul 2022 • Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren
Through the preliminary study on diffusion model parameterization, we find that previous gradient-based TTS models require hundreds or thousands of iterations to guarantee high sample quality, which poses a challenge for accelerating sampling.
no code implementations • NAACL 2022 • Kexun Zhang, Rui Wang, Xu Tan, Junliang Guo, Yi Ren, Tao Qin, Tie-Yan Liu
Furthermore, we take the best of both and design a new loss function to better handle the complicated syntactic multi-modality in real-world datasets.
no code implementations • 5 Jul 2022 • Lei Zhang, Mukesh Ghimire, Wenlong Zhang, Zhe Xu, Yi Ren
This paper investigates two potential solutions to this problem: a hybrid method that leverages both supervised Nash equilibria and the HJI PDE, and a value-hardening method where a sequence of HJIs are solved with a gradually hardening reward.
1 code implementation • 5 Jun 2022 • Ziyue Jiang, Su Zhe, Zhou Zhao, Qian Yang, Yi Ren, Jinglin Liu, Zhenhui Ye
This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language).
1 code implementation • 27 May 2022 • Xu Zhao, Yi Ren, Ying Du, Shenzheng Zhang, Nian Wang
This paper attempts to tackle the item cold-start problem by generating enhanced warmed-up ID embeddings for cold items with historical data and limited interaction records.
no code implementations • 25 May 2022 • Rongjie Huang, Jinglin Liu, Huadai Liu, Yi Ren, Lichao Zhang, Jinzheng He, Zhou Zhao
Specifically, a sequence of discrete representations derived in a self-supervised manner are predicted from the model and passed to a vocoder for speech reconstruction, while still facing the following challenges: 1) Acoustic multimodality: the discrete units derived from speech with same content could be indeterministic due to the acoustic property (e. g., rhythm, pitch, and energy), which causes deterioration of translation accuracy; 2) high latency: current S2ST systems utilize autoregressive models which predict each unit conditioned on the sequence previously generated, failing to take full advantage of parallelism.
1 code implementation • 15 May 2022 • Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao
Style transfer for out-of-domain (OOD) speech synthesis aims to generate speech samples with unseen style (e. g., speaker identity, emotion, and prosody) derived from an acoustic reference, while facing the following challenges: 1) The highly dynamic style features in expressive voice are difficult to model and transfer; and 2) the TTS models should be robust enough to handle diverse OOD conditions that differ from the source data.
no code implementations • 27 Apr 2022 • Sheng Cheng, Yi Ren, Yezhou Yang
This paper follows cognitive studies to investigate a graph representation for sketches, where the information of strokes, i. e., parts of a sketch, are encoded on vertices and information of inter-stroke on edges.
2 code implementations • 21 Apr 2022 • Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao
Also, FastDiff enables a sampling speed of 58x faster than real-time on a V100 GPU, making diffusion models practically applicable to speech synthesis deployment for the first time.
Ranked #7 on
Text-To-Speech Synthesis
on LJSpeech
(using extra training data)
1 code implementation • ICLR 2022 • Yi Ren, Shangmin Guo, Danica J. Sutherland
Observing the learning path not only provides a new perspective for understanding knowledge distillation, overfitting, and learning dynamics, but also reveals that the supervisory signal of a teacher network can be very unstable near the best points in training on real tasks.
3 code implementations • ACL 2022 • Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao
Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one.
no code implementations • ACL 2022 • Yi Ren, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu
Then we conduct a comprehensive study on NAR-TTS models that use some advanced modeling methods.
6 code implementations • ICLR 2022 • Luping Liu, Yi Ren, Zhijie Lin, Zhou Zhao
Under such a perspective, we propose pseudo numerical methods for diffusion models (PNDMs).
Ranked #8 on
Image Generation
on CelebA 64x64
no code implementations • 16 Feb 2022 • Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao
Specifically, we first introduce a word-level prosody encoder, which quantizes the low-frequency band of the speech and compresses prosody attributes in the latent prosody vector (LPV).
no code implementations • 8 Feb 2022 • Achraf Bahamou, Donald Goldfarb, Yi Ren
Specifically, our method uses a block-diagonal approximation to the empirical Fisher matrix, where for each layer in the DNN, whether it is convolutional or feed-forward and fully connected, the associated diagonal block is itself block-diagonal and is composed of a large number of mini-blocks of modest size.
no code implementations • 30 Jan 2022 • Xianye Ben, Yi Ren, Junping Zhang, Su-Jing Wang, Kidiyo Kpalma, Weixiao Meng, Yong-Jin Liu
Unlike the conventional facial expressions, micro-expressions are involuntary and transient facial expressions capable of revealing the genuine emotions that people attempt to hide.
no code implementations • 11 Jan 2022 • Shoutong Wang, Jinglin Liu, Yi Ren, Zhen Wang, Changliang Xu, Zhou Zhao
However, they face several challenges: 1) the fixed-size speaker embedding is not powerful enough to capture full details of the target timbre; 2) single reference audio does not contain sufficient timbre information of the target speaker; 3) the pitch inconsistency between different speakers also leads to a degradation in the generated voice.
1 code implementation • MM '21: Proceedings of the 29th ACM International Conference on Multimedia 2021 • Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao
High-fidelity multi-singer singing voice synthesis is challenging for neural vocoder due to the singing voice data shortage, limited singer generalization, and large computational cost.
1 code implementation • 25 Nov 2021 • Yi Ren, Hongyan Tang, Siwen Zhu
To provide personalized high quality recommendation results, conventional systems usually train pointwise rankers to predict the absolute value of objectives and leverage a distinct shallow tower to estimate and alleviate the impact of position bias.
no code implementations • 22 Oct 2021 • Yujie Lu, Ping Nie, Shengyu Zhang, Ming Zhao, Ruobing Xie, William Yang Wang, Yi Ren
However, existing work are primarily built upon pre-defined retrieval channels, including User-CF (U2U), Item-CF (I2I), and Embedding-based Retrieval (U2I), thus access to the limited correlation between users and items which solely entail from partial information of latent interactions.
no code implementations • 14 Oct 2021 • Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang
In this work, we propose SingGAN, a generative adversarial network designed for high-fidelity singing voice synthesis.
no code implementations • 14 Oct 2021 • Ziyue Jiang, Yi Ren, Ming Lei, Zhou Zhao
Federated learning enables collaborative training of machine learning models under strict privacy restrictions and federated text-to-speech aims to synthesize natural speech of multiple users with a few audio training samples stored in their devices locally.
3 code implementations • NeurIPS 2021 • Yi Ren, Jinglin Liu, Zhou Zhao
Non-autoregressive text-to-speech (NAR-TTS) models such as FastSpeech 2 and Glow-TTS can synthesize high-quality speech from the given text in parallel.
Text-To-Speech Synthesis
Vocal Bursts Intensity Prediction
+1
no code implementations • ICLR 2022 • Shangmin Guo, Yi Ren, Kory Wallace Mathewson, Simon Kirby, Stefano V Albrecht, Kenny Smith
Researchers are using deep learning models to explore the emergence of language in various language games, where simulated agents interact and develop an emergent language to solve a task.
no code implementations • 29 Sep 2021 • Ziyue Jiang, Yi Ren, Zhou Zhao
In this work, we propose a novel phase-oriented algorithm named PhaseFool that can efficiently construct imperceptible audio adversarial examples with energy dissipation.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 29 Sep 2021 • Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Zhou Zhao, Yi Ren
Learning generalizable speech representations for unseen samples in different domains has been a challenge with ever increasing importance to date.
1 code implementation • 16 Sep 2021 • Prasanth Buddareddygari, Travis Zhang, Yezhou Yang, Yi Ren
This paper investigates the feasibility of targeted attacks through visually learned patterns placed on physical objects in the environment, a threat model that combines the practicality and effectiveness of the existing ones.
no code implementations • 10 Sep 2021 • Zhehua Zhou, Ozgur S. Oguz, Yi Ren, Marion Leibold, Martin Buss
Safe reinforcement learning aims to learn a control policy while ensuring that neither the system nor the environment gets damaged during the learning process.
1 code implementation • 6 Sep 2021 • Sheng Cheng, Yang Jiao, Yi Ren
This paper considers the open challenge of identifying complete, concise, and explainable quantitative microstructure representations for disordered heterogeneous material systems.
1 code implementation • 14 Jul 2021 • Jinglin Liu, Zhiying Zhu, Yi Ren, Wencan Huang, Baoxing Huai, Nicholas Yuan, Zhou Zhao
However, the AR decoding manner generates current lip frame conditioned on frames generated previously, which inherently hinders the inference speed, and also has a detrimental effect on the quality of generated lip frames due to error propagation.
no code implementations • 17 Jun 2021 • Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao
Finally, by showing a comparable performance in the emotional speech synthesis task, we successfully demonstrate the ability of the proposed model.
1 code implementation • 7 Jun 2021 • Shangmin Guo, Yi Ren, Kory Mathewson, Simon Kirby, Stefano V. Albrecht, Kenny Smith
Researchers are using deep learning models to explore the emergence of language in various language games, where agents interact and develop an emergent language to solve tasks.
1 code implementation • NeurIPS 2021 • Yi Ren, Donald Goldfarb
Based on the so-called tensor normal (TN) distribution, we propose and analyze a brand new approximate natural gradient method, Tensor Normal Training (TNT), which like Shampoo, only requires knowledge of the shape of the training parameters.
6 code implementations • 6 May 2021 • Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao
Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive singing voice, in which the acoustic model generates the acoustic features (e. g., mel-spectrogram) given a music score.
no code implementations • 30 Mar 2021 • Feng Li, Zhenrui Chen, Pengjie Wang, Yi Ren, Di Zhang, Xiaoyu Zhu
Moreover, it is difficult for user to jump out of their specific historical behaviors for possible interest exploration, namely weak generalization problem.
no code implementations • 12 Feb 2021 • Yi Ren, Achraf Bahamou, Donald Goldfarb
Several improvements to the methods in Goldfarb et al. (2020) are also proposed that can be applied to both MLPs and CNNs.
no code implementations • 21 Jan 2021 • Ming Yang, Alceste Z. Bonanos, Biwei Jiang, Man I Lam, Jian Gao, Panagiotis Gavras, Grigoris Maravelias, Shu Wang, Xiao-Dian Chen, Frank Tramper, Yi Ren, Zoi T. Spetsieri
Further separating RSG candidates from the rest of the LSG candidates is done by using semi-empirical criteria on NIR CMDs and resulted in 323 RSG candidates.
Solar and Stellar Astrophysics Astrophysics of Galaxies
no code implementations • 17 Dec 2020 • Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu
In DenoiSpeech, we handle real-world noisy speech by modeling the fine-grained frame-level noise with a noise condition module, which is jointly trained with the TTS model.
1 code implementation • 9 Dec 2020 • Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin
Automatic song writing aims to compose a song (lyric and/or melody) by machine, which is an interesting topic in both academia and industry.
1 code implementation • 4 Dec 2020 • Shangmin Guo, Yi Ren, Agnieszka Słowik, Kory Mathewson
Referential games and reconstruction games are the most common game types for studying emergent languages.
no code implementations • 30 Nov 2020 • Xu Chen, Yuanxing Zhang, Lun Du, Zheng Fang, Yi Ren, Kaigui Bian, Kunqing Xie
Further analysis indicates that the locality and globality of the traffic networks are critical to traffic flow prediction and the proposed TSSRGCN model can adapt to the various temporal traffic patterns.
no code implementations • 11 Nov 2020 • Shidong Wang, Yi Ren, Gerard Parr, Yu Guan, Ling Shao
In this article, we propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
no code implementations • ICLR 2021 • Changhoon Kim, Yi Ren, Yezhou Yang
Growing applications of generative models have led to new threats such as malicious personation and digital copyright infringement.
1 code implementation • 18 Aug 2020 • Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu
To improve harmony, in this paper, we propose a novel MUlti-track MIDI representation (MuMIDI), which enables simultaneous multi-track generation in a single sequence and explicitly models the dependency of the notes from different tracks.
no code implementations • 9 Aug 2020 • Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu
However, there are more than 6, 000 languages in the world and most languages are lack of speech training data, which poses significant challenges when building TTS and ASR systems for extremely low-resource languages.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 6 Aug 2020 • Jinglin Liu, Yi Ren, Zhou Zhao, Chen Zhang, Baoxing Huai, Nicholas Jing Yuan
NAR lipreading is a challenging task that has many difficulties: 1) the discrepancy of sequence lengths between source and target makes it difficult to estimate the length of the output sequence; 2) the conditionally independent behavior of NAR generation lacks the correlation across time which leads to a poor approximation of target distribution; 3) the feature representation ability of encoder can be weak due to lack of effective alignment mechanism; and 4) the removal of AR language model exacerbates the inherent ambiguity problem of lipreading.
no code implementations • 17 Jul 2020 • Jinglin Liu, Yi Ren, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu
SAT contains a hyperparameter k, and each k value defines a SAT task with different degrees of parallelism.
no code implementations • 9 Jul 2020 • Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu
DeepSinger has several advantages over previous SVS systems: 1) to the best of our knowledge, it is the first SVS system that directly mines training data from music websites, 2) the lyrics-to-singing alignment model further avoids any human efforts for alignment labeling and greatly reduces labeling cost, 3) the singing model based on a feed-forward Transformer is simple and efficient, by removing the complicated acoustic feature modeling in parametric synthesis and leveraging a reference encoder to capture the timbre of a singer from noisy singing data, and 4) it can synthesize singing voices in multiple languages and multiple singers.
no code implementations • WS 2020 • Enmin Su, Yi Ren
We present in this report our submission to IWSLT 2020 Open Domain Translation Task.
no code implementations • ACL 2020 • Yi Ren, Jinglin Liu, Xu Tan, Chen Zhang, Tao Qin, Zhou Zhao, Tie-Yan Liu
In this work, we develop SimulSpeech, an end-to-end simultaneous speech to text translation system which translates speech in source language to text in target language concurrently.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
1 code implementation • NeurIPS 2020 • Donald Goldfarb, Yi Ren, Achraf Bahamou
We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs).
no code implementations • 14 Jun 2020 • Chen Zhang, Xu Tan, Yi Ren, Tao Qin, Ke-jun Zhang, Tie-Yan Liu
Existing speech to speech translation systems heavily rely on the text of target language: they usually translate source language either to target text and then synthesize target speech from text, or directly to target speech with target text for auxiliary training.
31 code implementations • ICLR 2021 • Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
In this paper, we propose FastSpeech 2, which addresses the issues in FastSpeech and better solves the one-to-many mapping problem in TTS by 1) directly training the model with ground-truth target instead of the simplified output from teacher, and 2) introducing more variation information of speech (e. g., pitch, energy and more accurate duration) as conditional inputs.
Ranked #6 on
Text-To-Speech Synthesis
on LJSpeech
(using extra training data)
1 code implementation • 8 Jun 2020 • Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, Tie-Yan Liu
Transformer-based text to speech (TTS) model (e. g., Transformer TTS~\cite{li2019neural}, FastSpeech~\cite{ren2019fastspeech}) has shown the advantages of training and inference efficiency over RNN-based model (e. g., Tacotron~\cite{shen2018natural}) due to its parallel computation in training and/or inference.
1 code implementation • 28 May 2020 • Yadu Babuji, Ben Blaiszik, Tom Brettin, Kyle Chard, Ryan Chard, Austin Clyde, Ian Foster, Zhi Hong, Shantenu Jha, Zhuozhao Li, Xuefeng Liu, Arvind Ramanathan, Yi Ren, Nicholaus Saint, Marcus Schwarting, Rick Stevens, Hubertus van Dam, Rick Wagner
Researchers across the globe are seeking to rapidly repurpose existing drugs or discover new drugs to counter the the novel coronavirus disease (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
no code implementations • ACL 2020 • Yi Ren, Jinglin Liu, Xu Tan, Zhou Zhao, Sheng Zhao, Tie-Yan Liu
In this work, we conduct a study to understand the difficulty of NAR sequence generation and try to answer: (1) Why NAR models can catch up with AR models in some tasks but not all?
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • ICLR 2020 • Yi Ren, Shangmin Guo, Matthieu Labeau, Shay B. Cohen, Simon Kirby
The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary.
no code implementations • 25 Dec 2019 • Xu Tan, Yichong Leng, Jiale Chen, Yi Ren, Tao Qin, Tie-Yan Liu
Multilingual neural machine translation (NMT) has recently been investigated from different aspects (e. g., pivot translation, zero-shot translation, fine-tuning, or training from scratch) and in different settings (e. g., rich resource and low resource, one-to-many, and many-to-one translation).
no code implementations • 21 Oct 2019 • Jiaying Lu, Xin Ye, Yi Ren, Yezhou Yang
Multiple-choice VQA has drawn increasing attention from researchers and end-users recently.
no code implementations • 11 Oct 2019 • Shangmin Guo, Yi Ren, Serhii Havrylov, Stella Frank, Ivan Titov, Kenny Smith
Since first introduced, computer simulation has been an increasingly important tool in evolutionary linguistics.
no code implementations • 5 Jun 2019 • Yi Ren, Donald Goldfarb
We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data sets.
10 code implementations • 22 May 2019 • Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
Compared with traditional concatenative and statistical parametric approaches, neural network based end-to-end models suffer from slow inference speed, and the synthesized speech is usually not robust (i. e., some words are skipped or repeated) and lack of controllability (voice speed or prosody control).
21 code implementations • NeurIPS 2019 • Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
In this work, we propose a novel feed-forward network based on Transformer to generate mel-spectrogram in parallel for TTS.
Ranked #10 on
Text-To-Speech Synthesis
on LJSpeech
(using extra training data)
no code implementations • 13 May 2019 • Yi Ren, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
Text to speech (TTS) and automatic speech recognition (ASR) are two dual tasks in speech processing and both achieve impressive performance thanks to the recent advance in deep learning and large amount of aligned speech and text data.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • ICLR 2019 • Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu
Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving.
no code implementations • 20 Feb 2019 • Yi Ren, B. W. Jiang, Ming Yang, Jian Gao
The period-luminosity (P-L) relation is analyzed for the RSGs in the fundamental mode.
Solar and Stellar Astrophysics Astrophysics of Galaxies
no code implementations • 9 Feb 2019 • Houpu Yao, Malcolm Regan, Yezhou Yang, Yi Ren
We demonstrate in this paper that a generative model can be designed to perform classification tasks under challenging settings, including adversarial attacks and input distribution shifts.
no code implementations • 7 Feb 2019 • Houpu Yao, Jingjing Wen, Yi Ren, Bin Wu, Ze Ji
The results show that the proposed network is capable to map low-end shock signals to its high-end counterparts with satisfactory accuracy.
no code implementations • 31 Jan 2019 • Houpu Yao, Zhe Wang, GuangYu Nie, Yassine Mazboudi, Yezhou Yang, Yi Ren
The vulnerability of neural networks under adversarial attacks has raised serious concerns and motivated extensive research.
no code implementations • 28 Jan 2019 • Yi Ren, Steven Elliott, Yiwei Wang, Yezhou Yang, Wenlong Zhang
While intelligence of autonomous vehicles (AVs) has significantly advanced in recent years, accidents involving AVs suggest that these autonomous systems lack gracefulness in driving when interacting with human drivers.
Robotics Computer Science and Game Theory
1 code implementation • 27 Jul 2018 • Ruijin Cang, Hope Yao, Yi Ren
We introduce a theory-driven mechanism for learning a neural network model that performs generative topology design in one shot given a problem setting, circumventing the conventional iterative process that computational design tasks usually entail.
no code implementations • 11 Jun 2018 • Hao Dong, Shuai Li, Dongchang Xu, Yi Ren, Di Zhang
The training of Deep Neural Networks usually needs tremendous computing resources.
1 code implementation • 7 Dec 2017 • Ruijin Cang, Hechao Li, Hope Yao, Yang Jiao, Yi Ren
Direct prediction of material properties from microstructures through statistical models has shown to be a potential approach to accelerating computational material design with large design spaces.
Computational Physics Materials Science
no code implementations • 23 Sep 2016 • Yi Ren, Yaniv Romano, Michael Elad
Image and texture synthesis is a challenging task that has long been drawing attention in the fields of image processing, graphics, and machine learning.