Search Results for author: Tingle Li

Found 8 papers, 3 papers with code

Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

no code implementations • 20 Dec 2023 • Jiachen Lian, Carly Feng, Naasir Farooqi, Steve Li, Anshul Kashyap, Cheol Jun Cho, Peter Wu, Robbie Netzorg, Tingle Li, Gopala Krishna Anumanchipalli

Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level.

Paper
Add Code

Deep Speech Synthesis from MRI-Based Articulatory Representations

1 code implementation • 5 Jul 2023 • Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian, Alan W Black, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

Finally, through a series of ablations, we show that the proposed MRI representation is more comprehensive than EMA and identify the most suitable MRI feature subset for articulatory synthesis.

Computational Efficiency Denoising +1

Paper
Code

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

1 code implementation • 2 May 2023 • Chenzhuang Du, Jiaye Teng, Tingle Li, Yichen Liu, Tianyuan Yuan, Yue Wang, Yang Yuan, Hang Zhao

We abstract the features (i. e. learned representations) of multi-modal data into 1) uni-modal features, which can be learned from uni-modal training, and 2) paired features, which can only be learned from cross-modal interactions.

205

Paper
Code

Learning Visual Styles from Audio-Visual Associations

no code implementations • 10 May 2022 • Tingle Li, Yichen Liu, Andrew Owens, Hang Zhao

Our model learns to manipulate the texture of a scene to match a sound, a problem we term audio-driven image stylization.

Image Stylization

Paper
Add Code

Neural Dubber: Dubbing for Videos According to Scripts

no code implementations • NeurIPS 2021 • Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao

Neural Dubber is a multi-modal text-to-speech (TTS) model that utilizes the lip movement in the video to control the prosody of the generated speech.

Paper
Add Code

Modality Laziness: Everybody's Business is Nobody's Business

no code implementations • 29 Sep 2021 • Chenzhuang Du, Jiaye Teng, Tingle Li, Yichen Liu, Yue Wang, Yang Yuan, Hang Zhao

We name this problem of multi-modal training, \emph{Modality Laziness}.

Paper
Add Code

Improving Multi-Modal Learning with Uni-Modal Teachers

no code implementations • 21 Jun 2021 • Chenzhuang Du, Tingle Li, Yichen Liu, Zixin Wen, Tianyu Hua, Yue Wang, Hang Zhao

We name this problem Modality Failure, and hypothesize that the imbalance of modalities and the implicit bias of common objectives in fusion method prevent encoders of each modality from sufficient feature learning.

Ranked #60 on Semantic Segmentation on NYU Depth v2

Image Segmentation Semantic Segmentation

Paper
Add Code

Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation

1 code implementation • 12 Sep 2019 • Tingle Li, Jia-Wei Chen, Haowen Hou, Ming Li

Convolutional Neural Network (CNN) or Long short-term memory (LSTM) based models with the input of spectrogram or waveforms are commonly used for deep learning based audio source separation.

Ranked #23 on Music Source Separation on MUSDB18

Audio Source Separation Music Source Separation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.