Search Results for author: Aoxiong Yin

Found 10 papers, 3 papers with code

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

no code implementations23 Dec 2023 Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, Changpeng Yang, Zhou Zhao

However, talking head translation, converting audio-visual speech (i. e., talking head video) from one language into another, still confronts several challenges compared to audio speech: (1) Existing methods invariably rely on cascading, synthesizing via both audio and text, resulting in delays and cascading errors.

Self-Supervised Learning Speech-to-Speech Translation +1

TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System

no code implementations11 Nov 2023 Haoyuan Li, Hao Jiang, Tianke Zhang, Zhelun Yu, Aoxiong Yin, Hao Cheng, Siming Fu, Yuhao Zhang, Wanggui He

We anticipate that our work will contribute to the advancement of research on TrainerAgent in both academic and industry communities, potentially establishing it as a new paradigm for model development in the field of AI.

Decision Making Language Modelling +1

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

no code implementations25 Jul 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.

Object Position +3

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

1 code implementation ICCV 2023 Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao

To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.

Object Semantic Similarity +3

Gloss Attention for Gloss-free Sign Language Translation

1 code implementation CVPR 2023 Aoxiong Yin, Tianyun Zhong, Li Tang, Weike Jin, Tao Jin, Zhou Zhao

We find that it can provide two aspects of information for the model, 1) it can help the model implicitly learn the location of semantic boundaries in continuous sign language videos, 2) it can help the model understand the sign language video globally.

Gloss-free Sign Language Translation Language Modelling +4

Connecting Multi-modal Contrastive Representations

no code implementations NeurIPS 2023 Zehan Wang, Yang Zhao, Xize Cheng, Haifeng Huang, Jiageng Liu, Li Tang, Linjun Li, Yongqi Wang, Aoxiong Yin, Ziang Zhang, Zhou Zhao

This paper proposes a novel training-efficient method for learning MCR without paired data called Connecting Multi-modal Contrastive Representations (C-MCR).

3D Point Cloud Classification counterfactual +4

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

2 code implementations ICCV 2023 Xize Cheng, Linjun Li, Tao Jin, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao

However, despite researchers exploring cross-lingual translation techniques such as machine translation and audio speech translation to overcome language barriers, there is still a shortage of cross-lingual studies on visual speech.

Lip Reading Machine Translation +4

MLSLT: Towards Multilingual Sign Language Translation

no code implementations CVPR 2022 Aoxiong Yin, Zhou Zhao, Weike Jin, Meng Zhang, Xingshan Zeng, Xiaofei He

In addition, we also explore zero-shot translation in sign language and find that our model can achieve comparable performance to the supervised BSLT model on some language pairs.

Sign Language Translation Translation

SimulSLT: End-to-End Simultaneous Sign Language Translation

no code implementations8 Dec 2021 Aoxiong Yin, Zhou Zhao, Jinglin Liu, Weike Jin, Meng Zhang, Xingshan Zeng, Xiaofei He

Sign language translation as a kind of technology with profound social significance has attracted growing researchers' interest in recent years.

Sign Language Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.