Search Results for author: Jinming Zhao

Found 21 papers, 15 papers with code

WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training

2 code implementations • 11 Mar 2021 • Yuqi Huo, Manli Zhang, Guangzhen Liu, Haoyu Lu, Yizhao Gao, Guoxing Yang, Jingyuan Wen, Heng Zhang, Baogui Xu, Weihao Zheng, Zongzheng Xi, Yueqian Yang, Anwen Hu, Jinming Zhao, Ruichen Li, Yida Zhao, Liang Zhang, Yuqing Song, Xin Hong, Wanqing Cui, Danyang Hou, Yingyan Li, Junyi Li, Peiyu Liu, Zheng Gong, Chuhao Jin, Yuchong Sun, ShiZhe Chen, Zhiwu Lu, Zhicheng Dou, Qin Jin, Yanyan Lan, Wayne Xin Zhao, Ruihua Song, Ji-Rong Wen

We further construct a large Chinese multi-source image-text dataset called RUC-CAS-WenLan for pre-training our BriVL model.

Ranked #1 on Image Retrieval on RUC-CAS-WenLan

Contrastive Learning Image Captioning +2

273

Paper
Code

MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning

3 code implementations • 18 Apr 2023 • Zheng Lian, Haiyang Sun, Licai Sun, Kang Chen, Mingyu Xu, Kexin Wang, Ke Xu, Yu He, Ying Li, Jinming Zhao, Ye Liu, Bin Liu, Jiangyan Yi, Meng Wang, Erik Cambria, Guoying Zhao, Björn W. Schuller, JianHua Tao

The first Multimodal Emotion Recognition Challenge (MER 2023) was successfully held at ACM Multimedia.

Multi-Label Learning Multimodal Emotion Recognition

Paper
Code

M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database

1 code implementation • ACL 2022 • Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li

In this work, we propose a Multi-modal Multi-scene Multi-label Emotional Dialogue dataset, M3ED, which contains 990 dyadic emotional dialogues from 56 different TV series, a total of 9, 082 turns and 24, 449 utterances.

Cultural Vocal Bursts Intensity Prediction Emotion Recognition

Paper
Code

MMGCN: Multimodal Fusion via Deep Graph Convolution Network for Emotion Recognition in Conversation

1 code implementation • ACL 2021 • Jingwen Hu, Yuchen Liu, Jinming Zhao, Qin Jin

Emotion recognition in conversation (ERC) is a crucial component in affective dialogue systems, which helps the system understand users' emotions and generate empathetic responses.

Emotion Recognition in Conversation

Paper
Code

Missing Modality Imagination Network for Emotion Recognition with Uncertain Missing Modalities

1 code implementation • ACL 2021 • Jinming Zhao, Ruichen Li, Qin Jin

However, in real-world applications, we often encounter the problem of missing modality, and which modalities will be missing is uncertain.

Emotion Recognition

Paper
Code

SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression

1 code implementation • 17 Jul 2020 • Jinming Zhao, Ming Liu, Longxiang Gao, Yuan Jin, Lan Du, He Zhao, He Zhang, Gholamreza Haffari

Obtaining training data for multi-document summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains.

Clustering Document Summarization +2

Paper
Code

Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities

1 code implementation • 27 Oct 2022 • Haolin Zuo, Rui Liu, Jinming Zhao, Guanglai Gao, Haizhou Li

Multimodal emotion recognition leverages complementary information across modalities to gain performance.

Multimodal Emotion Recognition

Paper
Code

Towards Relation Extraction From Speech

1 code implementation • 17 Oct 2022 • Tongtong Wu, Guitao Wang, Jinming Zhao, Zhaoran Liu, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

We explore speech relation extraction via two approaches: the pipeline approach conducting text-based extraction with a pretrained ASR module, and the end2end approach via a new proposed encoder-decoder model, or what we called SpeechRE.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Multi-Task Learning Framework for Emotion Recognition in-the-wild

1 code implementation • 19 Jul 2022 • Tenggan Zhang, Chuanhe Liu, Xiaolong Liu, Yuchen Liu, Liyu Meng, Lei Sun, Wenqiang Jiang, Fengyuan Zhang, Jinming Zhao, Qin Jin

This paper presents our system for the Multi-Task Learning (MTL) Challenge in the 4th Affective Behavior Analysis in-the-wild (ABAW) competition.

Emotion Recognition Multi-Task Learning +1

Paper
Code

Generating Synthetic Speech from SpokenVocab for Speech Translation

1 code implementation • 15 Oct 2022 • Jinming Zhao, Gholamreza Haffar, Ehsan Shareghi

Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is unavailable for most language pairs and domains.

Data Augmentation Machine Translation +1

Paper
Code

It is Not as Good as You Think! Evaluating Simultaneous Machine Translation on Interpretation Data

1 code implementation • EMNLP 2021 • Jinming Zhao, Philip Arthur, Gholamreza Haffari, Trevor Cohn, Ehsan Shareghi

Most existing simultaneous machine translation (SiMT) systems are trained and evaluated on offline translation corpora.

Machine Translation Style Transfer +1

Paper
Code

M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation

1 code implementation • 3 Jul 2022 • Jinming Zhao, Hao Yang, Ehsan Shareghi, Gholamreza Haffari

End-to-end speech-to-text translation models are often initialized with pre-trained speech encoder and pre-trained text decoder.

Speech-to-Text Translation Translation

Paper
Code

Investigating Pre-trained Audio Encoders in the Low-Resource Condition

1 code implementation • 28 May 2023 • Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi

Pre-trained speech encoders have been central to pushing state-of-the-art results across various speech understanding and generation tasks.

Paper
Code

Self-supervised Rewiring of Pre-trained Speech Encoders: Towards Faster Fine-tuning with Less Labels in Speech Processing

1 code implementation • 24 Oct 2022 • Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi

Pre-trained speech Transformers have facilitated great success across various speech processing tasks.

Self-Supervised Learning

Paper
Code

MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition

no code implementations • 27 Oct 2021 • Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li

Multimodal emotion recognition study is hindered by the lack of labelled corpora in terms of scale and diversity, due to the high annotation cost and label ambiguity.

Emotion Classification Multimodal Emotion Recognition +1

Paper
Add Code

DialogueEIN: Emotion Interaction Network for Dialogue Affective Analysis

no code implementations • COLING 2022 • Yuchen Liu, Jinming Zhao, Jingwen Hu, Ruichen Li, Qin Jin

Emotion Recognition in Conversation (ERC) has attracted increasing attention in the affective computing research field.

Emotion Recognition in Conversation

Paper
Add Code

RedApt: An Adaptor for wav2vec 2 Encoding \\ Faster and Smaller Speech Translation without Quality Compromise

no code implementations • 16 Oct 2022 • Jinming Zhao, Hao Yang, Gholamreza Haffari, Ehsan Shareghi

Pre-trained speech Transformers in speech translation (ST) have facilitated state-of-the-art (SotA) results; yet, using such encoders is computationally expensive.

Translation

Paper
Add Code

NAIST-SIC-Aligned: an Aligned English-Japanese Simultaneous Interpretation Corpus

no code implementations • 23 Apr 2023 • Jinming Zhao, Yuka Ko, Kosuke Doi, Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

Research has been limited due to the lack of a large-scale training corpus.

Machine Translation Sentence +1

Paper
Add Code

Simultaneous Machine Translation with Large Language Models

no code implementations • 13 Sep 2023 • Minghan Wang, Jinming Zhao, Thuy-Trang Vu, Fatemeh Shiri, Ehsan Shareghi, Gholamreza Haffari

The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.

Machine Translation Translation

Paper
Add Code

Towards Event Extraction from Speech with Contextual Clues

1 code implementation • 27 Jan 2024 • Jingqi Kang, Tongtong Wu, Jinming Zhao, Guitao Wang, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

While text-based event extraction has been an active research area and has seen successful application in many domains, extracting semantic events from speech directly is an under-explored problem.

Event Extraction speech-recognition +1

Paper
Code

Double Mixture: Towards Continual Event Detection from Speech

no code implementations • 20 Apr 2024 • Jingqi Kang, Tongtong Wu, Jinming Zhao, Guitao Wang, Yinwei Wei, Hao Yang, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

To address the challenges of catastrophic forgetting and effective disentanglement, we propose a novel method, 'Double Mixture.'

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.