Search Results for author: Zejun Li

Found 9 papers, 4 papers with code

DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning

1 code implementation • 2 Apr 2024 • Mengfei Du, Binhao Wu, Jiwen Zhang, Zhihao Fan, Zejun Li, Ruipu Luo, Xuanjing Huang, Zhongyu Wei

For task completion, the agent needs to align and integrate various navigation modalities, including instruction, observation and navigation history.

Contrastive Learning Decision Making +2

Paper
Code

ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks

1 code implementation • 4 Oct 2023 • Zejun Li, Ye Wang, Mengfei Du, Qingwen Liu, Binhao Wu, Jiwen Zhang, Chengxing Zhou, Zhihao Fan, Jie Fu, Jingjing Chen, Xuanjing Huang, Zhongyu Wei

Recent years have witnessed remarkable progress in the development of large vision-language models (LVLMs).

Paper
Code

A Unified Continuous Learning Framework for Multi-modal Knowledge Discovery and Pre-training

no code implementations • 11 Jun 2022 • Zhihao Fan, Zhongyu Wei, Jingjing Chen, Siyuan Wang, Zejun Li, Jiarong Xu, Xuanjing Huang

These two steps are iteratively performed in our framework for continuous learning.

Paper
Add Code

MVPTR: Multi-Level Semantic Alignment for Vision-Language Pre-Training via Multi-Stage Learning

1 code implementation • 29 Jan 2022 • Zejun Li, Zhihao Fan, Huaixiao Tou, Jingjing Chen, Zhongyu Wei, Xuanjing Huang

In MVPTR, we follow the nested structure of both modalities to introduce concepts as high-level semantics.

Image-text matching Language Modelling +3

Paper
Code

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

1 code implementation • Findings (NAACL) 2022 • Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Jianqing Fan

We propose our TAiloring neGative Sentences with Discrimination and Correction (TAGS-DC) to generate synthetic sentences automatically as negative samples.

Retrieval Sentence +1

Paper
Code

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval

no code implementations • 12 Sep 2021 • Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan

Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image.

Representation Learning Retrieval +2

Paper
Add Code

TCIC: Theme Concepts Learning Cross Language and Vision for Image Captioning

no code implementations • 21 Jun 2021 • Zhihao Fan, Zhongyu Wei, Siyuan Wang, Ruize Wang, Zejun Li, Haijun Shan, Xuanjing Huang

Considering that theme concepts can be learned from both images and captions, we propose two settings for their representations learning based on TTN.

Image Captioning Representation Learning

Paper
Add Code

An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-Level Structural Information

no code implementations • 21 Mar 2021 • Zejun Li, Zhongyu Wei, Zhihao Fan, Haijun Shan, Xuanjing Huang

In this paper, we focus on the problem of unsupervised image-sentence matching.

Representation Learning Semantic Similarity +2

Paper
Add Code

AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition

no code implementations • 10 Oct 2017 • Chun Yang, Xu-Cheng Yin, Zejun Li, Jianwei Wu, Chunchao Guo, Hongfa Wang, Lei Xiao

Recognizing text in the wild is a really challenging task because of complex backgrounds, various illuminations and diverse distortions, even with deep neural networks (convolutional neural networks and recurrent neural networks).

Scene Text Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.