Search Results for author: Qi Zheng

Found 25 papers, 12 papers with code

Understanding Gender Bias in Knowledge Base Embeddings

no code implementations ACL 2022 Yupei Du, Qi Zheng, Yuanbin Wu, Man Lan, Yan Yang, Meirong Ma

To exemplify the potential applications of our study, we also present two strategies (by adding and removing KB triples) to mitigate gender biases in KB embeddings.

LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding

3 code implementations8 Apr 2024 Chuwei Luo, Yufan Shen, Zhaoqing Zhu, Qi Zheng, Zhi Yu, Cong Yao

The core of LayoutLLM is a layout instruction tuning strategy, which is specially designed to enhance the comprehension and utilization of document layouts.

document understanding

LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

no code implementations3 Jan 2024 Rujiao Long, Hangdi Xing, Zhibo Yang, Qi Zheng, Zhi Yu, Cong Yao, Fei Huang

We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time regresses logical location as well as spatial location of table cells in a unified network.

regression

GeoLayoutLM: Geometric Pre-training for Visual Information Extraction

1 code implementation CVPR 2023 Chuwei Luo, Changxu Cheng, Qi Zheng, Cong Yao

Additionally, novel relation heads, which are pre-trained by the geometric pre-training tasks and fine-tuned for RE, are elaborately designed to enrich and enhance the feature representation.

Document AI entity_extraction +3

LORE: Logical Location Regression Network for Table Structure Recognition

1 code implementation7 Mar 2023 Hangdi Xing, Feiyu Gao, Rujiao Long, Jiajun Bu, Qi Zheng, Liangcheng Li, Cong Yao, Zhi Yu

Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats.

regression Table Recognition

ESceme: Vision-and-Language Navigation with Episodic Scene Memory

1 code implementation2 Mar 2023 Qi Zheng, Daqing Liu, Chaoyue Wang, Jing Zhang, Dadong Wang, DaCheng Tao

Vision-and-language navigation (VLN) simulates a visual agent that follows natural-language navigation instructions in real-world scenes.

Vision and Language Navigation

Modeling Video As Stochastic Processes for Fine-Grained Video Representation Learning

1 code implementation CVPR 2023 Heng Zhang, Daqing Liu, Qi Zheng, Bing Su

Specifically, we enforce the embeddings of the frame sequence of interest to approximate a goal-oriented stochastic process, i. e., Brownian bridge, in the latent space via a process-based contrastive loss.

Contrastive Learning Representation Learning +3

Cross-Modal Contrastive Learning for Robust Reasoning in VQA

1 code implementation21 Nov 2022 Qi Zheng, Chaoyue Wang, Daqing Liu, Dadong Wang, DaCheng Tao

For each positive pair, we regard the images from different graphs as negative samples and deduct the version of multi-positive contrastive learning.

Contrastive Learning Question Answering +1

Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding

no code implementations27 Jun 2022 Chuwei Luo, Guozhi Tang, Qi Zheng, Cong Yao, Lianwen Jin, Chenliang Li, Yang Xue, Luo Si

Multi-modal document pre-trained models have proven to be very effective in a variety of visually-rich document understanding (VrDU) tasks.

Document Classification document understanding +2

Bypass Network for Semantics Driven Image Paragraph Captioning

no code implementations21 Jun 2022 Qi Zheng, Chaoyue Wang, Dadong Wang

Most existing methods model the coherence through the topic transition that dynamically infers a topic vector from preceding sentences.

Image Paragraph Captioning Sentence

Visual Superordinate Abstraction for Robust Concept Learning

no code implementations28 May 2022 Qi Zheng, Chaoyue Wang, Dadong Wang, DaCheng Tao

Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks.

Attribute Question Answering +1

FAVER: Blind Quality Prediction of Variable Frame Rate Videos

1 code implementation5 Jan 2022 Qi Zheng, Zhengzhong Tu, Pavan C. Madhusudana, Xiaoyang Zeng, Alan C. Bovik, Yibo Fan

Video quality assessment (VQA) remains an important and challenging problem that affects many applications at the widest scales.

Cloud Computing Video Quality Assessment +1

Decoupling Visual-Semantic Feature Learning for Robust Scene Text Recognition

no code implementations24 Nov 2021 Changxu Cheng, Bohan Li, Qi Zheng, Yongpan Wang, Wenyu Liu

As a result, the learning of semantic features is prone to have a bias on the limited vocabulary of the training set, which is called vocabulary reliance.

Scene Text Recognition

SentiPrompt: Sentiment Knowledge Enhanced Prompt-Tuning for Aspect-Based Sentiment Analysis

no code implementations17 Sep 2021 Chengxi Li, Feiyu Gao, Jiajun Bu, Lu Xu, Xiang Chen, Yu Gu, Zirui Shao, Qi Zheng, Ningyu Zhang, Yongpan Wang, Zhi Yu

We inject sentiment knowledge regarding aspects, opinions, and polarities into prompt and explicitly model term relations via constructing consistency and polarity judgment templates from the ground truth triplets.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +3

Inference for High Dimensional Censored Quantile Regression

1 code implementation22 Jul 2021 Zhe Fei, Qi Zheng, Hyokyoung G. Hong, Yi Li

To our knowledge, there is little work available to draw inference on the effects of high dimensional predictors for censored quantile regression.

Epidemiology regression +2

Pouring Dynamics Estimation Using Gated Recurrent Units

no code implementations8 May 2021 Qi Zheng

One of the most commonly performed manipulation in a human's daily life is pouring.

Progressive Localization Networks for Language-based Moment Localization

no code implementations2 Feb 2021 Qi Zheng, Jianfeng Dong, Xiaoye Qu, Xun Yang, Yabing Wang, Pan Zhou, Baolong Liu, Xun Wang

The language-based setting of this task allows for an open set of target activities, resulting in a large variation of the temporal lengths of video moments.

Syntax-Aware Action Targeting for Video Captioning

1 code implementation CVPR 2020 Qi Zheng, Chaoyue Wang, Dacheng Tao

Existing methods on video captioning have made great efforts to identify objects/instances in videos, but few of them emphasize the prediction of action.

Video Captioning

Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition

2 code implementations ECCV 2018 Chaojian Yu, Xinyi Zhao, Qi Zheng, Peng Zhang, Xinge You

Fine-grained visual recognition is challenging because it highly relies on the modeling of various semantic parts and fine-grained feature learning.

Fine-Grained Visual Recognition

Coarse-to-Fine Salient Object Detection with Low-Rank Matrix Recovery

no code implementations21 May 2018 Qi Zheng, Shujian Yu, Xinge You, Qinmu Peng

Low-Rank Matrix Recovery (LRMR) has recently been applied to saliency detection by decomposing image features into a low-rank component associated with background and a sparse component associated with visual salient regions.

object-detection RGB Salient Object Detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.