Search Results for author: Zhendong Mao

Found 33 papers, 21 papers with code

EmRel: Joint Representation of Entities and Embedded Relations for Multi-triple Extraction

1 code implementation • NAACL 2022 • Benfeng Xu, Quan Wang, Yajuan Lyu, Yabing Shi, Yong Zhu, Jie Gao, Zhendong Mao

Multi-triple extraction is a challenging task due to the existence of informative inter-triple correlations, and consequently rich interactions across the constituent entities and relations. While existing works only explore entity representations, we propose to explicitly introduce relation representation, jointly represent it with entities, and novelly align them to identify valid triples. We perform comprehensive experiments on document-level relation extraction and joint entity and relation extraction along with ablations to demonstrate the advantage of the proposed method.

Document-level Relation Extraction Joint Entity and Relation Extraction +2

Paper
Code

Sentiment-oriented Transformer-based Variational Autoencoder Network for Live Video Commenting

1 code implementation • 19 Apr 2024 • Fengyi Fu, Shancheng Fang, Weidong Chen, Zhendong Mao

Furthermore, a batch attention module is also proposed in this paper to alleviate the problem of missing sentimental samples, caused by the data imbalance, which is common in live videos as the popularity of videos varies.

Paper
Code

Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

1 code implementation • 5 Apr 2024 • Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying WEI, Defu Lian, Zhendong Mao

Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods.

Attribute Benchmarking +2

Paper
Code

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

no code implementations • 1 Mar 2024 • Mengqi Huang, Zhendong Mao, Mingcong Liu, Qian He, Yongdong Zhang

However, the inherent entangled influence scope of pseudo-words with the given text results in a dual-optimum paradox, i. e., the similarity of the given subjects and the controllability of the given text could not be optimal simultaneously.

Paper
Add Code

Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing

no code implementations • 22 Feb 2024 • Hao Li, Mengqi Huang, Lei Zhang, Bo Hu, Yi Liu, Zhendong Mao

GAN-based image attribute editing firstly leverages GAN Inversion to project real images into the latent space of GAN and then manipulates corresponding latent codes.

Attribute

Paper
Add Code

Benchmarking Large Language Models on Controllable Generation under Diversified Instructions

1 code implementation • 1 Jan 2024 • Yihan Chen, Benfeng Xu, Quan Wang, Yi Liu, Zhendong Mao

While large language models (LLMs) have exhibited impressive instruction-following capabilities, it is still unclear whether and to what extent they can respond to explicit constraints that might be entailed in various instructions.

Benchmarking Instruction Following +1

Paper
Code

E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation

no code implementations • 25 Nov 2023 • Fengyi Fu, Lei Zhang, Quan Wang, Zhendong Mao

Then we propose an emotion correlation enhanced decoder, with a novel correlation-aware aggregation and soft/hard strategy, respectively improving the emotion perception and response generation.

Decoder Dialogue Generation +1

Paper
Add Code

Grammatical Error Correction via Mixed-Grained Weighted Training

no code implementations • 23 Nov 2023 • Jiahao Li, Quan Wang, Chiwei Zhu, Zhendong Mao, Yongdong Zhang

In this paper, the inherent discrepancies are manifested in two aspects, namely, accuracy of data annotation and diversity of potential annotations.

Grammatical Error Correction Sentence

Paper
Add Code

On the Calibration of Large Language Models and Alignment

no code implementations • 22 Nov 2023 • Chiwei Zhu, Benfeng Xu, Quan Wang, Yongdong Zhang, Zhendong Mao

As large language models attract increasing attention and find widespread application, concurrent challenges of reliability also arise at the same time.

Paper
Add Code

Improving Image Captioning via Predicting Structured Concepts

no code implementations • 14 Nov 2023 • Ting Wang, Weidong Chen, Yuanhe Tian, Yan Song, Zhendong Mao

Having the difficulty of solving the semantic gap between images and texts for the image captioning task, conventional studies in this area paid some attention to treating semantic concepts as a bridge between the two modalities and improved captioning performance accordingly.

Image Captioning

Paper
Add Code

Random Entity Quantization for Parameter-Efficient Compositional Knowledge Graph Representation

1 code implementation • 24 Oct 2023 • Jiaang Li, Quan Wang, Yi Liu, Licheng Zhang, Zhendong Mao

We analyze this phenomenon and reveal that entity codes, the quantization outcomes for expressing entities, have higher entropy at the code level and Jaccard distance at the codeword level under random entity quantization.

Knowledge Graphs Quantization +1

Paper
Code

Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation

1 code implementation • 23 Oct 2023 • Tianqi Zhong, Quan Wang, Jingxuan Han, Yongdong Zhang, Zhendong Mao

Then we design a novel attribute distribution reconstruction method to balance the obtained distributions and use the reconstructed distributions to guide language models for generation, effectively avoiding the issue of Attribute Collapse.

Attribute Text Generation

Paper
Code

DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation

no code implementations • 1 Jul 2023 • Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao

While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centric images, an intractable problem is how to preserve the face identity for conditioned face images.

Image Generation

Paper
Add Code

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts

2 code implementations • 24 May 2023 • Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Yongdong Zhang, Zhendong Mao

The answering quality of an aligned large language model (LLM) can be drastically improved if treated with proper crafting of prompts.

In-Context Learning Instruction Following +2

288

Paper
Code

Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation

1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Quan Wang, Yongdong Zhang

Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.

Image Generation Image Reconstruction +1

Paper
Code

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

1 code implementation • CVPR 2023 • Mengqi Huang, Zhendong Mao, Zhuowei Chen, Yongdong Zhang

Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook.

Image Generation Position +1

111

Paper
Code

Inductive Relation Prediction from Relational Paths and Context with Hierarchical Transformers

1 code implementation • 1 Apr 2023 • Jiaang Li, Quan Wang, Zhendong Mao

Relation prediction on knowledge graphs (KGs) is a key research topic.

Inductive Relation Prediction Knowledge Graphs +1

Paper
Code

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

1 code implementation • 24 Mar 2023 • Benfeng Xu, Quan Wang, Zhendong Mao, Yajuan Lyu, Qiaoqiao She, Yongdong Zhang

In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs.

In-Context Learning

Paper
Code

Learning Semantic Relationship Among Instances for Image-Text Matching

1 code implementation • CVPR 2023 • Zheren Fu, Zhendong Mao, Yan Song, Yongdong Zhang

Image-text matching, a bridge connecting image and language, is an important task, which generally learns a holistic cross-modal embedding to achieve a high-quality semantic alignment between the two modalities.

Image Retrieval Image-text matching +8

Paper
Code

Crossing the Gap: Domain Generalization for Image Captioning

no code implementations • CVPR 2023 • Yuchen Ren, Zhendong Mao, Shancheng Fang, Yan Lu, Tong He, Hao Du, Yongdong Zhang, Wanli Ouyang

In this paper, we introduce a new setting called Domain Generalization for Image Captioning (DGIC), where the data from the target domain is unseen in the learning process.

Domain Generalization Image Captioning +1

Paper
Add Code

Intra-class Adaptive Augmentation with Neighbor Correction for Deep Metric Learning

1 code implementation • 29 Nov 2022 • Zheren Fu, Zhendong Mao, Bo Hu, An-An Liu, Yongdong Zhang

They have overlooked the wide characteristic changes of different classes and can not model abundant intra-class variations for generations.

Image Augmentation Image Retrieval +5

Paper
Code

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

1 code implementation • 19 Nov 2022 • Shancheng Fang, Zhendong Mao, Hongtao Xie, Yuxin Wang, Chenggang Yan, Yongdong Zhang

In this paper, we argue that the limited capacity of language models comes from 1) implicit language modeling; 2) unidirectional feature representation; and 3) language model with noise input.

Ranked #4 on Text Spotting on SCUT-CTW1500

Blocking Language Modelling +2

Paper
Code

UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction

1 code implementation • 16 Nov 2022 • Wei Tang, Benfeng Xu, Yuyue Zhao, Zhendong Mao, Yifeng Liu, Yong Liao, Haiyong Xie

Relational triple extraction is challenging for its difficulty in capturing rich correlations between entities and relations.

Ranked #1 on Relation Extraction on WebNLG

Relation Extraction

Paper
Code

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

1 code implementation • 20 Oct 2022 • Jiahao Li, Quan Wang, Zhendong Mao, Junbo Guo, Yanyan Yang, Yongdong Zhang

In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task.

Paper
Code

DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation

no code implementations • 3 Sep 2022 • Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang

Text-to-image generation aims at generating realistic images which are semantically consistent with the given text.

Generative Adversarial Network Text-to-Image Generation

Paper
Add Code

Negative-Aware Attention Framework for Image-Text Matching

1 code implementation • CVPR 2022 • Kun Zhang, Zhendong Mao, Quan Wang, Yongdong Zhang

Image-text matching, as a fundamental task, bridges the gap between vision and language.

Image-text matching Text Matching +1

Paper
Code

Lesion-Aware Transformers for Diabetic Retinopathy Grading

no code implementations • CVPR 2021 • Rui Sun, Yihao Li, Tianzhu Zhang, Zhendong Mao, Feng Wu, Yongdong Zhang

First, to the best of our knowledge, this is the first work to formulate lesion discovery as a weakly supervised lesion localization problem via a transformer decoder.

Decoder Diabetic Retinopathy Grading

Paper
Add Code

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

3 code implementations • CVPR 2021 • Shancheng Fang, Hongtao Xie, Yuxin Wang, Zhendong Mao, Yongdong Zhang

Additionally, based on the ensemble of iterative predictions, we propose a self-training method which can learn from unlabeled images effectively.

Language Modelling Scene Text Recognition

38,684

Paper
Code

Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction

3 code implementations • 20 Feb 2021 • Benfeng Xu, Quan Wang, Yajuan Lyu, Yong Zhu, Zhendong Mao

Our experiments demonstrate the usefulness of the proposed entity structure and the effectiveness of SSAN.

Ranked #3 on Relation Extraction on DocRED

Document-level Relation Extraction Relation

1,694

Paper
Code

Overcoming Language Priors with Self-supervised Learning for Visual Question Answering

1 code implementation • 17 Dec 2020 • Xi Zhu, Zhendong Mao, Chunxiao Liu, Peng Zhang, Bin Wang, Yongdong Zhang

Our method can compensate for the data biases by generating balanced data without introducing external annotations.

Question Answering Self-Supervised Learning +1

Paper
Code

Image Captioning with Context-Aware Auxiliary Guidance

no code implementations • 10 Dec 2020 • Zeliang Song, Xiaofei Zhou, Zhendong Mao, Jianlong Tan

Image captioning is a challenging computer vision task, which aims to generate a natural language description of an image.

Decoder Image Captioning

Paper
Add Code

Curriculum Learning for Natural Language Understanding

no code implementations • ACL 2020 • Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang

With the great success of pre-trained language models, the pretrain-finetune paradigm now becomes the undoubtedly dominant solution for natural language understanding (NLU) tasks.

Natural Language Understanding

Paper
Add Code

Graph Structured Network for Image-Text Matching

1 code implementation • CVPR 2020 • Chunxiao Liu, Zhendong Mao, Tianzhu Zhang, Hongtao Xie, Bin Wang, Yongdong Zhang

The GSMN explicitly models object, relation and attribute as a structured phrase, which not only allows to learn correspondence of object, relation and attribute separately, but also benefits to learn fine-grained correspondence of structured phrase.

Ranked #16 on Cross-Modal Retrieval on Flickr30k

Attribute Image-text matching +3

160

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.