Search Results for author: Guanghui Xu

Found 8 papers, 4 papers with code

Debiased Visual Question Answering from Feature and Sample Perspectives

1 code implementation NeurIPS 2021 Zhiquan Wen, Guanghui Xu, Mingkui Tan, Qingyao Wu, Qi Wu

From the sample perspective, we construct two types of negative samples to assist the training of the models, without introducing additional annotations.

Bias Detection Question Answering +1

AdaXpert: Adapting Neural Architecture for Growing Data

1 code implementation1 Jul 2021 Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan

To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data.

Towards Accurate Text-based Image Captioning with Content Diversity Exploration

1 code implementation CVPR 2021 Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu

This task, however, is very challenging because an image often contains complex texts and visual information that is hard to be described comprehensively.

Image Captioning

How to Train Your Agent to Read and Write

1 code implementation4 Jan 2021 Li Liu, Mengge He, Guanghui Xu, Mingkui Tan, Qi Wu

Typically, this requires an agent to fully understand the knowledge from the given text materials and generate correct and fluent novel paragraphs, which is very challenging in practice.

KG-to-Text Generation Knowledge Graphs

Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis

no code implementations6 Nov 2020 Guanghui Xu, Wei Song, Zhengchen Zhang, Chao Zhang, Xiaodong He, BoWen Zhou

Despite prosody is related to the linguistic information up to the discourse structure, most text-to-speech (TTS) systems only take into account that within each sentence, which makes it challenging when converting a paragraph of texts into natural and expressive speech.

Sentence Embeddings Speech Synthesis

Building a mixed-lingual neural TTS system with only monolingual data

no code implementations12 Apr 2019 Liumeng Xue, Wei Song, Guanghui Xu, Lei Xie, Zhizheng Wu

When deploying a Chinese neural text-to-speech (TTS) synthesis system, one of the challenges is to synthesize Chinese utterances with English phrases or words embedded.

You Only Look & Listen Once: Towards Fast and Accurate Visual Grounding

no code implementations12 Feb 2019 Chaorui Deng, Qi Wu, Guanghui Xu, Zhuliang Yu, Yanwu Xu, Kui Jia, Mingkui Tan

Most state-of-the-art methods in VG operate in a two-stage manner, wherein the first stage an object detector is adopted to generate a set of object proposals from the input image and the second stage is simply formulated as a cross-modal matching problem that finds the best match between the language query and all region proposals.

Object Detection Region Proposal +1

Cannot find the paper you are looking for? You can Submit a new open access paper.