no code implementations • NAACL (GeBNLP) 2022 • Xiuying Chen, Mingzhe Li, Rui Yan, Xin Gao, Xiangliang Zhang
Word embeddings learned from massive text collections have demonstrated significant levels of discriminative biases. However, debias on the Chinese language, one of the most spoken languages, has been less explored. Meanwhile, existing literature relies on manually created supplementary data, which is time- and energy-consuming. In this work, we propose the first Chinese Gender-neutral word Embedding model (CGE) based on Word2vec, which learns gender-neutral word embeddings without any labeled data. Concretely, CGE utilizes and emphasizes the rich feminine and masculine information contained in radicals, i. e., a kind of component in Chinese characters, during the training procedure. This consequently alleviates discriminative gender biases. Experimental results on public benchmark datasets show that our unsupervised method outperforms the state-of-the-art supervised debiased word embedding models without sacrificing the functionality of the embedding model.
no code implementations • 12 Feb 2024 • Mingzhe Li, Xiuying Chen, Jing Xiang, Qishen Zhang, Changsheng Ma, Chenchen Dai, Jinxiong Chang, Zhongyi Liu, Guannan Zhang
Since attributes from two ends are often not aligned in terms of number and type, we propose to exploit the benefit of attributes by multiple-intent modeling.
1 code implementation • 1 Jun 2023 • Xiuying Chen, Guodong Long, Chongyang Tao, Mingzhe Li, Xin Gao, Chengqi Zhang, Xiangliang Zhang
The other factor is in the latent space, where the attacked inputs bring more variations to the hidden states.
no code implementations • 19 May 2023 • Xiuying Chen, Mingzhe Li, Shen Gao, Xin Cheng, Qiang Yang, Qishen Zhang, Xin Gao, Xiangliang Zhang
To address these two challenges, we first propose a unified topic encoder, which jointly discovers latent topics from the document and various kinds of side information.
no code implementations • 17 Mar 2023 • Xiuying Chen, Mingzhe Li, Jiayi Zhang, Xiaoqiang Xia, Chen Wei, Jianwei Cui, Xin Gao, Xiangliang Zhang, Rui Yan
As it is cumbersome and expensive to acquire a huge amount of data for training neural dialog models, data augmentation is proposed to effectively utilize existing training samples.
no code implementations • 27 Jan 2023 • Xin Cheng, Shen Gao, Yuchi Zhang, Yongliang Wang, Xiuying Chen, Mingzhe Li, Dongyan Zhao, Rui Yan
Review summarization is a non-trivial task that aims to summarize the main idea of the product review in the E-commerce website.
no code implementations • 3 Jan 2023 • Mingzhe Li, Xiuying Chen, Weiheng Liao, Yang song, Tao Zhang, Dongyan Zhao, Rui Yan
The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource.
1 code implementation • 2 Jan 2023 • Xiuying Chen, Mingzhe Li, Shen Gao, Zhangming Chan, Dongyan Zhao, Xin Gao, Xiangliang Zhang, Rui Yan
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline.
no code implementations • 8 Dec 2022 • Xiuying Chen, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, Xiangliang Zhang
We first propose a Multi-granularity Unsupervised Summarization model (MUS) as a simple and low-cost solution to the task.
1 code implementation • 4 Oct 2022 • Xiuying Chen, Mingzhe Li, Xin Gao, Xiangliang Zhang
The evaluation of factual consistency also shows that our model generates more faithful summaries than baselines.
no code implementations • ACL 2022 • Mingzhe Li, Xiexiong Lin, Xiuying Chen, Jinxiong Chang, Qishen Zhang, Feng Wang, Taifeng Wang, Zhongyi Liu, Wei Chu, Dongyan Zhao, Rui Yan
Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references.
1 code implementation • 26 May 2022 • Xiuying Chen, Hind Alamro, Mingzhe Li, Shen Gao, Rui Yan, Xin Gao, Xiangliang Zhang
The related work section is an important component of a scientific paper, which highlights the contribution of the target paper in the context of the reference papers.
1 code implementation • ACL 2021 • Xiuying Chen, Hind Alamro, Mingzhe Li, Shen Gao, Xiangliang Zhang, Dongyan Zhao, Rui Yan
Hence, in this paper, we propose a Relation-aware Related work Generator (RRG), which generates an abstractive related work from the given multiple scientific papers in the same research area.
no code implementations • 8 Jan 2021 • Mingzhe Li, Sourabh Palande, Lin Yan, Bei Wang
That is, given a large set T of merge trees, we would like to find a much smaller basis set S such that each tree in T can be approximately reconstructed from a linear combination of merge trees in S. A set of high-dimensional vectors can be sketched via matrix sketching techniques such as principal component analysis and column subset selection.
no code implementations • 14 Dec 2020 • Mingzhe Li, Xiuying Chen, Min Yang, Shen Gao, Dongyan Zhao, Rui Yan
In this paper, we propose a Disentanglement-based Attractive Headline Generator (DAHG) that generates headline which captures the attractive content following the attractive style.
1 code implementation • EMNLP 2020 • Mingzhe Li, Xiuying Chen, Shen Gao, Zhangming Chan, Dongyan Zhao, Rui Yan
Hence, in this paper, we propose the task of Video-based Multimodal Summarization with Multimodal Output (VMSMO) to tackle such a problem.