Search Results for author: Xiaojun Wan

Found 160 papers, 49 papers with code

Comparing Knowledge-Intensive and Data-Intensive Models for English Resource Semantic Parsing

no code implementations CL (ACL) 2021 Junjie Cao, Zi Lin, Weiwei Sun, Xiaojun Wan

Abstract In this work, we present a phenomenon-oriented comparative analysis of the two dominant approaches in English Resource Semantic (ERS) parsing: classic, knowledge-intensive and neural, data-intensive models.

Semantic Parsing

DialSummEval: Revisiting Summarization Evaluation for Dialogues

1 code implementation NAACL 2022 Mingqi Gao, Xiaojun Wan

Dialogue summarization is receiving increasing attention from researchers due to its extraordinary difficulty and unique application value.

How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation?

1 code implementation ACL 2022 Xunjian Yin, Xiaojun Wan

With the rapid development of deep learning, Seq2Seq paradigm has become prevalent for end-to-end data-to-text generation, and the BLEU scores have been increasing in recent years.

Data-to-Text Generation

DelibGAN: Coarse-to-Fine Text Generation via Adversarial Network

no code implementations ICLR 2019 Ke Wang, Xiaojun Wan

In this paper, we propose a novel adversarial learning framework, namely DelibGAN, for generating high-quality sentences without supervision.

Decoder Descriptive +1

Routing Enforced Generative Model for Recipe Generation

no code implementations EMNLP 2020 Zhiwei Yu, Hongyu Zang, Xiaojun Wan

One of the most challenging part of recipe generation is to deal with the complex restrictions among the input ingredients.

Recipe Generation

Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot

no code implementations EMNLP 2021 Yitao Cai, Yue Cao, Xiaojun Wan

Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then decode the sentence back from the semantic representations.

Paraphrase Generation Sentence

$B^4$: A Black-Box Scrubbing Attack on LLM Watermarks

no code implementations2 Nov 2024 Baizhou Huang, Xiao Pu, Xiaojun Wan

Specifically, we formulate the watermark scrubbing attack as a constrained optimization problem by capturing its objectives with two distributions, a Watermark Distribution and a Fidelity Distribution.

Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation

no code implementations22 Oct 2024 Mingqi Gao, Xinyu Hu, Li Lin, Xiaojun Wan

The correlation between NLG automatic evaluation metrics and human evaluation is often regarded as a critical criterion for assessing the capability of an evaluation metric.

nlg evaluation

Style-Compress: An LLM-Based Prompt Compression Framework Considering Task-Specific Styles

no code implementations17 Oct 2024 Xiao Pu, Tianxing He, Xiaojun Wan

In a preliminary study, we discover that when instructing language models to compress prompts, different compression styles (e. g., extractive or abstractive) impact performance of compressed prompts on downstream tasks.

In-Context Learning Informativeness +2

Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

no code implementations17 Oct 2024 Jiatao Li, Xinyu Hu, Xunjian Yin, Xiaojun Wan

In retrieval-augmented generation systems, the integration of self-generated documents (SGDs) alongside retrieved content has emerged as a promising strategy for enhancing the performance of large language model.

Language Modelling RAG +1

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

2 code implementations6 Oct 2024 Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, William Yang Wang

The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks.

Mathematical Reasoning Meta-Learning

SMART-RAG: Selection using Determinantal Matrices for Augmented Retrieval

no code implementations21 Sep 2024 Jiatao Li, Xinyu Hu, Xiaojun Wan

Retrieval-Augmented Generation (RAG) has greatly improved large language models (LLMs) by enabling them to generate accurate, contextually grounded responses through the integration of external information.

Diversity Point Processes +3

PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models

no code implementations26 Jun 2024 Huixuan Zhang, Yun Lin, Xiaojun Wan

We validate the effectiveness of PaCoST and apply it on popular open-source models and benchmarks.

MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency

no code implementations19 Jun 2024 Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Baizhou Huang, Xu Zhang, Xinyu Hu, Xiaojun Wan

Our benchmark facilitates independent correction of misreading and misrecognition errors by editing the corresponding knowledge component.

knowledge editing

ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions

no code implementations13 Jun 2024 Xu Zhang, Xunjian Yin, Xiaojun Wan

While substantial advancements have been made in developing large language models (LLMs), achieving control over their behavior can be difficult.

Better than Random: Reliable NLG Human Evaluation with Constrained Active Sampling

1 code implementation12 Jun 2024 Jie Ruan, Xiao Pu, Mingqi Gao, Xiaojun Wan, Yuesheng Zhu

Human evaluation is viewed as a reliable evaluation method for NLG which is expensive and time-consuming.

nlg evaluation

WaterPool: A Watermark Mitigating Trade-offs among Imperceptibility, Efficacy and Robustness

no code implementations22 May 2024 Baizhou Huang, Xiaojun Wan

To this end, we introduce \textbf{WaterPool}, a simple yet effective key module that preserves a complete key sampling space required by imperceptibility while utilizing semantics-based search to improve the key restoration process.

Automated Similarity Metric Generation for Recommendation

no code implementations18 Apr 2024 Liang Qu, Yun Lin, Wei Yuan, Xiaojun Wan, Yuhui Shi, Hongzhi Yin

Given the critical role of similarity metrics in recommender systems, existing methods mainly employ handcrafted similarity metrics to capture the complex characteristics of user-item interactions.

Recommendation Systems

WikiTableEdit: A Benchmark for Table Editing by Natural Language Instruction

no code implementations5 Mar 2024 Zheng Li, Xiang Chen, Xiaojun Wan

Subsequently, we evaluate several representative large language models on the WikiTableEdit dataset to demonstrate the challenge of this task.

Quantity Matters: Towards Assessing and Mitigating Number Hallucination in Large Vision-Language Models

no code implementations3 Mar 2024 Huixuan Zhang, Junzhe Zhang, Xiaojun Wan

Large-scale vision-language models have demonstrated impressive skill in handling tasks that involve both areas.

Hallucination

Enhancing Jailbreak Attacks with Diversity Guidance

no code implementations1 Mar 2024 Xu Zhang, Dinghao Jing, Xiaojun Wan

Therefore, we propose DPP-based Stochastic Trigger Searching (DSTS), a new optimization algorithm for jailbreak attacks.

Diversity Language Modelling +2

EAMA : Entity-Aware Multimodal Alignment Based Approach for News Image Captioning

no code implementations29 Feb 2024 Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Xiaojun Wan

News image captioning requires model to generate an informative caption rich in entities, with the news image and the associated news article.

Image Captioning Sentence

Are LLM-based Evaluators Confusing NLG Quality Criteria?

2 code implementations19 Feb 2024 Xinyu Hu, Mingqi Gao, Sen Hu, Yang Zhang, Yicheng Chen, Teng Xu, Xiaojun Wan

Some prior work has shown that LLMs perform well in NLG evaluation for different tasks.

nlg evaluation

Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation

1 code implementation18 Feb 2024 Xunjian Yin, Xu Zhang, Jie Ruan, Xiaojun Wan

In recent years, substantial advancements have been made in the development of large language models, achieving remarkable performance across diverse tasks.

Benchmarking Language Modelling +2

Selecting Large Language Model to Fine-tune via Rectified Scaling Law

no code implementations4 Feb 2024 Haowei Lin, Baizhou Huang, Haotian Ye, Qinyu Chen, ZiHao Wang, Sujian Li, Jianzhu Ma, Xiaojun Wan, James Zou, Yitao Liang

The ever-growing ecosystem of LLMs has posed a challenge in selecting the most appropriate pre-trained model to fine-tune amidst a sea of options.

Language Modelling Large Language Model

LLM-based NLG Evaluation: Current Status and Challenges

no code implementations2 Feb 2024 Mingqi Gao, Xinyu Hu, Jie Ruan, Xiao Pu, Xiaojun Wan

Evaluating natural language generation (NLG) is a vital but challenging problem in artificial intelligence.

nlg evaluation Text Generation

History Matters: Temporal Knowledge Editing in Large Language Model

1 code implementation9 Dec 2023 Xunjian Yin, Jin Jiang, Liming Yang, Xiaojun Wan

The imperative task of revising or updating the knowledge stored within large language models arises from two distinct sources: intrinsic errors inherent in the model which should be corrected and outdated knowledge due to external shifts in the real world which should be updated.

knowledge editing Language Modelling +1

OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization

1 code implementation27 Oct 2023 Yuchen Shen, Xiaojun Wan

Opinion summarization sets itself apart from other types of summarization tasks due to its distinctive focus on aspects and sentiments.

Opinion Summarization

Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models

no code implementations25 Oct 2023 Xiang Chen, Xiaojun Wan

Advancements in natural language generation (NLG) and large language models (LLMs) have led to proficient text generation in various tasks.

Text Generation

ALCUNA: Large Language Models Meet New Knowledge

1 code implementation23 Oct 2023 Xunjian Yin, Baizhou Huang, Xiaojun Wan

With the rapid development of NLP, large-scale language models (LLMs) excel in various tasks across multiple domains now.

A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection

1 code implementation10 Oct 2023 Shiping Yang, Renliang Sun, Xiaojun Wan

Contrasting previous studies of zero-resource hallucination detection, our method and benchmark concentrate on passage-level detection instead of sentence-level.

Hallucination Sentence

WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction

1 code implementation8 Oct 2023 Xiang Chen, Zheng Li, Xiaojun Wan

In this paper, we study the problem of controlled text editing by natural language instruction.

Informativeness

Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency

1 code implementation29 Sep 2023 Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan

We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency across outputs from multiple perspectives.

Code Generation HumanEval

Summarization is (Almost) Dead

no code implementations18 Sep 2023 Xiao Pu, Mingqi Gao, Xiaojun Wan

How well can large language models (LLMs) generate summaries?

Text Summarization

A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check

no code implementations25 Jul 2023 Xunjian Yin, Xiaojun Wan

With the development of pre-trained models and the incorporation of phonetic and graphic information, neural models have achieved high scores in Chinese Spelling Check (CSC).

Image Matters: A New Dataset and Empirical Study for Multimodal Hyperbole Detection

1 code implementation1 Jul 2023 Huixuan Zhang, Xiaojun Wan

We create a multimodal detection dataset from Weibo (a Chinese social media) and carry out some studies on it.

SituatedGen: Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning

2 code implementations NeurIPS 2023 Yunxiang Zhang, Xiaojun Wan

Generative commonsense reasoning is the task that requires machines, given a group of keywords, to compose a single coherent sentence with commonsense plausibility.

Sentence Text Generation

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

1 code implementation8 Jun 2023 Mingqi Gao, Xiaojun Wan, Jia Su, Zhefeng Wang, Baoxing Huai

To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarization containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on different error categories.

Benchmarking

A New Dataset and Empirical Study for Sentence Simplification in Chinese

1 code implementation7 Jun 2023 Shiping Yang, Renliang Sun, Xiaojun Wan

Sentence Simplification is a valuable technique that can benefit language learners and children a lot.

Few-Shot Learning Sentence

Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks

no code implementations24 May 2023 Xiao Pu, Mingqi Gao, Xiaojun Wan

The results show that summaries generated by fine-tuned models lead to higher consistency in usefulness across all three tasks, as rankings of fine-tuned summarization systems are close across downstream tasks according to the proposed extrinsic metrics.

Informativeness Question Answering +4

Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

1 code implementation21 May 2023 Renliang Sun, Wei Xu, Xiaojun Wan

In this paper, we propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.

Lexical Simplification Sentence +1

Human-like Summarization Evaluation with ChatGPT

1 code implementation5 Apr 2023 Mingqi Gao, Jie Ruan, Renliang Sun, Xunjian Yin, Shiping Yang, Xiaojun Wan

Evaluating text summarization is a challenging problem, and existing evaluation metrics are far from satisfactory.

Text Summarization

Models See Hallucinations: Evaluating the Factuality in Video Captioning

no code implementations6 Mar 2023 Hui Liu, Xiaojun Wan

In this work, we conduct a detailed human evaluation of the factuality in video captioning and collect two annotated factuality datasets.

Text Generation Video Captioning

Exploiting Summarization Data to Help Text Simplification

1 code implementation14 Feb 2023 Renliang Sun, Zhixian Yang, Xiaojun Wan

One of the major problems with text simplification is the lack of high-quality data.

Sentence Text Simplification +1

How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation

no code implementations20 Nov 2022 Jie Ruan, Yue Wu, Xiaojun Wan, Yuesheng Zhu

Sarcasm generation has been investigated in previous studies by considering it as a text-to-text generation problem, i. e., generating a sarcastic sentence for an input sentence.

Descriptive Sentence +1

Error-Robust Retrieval for Chinese Spelling Check

1 code implementation15 Nov 2022 Xunjian Yin, Xinyu Hu, Jin Jiang, Xiaojun Wan

Chinese Spelling Check (CSC) aims to detect and correct error tokens in Chinese contexts, which has a wide range of applications.

Retrieval

Social Biases in Automatic Evaluation Metrics for NLG

no code implementations17 Oct 2022 Mingqi Gao, Xiaojun Wan

Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias.

Sentence Sentence Embeddings +3

An Empirical Study of Automatic Post-Editing

no code implementations16 Sep 2022 Xu Zhang, Xiaojun Wan

In view of the importance of data augmentation in APE, we separately study the impact of the construction method of artificial corpora and artificial data domain on the performance of APE models.

Automatic Post-Editing Data Augmentation

CC-Riddle: A Question Answering Dataset of Chinese Character Riddles

2 code implementations28 Jun 2022 Fan Xu, Yunxiang Zhang, Xiaojun Wan

Solving Chinese character riddles is a challenging task that demands understanding of character glyph, general knowledge, and a grasp of figurative language.

General Knowledge Language Modelling +2

Nearest Neighbor Knowledge Distillation for Neural Machine Translation

1 code implementation NAACL 2022 Zhixian Yang, Renliang Sun, Xiaojun Wan

k-nearest-neighbor machine translation (NN-MT), proposed by Khandelwal et al. (2021), has achieved many state-of-the-art results in machine translation tasks.

Knowledge Distillation Machine Translation +2

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

no code implementations16 Apr 2022 Renliang Sun, Xiaojun Wan

We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words from the texts.

Language Modelling Lexical Simplification +3

Dependency-based Mixture Language Models

1 code implementation ACL 2022 Zhixian Yang, Xiaojun Wan

Various models have been proposed to incorporate knowledge of syntactic structures into neural language models.

Language Modelling Text Generation

A Simple Information-Based Approach to Unsupervised Domain-Adaptive Aspect-Based Sentiment Analysis

1 code implementation29 Jan 2022 Xiang Chen, Xiaojun Wan

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task which aims to extract the aspects from sentences and identify their corresponding sentiments.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4

Visual Information Guided Zero-Shot Paraphrase Generation

1 code implementation COLING 2022 Zhe Lin, Xiaojun Wan

Zero-shot paraphrase generation has drawn much attention as the large-scale high-quality paraphrase corpus is limited.

Diversity Image Captioning +2

Neural Content Extraction for Poster Generation of Scientific Papers

no code implementations16 Dec 2021 Sheng Xu, Xiaojun Wan

Then we propose a three-step framework to tackle this task and focus on the content extraction step in this study.

Document Summarization

A Syntax-Guided Grammatical Error Correction Model with Dependency Tree Correction

no code implementations5 Nov 2021 Zhaohong Wan, Xiaojun Wan

However, these methods lack the use of syntactic knowledge which plays an important role in the correction of grammatical errors.

Data Augmentation Grammatical Error Correction +3

Document-Level Text Simplification: Dataset, Criteria and Baseline

1 code implementation EMNLP 2021 Renliang Sun, Hanqi Jin, Xiaojun Wan

Finally, we select several representative models as baseline models for this task and perform automatic evaluation and human evaluation.

Sentence Text Simplification

BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles

no code implementations23 Sep 2021 Yunxiang Zhang, Xiaojun Wan

A riddle is a question or statement with double or veiled meanings, followed by an unexpected answer.

Multiple-choice Question Answering

CodeQA: A Question Answering Dataset for Source Code Comprehension

1 code implementation Findings (EMNLP) 2021 Chenxiao Liu, Xiaojun Wan

We propose CodeQA, a free-form question answering dataset for the purpose of source code comprehension: given a code snippet and a question, a textual answer is required to be generated.

Machine Reading Comprehension Question Answering

MOVER: Mask, Over-generate and Rank for Hyperbole Generation

1 code implementation NAACL 2022 Yunxiang Zhang, Xiaojun Wan

In this paper, we tackle the challenging task of hyperbole generation to transfer a literal sentence into its hyperbolic paraphrase.

Sentence

Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase Generation Approach

1 code implementation Findings (ACL) 2021 Zhe Lin, Xiaojun Wan

Both automatic and human evaluation show BTmPG can improve the diversity of paraphrase while preserving the semantics of the original sentence.

Diversity Paraphrase Generation +2

Continual Learning for Neural Machine Translation

no code implementations NAACL 2021 Yue Cao, Hao-Ran Wei, Boxing Chen, Xiaojun Wan

In practical applications, NMT models are usually trained on a general domain corpus and then fine-tuned by continuing training on the in-domain corpus.

Continual Learning Knowledge Distillation +3

Bridging the Domain Gap: Improve Informal Language Translation via Counterfactual Domain Adaptation

no code implementations AAAI 2021 Ke Wang, Guandan Chen, Zhongqiang Huang, Xiaojun Wan, Fei Huang

Despite the near-human performances already achieved on formal texts such as news articles, neural machine transla- tion still has difficulty in dealing with ”user-generated” texts that have diverse linguistic phenomena but lack large-scale high-quality parallel corpora.

counterfactual Domain Adaptation +2

Learning a Product Relevance Model from Click-Through Data in E-Commerce

no code implementations14 Feb 2021 Shaowei Yao, Jiwei Tan, Xi Chen, Keping Yang, Rong Xiao, Hongbo Deng, Xiaojun Wan

We propose a novel way to consider samples of different relevance confidence, and come up with a new training objective to learn a robust relevance model with desirable score distribution.

Click-Through Rate Prediction Computational Efficiency

ParaSCI: A Large Scientific Paraphrase Dataset for Longer Paraphrase Generation

1 code implementation EACL 2021 Qingxiu Dong, Xiaojun Wan, Yue Cao

We propose ParaSCI, the first large-scale paraphrase dataset in the scientific field, including 33, 981 paraphrase pairs from ACL (ParaSCI-ACL) and 316, 063 pairs from arXiv (ParaSCI-arXiv).

Diversity Paraphrase Generation

On the Helpfulness of Document Context to Sentence Simplification

1 code implementation COLING 2020 Renliang Sun, Zhe Lin, Xiaojun Wan

Our model uses neural networks to learn the different effects of the preceding sentences and the following sentences on the current sentence and applies them to the improved transformer model.

Sentence Text Simplification

IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation

2 code implementations EMNLP 2020 Yitao Cai, Xiaojun Wan

Our model outperforms previous state-of-the-art model by a large margin and achieves new state-of-the-art results on the two datasets.

Text-To-SQL

Adversarial Text Generation via Sequence Contrast Discrimination

no code implementations Findings of the Association for Computational Linguistics 2020 Ke Wang, Xiaojun Wan

In this paper, we propose a sequence contrast loss driven text generation framework, which learns the difference between real texts and generated texts and uses that difference.

Adversarial Text Text Generation

TransModality: An End2End Fusion Method with Transformer for Multimodal Sentiment Analysis

no code implementations7 Sep 2020 Zilong Wang, Zhaohong Wan, Xiaojun Wan

Enlightened by recent success of Transformer in the area of machine translation, we propose a new fusion method, TransModality, to address the task of multimodal sentiment analysis.

 Ranked #1 on Multimodal Sentiment Analysis on CMU-MOSI (F1-score (Weighted) metric)

Machine Translation Multimodal Sentiment Analysis +1

Constructing a Family Tree of Ten Indo-European Languages with Delexicalized Cross-linguistic Transfer Patterns

no code implementations17 Jul 2020 Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan

It is reasonable to hypothesize that the divergence patterns formulated by historical linguists and typologists reflect constraints on human languages, and are thus consistent with Second Language Acquisition (SLA) in a certain way.

Language Acquisition

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

no code implementations ACL 2020 Yue Cao, Hui Liu, Xiaojun Wan

However, it is a big challenge for the model to directly learn cross-lingual summarization as it requires learning to understand different languages and learning how to summarize at the same time.

Cross-Lingual Transfer

Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization

no code implementations ACL 2020 Hanqi Jin, Tianming Wang, Xiaojun Wan

In this paper, we propose a multi-granularity interaction network for extractive and abstractive multi-document summarization, which jointly learn semantic representations for words, sentences, and documents.

Document Summarization Extractive Summarization +2

Heterogeneous Graph Transformer for Graph-to-Sequence Learning

no code implementations ACL 2020 Shaowei Yao, Tianming Wang, Xiaojun Wan

The graph-to-sequence (Graph2Seq) learning aims to transduce graph-structured representations to word sequences for text generation.

AMR-to-Text Generation Graph-to-Sequence +3

Multimodal Transformer for Multimodal Machine Translation

1 code implementation ACL 2020 Shaowei Yao, Xiaojun Wan

Multimodal Machine Translation (MMT) aims to introduce information from other modality, generally static images, to improve the translation quality.

Multimodal Machine Translation Translation

Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study

no code implementations ACL 2020 Xinyu Xing, Xiaosheng Fan, Xiaojun Wan

In this paper, we study the challenging problem of automatic generation of citation texts in scholarly papers.

Text Generation

AMR-To-Text Generation with Graph Transformer

no code implementations TACL 2020 Tianming Wang, Xiaojun Wan, Hanqi Jin

Abstract meaning representation (AMR)-to-text generation is the challenging task of generating natural language texts from AMR graphs, where nodes represent concepts and edges denote relations.

Abstract Meaning Representation AMR-to-Text Generation +2

Towards a Unified End-to-End Approach for Fully Unsupervised Cross-Lingual Sentiment Analysis

no code implementations CONLL 2019 Yanlin Feng, Xiaojun Wan

Cross-lingual sentiment analysis (CLSA) aims to improve the performance on these languages by leveraging annotated data from other languages.

Cross-Lingual Word Embeddings Sentiment Analysis +1

Automated Chess Commentator Powered by Neural Chess Engine

2 code implementations ACL 2019 Hongyu Zang, Zhiwei Yu, Xiaojun Wan

In this paper, we explore a new approach for automated chess commentary generation, which aims to generate chess commentary texts in different categories (e. g., description, comparison, planning, etc.).

Text Generation

A Neural Approach to Irony Generation

1 code implementation13 Sep 2019 Mengdi Zhu, Zhiwei Yu, Xiaojun Wan

Ironies can not only express stronger emotions but also show a sense of humor.

Reinforcement Learning Style Transfer

INS: An Interactive Chinese News Synthesis System

no code implementations NAACL 2019 Hui Liu, Wentao Qin, Xiaojun Wan

So it is of vital importance to automatically synthesize a batch of news articles related to the event or topic into a new synthesis article (or overview article) for reader's convenience.

A Comparative Analysis of Knowledge-Intensive and Data-Intensive Semantic Parsers

no code implementations4 Jul 2019 Junjie Cao, Zi Lin, Weiwei Sun, Xiaojun Wan

We present a phenomenon-oriented comparative analysis of the two dominant approaches in task-independent semantic parsing: classic, knowledge-intensive and neural, data-intensive models.

Semantic Parsing

T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion

1 code implementation International Joint Conference on Artificial Intelligence 2019 Tianming Wang, Xiaojun Wan

Our model uses shared attention layers for encoder and decoder, which make the most of the contextual clues, and a latent variable for learning the distribution of coherent story plots.

Decoder Diversity +1

Asking the Crowd: Question Analysis, Evaluation and Generation for Open Discussion on Online Forums

1 code implementation ACL 2019 Zi Chai, Xinyu Xing, Xiaojun Wan, Bo Huang

For openQG task, we construct OQGenD, the first dataset as far as we know, and propose a model based on conditional generative adversarial networks and our question evaluation model.

Text Generation

A Semi-Supervised Approach for Low-Resourced Text Generation

1 code implementation3 Jun 2019 Hongyu Zang, Xiaojun Wan

The low-resource (of labeled data) problem is quite common in different task generation tasks, but unlabeled data are usually abundant.

Decoder Denoising +4

Massive Styles Transfer with Limited Labeled Data

1 code implementation3 Jun 2019 Hongyu Zang, Xiaojun Wan

In this paper, we propose a multi-agent style transfer system (MAST) for addressing multiple style transfer tasks with limited labeled data, by leveraging abundant unlabeled data and the mutual benefit among the multiple styles.

Denoising Style Transfer +1

Learning Bilingual Sentiment-Specific Word Embeddings without Cross-lingual Supervision

no code implementations NAACL 2019 Yanlin Feng, Xiaojun Wan

Our method only requires a sentiment corpus in the source language and pretrained monolingual word embeddings of both languages.

Sentiment Analysis Translation +3

How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation

no code implementations NAACL 2019 Zhiwei Yu, Xiaojun Wan

In order to create novel metaphors, we propose a neural approach to metaphor generation and explore the shared inferential structure of a metaphorical usage and a literal usage of a verb.

Language Modelling Text Generation

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

2 code implementations NeurIPS 2019 Ke Wang, Hang Hua, Xiaojun Wan

Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e. g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content.

Attribute Text Attribute Transfer

AMRec: An Intelligent System for Academic Method Recommendation

no code implementations10 Apr 2019 Shanshan Huang, Xiaojun Wan, Xuewei Tang

Finding new academic Methods for research problems is the key task in a researcher's research career.

Parsing Chinese Sentences with Grammatical Relations

no code implementations CL 2019 Weiwei Sun, Yufei Chen, Xiaojun Wan, Meichun Liu

In this work, we propose to represent grammatical information using general directed dependency graphs.

Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data

1 code implementation EMNLP 2018 Zi Lin, Yuguang Duan, Yuan-Yuan Zhao, Weiwei Sun, Xiaojun Wan

This paper studies semantic parsing for interlanguage (L2), taking semantic role labeling (SRL) as a case task and learner Chinese as a case language.

Semantic Parsing Semantic Role Labeling +1

Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism

no code implementations COLING 2018 Liunian Li, Xiaojun Wan

Our approach first adopts an encoder-decoder model to generate a template text with data slots to be filled and then leverages a proposed delayed copy mechanism to fill in the slots with proper data records.

Data-to-Text Generation Decoder +2

Sense-Aware Neural Models for Pun Location in Texts

no code implementations ACL 2018 Yitao Cai, Yin Li, Xiaojun Wan

In this paper, we focus on the task of pun location, which aims to identify the pun word in a given short text.

Word Sense Disambiguation

A Neural Approach to Pun Generation

no code implementations ACL 2018 Zhiwei Yu, Jiwei Tan, Xiaojun Wan

Since sequence-to-sequence models provide an effective technique for text generation, it is promising to investigate these models on the pun generation task.

Diversity Image Captioning +4

Language Generation via DAG Transduction

no code implementations ACL 2018 Yajie Ye, Weiwei Sun, Xiaojun Wan

This remarkable result demonstrates the feasibility of applying a DAG transducer to resolve NLG, as well as the effectiveness of our design.

Semantic Parsing Text Generation

Accurate SHRG-Based Semantic Parsing

no code implementations ACL 2018 Yufei Chen, Weiwei Sun, Xiaojun Wan

We demonstrate that an SHRG-based parser can produce semantic graphs much more accurately than previously shown, by relating synchronous production rules to the syntacto-semantic composition process.

Semantic Composition Semantic Parsing

Pre- and In-Parsing Models for Neural Empty Category Detection

no code implementations ACL 2018 Yufei Chen, Yuan-Yuan Zhao, Weiwei Sun, Xiaojun Wan

Motivated by the positive impact of empty category on syntactic parsing, we study neural models for pre- and in-parsing detection of empty category, which has not previously been investigated.

Dependency Parsing Structured Prediction

Towards a Neural Network Approach to Abstractive Multi-Document Summarization

no code implementations24 Apr 2018 Jianmin Zhang, Jiwei Tan, Xiaojun Wan

In this paper, we investigate neural abstractive methods for MDS by adapting a state-of-the-art neural abstractive summarization model for SDS.

Abstractive Text Summarization Document Summarization +1

Leveraging Diverse Lexical Chains to Construct Essays for Chinese College Entrance Examination

no code implementations IJCNLP 2017 Liunian Li, Xiaojun Wan, Jin-Ge Yao, Siming Yan

In this work we study the challenging task of automatically constructing essays for Chinese college entrance examination where the topic is specified in advance.

Sentence

Towards Automatic Generation of Entertaining Dialogues in Chinese Crosstalks

no code implementations1 Nov 2017 Shikang Du, Xiaojun Wan, Yajie Ye

Crosstalk, also known by its Chinese name xiangsheng, is a traditional Chinese comedic performing art featuring jokes and funny dialogues, and one of China's most popular cultural elements.

Dialogue Generation Translation

Towards a Universal Sentiment Classifier in Multiple languages

no code implementations EMNLP 2017 Kui Xu, Xiaojun Wan

We present the evaluation results of our universal sentiment classifier in five languages, and the results are very promising even when the parallel data between English and the target languages are not used.

General Classification Machine Translation +2

Towards Automatic Construction of News Overview Articles by News Synthesis

no code implementations EMNLP 2017 Jianmin Zhang, Xiaojun Wan

In this paper we investigate a new task of automatically constructing an overview article from a given set of news articles about a news event.

Document Summarization Multi-Document Summarization

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

no code implementations EMNLP 2017 Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan

We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs.

ARC Dependency Parsing

Parsing for Grammatical Relations via Graph Merging

no code implementations CONLL 2017 Weiwei Sun, Yantao Du, Xiaojun Wan

This paper is concerned with building deep grammatical relation (GR) analysis using data-driven approach.

The Covert Helps Parse the Overt

no code implementations CONLL 2017 Xun Zhang, Weiwei Sun, Xiaojun Wan

This paper is concerned with whether deep syntactic information can help surface parsing, with a particular focus on empty categories.

Dependency Parsing

Semantic Dependency Parsing via Book Embedding

no code implementations ACL 2017 Weiwei Sun, Junjie Cao, Xiaojun Wan

We model a dependency graph as a book, a particular kind of topological space, for semantic dependency parsing.

Combinatorial Optimization Dependency Parsing +2

Abstractive Document Summarization with a Graph-Based Attentional Neural Model

no code implementations ACL 2017 Jiwei Tan, Xiaojun Wan, Jianguo Xiao

Abstractive summarization is the ultimate goal of document summarization research, but previously it is less investigated due to the immaturity of text generation techniques.

Abstractive Text Summarization Document Summarization +5

Learning to Identify Ambiguous and Misleading News Headlines

no code implementations17 May 2017 Wei Wei, Xiaojun Wan

For the identification of misleading headlines, we extract features based on the congruence between headlines and bodies.

Diversity

PKUSUMSUM : A Java Platform for Multilingual Document Summarization

no code implementations COLING 2016 Jianmin Zhang, Tianming Wang, Xiaojun Wan

PKUSUMSUM is a Java platform for multilingual document summarization, and it sup-ports multiple languages, integrates 10 automatic summarization methods, and tackles three typical summarization tasks.

Chinese Word Segmentation Document Summarization +1

Learning to Mine Chinese Coordinate Terms Using the Web

no code implementations8 Jul 2015 Xiaojiang Huang, Xiaojun Wan, Jianguo Xiao

Coordinate relation refers to the relation between instances of a concept and the relation between the directly hyponyms of a concept.

Relation

Multi-Document Summarization via Discriminative Summary Reranking

no code implementations8 Jul 2015 Xiaojun Wan, Ziqiang Cao, Furu Wei, Sujian Li, Ming Zhou

However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets.

Document Summarization Multi-Document Summarization +1

Mining and Analyzing the Future Works in Scientific Articles

no code implementations8 Jul 2015 Yue Hu, Xiaojun Wan

Third, we apply the extraction method and the classification model to a paper dataset in the computer science field and conduct a further analysis of the future works.

Classification General Classification

Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing

no code implementations TACL 2013 Weiwei Sun, Xiaojun Wan

We present a comparative study of transition-, graph- and PCFG-based models aimed at illuminating more precisely the likely contribution of CFGs in improving Chinese dependency parsing accuracy, especially by combining heterogeneous models.

Chinese Dependency Parsing Dependency Parsing +3

Cannot find the paper you are looking for? You can Submit a new open access paper.