Search Results for author: Xiaojun Wan

Found 136 papers, 38 papers with code

Routing Enforced Generative Model for Recipe Generation

no code implementations EMNLP 2020 Zhiwei Yu, Hongyu Zang, Xiaojun Wan

One of the most challenging part of recipe generation is to deal with the complex restrictions among the input ingredients.

Recipe Generation

Comparing Knowledge-Intensive and Data-Intensive Models for English Resource Semantic Parsing

no code implementations CL (ACL) 2021 Junjie Cao, Zi Lin, Weiwei Sun, Xiaojun Wan

Abstract In this work, we present a phenomenon-oriented comparative analysis of the two dominant approaches in English Resource Semantic (ERS) parsing: classic, knowledge-intensive and neural, data-intensive models.

Semantic Parsing

Homophonic Pun Generation with Lexically Constrained Rewriting

no code implementations EMNLP 2020 Zhiwei Yu, Hongyu Zang, Xiaojun Wan

Punning is a creative way to make conversation enjoyable and literary writing elegant.

DelibGAN: Coarse-to-Fine Text Generation via Adversarial Network

no code implementations ICLR 2019 Ke Wang, Xiaojun Wan

In this paper, we propose a novel adversarial learning framework, namely DelibGAN, for generating high-quality sentences without supervision.

Descriptive Text Generation

DialSummEval: Revisiting Summarization Evaluation for Dialogues

1 code implementation NAACL 2022 Mingqi Gao, Xiaojun Wan

Dialogue summarization is receiving increasing attention from researchers due to its extraordinary difficulty and unique application value.

How Do Seq2Seq Models Perform on End-to-End Data-to-Text Generation?

1 code implementation ACL 2022 Xunjian Yin, Xiaojun Wan

With the rapid development of deep learning, Seq2Seq paradigm has become prevalent for end-to-end data-to-text generation, and the BLEU scores have been increasing in recent years.

Data-to-Text Generation

Revisiting Pivot-Based Paraphrase Generation: Language Is Not the Only Optional Pivot

no code implementations EMNLP 2021 Yitao Cai, Yue Cao, Xiaojun Wan

Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then decode the sentence back from the semantic representations.

Paraphrase Generation

OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization

1 code implementation27 Oct 2023 Yuchen Shen, Xiaojun Wan

Opinion summarization sets itself apart from other types of summarization tasks due to its distinctive focus on aspects and sentiments.

A Comprehensive Evaluation of Constrained Text Generation for Large Language Models

no code implementations25 Oct 2023 Xiang Chen, Xiaojun Wan

Results illuminate LLMs' capacity and deficiency to incorporate constraints and provide insights for future developments in constrained text generation.

Text Generation

ALCUNA: Large Language Models Meet New Knowledge

1 code implementation23 Oct 2023 Xunjian Yin, Baizhou Huang, Xiaojun Wan

With the rapid development of NLP, large-scale language models (LLMs) excel in various tasks across multiple domains now.

A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection

1 code implementation10 Oct 2023 Shiping Yang, Renliang Sun, Xiaojun Wan

Contrasting previous studies of zero-resource hallucination detection, our method and benchmark concentrate on passage-level detection instead of sentence-level.

WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction

1 code implementation8 Oct 2023 Xiang Chen, Zheng Li, Xiaojun Wan

In this paper, we study the problem of controlled text editing by natural language instruction.


Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency

no code implementations29 Sep 2023 Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan

Specifically, we ask LLMs to sample multiple diverse outputs from various perspectives for a given query and then construct a multipartite graph based on them.

Code Generation

Summarization is (Almost) Dead

no code implementations18 Sep 2023 Xiao Pu, Mingqi Gao, Xiaojun Wan

How well can large language models (LLMs) generate summaries?

Text Summarization

A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check

no code implementations25 Jul 2023 Xunjian Yin, Xiaojun Wan

With the development of pre-trained models and the incorporation of phonetic and graphic information, neural models have achieved high scores in Chinese Spelling Check (CSC).


Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

1 code implementation8 Jun 2023 Mingqi Gao, Xiaojun Wan, Jia Su, Zhefeng Wang, Baoxing Huai

To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarization containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on different error categories.


A New Dataset and Empirical Study for Sentence Simplification in Chinese

1 code implementation7 Jun 2023 Shiping Yang, Renliang Sun, Xiaojun Wan

Sentence Simplification is a valuable technique that can benefit language learners and children a lot.

Few-Shot Learning

Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks

no code implementations24 May 2023 Xiao Pu, Mingqi Gao, Xiaojun Wan

The results show that summaries generated by fine-tuned models lead to higher consistency in usefulness across all three tasks, as rankings of fine-tuned summarization systems are close across downstream tasks according to the proposed extrinsic metrics.

Informativeness Question Answering +4

Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

1 code implementation21 May 2023 Renliang Sun, Wei Xu, Xiaojun Wan

In this paper, we propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.

Lexical Simplification Text Simplification

Human-like Summarization Evaluation with ChatGPT

1 code implementation5 Apr 2023 Mingqi Gao, Jie Ruan, Renliang Sun, Xunjian Yin, Shiping Yang, Xiaojun Wan

Evaluating text summarization is a challenging problem, and existing evaluation metrics are far from satisfactory.

Text Summarization

Models See Hallucinations: Evaluating the Factuality in Video Captioning

no code implementations6 Mar 2023 Hui Liu, Xiaojun Wan

In this work, we conduct a detailed human evaluation of the factuality in video captioning and collect two annotated factuality datasets.

Text Generation Video Captioning

How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation

no code implementations20 Nov 2022 Jie Ruan, Yue Wu, Xiaojun Wan, Yuesheng Zhu

Sarcasm generation has been investigated in previous studies by considering it as a text-to-text generation problem, i. e., generating a sarcastic sentence for an input sentence.

Descriptive Text Generation

Chinese Spelling Check with Nearest Neighbors

no code implementations15 Nov 2022 Xunjian Yin, Xinyu Hu, Xiaojun Wan

Chinese Spelling Check (CSC) aims to detect and correct error tokens in Chinese contexts, which has a wide range of applications.


Social Biases in Automatic Evaluation Metrics for NLG

no code implementations17 Oct 2022 Mingqi Gao, Xiaojun Wan

Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias.

Sentence Embeddings Test +3

An Empirical Study of Automatic Post-Editing

no code implementations16 Sep 2022 Xu Zhang, Xiaojun Wan

In view of the importance of data augmentation in APE, we separately study the impact of the construction method of artificial corpora and artificial data domain on the performance of APE models.

Automatic Post-Editing Data Augmentation

CC-Riddle: A Question Answering Dataset of Chinese Character Riddles

2 code implementations28 Jun 2022 Fan Xu, Yunxiang Zhang, Xiaojun Wan

Solving Chinese character riddles is a challenging task that demands understanding of character glyph, general knowledge, and a grasp of figurative language.

General Knowledge Language Modelling +2

Nearest Neighbor Knowledge Distillation for Neural Machine Translation

1 code implementation NAACL 2022 Zhixian Yang, Renliang Sun, Xiaojun Wan

k-nearest-neighbor machine translation (NN-MT), proposed by Khandelwal et al. (2021), has achieved many state-of-the-art results in machine translation tasks.

Knowledge Distillation Machine Translation +2

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

no code implementations16 Apr 2022 Renliang Sun, Xiaojun Wan

We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words from the texts.

Language Modelling Lexical Simplification +2

Dependency-based Mixture Language Models

1 code implementation ACL 2022 Zhixian Yang, Xiaojun Wan

Various models have been proposed to incorporate knowledge of syntactic structures into neural language models.

Language Modelling Text Generation

A Simple Information-Based Approach to Unsupervised Domain-Adaptive Aspect-Based Sentiment Analysis

1 code implementation29 Jan 2022 Xiang Chen, Xiaojun Wan

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task which aims to extract the aspects from sentences and identify their corresponding sentiments.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4

Visual Information Guided Zero-Shot Paraphrase Generation

1 code implementation COLING 2022 Zhe Lin, Xiaojun Wan

Zero-shot paraphrase generation has drawn much attention as the large-scale high-quality paraphrase corpus is limited.

Image Captioning Paraphrase Generation +1

Neural Content Extraction for Poster Generation of Scientific Papers

no code implementations16 Dec 2021 Sheng Xu, Xiaojun Wan

Then we propose a three-step framework to tackle this task and focus on the content extraction step in this study.

Document Summarization

A Syntax-Guided Grammatical Error Correction Model with Dependency Tree Correction

no code implementations5 Nov 2021 Zhaohong Wan, Xiaojun Wan

However, these methods lack the use of syntactic knowledge which plays an important role in the correction of grammatical errors.

Data Augmentation Grammatical Error Correction +3

Document-Level Text Simplification: Dataset, Criteria and Baseline

1 code implementation EMNLP 2021 Renliang Sun, Hanqi Jin, Xiaojun Wan

Finally, we select several representative models as baseline models for this task and perform automatic evaluation and human evaluation.

Text Simplification

BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles

no code implementations23 Sep 2021 Yunxiang Zhang, Xiaojun Wan

A riddle is a question or statement with double or veiled meanings, followed by an unexpected answer.

Multiple-choice Question Answering

CodeQA: A Question Answering Dataset for Source Code Comprehension

1 code implementation Findings (EMNLP) 2021 Chenxiao Liu, Xiaojun Wan

We propose CodeQA, a free-form question answering dataset for the purpose of source code comprehension: given a code snippet and a question, a textual answer is required to be generated.

Machine Reading Comprehension Question Answering

MOVER: Mask, Over-generate and Rank for Hyperbole Generation

1 code implementation NAACL 2022 Yunxiang Zhang, Xiaojun Wan

In this paper, we tackle the challenging task of hyperbole generation to transfer a literal sentence into its hyperbolic paraphrase.

Pushing Paraphrase Away from Original Sentence: A Multi-Round Paraphrase Generation Approach

1 code implementation Findings (ACL) 2021 Zhe Lin, Xiaojun Wan

Both automatic and human evaluation show BTmPG can improve the diversity of paraphrase while preserving the semantics of the original sentence.

Paraphrase Generation Translation

Making Better Use of Bilingual Information for Cross-Lingual AMR Parsing

1 code implementation Findings (ACL) 2021 Yitao Cai, Zhe Lin, Xiaojun Wan

We argue that the misprediction of concepts is due to the high relevance between English tokens and AMR concepts.

AMR Parsing

Continual Learning for Neural Machine Translation

no code implementations NAACL 2021 Yue Cao, Hao-Ran Wei, Boxing Chen, Xiaojun Wan

In practical applications, NMT models are usually trained on a general domain corpus and then fine-tuned by continuing training on the in-domain corpus.

Continual Learning Knowledge Distillation +3

Bridging the Domain Gap: Improve Informal Language Translation via Counterfactual Domain Adaptation

no code implementations AAAI 2021 Ke Wang, Guandan Chen, Zhongqiang Huang, Xiaojun Wan, Fei Huang

Despite the near-human performances already achieved on formal texts such as news articles, neural machine transla- tion still has difficulty in dealing with ”user-generated” texts that have diverse linguistic phenomena but lack large-scale high-quality parallel corpora.

counterfactual Domain Adaptation +2

Learning a Product Relevance Model from Click-Through Data in E-Commerce

no code implementations14 Feb 2021 Shaowei Yao, Jiwei Tan, Xi Chen, Keping Yang, Rong Xiao, Hongbo Deng, Xiaojun Wan

We propose a novel way to consider samples of different relevance confidence, and come up with a new training objective to learn a robust relevance model with desirable score distribution.

Click-Through Rate Prediction

ParaSCI: A Large Scientific Paraphrase Dataset for Longer Paraphrase Generation

1 code implementation EACL 2021 Qingxiu Dong, Xiaojun Wan, Yue Cao

We propose ParaSCI, the first large-scale paraphrase dataset in the scientific field, including 33, 981 paraphrase pairs from ACL (ParaSCI-ACL) and 316, 063 pairs from arXiv (ParaSCI-arXiv).

Paraphrase Generation

On the Helpfulness of Document Context to Sentence Simplification

1 code implementation COLING 2020 Renliang Sun, Zhe Lin, Xiaojun Wan

Our model uses neural networks to learn the different effects of the preceding sentences and the following sentences on the current sentence and applies them to the improved transformer model.

Text Simplification

IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation

2 code implementations EMNLP 2020 Yitao Cai, Xiaojun Wan

Our model outperforms previous state-of-the-art model by a large margin and achieves new state-of-the-art results on the two datasets.


Adversarial Text Generation via Sequence Contrast Discrimination

no code implementations Findings of the Association for Computational Linguistics 2020 Ke Wang, Xiaojun Wan

In this paper, we propose a sequence contrast loss driven text generation framework, which learns the difference between real texts and generated texts and uses that difference.

Adversarial Text Text Generation

TransModality: An End2End Fusion Method with Transformer for Multimodal Sentiment Analysis

no code implementations7 Sep 2020 Zilong Wang, Zhaohong Wan, Xiaojun Wan

Enlightened by recent success of Transformer in the area of machine translation, we propose a new fusion method, TransModality, to address the task of multimodal sentiment analysis.

 Ranked #1 on Multimodal Sentiment Analysis on CMU-MOSI (F1-score (Weighted) metric)

Machine Translation Multimodal Sentiment Analysis +1

Constructing a Family Tree of Ten Indo-European Languages with Delexicalized Cross-linguistic Transfer Patterns

no code implementations17 Jul 2020 Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan

It is reasonable to hypothesize that the divergence patterns formulated by historical linguists and typologists reflect constraints on human languages, and are thus consistent with Second Language Acquisition (SLA) in a certain way.

Language Acquisition

Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization

no code implementations ACL 2020 Yue Cao, Hui Liu, Xiaojun Wan

However, it is a big challenge for the model to directly learn cross-lingual summarization as it requires learning to understand different languages and learning how to summarize at the same time.

Cross-Lingual Transfer

Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study

no code implementations ACL 2020 Xinyu Xing, Xiaosheng Fan, Xiaojun Wan

In this paper, we study the challenging problem of automatic generation of citation texts in scholarly papers.

Text Generation

Multimodal Transformer for Multimodal Machine Translation

1 code implementation ACL 2020 Shaowei Yao, Xiaojun Wan

Multimodal Machine Translation (MMT) aims to introduce information from other modality, generally static images, to improve the translation quality.

Multimodal Machine Translation Translation

Heterogeneous Graph Transformer for Graph-to-Sequence Learning

no code implementations ACL 2020 Shaowei Yao, Tianming Wang, Xiaojun Wan

The graph-to-sequence (Graph2Seq) learning aims to transduce graph-structured representations to word sequences for text generation.

AMR-to-Text Generation Graph-to-Sequence +3

Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization

no code implementations ACL 2020 Hanqi Jin, Tianming Wang, Xiaojun Wan

In this paper, we propose a multi-granularity interaction network for extractive and abstractive multi-document summarization, which jointly learn semantic representations for words, sentences, and documents.

Document Summarization Extractive Summarization +1

AMR-To-Text Generation with Graph Transformer

no code implementations TACL 2020 Tianming Wang, Xiaojun Wan, Hanqi Jin

Abstract meaning representation (AMR)-to-text generation is the challenging task of generating natural language texts from AMR graphs, where nodes represent concepts and edges denote relations.

AMR-to-Text Generation Graph-to-Sequence +1

Towards a Unified End-to-End Approach for Fully Unsupervised Cross-Lingual Sentiment Analysis

no code implementations CONLL 2019 Yanlin Feng, Xiaojun Wan

Cross-lingual sentiment analysis (CLSA) aims to improve the performance on these languages by leveraging annotated data from other languages.

Cross-Lingual Word Embeddings Sentiment Analysis +1

Automated Chess Commentator Powered by Neural Chess Engine

2 code implementations ACL 2019 Hongyu Zang, Zhiwei Yu, Xiaojun Wan

In this paper, we explore a new approach for automated chess commentary generation, which aims to generate chess commentary texts in different categories (e. g., description, comparison, planning, etc.).

Text Generation

A Neural Approach to Irony Generation

1 code implementation13 Sep 2019 Mengdi Zhu, Zhiwei Yu, Xiaojun Wan

Ironies can not only express stronger emotions but also show a sense of humor.

Style Transfer

INS: An Interactive Chinese News Synthesis System

no code implementations NAACL 2019 Hui Liu, Wentao Qin, Xiaojun Wan

So it is of vital importance to automatically synthesize a batch of news articles related to the event or topic into a new synthesis article (or overview article) for reader's convenience.

A Comparative Analysis of Knowledge-Intensive and Data-Intensive Semantic Parsers

no code implementations4 Jul 2019 Junjie Cao, Zi Lin, Weiwei Sun, Xiaojun Wan

We present a phenomenon-oriented comparative analysis of the two dominant approaches in task-independent semantic parsing: classic, knowledge-intensive and neural, data-intensive models.

Semantic Parsing

T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion

1 code implementation International Joint Conference on Artificial Intelligence 2019 Tianming Wang, Xiaojun Wan

Our model uses shared attention layers for encoder and decoder, which make the most of the contextual clues, and a latent variable for learning the distribution of coherent story plots.

Story Completion

Asking the Crowd: Question Analysis, Evaluation and Generation for Open Discussion on Online Forums

1 code implementation ACL 2019 Zi Chai, Xinyu Xing, Xiaojun Wan, Bo Huang

For openQG task, we construct OQGenD, the first dataset as far as we know, and propose a model based on conditional generative adversarial networks and our question evaluation model.

Text Generation

Massive Styles Transfer with Limited Labeled Data

1 code implementation3 Jun 2019 Hongyu Zang, Xiaojun Wan

In this paper, we propose a multi-agent style transfer system (MAST) for addressing multiple style transfer tasks with limited labeled data, by leveraging abundant unlabeled data and the mutual benefit among the multiple styles.

Denoising Style Transfer +1

A Semi-Supervised Approach for Low-Resourced Text Generation

1 code implementation3 Jun 2019 Hongyu Zang, Xiaojun Wan

The low-resource (of labeled data) problem is quite common in different task generation tasks, but unlabeled data are usually abundant.

Denoising Language Modelling +2

Learning Bilingual Sentiment-Specific Word Embeddings without Cross-lingual Supervision

no code implementations NAACL 2019 Yanlin Feng, Xiaojun Wan

Our method only requires a sentiment corpus in the source language and pretrained monolingual word embeddings of both languages.

Sentiment Analysis Translation +3

How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation

no code implementations NAACL 2019 Zhiwei Yu, Xiaojun Wan

In order to create novel metaphors, we propose a neural approach to metaphor generation and explore the shared inferential structure of a metaphorical usage and a literal usage of a verb.

Language Modelling Text Generation

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

2 code implementations NeurIPS 2019 Ke Wang, Hang Hua, Xiaojun Wan

Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e. g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content.

Text Attribute Transfer

AMRec: An Intelligent System for Academic Method Recommendation

no code implementations10 Apr 2019 Shanshan Huang, Xiaojun Wan, Xuewei Tang

Finding new academic Methods for research problems is the key task in a researcher's research career.

Parsing Chinese Sentences with Grammatical Relations

no code implementations CL 2019 Weiwei Sun, Yufei Chen, Xiaojun Wan, Meichun Liu

In this work, we propose to represent grammatical information using general directed dependency graphs.

Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data

1 code implementation EMNLP 2018 Zi Lin, Yuguang Duan, Yuan-Yuan Zhao, Weiwei Sun, Xiaojun Wan

This paper studies semantic parsing for interlanguage (L2), taking semantic role labeling (SRL) as a case task and learner Chinese as a case language.

Semantic Parsing Semantic Role Labeling

Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism

no code implementations COLING 2018 Liunian Li, Xiaojun Wan

Our approach first adopts an encoder-decoder model to generate a template text with data slots to be filled and then leverages a proposed delayed copy mechanism to fill in the slots with proper data records.

Data-to-Text Generation Descriptive +1

Language Generation via DAG Transduction

no code implementations ACL 2018 Yajie Ye, Weiwei Sun, Xiaojun Wan

This remarkable result demonstrates the feasibility of applying a DAG transducer to resolve NLG, as well as the effectiveness of our design.

Semantic Parsing Text Generation

Accurate SHRG-Based Semantic Parsing

no code implementations ACL 2018 Yufei Chen, Weiwei Sun, Xiaojun Wan

We demonstrate that an SHRG-based parser can produce semantic graphs much more accurately than previously shown, by relating synchronous production rules to the syntacto-semantic composition process.

Semantic Composition Semantic Parsing

Pre- and In-Parsing Models for Neural Empty Category Detection

no code implementations ACL 2018 Yufei Chen, Yuan-Yuan Zhao, Weiwei Sun, Xiaojun Wan

Motivated by the positive impact of empty category on syntactic parsing, we study neural models for pre- and in-parsing detection of empty category, which has not previously been investigated.

Dependency Parsing Structured Prediction

A Neural Approach to Pun Generation

no code implementations ACL 2018 Zhiwei Yu, Jiwei Tan, Xiaojun Wan

Since sequence-to-sequence models provide an effective technique for text generation, it is promising to investigate these models on the pun generation task.

Image Captioning Language Modelling +3

Sense-Aware Neural Models for Pun Location in Texts

no code implementations ACL 2018 Yitao Cai, Yin Li, Xiaojun Wan

In this paper, we focus on the task of pun location, which aims to identify the pun word in a given short text.

Word Sense Disambiguation

Towards a Neural Network Approach to Abstractive Multi-Document Summarization

no code implementations24 Apr 2018 Jianmin Zhang, Jiwei Tan, Xiaojun Wan

In this paper, we investigate neural abstractive methods for MDS by adapting a state-of-the-art neural abstractive summarization model for SDS.

Abstractive Text Summarization Document Summarization +1

Towards Automatic Generation of Entertaining Dialogues in Chinese Crosstalks

no code implementations1 Nov 2017 Shikang Du, Xiaojun Wan, Yajie Ye

Crosstalk, also known by its Chinese name xiangsheng, is a traditional Chinese comedic performing art featuring jokes and funny dialogues, and one of China's most popular cultural elements.

Dialogue Generation Translation

Leveraging Diverse Lexical Chains to Construct Essays for Chinese College Entrance Examination

no code implementations IJCNLP 2017 Liunian Li, Xiaojun Wan, Jin-Ge Yao, Siming Yan

In this work we study the challenging task of automatically constructing essays for Chinese college entrance examination where the topic is specified in advance.

Towards a Universal Sentiment Classifier in Multiple languages

no code implementations EMNLP 2017 Kui Xu, Xiaojun Wan

We present the evaluation results of our universal sentiment classifier in five languages, and the results are very promising even when the parallel data between English and the target languages are not used.

General Classification Machine Translation +2

Towards Automatic Construction of News Overview Articles by News Synthesis

no code implementations EMNLP 2017 Jianmin Zhang, Xiaojun Wan

In this paper we investigate a new task of automatically constructing an overview article from a given set of news articles about a news event.

Document Summarization Multi-Document Summarization

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

no code implementations EMNLP 2017 Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan

We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs.

Dependency Parsing

Parsing for Grammatical Relations via Graph Merging

no code implementations CONLL 2017 Weiwei Sun, Yantao Du, Xiaojun Wan

This paper is concerned with building deep grammatical relation (GR) analysis using data-driven approach.

The Covert Helps Parse the Overt

no code implementations CONLL 2017 Xun Zhang, Weiwei Sun, Xiaojun Wan

This paper is concerned with whether deep syntactic information can help surface parsing, with a particular focus on empty categories.

Dependency Parsing

Abstractive Document Summarization with a Graph-Based Attentional Neural Model

no code implementations ACL 2017 Jiwei Tan, Xiaojun Wan, Jianguo Xiao

Abstractive summarization is the ultimate goal of document summarization research, but previously it is less investigated due to the immaturity of text generation techniques.

Abstractive Text Summarization Document Summarization +4

Semantic Dependency Parsing via Book Embedding

no code implementations ACL 2017 Weiwei Sun, Junjie Cao, Xiaojun Wan

We model a dependency graph as a book, a particular kind of topological space, for semantic dependency parsing.

Combinatorial Optimization Dependency Parsing +1

Learning to Identify Ambiguous and Misleading News Headlines

no code implementations17 May 2017 Wei Wei, Xiaojun Wan

For the identification of misleading headlines, we extract features based on the congruence between headlines and bodies.

PKUSUMSUM : A Java Platform for Multilingual Document Summarization

no code implementations COLING 2016 Jianmin Zhang, Tianming Wang, Xiaojun Wan

PKUSUMSUM is a Java platform for multilingual document summarization, and it sup-ports multiple languages, integrates 10 automatic summarization methods, and tackles three typical summarization tasks.

Chinese Word Segmentation Document Summarization +1

Mining and Analyzing the Future Works in Scientific Articles

no code implementations8 Jul 2015 Yue Hu, Xiaojun Wan

Third, we apply the extraction method and the classification model to a paper dataset in the computer science field and conduct a further analysis of the future works.

Classification General Classification

Multi-Document Summarization via Discriminative Summary Reranking

no code implementations8 Jul 2015 Xiaojun Wan, Ziqiang Cao, Furu Wei, Sujian Li, Ming Zhou

However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets.

Document Summarization Multi-Document Summarization

Learning to Mine Chinese Coordinate Terms Using the Web

no code implementations8 Jul 2015 Xiaojiang Huang, Xiaojun Wan, Jianguo Xiao

Coordinate relation refers to the relation between instances of a concept and the relation between the directly hyponyms of a concept.

Data-driven, PCFG-based and Pseudo-PCFG-based Models for Chinese Dependency Parsing

no code implementations TACL 2013 Weiwei Sun, Xiaojun Wan

We present a comparative study of transition-, graph- and PCFG-based models aimed at illuminating more precisely the likely contribution of CFGs in improving Chinese dependency parsing accuracy, especially by combining heterogeneous models.

Chinese Dependency Parsing Dependency Parsing +1

Cannot find the paper you are looking for? You can Submit a new open access paper.