Search Results for author: Dayiheng Liu

Found 81 papers, 45 papers with code

RoBLEURT Submission for WMT2021 Metrics Task

no code implementations WMT (EMNLP) 2021 Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

GCPG: A General Framework for Controllable Paraphrase Generation

no code implementations Findings (ACL) 2022 Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Haibo Zhang, Xue Zhao, Wenqing Yao, Boxing Chen

Under GCPG, we reconstruct commonly adopted lexical condition (i. e., Keywords) and syntactical conditions (i. e., Part-Of-Speech sequence, Constituent Tree, Masked Template and Sentential Exemplar) and study the combination of the two types.

Decoder Paraphrase Generation +1

CoRT: Code-integrated Reasoning within Thinking

1 code implementation11 Jun 2025 Chengpeng Li, Zhengyang Tang, Ziniu Li, Mingfeng Xue, Keqin Bao, Tian Ding, Ruoyu Sun, Benyou Wang, Xiang Wang, Junyang Lin, Dayiheng Liu

Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language reasoning with long chain-of-thought (CoT), yet they remain inefficient or inaccurate when handling complex mathematical operations.

Mathematical Reasoning

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

no code implementations5 Jun 2025 Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, Fei Huang, Jingren Zhou

In this work, we introduce the Qwen3 Embedding series, a significant advancement over its predecessor, the GTE-Qwen series, in text embedding and reranking capabilities, built upon the Qwen3 foundation models.

Reranking Retrieval +1

Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

1 code implementation31 May 2025 Yufa Zhou, Shaobo Wang, Xingyu Dong, Xiangqi Jin, Yifang Chen, Yue Min, Kexin Yang, Xingzhang Ren, Dayiheng Liu, Linfeng Zhang

Directly training Large Language Models (LLMs) for Multi-Agent Systems (MAS) remains challenging due to intricate reward modeling, dynamic agent interactions, and demanding generalization requirements.

Parallel Scaling Law for Language Models

1 code implementation15 May 2025 Mouxiang Chen, Binyuan Hui, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Jianling Sun, Junyang Lin, Zhongxin Liu

We apply $P$ diverse and learnable transformations to the input, execute forward passes of the model in parallel, and dynamically aggregate the $P$ outputs.

WorldPM: Scaling Human Preference Modeling

1 code implementation15 May 2025 Binghai Wang, Runji Lin, Keming Lu, Le Yu, Zhenru Zhang, Fei Huang, Chujie Zheng, Kai Dang, Yang Fan, Xingzhang Ren, An Yang, Binyuan Hui, Dayiheng Liu, Tao Gui, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Bowen Yu, Jingren Zhou, Junyang Lin

Motivated by scaling laws in language modeling that demonstrate how test loss scales as a power law with model and dataset sizes, we find that similar laws exist in preference modeling.

Language Modeling Language Modelling

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

1 code implementation10 May 2025 Zihan Qiu, Zekun Wang, Bo Zheng, Zeyu Huang, Kaiyue Wen, Songlin Yang, Rui Men, Le Yu, Fei Huang, Suozhi Huang, Dayiheng Liu, Jingren Zhou, Junyang Lin

Gating mechanisms have been widely utilized, from early models like LSTMs and Highway Networks to recent state space models, linear attention, and also softmax attention.

Attribute Mixture-of-Experts +1

START: Self-taught Reasoner with Tools

no code implementations6 Mar 2025 Chengpeng Li, Mingfeng Xue, Zhenru Zhang, Jiaxi Yang, Beichen Zhang, Xiang Wang, Bowen Yu, Binyuan Hui, Junyang Lin, Dayiheng Liu

In this paper, we introduce START (Self-Taught Reasoner with Tools), a novel tool-integrated long CoT reasoning LLM that significantly enhances reasoning capabilities by leveraging external tools.

Math Self-Learning

DataMan: Data Manager for Pre-training Large Language Models

no code implementations26 Feb 2025 Ru Peng, Kexin Yang, Yawen Zeng, Junyang Lin, Dayiheng Liu, Junbo Zhao

In this paper, we train a Data Manager (DataMan) to learn quality ratings and domain recognition from pointwise rating, and use it to annotate a 447B token pre-training corpus with 14 quality ratings and domain type.

In-Context Learning Instruction Following

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

no code implementations20 Feb 2025 M-A-P Team, Xinrun Du, Yifan Yao, Kaijing Ma, Bingli Wang, Tianyu Zheng, King Zhu, Minghao Liu, Yiming Liang, Xiaolong Jin, Zhenlin Wei, Chujie Zheng, Kaixin Deng, Shawn Gavin, Shian Jia, Sichao Jiang, Yiyan Liao, Rui Li, Qinrui Li, Sirun Li, Yizhi Li, Yunwen Li, David Ma, Yuansheng Ni, Haoran Que, Qiyao Wang, Zhoufutu Wen, Siwei Wu, Tyshawn Hsing, Ming Xu, Zhenzhu Yang, Zekun Moore Wang, Junting Zhou, Yuelin Bai, Xingyuan Bu, Chenglin Cai, Liang Chen, Yifan Chen, Chengtuo Cheng, Tianhao Cheng, Keyi Ding, Siming Huang, Yun Huang, Yaoru Li, Yizhe Li, Zhaoqun Li, Tianhao Liang, Chengdong Lin, Hongquan Lin, Yinghao Ma, Tianyang Pang, Zhongyuan Peng, Zifan Peng, Qige Qi, Shi Qiu, Xingwei Qu, Shanghaoran Quan, Yizhou Tan, Zili Wang, Chenqing Wang, Hao Wang, Yiya Wang, YuBo Wang, Jiajun Xu, Kexin Yang, Ruibin Yuan, Yuanhao Yue, Tianyang Zhan, Chun Zhang, Jinyang Zhang, Xiyue Zhang, Xingjian Zhang, Yue Zhang, Yongchi Zhao, Xiangyu Zheng, Chenghua Zhong, Yang Gao, Zhoujun Li, Dayiheng Liu, Qian Liu, Tianyu Liu, Shiwen Ni, Junran Peng, Yujia Qin, Wenbo Su, Guoyin Wang, Shi Wang, Jian Yang, Min Yang, Meng Cao, Xiang Yue, Zhaoxiang Zhang, Wangchunshu Zhou, Jiaheng Liu, Qunshu Lin, Wenhao Huang, Ge Zhang

To address this gap, we present SuperGPQA, a comprehensive benchmark that evaluates graduate-level knowledge and reasoning capabilities across 285 disciplines.

Collaborative Filtering

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

no code implementations21 Jan 2025 Zihan Qiu, Zeyu Huang, Bo Zheng, Kaiyue Wen, Zekun Wang, Rui Men, Ivan Titov, Dayiheng Liu, Jingren Zhou, Junyang Lin

Existing MoE training frameworks usually employ the parallel training strategy so that $f_i$ and the LBL are calculated within a $\textbf{micro-batch}$ and then averaged across parallel groups.

Mixture-of-Experts

The Lessons of Developing Process Reward Models in Mathematical Reasoning

no code implementations13 Jan 2025 Zhenru Zhang, Chujie Zheng, Yangzhen Wu, Beichen Zhang, Runji Lin, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin

Furthermore, we identify potential biases in conventional Best-of-N (BoN) evaluation strategies for PRMs: (1) The unreliable policy models generate responses with correct answers but flawed processes, leading to a misalignment between the evaluation criteria of BoN and the PRM objectives of process verification.

Mathematical Reasoning

Enabling Scalable Oversight via Self-Evolving Critic

no code implementations10 Jan 2025 Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang, Tianyu Liu, Bowen Yu, Junyang Lin

Despite their remarkable performance, the development of Large Language Models (LLMs) faces a critical challenge in scalable oversight: providing effective feedback for tasks where human evaluation is difficult or where LLMs outperform humans.

ProcessBench: Identifying Process Errors in Mathematical Reasoning

1 code implementation9 Dec 2024 Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin

We conduct extensive evaluation on ProcessBench, involving two types of models: process reward models (PRMs) and critic models, where for the latter we prompt general language models to critique each solution step by step.

GSM8K Math +1

Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models

no code implementations4 Jul 2024 Yuyan Chen, Qiang Fu, Yichen Yuan, Zhihao Wen, Ge Fan, Dayiheng Liu, Dongmei Zhang, Zhixu Li, Yanghua Xiao

Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems.

Hallucination Question Answering

DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

1 code implementation4 Jul 2024 Chengpeng Li, Guanting Dong, Mingfeng Xue, Ru Peng, Xiang Wang, Dayiheng Liu

In this paper, we introduce a series of LLMs that employs the Decomposition of thought with code assistance and self-correction for mathematical reasoning, dubbed as DotaMath.

Avg GSM8K +3

MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

no code implementations4 Jul 2024 Yuyan Chen, Zhihao Wen, Ge Fan, Zhengyu Chen, Wei Wu, Dayiheng Liu, Zhixu Li, Bang Liu, Yanghua Xiao

Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community.

Language Modeling Language Modelling +3

LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

1 code implementation20 Jun 2024 Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

In recent progress, mathematical verifiers have achieved success in mathematical reasoning tasks by validating the correctness of solutions generated by policy models.

Binary Classification GSM8K +2

An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model

no code implementations13 Mar 2024 Yuxin Tian, Mouxing Yang, Yunfan Li, Dayiheng Liu, Xingzhang Ren, Xi Peng, Jiancheng Lv

A natural expectation for PEFTs is that the performance of various PEFTs is positively related to the data size and fine-tunable parameter size.

parameter-efficient fine-tuning

Noisy Pair Corrector for Dense Retrieval

no code implementations7 Nov 2023 Hang Zhang, Yeyun Gong, Xingwei He, Dayiheng Liu, Daya Guo, Jiancheng Lv, Jian Guo

Most dense retrieval models contain an implicit assumption: the training query-document pairs are exactly matched.

Code Search Text Retrieval +1

OccuQuest: Mitigating Occupational Bias for Inclusive Large Language Models

1 code implementation25 Oct 2023 Mingfeng Xue, Dayiheng Liu, Kexin Yang, Guanting Dong, Wenqiang Lei, Zheng Yuan, Chang Zhou, Jingren Zhou

Furthermore, we assemble three test sets for comprehensive evaluation, an occu-test set covering 25 occupational categories, an estate set focusing on real estate, and an occu-quora set containing real-world questions from Quora.

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

2 code implementations9 Oct 2023 Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou

We propose four intriguing research questions to explore the association between model performance and various factors including data amount, composition ratio, model size and SFT strategies.

Code Generation Instruction Following +2

Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation

1 code implementation26 May 2023 Zhiwei Cao, Baosong Yang, Huan Lin, Suhang Wu, Xiangpeng Wei, Dayiheng Liu, Jun Xie, Min Zhang, Jinsong Su

$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.

Domain Adaptation Machine Translation +3

Interactive Natural Language Processing

no code implementations22 May 2023 Zekun Wang, Ge Zhang, Kexin Yang, Ning Shi, Wangchunshu Zhou, Shaochun Hao, Guangzheng Xiong, Yizhi Li, Mong Yuan Sim, Xiuying Chen, Qingqing Zhu, Zhenzhu Yang, Adam Nik, Qi Liu, Chenghua Lin, Shi Wang, Ruibo Liu, Wenhu Chen, Ke Xu, Dayiheng Liu, Yike Guo, Jie Fu

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence.

Decision Making

Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

no code implementations17 Feb 2023 Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie

In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs.

Position Sentence +1

Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

1 code implementation18 Oct 2022 Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F. Wong, Jun Xie

In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation).

Language Modeling Sentence +1

Draft, Command, and Edit: Controllable Text Editing in E-Commerce

no code implementations11 Aug 2022 Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Qian Qu, Jiancheng Lv

To address this challenge, we explore a new draft-command-edit manner in description generation, leading to the proposed new task-controllable text editing in E-commerce.

Attribute Data Augmentation

Should We Rely on Entity Mentions for Relation Extraction? Debiasing Relation Extraction with Counterfactual Analysis

1 code implementation NAACL 2022 Yiwei Wang, Muhao Chen, Wenxuan Zhou, Yujun Cai, Yuxuan Liang, Dayiheng Liu, Baosong Yang, Juncheng Liu, Bryan Hooi

In this paper, we propose the CORE (Counterfactual Analysis based Relation Extraction) debiasing method that guides the RE models to focus on the main effects of textual context without losing the entity information.

counterfactual Relation +2

RoBLEURT Submission for the WMT2021 Metrics Task

no code implementations28 Apr 2022 Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model with massive synthetic data pairs, and 3) fine-tuning the model with data denoising strategy.

Denoising

Tailor: A Prompt-Based Approach to Attribute-Based Controlled Text Generation

no code implementations28 Apr 2022 Kexin Yang, Dayiheng Liu, Wenqiang Lei, Baosong Yang, Mingfeng Xue, Boxing Chen, Jun Xie

We experimentally find that these prompts can be simply concatenated as a whole to multi-attribute CTG without any re-training, yet raises problems of fluency decrease and position sensitivity.

Attribute Position +1

RMBR: A Regularized Minimum Bayes Risk Reranking Framework for Machine Translation

no code implementations1 Mar 2022 Yidan Zhang, Yu Wan, Dayiheng Liu, Baosong Yang, Zhenan He

Recently, Minimum Bayes Risk (MBR) decoding has been proposed to improve the quality for NMT, which seeks for a consensus translation that is closest on average to other candidates from the n-best list.

Machine Translation NMT +2

Frequency-Aware Contrastive Learning for Neural Machine Translation

no code implementations29 Dec 2021 Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao

Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.

Contrastive Learning Diversity +5

KGR^4: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation

1 code implementation15 Dec 2021 Xin Liu, Dayiheng Liu, Baosong Yang, Haibo Zhang, Junwei Ding, Wenqing Yao, Weihua Luo, Haiying Zhang, Jinsong Su

Generative commonsense reasoning requires machines to generate sentences describing an everyday scenario given several concepts, which has attracted much attention recently.

Retrieval Sentence

Leveraging Advantages of Interactive and Non-Interactive Models for Vector-Based Cross-Lingual Information Retrieval

no code implementations3 Nov 2021 Linlong Xu, Baosong Yang, Xiaoyu Lv, Tianchi Bi, Dayiheng Liu, Haibo Zhang

Interactive and non-interactive model are the two de-facto standard frameworks in vector-based cross-lingual information retrieval (V-CLIR), which embed queries and documents in synchronous and asynchronous fashions, respectively.

Computational Efficiency Cross-Lingual Information Retrieval +5

POS-Constrained Parallel Decoding for Non-autoregressive Generation

1 code implementation ACL 2021 Kexin Yang, Wenqiang Lei, Dayiheng Liu, Weizhen Qi, Jiancheng Lv

However, in this work, we experimentally reveal that this assumption does not always hold for the text generation tasks like text summarization and story ending generation.

Knowledge Distillation POS +2

Prediction, Selection, and Generation: Exploration of Knowledge-Driven Conversation System

no code implementations23 Apr 2021 Cheng Luo, Dayiheng Liu, Chanjuan Li, Li Lu, Jiancheng Lv

The system includes modules such as dialogue topic prediction, knowledge matching and dialogue generation.

Dialogue Generation

ProphetNet: Predicting Future N-gram for Sequence-to-SequencePre-training

3 code implementations Findings of the Association for Computational Linguistics 2020 Weizhen Qi, Yu Yan, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou

This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.

Abstractive Text Summarization Prediction +2

Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space

1 code implementation EMNLP 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Jiancheng Lv, Nan Duan, Ming Zhou

In this paper, we propose a novel data augmentation method, referred to as Controllable Rewriting based Question Data Augmentation (CRQDA), for machine reading comprehension (MRC), question generation, and question-answering natural language inference tasks.

Data Augmentation Machine Reading Comprehension +6

RikiNet: Reading Wikipedia Pages for Natural Question Answering

no code implementations ACL 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan

The representations are then fed into the predictor to obtain the span of the short answer, the paragraph of the long answer, and the answer type in a cascaded manner.

Natural Language Understanding Natural Questions +1

Let's be Humorous: Knowledge Enhanced Humor Generation

no code implementations ACL 2020 Hang Zhang, Dayiheng Liu, Jiancheng Lv, Cheng Luo

To our knowledge, this is the first attempt to generate punchlines with knowledge enhanced model.

Sentence

Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation

1 code implementation EMNLP 2020 Dayiheng Liu, Yeyun Gong, Jie Fu, Wei Liu, Yu Yan, Bo Shao, Daxin Jiang, Jiancheng Lv, Nan Duan

Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of $<$news article, headline, keyphrase$>$.

Decoder Diversity +2

Generating Chinese Poetry from Images via Concrete and Abstract Information

no code implementations24 Mar 2020 Yusen Liu, Dayiheng Liu, Jiancheng Lv, Yongsheng Sang

We proposed an infilling-based Chinese poetry generation model which can infill the Concrete keywords into each line of poems in an explicit way, and an abstract information embedding to integrate the Abstract information into generated poems.

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

5 code implementations13 Jan 2020 Weizhen Qi, Yu Yan, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou

This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.

Ranked #6 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Prediction +2

Deep Poetry: A Chinese Classical Poetry Generation System

no code implementations19 Nov 2019 Yusen Liu, Dayiheng Liu, Jiancheng Lv

For the user's convenience, we deploy the system at the WeChat applet platform, users can use the system on the mobile device whenever and wherever possible.

Deep Learning-Based Automatic Downbeat Tracking: A Brief Review

1 code implementation10 Jun 2019 Bijue Jia, Jiancheng Lv, Dayiheng Liu

Thereinto, downbeat tracking has been a fundamental and continuous problem in Music Information Retrieval (MIR) area.

Deep Learning Downbeat Tracking +5

Revision in Continuous Space: Unsupervised Text Style Transfer without Adversarial Learning

1 code implementation29 May 2019 Dayiheng Liu, Jie Fu, Yidan Zhang, Chris Pal, Jiancheng Lv

We propose a new framework that utilizes the gradients to revise the sentence in a continuous space during inference to achieve text style transfer.

Attribute Disentanglement +4

TIGS: An Inference Algorithm for Text Infilling with Gradient Search

1 code implementation ACL 2019 Dayiheng Liu, Jie Fu, PengFei Liu, Jiancheng Lv

Text infilling is defined as a task for filling in the missing part of a sentence or paragraph, which is suitable for many real-world natural language generation scenarios.

Sentence Text Infilling

mu-Forcing: Training Variational Recurrent Autoencoders for Text Generation

2 code implementations24 May 2019 Dayiheng Liu, Xu Yang, Feng He, YuanYuan Chen, Jiancheng Lv

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problem.

Language Modeling Language Modelling +1

A Multi-Modal Chinese Poetry Generation Model

1 code implementation26 Jun 2018 Dayiheng Liu, Quan Guo, Wubo Li, Jiancheng Lv

Given a picture, the first line, the title and the other lines of the poem are successively generated in three stages.

Decoder model +1

BFGAN: Backward and Forward Generative Adversarial Networks for Lexically Constrained Sentence Generation

no code implementations21 Jun 2018 Dayiheng Liu, Jie Fu, Qian Qu, Jiancheng Lv

Incorporating prior knowledge like lexical constraints into the model's output to generate meaningful and coherent sentences has many applications in dialogue system, machine translation, image captioning, etc.

Image Captioning Machine Translation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.