Search Results for author: Jingjing Xu

Found 79 papers, 40 papers with code

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

no code implementations10 Apr 2025 ByteDance Seed, :, Jiaze Chen, Tiantian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu, Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang, Ruofei Zhu, Zhecheng An, Zhihao Bai, Yu Bao, Xingyan Bin, Jiangjie Chen, Feng Chen, Hongmin Chen, Riwei Chen, Liangqiang Chen, Zixin Chen, Jinsong Chen, Siyan Chen, Kaiyuan Chen, Zhi Chen, Jin Chen, Jiecao Chen, Jinxin Chi, Weinan Dai, Ning Dai, Jiahui Dai, Shihan Dou, Yantao Du, Zhengyin Du, Jianhui Duan, Chen Dun, Ting-Han Fan, Jiazhan Feng, Junda Feng, Ziyuan Feng, Yuwei Fu, Wenqi Fu, Hanjie Fu, Hao Ge, Hongyi Guo, Mingji Han, Li Han, Wenhao Hao, Xintong Hao, Qianyu He, Jerry He, Feng He, Wen Heng, Zehua Hong, Qi Hou, Liang Hu, Shengding Hu, Nan Hu, Kai Hua, Qi Huang, Ziyue Huang, Hongzhi Huang, Zihao Huang, Ting Huang, Wenhao Huang, Wei Jia, Bin Jia, Xiaoying Jia, Yuhua Jiang, Haobin Jiang, Ziheng Jiang, Kaihua Jiang, Chengquan Jiang, Jianpeng Jiao, Xiaoran Jin, Xing Jin, Xunhao Lai, Xiang Li, Liyi Li, Hongkai Li, Zheng Li, Shengxian Wan, Ya Wang, Yunshui Li, Chenggang Li, Niuniu Li, Siyu Li, Xi Li, Xiao Li, Aoyan Li, Yuntao Li, Nianning Liang, Xinnian Liang, Haibin Lin, Weijian Lin, Ye Lin, Zhicheng Liu, Guanlin Liu, Chenxiao Liu, Yan Liu, Gaohong Liu, Juncai Liu, Chundian Liu, Deyi Liu, Kaibo Liu, Siyao Liu, Qi Liu, Yongfei Liu, Kang Liu, Gan Liu, Boyi Liu, Rui Long, Weiqiang Lou, Chenwei Lou, Xiang Luo, Yao Luo, Caiping Lv, Heyang Lv, Bole Ma, Qianli Ma, Hongzhi Ma, Yiyuan Ma, Jin Ma, Wenchang Ma, Tingting Ma, Chen Mao, Qiyang Min, Zhe Nan, Guanghan Ning, Jinxiang Ou, Haojie Pan, Renming Pang, Yanghua Peng, Tao Peng, Lihua Qian, Mu Qiao, Meng Qu, Cheng Ren, Hongbin Ren, Yong Shan, Wei Shen, Ke Shen, Kai Shen, Guangming Sheng, Jinlong Shi, Wenlei Shi, Guang Shi, Shuai Shuai Cao, Yuxin Song, Zuquan Song, Jing Su, Yifan Sun, Tao Sun, Zewei Sun, Borui Wan, Xiaohui Wang, Xi Wang, Shuguang Wang, Jun Wang, Qinlong Wang, Chenyuan Wang, Shuai Wang, Zihan Wang, Changbao Wang, Jiaqiang Wang, Shihang Wang, Xuwu Wang, Zaiyuan Wang, Yuxuan Wang, Wenqi Wang, Taiqing Wang, Chengzhi Wei, Houmin Wei, Ziyun Wei, Shufa Wei, Zheng Wu, Yonghui Wu, Yangjun Wu, Bohong Wu, Shuang Wu, Jingqiao Wu, Ning Wu, Shuangzhi Wu, Jianmin Wu, Chenguang Xi, Fan Xia, Yuqiao Xian, Liang Xiang, Boren Xiang, Bowen Xiao, Zhen Xiao, Xia Xiao, Yongsheng Xiao, Chao Xin, Shulin Xin, Yuwen Xiong, Jingjing Xu, Ziwen Xu, Chenyin Xu, Jiayi Xu, Yifan Xu, Wei Xu, Yufei Xu, Shikun Xu, Shipeng Yan, Shen Yan, Qingping Yang, Xi Yang, Tianhao Yang, Yuehang Yang, Yuan Yang, Ximing Yang, Zeyu Yang, Guang Yang, Yifan Yang, Xuesong Yao, Bairen Yi, Fan Yin, Jianian Yin, Ziqiang Ying, Xiangyu Yu, Hongli Yu, Song Yu, Menghan Yu, Huan Yu, Siyu Yuan, Jun Yuan, Yutao Zeng, Tianyang Zhan, Zheng Zhang, Yun Zhang, Mofan Zhang, Wang Zhang, Ru Zhang, Zhi Zhang, Tianqi Zhang, Xinyi Zhang, Zhexi Zhang, Sijun Zhang, Wenqiang Zhang, Xiangxiang Zhang, Yongtao Zhang, Yuyu Zhang, Ge Zhang, He Zhang, Yue Zhang, Renjie Zheng, Ningxin Zheng, Zhuolin Zheng, Yaowei Zheng, Chen Zheng, Xiaoyun Zhi, Wanjun Zhong, Cheng Zhong, Zheng Zhong, Baoquan Zhong, Xun Zhou, Na Zhou, Huan Zhou, Hang Zhu, Defa Zhu, Wenjia Zhu, Lei Zuo

We introduce Seed1. 5-Thinking, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks.

Mixture-of-Experts reinforcement-learning +1

Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning

no code implementations12 Feb 2025 Qifan Yu, Zhenyu He, Sijie Li, Xun Zhou, Jun Zhang, Jingjing Xu, Di He

Specifically, we align the steps of Chain-of-Thought (CoT) reasoning with loop iterations and apply intermediate supervision during the training of Looped Transformers.

Teaching Language Models to Critique via Reinforcement Learning

no code implementations5 Feb 2025 Zhihui Xie, Jie Chen, Liyu Chen, Weichao Mao, Jingjing Xu, Lingpeng Kong

Teaching large language models (LLMs) to critique and refine their outputs is crucial for building systems that can iteratively improve, yet it is fundamentally limited by the ability to provide accurate judgments and actionable suggestions.

Code Generation reinforcement-learning +1

A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models

1 code implementation2 Jan 2025 Jingjing Xu, Caesar Wu, Yuan-Fang Li, Grégoire Danoy, Pascal Bouvry

Transformer-based models for time series forecasting (TSF) have attracted significant attention in recent years due to their effectiveness and versatility.

Hyperparameter Optimization Mamba +2

Why Does the Effective Context Length of LLMs Fall Short?

no code implementations24 Oct 2024 Chenxin An, Jun Zhang, Ming Zhong, Lei LI, Shansan Gong, Yao Luo, Jingjing Xu, Lingpeng Kong

Advancements in distributed training and efficient attention mechanisms have significantly expanded the context window sizes of large language models (LLMs).

Attribute

FAN: Fourier Analysis Networks

2 code implementations3 Oct 2024 Yihong Dong, Ge Li, Yongding Tao, Xue Jiang, Kechi Zhang, Jia Li, Jinliang Deng, Jing Su, Jun Zhang, Jingjing Xu

Despite the remarkable successes of general-purpose neural networks, such as MLPs and Transformers, we find that they exhibit notable shortcomings in modeling and reasoning about periodic phenomena, achieving only marginal performance within the training domain and failing to generalize effectively to out-of-domain (OOD) scenarios.

Language Modeling Language Modelling +1

Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting

no code implementations29 Jul 2024 Jingjing Xu, Caesar Wu, Yuan-Fang Li, Gregoire Danoy, Pascal Bouvry

We review the previous research works from a data-centric AI perspective and we intend to lay the foundation work for the future development of transformer-based architecture and data-centric AI.

Time Series Time Series Forecasting

Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition

no code implementations10 Jul 2024 Jingjing Xu, Wei Zhou, Zijian Yang, Eugen Beck, Ralf Schlueter

Varying-size models are often required to deploy ASR systems under different hardware and/or application constraints such as memory and latency.

speech-recognition Speech Recognition

Let the Code LLM Edit Itself When You Edit the Code

no code implementations3 Jul 2024 Zhenyu He, Jun Zhang, Shengjie Luo, Jingjing Xu, Zhi Zhang, Di He

Simply encoding the edited subsequence and integrating it to the original KV cache meets the temporal confusion problem, leading to significantly worse performance.

8k Code Generation +2

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

no code implementations25 Mar 2024 Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang

We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs.

Empowering Large Language Model Agents through Action Learning

1 code implementation24 Feb 2024 Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior.

Language Modeling Language Modelling +2

Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

1 code implementation29 Jan 2024 Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, LiWei Wang, Jingjing Xu, Zhi Zhang, Hongxia Yang, Di He

In this work, we leverage the intrinsic segmentation of language sequences and design a new positional encoding method called Bilevel Positional Encoding (BiPE).

Disentanglement Position

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

1 code implementation10 Jan 2024 Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, Jing Su, Jingjing Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks.

Benchmarking

Transformer Multivariate Forecasting: Less is More?

1 code implementation30 Dec 2023 Jingjing Xu, Caesar Wu, Yuan-Fang Li, Pascal Bouvry

From the model perspective, one of the PCA-enhanced models: PCA+Crossformer, reduces mean square errors (MSE) by 33. 3% and decreases runtime by 49. 2% on average.

Temporal Sequences Time Series +1

Trustworthy AI: Deciding What to Decide

no code implementations21 Nov 2023 Caesar Wu, Yuan-Fang Li, Jian Li, Jingjing Xu, Bouvry Pascal

We aim to use this framework to conduct the TAI experiments by quantitive and qualitative research methods to satisfy TAI properties for the decision-making context.

Decision Making

Extrapolating Large Language Models to Non-English by Aligning Languages

2 code implementations9 Aug 2023 Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI

We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i. e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data.

Translation

INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation

1 code implementation10 Jun 2023 Wenhao Zhu, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen

We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.

Machine Translation Translation

M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

no code implementations7 Jun 2023 Lei LI, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu sun, Lingpeng Kong, Qi Liu

To tackle this challenge and promote research in the vision-language field, we introduce the Multi-Modal, Multilingual Instruction Tuning (M$^3$IT) dataset, designed to optimize VLM alignment with human instructions.

World Knowledge

ImageNetVC: Zero- and Few-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories

1 code implementation24 May 2023 Heming Xia, Qingxiu Dong, Lei LI, Jingjing Xu, Tianyu Liu, Ziwei Qin, Zhifang Sui

Recently, Large Language Models (LLMs) have been serving as general-purpose interfaces, posing a significant demand for comprehensive visual knowledge.

Common Sense Reasoning

Can Language Models Understand Physical Concepts?

1 code implementation23 May 2023 Lei LI, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu sun

Language models~(LMs) gradually become general-purpose interfaces in the interactive and embodied world, where the understanding of physical concepts is an essential prerequisite.

Can We Edit Factual Knowledge by In-Context Learning?

2 code implementations22 May 2023 Ce Zheng, Lei LI, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang

Inspired by in-context learning (ICL), a new paradigm based on demonstration contexts without parameter updating, we explore whether ICL can edit factual knowledge.

In-Context Learning knowledge editing

Extrapolating Multilingual Understanding Models as Multilingual Generators

no code implementations22 May 2023 Bohong Wu, Fei Yuan, Hai Zhao, Lei LI, Jingjing Xu

Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.

Denoising Language Modeling +6

A Challenging Benchmark for Low-Resource Learning

1 code implementation7 Mar 2023 Yudong Wang, Chang Ma, Qingxiu Dong, Lingpeng Kong, Jingjing Xu

Experiments on a wide range of models show that neural networks, even pre-trained language models, have sharp performance drops on our benchmark, demonstrating the effectiveness on evaluating the weaknesses of neural networks.

OpenICL: An Open-Source Framework for In-context Learning

3 code implementations6 Mar 2023 Zhenyu Wu, Yaoxiang Wang, Jiacheng Ye, Jiangtao Feng, Jingjing Xu, Yu Qiao, Zhiyong Wu

However, the implementation of ICL is sophisticated due to the diverse retrieval and inference methods involved, as well as the varying pre-processing requirements for different models, datasets, and tasks.

In-Context Learning Language Modeling +5

Analyzing And Improving Neural Speaker Embeddings for ASR

no code implementations11 Jan 2023 Christoph Lüscher, Jingjing Xu, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

By further adding neural speaker embeddings, we gain additional ~3% relative WER improvement on Hub5'00.

Speaker Verification

A Survey on In-context Learning

1 code implementation31 Dec 2022 Qingxiu Dong, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Tianyu Liu, Baobao Chang, Xu sun, Lei LI, Zhifang Sui

With the increasing capabilities of large language models (LLMs), in-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP), where LLMs make predictions based on contexts augmented with a few examples.

In-Context Learning Survey

Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models

no code implementations20 Dec 2022 Jingjing Xu, Qingxiu Dong, Hongyi Liu, Lei LI

With increasing scale, large language models demonstrate both quantitative improvement and new qualitative capabilities, especially as zero-shot learners, like GPT-3.

Language Modeling Language Modelling +3

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

1 code implementation20 Dec 2022 Fei Yuan, Yinquan Lu, Wenhao Zhu, Lingpeng Kong, Lei LI, Yu Qiao, Jingjing Xu

To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT.

Machine Translation Translation

BigText-QA: Question Answering over a Large-Scale Hybrid Knowledge Graph

no code implementations12 Dec 2022 Jingjing Xu, Maria Biryukov, Martin Theobald, Vinu Ellampallil Venugopal

Answering complex questions over textual resources remains a challenge, particularly when dealing with nuanced relationships between multiple entities expressed within natural-language sentences.

Diversity Question Answering

Enhancing and Adversarial: Improve ASR with Speaker Labels

no code implementations11 Nov 2022 Wei Zhou, Haotian Wu, Jingjing Xu, Mohammad Zeineldeen, Christoph Lüscher, Ralf Schlüter, Hermann Ney

Detailed analysis and experimental verification are conducted to show the optimal positions in the ASR neural network (NN) to apply speaker enhancing and adversarial training.

Multi-Task Learning

Improving the Training Recipe for a Robust Conformer-based Hybrid Model

no code implementations26 Jun 2022 Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Ralf Schlüter, Hermann Ney

In this work, we investigate various methods for speaker adaptive training (SAT) based on feature-space approaches for a conformer-based acoustic model (AM) on the Switchboard 300h dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

KNAS: Green Neural Architecture Search

1 code implementation26 Nov 2021 Jingjing Xu, Liang Zhao, Junyang Lin, Rundong Gao, Xu sun, Hongxia Yang

Many existing neural architecture search (NAS) solutions rely on downstream training for architecture evaluation, which takes enormous computations.

image-classification Image Classification +3

A Survey on Green Deep Learning

no code implementations8 Nov 2021 Jingjing Xu, Wangchunshu Zhou, Zhiyi Fu, Hao Zhou, Lei LI

In recent years, larger and deeper models are springing up and continuously pushing state-of-the-art (SOTA) results across various fields like natural language processing (NLP) and computer vision (CV).

Deep Learning Knowledge Distillation +2

Conformer-based Hybrid ASR System for Switchboard Dataset

no code implementations5 Nov 2021 Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Wilfried Michel, Alexander Gerstenberger, Ralf Schlüter, Hermann Ney

The recently proposed conformer architecture has been successfully used for end-to-end automatic speech recognition (ASR) architectures achieving state-of-the-art performance on different datasets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Information-theoretic Vocabularization via Optimal Transport

no code implementations1 Jan 2021 Jingjing Xu, Hao Zhou, Chun Gan, Zaixiang Zheng, Lei LI

In this paper, we find an exciting relation between an information-theoretic feature and the performance of NLP tasks such as machine translation with a given vocabulary.

Machine Translation Translation

A Gradient-based Kernel Approach for Efficient Network Architecture Search

no code implementations1 Jan 2021 Jingjing Xu, Liang Zhao, Junyang Lin, Xu sun, Hongxia Yang

Inspired by our new finding, we explore a simple yet effective network architecture search (NAS) approach that leverages gradient correlation and gradient values to find well-performing architectures.

image-classification Image Classification +2

Graph-based Multi-hop Reasoning for Long Text Generation

no code implementations28 Sep 2020 Liang Zhao, Jingjing Xu, Junyang Lin, Yichang Zhang, Hongxia Yang, Xu sun

The reasoning module is responsible for searching skeleton paths from a knowledge graph to imitate the imagination process in the human writing for semantic transfer.

Review Generation Sentence +1

MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning

3 code implementations17 Nov 2019 Guangxiang Zhao, Xu sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo

In this work, we explore parallel multi-scale representation learning on sequence data, striving to capture both long-range and short-range language structures.

Machine Translation Representation Learning +1

Understanding and Improving Layer Normalization

2 code implementations NeurIPS 2019 Jingjing Xu, Xu sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin

Unlike them, we find that the derivatives of the mean and variance are more important than forward normalization by re-centering and re-scaling backward gradients.

Machine Translation Translation

Specificity-Driven Cascading Approach for Unsupervised Sentiment Modification

no code implementations IJCNLP 2019 Pengcheng Yang, Junyang Lin, Jingjing Xu, Jun Xie, Qi Su, Xu sun

The task of unsupervised sentiment modification aims to reverse the sentiment polarity of the input text while preserving its semantic content without any parallel data.

Specificity

Reasoning Over Semantic-Level Graph for Fact Checking

no code implementations ACL 2020 Wanjun Zhong, Jingjing Xu, Duyu Tang, Zenan Xu, Nan Duan, Ming Zhou, Jiahai Wang, Jian Yin

We evaluate our system on FEVER, a benchmark dataset for fact checking, and find that rich structural information is helpful and both our graph-based mechanisms improve the accuracy.

Claim Verification Fact Checking +5

Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model

1 code implementation ACL 2019 Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu sun

In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph.

Articles Decoder +1

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation

4 code implementations27 Jun 2019 Ruixuan Luo, Jingjing Xu, Yi Zhang, Zhiyuan Zhang, Xuancheng Ren, Xu sun

Through this method, we generate synthetic data using a large amount of unlabeled data in the target domain and then obtain a word segmentation model for the target domain.

Chinese Word Segmentation Domain Adaptation +3

Coherent Comment Generation for Chinese Articles with a Graph-to-Sequence Model

1 code implementation4 Jun 2019 Wei Li, Jingjing Xu, Yancheng He, ShengLi Yan, Yunfang Wu, Xu sun

In this paper, we propose to generate comments with a graph-to-sequence model that models the input news as a topic interaction graph.

Articles Comment Generation +2

Learning Unsupervised Word Mapping by Maximizing Mean Discrepancy

no code implementations1 Nov 2018 Pengcheng Yang, Fuli Luo, Shuangzhi Wu, Jingjing Xu, Dong-dong Zhang, Xu sun

In order to avoid such sophisticated alternate optimization, we propose to learn unsupervised word mapping by directly maximizing the mean discrepancy between the distribution of transferred embedding and target embedding.

Cross-Lingual Word Embeddings Density Estimation +4

An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation

1 code implementation EMNLP 2018 Liangchen Luo, Jingjing Xu, Junyang Lin, Qi Zeng, Xu sun

Different from conventional text generation tasks, the mapping between inputs and responses in conversations is more complicated, which highly demands the understanding of utterance-level semantic dependency, a relation between the whole meanings of inputs and outputs.

Dialogue Generation

Learning Sentiment Memories for Sentiment Modification without Parallel Data

1 code implementation EMNLP 2018 Yi Zhang, Jingjing Xu, Pengcheng Yang, Xu sun

The task of sentiment modification requires reversing the sentiment of the input and preserving the sentiment-independent content.

Text Style Transfer

A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation

1 code implementation EMNLP 2018 Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu sun

Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation.

Reinforcement Learning Sentence +1

Primal Meaning Recommendation via On-line Encyclopedia

no code implementations14 Aug 2018 Zhiyuan Zhang, Wei Li, Jingjing Xu, Xu sun

We define the primal meaning of an expression to be a frequently used sense of that expression from which its other frequent senses can be deduced.

A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text

2 code implementations19 Nov 2017 Jingjing Xu, Ji Wen, Xu sun, Qi Su

To build a high quality dataset, we propose two tagging methods to solve the problem of data inconsistency, including a heuristic tagging method and a machine auxiliary tagging method.

Articles named-entity-recognition +4

Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

3 code implementations17 Nov 2017 Xu Sun, Xuancheng Ren, Shuming Ma, Bingzhen Wei, Wei Li, Jingjing Xu, Houfeng Wang, Yi Zhang

Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications.

Deep Stacking Networks for Low-Resource Chinese Word Segmentation with Transfer Learning

no code implementations4 Nov 2017 Jingjing Xu, Xu sun, Sujian Li, Xiaoyan Cai, Bingzhen Wei

In this paper, we propose a deep stacking framework to improve the performance on word segmentation tasks with insufficient data by integrating datasets from diverse domains.

Chinese Word Segmentation Transfer Learning

Shallow Discourse Parsing with Maximum Entropy Model

no code implementations31 Oct 2017 Jingjing Xu

The head-based representation of the PDTB is adopted in the arguments identifier, which turns the problem of indentifying the arguments of discourse connective into finding the head and end of the arguments.

Discourse Parsing model

Improving Social Media Text Summarization by Learning Sentence Weight Distribution

no code implementations31 Oct 2017 Jingjing Xu

Recently, encoder-decoder models are widely used in social media text summarization.

Decoder Sentence +1

Minimal Effort Back Propagation for Convolutional Neural Networks

no code implementations18 Sep 2017 Bingzhen Wei, Xu sun, Xuancheng Ren, Jingjing Xu

As traditional neural network consumes a significant amount of computing resources during back propagation, \citet{Sun2017mePropSB} propose a simple yet effective technique to alleviate this problem.

Transfer Deep Learning for Low-Resource Chinese Word Segmentation with a Novel Neural Network

no code implementations15 Feb 2017 Jingjing Xu, Xu sun

First, we train a teacher model on high-resource corpora and then use the learned knowledge to initialize a student model.

Chinese Word Segmentation Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.