Nested Named Entity Recognition with Span-level Graphs

no code implementations ACL 2022 Juncheng Wan, Dongyu Ru, Weinan Zhang, Yong Yu

In this work, we try to improve the span representation by utilizing retrieval-based span-level graphs, connecting spans and entities in the training data based on n-gram features.

NER Nested Named Entity Recognition

Multi-Level Interaction Reranking with User Behavior History

1 code implementation20 Apr 2022 Yunjia Xi, Weiwen Liu, Jieming Zhu, Xilong Zhao, Xinyi Dai, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

MIR combines low-level cross-item interaction and high-level set-to-list interaction, where we view the candidate items to be reranked as a set and the users' behavior history in chronological order as a list.

Recommendation Systems

PerfectDou: Dominating DouDizhu with Perfect Information Distillation

no code implementations30 Mar 2022 Guan Yang, Minghuan Liu, Weijun Hong, Weinan Zhang, Fei Fang, Guangjun Zeng, Yue Lin

To this end, we characterize card and game features for DouDizhu to represent the perfect and imperfect information.

Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects

no code implementations20 Mar 2022 Xihuai Wang, Zhicheng Zhang, Weinan Zhang

Significant advances have recently been achieved in Multi-Agent Reinforcement Learning (MARL) which tackles sequential decision-making problems involving multiple participants.

Decision Making Multi-agent Reinforcement Learning +1

Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization

no code implementations4 Mar 2022 Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao, Yong Yu, Jun Wang

Recent progress in state-only imitation learning extends the scope of applicability of imitation learning to real-world settings by relieving the need for observing expert actions.

Imitation Learning Transfer Learning

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

1 code implementation25 Feb 2022 Ting Long, Yutong Xie, Xianyu Chen, Weinan Zhang, Qinxiang Cao, Yong Yu

We thoroughly evaluate our proposed MVG approach in the context of algorithm detection, an important and challenging subfield of PLP.

Neural Re-ranking in Multi-stage Recommender Systems: A Review

1 code implementation14 Feb 2022 Weiwen Liu, Yunjia Xi, Jiarui Qin, Fei Sun, Bo Chen, Weinan Zhang, Rui Zhang, Ruiming Tang

As the final stage of the multi-stage recommender system (MRS), re-ranking directly affects user experience and satisfaction by rearranging the input ranking lists, and thereby plays a critical role in MRS. With the advances in deep learning, neural re-ranking has become a trending topic and been widely applied in industrial applications.

Recommendation Systems Re-Ranking

Who to Watch Next: Two-side Interactive Networks for Live Broadcast Recommendation

no code implementations9 Feb 2022 Jiarui Jin, Xianyu Chen, Yuanbo Chen, Weinan Zhang, Renting Rui, Zaifan Jiang, Zhewen Su, Yong Yu

With the prevalence of live broadcast business nowadays, a new type of recommendation service, called live broadcast recommendation, is widely used in many mobile e-commerce Apps.

Learn over Past, Evolve for Future: Search-based Time-aware Recommendation with Sequential Behavior Data

no code implementations7 Feb 2022 Jiarui Jin, Xianyu Chen, Weinan Zhang, JunJie Huang, Ziming Feng, Yong Yu

More concretely, we first design a search-based module to retrieve a user's relevant historical behaviors, which are then mixed up with her recent records to be fed into a time-aware sequential network for capturing her time-sensitive demands.

Click-Through Rate Prediction

Efficient Policy Space Response Oracles

no code implementations28 Jan 2022 Ming Zhou, Jingxiao Chen, Ying Wen, Weinan Zhang, Yaodong Yang, Yong Yu

Policy Space Response Oracle method (PSRO) provides a general solution to Nash equilibrium in two-player zero-sum games but suffers from two problems: (1) the computation inefficiency due to consistently evaluating current populations by simulations; and (2) the exploration inefficiency due to learning best responses against a fixed meta-strategy at each iteration.

Efficient Exploration

Generative Adversarial Exploration for Reinforcement Learning

no code implementations27 Jan 2022 Weijun Hong, Menghui Zhu, Minghuan Liu, Weinan Zhang, Ming Zhou, Yong Yu, Peng Sun

Exploration is crucial for training the optimal reinforcement learning (RL) policy, where the key is to discriminate whether a state visiting is novel.

Montezuma's Revenge reinforcement-learning

Goal-Conditioned Reinforcement Learning: Problems and Solutions

1 code implementation20 Jan 2022 Minghuan Liu, Menghui Zhu, Weinan Zhang

Goal-conditioned reinforcement learning (GCRL), related to a set of complex RL problems, trains an agent to achieve different goals under particular scenarios.


Phrase-level Adversarial Example Generation for Neural Machine Translation

no code implementations6 Jan 2022 Juncheng Wan, Jian Yang, Shuming Ma, Dongdong Zhang, Weinan Zhang, Yong Yu, Furu Wei

In this paper, we propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.

Machine Translation Translation

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

1 code implementation6 Dec 2021 Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.

Offline RL reinforcement-learning +3

Curriculum Offline Imitating Learning

no code implementations NeurIPS 2021 Minghuan Liu, Hanye Zhao, Zhengyu Yang, Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies.

Continuous Control Imitation Learning +1

Towards Return Parity in Markov Decision Processes

1 code implementation19 Nov 2021 Jianfeng Chi, Jian Shen, Xinyi Dai, Weinan Zhang, Yuan Tian, Han Zhao

We first provide a decomposition theorem for return disparity, which decomposes the return disparity of any two MDPs sharing the same state and action spaces into the distance between group-wise reward functions, the discrepancy of group policies, and the discrepancy between state visitation distributions induced by the group policies.

Fairness Recommendation Systems

QA4PRF: A Question Answering based Framework for Pseudo Relevance Feedback

no code implementations16 Nov 2021 Handong Ma, Jiawei Hou, Chenxu Zhu, Weinan Zhang, Ruiming Tang, Jincai Lai, Jieming Zhu, Xiuqiang He, Yong Yu

Pseudo relevance feedback (PRF) automatically performs query expansion based on top-retrieved documents to better represent the user's information need so as to improve the search results.

Question Answering Semantic Similarity +1

AIM: Automatic Interaction Machine for Click-Through Rate Prediction

1 code implementation5 Nov 2021 Chenxu Zhu, Bo Chen, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, Yong Yu

To address these three issues mentioned above, we propose Automatic Interaction Machine (AIM) with three core components, namely, Feature Interaction Search (FIS), Interaction Function Search (IFS) and Embedding Dimension Search (EDS), to select significant feature interactions, appropriate interaction functions and necessary embedding dimensions automatically in a unified framework.

Click-Through Rate Prediction

Curriculum Offline Imitation Learning

1 code implementation3 Nov 2021 Minghuan Liu, Hanye Zhao, Zhengyu Yang, Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

However, IL is usually limited in the capability of the behavioral policy and tends to learn a mediocre behavior from the dataset collected by the mixture of policies.

Continuous Control Imitation Learning +1

Context-aware Reranking with Utility Maximization for Recommendation

no code implementations18 Oct 2021 Yunjia Xi, Weiwen Liu, Xinyi Dai, Ruiming Tang, Weinan Zhang, Qing Liu, Xiuqiang He, Yong Yu

As a critical task for large-scale commercial recommender systems, reranking has shown the potential of improving recommendation results by uncovering mutual influence among items.

Graph Attention Recommendation Systems

Why Propagate Alone? Parallel Use of Labels and Features on Graphs

no code implementations ICLR 2022 Yangkun Wang, Jiarui Jin, Weinan Zhang, Yongyi Yang, Jiuhai Chen, Quan Gan, Yong Yu, Zheng Zhang, Zengfeng Huang, David Wipf

In this regard, it has recently been proposed to use a randomly-selected portion of the training labels as GNN inputs, concatenated with the original node features for making predictions on the remaining labels.

Node Property Prediction

Inductive Relation Prediction Using Analogy Subgraph Embeddings

no code implementations ICLR 2022 Jiarui Jin, Yangkun Wang, Kounianhua Du, Weinan Zhang, Zheng Zhang, David Wipf, Yong Yu, Quan Gan

Prevailing methods for relation prediction in heterogeneous graphs aim at learning latent representations (i. e., embeddings) of observed nodes and relations, and thus are limited to the transductive setting where the relation types must be known during training.

Inductive Relation Prediction

Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning

no code implementations ICLR 2022 Jiarui Jin, Sijin Zhou, Weinan Zhang, Tong He, Yong Yu, Rasool Fakoor

Goal-oriented Reinforcement Learning (GoRL) is a promising approach for scaling up RL techniques on sparse reward environments requiring long horizon planning.

Continuous Control graph construction +1

AARL: Automated Auxiliary Loss for Reinforcement Learning

no code implementations29 Sep 2021 Tairan He, Yuge Zhang, Kan Ren, Che Wang, Weinan Zhang, Dongsheng Li, Yuqing Yang

A good state representation is crucial to reinforcement learning (RL) while an ideal representation is hard to learn only with signals from the RL objective.


Deep Ensemble Policy Learning

no code implementations29 Sep 2021 Zhengyu Yang, Kan Ren, Xufang Luo, Weiqing Liu, Jiang Bian, Weinan Zhang, Dongsheng Li

Ensemble learning, which can consistently improve the prediction performance in supervised learning, has drawn increasing attentions in reinforcement learning (RL).

Ensemble Learning

Task-wise Split Gradient Boosting Trees for Multi-center Diabetes Prediction

no code implementations16 Aug 2021 Mingcheng Chen, Zhenghui Wang, Zhiyun Zhao, Weinan Zhang, Xiawei Guo, Jian Shen, Yanru Qu, Jieli Lu, Min Xu, Yu Xu, Tiange Wang, Mian Li, Wei-Wei Tu, Yong Yu, Yufang Bi, Weiqing Wang, Guang Ning

To tackle the above challenges, we employ gradient boosting decision trees (GBDT) to handle data heterogeneity and introduce multi-task learning (MTL) to solve data insufficiency.

Diabetes Prediction Multi-Task Learning

Retrieval & Interaction Machine for Tabular Data Prediction

1 code implementation11 Aug 2021 Jiarui Qin, Weinan Zhang, Rong Su, Zhirong Liu, Weiwen Liu, Ruiming Tang, Xiuqiang He, Yong Yu

Prediction over tabular data is an essential task in many data science applications such as recommender systems, online advertising, medical treatment, etc.

Click-Through Rate Prediction Recommendation Systems

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

1 code implementation5 Jun 2021 Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang

Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.

Atari Games Distributed Computing +2

Learning to Select Cuts for Efficient Mixed-Integer Programming

no code implementations28 May 2021 Zeren Huang, Kerong Wang, Furui Liu, Hui-Ling Zhen, Weinan Zhang, Mingxuan Yuan, Jianye Hao, Yong Yu, Jun Wang

In the online A/B testing of the product planning problems with more than $10^7$ variables and constraints daily, Cut Ranking has achieved the average speedup ratio of 12. 42% over the production solver without any accuracy loss of solution.

Multiple Instance Learning

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

1 code implementation13 May 2021 Menghui Zhu, Minghuan Liu, Jian Shen, Zhicheng Zhang, Sheng Chen, Weinan Zhang, Deheng Ye, Yong Yu, Qiang Fu, Wei Yang

In Goal-oriented Reinforcement learning, relabeling the raw goals in past experience to provide agents with hindsight ability is a major solution to the reward sparsity problem.


Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

1 code implementation7 May 2021 Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

We specify the dynamics sample complexity and the opponent sample complexity in MARL, and conduct a theoretic analysis of return discrepancy upper bound.

Multi-agent Reinforcement Learning reinforcement-learning

Deep Learning for Click-Through Rate Estimation

no code implementations21 Apr 2021 Weinan Zhang, Jiarui Qin, Wei Guo, Ruiming Tang, Xiuqiang He

In this survey, we provide a comprehensive review of deep learning models for CTR estimation tasks.

Recommendation Systems

An Adversarial Imitation Click Model for Information Retrieval

1 code implementation13 Apr 2021 Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, Yong Yu

Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback.

Imitation Learning Information Retrieval +1

Bag of Tricks for Node Classification with Graph Neural Networks

1 code implementation24 Mar 2021 Yangkun Wang, Jiarui Jin, Weinan Zhang, Yong Yu, Zheng Zhang, David Wipf

Over the past few years, graph neural networks (GNN) and label propagation-based methods have made significant progress in addressing node classification tasks on graphs.

Classification General Classification +2

NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning

3 code implementations1 Feb 2021 Rongjun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu

We evaluate existing offline RL algorithms on NeoRL and argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward.

Offline RL reinforcement-learning

Universal Trading for Order Execution with Oracle Policy Distillation

no code implementations28 Jan 2021 Yuchen Fang, Kan Ren, Weiqing Liu, Dong Zhou, Weinan Zhang, Jiang Bian, Yong Yu, Tie-Yan Liu

As a fundamental problem in algorithmic trading, order execution aims at fulfilling a specific trading order, either liquidation or acquirement, for a given instrument.

Algorithmic Trading reinforcement-learning

Regioned Episodic Reinforcement Learning

no code implementations1 Jan 2021 Jiarui Jin, Cong Chen, Ming Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Yong Yu, Jun Wang, Alex Smola

Goal-oriented reinforcement learning algorithms are often good at exploration, not exploitation, while episodic algorithms excel at exploitation, not exploration.


Explore with Dynamic Map: Graph Structured Reinforcement Learning

no code implementations1 Jan 2021 Jiarui Jin, Sijin Zhou, Weinan Zhang, Rasool Fakoor, David Wipf, Tong He, Yong Yu, Zheng Zhang, Alex Smola

In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation.


Non-iterative Parallel Text Generation via Glancing Transformer

no code implementations1 Jan 2021 Lihua Qian, Hao Zhou, Yu Bao, Mingxuan Wang, Lin Qiu, Weinan Zhang, Yong Yu, Lei LI

Although non-autoregressive models with one-iteration generation achieves remarkable inference speed-up, they still falls behind their autoregressive counterparts inprediction accuracy.

Language Modelling Text Generation

Which Heroes to Pick? Learning to Draft in MOBA Games with Neural Networks and Tree Search

no code implementations18 Dec 2020 Sheng Chen, Menghui Zhu, Deheng Ye, Weinan Zhang, Qiang Fu, Wei Yang

Hero drafting is essential in MOBA game playing as it builds the team of each side and directly affects the match outcome.

An Embedding Learning Framework for Numerical Features in CTR Prediction

1 code implementation16 Dec 2020 Huifeng Guo, Bo Chen, Ruiming Tang, Weinan Zhang, Zhenguo Li, Xiuqiang He

In this paper, we propose a novel embedding learning framework for numerical features in CTR prediction (AutoDis) with high model capacity, end-to-end training and unique representation properties preserved.

Click-Through Rate Prediction Feature Engineering +1

Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings

1 code implementation14 Dec 2020 Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, David Wipf

Cycle-consistent training is widely used for jointly learning a forward and inverse mapping between two domains of interest without the cumbersome requirement of collecting matched pairs within each domain.

Knowledge Graphs Text Generation

Towards Generalized Implementation of Wasserstein Distance in GANs

1 code implementation7 Dec 2020 Minkai Xu, Zhiming Zhou, Guansong Lu, Jian Tang, Weinan Zhang, Yong Yu

Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality of Wasserstein distance, is one of the most theoretically sound GAN models.

Reciprocal Supervised Learning Improves Neural Machine Translation

1 code implementation5 Dec 2020 Minkai Xu, Mingxuan Wang, Zhouhan Lin, Hao Zhou, Weinan Zhang, Lei LI

Despite the recent success on image classification, self-training has only achieved limited gains on structured prediction tasks such as neural machine translation (NMT).

Knowledge Distillation Machine Translation +2

U-rank: Utility-oriented Learning to Rank with Implicit Feedback

no code implementations1 Nov 2020 Xinyi Dai, Jiawei Hou, Qing Liu, Yunjia Xi, Ruiming Tang, Weinan Zhang, Xiuqiang He, Jun Wang, Yong Yu

To this end, we propose a novel ranking framework called U-rank that directly optimizes the expected utility of the ranking list.

Click-Through Rate Prediction Learning-To-Rank +1

Efficient Projection-Free Algorithms for Saddle Point Problems

no code implementations NeurIPS 2020 Cheng Chen, Luo Luo, Weinan Zhang, Yong Yu

The Frank-Wolfe algorithm is a classic method for constrained optimization problems.

Model-based Policy Optimization with Unsupervised Model Adaptation

1 code implementation NeurIPS 2020 Jian Shen, Han Zhao, Weinan Zhang, Yong Yu

However, due to the potential distribution mismatch between simulated data and real data, this could lead to degraded performance.

Continuous Control Model-based Reinforcement Learning +1

Feature-Based Matrix Factorization

no code implementations11 Sep 2011 Tianqi Chen, Zhao Zheng, Qiuxia Lu, Weinan Zhang, Yong Yu

Recommender system has been more and more popular and widely used in many applications recently.

Recommendation Systems

