In this work, we try to improve the span representation by utilizing retrieval-based span-level graphs, connecting spans and entities in the training data based on n-gram features.
MIR combines low-level cross-item interaction and high-level set-to-list interaction, where we view the candidate items to be reranked as a set and the users' behavior history in chronological order as a list.
Recent progress in state-only imitation learning extends the scope of applicability of imitation learning to real-world settings by relieving the need for observing expert actions.
We thoroughly evaluate our proposed MVG approach in the context of algorithm detection, an important and challenging subfield of PLP.
With the prevalence of live broadcast business nowadays, a new type of recommendation service, called live broadcast recommendation, is widely used in many mobile e-commerce Apps.
More concretely, we first design a search-based module to retrieve a user's relevant historical behaviors, which are then mixed up with her recent records to be fed into a time-aware sequential network for capturing her time-sensitive demands.
Policy Space Response Oracle method (PSRO) provides a general solution to Nash equilibrium in two-player zero-sum games but suffers from two problems: (1) the computation inefficiency due to consistently evaluating current populations by simulations; and (2) the exploration inefficiency due to learning best responses against a fixed meta-strategy at each iteration.
Exploration is crucial for training the optimal reinforcement learning (RL) policy, where the key is to discriminate whether a state visiting is novel.
Neural architecture search (NAS) has shown encouraging results in automating the architecture design.
In this paper, we propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
Pseudo relevance feedback (PRF) automatically performs query expansion based on top-retrieved documents to better represent the user's information need so as to improve the search results.
Model-based reinforcement learning has attracted wide attention due to its superior sample efficiency.
LogiRE treats logic rules as latent variables and consists of two modules: a rule generator and a relation extractor.
Ranked #18 on Relation Extraction on DocRED
To address these three issues mentioned above, we propose Automatic Interaction Machine (AIM) with three core components, namely, Feature Interaction Search (FIS), Interaction Function Search (IFS) and Embedding Dimension Search (EDS), to select significant feature interactions, appropriate interaction functions and necessary embedding dimensions automatically in a unified framework.
As a critical task for large-scale commercial recommender systems, reranking has shown the potential of improving recommendation results by uncovering mutual influence among items.
In this regard, it has recently been proposed to use a randomly-selected portion of the training labels as GNN inputs, concatenated with the original node features for making predictions on the remaining labels.
State-only imitation learning (SOIL) enables agents to learn from massive demonstrations without explicit action or reward information.
Goal-oriented Reinforcement Learning (GoRL) is a promising approach for scaling up RL techniques on sparse reward environments requiring long horizon planning.
Prevailing methods for relation prediction in heterogeneous graphs aim at learning latent representations (i. e., embeddings) of observed nodes and relations, and thus are limited to the transductive setting where the relation types must be known during training.
no code implementations • 16 Aug 2021 • Mingcheng Chen, Zhenghui Wang, Zhiyun Zhao, Weinan Zhang, Xiawei Guo, Jian Shen, Yanru Qu, Jieli Lu, Min Xu, Yu Xu, Tiange Wang, Mian Li, Wei-Wei Tu, Yong Yu, Yufang Bi, Weiqing Wang, Guang Ning
To tackle the above challenges, we employ gradient boosting decision trees (GBDT) to handle data heterogeneity and introduce multi-task learning (MTL) to solve data insufficiency.
Prediction over tabular data is an essential task in many data science applications such as recommender systems, online advertising, medical treatment, etc.
To better exploit search logs and model users' behavior patterns, numerous click models are proposed to extract users' implicit interaction feedback.
In the online A/B testing of the product planning problems with more than $10^7$ variables and constraints daily, Cut Ranking has achieved the average speedup ratio of 12. 42% over the production solver without any accuracy loss of solution.
In Goal-oriented Reinforcement learning, relabeling the raw goals in past experience to provide agents with hindsight ability is a major solution to the reward sparsity problem.
Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback.
Over the past few years, graph neural networks (GNN) and label propagation-based methods have made significant progress in addressing node classification tasks on graphs.
Ranked #1 on Node Property Prediction on ogbn-proteins
Searching for novel molecules with desired chemical properties is crucial in drug discovery.
As a fundamental problem in algorithmic trading, order execution aims at fulfilling a specific trading order, either liquidation or acquirement, for a given instrument.
In reinforcement learning, a map with states and transitions built based on historical trajectories is often helpful in exploration and exploitation.
Goal-oriented reinforcement learning algorithms are often good at exploration, not exploitation, while episodic algorithms excel at exploitation, not exploration.
Although non-autoregressive models with one-iteration generation achieves remarkable inference speed-up, they still falls behind their autoregressive counterparts inprediction accuracy.
Knowledge tracing (KT) defines the task of predicting whether students can correctly answer questions based on their historical response.
Ranked #3 on Knowledge Tracing on EdNet
Wasserstein GANs (WGANs), built upon the Kantorovich-Rubinstein (KR) duality of Wasserstein distance, is one of the most theoretically sound GAN models.
Heterogeneous information network (HIN) has been widely used to characterize entities of various types and their complex relations.
To this end, we propose a novel ranking framework called U-rank that directly optimizes the expected utility of the ranking list.
However, due to the potential distribution mismatch between simulated data and real data, this could lead to degraded performance.
Although many research works and projects turn to this direction for energy saving, the application into the optimization problem remains a challenging task.
The heavy traffic congestion problem has always been a concern for modern cities.
With the rapid development in online education, knowledge tracing (KT) has become a fundamental problem which traces students' knowledge status and predicts their performance on new questions.
Ranked #8 on Knowledge Tracing on EdNet
With GLM, we develop Glancing Transformer (GLAT) for machine translation.
Model-based reinforcement learning approaches leverage a forward dynamics model to support planning and decision making, which, however, may fail catastrophically if the model is inaccurate.
To the best of our knowledge, this is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.
Interactive recommender system (IRS) has drawn huge attention because of its flexible recommendation strategy and the consideration of optimal long-term user experiences.
These retrieved behaviors are then fed into a deep model to make the final prediction instead of simply using the most recent ones.
Position bias is a critical problem in information retrieval when dealing with implicit yet biased user feedback data.
We propose adversarial uncertainty sampling in discrete space (AUSDS) to retrieve informative unlabeled samples more efficiently.
In this paper, motivated by the inherent connections between neural joint source-channel coding and discrete representation learning, we propose a novel regularization method called Infomax Adversarial-Bit-Flip (IABF) to improve the stability and robustness of the neural joint source-channel coding scheme.
By implementing a regularized optimizer over the architecture parameters, the model can automatically identify and remove the redundant feature interactions during the training process of the model.
Ranked #15 on Click-Through Rate Prediction on Criteo
Recent advances in large-scale optimal transport have greatly extended its application scenarios in machine learning.
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework with explicit modeling of correlated policies by approximating opponents' policies, which can recover agents' policies that can regenerate similar interactions.
Domain adaptation aims to leverage the supervision signal of source domain to obtain an accurate model for target domain, where the labels are not available.
Various sequential recommendation methods are proposed to model the dynamic user behaviors.
Paraphrasing plays an important role in various natural language processing (NLP) tasks, such as question answering, information retrieval and sentence simplification.
Improving the efficiency of dispatching orders to vehicles is a research hotspot in online ride-hailing systems.
Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various natural language processing tasks.
Predicting urban traffic is of great importance to intelligent transportation systems and public safety, yet is very challenging because of two aspects: 1) complex spatio-temporal correlations of urban traffic, including spatial correlations between locations along with temporal correlations among timestamps; 2) diversity of such spatiotemporal correlations, which vary from location to location and depend on the surrounding geographical information, e. g., points of interests and road networks.
Knowledge base is one of the main forms to represent information in a structured way.
However, many difficult questions require multiple supporting evidence from scattered text among two or more documents.
Ranked #34 on Question Answering on HotpotQA
The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios.
The problem is formulated as to forecast the probability distribution of market price for each ad auction.
In order to tackle these challenges, in this paper, we propose a Hierarchical Periodic Memory Network for lifelong sequential modeling with personalized memorization of sequential patterns for each user.
In this paper we propose a hybrid architecture of actor-critic algorithms for reinforcement learning in parameterized action space, which consists of multiple parallel sub-actor networks to decompose the structured action space into simpler action spaces along with a critic network to guide the training of all sub-actor networks.
By contrast, Wasserstein GAN (WGAN), where the discriminative function is restricted to 1-Lipschitz, does not suffer from such a gradient uninformativeness problem.
CycleGAN is capable of learning a one-to-one mapping between two data distributions without paired examples, achieving the task of unsupervised data translation.
Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for long-run performance.
With the rapid growth of the express industry, intelligent warehouses that employ autonomous robots for carrying parcels have been widely used to handle the vast express volume.
In this paper, we propose a general framework (HyperST-Net) based on hypernetworks for deep ST models.
TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~99% training samples while preserving competitive performance.
By capturing the time dependency through modeling the conditional probability of the event for each sample, our method predicts the likelihood of the true event occurrence and estimates the survival rate over time, i. e., the probability of the non-occurrence of the event, for the censored data.
To achieve this, we utilize sequence-to-sequence prediction for user clicks, and combine both post-view and post-click attribution patterns together for the final conversion estimation.
In this paper, we investigate the underlying factor that leads to failure and success in the training of GANs.
User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search.
We introduce a new function-preserving transformation for efficient neural architecture search.
We study the problem of named entity recognition (NER) from electronic medical records, which is one of the most fundamental and critical problems for medical text mining.
Information Extraction (IE) refers to automatically extracting structured relation tuples from unstructured texts.
From the learning perspective, we show that the bidding machine can be updated smoothly with both offline periodical batch or online sequential training schemes.
This paper addresses the problem of unsupervised domain adaptation on the task of pedestrian detection in crowded scenes.
We introduce Texygen, a benchmarking platform to support research on open-domain text generation models.
Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents.
Recently, supervised hashing methods have attracted much attention since they can optimize retrieval speed and storage cost while preserving semantic information.
Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc.
Ranked #1 on Text Generation on COCO Captions
We conduct an empirical study on discovering the ordered collective dynamics obtained by a population of intelligence agents, driven by million-agent reinforcement learning.
Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results.
Ranked #118 on Image Classification on CIFAR-10
Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL).
In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment.
In this paper, we formulate the bid decision process as a reinforcement learning problem, where the state space is represented by the auction information and the campaign's real-time parameters, while an action is the bid price to set.
Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising.
Ranked #1 on Click-Through Rate Prediction on iPinYou
As a new way of training generative models, Generative Adversarial Nets (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data.
Ranked #3 on Text Generation on Chinese Poems
We present a question answering system over DBpedia, filling the gap between user information needs expressed in natural language and a structured query interface expressed in SPARQL over the underlying knowledge base (KB).
Unlike previous approaches, our approach models the clothing attributes as latent variables and thus requires no explicit labeling for the clothing attributes.
In this paper, we tackle this challenge with a novel parallel and efficient algorithm for feature-based matrix factorization.
In this paper, we utilize structured learning to simultaneously address two intertwined problems: human pose estimation (HPE) and garment attribute classification (GAC), which are valuable for a variety of computer vision and multimedia applications.