Open-domain dialogue generation suffers from the data insufficiency problem due to the vast size of potential responses.
We first evaluate CLIP's zero-shot performance on a typical visual question answering task and demonstrate a zero-shot cross-modality transfer capability of CLIP on the visual entailment task.
Achieving human-level performance on some of the Machine Reading Comprehension (MRC) datasets is no longer challenging with the help of powerful Pre-trained Language Models (PLMs).
Generating open-domain conversational responses in the desired style usually suffers from the lack of parallel data in the style.
Maintaining consistent personas is essential for dialogue agents.
Unstructured documents serving as external knowledge of the dialogues help to generate more informative responses.
The heavy traffic congestion problem has always been a concern for modern cities.
With the rapid development in online education, knowledge tracing (KT) has become a fundamental problem which traces students' knowledge status and predicts their performance on new questions.
Ranked #8 on Knowledge Tracing on EdNet
To drive purchase in online advertising, it is of the advertiser's great interest to optimize the sequential advertising strategy whose performance and interpretability are both important.
With GLM, we develop Glancing Transformer (GLAT) for machine translation.
Model-based reinforcement learning approaches leverage a forward dynamics model to support planning and decision making, which, however, may fail catastrophically if the model is inaccurate.
To the best of our knowledge, this is the first work providing an efficient neighborhood-based interaction model in the HIN-based recommendations.
Interactive recommender system (IRS) has drawn huge attention because of its flexible recommendation strategy and the consideration of optimal long-term user experiences.
Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG~2017 dataset after preprocessing, which is far fewer than the millions of data for other tasks such as machine translation.
Though practical, current methods rely on restrictive assumptions to decompose the centralized value function across agents for execution.
These retrieved behaviors are then fed into a deep model to make the final prediction instead of simply using the most recent ones.
Position bias is a critical problem in information retrieval when dealing with implicit yet biased user feedback data.
Open-domain dialogue generation suffers from the data insufficiency problem due to the vast size of potential responses.
We tackle a common scenario in imitation learning (IL), where agents try to recover the optimal policy from expert demonstrations without further access to the expert or environment reward signals.
We propose adversarial uncertainty sampling in discrete space (AUSDS) to retrieve informative unlabeled samples more efficiently.
We believe that extracting unstructured document(s) information is the future trend of the DS because a great amount of human knowledge lies in these document(s).
Maintaining a consistent personality in conversations is quite natural for human beings, but is still a non-trivial task for machines.
By implementing a regularized optimizer over the architecture parameters, the model can automatically identify and remove the redundant feature interactions during the training process of the model.
Ranked #15 on Click-Through Rate Prediction on Criteo
Recent advances in large-scale optimal transport have greatly extended its application scenarios in machine learning.
Molecular graph generation is a fundamental problem for drug discovery and has been attracting growing attention.
Ranked #1 on Molecular Graph Generation on MOSES
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework with explicit modeling of correlated policies by approximating opponents' policies, which can recover agents' policies that can regenerate similar interactions.
Story Ending Prediction is a task that needs to select an appropriate ending for the given story, which requires the machine to understand the story and sometimes needs commonsense knowledge.
Domain adaptation aims to leverage the supervision signal of source domain to obtain an accurate model for target domain, where the labels are not available.
Recurrent Neural Networks (RNN) are known as powerful models for handling sequential data, and especially widely utilized in various natural language processing tasks.
Various sequential recommendation methods are proposed to model the dynamic user behaviors.
Paraphrasing plays an important role in various natural language processing (NLP) tasks, such as question answering, information retrieval and sentence simplification.
Improving the efficiency of dispatching orders to vehicles is a research hotspot in online ride-hailing systems.
The heart of TripleNet is a novel attention mechanism named triple attention to model the relationships within the triple at four levels.
Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various natural language processing tasks.
This paper studies graph-based recommendation, where an interaction graph is constructed from historical records and is lever-aged to alleviate data sparsity and cold start problems.
Ranked #1 on Click-Through Rate Prediction on MovieLens 1M
Giving conversational context with persona information to a chatbot, how to exploit the information to generate diverse and sustainable conversations is still a non-trivial task.
How to optimally dispatch orders to vehicles and how to trade off between immediate and future returns are fundamental questions for a typical ride-hailing platform.
Knowledge base is one of the main forms to represent information in a structured way.
However, many difficult questions require multiple supporting evidence from scattered text among two or more documents.
Ranked #34 on Question Answering on HotpotQA
The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios.
To enable cooperation of traffic signals, in this paper, we propose a model, CoLight, which uses graph attentional networks to facilitate communication.
The problem is formulated as to forecast the probability distribution of market price for each ad auction.
In order to tackle these challenges, in this paper, we propose a Hierarchical Periodic Memory Network for lifelong sequential modeling with personalized memorization of sequential patterns for each user.
In this paper we propose a hybrid architecture of actor-critic algorithms for reinforcement learning in parameterized action space, which consists of multiple parallel sub-actor networks to decompose the structured action space into simpler action spaces along with a critic network to guide the training of all sub-actor networks.
By contrast, Wasserstein GAN (WGAN), where the discriminative function is restricted to 1-Lipschitz, does not suffer from such a gradient uninformativeness problem.
In this paper, we propose CommunityGAN, a novel community detection framework that jointly solves overlapping community detection and graph representation learning.
Ranked #1 on Community Detection on Amazon
With the rapid growth of the express industry, intelligent warehouses that employ autonomous robots for carrying parcels have been widely used to handle the vast express volume.
Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for long-run performance.
The DRR framework treats recommendation as a sequential decision making procedure and adopts an "Actor-Critic" reinforcement learning scheme to model the interactions between the users and recommender systems, which can consider both the dynamic adaptation and long-term rewards.
Dialogue systems are usually built on either generation-based or retrieval-based approaches, yet they do not benefit from the advantages of different models.
TGE-PS uses Pairs Sampling (PS) to improve the sampling strategy of RW, being able to reduce ~99% training samples while preserving competitive performance.
Deep Q-learning has achieved a significant success in single-agent decision making tasks.
In this paper, we investigate the problem of advertising with adaptive exposure: can we dynamically determine the number and positions of ads for each user visit under certain business constraints so that the platform revenue can be increased?
By capturing the time dependency through modeling the conditional probability of the event for each sample, our method predicts the likelihood of the true event occurrence and estimates the survival rate over time, i. e., the probability of the non-occurrence of the event, for the censored data.
To achieve this, we utilize sequence-to-sequence prediction for user clicks, and combine both post-view and post-click attribution patterns together for the final conversion estimation.
Recent neural network methods for zero pronoun resolution explore multiple models for generating representation vectors for zero pronouns and their candidate antecedents.
Despite the success of existing works on single-turn conversation generation, taking the coherence in consideration, human conversing is actually a context-sensitive process.
Most existing knowledge graphs (KGs) in academic domains suffer from problems of insufficient multi-relational information, name ambiguity and improper data format for large-scale machine processing.
In this paper, we investigate the underlying factor that leads to failure and success in the training of GANs.
User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search.
In this tutorial, we focus on discussing the GAN techniques and the variants on discrete data fitting in various information retrieval scenarios.
In this study, we show how to integrate local and global decision-making by exploiting deep reinforcement learning models.
We introduce a new function-preserving transformation for efficient neural architecture search.
We study the problem of named entity recognition (NER) from electronic medical records, which is one of the most fundamental and critical problems for medical text mining.
Information Extraction (IE) refers to automatically extracting structured relation tuples from unstructured texts.
From the advertisers' side, participating in ranking the search results by paying for the sponsored search advertisement to attract more awareness and purchase facilitates their commercial goal.
From the learning perspective, we show that the bidding machine can be updated smoothly with both offline periodical batch or online sequential training schemes.
Real-time advertising allows advertisers to bid for each impression for a visiting user.
Existing multi-agent reinforcement learning methods are limited typically to a small number of agents.
We introduce Texygen, a benchmarking platform to support research on open-domain text generation models.
Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents.
In this paper, we show that the recent integration of statistical models with deep recurrent neural networks provides a new way of formulating volatility (the degree of variation of time series) models that have been widely used in time series analysis and prediction in finance.
The goal of graph representation learning is to embed each vertex in a graph into a low-dimensional vector space.
Ranked #1 on Node Classification on Wikipedia
Although the word-popularity based negative sampler has shown superb performance in the skip-gram model, the theoretical motivation behind oversampling popular (non-observed) words as negative samples is still not well understood.
In this paper, we introduce the first evaluation of Chinese human-computer dialogue technology.
Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc.
Ranked #1 on Text Generation on COCO Captions
We conduct an empirical study on discovering the ordered collective dynamics obtained by a population of intelligence agents, driven by million-agent reinforcement learning.
Existing approaches for Chinese zero pronoun resolution typically utilize only syntactical and lexical features while ignoring semantic information.
In this article, we mathematically study several GAN related topics, including Inception score, label smoothing, gradient vanishing and the -log(D(x)) alternative.
Techniques for automatically designing deep neural network architectures such as reinforcement learning based approaches have recently shown promising results.
Ranked #118 on Image Classification on CIFAR-10
In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment.
Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL).
This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair.
In this paper, we formulate the bid decision process as a reinforcement learning problem, where the state space is represented by the auction information and the campaign's real-time parameters, while an action is the bid price to set.
Moreover, the lexical divergence of the responses generated by the 5 personalized models indicates that the proposed two-phase approach achieves good results on modeling the responding style of human and generating personalized responses for the conversational systems.
Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising.
Ranked #1 on Click-Through Rate Prediction on iPinYou
The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads.
Computer Science and Game Theory
As a new way of training generative models, Generative Adversarial Nets (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data.
Ranked #3 on Text Generation on Chinese Poems
An obvious drawback of these work is that there is not a learnable relationship between words and the start symbol.
Recently, the rapid development of word embedding and neural networks has brought new inspiration to various NLP and IR tasks.
Most existing approaches for zero pronoun resolution are heavily relying on annotated data, which is often released by shared task organizers.
This is because zero pronouns have no descriptive information, which results in difficulty in explicitly capturing their semantic similarities with antecedents.
In this paper, we propose a feedback control mechanism for RTB which helps advertisers dynamically adjust the bids to effectively control the KPIs, e. g., the auction winning ratio and the effective cost per click.
Computer Science and Game Theory Systems and Control
In this work, we propose a general framework which learns the user profiles based on their online browsing behaviour, and transfers the learned knowledge onto prediction of their ad response.
Different from continuous raw features that we usually found in the image and audio domains, the input features in web space are always of multi-field and are mostly discrete and categorical while their dependencies are little known.
Ranked #2 on Click-Through Rate Prediction on Company*
This dataset directly supports the experiments of some important research problems such as bid optimisation and CTR estimation.
Computer Science and Game Theory Computers and Society