Call centers, in which human operators attend clients using textual chat, are very common in modern e-commerce.
Because of this decision tree equivalence, any function approximator can be used during training, including a neural network, while yielding a decision tree policy for the base MDP.
In this paper, we propose to approach economic time series forecasting of multiple financial assets in a novel way via video prediction.
In this work, we take a novel approach by leveraging advances in deep learning to extend the field of time series forecasting to a visual setting.
One of the key characteristics of these applications is the wide range of strategies that an adversary may choose as they adapt their strategy dynamically to sustain benefits and evade authorities.
Task specific fine-tuning of a pre-trained neural language model using a custom softmax output layer is the de facto approach of late when dealing with document classification problems.
We show that conventional crowdsourcing algorithms struggle in this user feedback setting, and present a new algorithm, SURF, that can cope with this non-response ambiguity.
We introduce a data management problem called metadata debt, to identify the mapping between data concepts and their logical representations.
Digital reports are often created based on tedious manual analysis as well as visualization of the underlying trends and characteristics of data.
Training multi-agent systems (MAS) to achieve realistic equilibria gives us a useful tool to understand and model real-world systems.
We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems.
To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on the performance of their learning procedures rather than pre-trained agents.
Document classification is ubiquitous in a business setting, but often the end users of a classifier are engaged in an ongoing feedback-retrain loop with the team that maintain it.
We consider the problem of aggregating predictions or measurements from a set of human forecasters, models, sensors or other instruments which may be subject to bias or miscalibration and random heteroscedastic noise.
Through experiments with simulated and real world scientific collaboration, transportation and global trade networks, we demonstrate that the proposed heuristics show increased performance with the richness of connection type correlation structure and significantly outperform their baseline heuristics for ordinary networks with a single connection type.
Machine learning (especially reinforcement learning) methods for trading are increasingly reliant on simulation for agent training and testing.
In this work we explore the use of latent representations obtained from multiple input sensory modalities (such as images or sounds) in allowing an agent to learn and exploit policies over different subsets of input modalities.
The dynamics of financial markets are driven by the interactions between participants, as well as the trading mechanisms and regulatory frameworks that govern these interactions.
Market makers play an important role in providing liquidity to markets by continuously quoting prices at which they are willing to buy and sell, and managing inventory risk.
In this paper, we propose using vibrations and force-torque feedback from the interactions to adapt the slicing motions and monitor for contact events.
Therefore, we introduce a comprehensive, large-scale, simulator-paired dataset of human demonstrations: MineRL.
The art of systematic financial trading evolved with an array of approaches, ranging from simple strategies to complex algorithms all relying, primary, on aspects of time-series analysis.
In this study, we examine whether binary decisions are better to be decided based on the numeric or the visual representation of the same data.
There is a growing desire in the field of reinforcement learning (and machine learning in general) to move from black-box models toward more "interpretable AI."
Though reinforcement learning has greatly benefited from the incorporation of neural networks, the inability to verify the correctness of such systems limits their use.
1 code implementation • 22 Apr 2019 • William H. Guss, Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, Diego Perez Liebana, Ruslan Salakhutdinov, Nicholay Topin, Manuela Veloso, Phillip Wang
To that end, we introduce: (1) the Minecraft ObtainDiamond task, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods; and (2) the MineRL-v0 dataset, a large-scale collection of over 60 million state-action pairs of human demonstrations that can be resimulated into embodied trajectories with arbitrary modifications to game state and visuals.
In this pilot study, we investigate (1) in what way a robot can express a certain mood to influence a human's decision making behavioral model; (2) how and to what extent the human will be influenced in a game theoretic setting.
With the success of deep learning, recent efforts have been focused on analyzing how learned networks make their classifications.
In this paper, we propose the concept of coordination between CoBot and the Parrot ARDrone 2. 0 to perform service-based object search tasks, in which CoBot localizes and navigates to the general search areas carrying the ARDrone and the ARDrone searches locally for objects.
Multisensory polices are known to enhance both state estimation and target tracking.
In order to compute a solution for a probabilistic planning problem, planners need to manage the uncertainty associated with the different paths from the initial state to a goal state.