Value prediction
15 papers with code • 1 benchmarks • 0 datasets
Latest papers
ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast
Data-driven weather forecast based on machine learning (ML) has experienced rapid development and demonstrated superior performance in the global medium-range forecast compared to traditional physics-based dynamical models.
A Multi-Granularity-Aware Aspect Learning Model for Multi-Aspect Dense Retrieval
Dense retrieval methods have been mostly focused on unstructured text and less attention has been drawn to structured data with various aspects, e. g., products with aspects such as category and brand.
Reinforcement Learning from Passive Data via Latent Intentions
Passive observational data, such as human videos, is abundant and rich in information, yet remains largely untapped by current RL methods.
Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic Environments
To address these challenges, we do the following: i) Instead of a neural network, we do model-based planning using a parallel memory retrieval system (which we term the slow mechanism); ii) Instead of learning state values, we guide the agent's actions using goal-directed exploration, by using a neural network to choose the next action given the current state and the goal state (which we term the fast mechanism).
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
However, prior methods typically require accurate estimation of the behavior policy or sampling from OOD data points, which themselves can be a non-trivial problem.
On the Estimation Bias in Double Q-Learning
Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation.
Learning State Representations from Random Deep Action-conditional Predictions
Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i. e., deep action-conditional predictions -- random both in what feature of observations they predict as well as in the sequence of actions the predictions are conditioned upon -- form good auxiliary tasks for reinforcement learning (RL) problems.
DATE: Dual Attentive Tree-aware Embedding for Customs Fraud Detection
Intentional manipulation of invoices that lead to undervaluation of trade goods is the most common type of customs fraud to avoid ad valorem duties and taxes.
timeXplain -- A Framework for Explaining the Predictions of Time Series Classifiers
Modern time series classifiers display impressive predictive capabilities, yet their decision-making processes mostly remain black boxes to the user.
PIVEN: A Deep Neural Network for Prediction Intervals with Specific Value Prediction
Improving the robustness of neural nets in regression tasks is key to their application in multiple domains.