Numerical results confirm the effectiveness of our model and the computational efficacy of algorithms.
The detection of the underlying shopping intentions of users based on their historical interactions is a crucial aspect for e-commerce platforms, such as Amazon, to enhance the convenience and efficiency of their customers' shopping experiences.
In this paper, we introduce AdaSelection, an adaptive sub-sampling method to identify the most informative sub-samples within each minibatch to speed up the training of large-scale deep learning models without sacrificing model performance.
Moreover, it formulates multiple goals that may be conflicting yet important to optimize for simultaneously, e. g., in product search, a ranking model can be trained based on product quality and purchase likelihood to increase revenue.
We show that the delay impacts in both cases can still be upper bounded by an additive penalty on both the regret and total incentive costs.
In this paper, we investigate a new multi-armed bandit (MAB) online learning model that considers real-world phenomena in many recommender systems: (i) the learning agent cannot pull the arms by itself and thus has to offer rewards to users to incentivize arm-pulling indirectly; and (ii) if users with specific arm preferences are well rewarded, they induce a "self-reinforcing" effect in the sense that they will attract more users of similar arm preferences.
Deep learning models in large-scale machine learning systems are often continuously trained with enormous data from production environments.
We study the problem of learning the objective functions or constraints of a multiobjective decision making model, based on a set of sequentially arrived decisions.
Our approach allows the learner to continuously estimate real-time risk preferences using concurrent observed portfolios and market price data.
To hedge against the uncertainties in the hypothetical DMP, the data, and the parameter space, we investigate in this paper the distributionally robust approach for inverse multiobjective optimization.
Inverse optimization is a powerful paradigm for learning preferences and restrictions that explain the behavior of a decision maker, based on a set of external signal and the corresponding decision pairs.
Given a set of human's decisions that are observed, inverse optimization has been developed and utilized to infer the underlying decision making problem.