1 code implementation • 3 May 2023 • Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, Jieping Ye, Chen Ma
However, building a user simulator with no reality-gap, i. e., can predict user's feedback exactly, is unrealistic because the users' reaction patterns are complex and historical logs for each user are limited, which might mislead the simulator-based recommendation policy.
no code implementations • 28 Feb 2023 • Alex Chin, Zhiwei Qin
Ridesharing platforms are a type of two-sided marketplace where ``supply-demand balance'' is critical for market efficiency and yet is complex to define and analyze.
no code implementations • 6 Nov 2022 • Yanqiu Wu, Qingyang Li, Zhiwei Qin
Motivated by this observation, we make an attempt to optimize the distribution of demand to handle this problem by learning the long-term spatio-temporal values as a guideline for pricing strategy.
no code implementations • 10 Feb 2022 • Soheil Sadeghi Eshkevari, Xiaocheng Tang, Zhiwei Qin, Jinhan Mei, Cheng Zhang, Qianying Meng, Jia Xu
In this study, a real-time dispatching algorithm based on reinforcement learning is proposed and for the first time, is deployed in large scale.
1 code implementation • NeurIPS 2021 • Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye
Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.
no code implementations • 8 Jun 2021 • Xiaocheng Tang, Zhiwei Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, Jieping Ye
In this work, we propose a deep reinforcement learning based solution for order dispatching and we conduct large scale online A/B tests on DiDi's ride-dispatching platform to show that the proposed method achieves significant improvement on both total driver income and user experience related metrics.
no code implementations • 18 May 2021 • Xiaocheng Tang, Fan Zhang, Zhiwei Qin, Yansheng Wang, Dingyuan Shi, Bingchen Song, Yongxin Tong, Hongtu Zhu, Jieping Ye
In this paper we propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks.
no code implementations • 3 May 2021 • Zhiwei Qin, Hongtu Zhu, Jieping Ye
In this paper, we present a comprehensive, in-depth survey of the literature on reinforcement learning approaches to decision optimization problems in a typical ridesharing system.
no code implementations • 8 Mar 2021 • Yan Jiao, Xiaocheng Tang, Zhiwei Qin, Shuaiji Li, Fan Zhang, Hongtu Zhu, Jieping Ye
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle repositioning on ride-hailing (a type of mobility-on-demand, MoD) platforms.
no code implementations • 1 Oct 2020 • Yayi Zou, Zhiwei Qin
This framework is based on our proposed fast-adaptation variation to Gradient-EM Bayesian Meta-learning and the fast-update advantage of DQN, which allows for fast adaptation to new scenarios with continual learning ability and robustness to uncertainty.
1 code implementation • 2 Apr 2020 • Mengyue Yang, Qingyang Li, Zhiwei Qin, Jieping Ye
In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint.
no code implementations • 25 Nov 2019 • John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye
Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace.
no code implementations • 7 Oct 2019 • Ming Zhou, Jiarui Jin, Wei-Nan Zhang, Zhiwei Qin, Yan Jiao, Chenxi Wang, Guobin Wu, Yong Yu, Jieping Ye
Improving the efficiency of dispatching orders to vehicles is a research hotspot in online ride-hailing systems.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 28 Aug 2019 • Donghui Yan, Songxiang Gu, Ying Xu, Zhiwei Qin
Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains.
no code implementations • 12 Jul 2019 • Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, Jieping Ye
DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application.
no code implementations • 27 May 2019 • Jiarui Jin, Ming Zhou, Wei-Nan Zhang, Minne Li, Zilong Guo, Zhiwei Qin, Yan Jiao, Xiaocheng Tang, Chenxi Wang, Jun Wang, Guobin Wu, Jieping Ye
How to optimally dispatch orders to vehicles and how to trade off between immediate and future returns are fundamental questions for a typical ride-hailing platform.
Multiagent Systems
no code implementations • 2 Jan 2019 • Donghui Yan, Zhiwei Qin, Songxiang Gu, Haiping Xu, Ming Shao
Many applications require the collection of data on different variables or measurements over many system performance metrics.
no code implementations • 11 Nov 2018 • Ishan Jindal, Zhiwei Qin, Xue-wen Chen, Matthew Nokleby, Jieping Ye
In this paper, we develop a reinforcement learning (RL) based system to learn an effective policy for carpooling that maximizes transportation efficiency so that fewer cars are required to fulfill the given amount of trip demand.
no code implementations • 16 Nov 2014 • Zhiwei Qin, Xiaocheng Tang, Ioannis Akrotirianakis, Amit Chakraborty
We consider classification tasks in the regime of scarce labeled training data in high dimensional feature space, where specific expert knowledge is also available.
no code implementations • 24 Nov 2013 • Donald Goldfarb, Zhiwei Qin
Robust tensor recovery plays an instrumental role in robustifying tensor decompositions for multilinear data analysis against outliers, gross corruptions and missing values and has a diverse array of applications.