Search Results for author: Zhiwei Qin

Found 20 papers, 3 papers with code

Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender Systems

1 code implementation • 3 May 2023 • Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, Jieping Ye, Chen Ma

However, building a user simulator with no reality-gap, i. e., can predict user's feedback exactly, is unrealistic because the users' reaction patterns are complex and historical logs for each user are limited, which might mislead the simulator-based recommendation policy.

Decision Making Recommendation Systems +1

Paper
Code

A Unified Representation Framework for Rideshare Marketplace Equilibrium and Efficiency

no code implementations • 28 Feb 2023 • Alex Chin, Zhiwei Qin

Ridesharing platforms are a type of two-sided marketplace where ``supply-demand balance'' is critical for market efficiency and yet is complex to define and analyze.

Paper
Add Code

Spatio-temporal Incentives Optimization for Ride-hailing Services with Offline Deep Reinforcement Learning

no code implementations • 6 Nov 2022 • Yanqiu Wu, Qingyang Li, Zhiwei Qin

Motivated by this observation, we make an attempt to optimize the distribution of demand to handle this problem by learning the long-term spatio-temporal values as a guideline for pricing strategy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm Deployed in Ridehailing Marketplace

no code implementations • 10 Feb 2022 • Soheil Sadeghi Eshkevari, Xiaocheng Tang, Zhiwei Qin, Jinhan Mei, Cheng Zhang, Qianying Meng, Jia Xu

In this study, a real-time dispatching algorithm based on reinforcement learning is proposed and for the first time, is deployed in large scale.

Causal Inference reinforcement-learning +1

Paper
Add Code

Offline Model-based Adaptable Policy Learning

1 code implementation • NeurIPS 2021 • Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye

Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.

Decision Making reinforcement-learning +1

Paper
Code

A Deep Value-network Based Approach for Multi-Driver Order Dispatching

no code implementations • 8 Jun 2021 • Xiaocheng Tang, Zhiwei Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, Jieping Ye

In this work, we propose a deep reinforcement learning based solution for order dispatching and we conduct large scale online A/B tests on DiDi's ride-dispatching platform to show that the proposed method achieves significant improvement on both total driver income and user experience related metrics.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms

no code implementations • 18 May 2021 • Xiaocheng Tang, Fan Zhang, Zhiwei Qin, Yansheng Wang, Dingyuan Shi, Bingchen Song, Yongxin Tong, Hongtu Zhu, Jieping Ye

In this paper we propose a unified value-based dynamic learning framework (V1D3) for tackling both tasks.

Paper
Add Code

Reinforcement Learning for Ridesharing: An Extended Survey

no code implementations • 3 May 2021 • Zhiwei Qin, Hongtu Zhu, Jieping Ye

In this paper, we present a comprehensive, in-depth survey of the literature on reinforcement learning approaches to decision optimization problems in a typical ridesharing system.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning

no code implementations • 8 Mar 2021 • Yan Jiao, Xiaocheng Tang, Zhiwei Qin, Shuaiji Li, Fan Zhang, Hongtu Zhu, Jieping Ye

We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle repositioning on ride-hailing (a type of mobility-on-demand, MoD) platforms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Bayesian Meta-reinforcement Learning for Traffic Signal Control

no code implementations • 1 Oct 2020 • Yayi Zou, Zhiwei Qin

This framework is based on our proposed fast-adaptation variation to Gradient-EM Bayesian Meta-learning and the fast-update advantage of DQN, which allows for fast adaptation to new scenarios with continual learning ability and robustness to uncertainty.

Continual Learning Meta-Learning +3

Paper
Add Code

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

1 code implementation • 2 Apr 2020 • Mengyue Yang, Qingyang Li, Zhiwei Qin, Jieping Ye

In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint.

Multi-Armed Bandits

Paper
Code

Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem

no code implementations • 25 Nov 2019 • John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye

Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace.

BIG-bench Machine Learning Decision Making +3

Paper
Add Code

Multi-Agent Reinforcement Learning for Order-dispatching via Order-Vehicle Distribution Matching

no code implementations • 7 Oct 2019 • Ming Zhou, Jiarui Jin, Wei-Nan Zhang, Zhiwei Qin, Yan Jiao, Chenxi Wang, Guobin Wu, Yong Yu, Jieping Ye

Improving the efficiency of dispatching orders to vehicles is a research hotspot in online ride-hailing systems.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Similarity Kernel and Clustering via Random Projection Forests

no code implementations • 28 Aug 2019 • Donghui Yan, Songxiang Gu, Ying Xu, Zhiwei Qin

Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains.

Clustering Clustering Ensemble

Paper
Add Code

Environment Reconstruction with Hidden Confounders for Reinforcement Learning based Recommendation

no code implementations • 12 Jul 2019 • Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, Jieping Ye

DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application.

Imitation Learning reinforcement-learning +1

Paper
Add Code

CoRide: Joint Order Dispatching and Fleet Management for Multi-Scale Ride-Hailing Platforms

no code implementations • 27 May 2019 • Jiarui Jin, Ming Zhou, Wei-Nan Zhang, Minne Li, Zilong Guo, Zhiwei Qin, Yan Jiao, Xiaocheng Tang, Chenxi Wang, Jun Wang, Guobin Wu, Jieping Ye

How to optimally dispatch orders to vehicles and how to trade off between immediate and future returns are fundamental questions for a typical ride-hailing platform.

Multiagent Systems

Paper
Add Code

Cost-sensitive Selection of Variables by Ensemble of Model Sequences

no code implementations • 2 Jan 2019 • Donghui Yan, Zhiwei Qin, Songxiang Gu, Haiping Xu, Ming Shao

Many applications require the collection of data on different variables or measurements over many system performance metrics.

Paper
Add Code

Optimizing Taxi Carpool Policies via Reinforcement Learning and Spatio-Temporal Mining

no code implementations • 11 Nov 2018 • Ishan Jindal, Zhiwei Qin, Xue-wen Chen, Matthew Nokleby, Jieping Ye

In this paper, we develop a reinforcement learning (RL) based system to learn an effective policy for carpooling that maximizes transportation efficiency so that fewer cars are required to fulfill the given amount of trip demand.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

HIPAD - A Hybrid Interior-Point Alternating Direction algorithm for knowledge-based SVM and feature selection

no code implementations • 16 Nov 2014 • Zhiwei Qin, Xiaocheng Tang, Ioannis Akrotirianakis, Amit Chakraborty

We consider classification tasks in the regime of scarce labeled training data in high dimensional feature space, where specific expert knowledge is also available.

feature selection General Classification

Paper
Add Code

Robust Low-rank Tensor Recovery: Models and Algorithms

no code implementations • 24 Nov 2013 • Donald Goldfarb, Zhiwei Qin

Robust tensor recovery plays an instrumental role in robustifying tensor decompositions for multilinear data analysis against outliers, gross corruptions and missing values and has a diverse array of applications.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.