Search Results for author: Pengfei Li

Found 70 papers, 33 papers with code

Automatic True/False Question Generation for Educational Purpose

no code implementations NAACL (BEA) 2022 Bowei Zou, Pengfei Li, Liangming Pan, Ai Ti Aw

In field of teaching, true/false questioning is an important educational method for assessing students’ general understanding of learning materials.

Fact Verification Question Generation +2

End-to-End Simultaneous Speech Translation with Pretraining and Distillation: Huawei Noah’s System for AutoSimTranS 2022

no code implementations NAACL (AutoSimTrans) 2022 Xingshan Zeng, Pengfei Li, Liangyou Li, Qun Liu

This paper describes the system submitted to AutoSimTrans 2022 from Huawei Noah’s Ark Lab, which won the first place in the audio input track of the Chinese-English translation task.

Decoder Knowledge Distillation +2

A Water Efficiency Dataset for African Data Centers

no code implementations4 Dec 2024 Noah Shumba, Opelo Tshekiso, Pengfei Li, Giulia Fanti, Shaolei Ren

While most attention has been paid to developed countries such as the U. S., this paper presents the first-of-its-kind dataset that combines nation-level weather and electricity generation data to estimate water usage efficiency for data centers in 41 African countries across five different climate regions.

LLM-PySC2: Starcraft II learning environment for Large Language Models

1 code implementation8 Nov 2024 Zongyuan Li, Yanan Ni, Runnan Qi, Lumin Jiang, Chang Lu, Xiaojie Xu, Xiangbei Liu, Pengfei Li, Yunzheng Guo, Zhe Ma, Xian Guo, Kuihua Huang, Xuebo Zhang

This paper introduces a new environment LLM-PySC2 (the Large Language Model StarCraft II Learning Environment), a platform derived from DeepMind's StarCraft II Learning Environment that serves to develop Large Language Models (LLMs) based decision-making methodologies.

Decision Making Language Modelling +4

Online Budgeted Matching with General Bids

no code implementations6 Nov 2024 Jianyi Yang, Pengfei Li, Adam Wierman, Shaolei Ren

In this paper, we remove the FLM assumption and tackle the open problem of OBM with general bids.

Parameter-Efficient Fine-Tuning Medical Multimodal Large Language Models for Medical Visual Grounding

no code implementations31 Oct 2024 Jinlong He, Pengfei Li, Gang Liu, Shenjun Zhong

Multimodal Large Language Models (MLLMs) inherit the superior text understanding capabilities of LLMs and extend these capabilities to multimodal scenarios.

parameter-efficient fine-tuning Visual Grounding

Bench4Merge: A Comprehensive Benchmark for Merging in Realistic Dense Traffic with Micro-Interactive Vehicles

no code implementations21 Oct 2024 Zhengming Wang, Junli Wang, Pengfei Li, Zhaohan Li, Peng Li, Yilun Chen

While the capabilities of autonomous driving have advanced rapidly, merging into dense traffic remains a significant challenge, many motion planning methods for this scenario have been proposed but it is hard to evaluate them.

Autonomous Driving Diversity +1

ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema

no code implementations26 Jul 2024 Fei Wang, Yuewen Zheng, Qin Li, Jingyi Wu, Pengfei Li, Luxia Zhang

For the result of value extraction based on correct key extraction, the overall accuracy was 97. 2%, precision was 95. 8%, recall was 95. 8%, and F1-score was 95. 8%.

Optical Character Recognition Optical Character Recognition (OCR)

Positive and Unlabeled Data: Model, Estimation, Inference, and Classification

no code implementations13 Jul 2024 Siyan Liu, Chi-Kuang Yeh, Xin Zhang, Qinglong Tian, Pengfei Li

This study introduces a new approach to addressing positive and unlabeled (PU) data through the double exponential tilting model (DETM).

ModWaveMLP: MLP-Based Mode Decomposition and Wavelet Denoising Model to Defeat Complex Structures in Traffic Forecasting

1 code implementation The 38th Annual AAAI Conference on Artificial Intelligence 2024 Ke Sun, Pei Liu, Pengfei Li, Zhifang Liao

Additionally, when handling traffic data, researchers tend to manually design the model structure based on the data features, which makes the structure of traffic prediction redundant and the model generalizability limited.

Denoising Traffic Prediction

Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

no code implementations8 Jun 2024 Biqing Qi, Pengfei Li, Fangyuan Li, Junqi Gao, Kaiyan Zhang, BoWen Zhou

Inspired by intraspecific competition driving species evolution, we propose a Online Fast-Slow chasing DPO (OFS-DPO) for preference alignment, simulating competition through fast and slow chasing among models to facilitate rapid adaptation.

Continual Learning

Exploring Adversarial Robustness of Deep State Space Models

1 code implementation8 Jun 2024 Biqing Qi, Yang Luo, Junqi Gao, Pengfei Li, Kai Tian, Zhiyuan Ma, BoWen Zhou

We find that fixed-parameterized SSMs have output error bounds strictly related to their parameters, limiting their AT benefits, while input-dependent SSMs may face the problem of error explosion.

Adversarial Robustness State Space Models

Building Socially-Equitable Public Models

1 code implementation4 Jun 2024 Yejia Liu, Jianyi Yang, Pengfei Li, Tongxin Li, Shaolei Ren

Public models offer predictions to a variety of downstream tasks and have played a crucial role in various AI applications, showcasing their proficiency in accurate predictions.

Decision Making Fairness

PRICE: A Pretrained Model for Cross-Database Cardinality Estimation

1 code implementation3 Jun 2024 Tianjing Zeng, Junwei Lan, Jiahong Ma, Wenqing Wei, Rong Zhu, Pengfei Li, Bolin Ding, Defu Lian, Zhewei Wei, Jingren Zhou

It is generally applicable to any unseen new database to attain high estimation accuracy, while its preparation cost is as little as the basic one-dimensional histogram-based CardEst methods.

A Dataset for Research on Water Sustainability

no code implementations24 May 2024 Pranjol Sen Gupta, Md Rajib Hossen, Pengfei Li, Shaolei Ren, Mohammad A. Islam

Freshwater scarcity is a global problem that requires collective efforts across all industry sectors.

A New Method in Facial Registration in Clinics Based on Structure Light Images

no code implementations23 May 2024 Pengfei Li, Ziyue Ma, Hong Wang, Juan Deng, Yan Wang, Zhenyu Xu, Feng Yan, Wenjun Tu, Hong Sha

To abundant traditional image methods with depth information, a method in registering with depth images and traditional clinical images was investigated.

Face Recognition

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

1 code implementation28 Mar 2024 Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao Zhao

However, the exploration of 3D dense captioning in outdoor scenes is hindered by two major challenges: 1) the domain gap between indoor and outdoor scenes, such as dynamics and sparse visual inputs, makes it difficult to directly adapt existing indoor methods; 2) the lack of data with comprehensive box-caption pair annotations specifically tailored for outdoor scenes.

3D dense captioning Dense Captioning

P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap Priors

no code implementations15 Mar 2024 Zhou Jiang, Zhenxin Zhu, Pengfei Li, Huan-ang Gao, Tianyuan Yuan, Yongliang Shi, Hang Zhao, Hao Zhao

On the other hand, we exploit a masked autoencoder to capture the prior distribution of HDMap, which can serve as a refinement module to mitigate occlusions and artifacts.

Autonomous Vehicles

GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping

1 code implementation14 Mar 2024 Yuhang Zheng, Xiangyu Chen, Yupeng Zheng, Songen Gu, Runyi Yang, Bu Jin, Pengfei Li, Chengliang Zhong, Zengmao Wang, Lina Liu, Chao Yang, Dawei Wang, Zhen Chen, Xiaoxiao Long, Meiqing Wang

In particular, we propose an Efficient Feature Distillation (EFD) module that employs contrastive learning to efficiently and accurately distill language embeddings derived from foundational models.

Contrastive Learning Robotic Grasping +1

MonoOcc: Digging into Monocular Semantic Occupancy Prediction

1 code implementation13 Mar 2024 Yupeng Zheng, Xiang Li, Pengfei Li, Yuhang Zheng, Bu Jin, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang

However, existing methods rely on a complex cascaded framework with relatively limited information to restore 3D scenes, including a dependency on supervision solely on the whole network's output, single-frame input, and the utilization of a small backbone.

3D geometry Autonomous Vehicles

PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging

1 code implementation5 Jan 2024 Gang Liu, Jinlong He, Pengfei Li, Genrong He, Zhaolin Chen, Shenjun Zhong

In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets.

 Ranked #1 on Medical Visual Question Answering on VQA-RAD (using extra training data)

Medical Report Generation Medical Visual Question Answering +5

Reputation-Based Federated Learning Defense to Mitigate Threats in EEG Signal Classification

no code implementations22 Oct 2023 Zhibo Zhang, Pengfei Li, Ahmed Y. Al Hammadi, Fusen Guo, Ernesto Damiani, Chan Yeob Yeun

This paper presents a reputation-based threat mitigation framework that defends potential security threats in electroencephalogram (EEG) signal classification during model aggregation of Federated Learning.

Brain Computer Interface Data Poisoning +5

A Robust Adversary Detection-Deactivation Method for Metaverse-oriented Collaborative Deep Learning

no code implementations21 Oct 2023 Pengfei Li, Zhibo Zhang, Ameena S. Al-Sumaiti, Naoufel Werghi, Chan Yeob Yeun

Metaverse is trending to create a digital circumstance that can transfer the real world to an online platform supported by large quantities of real-time interactions.

Generative Adversarial Network

Learning Point-wise Abstaining Penalty for Point Cloud Anomaly Detection

1 code implementation19 Sep 2023 Shaocong Xu, Pengfei Li, Xinyu Liu, Qianpu Sun, Yang Li, Shihui Guo, Zhen Wang, Bo Jiang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao

We demonstrate that learning different abstaining penalties, apart from point-wise penalty, for different types of (synthesized) outliers can further improve the performance.

Anomaly Detection Autonomous Driving +1

3D Implicit Transporter for Temporally Consistent Keypoint Discovery

1 code implementation ICCV 2023 Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao

To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporate both spatial and temporal information.

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

1 code implementation11 Jul 2023 Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong

Medical visual question answering (VQA) is a challenging task that requires answering clinical questions of a given medical image, by taking consider of both visual and language information.

Medical Visual Question Answering

Towards Environmentally Equitable AI via Geographical Load Balancing

1 code implementation20 Jun 2023 Pengfei Li, Jianyi Yang, Adam Wierman, Shaolei Ren

The results demonstrate that existing GLB approaches may amplify environmental inequity while our proposed equity-aware GLB can significantly reduce the regional disparity in terms of carbon and water footprints.

Learning-Augmented Decentralized Online Convex Optimization in Networks

no code implementations16 Jun 2023 Pengfei Li, Jianyi Yang, Adam Wierman, Shaolei Ren

This paper studies decentralized online convex optimization in a networked multi-agent system and proposes a novel algorithm, Learning-Augmented Decentralized Online optimization (LADO), for individual agents to select actions only based on local online information.

Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees

1 code implementation31 May 2023 Pengfei Li, Jianyi Yang, Shaolei Ren

The key novelty of LOMAR is a new online switching operation which, based on a judicious condition to hedge against future uncertainties, decides whether to follow the expert's decision or the RL decision for each online item.

Reinforcement Learning (RL)

Robustified Learning for Online Optimization with Memory Costs

no code implementations1 May 2023 Pengfei Li, Jianyi Yang, Shaolei Ren

In this paper, we propose a novel expert-robustified learning (ERL) approach, achieving {both} good average performance and robustness.

Scheduling

DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

1 code implementation ICCV 2023 Huan-ang Gao, Beiwen Tian, Pengfei Li, Hao Zhao, Guyue Zhou

While this paradigm is natural for image-level or pixel-level prediction, adapting it to the detection problem is challenged by the issue of proposal matching.

3D Object Detection object-detection +1

Making AI Less "Thirsty": Uncovering and Addressing the Secret Water Footprint of AI Models

1 code implementation6 Apr 2023 Pengfei Li, Jianyi Yang, Mohammad A. Islam, Shaolei Ren

To respond to the global water challenges, AI models can, and also must, take social responsibility and lead by example by addressing their own water footprint.

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

no code implementations20 Mar 2023 Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang, Weichao Wang, Pengfei Li, Xiaoda Zhang, Alexander Podolskiy, Grigory Arshinov, Andrey Bout, Irina Piontkovskaya, Jiansheng Wei, Xin Jiang, Teng Su, Qun Liu, Jun Yao

In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1. 085T parameters named PanGu-{\Sigma}.

Code Generation Language Modelling +4

LODE: Locally Conditioned Eikonal Implicit Scene Completion from Sparse LiDAR

1 code implementation27 Feb 2023 Pengfei Li, Ruowen Zhao, Yongliang Shi, Hao Zhao, Jirui Yuan, Guyue Zhou, Ya-Qin Zhang

In this paper, we propose a novel Eikonal formulation that conditions the implicit representation on localized shape priors which function as dense boundary value constraints, and demonstrate it works on SemanticKITTI and SemanticPOSS.

Autonomous Driving Representation Learning

ADAPT: Action-aware Driving Caption Transformer

1 code implementation1 Feb 2023 Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu

To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.

Autonomous Driving Decision Making

Self-supervised vision-language pretraining for Medical visual question answering

2 code implementations24 Nov 2022 Pengfei Li, Gang Liu, Lin Tan, Jinying Liao, Shenjun Zhong

Medical image visual question answering (VQA) is a task to answer clinical questions, given a radiographic image, which is a challenging problem that requires a model to integrate both vision and language information.

Contrastive Learning Image-text matching +6

TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

1 code implementation19 Oct 2022 Pengfei Li, Beiwen Tian, Yongliang Shi, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on.

Instance Segmentation Referring Expression +2

City-scale Incremental Neural Mapping with Three-layer Sampling and Panoptic Representation

no code implementations28 Sep 2022 Yongliang Shi, Runyi Yang, Pengfei Li, Zirui Wu, Hao Zhao, Guyue Zhou

Neural implicit representations are drawing a lot of attention from the robotics community recently, as they are expressive, continuous and compact.

LATITUDE: Robotic Global Localization with Truncated Dynamic Low-pass Filter in City-scale NeRF

1 code implementation18 Sep 2022 Zhenxin Zhu, Yuantao Chen, Zirui Wu, Chao Hou, Yongliang Shi, Chuxuan Li, Pengfei Li, Hao Zhao, Guyue Zhou

In this paper, we present LATITUDE: Global Localization with Truncated Dynamic Low-pass Filter, which introduces a two-stage localization mechanism in city-scale NeRF.

Pose Prediction

Constrained Update Projection Approach to Safe Policy Optimization

3 code implementations15 Sep 2022 Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan

Compared to previous safe RL methods, CUP enjoys the benefits of 1) CUP generalizes the surrogate functions to generalized advantage estimator (GAE), leading to strong empirical performance.

Reinforcement Learning (RL) Safe Reinforcement Learning

TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents

no code implementations14 Jul 2022 Zhanzhan Cheng, Peng Zhang, Can Li, Qiao Liang, Yunlu Xu, Pengfei Li, ShiLiang Pu, Yi Niu, Fei Wu

Most existing methods divide this task into two subparts: the text reading part for obtaining the plain text from the original document images and the information extraction part for extracting key contents.

Language Modelling

DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding

1 code implementation14 Jul 2022 Liang Qiao, Hui Jiang, Ying Chen, Can Li, Pengfei Li, Zaisheng Li, Baorui Zou, Dashan Guo, Yingda Xu, Yunlu Xu, Zhanzhan Cheng, Yi Niu

Compared with the previous opensource OCR toolbox, DavarOCR has relatively more complete support for the sub-tasks of the cutting-edge technology of document understanding.

document understanding Optical Character Recognition (OCR)

An Extendable Maneuver Management Framework with Fault-Tolerant Mechanism for Vehicle Platoon Control System in Highway Scenario

no code implementations4 Jul 2022 Chang Liu, Yugong Luo, Pengfei Li, Chunhui Xing, Weiwei Kong

To deal with this problem, this paper introduces a two-dimensional maneuver management framework with a fault-tolerant mechanism on the basis of the proposed hierarchical architecture for the platoon control system.

Management

Expert-Calibrated Learning for Online Optimization with Switching Costs

no code implementations18 Apr 2022 Pengfei Li, Jianyi Yang, Shaolei Ren

Nonetheless, by using the standard practice of training an ML model as a standalone optimizer and plugging it into an ML-augmented algorithm, the average cost performance can be highly unsatisfactory.

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning

1 code implementation15 Feb 2022 Long Yang, Jiaming Ji, Juntao Dai, Yu Zhang, Pengfei Li, Gang Pan

Although using bounds as surrogate functions to design safe RL algorithms have appeared in some existing works, we develop them at least three aspects: (i) We provide a rigorous theoretical analysis to extend the surrogate functions to generalized advantage estimator (GAE).

reinforcement-learning Reinforcement Learning +3

Semi-supervised Implicit Scene Completion from Sparse LiDAR

1 code implementation29 Nov 2021 Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Recent advances show that semi-supervised implicit representation learning can be achieved through physical constraints like Eikonal equations.

Representation Learning

Rapid Assessments of Light-Duty Gasoline Vehicle Emissions Using On-Road Remote Sensing and Machine Learning

no code implementations1 Oct 2021 Yan Xia, Linhui Jiang, Lu Wang, Xue Chen, Jianjie Ye, Tangyan Hou, Liqiang Wang, Yibo Zhang, Mengying Li, Zhen Li, Zhe Song, Yaping Jiang, Weiping Liu, Pengfei Li, Daniel Rosenfeld, John H. Seinfeld, Shaocai Yu

Our results show that the ORRS measurements, assisted by the machine-learning-based ensemble model developed here, can realize day-to-day supervision of on-road vehicle-specific emissions.

1st Place Solution to ICDAR 2021 RRC-ICTEXT End-to-end Text Spotting and Aesthetic Assessment on Integrated Circuit

no code implementations8 Apr 2021 Qiyao Wang, Pengfei Li, Li Zhu, Yi Niu

For the text spotting task, we detect the characters on integrated circuit and classify them based on yolov5 detection model.

Text Spotting

A cautionary tale in fitting galaxy rotation curves with Bayesian techniques: does Newton's constant vary from galaxy to galaxy?

no code implementations27 Jan 2021 Pengfei Li, Federico Lelli, Stacy McGaugh, James Schombert, Kyu-Hyun Chae

The application of Bayesian techniques to astronomical data is generally non-trivial because the fitting parameters can be strongly degenerated and the formal uncertainties are themselves uncertain.

Astrophysics of Galaxies Cosmology and Nongalactic Astrophysics Instrumentation and Methods for Astrophysics

Towards Reducing Severe Defocus Spread Effects for Multi-Focus Image Fusion via an Optimization Based Strategy

1 code implementation29 Dec 2020 Shuang Xu, Lizhen Ji, Zhe Wang, Pengfei Li, Kai Sun, Chunxia Zhang, Jiangshe Zhang

According to the idea that each local region in the fused image should be similar to the sharpest one among source images, this paper presents an optimization-based approach to reduce defocus spread effects.

Multi Focus Image Fusion SSIM

CARE: Commonsense-Aware Emotional Response Generation with Latent Concepts

no code implementations15 Dec 2020 Peixiang Zhong, Di Wang, Pengfei Li, Chen Zhang, Hao Wang, Chunyan Miao

Experimental results on two large-scale datasets support our hypothesis and show that our model can produce more accurate and commonsense-aware emotional responses and achieve better human ratings than state-of-the-art models that only specialize in one aspect.

Response Generation

On Convergence of Gradient Expected Sarsa($λ$)

no code implementations14 Dec 2020 Long Yang, Gang Zheng, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

We study the convergence of $\mathtt{Expected~Sarsa}(\lambda)$ with linear function approximation.

DIDFuse: Deep Image Decomposition for Infrared and Visible Image Fusion

2 code implementations20 Mar 2020 Zixiang Zhao, Shuang Xu, Chun-Xia Zhang, Junmin Liu, Pengfei Li, Jiangshe Zhang

Infrared and visible image fusion, a hot topic in the field of image processing, aims at obtaining fused images keeping the advantages of source images.

Decoder Infrared And Visible Image Fusion +1

Car Pose in Context: Accurate Pose Estimation with Ground Plane Constraints

no code implementations9 Dec 2019 Pengfei Li, Weichao Qiu, Michael Peven, Gregory D. Hager, Alan L. Yuille

Scene context is a powerful constraint on the geometry of objects within the scene in cases, such as surveillance, where the camera geometry is unknown and image quality may be poor.

3D geometry Car Pose Estimation

Improving Relation Extraction with Knowledge-attention

no code implementations IJCNLP 2019 Pengfei Li, Kezhi Mao, Xuefeng Yang, Qi Li

While attention mechanisms have been proven to be effective in many NLP tasks, majority of them are data-driven.

Relation Relation Extraction

Gradient Q$(σ, λ)$: A Unified Algorithm with Function Approximation for Reinforcement Learning

no code implementations6 Sep 2019 Long Yang, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

To address above problem, we propose a GQ$(\sigma,\lambda)$ that extends tabular Q$(\sigma,\lambda)$ with linear function approximation.

Q-Learning Reinforcement Learning +1

Expected Sarsa($λ$) with Control Variate for Variance Reduction

no code implementations25 Jun 2019 Long Yang, Yu Zhang, Jun Wen, Qian Zheng, Pengfei Li, Gang Pan

In this paper, for reducing the variance, we introduce control variate technique to $\mathtt{Expected}$ $\mathtt{Sarsa}$($\lambda$) and propose a tabular $\mathtt{ES}$($\lambda$)-$\mathtt{CV}$ algorithm.

Off-policy evaluation Reinforcement Learning

A Scalable Learned Index Scheme in Storage Systems

no code implementations8 May 2019 Pengfei Li, Yu Hua, Pengfei Zuo, Jingnan Jia

Index structures are important for efficient data access, which have been widely used to improve the performance in many in-memory systems.

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

no code implementations14 Jun 2018 Wenjia Meng, Qian Zheng, Long Yang, Pengfei Li, Gang Pan

In this paper, we propose a general framework to combine DQN and most of the return-based reinforcement learning algorithms, named R-DQN.

OpenAI Gym reinforcement-learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.