no code implementations • 27 Nov 2024 • Xiaoxuan Li, Yao Liu, Ruoyu Wang, Lina Yao
As the significance of understanding the cause-and-effect relationships among variables increases in the development of modern systems and algorithms, learning causality from observational data has become a preferred and efficient approach over conducting randomized control trials.
no code implementations • 18 Oct 2024 • Zhepeng Cen, Yao Liu, Siliang Zeng, Pratik Chaudhar, Huzefa Rangwala, George Karypis, Rasool Fakoor
Our first approach is Batch-Scheduled Sampling, where, during training, we stochastically choose between the ground-truth token from the dataset and the model's own generated token as input to predict the next token.
no code implementations • 17 Oct 2024 • Ke Yang, Yao Liu, Sapana Chaudhary, Rasool Fakoor, Pratik Chaudhari, George Karypis, Huzefa Rangwala
On the other hand, there has been limited study on the misalignment between a web agent's observation/action representation and the pre-training data of the LLM it's based on.
no code implementations • 4 Sep 2024 • Ruoyu Wang, Yao Liu, Yuanjiang Cao, Lina Yao
We address these constraints by initially exploring the unique differences between Navigation tasks and other sequential data tasks through the lens of Causality, presenting a causal framework to elucidate the inadequacies of conventional sequential methods for Navigation.
no code implementations • 25 Jun 2024 • Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor
Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL.
no code implementations • 3 Jun 2024 • Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor
We focus on the task of learning the value function in the reinforcement learning (RL) setting.
no code implementations • 12 May 2024 • Yao Liu, Quan Z. Sheng, Lina Yao
In response, we propose the Energy Plan Denoising (EPD) model for stochastic trajectory prediction.
no code implementations • 11 May 2024 • Yao Liu, Ruoyu Wang, Yuanjiang Cao, Quan Z. Sheng, Lina Yao
The exploration of high-speed movement by robots or road traffic agents is crucial for autonomous driving and navigation.
no code implementations • 25 Feb 2024 • Jessica Echterhoff, Yao Liu, Abeer Alessa, Julian McAuley, Zexue He
Large language models (LLMs) offer significant potential as tools to support an expanding range of decision-making tasks.
1 code implementation • 13 Feb 2024 • Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, Julian McAuley
In this paper, we propose InstructGraph, a framework that empowers LLMs with the abilities of graph reasoning and generation by instruction tuning and preference alignment.
no code implementations • 26 Dec 2023 • Yao Liu, Binghao Li, Xianzhi Wang, Claude Sammut, Lina Yao
We propose Attention-aware Social Graph Transformer Networks for multi-modal trajectory prediction.
1 code implementation • 30 Nov 2023 • Xinzhe Li, Sun Rui, Yiming Niu, Yao Liu
Specifically, the framework consists of a precipitation predictor with multiple lightweight heads (learners) and a controller that combines the outputs from these heads.
no code implementations • 13 Nov 2023 • Rui Duan, Zhe Qu, Leah Ding, Yao Liu, Zhuo Lu
Motivated by recent advancements in voice conversion (VC), we propose to use the one short sentence knowledge to generate more synthetic speech samples that sound like the target speaker, called parrot speech.
no code implementations • 21 Oct 2023 • Zexue He, Yu Wang, An Yan, Yao Liu, Eric Y. Chang, Amilcare Gentili, Julian McAuley, Chun-Nan Hsu
Curated datasets for healthcare are often limited due to the need of human annotations from experts.
no code implementations • 9 Oct 2023 • Zuxin Liu, Jesse Zhang, Kavosh Asadi, Yao Liu, Ding Zhao, Shoham Sabach, Rasool Fakoor
Inspired by recent advancements in parameter-efficient fine-tuning in language domains, we explore efficient fine-tuning techniques -- e. g., Bottleneck Adapters, P-Tuning, and Low-Rank Adaptation (LoRA) -- in TAIL to adapt large pretrained models for new tasks with limited demonstration data.
1 code implementation • 3 Aug 2023 • Yao Liu, Hang Shao, Bing Bai
This paper introduces a new Convolutional Neural Network (ConvNet) architecture inspired by a class of partial differential equations (PDEs) called quasi-linear hyperbolic systems.
no code implementations • 22 Jul 2023 • Yao Liu, Gangfeng Cui, Jiahui Luo, Xiaojun Chang, Lina Yao
Subsequently, a frame features learning module and a two-stream multi-level feature aggregation module extract global and partial features from the sampled frames, effectively representing the local-region spatial information, appearance information, and motion information related to the interactions.
no code implementations • NeurIPS 2023 • Yao Liu, Pratik Chaudhari, Rasool Fakoor
The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action?
no code implementations • 11 Apr 2023 • Sherry Ruan, Allen Nie, William Steenbergen, Jiayu He, JQ Zhang, Meng Guo, Yao Liu, Kyle Dang Nguyen, Catherine Y Wang, Rui Ying, James A Landay, Emma Brunskill
Resource limitations make it hard to provide all students with one of the most effective educational interventions: personalized instruction.
Deep Reinforcement Learning Explainable artificial intelligence +2
no code implementations • 15 Mar 2023 • Yao Liu, Zesheng Ye, Rui Wang, Binghao Li, Quan Z. Sheng, Lina Yao
Tremendous efforts have been put forth on predicting pedestrian trajectory with generative models to accommodate uncertainty and multi-modality in human behaviors.
1 code implementation • NeurIPS 2023 • Xiangning Chen, Chen Liang, Da Huang, Esteban Real, Kaiyuan Wang, Yao Liu, Hieu Pham, Xuanyi Dong, Thang Luong, Cho-Jui Hsieh, Yifeng Lu, Quoc V. Le
On diffusion models, Lion outperforms Adam by achieving a better FID score and reducing the training compute by up to 2. 3x.
no code implementations • 26 Jul 2022 • Rui Duan, Zhe Qu, Shangqing Zhao, Leah Ding, Yao Liu, Zhuo Lu
In this work, we formulate the adversarial attack against music signals as a new perception-aware attack framework, which integrates human study into adversarial attack design.
no code implementations • 10 Jul 2022 • Yunyun Wang, Yao Liu, Songcan Chen
In this paper, we propose a new UniDA method with adaptive Unknown Authentication by Classifier Paradox (UACP), considering that samples with paradoxical predictions are probably unknowns belonging to none of the source classes.
1 code implementation • 1 Jul 2022 • Yao Liu, Yannis Flet-Berliac, Emma Brunskill
Offline policy optimization could have a large impact on many real-world decision-making problems, as online learning may be infeasible in many applications.
1 code implementation • 6 Jun 2022 • Zhe Qu, Xingyu Li, Rui Duan, Yao Liu, Bo Tang, Zhuo Lu
Therefore, in this paper, we revisit the solutions to the distribution shift problem in FL with a focus on local learning generality.
no code implementations • 27 May 2022 • Yao Liu, Dipendra Misra, Miro Dudík, Robert E. Schapire
We study reinforcement learning (RL) in settings where observations are high-dimensional, but where an RL agent has access to abstract knowledge about the structure of the state space, as is the case, for example, when a robot is tasked to go to a specific room in a building using observations from its own camera, while having access to the floor plan.
1 code implementation • 11 Feb 2022 • Yabin Zhu, Chenglong Li, Yao Liu, Xiao Wang, Jin Tang, Bin Luo, Zhixiang Huang
Tiny objects, frequently appearing in practical applications, have weak appearance and features, and receive increasing interests in meany vision tasks, such as object detection and segmentation.
no code implementations • 10 Jan 2022 • Tao Hou, Tao Wang, Zhuo Lu, Yao Liu, Yalin Sagduyu
In this research, we propose a novel attack strategy named IoTGAN to manipulate an IoT device's traffic such that it can evade machine learning based IoT device identification.
no code implementations • 8 Jan 2022 • Xingyu Li, Zhe Qu, Shangqing Zhao, Bo Tang, Zhuo Lu, Yao Liu
Federated learning (FL) provides a high efficient decentralized machine learning framework, where the training data remains distributed at remote clients in a network.
no code implementations • 2 Dec 2021 • Zhe Qu, Rui Duan, Lixing Chen, Jie Xu, Zhuo Lu, Yao Liu
In addition, client selection for HFL faces more challenges than conventional FL, e. g., the time-varying connection of client-ES pairs and the limited budget of the Network Operator (NO).
no code implementations • 29 Sep 2021 • Yao Liu, Emma Brunskill
Offline policy optimization has a critical impact on many real-world decision-making problems, as online learning is costly and concerning in many applications.
no code implementations • 21 Jun 2021 • Na Li, Yao Liu
We further apply our proposed methods on super resolution model, which is the first to propose a spherical super-resolution model that directly operates on a mesh representation of spherical pixels of 360-degree data.
no code implementations • NeurIPS 2020 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill
Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance.
1 code implementation • 16 Jul 2020 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill
Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance.
no code implementations • ICML 2020 • Omer Gottesman, Joseph Futoma, Yao Liu, Sonali Parbhoo, Leo Anthony Celi, Emma Brunskill, Finale Doshi-Velez
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education, but safe deployment in high stakes settings requires ways of assessing its validity.
no code implementations • 20 Nov 2019 • Sheng Jin, Shangchen Zhou, Yao Liu, Chao Chen, Xiaoshuai Sun, Hongxun Yao, Xian-Sheng Hua
In this paper, we propose a novel Semi-supervised Self-pace Adversarial Hashing method, named SSAH to solve the above problems in a unified framework.
no code implementations • 21 Oct 2019 • Benjamin Petit, Loren Amdahl-Culleton, Yao Liu, Jimmy Smith, Pierre-Luc Bacon
While often stated as an instance of the likelihood ratio trick [Rubinstein, 1989], the original policy gradient theorem [Sutton, 1999] involves an integral over the action space.
no code implementations • ICML 2020 • Yao Liu, Pierre-Luc Bacon, Emma Brunskill
Surprisingly, we find that in finite horizon MDPs there is no strict variance reduction of per-decision importance sampling or stationary importance sampling, comparing with vanilla importance sampling.
no code implementations • 14 May 2019 • Omer Gottesman, Yao Liu, Scott Sussex, Emma Brunskill, Finale Doshi-Velez
We consider a model-based approach to perform batch off-policy evaluation in reinforcement learning.
no code implementations • 17 Apr 2019 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill
We study the problem of off-policy policy optimization in Markov decision processes, and develop a novel off-policy policy gradient method.
no code implementations • 27 Feb 2019 • Yao Liu, Ying Tai, Jilin Li, Shouhong Ding, Chengjie Wang, Feiyue Huang, Dongyang Li, Wenshuai Qi, Rongrong Ji
In this paper, we propose a light reflection based face anti-spoofing method named Aurora Guard (AG), which is fast, simple yet effective that has already been deployed in real-world systems serving for millions of users.
no code implementations • 3 Jul 2018 • Aniruddh Raghu, Omer Gottesman, Yao Liu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill
In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy Policy Evaluation (OPE) when the true behaviour policy is unknown.
1 code implementation • NeurIPS 2018 • Yao Liu, Omer Gottesman, Aniruddh Raghu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill
We study the problem of off-policy policy evaluation (OPPE) in RL.
no code implementations • 23 May 2018 • Yao Liu, Emma Brunskill
Efficient exploration is one of the key challenges for reinforcement learning (RL) algorithms.