Search Results for author: Yao Liu

Found 44 papers, 9 papers with code

Regularized Multi-LLMs Collaboration for Enhanced Score-based Causal Discovery

no code implementations27 Nov 2024 Xiaoxuan Li, Yao Liu, Ruoyu Wang, Lina Yao

As the significance of understanding the cause-and-effect relationships among variables increases in the development of modern systems and algorithms, learning causality from observational data has become a preferred and efficient approach over conducting randomized control trials.

Causal Discovery

Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens

no code implementations18 Oct 2024 Zhepeng Cen, Yao Liu, Siliang Zeng, Pratik Chaudhar, Huzefa Rangwala, George Karypis, Rasool Fakoor

Our first approach is Batch-Scheduled Sampling, where, during training, we stochastically choose between the ground-truth token from the dataset and the model's own generated token as input to predict the next token.

Math Question Answering

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

no code implementations17 Oct 2024 Ke Yang, Yao Liu, Sapana Chaudhary, Rasool Fakoor, Pratik Chaudhari, George Karypis, Huzefa Rangwala

On the other hand, there has been limited study on the misalignment between a web agent's observation/action representation and the pre-training data of the LLM it's based on.

Causality-Aware Transformer Networks for Robotic Navigation

no code implementations4 Sep 2024 Ruoyu Wang, Yao Liu, Yuanjiang Cao, Lina Yao

We address these constraints by initially exploring the unique differences between Navigation tasks and other sequential data tasks through the lens of Causality, presenting a causal framework to elucidate the inadequacies of conventional sequential methods for Navigation.

Visual Navigation

EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data

no code implementations25 Jun 2024 Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor

Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL.

Reinforcement Learning (RL) Robot Manipulation

Learning the Target Network in Function Space

no code implementations3 Jun 2024 Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor

We focus on the task of learning the value function in the reinforcement learning (RL) setting.

Reinforcement Learning (RL)

Multi-agent Traffic Prediction via Denoised Endpoint Distribution

no code implementations11 May 2024 Yao Liu, Ruoyu Wang, Yuanjiang Cao, Quan Z. Sheng, Lina Yao

The exploration of high-speed movement by robots or road traffic agents is crucial for autonomous driving and navigation.

Autonomous Driving Traffic Prediction +1

Cognitive Bias in Decision-Making with LLMs

no code implementations25 Feb 2024 Jessica Echterhoff, Yao Liu, Abeer Alessa, Julian McAuley, Zexue He

Large language models (LLMs) offer significant potential as tools to support an expanding range of decision-making tasks.

Decision Making

InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment

1 code implementation13 Feb 2024 Jianing Wang, Junda Wu, Yupeng Hou, Yao Liu, Ming Gao, Julian McAuley

In this paper, we propose InstructGraph, a framework that empowers LLMs with the abilities of graph reasoning and generation by instruction tuning and preference alignment.

Hallucination

Precipitation Prediction Using an Ensemble of Lightweight Learners

1 code implementation30 Nov 2023 Xinzhe Li, Sun Rui, Yiming Niu, Yao Liu

Specifically, the framework consists of a precipitation predictor with multiple lightweight heads (learners) and a controller that combines the outputs from these heads.

Ensemble Learning

Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models

no code implementations13 Nov 2023 Rui Duan, Zhe Qu, Leah Ding, Yao Liu, Zhuo Lu

Motivated by recent advancements in voice conversion (VC), we propose to use the one short sentence knowledge to generate more synthetic speech samples that sound like the target speaker, called parrot speech.

Sentence Speaker Recognition +1

TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models

no code implementations9 Oct 2023 Zuxin Liu, Jesse Zhang, Kavosh Asadi, Yao Liu, Ding Zhao, Shoham Sabach, Rasool Fakoor

Inspired by recent advancements in parameter-efficient fine-tuning in language domains, we explore efficient fine-tuning techniques -- e. g., Bottleneck Adapters, P-Tuning, and Low-Rank Adaptation (LoRA) -- in TAIL to adapt large pretrained models for new tasks with limited demonstration data.

Continual Learning Imitation Learning +1

A Novel Convolutional Neural Network Architecture with a Continuous Symmetry

1 code implementation3 Aug 2023 Yao Liu, Hang Shao, Bing Bai

This paper introduces a new Convolutional Neural Network (ConvNet) architecture inspired by a class of partial differential equations (PDEs) called quasi-linear hyperbolic systems.

Image Classification

Two-stream Multi-level Dynamic Point Transformer for Two-person Interaction Recognition

no code implementations22 Jul 2023 Yao Liu, Gangfeng Cui, Jiahui Luo, Xiaojun Chang, Lina Yao

Subsequently, a frame features learning module and a two-stream multi-level feature aggregation module extract global and partial features from the sampled frames, effectively representing the local-region spatial information, appearance information, and motion information related to the interactions.

Action Recognition Temporal Action Localization

Budgeting Counterfactual for Offline RL

no code implementations NeurIPS 2023 Yao Liu, Pratik Chaudhari, Rasool Fakoor

The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action?

counterfactual Counterfactual Reasoning +2

Uncertainty-Aware Pedestrian Trajectory Prediction via Distributional Diffusion

no code implementations15 Mar 2023 Yao Liu, Zesheng Ye, Rui Wang, Binghao Li, Quan Z. Sheng, Lina Yao

Tremendous efforts have been put forth on predicting pedestrian trajectory with generative models to accommodate uncertainty and multi-modality in human behaviors.

Denoising Pedestrian Trajectory Prediction +1

Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception

no code implementations26 Jul 2022 Rui Duan, Zhe Qu, Shangqing Zhao, Leah Ding, Yao Liu, Zhuo Lu

In this work, we formulate the adversarial attack against music signals as a new perception-aware attack framework, which integrates human study into adversarial attack design.

Adversarial Attack Speaker Recognition +2

Towards Adaptive Unknown Authentication for Universal Domain Adaptation by Classifier Paradox

no code implementations10 Jul 2022 Yunyun Wang, Yao Liu, Songcan Chen

In this paper, we propose a new UniDA method with adaptive Unknown Authentication by Classifier Paradox (UACP), considering that samples with paradoxical predictions are probably unknowns belonging to none of the source classes.

Universal Domain Adaptation Unsupervised Domain Adaptation

Offline Policy Optimization with Eligible Actions

1 code implementation1 Jul 2022 Yao Liu, Yannis Flet-Berliac, Emma Brunskill

Offline policy optimization could have a large impact on many real-world decision-making problems, as online learning may be infeasible in many applications.

continuous-control Continuous Control +1

Generalized Federated Learning via Sharpness Aware Minimization

1 code implementation6 Jun 2022 Zhe Qu, Xingyu Li, Rui Duan, Yao Liu, Bo Tang, Zhuo Lu

Therefore, in this paper, we revisit the solutions to the distribution shift problem in FL with a focus on local learning generality.

Federated Learning Privacy Preserving

Provably Sample-Efficient RL with Side Information about Latent Dynamics

no code implementations27 May 2022 Yao Liu, Dipendra Misra, Miro Dudík, Robert E. Schapire

We study reinforcement learning (RL) in settings where observations are high-dimensional, but where an RL agent has access to abstract knowledge about the structure of the state space, as is the case, for example, when a robot is tasked to go to a specific room in a building using observations from its own camera, while having access to the floor plan.

reinforcement-learning Reinforcement Learning (RL) +1

Tiny Object Tracking: A Large-scale Dataset and A Baseline

1 code implementation11 Feb 2022 Yabin Zhu, Chenglong Li, Yao Liu, Xiao Wang, Jin Tang, Bin Luo, Zhixiang Huang

Tiny objects, frequently appearing in practical applications, have weak appearance and features, and receive increasing interests in meany vision tasks, such as object detection and segmentation.

Attribute Knowledge Distillation +4

IoTGAN: GAN Powered Camouflage Against Machine Learning Based IoT Device Identification

no code implementations10 Jan 2022 Tao Hou, Tao Wang, Zhuo Lu, Yao Liu, Yalin Sagduyu

In this research, we propose a novel attack strategy named IoTGAN to manipulate an IoT device's traffic such that it can evade machine learning based IoT device identification.

BIG-bench Machine Learning IoT Device Identification

LoMar: A Local Defense Against Poisoning Attack on Federated Learning

no code implementations8 Jan 2022 Xingyu Li, Zhe Qu, Shangqing Zhao, Bo Tang, Zhuo Lu, Yao Liu

Federated learning (FL) provides a high efficient decentralized machine learning framework, where the training data remains distributed at remote clients in a network.

Density Estimation Edge-computing +2

Context-Aware Online Client Selection for Hierarchical Federated Learning

no code implementations2 Dec 2021 Zhe Qu, Rui Duan, Lixing Chen, Jie Xu, Zhuo Lu, Yao Liu

In addition, client selection for HFL faces more challenges than conventional FL, e. g., the time-varying connection of client-ES pairs and the limited budget of the Network Operator (NO).

Federated Learning

Avoiding Overfitting to the Importance Weights in Offline Policy Optimization

no code implementations29 Sep 2021 Yao Liu, Emma Brunskill

Offline policy optimization has a critical impact on many real-world decision-making problems, as online learning is costly and concerning in many applications.

Decision Making

Applying VertexShuffle Toward 360-Degree Video Super-Resolution on Focused-Icosahedral-Mesh

no code implementations21 Jun 2021 Na Li, Yao Liu

We further apply our proposed methods on super resolution model, which is the first to propose a spherical super-resolution model that directly operates on a mesh representation of spherical pixels of 360-degree data.

Video Super-Resolution

Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration

no code implementations NeurIPS 2020 Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance.

reinforcement-learning Reinforcement Learning +1

Provably Good Batch Reinforcement Learning Without Great Exploration

1 code implementation16 Jul 2020 Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance.

reinforcement-learning Reinforcement Learning +1

Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions

no code implementations ICML 2020 Omer Gottesman, Joseph Futoma, Yao Liu, Sonali Parbhoo, Leo Anthony Celi, Emma Brunskill, Finale Doshi-Velez

Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education, but safe deployment in high stakes settings requires ways of assessing its validity.

Off-policy evaluation reinforcement-learning +1

SSAH: Semi-supervised Adversarial Deep Hashing with Self-paced Hard Sample Generation

no code implementations20 Nov 2019 Sheng Jin, Shangchen Zhou, Yao Liu, Chao Chen, Xiaoshuai Sun, Hongxun Yao, Xian-Sheng Hua

In this paper, we propose a novel Semi-supervised Self-pace Adversarial Hashing method, named SSAH to solve the above problems in a unified framework.

Deep Hashing Generative Adversarial Network

All-Action Policy Gradient Methods: A Numerical Integration Approach

no code implementations21 Oct 2019 Benjamin Petit, Loren Amdahl-Culleton, Yao Liu, Jimmy Smith, Pierre-Luc Bacon

While often stated as an instance of the likelihood ratio trick [Rubinstein, 1989], the original policy gradient theorem [Sutton, 1999] involves an integral over the action space.

continuous-control Continuous Control +2

Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling

no code implementations ICML 2020 Yao Liu, Pierre-Luc Bacon, Emma Brunskill

Surprisingly, we find that in finite horizon MDPs there is no strict variance reduction of per-decision importance sampling or stationary importance sampling, comparing with vanilla importance sampling.

Off-policy evaluation Reinforcement Learning

Off-Policy Policy Gradient with State Distribution Correction

no code implementations17 Apr 2019 Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

We study the problem of off-policy policy optimization in Markov decision processes, and develop a novel off-policy policy gradient method.

Aurora Guard: Real-Time Face Anti-Spoofing via Light Reflection

no code implementations27 Feb 2019 Yao Liu, Ying Tai, Jilin Li, Shouhong Ding, Chengjie Wang, Feiyue Huang, Dongyang Li, Wenshuai Qi, Rongrong Ji

In this paper, we propose a light reflection based face anti-spoofing method named Aurora Guard (AG), which is fast, simple yet effective that has already been deployed in real-world systems serving for millions of users.

Face Anti-Spoofing General Classification

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

no code implementations3 Jul 2018 Aniruddh Raghu, Omer Gottesman, Yao Liu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill

In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy Policy Evaluation (OPE) when the true behaviour policy is unknown.

Cannot find the paper you are looking for? You can Submit a new open access paper.