Search Results for author: Shuang Qiu

Found 33 papers, 8 papers with code

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

1 code implementation28 Feb 2024 Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang

Additionally, DPA models user preferences as directions (i. e., unit vectors) in the reward space to achieve user-dependent preference control.

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

1 code implementation15 Feb 2024 Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen

We consider the problem of multi-objective alignment of foundation models with human preferences, which is a critical step towards helpful and harmless AI systems.

Reinforcement Learning (RL)

A Temporal-Spectral Fusion Transformer with Subject-specific Adapter for Enhancing RSVP-BCI Decoding

no code implementations12 Jan 2024 Xujin Li, Wei Wei, Shuang Qiu, Huiguang He

The performance improvement of traditional decoding methods relies on a substantial amount of training data from new test subjects, which increases preparation time for BCI systems.

Brain Computer Interface EEG

StairNetV3: Depth-aware Stair Modeling using Deep Learning

no code implementations13 Aug 2023 Chen Wang, Zhongcai Pei, Shuang Qiu, Yachun Wang, Zhiyong Tang

Experiments on our dataset show that our method has a significant improvement over the previous best monocular vision method, with an intersection over union (IOU) increase of 3. 4 %, and the lightweight version has a fast detection speed and can meet the requirements of most real-time applications.

Point cloud reconstruction

On the Value of Myopic Behavior in Policy Reuse

no code implementations28 May 2023 Kang Xu, Chenjia Bai, Shuang Qiu, Haoran He, Bin Zhao, Zhen Wang, Wei Li, Xuelong Li

Leveraging learned strategies in unfamiliar scenarios is fundamental to human intelligence.

RGB-D-based Stair Detection using Deep Learning for Autonomous Stair Climbing

no code implementations2 Dec 2022 Chen Wang, Zhongcai Pei, Shuang Qiu, Zhiyong Tang

Specifically, we design a selective module, which can make the network learn the complementary relationship between the RGB map and the depth map and effectively combine the information from the RGB map and the depth map in different scenes.

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

1 code implementation29 Jul 2022 Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang

Moreover, under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.

Contrastive Learning reinforcement-learning +3

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

no code implementations25 Jul 2022 Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang

Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization in a non-stationary environment.

Stochastic Gradient Descent without Full Data Shuffle

1 code implementation12 Jun 2022 Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang

In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate.

Computational Efficiency

Safe Screening for Sparse Conditional Random Fields

no code implementations27 Nov 2021 Weizhong Zhang, Shuang Qiu

To the best of our knowledge, this is the first screening method which introduces the dual optimum estimation technique -- by carefully exploring and exploiting the strong convexity and the complex structure of the dual problem -- in static screening methods to dynamic screening.

Structured Prediction

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

no code implementations19 Oct 2021 Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang

Then, given any extrinsic reward, the agent computes the policy via a planning algorithm with offline data collected in the exploration phase.

Reinforcement Learning (RL)

Stylized Neural Painting

4 code implementations CVPR 2021 Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, Zhenwei Shi

Different from previous image-to-image translation methods that formulate the translation as pixel-wise prediction, we deal with such an artistic creation process in a vectorized environment and produce a sequence of physically meaningful stroke parameters that can be further used for rendering.

Disentanglement Image-to-Image Translation +2

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning

no code implementations23 Aug 2020 Shuang Qiu, Zhuoran Yang, Xiaohan Wei, Jieping Ye, Zhaoran Wang

Existing approaches for this problem are based on two-timescale or double-loop stochastic gradient algorithms, which may also require sampling large-batch data.

Low-Resource Generation of Multi-hop Reasoning Questions

no code implementations ACL 2020 Jianxing Yu, Wei Liu, Shuang Qiu, Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Specifically, we first build a multi-hop generation model and guide it to satisfy the logical rationality by the reasoning chain extracted from a given text.

Machine Reading Comprehension valid

Gradient-Variation Bound for Online Convex Optimization with Constraints

no code implementations22 Jun 2020 Shuang Qiu, Xiaohan Wei, Mladen Kolar

We study online convex optimization with constraints consisting of multiple functional constraints and a relatively simple constraint set, such as a Euclidean ball.

Energy-Aware DNN Graph Optimization

1 code implementation12 May 2020 Yu Wang, Rong Ge, Shuang Qiu

Unlike existing work in deep neural network (DNN) graphs optimization for inference performance, we explore DNN graph optimization for energy awareness and savings for power- and resource-constrained machine learning devices.

Referring Image Segmentation by Generative Adversarial Learning

no code implementations IEEE 2020 Shuang Qiu, Yao Zhao, Jianbo Jiao, Yunchao Wei, Shikui Wei

To this end, we propose to train the referring image segmentation model in a generative adversarial fashion, which well addresses the distribution similarity problem.

Image Segmentation Referring Expression +4

Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss

no code implementations NeurIPS 2020 Shuang Qiu, Xiaohan Wei, Zhuoran Yang, Jieping Ye, Zhaoran Wang

In particular, we prove that the proposed algorithm achieves $\widetilde{\mathcal{O}}(L|\mathcal{S}|\sqrt{|\mathcal{A}|T})$ upper bounds of both the regret and the constraint violation, where $L$ is the length of each episode.

reinforcement-learning Reinforcement Learning (RL)

Central Server Free Federated Learning over Single-sided Trust Social Networks

1 code implementation11 Oct 2019 Chaoyang He, Conghui Tan, Hanlin Tang, Shuang Qiu, Ji Liu

However, in many social network scenarios, centralized federated learning is not applicable (e. g., a central agent or server connecting all users may not exist, or the communication cost to the central server is not affordable).

Federated Learning

Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rate and Global Landscape Analysis

no code implementations NeurIPS Workshop Deep_Invers 2019 Shuang Qiu, Xiaohan Wei, Zhuoran Yang

In this paper, we consider a new framework for the one-bit sensing problem where the sparsity is implicitly enforced via mapping a low dimensional representation $x_0$ through a known $n$-layer ReLU generative network $G:\mathbb{R}^k\rightarrow\mathbb{R}^d$.

Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis

no code implementations ICML 2020 Shuang Qiu, Xiaohan Wei, Zhuoran Yang

Specifically, we consider a new framework for this problem where the sparsity is implicitly enforced via mapping a low dimensional representation $x_0 \in \mathbb{R}^k$ through a known $n$-layer ReLU generative network $G:\mathbb{R}^k\rightarrow\mathbb{R}^d$ such that $\theta_0 = G(x_0)$.

$\texttt{DeepSqueeze}$: Decentralization Meets Error-Compensated Compression

no code implementations17 Jul 2019 Hanlin Tang, Xiangru Lian, Shuang Qiu, Lei Yuan, Ce Zhang, Tong Zhang, Ji Liu

Since the \emph{decentralized} training has been witnessed to be superior to the traditional \emph{centralized} training in the communication restricted scenario, therefore a natural question to ask is "how to apply the error-compensated technology to the decentralized learning to further reduce the communication cost."

Decentralized Online Learning: Take Benefits from Others' Data without Sharing Your Own to Track Global Trend

no code implementations29 Jan 2019 Yawei Zhao, Chen Yu, Peilin Zhao, Hanlin Tang, Shuang Qiu, Ji Liu

Decentralized Online Learning (online learning in decentralized networks) attracts more and more attention, since it is believed that Decentralized Online Learning can help the data providers cooperatively better solve their online problems without sharing their private data to a third party or other providers.

Proximal Online Gradient is Optimum for Dynamic Regret

no code implementations8 Oct 2018 Yawei Zhao, Shuang Qiu, Ji Liu

While the online gradient method has been shown to be optimal for the static regret metric, the optimal algorithm for the dynamic regret remains unknown.

P^2IR: Universal Deep Node Representation via Partial Permutation Invariant Set Functions

no code implementations27 Sep 2018 Shupeng Gui, Xiangliang Zhang, Shuang Qiu, Mingrui Wu, Jieping Ye, Ji Liu

Our method can 1) learn an arbitrary form of the representation function from the neighborhood, without losing any potential dependence structures, 2) automatically decide the significance of neighbors at different distances, and 3) be applicable to both homogeneous and heterogeneous graph embedding, which may contain multiple types of nodes.

Graph Embedding Representation Learning

GESF: A Universal Discriminative Mapping Mechanism for Graph Representation Learning

no code implementations28 May 2018 Shupeng Gui, Xiangliang Zhang, Shuang Qiu, Mingrui Wu, Jieping Ye, Ji Liu

Graph embedding is a central problem in social network analysis and many other applications, aiming to learn the vector representation for each node.

Graph Embedding Graph Representation Learning

Nonconvex One-bit Single-label Multi-label Learning

no code implementations17 Mar 2017 Shuang Qiu, Tingjin Luo, Jieping Ye, Ming Lin

We study an extreme scenario in multi-label learning where each training instance is endowed with a single one-bit label out of multiple labels.

Multi-Label Learning

The Second Order Linear Model

no code implementations2 Mar 2017 Ming Lin, Shuang Qiu, Bin Hong, Jieping Ye

We show that the conventional gradient descent heuristic is biased by the skewness of the distribution therefore is no longer the best practice of learning the SLM.

Open-Ended Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.