no code implementations • 1 Nov 2023 • Cong Guan, Lichao Zhang, Chunpeng Fan, Yichen Li, Feng Chen, Lihe Li, Yunjia Tian, Lei Yuan, Yang Yu
Developing intelligent agents capable of seamless coordination with humans is a critical step towards achieving artificial general intelligence.
1 code implementation • 16 Oct 2023 • Ziniu Li, Tian Xu, Yushun Zhang, Yang Yu, Ruoyu Sun, Zhi-Quan Luo
This is due to the computational overhead of the value model, which does not exist in ReMax.
1 code implementation • 15 Oct 2023 • Yang Yu, Qi Liu, Kai Zhang, Yuren Zhang, Chao Song, Min Hou, Yuqing Yuan, Zhihao Ye, Zaixi Zhang, Sanshi Lei Yu
Specifically, we adopt a multiple pairwise ranking loss which trains the user model to capture the similarity orders between the implicitly augmented view, the explicitly augmented view, and views from other users.
no code implementations • 9 Oct 2023 • Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu
MOREC learns a generalizable dynamics reward function from offline data, which is subsequently employed as a transition filter in any offline MBRL method: when generating transitions, the dynamics model generates a batch of transitions and selects the one with the highest dynamics reward value.
no code implementations • 9 Oct 2023 • Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Anqi Huang, Kai Xu, Zongzhang Zhang, Yang Yu
In this work, we focus on imitator learning based on only one expert demonstration.
no code implementations • 21 Sep 2023 • Zhourui Guo, Meng Yao, Yang Yu, Qiyue Yin
We assume that the interaction can be modeled as a sequence of templated questions and answers, and that there is a large corpus of previous interactions available.
no code implementations • 12 Sep 2023 • Chenxiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu
Third, we train an Advantage-Conditioned Transformer (ACT) to generate actions conditioned on the estimated advantages.
1 code implementation • 6 Sep 2023 • Yu Chen, Tingxin Li, Huiming Liu, Yang Yu
Numerous companies have started offering services based on large language models (LLM), such as ChatGPT, which inevitably raises privacy concerns as users' prompts are exposed to the model provider.
no code implementations • 26 Aug 2023 • Jiajin Luo, Baojian Zhou, Yang Yu, Ping Zhang, Xiaohui Peng, Jianglei Ma, Peiying Zhu, Jianmin Lu, Wen Tong
In order to address the lack of applicable channel models for ISAC research and evaluation, we release Sensiverse, a dataset that can be used for ISAC research.
no code implementations • 17 Aug 2023 • Yang Yu, Han Chen
Structural Health Monitoring (SHM) plays an indispensable role in ensuring the longevity and safety of infrastructure.
no code implementations • 4 Aug 2023 • Han Chen, Yang Yu, Pengtao Li
Mechanical vibration signal denoising is a pivotal task in various industrial applications, including system health monitoring and failure prediction.
1 code implementation • 3 Aug 2023 • Guanzhou Ke, Yang Yu, Guoqing Chao, Xiaoli Wang, Chenyang Xu, Shengfeng He
In this paper, we propose a novel multi-view representation disentangling method that aims to go beyond inductive biases, ensuring both interpretability and generalizability of the resulting representations.
1 code implementation • 26 Jul 2023 • Tianyu Liu, Hao Zhao, Yang Yu, Guyue Zhou, Ming Liu
However, previous studies learned within a sequence of autonomous driving datasets, resulting in unsatisfactory blurring when rotating the car in the simulator.
1 code implementation • PMLR 2023 • Yihao Sun, Jiaji Zhang, Chengxing Jia, Haoxin Lin, Junyin Ye, Yang Yu
MOBILE conducts uncertainty quantification through the inconsistency of Bellman estimations under an ensemble of learned dynamics models, which can be a better approximator to the true Bellman error, and penalizes the Bellman estimation based on this uncertainty.
no code implementations • 28 Jun 2023 • Ziqiao Meng, Peilin Zhao, Yang Yu, Irwin King
Reaction and retrosynthesis prediction are fundamental tasks in computational chemistry that have recently garnered attention from both the machine learning and drug discovery communities.
no code implementations • 15 Jun 2023 • Bo wang, Yifan Zhang, Jian Li, Yang Yu, Zhenping Sun, Li Liu, Dewen Hu
Occlusion problem remains a key challenge in Optical Flow Estimation (OFE) despite the recent significant progress brought by deep learning in the field.
no code implementations • 12 Jun 2023 • Yu Chen, Yang Yu, Rongrong Ni, Yao Zhao, Haoliang Li
Next, we design a phoneme-viseme awareness module for cross-modal feature fusion and representation alignment, so that the modality gap can be reduced and the intrinsic complementarity of the two modalities can be better explored.
2 code implementations • 11 Jun 2023 • Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu
A common taxonomy of existing offline RL works is policy regularization, which typically constrains the learned policy by distribution or support of the behavior policy.
1 code implementation • 11 Jun 2023 • Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo
Adversarial imitation learning (AIL), a subset of IL methods, is particularly promising, but its theoretical foundation in the presence of unknown transitions has yet to be fully developed.
no code implementations • 5 Jun 2023 • Ziqiao Meng, Peilin Zhao, Yang Yu, Irwin King
However, the current non-autoregressive decoder does not satisfy two essential rules of electron redistribution modeling simultaneously: the electron-counting rule and the symmetry rule.
no code implementations • 23 May 2023 • Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu
We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.
1 code implementation • 10 May 2023 • Lei Yuan, Zi-Qian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Li-He Li, Chao Qian, Yang Yu
Concretely, to avoid the ego-system overfitting to a specific attacker, we maintain a set of attackers, which is optimized to guarantee the attackers high attacking quality and behavior diversity.
no code implementations • 9 May 2023 • Lei Yuan, Feng Chen, Zhongzhang Zhang, Yang Yu
In specific, we introduce a novel message-attacking approach that models the learning of the auxiliary attacker as a cooperative problem under a shared goal to minimize the coordination ability of the ego system, with which every information channel may suffer from distinct message attacks.
no code implementations • 7 May 2023 • Lei Yuan, Lihe Li, Ziqian Zhang, Fuxiang Zhang, Cong Guan, Yang Yu
Towards tackling the mentioned issue, this paper proposes an approach Multi-Agent Continual Coordination via Progressive Task Contextualization, dubbed MACPro.
no code implementations • 7 May 2023 • Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu
Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment.
1 code implementation • 3 May 2023 • Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, Jieping Ye, Chen Ma
However, building a user simulator with no reality-gap, i. e., can predict user's feedback exactly, is unrealistic because the users' reaction patterns are complex and historical logs for each user are limited, which might mislead the simulator-based recommendation policy.
no code implementations • 21 Mar 2023 • Yang Yu, Danruo Deng, Furui Liu, Yueming Jin, Qi Dou, Guangyong Chen, Pheng-Ann Heng
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers).
no code implementations • 9 Mar 2023 • Zhengmao Zhu, YuRen Liu, Honglong Tian, Yang Yu, Kun Zhang
Playing an important role in Model-Based Reinforcement Learning (MBRL), environment models aim to predict future states based on the past.
Model-based Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 3 Mar 2023 • Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang, Yang Yu
In this paper, we propose a novel active imitation learning framework based on a teacher-student interaction model, in which the teacher's goal is to identify the best teaching behavior and actively affect the student's learning process.
1 code implementation • 3 Mar 2023 • Danruo Deng, Guangyong Chen, Yang Yu, Furui Liu, Pheng-Ann Heng
To address this problem, we propose a novel method, Fisher Information-based Evidential Deep Learning ($\mathcal{I}$-EDL).
no code implementations • 19 Feb 2023 • Cong Guan, Feng Chen, Lei Yuan, Zongzhang Zhang, Yang Yu
We also release the built offline benchmarks in this paper as a testbed for communication ability validation to facilitate further future research.
no code implementations • 18 Feb 2023 • Jing-Cheng Pang, Xin-Yu Yang, Si-Hang Yang, Yang Yu
To ease the learning burden of the policy, we investigate an inside-out scheme for natural language-conditioned RL by developing a task language (TL) that is task-related and unique.
1 code implementation • 27 Jan 2023 • Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo
This paper considers a situation where, besides the small amount of expert data, a supplementary dataset is available, which can be collected cheaply from sub-optimal policies.
1 code implementation • 5 Jan 2023 • Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu, De-Chuan Zhan
In cooperative multi-agent reinforcement learning (CMARL), it is critical for agents to achieve a balance between self-exploration and team collaboration.
1 code implementation • 28 Dec 2022 • Guanzhou Ke, Guoqing Chao, Xiaoli Wang, Chenyang Xu, Yongqi Zhu, Yang Yu
To this end, we utilize a deep fusion network to fuse view-specific representations into the view-common representation, extracting high-level semantics for obtaining robust representation.
1 code implementation • 11 Dec 2022 • Yang Yu, Qi Liu, Likang Wu, Runlong Yu, Sanshi Lei Yu, Zaixi Zhang
Experiments on two public datasets show that ClusterAttack can effectively degrade the performance of FedRec systems while circumventing many defense methods, and UNION can improve the resistance of the system against various untargeted attacks, including our ClusterAttack.
no code implementations • 8 Dec 2022 • Xingxing Zhang, Yiran Liu, Xun Wang, Pengcheng He, Yang Yu, Si-Qing Chen, Wayne Xiong, Furu Wei
The input and output of most text generation tasks can be transformed to two sequences of tokens and they can be modeled using sequence-to-sequence learning modeling tools such as Transformers.
Ranked #2 on
Text Summarization
on SAMSum Corpus
1 code implementation • 5 Dec 2022 • Hang Zhao, Zherong Pan, Yang Yu, Kai Xu
We study the problem of learning online packing skills for irregular 3D shapes, which is arguably the most challenging setting of bin packing problems.
no code implementations • 29 Nov 2022 • Runjia Li, Yang Yu, Charlie Haywood
In this paper, we address the problem of blind deblurring with high efficiency.
no code implementations • 14 Nov 2022 • Yiran Liu, Xiao Liu, Haotian Chen, Yang Yu
We use our theoretical framework to explain why the current debiasing methods cause performance degradation.
no code implementations • 6 Nov 2022 • Haotian Chen, Lingwei Zhang, Yiran Liu, Fanchao Chen, Yang Yu
To validate our theoretical analysis, we further propose another method using our proposed Causality-Aware Self-Attention Mechanism (CASAM) to guide the model to learn the underlying causality knowledge in legal texts.
2 code implementations • ACM Multimedia 2022 • Meiyu Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang
Secondly, for further improving learning ability of implicit cross-media semantic associations, a semantic label association graph is constructed, and the graph convolutional network is utilized to mine the implicit semantic structures, thus guiding learning of discriminative features of different modalities.
no code implementations • 19 Oct 2022 • Yingchun Guo, Huan He, Ye Zhu, Yang Yu
Domain generalization person re-identification (DG Re-ID) aims to directly deploy a model trained on the source domain to the unseen target domain with good generalization, which is a challenging problem and has practical value in a real-world deployment.
1 code implementation • 13 Oct 2022 • Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu
MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm.
no code implementations • 11 Oct 2022 • Zhengbang Zhu, Rongjun Qin, JunJie Huang, Xinyi Dai, Yang Yu, Yong Yu, Weinan Zhang
In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios.
no code implementations • 27 Sep 2022 • Jiahan Liu, Chaochao Yan, Yang Yu, Chan Lu, Junzhou Huang, Le Ou-Yang, Peilin Zhao
In this paper, we propose a novel end-to-end graph generation model for retrosynthesis prediction, which sequentially identifies the reaction center, generates the synthons, and adds motifs to the synthons to generate reactants.
2 code implementations • 23 Sep 2022 • Ruo-Ze Liu, Zhen-Jia Pang, Zhou-Yu Meng, Wenhai Wang, Yang Yu, Tong Lu
In this work, we investigate a set of RL techniques for the full-length game of StarCraft II.
no code implementations • 21 Sep 2022 • Hui Su, Xiao Zhou, Houjin Yu, Xiaoyu Shen, YuWen Chen, Zilin Zhu, Yang Yu, Jie zhou
Large Language Models pre-trained with self-supervised learning have demonstrated impressive zero-shot generalization capabilities on a wide spectrum of tasks.
1 code implementation • 16 Sep 2022 • Lanqing Li, Liang Zeng, Ziqi Gao, Shen Yuan, Yatao Bian, Bingzhe Wu, Hengtong Zhang, Yang Yu, Chan Lu, Zhipeng Zhou, Hongteng Xu, Jia Li, Peilin Zhao, Pheng-Ann Heng
The last decade has witnessed a prosperous development of computational methods and dataset curation for AI-aided drug discovery (AIDD).
no code implementations • 12 Sep 2022 • Haoxin Lin, Yihao Sun, Jiaji Zhang, Yang Yu
The new model-based reinforcement learning algorithm MPPVE (Model-based Planning Policy Learning with Multi-step Plan Value Estimation) shows a better utilization of the learned model and achieves a better sample efficiency than state-of-the-art model-based RL approaches.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • 31 Aug 2022 • Chao Chen, Dawei Wang, Feng Mao, Zongzhang Zhang, Yang Yu
Semi-supervised Anomaly Detection (AD) is a kind of data mining task which aims at learning features from partially-labeled datasets to help detect outliers.
1 code implementation • 26 Aug 2022 • Guanzhou Ke, Yongqi Zhu, Yang Yu
To this end, in this paper, we proposed a hybrid contrastive fusion algorithm to extract robust view-common representation from unlabeled data.
no code implementations • 23 Aug 2022 • Xinbin Liang, Yaru Liu, Yang Yu, Kaixuan Liu, Yadong Liu, Zongtan Zhou
Significance: We improve the classification performance of 3 CNNs on 2 datasets by the use of TRM, indicating that it has the capability to mine the EEG spatial topological information.
no code implementations • 19 Aug 2022 • Rong-Jun Qin, Fan-Ming Luo, Hong Qian, Yang Yu
This paper addresses policy learning in non-stationary environments and games with continuous actions.
no code implementations • 9 Aug 2022 • Ke Xue, Yutong Wang, Lei Yuan, Cong Guan, Chao Qian, Yang Yu
Experimental results on a collaborative cooking task show the necessity of considering the heterogeneous setting and illustrate that our proposed method is a promising solution for heterogeneous cooperative MARL.
no code implementations • 3 Aug 2022 • Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo
Imitation learning learns a policy from expert trajectories.
no code implementations • 20 Jul 2022 • Yang Yu, Zixu Zhao, Yueming Jin, Guangyong Chen, Qi Dou, Pheng-Ann Heng
Concretely, for trusty representation learning, we propose to incorporate pseudo labels to instruct the pair selection, obtaining more reliable representation pairs for pixel contrast.
no code implementations • 19 Jun 2022 • Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen, Weinan Zhang, Yang Yu
In this survey, we take a review of MBRL with a focus on the recent progress in deep RL.
no code implementations • 4 Jun 2022 • Xue-Kun Jin, Xu-Hui Liu, Shengyi Jiang, Yang Yu
Value function estimation is an indispensable subroutine in reinforcement learning, which becomes more challenging in the offline setting.
no code implementations • 3 Jun 2022 • Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu
Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment.
no code implementations • 1 Jun 2022 • Fan-Ming Luo, Xingchen Cao, Yang Yu
Empirical results compared with the state-of-the-art AIL methods show that DARL can learn a reward that is more consistent with the true reward, thus obtaining higher environment returns.
no code implementations • 1 Jun 2022 • Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu
Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.
1 code implementation • 29 Mar 2022 • Yueming Jin, Yang Yu, Cheng Chen, Zixu Zhao, Pheng-Ann Heng, Danail Stoyanov
Automatic surgical scene segmentation is fundamental for facilitating cognitive intelligence in the modern operating theatre.
no code implementations • 28 Mar 2022 • Yangyang Hu, Yang Yu
On a mathematical reasoning dataset, we adopt the recently proposed abductive learning framework, and propose the ABL-Sym algorithm that combines the Transformer neural models with a symbolic mathematics library.
no code implementations • 22 Mar 2022 • Ziniu Li, Tian Xu, Yang Yu
In particular, we demonstrate that the sample complexity of the target Q-learning algorithm in [Lee and He, 2020] is $\widetilde{\mathcal O}(|\mathcal S|^2|\mathcal A|^2 (1-\gamma)^{-5}\varepsilon^{-2})$.
no code implementations • 9 Mar 2022 • Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Zongzhang Zhang, Chongjie Zhang, Yang Yu
We demonstrate that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.
1 code implementation • 24 Feb 2022 • Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio Lopez Moreno
In this paper, we introduce a novel language identification system based on conformer layers.
no code implementations • 5 Feb 2022 • Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo
First, we show that ValueDice could reduce to BC under the offline setting.
no code implementations • 28 Dec 2021 • Qixin Zhang, Wenbing Ye, Zaiyi Chen, Haoyuan Hu, Enhong Chen, Yang Yu
As a result, only limited violations of constraints or pessimistic competitive bounds could be guaranteed.
1 code implementation • 20 Dec 2021 • Chaochao Yan, Peilin Zhao, Chan Lu, Yang Yu, Junzhou Huang
To overcome this limitation, we propose an innovative retrosynthesis prediction framework that can compose novel templates beyond training templates.
Ranked #8 on
Single-step retrosynthesis
on USPTO-50k
no code implementations • 8 Dec 2021 • Zhenxin Wu, Qingliang Chen, Yifeng Liu, Yinqi Zhang, Chengkai Zhu, Yang Yu
Finally, using the progressive training (P), the features extracted by the model in different stages can be fully utilized and fused with each other.
1 code implementation • 2 Dec 2021 • Yang Yu, Fangzhao Wu, Chuhan Wu, Jingwei Yi, Qi Liu
We further propose a two-stage knowledge distillation method to improve the efficiency of the large PLM-based news recommendation model while maintaining its performance.
1 code implementation • NeurIPS 2021 • Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, Jianye Hao
A belief is a distribution of states representing state uncertainty.
1 code implementation • NeurIPS 2021 • Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu
Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.
1 code implementation • NeurIPS 2021 • Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye
Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.
no code implementations • 24 Nov 2021 • Yang Li, Kang Li, Zhen Yang, Yang Yu, Runnan Xu, Miaosen Yang
In order to solve this model, this research combines Jaya algorithm and interior point method (IPM) to develop a hybrid analysis-heuristic solution method called Jaya-IPM, where the lower- and upper- levels are respectively addressed by the IPM and the Jaya, and the scheduling scheme is obtained via iterations between the two levels.
no code implementations • 20 Nov 2021 • Yang Hu, Zhui Zhu, Sirui Song, Xue Liu, Yang Yu
Experimental results in an exemplary environment show that our MARL approach is able to demonstrate the effectiveness and necessity of restrictions on individual liberty for collaborative supply of public goods.
1 code implementation • ICLR 2022 • Hang Zhao, Yang Yu, Kai Xu
PCT is a full-fledged description of the state and action space of bin packing which can support packing policy learning based on deep reinforcement learning (DRL).
no code implementations • 26 Sep 2021 • Jiahan Cao, Lei Yuan, Jianhao Wang, Shaowei Zhang, Chongjie Zhang, Yang Yu, De-Chuan Zhan
During long-time observations, agents can build \textit{awareness} for teammates to alleviate the problem of partial observability.
no code implementations • 3 Sep 2021 • Chuhan Wu, Fangzhao Wu, Yang Yu, Tao Qi, Yongfeng Huang, Xing Xie
Two self-supervision tasks are incorporated in UserBERT for user model pre-training on unlabeled user behavior data to empower user modeling.
no code implementations • 16 Aug 2021 • Zhao-Hua Li, Yang Yu, Yingfeng Chen, Ke Chen, Zhipeng Hu, Changjie Fan
The empirical results show that the proposed method can preserve a higher cumulative reward than behavior cloning and learn a more consistent policy to the original one.
1 code implementation • 12 Aug 2021 • Jiarui Fang, Zilin Zhu, Shenggui Li, Hui Su, Yang Yu, Jie zhou, Yang You
PatrickStar uses the CPU-GPU heterogeneous memory space to store the model data.
no code implementations • 16 Jul 2021 • Yongqing Gao, Guangda Huzhang, Weijie Shen, Yawen Liu, Wen-Ji Zhou, Qing Da, Yang Yu
Recent E-commerce applications benefit from the growth of deep learning techniques.
no code implementations • 19 Jun 2021 • Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo
For some MDPs, we show that vanilla AIL has a worse sample complexity than BC.
no code implementations • ACL 2021 • Tao Qi, Fangzhao Wu, Chuhan Wu, Peiru Yang, Yang Yu, Xing Xie, Yongfeng Huang
Instead of a single user embedding, in our method each user is represented in a hierarchical interest tree to better capture their diverse and multi-grained interest in news.
1 code implementation • ICLR 2022 • Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang
Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning.
1 code implementation • ICLR 2022 • Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang
Although GCHRL possesses superior exploration ability by decomposing tasks via subgoals, existing GCHRL methods struggle in temporally extended tasks with sparse external rewards, since the high-level policy learning relies on external rewards.
no code implementations • 18 May 2021 • Jing-Cheng Pang, Tian Xu, Shengyi Jiang, Yu-Ren Liu, Yang Yu
Reinforcement learning (RL) has made remarkable progress in many decision-making tasks, such as Go, game playing, and robotics control.
1 code implementation • NeurIPS 2021 • Xu-Hui Liu, Zhenghai Xue, Jing-Cheng Pang, Shengyi Jiang, Feng Xu, Yang Yu
In reinforcement learning, experience replay stores past samples for further reuse.
1 code implementation • 14 Apr 2021 • Ruo-Ze Liu, Wenhai Wang, Yanjie Shen, Zhiqi Li, Yang Yu, Tong Lu
StarCraft II (SC2) is a real-time strategy game in which players produce and control multiple units to fight against opponent's units.
1 code implementation • 19 Feb 2021 • Yang Yu, Shih-Kang Chao, Guang Cheng
We propose a distributed bootstrap method for simultaneous inference on high-dimensional massive data that are stored and processed with many machines.
no code implementations • 10 Feb 2021 • Hong Qian, Yang Yu
In this article, we summarize methods of derivative-free reinforcement learning to date, and organize the methods in aspects including parameter updating, model selection, exploration, and parallel/distributed methods.
no code implementations • 10 Feb 2021 • Peiyi Zhang, Xiaodong Jiang, Ginger M Holt, Nikolay Pavlovich Laptev, Caner Komurlu, Peng Gao, Yang Yu
Hyper-parameters of time series models play an important role in time series analysis.
no code implementations • Findings (EMNLP) 2021 • Chuhan Wu, Fangzhao Wu, Yang Yu, Tao Qi, Yongfeng Huang, Qi Liu
However, existing language models are pre-trained and distilled on general corpus like Wikipedia, which has some gaps with the news domain and may be suboptimal for news intelligence.
no code implementations • 1 Feb 2021 • Yang Yu, Hai-Feng Wang, Wen-Yuan Cui, Lin-Lin Li, Chao Liu, Bo Zhang, Hao Tian, Zhen-Yan Huo, Jie Ju, Zhi-Cun Liu, Fang Wen, Shuai Feng
We present analysis of the spatial density structure for the outer disk from 8$-$14 \, kpc with the LAMOST DR5 13534 OB-type stars and observe similar flaring on north and south sides of the disk implying that the flaring structure is symmetrical about the Galactic plane, for which the scale height at different Galactocentric distance is from 0. 14 to 0. 5 \, kpc.
Astrophysics of Galaxies
3 code implementations • 1 Feb 2021 • Rongjun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu
We evaluate existing offline RL algorithms on NeoRL and argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward.
no code implementations • 27 Jan 2021 • Yang Yu, Shangce Gao, Yirui Wang, Jiujun Cheng, Yuki Todo
This proposed method, adaptive step length based on memory selection BSO, namely ASBSO, applies multiple step lengths to modify the generation process of new solutions, thus supplying a flexible search according to corresponding problems and convergent periods.
1 code implementation • 1 Jan 2021 • Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Yang Yu
Domain adaptation is a promising direction for deploying RL agents in real-world applications, where vision-based robotics tasks constitute an important part.
no code implementations • 1 Jan 2021 • Xiong-Hui Chen, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Wenjie Shang, Yiping Meng, Jieping Ye
Instead of increasing the fidelity of models for policy learning, we handle the distortion issue via learning to adapt to diverse simulators generated by the offline dataset.
no code implementations • 9 Dec 2020 • Yang Yu, Zhenhao Gu, Rong Tao, Jingtian Ge, Kenglun Chang
With the continuous development of machine learning technology, major e-commerce platforms have launched recommendation systems based on it to serve a large number of customers with different needs more efficiently.
no code implementations • 3 Dec 2020 • Wei zhang, Murray Campbell, Yang Yu, Sadhana Kumaravel
Human judgments of word similarity have been a popular method of evaluating the quality of word embedding.
no code implementations • NeurIPS 2020 • Shengyi Jiang, JingCheng Pang, Yang Yu
In this work, we investigate policy learning in the condition of a few expert demonstrations and a simulator with misspecified dynamics.
no code implementations • 24 Nov 2020 • Jing Yang, Chun Ouyang, Wil M. P. van der Aalst, Arthur H. M. ter Hofstede, Yang Yu
We demonstrate the feasibility of this framework by proposing an approach underpinned by the framework for organizational model discovery, and also conduct experiments on real-life event logs to discover and evaluate organizational models.
no code implementations • 22 Nov 2020 • Shenglan Liu, Yang Yu
As a widely used method in machine learning, principal component analysis (PCA) shows excellent properties for dimensionality reduction.
1 code implementation • NeurIPS 2020 • Chaochao Yan, Qianggang Ding, Peilin Zhao, Shuangjia Zheng, Jinyu Yang, Yang Yu, Junzhou Huang
Retrosynthesis is the process of recursively decomposing target molecules into available building blocks.
no code implementations • 27 Oct 2020 • Yang Yu, Rongrong Ni, Yao Zhao
Recently, AI-manipulated face techniques have developed rapidly and constantly, which has raised new security issues in society.
no code implementations • Knowledge Based Systems 2020 • Chao Wu, Qingyu Xiong, Hualing Yi, Yang Yu, Qiwu Zhu, Min Gao, Jie Chen
In this paper, we propose a novel end-to-end multiple-element joint detection model (MEJD), which effectively extracts all (target, aspect, sentiment) triples from a sentence.
no code implementations • NeurIPS 2020 • Tian Xu, Ziniu Li, Yang Yu
In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation.
no code implementations • 16 Oct 2020 • Xiao Liu, Jiajie Zhang, Siting Li, Zuotong Wu, Yang Yu
We discover that pixel normalization causes object entanglement by in-painting the area occupied by ablated objects.
no code implementations • 9 Oct 2020 • Jiarui Fang, Yang Yu, Chengduo Zhao, Jie zhou
This paper designed a transformer serving system called TurboTransformers, which consists of a computing runtime and a serving framework to solve the above challenges.
1 code implementation • 4 Aug 2020 • Sirui Song, Zefang Zong, Yong Li, Xue Liu, Yang Yu
Saving lives or economy is a dilemma for epidemic control in most cities while smart-tracing technology raises people's privacy concerns.
4 code implementations • ICLR 2021 • Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang
This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function.
no code implementations • 29 Jun 2020 • Shenglan Liu, Yang Yu
Manifold Learning occupies a vital role in the field of nonlinear dimensionality reduction and its ideas also serve for other relevant methods.
no code implementations • LREC 2020 • Linrui Zhang, Hsin-Lun Huang, Yang Yu, Dan Moldovan
As opposed to the traditional machine learning models which require considerable effort in designing task specific features, our model can be well adapted to the proposed tasks with a very limited amount of fine-tuning, which significantly reduces the manual effort in feature engineering.
no code implementations • 16 Apr 2020 • Tianyu Liu, Qinghai Liao, Lu Gan, Fulong Ma, Jie Cheng, Xupeng Xie, Zhe Wang, Yingbing Chen, Yilong Zhu, Shuyang Zhang, Zhengyong Chen, Yang Liu, Meng Xie, Yang Yu, Zitong Guo, Guang Li, Peidong Yuan, Dong Han, Yuying Chen, Haoyang Ye, Jianhao Jiao, Peng Yun, Zhenhua Xu, Hengli Wang, Huaiyang Huang, Sukai Wang, Peide Cai, Yuxiang Sun, Yandong Liu, Lujia Wang, Ming Liu
Moreover, many countries have imposed tough lockdown measures to reduce the virus transmission (e. g., retail, catering) during the pandemic, which causes inconveniences for human daily life.
no code implementations • 25 Mar 2020 • Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Yawen Liu, Weijie Shen, Wen-Ji Zhou, Qing Da, An-Xiang Zeng, Han Yu, Yang Yu, Zhi-Hua Zhou
The framework consists of an evaluator that generalizes to evaluate recommendations involving the context, and a generator that maximizes the evaluator score by reinforcement learning, and a discriminator that ensures the generalization of the evaluator.
1 code implementation • 1 Mar 2020 • Chao Wang, Ruo-Ze Liu, Han-Jia Ye, Yang Yu
We disclose that a classically fully trained feature extractor can leave little embedding space for unseen classes, which keeps the model from well-fitting the new classes.
no code implementations • ICML 2020 • Yang Yu, Shih-Kang Chao, Guang Cheng
In this paper, we propose a bootstrap method applied to massive data processed distributedly in a large number of machines.
no code implementations • 19 Feb 2020 • Chi-Hua Wang, Yang Yu, Botao Hao, Guang Cheng
In this paper, we propose a novel perturbation-based exploration method in bandit algorithms with bounded or unbounded rewards, called residual bootstrap exploration (\texttt{ReBoot}).
no code implementations • 6 Feb 2020 • Wen-Ji Zhou, Yang Yu
Hierarchical reinforcement learning (HRL) helps address large-scale and sparse reward issues in reinforcement learning.
no code implementations • 12 Dec 2019 • Jingshi Cui, Haoxiang Wang, Chenye Wu, Yang Yu
To enable an efficient electricity market, a good pricing scheme is of vital importance.
no code implementations • 1 Dec 2019 • Jiaman Wu, Zhiqi Wang, Chenye Wu, Kui Wang, Yang Yu
Dynamic pricing is both an opportunity and a challenge to the demand side.
1 code implementation • NeurIPS 2019 • Wang-Zhou Dai, Qiu-Ling Xu, Yang Yu, Zhi-Hua Zhou
In the area of artificial intelligence (AI), the two abilities are usually realised by machine learning and logic programming, respectively.
no code implementations • 27 Nov 2019 • Rong-Jun Qin, Jing-Cheng Pang, Yang Yu
However, learning to beat a pool in stochastic games, i. e., a wide distribution over policy models, is either sample-consuming or insufficient to exploit all models with limited amount of samples.
no code implementations • 18 Nov 2019 • Jingshi Cui, Haoxiang Wang, Chenye Wu, Yang Yu
In this paper, from an adversarial machine learning point of view, we examine the vulnerability of data-driven electricity market design.
no code implementations • 16 Nov 2019 • Tian Xu, Ziniu Li, Yang Yu
We also show that the framework leads to the value discrepancy of GAIL in an order of O((1-\gamma)^{-1}).
no code implementations • 16 Nov 2019 • Jiaman Wu, Zhiqi Wang, Yang Yu, Chenye Wu
Renewable energy brings huge uncertainties to the power system, which challenges the traditional power system operation with limited flexible resources.
Systems and Control Systems and Control Optimization and Control
no code implementations • 9 Nov 2019 • Kui Wang, Jian Sun, Chenye Wu, Yang Yu
Conductor galloping is the high-amplitude, low-frequency oscillation of overhead power lines due to wind.
no code implementations • 21 Oct 2019 • Shengye Wang, Li Wan, Yang Yu, Ignacio Lopez Moreno
We compare the performance of a lattice-based ensemble model and a deep neural network model to combine signals from recognizers with that of a baseline that only uses low-level acoustic signals.
no code implementations • 25 Sep 2019 • Zi-Niu Li, Xiong-Hui Chen, Yang Yu
Efficient exploration is essential to reinforcement learning in huge state space.
no code implementations • 16 Sep 2019 • Shenglan Liu, Yang Yu, Yang Liu, Hong Qiao, Lin Feng, Jiashi Feng
Manifold learning now plays a very important role in machine learning and many relevant applications.
1 code implementation • 5 Sep 2019 • Yu Chen, Yingfeng Chen, Zhipeng Hu, Tianpei Yang, Changjie Fan, Yang Yu, Jianye Hao
Transfer learning (TL) is a promising way to improve the sample efficiency of reinforcement learning.
1 code implementation • IJCNLP 2019 • Ming Tan, Yang Yu, Haoyu Wang, Dakuo Wang, Saloni Potdar, Shiyu Chang, Mo Yu
Out-of-domain (OOD) detection for low-resource text classification is a realistic but understudied task.
no code implementations • 28 Jul 2019 • Chao Bian, Chao Qian, Yang Yu, Ke Tang
Sampling is a popular strategy, which evaluates the objective a couple of times, and employs the mean of these evaluation results as an estimate of the objective value.
no code implementations • 24 Jul 2019 • Jorge G. Madrid, Hugo Jair Escalante, Eduardo F. Morales, Wei-Wei Tu, Yang Yu, Lisheng Sun-Hosoya, Isabelle Guyon, Michele Sebag
We extendAuto-Sklearn with sound and intuitive mechanisms that allow it to cope with this sort ofproblems.
no code implementations • 12 Jul 2019 • Wenjie Shang, Yang Yu, Qingyang Li, Zhiwei Qin, Yiping Meng, Jieping Ye
DEMER also derives a recommendation policy with a significantly improved performance in the test phase of the real application.
no code implementations • 17 Jun 2019 • Chao Bian, Chao Qian, Ke Tang, Yang Yu
Evolutionary algorithms (EAs) have found many successful real-world applications, where the optimization problems are often subject to a wide range of uncertainties.
no code implementations • 7 Jun 2019 • Rui Fan, Jianhao Jiao, Haoyang Ye, Yang Yu, Ioannis Pitas, Ming Liu
Over the past decade, many research articles have been published in the area of autonomous driving.
no code implementations • 31 May 2019 • Mayukh Das, Devendra Singh Dhami, Yang Yu, Gautam Kunapuli, Sriraam Natarajan
Recently, deep models have had considerable success in several tasks, especially with low-level representations.
no code implementations • 31 May 2019 • Yi-Qi Hu, Yang Yu, Jun-Da Liao
We show theoretically that the ER-UCB has a regret upper bound $O\left(K \ln n\right)$ with independent feedbacks, which is as efficient as the classical UCB bandit.
no code implementations • 31 May 2019 • Wen-Ji Zhou, Yang Yu, Yingfeng Chen, Kai Guan, Tangjie Lv, Changjie Fan, Zhi-Hua Zhou
Experience reuse is key to sample-efficient reinforcement learning.
no code implementations • 27 May 2019 • Ye Tian, Li Yang, Wei Wang, Jing Zhang, Qing Tang, Mili Ji, Yang Yu, Yu Li, Hong Yang, Airong Qian
Traditionally, the most indispensable diagnosis of cervix squamous carcinoma is histopathological assessment which is achieved under microscope by pathologist.
no code implementations • 13 May 2019 • Jianhao Jiao, Yang Yu, Qinghai Liao, Haoyang Ye, Ming Liu
Multiple LiDARs have progressively emerged on autonomous vehicles for rendering a wide field of view and dense measurements.
no code implementations • ICLR 2019 • Mayukh Das, Yang Yu, Devendra Singh Dhami, Gautam Kunapuli, Sriraam Natarajan
While extremely successful in several applications, especially with low-level representations; sparse, noisy samples and structured domains (with multiple objects and interactions) are some of the open challenges in most deep models.
no code implementations • 27 Apr 2019 • Jianhao Jiao, Qinghai Liao, Yilong Zhu, Tianyu Liu, Yang Yu, Rui Fan, Lujia Wang, Ming Liu
Multiple lidars are prevalently used on mobile vehicles for rendering a broad view to enhance the performance of localization and perception systems.
no code implementations • 15 Apr 2019 • Mayukh Das, Yang Yu, Devendra Singh Dhami, Gautam Kunapuli, Sriraam Natarajan
Recently, deep models have been successfully applied in several applications, especially with low-level representations.
1 code implementation • 2 Mar 2019 • Ruo-Ze Liu, Haifeng Guo, Xiaozhong Ji, Yang Yu, Zhen-Jia Pang, Zitai Xiao, Yuzhou Wu, Tong Lu
Injecting human knowledge is an effective way to accelerate reinforcement learning (RL).
no code implementations • 18 Feb 2019 • Yu-An Wang, Yang Yu, Ming Liu
Finally, we extend the Sort algorithm with this instance framework to realize tracking in the 3D LiDAR point cloud data.
no code implementations • 2 Feb 2019 • Yijiang Lian, Zhijie Chen, Jinlong Hu, Kefeng Zhang, Chunwei Yan, Muchenxuan Tong, Wenying Han, Hanju Guan, Ying Li, Ying Cao, Yang Yu, Zhigang Li, Xiaochun Liu, Yue Wang
In this paper, we present a generative retrieval method for sponsored search engine, which uses neural machine translation (NMT) to generate keywords directly from query.
1 code implementation • 29 Nov 2018 • Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno
In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages.
1 code implementation • 26 Nov 2018 • Yang Yu, Ke Han, Washington Ochieng
These two variants, serving as based models, are further extended with two features: bounded rationality (BR) and information sharing.
Physics and Society Optimization and Control
1 code implementation • 31 Oct 2018 • Quanming Yao, Mengshuo Wang, Yuqiang Chen, Wenyuan Dai, Yu-Feng Li, Wei-Wei Tu, Qiang Yang, Yang Yu
We hope this survey can serve as not only an insightful guideline for AutoML beginners but also an inspiration for future research.
no code implementations • 11 Oct 2018 • Chao Qian, Chao Bian, Yang Yu, Ke Tang, Xin Yao
In noisy evolutionary optimization, sampling is a common strategy to deal with noise.
no code implementations • 27 Sep 2018 • Wei-Yang Qu, Yang Yu, Tang-Jie Lv, Ying-Feng Chen, Chang-Jie Fan
There are two policies in this approach, the exploration policy is used for exploratory sampling in the environment, then the benchmark policy try to update by the data proven by the exploration policy.
no code implementations • 23 Sep 2018 • Zhen-Jia Pang, Ruo-Ze Liu, Zhou-Yu Meng, Yi Zhang, Yang Yu, Tong Lu
The reinforcement training algorithm for this architecture is also investigated.
Hierarchical Reinforcement Learning
reinforcement-learning
+4
1 code implementation • NeurIPS 2018 • Ji Feng, Yang Yu, Zhi-Hua Zhou
Multi-layered representation is believed to be the key ingredient of deep neural networks especially in cognitive tasks like computer vision.
1 code implementation • 25 May 2018 • Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, An-Xiang Zeng
Applying reinforcement learning in physical-world tasks is extremely challenging.
no code implementations • 23 Mar 2018 • Yang Yu, Vincent Ng
Keyphrase is an efficient representation of the main idea of documents.
1 code implementation • 2 Mar 2018 • Yujing Hu, Qing Da, An-Xiang Zeng, Yang Yu, Yinghui Xu
For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session.
no code implementations • 1 Mar 2018 • Yang Yu, Kazi Saidul Hasan, Mo Yu, Wei zhang, Zhiguo Wang
Relation detection is a core component for Knowledge Base Question Answering (KBQA).
1 code implementation • 4 Feb 2018 • Wang-Zhou Dai, Qiu-Ling Xu, Yang Yu, Zhi-Hua Zhou
Perception and reasoning are basic human abilities that are seamlessly connected as part of human intelligence.
3 code implementations • 31 Dec 2017 • Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Chao Qian, Yang Yu
Recent advances in derivative-free optimization allow efficient approximation of the global-optimal solutions of sophisticated functions, such as functions with many local optima, non-differentiable and non-continuous functions.
no code implementations • NeurIPS 2017 • Chao Qian, Jing-Cheng Shi, Yang Yu, Ke Tang, Zhi-Hua Zhou
The problem of selecting the best $k$-element subset from a universe is involved in many applications.
no code implementations • 20 Nov 2017 • Chao Qian, Yang Yu, Ke Tang, Xin Yao, Zhi-Hua Zhou
To provide a general theoretical explanation of the behavior of EAs, it is desirable to study their performance on general classes of combinatorial optimization problems.
no code implementations • 24 May 2017 • Yang Yu, Wei-Yang Qu, Nan Li, Zimin Guo
ASG generates positive and negative samples of seen categories in the unsupervised manner via an adversarial learning strategy.
no code implementations • 31 Oct 2016 • Yang Yu, Wei zhang, Kazi Hasan, Mo Yu, Bing Xiang, Bo-Wen Zhou
This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer questions.
Ranked #49 on
Question Answering
on SQuAD1.1 dev
no code implementations • 10 Jun 2016 • Chao Qian, Yang Yu, Zhi-Hua Zhou
Our results imply that the increase of population size, while usually desired in practice, bears the risk of increasing the lower bound of the running time and thus should be carefully considered.
no code implementations • NeurIPS 2015 • Chao Qian, Yang Yu, Zhi-Hua Zhou
Selecting the optimal subset from a large set of variables is a fundamental problem in various learning tasks such as feature selection, sparse regression, dictionary learning, etc.
no code implementations • 26 Oct 2015 • Yang Yu, Wei zhang, Chung-Wei Hang, Bing Xiang, Bo-Wen Zhou
In this paper we explore deep learning models with memory component or attention mechanism for question answering task.