no code implementations • 20 Feb 2025 • Pengxiang Ding, Jianfei Ma, Xinyang Tong, Binghong Zou, Xinxin Luo, Yiguo Fan, Ting Wang, Hongchao Lu, Panzhong Mo, Jinxin Liu, Yuefan Wang, Huaicheng Zhou, Wenshuo Feng, Jiacheng Liu, Siteng Huang, Donglin Wang
This paper addresses the limitations of current humanoid robot control frameworks, which primarily rely on reactive mechanisms and lack autonomous interaction capabilities due to data scarcity.
1 code implementation • 25 Nov 2024 • Amy Xin, Jinxin Liu, Zijun Yao, Zhicheng Lee, Shulin Cao, Lei Hou, Juanzi Li
Drawing inspiration from the graph modeling of knowledge, AtomR leverages large language models (LLMs) to decompose complex questions into combinations of three atomic knowledge operators, significantly enhancing the reasoning process at both the planning and execution stages.
no code implementations • 27 Oct 2024 • Xiao Tang, Yudan Jiang, Jinxin Liu, Qinghe Du, Dusit Niyato, Zhu Han
This paper reveals the potential of movable antennas in enhancing anti-jamming communication.
1 code implementation • 22 Aug 2024 • Chaoyi Wu, Pengcheng Qiu, Jinxin Liu, Hongfei Gu, Na Li, Ya zhang, Yanfeng Wang, Weidi Xie
To promote further advancements in the application of LLMs to clinical challenges, we have made the MedS-Ins dataset fully accessible and invite the research community to contribute to its expansion. Additionally, we have launched a dynamic leaderboard for MedS-Bench, which we plan to regularly update the test set to track progress and enhance the adaptation of general LLMs to the medical domain.
no code implementations • 20 Aug 2024 • Jinxin Liu, Zao Yang
This work implements Influence Functions (IFs) to trace privacy leakage back to the training data, thereby mitigating privacy concerns of Language Models (LMs).
1 code implementation • 23 May 2024 • Jinxin Liu, Xinghong Guo, Zifeng Zhuang, Donglin Wang
The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data.
1 code implementation • 14 May 2024 • Zifeng Zhuang, Dengyun Peng, Jinxin Liu, Ziqi Zhang, Donglin Wang
In this work, we introduce the concept of max-return sequence modeling which integrates the goal of maximizing returns into existing sequence models.
1 code implementation • 10 May 2024 • Yao Lai, Jinxin Liu, David Z. Pan, Ping Luo
We believe our work will offer valuable insights into hardware design, further accelerating speed and reducing size through the refined search space and our tree generation methodologies.
1 code implementation • 13 Feb 2024 • Mohammad Ghazi Vakili, Christoph Gorgulla, AkshatKumar Nigam, Dmitry Bezrukov, Daniel Varoli, Alex Aliper, Daniil Polykovsky, Krishna M. Padmanabha Das, Jamie Snider, Anna Lyakisheva, Ardalan Hosseini Mansob, Zhong Yao, Lela Bitar, Eugene Radchenko, Xiao Ding, Jinxin Liu, Fanye Meng, Feng Ren, Yudong Cao, Igor Stagljar, Alán Aspuru-Guzik, Alex Zhavoronkov
The discovery of small molecules with therapeutic potential is a long-standing challenge in chemistry and biology.
no code implementations • 29 Jan 2024 • Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang, Miao Liu, Shuai Zhang
Offline reinforcement learning (RL) algorithms can learn better decision-making compared to behavior policies by stitching the suboptimal trajectories to derive more optimal ones.
no code implementations • 18 Jan 2024 • Jinxin Liu, Petar Djukic, Michel Kulhandjian, Burak Kantarci
We propose Deep Dict, a deep learning-based lossy time series compressor designed to achieve a high compression ratio while maintaining decompression error within a predefined range.
no code implementations • 11 Jan 2024 • Jinxin Liu, Shulin Cao, Jiaxin Shi, Tingjian Zhang, Lunyiu Nie, Linmei Hu, Lei Hou, Juanzi Li
A typical approach to KBQA is semantic parsing, which translates a question into an executable logical form in a formal language.
1 code implementation • 12 Dec 2023 • Ziqi Zhang, Jingzehua Xu, Zifeng Zhuang, Hongyin Zhang, Jinxin Liu, Donglin Wang, Shuai Zhang
Unlike previous clipping approaches, we propose a bi-level proximal policy optimization objective that can dynamically adjust the clipping bound to better reflect the preference (maximizing Return) of these RL tasks.
1 code implementation • NeurIPS 2023 • Bowei He, Zexu Sun, Jinxin Liu, Shuai Zhang, Xu Chen, Chen Ma
We theoretically analyze the influence of the generated expert data and the improvement of generalization.
no code implementations • 7 Oct 2023 • Ziqi Zhang, Xiao Xiong, Zifeng Zhuang, Jinxin Liu, Donglin Wang
Studying how to fine-tune offline reinforcement learning (RL) pre-trained policy is profoundly significant for enhancing the sample efficiency of RL algorithms.
no code implementations • 3 Sep 2023 • Jinxin Liu, Murat Simsek, Michele Nogueira, Burak Kantarci
Timely response of Network Intrusion Detection Systems (NIDS) is constrained by the flow generation process which requires accumulation of network packets.
1 code implementation • 19 Jul 2023 • Yachen Kang, Li He, Jinxin Liu, Zifeng Zhuang, Donglin Wang
Due to the existence of similarity trap, such consistency regularization improperly enhances the consistency possiblity of the model's predictions between segment pairs, and thus reduces the confidence in reward learning, since the augmented distribution does not match with the original one in PbRL.
no code implementations • 26 Jun 2023 • Yao Lai, Jinxin Liu, Zhentao Tang, Bin Wang, Jianye Hao, Ping Luo
To resolve these challenges, we cast the chip placement as an offline RL formulation and present ChiPFormer that enables learning a transferable placement policy from fixed offline data.
no code implementations • NeurIPS 2023 • Jinxin Liu, Hongyin Zhang, Zifeng Zhuang, Yachen Kang, Donglin Wang, Bin Wang
Naturally, such a paradigm raises three core questions that are not fully answered by prior non-iterative offline RL counterparts like reward-conditioned policy: (q1) What information should we transfer from the inner-level to the outer-level?
no code implementations • 23 Jun 2023 • Jinxin Liu, Lipeng Zu, Li He, Donglin Wang
As a remedy for the labor-intensive labeling, we propose to endow offline RL tasks with a few expert data and utilize the limited expert data to drive intrinsic rewards, thus eliminating the need for extrinsic rewards.
1 code implementation • 22 Jun 2023 • Jinxin Liu, Ziqi Zhang, Zhenyu Wei, Zifeng Zhuang, Yachen Kang, Sibo Gai, Donglin Wang
Offline reinforcement learning (RL) aims to learn a policy using only pre-collected and fixed data.
1 code implementation • 15 Jun 2023 • Jifan Yu, Xiaozhi Wang, Shangqing Tu, Shulin Cao, Daniel Zhang-li, Xin Lv, Hao Peng, Zijun Yao, Xiaohan Zhang, Hanming Li, Chunyang Li, Zheyuan Zhang, Yushi Bai, Yantao Liu, Amy Xin, Nianyi Lin, Kaifeng Yun, Linlu Gong, Jianhui Chen, Zhili Wu, Yunjia Qi, Weikai Li, Yong Guan, Kaisheng Zeng, Ji Qi, Hailong Jin, Jinxin Liu, Yu Gu, Yuan YAO, Ning Ding, Lei Hou, Zhiyuan Liu, Bin Xu, Jie Tang, Juanzi Li
The unprecedented performance of large language models (LLMs) necessitates improvements in evaluations.
1 code implementation • 25 May 2023 • Yachen Kang, Diyuan Shi, Jinxin Liu, Li He, Donglin Wang
Instead, the agent is provided with fixed offline trajectories and human preferences between pairs of trajectories to extract the dynamics and task information, respectively.
1 code implementation • 23 May 2023 • Ji Qi, Chuchun Zhang, Xiaozhi Wang, Kaisheng Zeng, Jifan Yu, Jinxin Liu, Jiuding Sun, Yuxiang Chen, Lei Hou, Juanzi Li, Bin Xu
In this paper, we present the first benchmark that simulates the evaluation of open information extraction models in the real world, where the syntactic and expressive distributions under the same knowledge meaning may drift variously.
2 code implementations • 22 Feb 2023 • Zifeng Zhuang, Kun Lei, Jinxin Liu, Donglin Wang, Yilang Guo
Offline reinforcement learning (RL) is a challenging setting where existing off-policy actor-critic methods perform poorly due to the overestimation of out-of-distribution state-action pairs.
no code implementations • 8 Oct 2022 • Ji Qi, Bin Xu, Kaisheng Zeng, Jinxin Liu, Jifan Yu, Qi Gao, Juanzi Li, Lei Hou
Document-level relation extraction with graph neural networks faces a fundamental graph construction gap between training and inference - the golden graph structure only available during training, which causes that most methods adopt heuristic or syntactic rules to construct a prior graph as a pseudo proxy.
no code implementations • 7 Apr 2022 • Zhiyan Chen, Jinxin Liu, Yu Shen, Murat Simsek, Burak Kantarci, Hussein T. Mouftah, Petar Djukic
Advanced persistent threat (APT) is prominent for cybercriminals to compromise networks, and it is crucial to long-term and harmful characteristics.
no code implementations • ICLR 2022 • Jinxin Liu, Hongyin Zhang, Donglin Wang
Specifically, DARA emphasizes learning from those source transition pairs that are adaptive for the target environment and mitigates the offline dynamics shift by characterizing state-action-next-state pairs instead of the typical state-action distribution sketched by prior offline RL methods.
no code implementations • NeurIPS 2021 • Jinxin Liu, Hao Shen, Donglin Wang, Yachen Kang, Qiangxing Tian
Unsupervised reinforcement learning aims to acquire skills without prior goal representations, where an agent automatically explores an open-ended environment to represent goals and learn the goal-conditioned policy.
no code implementations • 21 Oct 2021 • Yachen Kang, Jinxin Liu, Xin Cao, Donglin Wang
To achieve this, the widely used GAN-inspired IRL method is adopted, and its discriminator, recognizing policy-generating trajectories, is modified with the quantification of dynamics difference.
no code implementations • 29 Aug 2021 • Jinxin Liu, Murat Simsek, Burak Kantarci, Melike Erol-Kantarci, Andrew Malton, Andrew Walenstein
The risk levels are associated with access control decisions recommended by a security policy.
no code implementations • 11 Apr 2021 • Jinxin Liu, Donglin Wang, Qiangxing Tian, Zhengyu Chen
It is of significance for an agent to learn a widely applicable and general-purpose policy that can achieve diverse goals including images and text descriptions.
no code implementations • 25 Sep 2019 • Qiangxing Tian, Jinxin Liu, Donglin Wang
By maximizing an information theoretic objective, a few recent methods empower the agent to explore the environment and learn useful skills without supervision.