Search Results for author: Weinan Zhang

Found 178 papers, 79 papers with code

Nested Named Entity Recognition with Span-level Graphs

no code implementations ACL 2022 Juncheng Wan, Dongyu Ru, Weinan Zhang, Yong Yu

In this work, we try to improve the span representation by utilizing retrieval-based span-level graphs, connecting spans and entities in the training data based on n-gram features.

named-entity-recognition Named Entity Recognition +3

LIBER: Lifelong User Behavior Modeling Based on Large Language Models

no code implementations22 Nov 2024 Chenxu Zhu, Shigang Quan, Bo Chen, Jianghao Lin, Xiaoling Cai, Hong Zhu, Xiangyang Li, Yunjia Xi, Weinan Zhang, Ruiming Tang

On the one hand, it presents difficulties for LLMs in effectively capturing the dynamic shifts in user interests within these sequences, and on the other hand, there exists the issue of substantial computational overhead if the LLMs necessitate recurrent calls upon each update to the user sequences.

Click-Through Rate Prediction Music Recommendation +1

Multi-LLM-Agent Systems: Techniques and Business Perspectives

no code implementations21 Nov 2024 Yingxuan Yang, Qiuying Peng, Jun Wang, Weinan Zhang

In the era of (multi-modal) large language models, most operational processes can be reformulated and reproduced using LLM agents.

Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

no code implementations14 Nov 2024 Longxuan Ma, Mingda Li, Weinan Zhang, Jiapeng Li, Ting Liu

The retrieval models consist of Fusion, Matching, and Ranking modules, while the generative models comprise Dialogue and Knowledge Encoding, Knowledge Selection, and Response Generation modules.

Dialogue Generation Response Generation +2

Beyond Positive History: Re-ranking with List-level Hybrid Feedback

no code implementations28 Oct 2024 Muyan Weng, Yunjia Xi, Weiwen Liu, Bo Chen, Jianghao Lin, Ruiming Tang, Weinan Zhang, Yong Yu

It captures user's preferences and behavior patterns with three modules: a Disentangled Interest Miner to disentangle the user's preferences into interests and disinterests, a Sequential Preference Mixer to learn users' entangled preferences considering the context of feedback, and a Comparison-aware Pattern Extractor to capture user's behavior patterns within each list.

Contrastive Learning Recommendation Systems +1

Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

no code implementations25 Oct 2024 Kangning Zhang, Jiarui Jin, Yingjie Qin, Ruilong Su, Jianghao Lin, Yong Yu, Weinan Zhang

Furthermore, the unique nature of item-specific ID embeddings hinders the information exchange among related items and the spatial requirement of ID embeddings increases with the scale of item.

Multimodal Recommendation Quantization

Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

no code implementations24 Oct 2024 Donglin Di, Weinan Zhang, Yue Zhang, Fanglin Wang

Making use of off-the-shelf resources of resource-rich languages to transfer knowledge for low-resource languages raises much attention recently.

Cross-Lingual Transfer Decoder +6

Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations

no code implementations21 Oct 2024 JunJie Huang, Jiarui Qin, Jianghao Lin, Ziming Feng, Yong Yu, Weinan Zhang

Despite advancements in individual retrieval methods, multi-channel fusion, the process of efficiently merging multi-channel retrieval results, remains underexplored.

Bayesian Optimization Recommendation Systems +1

Agentic Information Retrieval

no code implementations13 Oct 2024 Weinan Zhang, Junwei Liao, Ning li, Kounianhua Du

We propose that agentic IR holds promise for generating innovative applications, potentially becoming a central information entry point in future digital ecosystems.

Information Retrieval Recommendation Systems +1

ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction

1 code implementation13 Oct 2024 Yanlin Zhang, Ning li, Quan Gan, Weinan Zhang, David Wipf, Minjie Wang

But despite this potential, evaluations thus far are primarily based on the end performance of a complete ML pipeline, providing limited insight into precisely how LLMs behave relative to human experts in feature engineering.

Feature Engineering

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

1 code implementation12 Oct 2024 Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni, Linyi Yang, Ying Wen, Weinan Zhang

Inspired by the success of OpenAI's o1 model, which demonstrated improved reasoning abilities through step-by-step reasoning and reinforcement learning, OpenR integrates test-time compute, reinforcement learning, and process supervision to improve reasoning in LLMs.

Math reinforcement-learning +1

Hammer: Robust Function-Calling for On-Device Language Models via Function Masking

1 code implementation6 Oct 2024 Qiqiang Lin, Muning Wen, Qiuying Peng, Guanyu Nie, Junwei Liao, Xiaoyun Mo, Jiamu Zhou, Cheng Cheng, Yin Zhao, Jun Wang, Weinan Zhang

Large language models have demonstrated impressive value in performing as autonomous agents when equipped with external tools and API calls.

GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs

no code implementations4 Oct 2024 Pu Hua, Minghuan Liu, Annabella Macaluso, Yunfeng Lin, Weinan Zhang, Huazhe Xu, Lirui Wang

The pipeline can generate data for up to 100 articulated tasks with 200 objects and reduce the required human efforts.

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

no code implementations2 Oct 2024 Naming Liu, Mingzhi Wang, Xihuai Wang, Weinan Zhang, Yaodong Yang, Youzhi Zhang, Bo An, Ying Wen

Such insufficient policy expressiveness causes Team PSRO to be trapped into a sub-optimal ex ante equilibrium with significantly higher exploitability and never converges to the global ex ante equilibrium.

LoopSR: Looping Sim-and-Real for Lifelong Policy Adaptation of Legged Robots

no code implementations26 Sep 2024 Peilin Wu, Weiji Xie, Jiahang Cao, Hang Lai, Weinan Zhang

Reinforcement Learning (RL) has shown its remarkable and generalizable capability in legged locomotion through sim-to-real transfer.

Contrastive Learning Decoder +1

World Model-based Perception for Visual Legged Locomotion

no code implementations25 Sep 2024 Hang Lai, Jiahang Cao, Jiafeng Xu, Hongtao Wu, Yunfeng Lin, Tao Kong, Yong Yu, Weinan Zhang

To address this issue, traditional methods attempt to learn a teacher policy with access to privileged information first and then learn a student policy to imitate the teacher's behavior with visual input.

RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation

no code implementations15 Sep 2024 Qingyao Li, Wei Xia, Kounianhua Du, Xinyi Dai, Ruiming Tang, Yasheng Wang, Yong Yu, Weinan Zhang

More importantly, we construct verbal feedback from fine-grained code execution feedback to refine erroneous thoughts during the search.

Code Generation HumanEval

Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation

no code implementations14 Sep 2024 Yiwei Shi, Muning Wen, Qi Zhang, Weinan Zhang, Cunjia Liu, Weiru Liu

Reinforcement Learning has revolutionized decision-making processes in dynamic environments, yet it often struggles with autonomously detecting and achieving goals without clear feedback signals.

Decision Making

A Survey on Diffusion Models for Recommender Systems

1 code implementation8 Sep 2024 Jianghao Lin, Jiaqi Liu, Jiachen Zhu, Yunjia Xi, Chengkai Liu, Yangtian Zhang, Yong Yu, Weinan Zhang

While traditional recommendation techniques have made significant strides in the past decades, they still suffer from limited generalization performance caused by factors like inadequate collaborative signals, weak latent representations, and noisy data.

Data Augmentation Recommendation Systems +1

A Decoding Acceleration Framework for Industrial Deployable LLM-based Recommender Systems

1 code implementation11 Aug 2024 Yunjia Xi, Hangyu Wang, Bo Chen, Jianghao Lin, Menghui Zhu, Weiwen Liu, Ruiming Tang, Weinan Zhang, Yong Yu

This generation inefficiency stems from the autoregressive nature of LLMs, and a promising direction for acceleration is speculative decoding, a Draft-then-Verify paradigm that increases the number of generated tokens per decoding step.

Recommendation Systems Retrieval

P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for data pruning in LLM Training

no code implementations10 Aug 2024 Yingxuan Yang, Huayi Wang, Muning Wen, Xiaoyun Mo, Qiuying Peng, Jun Wang, Weinan Zhang

In the rapidly advancing field of Large Language Models (LLMs), effectively leveraging existing datasets during fine-tuning to maximize the model's potential is of paramount importance.

Diversity Logical Reasoning +1

Lifelong Personalized Low-Rank Adaptation of Large Language Models for Recommendation

no code implementations7 Aug 2024 Jiachen Zhu, Jianghao Lin, Xinyi Dai, Bo Chen, Rong Shan, Jieming Zhu, Ruiming Tang, Yong Yu, Weinan Zhang

Thus, LLMs only see a small fraction of the datasets (e. g., less than 10%) instead of the whole datasets, limiting their exposure to the full training space.

Logical Reasoning Recommendation Systems +1

SR-CIS: Self-Reflective Incremental System with Decoupled Memory and Reasoning

no code implementations4 Aug 2024 Biqing Qi, Junqi Gao, Xinquan Chen, Dong Li, Weinan Zhang, BoWen Zhou

The ability of humans to rapidly learn new knowledge while retaining old memories poses a significant challenge for current deep learning models.

Anomaly Detection Incremental Learning

Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning

no code implementations29 Jul 2024 Liyuan Mao, Haoran Xu, Xianyuan Zhan, Weinan Zhang, Amy Zhang

In this work, we show that DICE-based methods can be viewed as a transformation from the behavior distribution to the optimal policy distribution.

Offline RL reinforcement-learning +1

A Comprehensive Survey on Retrieval Methods in Recommender Systems

no code implementations11 Jul 2024 JunJie Huang, Jizheng Chen, Jianghao Lin, Jiarui Qin, Ziming Feng, Weinan Zhang, Yong Yu

By detailing the retrieval stage, which is fundamental for effective recommendation, this survey aims to bridge the existing knowledge gap and serve as a cornerstone for researchers interested in optimizing this critical component of cascade recommender systems.

Benchmarking Recommendation Systems +2

MemoCRS: Memory-enhanced Sequential Conversational Recommender Systems with Large Language Models

1 code implementation6 Jul 2024 Yunjia Xi, Weiwen Liu, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu

The preferences embedded in the user's historical dialogue sessions and the current session exhibit continuity and sequentiality, and we refer to CRSs with this characteristic as sequential CRSs.

Recommendation Systems

SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model

1 code implementation1 Jul 2024 Lingyue Fu, Hao Guan, Kounianhua Du, Jianghao Lin, Wei Xia, Weinan Zhang, Ruiming Tang, Yasheng Wang, Yong Yu

Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question, which is a crucial task in intelligent tutoring systems (ITS).

Knowledge Tracing Language Modelling +1

ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation

no code implementations27 Jun 2024 Jizheng Chen, Kounianhua Du, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang

Concretely, we propose to inject the preference understanding capability into LLM via a GAT expert model where the user preference is better encoded by parallelly propagating the temporal relations, and rating signals as well as various side information of historical items.

OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning

no code implementations13 Jun 2024 Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi

We present OmniH2O (Omni Human-to-Humanoid), a learning-based system for whole-body humanoid teleoperation and autonomy.

RACon: Retrieval-Augmented Simulated Character Locomotion Control

no code implementations11 Jun 2024 Yuxuan Mu, Shihao Zou, Kangning Yin, Zheng Tian, Li Cheng, Weinan Zhang, Jun Wang

The retriever searches motion experts from a user-specified database in a task-oriented fashion, which boosts the responsiveness to the user's control.

Hierarchical Reinforcement Learning Retrieval

Large Language Models Make Sample-Efficient Recommender Systems

no code implementations4 Jun 2024 Jianghao Lin, Xinyi Dai, Rong Shan, Bo Chen, Ruiming Tang, Yong Yu, Weinan Zhang

Hence, we propose and verify our core viewpoint: Large Language Models Make Sample-Efficient Recommender Systems.

Recommendation Systems

Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models

1 code implementation31 May 2024 Mingda Li, Xinyu Li, Yifan Chen, Wenfeng Xuan, Weinan Zhang

Although Retrieval-Augmented Large Language Models (RALMs) demonstrate their superiority in terms of factuality, they do not consistently outperform the original retrieval-free Language Models (LMs).

Open-Domain Question Answering Retrieval

Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning

1 code implementation29 May 2024 Hanye Zhao, Xiaoshen Han, Zhengbang Zhu, Minghuan Liu, Yong Yu, Weinan Zhang

We propose Dynamics Diffusion, short as DyDiff, which can inject information from the learning policy to DMs iteratively.

Decision Making reinforcement-learning +1

Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization

no code implementations25 May 2024 Shutong Ding, Ke Hu, Zhenhao Zhang, Kan Ren, Weinan Zhang, Jingyi Yu, Jingya Wang, Ye Shi

To overcome this, we propose a novel model-free diffusion-based online RL algorithm, Q-weighted Variational Policy Optimization (QVPO).

continuous-control Continuous Control +4

Look into the Future: Deep Contextualized Sequential Recommendation

no code implementations23 May 2024 Lei Zheng, Ning li, Yanhuan Huang, Ruiwen Xu, Weinan Zhang, Yong Yu

In LIFT, the context of a target user's interaction is represented based on i) his own past behaviors and ii) the past and future behaviors of the retrieved similar interactions from other users.

Click-Through Rate Prediction Retrieval +1

Learning Structure and Knowledge Aware Representation with Large Language Models for Concept Recommendation

no code implementations21 May 2024 Qingyao Li, Wei Xia, Kounianhua Du, Qiji Zhang, Weinan Zhang, Ruiming Tang, Yong Yu

However, integrating LLMs into concept recommendation presents two urgent challenges: 1) How to construct text for concepts that effectively incorporate the human knowledge system?

Contrastive Learning Knowledge Tracing +1

DisCo: Towards Harmonious Disentanglement and Collaboration between Tabular and Semantic Space for Recommendation

1 code implementation20 May 2024 Kounianhua Du, Jizheng Chen, Jianghao Lin, Yunjia Xi, Hangyu Wang, Xinyi Dai, Bo Chen, Ruiming Tang, Weinan Zhang

In this paper, we propose DisCo to Disentangle the unique patterns from the two representation spaces and Collaborate the two spaces for recommendation enhancement, where both the specificity and the consistency of the two spaces are captured.

Disentanglement Recommendation Systems +1

FINED: Feed Instance-Wise Information Need with Essential and Disentangled Parametric Knowledge from the Past

no code implementations20 May 2024 Kounianhua Du, Jizheng Chen, Jianghao Lin, Menghui Zhu, Bo Chen, Shuai Li, Yong Yu, Weinan Zhang

In this paper, we propose FINED to Feed INstance-wise information need with Essential and Disentangled parametric knowledge from past data for recommendation enhancement.

Disentanglement Memorization

CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation

no code implementations3 May 2024 Kounianhua Du, Jizheng Chen, Renting Rui, Huacan Chai, Lingyue Fu, Wei Xia, Yasheng Wang, Ruiming Tang, Yong Yu, Weinan Zhang

Despite the intelligence shown by the general large language models, their specificity in code generation can still be improved due to the syntactic gap and mismatched vocabulary existing among natural language and different programming languages.

Code Generation Language Modelling +3

4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs

1 code implementation28 Apr 2024 Minjie Wang, Quan Gan, David Wipf, Zhenkun Cai, Ning li, Jianheng Tang, Yanlin Zhang, Zizhao Zhang, Zunyao Mao, Yakun Song, Yanbo Wang, Jiahang Li, Han Zhang, Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos, Zheng Zhang

Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing.

Benchmarking

Retrieval and Distill: A Temporal Data Shift-Free Paradigm for Online Recommendation System

no code implementations24 Apr 2024 Lei Zheng, Ning li, Weinan Zhang, Yong Yu

Current recommendation systems are significantly affected by a serious issue of temporal data shift, which is the inconsistency between the distribution of historical data and that of online data.

Recommendation Systems Retrieval

DREAM: A Dual Representation Learning Model for Multimodal Recommendation

no code implementations17 Apr 2024 Kangning Zhang, Yingjie Qin, Jiarui Jin, Yifan Liu, Ruilong Su, Weinan Zhang, Yong Yu

For sufficient information extraction, we introduce separate dual lines, including Behavior Line and Modal Line, in which the Modal-specific Encoder is applied to empower modal representations.

Multimodal Recommendation Representation Learning

Recall-Augmented Ranking: Enhancing Click-Through Rate Prediction Accuracy with Cross-Stage Data

no code implementations15 Apr 2024 JunJie Huang, Guohao Cai, Jieming Zhu, Zhenhua Dong, Ruiming Tang, Weinan Zhang, Yong Yu

RAR consists of two key sub-modules, which synergistically gather information from a vast pool of look-alike users and recall items, resulting in enriched user representations.

Click-Through Rate Prediction

M-scan: A Multi-Scenario Causal-driven Adaptive Network for Recommendation

no code implementations11 Apr 2024 Jiachen Zhu, Yichao Wang, Jianghao Lin, Jiarui Qin, Ruiming Tang, Weinan Zhang, Yong Yu

Furthermore, through causal graph analysis, we have discovered that the scenario itself directly influences click behavior, yet existing approaches directly incorporate data from other scenarios during the training of the current scenario, leading to prediction biases when they directly utilize click behaviors from other scenarios to train models.

counterfactual Counterfactual Inference

Play to Your Strengths: Collaborative Intelligence of Conventional Recommender Models and Large Language Models

no code implementations25 Mar 2024 Yunjia Xi, Weiwen Liu, Jianghao Lin, Chuhan Wu, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu

The rise of large language models (LLMs) has opened new opportunities in Recommender Systems (RSs) by enhancing user behavior modeling and content understanding.

Language Modelling Large Language Model +1

AlignRec: Aligning and Training in Multimodal Recommendations

1 code implementation19 Mar 2024 Yifan Liu, Kangning Zhang, Xiangyuan Ren, Yanhua Huang, Jiarui Jin, Yingjie Qin, Ruilong Su, Ruiwen Xu, Yong Yu, Weinan Zhang

Each alignment is characterized by a specific objective function and is integrated into our multimodal recommendation framework.

Multimodal Recommendation

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

1 code implementation10 Mar 2024 Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang

Among these works, many of them utilize in-context examples to achieve generalization without the need for fine-tuning, while few of them have considered the problem of how to select and effectively utilize these examples.

Language Modelling Large Language Model +2

Looking Ahead to Avoid Being Late: Solving Hard-Constrained Traveling Salesman Problem

no code implementations8 Mar 2024 Jingxiao Chen, Ziqin Gong, Minghuan Liu, Jun Wang, Yong Yu, Weinan Zhang

To overcome this problem and to have an effective solution against hard constraints, we proposed a novel learning-based method that uses looking-ahead information as the feature to improve the legality of TSP with Time Windows (TSPTW) solutions.

Traveling Salesman Problem

Towards Efficient and Effective Unlearning of Large Language Models for Recommendation

1 code implementation6 Mar 2024 Hangyu Wang, Jianghao Lin, Bo Chen, Yang Yang, Ruiming Tang, Weinan Zhang, Yong Yu

However, in order to protect user privacy and optimize utility, it is also crucial for LLMRec to intentionally forget specific user data, which is generally referred to as recommendation unlearning.

World Knowledge

Offline Fictitious Self-Play for Competitive Games

no code implementations29 Feb 2024 Jingxiao Chen, Weiji Xie, Weinan Zhang, Yong Yu, Ying Wen

Firstly, unaware of the game structure, it is impossible to interact with the opponents and conduct a major learning paradigm, self-play, for competitive games.

Offline RL Reinforcement Learning (RL)

Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training

no code implementations22 Feb 2024 Haoran He, Chenjia Bai, Ling Pan, Weinan Zhang, Bin Zhao, Xuelong Li

In the pre-training stage, we employ a discrete diffusion model with a mask-and-replace diffusion strategy to predict future video tokens in the latent space.

Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement

1 code implementation9 Feb 2024 Muning Wen, Junwei Liao, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen

We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks; results underline ETPO's potential as a robust method for refining the interactive decision-making capabilities of language agents.

Code Generation Decision Making +3

CityFlowER: An Efficient and Realistic Traffic Simulator with Embedded Machine Learning Models

no code implementations9 Feb 2024 Longchao Da, Chen Chu, Weinan Zhang, Hua Wei

Addressing these limitations, we introduce CityFlowER, an advancement over the existing CityFlow simulator, designed for efficient and realistic city-wide traffic simulation.

Contrastive Diffuser: Planning Towards High Return States via Contrastive Learning

no code implementations5 Feb 2024 Yixiang Shan, Zhengbang Zhu, Ting Long, Qifan Liang, Yi Chang, Weinan Zhang, Liang Yin

The performance of offline reinforcement learning (RL) is sensitive to the proportion of high-return trajectories in the offline dataset.

Contrastive Learning D4RL +2

DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching

no code implementations4 Feb 2024 Guanghe Li, Yixiang Shan, Zhengbang Zhu, Ting Long, Weinan Zhang

In offline reinforcement learning (RL), the performance of the learned policy highly depends on the quality of offline datasets.

D4RL Data Augmentation +4

ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update

1 code implementation1 Feb 2024 Liyuan Mao, Haoran Xu, Weinan Zhang, Xianyuan Zhan

To resolve this issue, we propose a simple yet effective modification that projects the backward gradient onto the normal plane of the forward gradient, resulting in an orthogonal-gradient update, a new learning rule for DICE-based methods.

Imitation Learning Offline RL +1

InfoRank: Unbiased Learning-to-Rank via Conditional Mutual Information Minimization

no code implementations23 Jan 2024 Jiarui Jin, Zexue He, Mengyue Yang, Weinan Zhang, Yong Yu, Jun Wang, Julian McAuley

Subsequently, we minimize the mutual information between the observation estimation and the relevance estimation conditioned on the input features.

Learning-To-Rank Recommendation Systems

D2K: Turning Historical Data into Retrievable Knowledge for Recommender Systems

no code implementations21 Jan 2024 Jiarui Qin, Weiwen Liu, Ruiming Tang, Weinan Zhang, Yong Yu

A personalized knowledge adaptation unit is devised to effectively exploit the information from the knowledge base by adapting the retrieved knowledge to the target samples.

Recommendation Systems

Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

no code implementations27 Dec 2023 Qingyao Li, Lingyue Fu, Weiming Zhang, Xianyu Chen, Jingwei Yu, Wei Xia, Weinan Zhang, Ruiming Tang, Yong Yu

Solving the problems encountered by students poses a significant challenge for traditional deep learning models, as it requires not only a broad spectrum of subject knowledge but also the ability to understand what constitutes a student's individual difficulties.

Question Answering

GFS: Graph-based Feature Synthesis for Prediction over Relational Databases

no code implementations4 Dec 2023 Han Zhang, Quan Gan, David Wipf, Weinan Zhang

Consequently, the prevalent approach for training machine learning models on data stored in relational databases involves performing feature engineering to merge the data from multiple tables into a single table and subsequently applying single table models.

Feature Engineering Inductive Bias

Vision-Language Foundation Models as Effective Robot Imitators

no code implementations2 Nov 2023 Xinghang Li, Minghuan Liu, Hanbo Zhang, Cunjun Yu, Jie Xu, Hongtao Wu, Chilam Cheang, Ya Jing, Weinan Zhang, Huaping Liu, Hang Li, Tao Kong

We believe RoboFlamingo has the potential to be a cost-effective and easy-to-use solution for robotics manipulation, empowering everyone with the ability to fine-tune their own robotics policy.

Imitation Learning

FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction

1 code implementation30 Oct 2023 Hangyu Wang, Jianghao Lin, Xiangyang Li, Bo Chen, Chenxu Zhu, Ruiming Tang, Weinan Zhang, Yong Yu

The traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality, which capture the collaborative signals via feature interaction modeling.

Click-Through Rate Prediction Contrastive Learning +1

Specify Robust Causal Representation from Mixed Observations

1 code implementation21 Oct 2023 Mengyue Yang, Xinyu Cai, Furui Liu, Weinan Zhang, Jun Wang

Under the hypothesis that the intrinsic latent factors follow some casual generative models, we argue that by learning a causal representation, which is the minimal sufficient causes of the whole system, we can improve the robustness and generalization performance of machine learning models.

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

1 code implementation8 Oct 2023 Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai

This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers).

Reinforcement Learning (RL)

ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination

2 code implementations8 Oct 2023 Xihuai Wang, Shao Zhang, WenHao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, Weinan Zhang

ZSC-Eval consists of: 1) Generation of evaluation partner candidates through behavior-preferring rewards to approximate deployment-time partners' distribution; 2) Selection of evaluation partners by Best-Response Diversity (BR-Div); 3) Measurement of generalization performance with various evaluation partners via the Best-Response Proximity (BR-Prox) metric.

Diversity Multi-agent Reinforcement Learning

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

1 code implementation29 Sep 2023 Xidong Feng, Ziyu Wan, Muning Wen, Stephen Marcus McAleer, Ying Wen, Weinan Zhang, Jun Wang

Empirical results across reasoning, planning, alignment, and decision-making tasks show that TS-LLM outperforms existing approaches and can handle trees with a depth of 64.

Decision Making Language Modelling +1

CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

1 code implementation5 Sep 2023 Lingyue Fu, Huacan Chai, Shuang Luo, Kounianhua Du, Weiming Zhang, Longteng Fan, Jiayi Lei, Renting Rui, Jianghao Lin, Yuchen Fang, Yifan Liu, Jingkuan Wang, Siyuan Qi, Kangning Zhang, Weinan Zhang, Yong Yu

With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers.

Code Generation Multiple-choice

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

1 code implementation22 Aug 2023 Jianghao Lin, Rong Shan, Chenxu Zhu, Kounianhua Du, Bo Chen, Shigang Quan, Ruiming Tang, Yong Yu, Weinan Zhang

With large language models (LLMs) achieving remarkable breakthroughs in natural language processing (NLP) domains, LLM-enhanced recommender systems have received much attention and have been actively explored currently.

Data Augmentation Language Modelling +3

Through the Lens of Core Competency: Survey on Evaluation of Large Language Models

no code implementations15 Aug 2023 Ziyu Zhuang, Qiguang Chen, Longxuan Ma, Mingda Li, Yi Han, Yushan Qian, Haopeng Bai, Zixian Feng, Weinan Zhang, Ting Liu

From pre-trained language model (PLM) to large language model (LLM), the field of natural language processing (NLP) has witnessed steep performance gains and wide practical uses.

Language Modelling Large Language Model

Replace Scoring with Arrangement: A Contextual Set-to-Arrangement Framework for Learning-to-Rank

no code implementations5 Aug 2023 Jiarui Jin, Xianyu Chen, Weinan Zhang, Mengyue Yang, Yang Wang, Yali Du, Yong Yu, Jun Wang

Notice that these ranking metrics do not consider the effects of the contextual dependence among the items in the list, we design a new family of simulation-based ranking metrics, where existing metrics can be regarded as special cases.

Learning-To-Rank

MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction

1 code implementation3 Aug 2023 Jianghao Lin, Yanru Qu, Wei Guo, Xinyi Dai, Ruiming Tang, Yong Yu, Weinan Zhang

The large capacity of neural models helps digest such massive amounts of data under the supervised learning paradigm, yet they fail to utilize the substantial data to its full potential, since the 1-bit click signal is not sufficient to guide the model to learn capable representations of features and instances.

Binary Classification Click-Through Rate Prediction +1

Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in Finance

no code implementations6 Jul 2023 Yuchen Fang, Zhenggang Tang, Kan Ren, Weiqing Liu, Li Zhao, Jiang Bian, Dongsheng Li, Weinan Zhang, Yong Yu, Tie-Yan Liu

Order execution is a fundamental task in quantitative finance, aiming at finishing acquisition or liquidation for a number of trading orders of the specific assets.

Reinforcement Learning (RL)

Is Risk-Sensitive Reinforcement Learning Properly Resolved?

no code implementations2 Jul 2023 Ruiwen Zhou, Minghuan Liu, Kan Ren, Xufang Luo, Weinan Zhang, Dongsheng Li

Due to the nature of risk management in learning applicable policies, risk-sensitive reinforcement learning (RSRL) has been realized as an important direction.

Distributional Reinforcement Learning Management +3

Large Sequence Models for Sequential Decision-Making: A Survey

no code implementations24 Jun 2023 Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.

Decision Making Sequential Decision Making +1

Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models

1 code implementation19 Jun 2023 Yunjia Xi, Weiwen Liu, Jianghao Lin, Xiaoling Cai, Hong Zhu, Jieming Zhu, Bo Chen, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

In this work, we propose an Open-World Knowledge Augmented Recommendation Framework with Large Language Models, dubbed KAR, to acquire two types of external knowledge from LLMs -- the reasoning knowledge on user preferences and the factual knowledge on items.

Music Recommendation Recommendation Systems +1

ReLoop2: Building Self-Adaptive Recommendation Models via Responsive Error Compensation Loop

2 code implementations15 Jun 2023 Jieming Zhu, Guohao Cai, JunJie Huang, Zhenhua Dong, Ruiming Tang, Weinan Zhang

The error memory module is designed with fast access capabilities and undergoes continual refreshing with newly observed data samples during the model serving phase to support fast model adaptation.

Recommendation Systems

MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text Classification

1 code implementation15 Jun 2023 Hongyuan Dong, Weinan Zhang, Wanxiang Che

Despite the promising prospects, the performance of prompting model largely depends on the design of prompt template and verbalizer.

Few-Shot Text Classification text-classification

I run as fast as a rabbit, can you? A Multilingual Simile Dialogue Dataset

1 code implementation9 Jun 2023 Longxuan Ma, Weinan Zhang, Shuhan Zhou, Churui Sun, Changxin Ke, Ting Liu

Meanwhile, the MSD data can also be used on dialogue tasks to test the ability of dialogue systems when using similes.

Retrieval Sentence +1

How Can Recommender Systems Benefit from Large Language Models: A Survey

1 code implementation9 Jun 2023 Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Hao Zhang, Yong liu, Chuhan Wu, Xiangyang Li, Chenxu Zhu, Huifeng Guo, Yong Yu, Ruiming Tang, Weinan Zhang

In this paper, we conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.

Ethics Feature Engineering +5

Set-to-Sequence Ranking-based Concept-aware Learning Path Recommendation

no code implementations7 Jun 2023 Xianyu Chen, Jian Shen, Wei Xia, Jiarui Jin, Yakun Song, Weinan Zhang, Weiwen Liu, Menghui Zhu, Ruiming Tang, Kai Dong, Dingyin Xia, Yong Yu

Noticing that existing approaches fail to consider the correlations of concepts in the path, we propose a novel framework named Set-to-Sequence Ranking-based Concept-aware Learning Path Recommendation (SRC), which formulates the recommendation task under a set-to-sequence paradigm.

Decoder Knowledge Tracing +1

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

1 code implementation NeurIPS 2023 Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li

Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings.

Reinforcement Learning (RL)

MADiff: Offline Multi-agent Learning with Diffusion Models

1 code implementation27 May 2023 Zhengbang Zhu, Minghuan Liu, Liyuan Mao, Bingyi Kang, Minkai Xu, Yong Yu, Stefano Ermon, Weinan Zhang

MADiff is realized with an attention-based diffusion model to model the complex coordination among behaviors of multiple agents.

Offline RL Trajectory Prediction

An Empirical Study on Google Research Football Multi-agent Scenarios

1 code implementation16 May 2023 Yan Song, He Jiang, Zheng Tian, Haifeng Zhang, Yingping Zhang, Jiangcheng Zhu, Zonghong Dai, Weinan Zhang, Jun Wang

Few multi-agent reinforcement learning (MARL) research on Google Research Football (GRF) focus on the 11v11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public.

Benchmarking Multi-agent Reinforcement Learning +2

Covidia: COVID-19 Interdisciplinary Academic Knowledge Graph

no code implementations14 Apr 2023 Cheng Deng, Jiaxin Ding, Luoyi Fu, Weinan Zhang, Xinbing Wang, Chenghu Zhou

In this work, we propose Covidia, COVID-19 interdisciplinary academic knowledge graph to bridge the gap between knowledge of COVID-19 on different domains.

Classification Contrastive Learning +2

FMGNN: Fused Manifold Graph Neural Network

no code implementations3 Apr 2023 Cheng Deng, Fan Xu, Jiaxing Ding, Luoyi Fu, Weinan Zhang, Xinbing Wang

Graph representation learning has been widely studied and demonstrated effectiveness in various graph tasks.

Graph Neural Network Graph Representation Learning +2

Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset

1 code implementation19 Feb 2023 Jiexing Qi, Shuhao Li, Zhixin Guo, Yusheng Huang, Chenghu Zhou, Weinan Zhang, Xinbing Wang, Zhouhan Lin

In this work, we first collect a large-scale institution name normalization dataset LoT-insts1, which contains over 25k classes that exhibit a naturally long-tailed distribution.

Long-tail Learning open-set classification +4

Order Matters: Agent-by-agent Policy Optimization

1 code implementation13 Feb 2023 Xihuai Wang, Zheng Tian, Ziyu Wan, Ying Wen, Jun Wang, Weinan Zhang

In this paper, we propose the \textbf{A}gent-by-\textbf{a}gent \textbf{P}olicy \textbf{O}ptimization (A2PO) algorithm to improve the sample efficiency and retain the guarantees of monotonic improvement for each agent during training.

Visual Imitation Learning with Patch Rewards

1 code implementation2 Feb 2023 Minghuan Liu, Tairan He, Weinan Zhang, Shuicheng Yan, Zhongwen Xu

Specifically, we present Adversarial Imitation Learning with Patch Rewards (PatchAIL), which employs a patch-based discriminator to measure the expertise of different local parts from given images and provide patch rewards.

Imitation Learning

Refined Edge Usage of Graph Neural Networks for Edge Prediction

no code implementations25 Dec 2022 Jiarui Jin, Yangkun Wang, Weinan Zhang, Quan Gan, Xiang Song, Yong Yu, Zheng Zhang, David Wipf

However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i. e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes.

Link Prediction Node Classification

On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective

1 code implementation24 Dec 2022 Ying Wen, Ziyu Wan, Ming Zhou, Shufang Hou, Zhe Cao, Chenyang Le, Jingxiao Chen, Zheng Tian, Weinan Zhang, Jun Wang

The pervasive uncertainty and dynamic nature of real-world environments present significant challenges for the widespread implementation of machine-driven Intelligent Decision-Making (IDM) systems.

Decision Making Image Captioning +2

Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents

no code implementations18 Dec 2022 Minghuan Liu, Zhengbang Zhu, Menghui Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao

In reinforcement learning applications like robotics, agents usually need to deal with various input/output features when specified with different state/action spaces by their developers or physical restrictions.

Sim-to-Real Transfer for Quadrupedal Locomotion via Terrain Transformer

no code implementations15 Dec 2022 Hang Lai, Weinan Zhang, Xialin He, Chen Yu, Zheng Tian, Yong Yu, Jun Wang

Deep reinforcement learning has recently emerged as an appealing alternative for legged locomotion over multiple terrains by training a policy in physical simulation and then transferring it to the real world (i. e., sim-to-real transfer).

Decision Making Deep Reinforcement Learning

A Bird's-eye View of Reranking: from List Level to Page Level

1 code implementation17 Nov 2022 Yunjia Xi, Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Rui Zhang, Ruiming Tang, Yong Yu

Moreover, simply applying a shared network for all the lists fails to capture the commonalities and distinctions in user behaviors on different lists.

Recommendation Systems

NeurIPS 2022 Competition: Driving SMARTS

no code implementations14 Nov 2022 Amir Rasouli, Randy Goebel, Matthew E. Taylor, Iuliia Kotseruba, Soheil Alizadeh, Tianpei Yang, Montgomery Alban, Florian Shkurti, Yuzheng Zhuang, Adam Scibior, Kasra Rezaee, Animesh Garg, David Meger, Jun Luo, Liam Paull, Weinan Zhang, Xinyu Wang, Xi Chen

The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods, trained on a combination of naturalistic AD data and open-source simulation platform SMARTS.

Autonomous Driving Reinforcement Learning (RL)

Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

no code implementations11 Oct 2022 Zhengbang Zhu, Rongjun Qin, JunJie Huang, Xinyi Dai, Yang Yu, Yong Yu, Weinan Zhang

The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption.

Benchmarking Sequential Recommendation

Forgetting Fast in Recommender Systems

no code implementations14 Aug 2022 Wenyan Liu, Juncheng Wan, Xiaoling Wang, Weinan Zhang, Dell Zhang, Hang Li

In this paper, we investigate fast machine unlearning techniques for recommender systems that can remove the effect of a small amount of training data from the recommendation model without incurring the full cost of retraining.

Machine Unlearning Recommendation Systems

Multi-Scale User Behavior Network for Entire Space Multi-Task Learning

no code implementations3 Aug 2022 Jiarui Jin, Xianyu Chen, Weinan Zhang, Yuanbo Chen, Zaifan Jiang, Zekun Zhu, Zhewen Su, Yong Yu

Modelling the user's multiple behaviors is an essential part of modern e-commerce, whose widely adopted application is to jointly optimize click-through rate (CTR) and conversion rate (CVR) predictions.

Multi-Task Learning Survival Analysis

Bootstrapped Transformer for Offline Reinforcement Learning

no code implementations17 Jun 2022 Kerong Wang, Hanye Zhao, Xufang Luo, Kan Ren, Weinan Zhang, Dongsheng Li

Offline reinforcement learning (RL) aims at learning policies from previously collected static trajectory data without interacting with the real environment.

Offline RL reinforcement-learning +2

An F-shape Click Model for Information Retrieval on Multi-block Mobile Pages

1 code implementation17 Jun 2022 Lingyue Fu, Jianghao Lin, Weiwen Liu, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

However, with the development of user interface (UI) design, the layout of displayed items on a result page tends to be multi-block (i. e., multi-list) style instead of a single list, which requires different assumptions to model user behaviors more accurately.

Information Retrieval Retrieval

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

1 code implementation30 May 2022 Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the task is to map agents' observation sequence to agents' optimal action sequence.

Decision Making Multi-agent Reinforcement Learning +4

Spatio-Temporal Graph Few-Shot Learning with Cross-City Knowledge Transfer

1 code implementation27 May 2022 Bin Lu, Xiaoying Gan, Weinan Zhang, Huaxiu Yao, Luoyi Fu, Xinbing Wang

To address this challenge, cross-city knowledge transfer has shown its promise, where the model learned from data-sufficient cities is leveraged to benefit the learning process of data-scarce cities.

Few-Shot Learning Graph Learning +2

Geometer: Graph Few-Shot Class-Incremental Learning via Prototype Representation

1 code implementation27 May 2022 Bin Lu, Xiaoying Gan, Lina Yang, Weinan Zhang, Luoyi Fu, Xinbing Wang

Instead of replacing and retraining the fully connected neural network classifer, Geometer predicts the label of a node by finding the nearest class prototype.

class-incremental learning Few-Shot Class-Incremental Learning +4

Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble

no code implementations19 May 2022 Zhengyu Yang, Kan Ren, Xufang Luo, Minghuan Liu, Weiqing Liu, Jiang Bian, Weinan Zhang, Dongsheng Li

Considering the great performance of ensemble methods on both accuracy and generalization in supervised learning (SL), we design a robust and applicable method named Ensemble Proximal Policy Optimization (EPPO), which learns ensemble policies in an end-to-end manner.

Diversity reinforcement-learning +1

Multi-Level Interaction Reranking with User Behavior History

1 code implementation20 Apr 2022 Yunjia Xi, Weiwen Liu, Jieming Zhu, Xilong Zhao, Xinyi Dai, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

MIR combines low-level cross-item interaction and high-level set-to-list interaction, where we view the candidate items to be reranked as a set and the users' behavior history in chronological order as a list.

Recommendation Systems

PerfectDou: Dominating DouDizhu with Perfect Information Distillation

1 code implementation30 Mar 2022 Guan Yang, Minghuan Liu, Weijun Hong, Weinan Zhang, Fei Fang, Guangjun Zeng, Yue Lin

To this end, we characterize card and game features for DouDizhu to represent the perfect and imperfect information.

Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects

no code implementations20 Mar 2022 Xihuai Wang, Zhicheng Zhang, Weinan Zhang

Significant advances have recently been achieved in Multi-Agent Reinforcement Learning (MARL) which tackles sequential decision-making problems involving multiple participants.

Decision Making Multi-agent Reinforcement Learning +4

Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization

2 code implementations4 Mar 2022 Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao, Yong Yu, Jun Wang

Recent progress in state-only imitation learning extends the scope of applicability of imitation learning to real-world settings by relieving the need for observing expert actions.

Imitation Learning