Search Results for author: Ji-Rong Wen

Found 238 papers, 152 papers with code

Finding the Dominant Winning Ticket in Pre-Trained Language Models

no code implementations Findings (ACL) 2022 Zhuocheng Gong, Di He, Yelong Shen, Tie-Yan Liu, Weizhu Chen, Dongyan Zhao, Ji-Rong Wen, Rui Yan

Empirically, we show that (a) the dominant winning ticket can achieve performance that is comparable with that of the full-parameter model, (b) the dominant winning ticket is transferable across different tasks, (c) and the dominant winning ticket has a natural structure within each parameter matrix.

There Are a Thousand Hamlets in a Thousand People’s Eyes: Enhancing Knowledge-grounded Dialogue with Personal Memory

no code implementations ACL 2022 Tingchen Fu, Xueliang Zhao, Chongyang Tao, Ji-Rong Wen, Rui Yan

Knowledge-grounded conversation (KGC) shows great potential in building an engaging and knowledgeable chatbot, and knowledge selection is a key ingredient in it.

Chatbot

Optimal Partial Transport Based Sentence Selection for Long-form Document Matching

1 code implementation COLING 2022 Weijie Yu, Liang Pang, Jun Xu, Bing Su, Zhenhua Dong, Ji-Rong Wen

Enjoying the partial transport properties of OPT, the selected key sentences can not only effectively enhance the matching accuracy, but also be explained as the rationales for the matching results.

Sentence

Semantic Sentence Matching via Interacting Syntax Graphs

no code implementations COLING 2022 Chen Xu, Jun Xu, Zhenhua Dong, Ji-Rong Wen

In this paper, we formalize the task of semantic sentence matching as a problem of graph matching in which each sentence is represented as a directed graph according to its syntactic structures.

Graph Matching Sentence

Enhancing Graph Contrastive Learning with Reliable and Informative Augmentation for Recommendation

no code implementations9 Sep 2024 Bowen Zheng, Junjie Zhang, Hongyu Lu, Yu Chen, Ming Chen, Wayne Xin Zhao, Ji-Rong Wen

Based on these discrete codes, we enhance the collaborative information of contrastive views by considering neighborhood structure and semantic relevance respectively.

Collaborative Filtering Contrastive Learning +2

Modeling Domain and Feedback Transitions for Cross-Domain Sequential Recommendation

no code implementations15 Aug 2024 Changshuo Zhang, Teng Shi, Xiao Zhang, Qi Liu, Ruobing Xie, Jun Xu, Ji-Rong Wen

In this paper, we propose $\text{Transition}^2$, a novel method to model transitions across both domains and types of user feedback.

Sequential Recommendation

Towards Effective and Efficient Continual Pre-training of Large Language Models

no code implementations26 Jul 2024 Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen

To make the CPT approach more traceable, this paper presents a technical report for continually pre-training Llama-3 (8B), which significantly enhances the Chinese language ability and scientific reasoning ability of the backbone model.

Math

Very Large-Scale Multi-Agent Simulation in AgentScope

1 code implementation25 Jul 2024 Xuchen Pan, Dawei Gao, Yuexiang Xie, Zhewei Wei, Yaliang Li, Bolin Ding, Ji-Rong Wen, Jingren Zhou

Recent advances in large language models (LLMs) have opened new avenues for applying multi-agent systems in very large-scale simulations.

Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

no code implementations15 Jul 2024 Jinhao Jiang, Junyi Li, Wayne Xin Zhao, Yang song, Tao Zhang, Ji-Rong Wen

However, this method may result in inefficient knowledge memorization due to a lack of awareness of knowledge utilization and imposes substantial demands on LLMs to simultaneously learn knowledge utilization and format alignment with limited training samples.

Domain Adaptation Memorization

LLMBox: A Comprehensive Library for Large Language Models

1 code implementation8 Jul 2024 Tianyi Tang, Yiwen Hu, Bingqian Li, Wenyang Luo, Zijing Qin, Haoxiang Sun, Jiapeng Wang, Shiyi Xu, Xiaoxue Cheng, Geyang Guo, Han Peng, Bowen Zheng, Yiru Tang, Yingqian Min, Yushuo Chen, Jie Chen, Yuanqian Zhao, Luran Ding, Yuhao Wang, Zican Dong, Chunxuan Xia, Junyi Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs.

Discovering symbolic expressions with parallelized tree search

1 code implementation5 Jul 2024 Kai Ruan, Ze-Feng Gao, Yike Guo, Hao Sun, Ji-Rong Wen, Yang Liu

Symbolic regression plays a crucial role in modern scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data.

Equation Discovery regression

Query-oriented Data Augmentation for Session Search

no code implementations4 Jul 2024 Haonan Chen, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen

However, this paradigm neglects the symmetric nature of the relevance between the session context and document, i. e., the clicked documents can also be paired with different search contexts when training.

Data Augmentation Session Search

Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation

1 code implementation26 Jun 2024 Guanting Dong, Yutao Zhu, Chenghao Zhang, Zechen Wang, Zhicheng Dou, Ji-Rong Wen

Based on preference data, DPA-RAG accomplishes both external and internal preference alignment: 1) It jointly integrate pair-wise, point-wise, and contrastive preference alignment abilities into the reranker, achieving external preference alignment among RAG components.

Hallucination Knowledge Base Question Answering +2

ICLEval: Evaluating In-Context Learning Ability of Large Language Models

1 code implementation21 Jun 2024 Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia, Zhao Cao, Ji-Rong Wen

In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs.

In-Context Learning

Towards Event-oriented Long Video Understanding

1 code implementation20 Jun 2024 Yifan Du, Kun Zhou, Yuqi Huo, YiFan Li, Wayne Xin Zhao, Haoyu Lu, Zijia Zhao, Bingning Wang, WeiPeng Chen, Ji-Rong Wen

Leveraging an effective instruction synthesis method and an adaptive model architecture, VIM surpasses both state-of-the-art open-source models and GPT-4V on the Event-Bench.

Video Understanding

Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning

1 code implementation20 Jun 2024 Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Ji-Rong Wen

The emergence of in-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) for recognizing the task from demonstrations and utilizing pre-trained priors, and task learning (TL) for learning from demonstrations.

Ensemble Learning In-Context Learning

Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models

no code implementations18 Jun 2024 Jie Chen, Yupeng Zhang, Bingning Wang, Wayne Xin Zhao, Ji-Rong Wen, WeiPeng Chen

Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs).

Instruction Following

Low-Redundant Optimization for Large Language Model Alignment

1 code implementation18 Jun 2024 Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen

Concretely, we first identify the neurons that are related to the human preference data by a gradient-based strategy, then identify the alignment-related key tokens by reward models for computing loss.

Language Modelling Large Language Model

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

1 code implementation17 Jun 2024 Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Hongzhi Zhang, Fuzheng Zhang, Di Zhang, Kun Gai, Ji-Rong Wen

Hallucination detection is a challenging task for large language models (LLMs), and existing studies heavily rely on powerful closed-source LLMs such as GPT-4.

2k Hallucination

Counteracting Duration Bias in Video Recommendation via Counterfactual Watch Time

1 code implementation12 Jun 2024 Haiyuan Zhao, Guohao Cai, Jieming Zhu, Zhenhua Dong, Jun Xu, Ji-Rong Wen

In video recommendation, an ongoing effort is to satisfy users' personalized information needs by leveraging their logged watch time.

counterfactual Recommendation Systems

QAGCF: Graph Collaborative Filtering for Q&A Recommendation

no code implementations7 Jun 2024 Changshuo Zhang, Teng Shi, Xiao Zhang, Yanping Zheng, Ruobing Xie, Qi Liu, Jun Xu, Ji-Rong Wen

Traditional recommendation methods treat the question-answer pair as a whole or only consider the answer as a single item, which overlooks the two challenges and cannot effectively model user interests.

Collaborative Filtering Contrastive Learning +1

Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training

1 code implementation30 May 2024 Jinxia Yang, Bing Su, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we introduce the Med-ST framework for fine-grained spatial and temporal modeling to exploit information from multiple spatial views of chest radiographs and temporal historical records.

Temporal Sequences

Exploring Context Window of Large Language Models via Decomposed Positional Vectors

no code implementations28 May 2024 Zican Dong, Junyi Li, Xin Men, Wayne Xin Zhao, Bingbing Wang, Zhen Tian, WeiPeng Chen, Ji-Rong Wen

Based on our findings, we design two training-free context window extension methods, positional vector replacement and attention window extension.

Tool Learning with Large Language Models: A Survey

1 code implementation28 May 2024 Changle Qu, Sunhao Dai, Xiaochi Wei, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Jun Xu, Ji-Rong Wen

In this survey, we focus on reviewing existing literature from the two primary aspects (1) why tool learning is beneficial and (2) how tool learning is implemented, enabling a comprehensive understanding of tool learning with LLMs.

Response Generation

Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration

1 code implementation26 May 2024 Sunhao Dai, Weihao Liu, Yuqi Zhou, Liang Pang, Rongju Ruan, Gang Wang, Zhenhua Dong, Jun Xu, Ji-Rong Wen

The proliferation of Large Language Models (LLMs) has led to an influx of AI-generated content (AIGC) on the internet, transforming the corpus of Information Retrieval (IR) systems from solely human-written to a coexistence with LLM-generated content.

Information Retrieval Text Retrieval

Towards Completeness-Oriented Tool Retrieval for Large Language Models

1 code implementation25 May 2024 Changle Qu, Sunhao Dai, Xiaochi Wei, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Jun Xu, Ji-Rong Wen

Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions, frequently leading to the retrieval of redundant, similar tools.

Retrieval

JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

2 code implementations23 May 2024 Kun Zhou, Beichen Zhang, Jiapeng Wang, Zhipeng Chen, Wayne Xin Zhao, Jing Sha, Zhichao Sheng, Shijin Wang, Ji-Rong Wen

We leverage it to synthesize 6 million math problems for pre-training our JiuZhang3. 0 model, which only needs to invoke GPT-4 API 9. 3k times and pre-train on 4. 6B data.

Knowledge Distillation Math +1

Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression

1 code implementation21 May 2024 Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Yipeng Ma, Tao Wang, Ji-Rong Wen

In this paper, we introduce \textbf{DecoQuant}, a novel data-free low-bit quantization technique based on tensor decomposition methods, to effectively compress KV cache.

Quantization Tensor Decomposition

A Survey on the Memory Mechanism of Large Language Model based Agents

1 code implementation21 Apr 2024 Zeyu Zhang, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Quanyu Dai, Jieming Zhu, Zhenhua Dong, Ji-Rong Wen

Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions.

Language Modelling Large Language Model

Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models

1 code implementation17 Apr 2024 Yushuo Chen, Tianyi Tang, Erge Xiang, Linjiang Li, Wayne Xin Zhao, Jing Wang, Yunpeng Chai, Ji-Rong Wen

In real world, large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications.

Dynamic Prompt Optimizing for Text-to-Image Generation

1 code implementation CVPR 2024 Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang

Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images.

Text-to-Image Generation

Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study

1 code implementation4 Apr 2024 Zechun Niu, Jiaxin Mao, Qingyao Ai, Ji-Rong Wen

Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.

counterfactual Learning-To-Rank +1

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

1 code implementation28 Mar 2024 Ang Lv, Yuhan Chen, Kaiyi Zhang, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan

In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks.

EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention

1 code implementation26 Mar 2024 Zhen Tian, Wayne Xin Zhao, Changwang Zhang, Xin Zhao, Zhongrui Ma, Ji-Rong Wen

The core of transformer architecture lies in the self-attention mechanism, which computes the pairwise attention scores in a sequence.

Contrastive Learning Recommendation Systems

Selecting Query-bag as Pseudo Relevance Feedback for Information-seeking Conversations

no code implementations22 Mar 2024 Xiaoqing Zhang, Xiuying Chen, Shen Gao, Shuqi Li, Xin Gao, Ji-Rong Wen, Rui Yan

Given the user query, the information-seeking dialogue systems first retrieve a subset of response candidates, then further select the best response from the candidate set through re-ranking.

Contrastive Learning Re-Ranking

ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

1 code implementation21 Mar 2024 Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

In response to this challenge, we present an empirical investigation of CoT prompting and introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.

A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation

1 code implementation20 Mar 2024 Bowen Zheng, Zihan Lin, Enze Liu, Chen Yang, Enyang Bai, Cheng Ling, Wayne Xin Zhao, Ji-Rong Wen

Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors.

Language Modelling Large Language Model +1

An Analysis on Matching Mechanisms and Token Pruning for Late-interaction Models

no code implementations20 Mar 2024 Qi Liu, Gang Guo, Jiaxin Mao, Zhicheng Dou, Ji-Rong Wen, Hao Jiang, Xinyu Zhang, Zhao Cao

Based on these findings, we then propose several simple document pruning methods to reduce the storage overhead and compare the effectiveness of different pruning methods on different late-interaction models.

Retrieval

Less is More: Data Value Estimation for Visual Instruction Tuning

no code implementations14 Mar 2024 Zikang Liu, Kun Zhou, Wayne Xin Zhao, Dawei Gao, Yaliang Li, Ji-Rong Wen

To investigate this issue, we conduct a series of empirical studies, which reveal a significant redundancy within the visual instruction datasets, and show that greatly reducing the amount of several instruction dataset even do not affect the performance.

StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses

no code implementations13 Mar 2024 Jia-Nan Li, Quan Tu, Cunli Mao, Zhengtao Yu, Ji-Rong Wen, Rui Yan

Accordingly, we introduce StreamingDialogue, which compresses long dialogue history into conv-attn sinks with minimal losses, and thus reduces computational complexity quadratically with the number of sinks (i. e., the number of utterances).

REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering

1 code implementation27 Feb 2024 Yuhao Wang, Ruiyang Ren, Junyi Li, Wayne Xin Zhao, Jing Liu, Ji-Rong Wen

By combining the improvements in both architecture and training, our proposed REAR can better utilize external knowledge by effectively perceiving the relevance of retrieved documents.

Open-Domain Question Answering RAG +1

BASES: Large-scale Web Search User Simulation with Large Language Model based Agents

no code implementations27 Feb 2024 Ruiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Hua Wu, Ji-Rong Wen, Haifeng Wang

Due to the excellent capacities of large language models (LLMs), it becomes feasible to develop LLM-based agents for reliable user simulation.

Information Retrieval Language Modelling +3

Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers

1 code implementation27 Feb 2024 Xinyu Tang, Xiaolei Wang, Wayne Xin Zhao, Siyuan Lu, Yaliang Li, Ji-Rong Wen

Focused on the two aspects, we borrow the theoretical framework and learning methods from gradient-based optimization to design improved strategies for LLM-based prompt optimizers.

MMLU

Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

no code implementations26 Feb 2024 Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen

Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora.

UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models

1 code implementation22 Feb 2024 Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen

To address these challenges, we categorize four available fact sources: human-written evidence, reference documents, search engine results, and LLM knowledge, along with five text generation tasks containing six representative datasets.

Hallucination Retrieval +1

Large Language Model-based Human-Agent Collaboration for Complex Task Solving

1 code implementation20 Feb 2024 Xueyang Feng, Zhi-Yuan Chen, Yujia Qin, Yankai Lin, Xu Chen, Zhiyuan Liu, Ji-Rong Wen

We construct a human-agent collaboration dataset to train this policy model in an offline reinforcement learning environment.

Language Modelling Large Language Model +1

Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs

1 code implementation19 Feb 2024 Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen

The integration of large language models (LLMs) and search engines represents a significant evolution in knowledge acquisition methodologies.

Question Answering

KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

no code implementations17 Feb 2024 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yang song, Chen Zhu, HengShu Zhu, Ji-Rong Wen

To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG, and synthesize a code-based instruction dataset to fine-tune the base LLM.

Knowledge Graphs

INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning

1 code implementation12 Jan 2024 Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, Zhicheng Dou

Despite this, their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.

Diversity document understanding +3

Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

1 code implementation11 Jan 2024 Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Junchen Wan, Fuzheng Zhang, Di Zhang, Ji-Rong Wen

To address it, we propose a new RL method named RLMEC that incorporates a generative model as the reward model, which is trained by the erroneous solution rewriting task under the minimum editing constraint, and can produce token-level rewards for RL training.

Question Answering Reinforcement Learning (RL)

Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis

no code implementations10 Jan 2024 Lanling Xu, Junjie Zhang, Bingqian Li, Jinpeng Wang, Mingchen Cai, Wayne Xin Zhao, Ji-Rong Wen

As for the use of LLMs as recommenders, we analyze the impact of public availability, tuning strategies, model architecture, parameter scale, and context length on recommendation results based on the classification of LLMs.

Prompt Engineering Recommendation Systems

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

1 code implementation6 Jan 2024 Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

To tackle the LLM hallucination, three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigate them (mitigation).

Hallucination

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

no code implementations30 Dec 2023 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen

To better perform reasoning on KG, recent work typically adopts a pre-trained language model~(PLM) to model the question, and a graph neural network~(GNN) based module to perform multi-hop reasoning on the KG.

Graph Neural Network Language Modelling +1

Scaling Law of Large Sequential Recommendation Models

no code implementations19 Nov 2023 Gaowei Zhang, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ji-Rong Wen

We find that scaling up the model size can greatly boost the performance on these challenging tasks, which again verifies the benefits of large recommendation models.

Sequential Recommendation

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

1 code implementation15 Nov 2023 Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, Ji-Rong Wen

To address this challenge, in this paper, we propose a new LLM-based recommendation model called LC-Rec, which can better integrate language and collaborative semantics for recommender systems.

Quantization Recommendation Systems

Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse

1 code implementation13 Nov 2023 Ang Lv, Kaiyi Zhang, Shufang Xie, Quan Tu, Yuhan Chen, Ji-Rong Wen, Rui Yan

Recent studies have highlighted a phenomenon in large language models (LLMs) known as "the reversal curse," in which the order of knowledge entities in the training data biases the models' comprehension.

Denoising Language Modelling

AI-accelerated Discovery of Altermagnetic Materials

1 code implementation8 Nov 2023 Ze-Feng Gao, Shuai Qu, Bocheng Zeng, Yang Liu, Ji-Rong Wen, Hao Sun, Peng-Jie Guo, Zhong-Yi Lu

Since each altermagnetic material has a unique crystal structure, we propose an automated discovery approach empowered by an AI search engine that employs a pre-trained graph neural network to learn the intrinsic features of the material crystal structure, followed by fine-tuning a classifier with limited positive samples to predict the altermagnetism probability of a given material candidate.

Graph Neural Network

Don't Make Your LLM an Evaluation Benchmark Cheater

no code implementations3 Nov 2023 Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, Jiawei Han

Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence, attaining remarkable improvement in model capacity.

What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning

1 code implementation2 Nov 2023 Yifan Du, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Jinpeng Wang, Chuyuan Wang, Mingchen Cai, Ruihua Song, Ji-Rong Wen

By conducting a comprehensive empirical study, we find that instructions focused on complex visual reasoning tasks are particularly effective in improving the performance of MLLMs on evaluation benchmarks.

Visual Reasoning Zero-shot Generalization

DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy

1 code implementation28 Oct 2023 Hongda Sun, Weikai Xu, Wei Liu, Jian Luan, Bin Wang, Shuo Shang, Ji-Rong Wen, Rui Yan

Recent advances in large language models (LLMs) have revolutionized the landscape of reasoning tasks.

Logical Reasoning

AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems

no code implementations13 Oct 2023 Junjie Zhang, Yupeng Hou, Ruobing Xie, Wenqi Sun, Julian McAuley, Wayne Xin Zhao, Leyu Lin, Ji-Rong Wen

The optimized agents can also propagate their preferences to other agents in subsequent interactions, implicitly capturing the collaborative filtering idea.

Collaborative Filtering Decision Making +2

BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models

1 code implementation23 Sep 2023 Zican Dong, Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

Recently, multiple studies have committed to extending the context length and enhancing the long text modeling capabilities of LLMs.

Code Completion Hallucination +2

Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge Selection

no code implementations30 Aug 2023 Hongjin Qian, Zhicheng Dou, Jiejun Tan, Haonan Chen, Haoqi Gu, Ruofei Lai, Xinyu Zhang, Zhao Cao, Ji-Rong Wen

Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up(e. g., entity mismatch) of irrelevant references.

Decoder Text Generation

A Survey on Large Language Model based Autonomous Agents

2 code implementations22 Aug 2023 Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, ZhiYuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, Ji-Rong Wen

In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of LLM-based autonomous agents from a holistic perspective.

Language Modelling Large Language Model

Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation

1 code implementation16 Aug 2023 Haiyuan Zhao, Lei Zhang, Jun Xu, Guohao Cai, Zhenhua Dong, Ji-Rong Wen

In the video recommendation, watch time is commonly adopted as an indicator of user interest.

Large Language Models for Information Retrieval: A Survey

1 code implementation14 Aug 2023 Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Haonan Chen, Zheng Liu, Zhicheng Dou, Ji-Rong Wen

This evolution requires a combination of both traditional methods (such as term-based sparse retrieval methods with rapid response) and modern neural architectures (such as language models with powerful language understanding capacity).

Information Retrieval Question Answering +2

Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization

1 code implementation10 Aug 2023 Zezhong Lv, Bing Su, Ji-Rong Wen

Finally, by suppressing the unimodal effect of masked query, we can rectify the reconstructions of video proposals to perform reasonable contrastive learning.

Contrastive Learning counterfactual

Synthesizing Long-Term Human Motions with Diffusion Models via Coherent Sampling

1 code implementation3 Aug 2023 Zhao Yang, Bing Su, Ji-Rong Wen

Firstly, they cannot directly generate coherent motions and require additional operations such as interpolation to process the generated actions.

Sentence

Spatio-Temporal Branching for Motion Prediction using Motion Increments

1 code implementation2 Aug 2023 Jiexin Wang, Yujie Zhou, Wenwen Qiang, Ying Ba, Bing Su, Ji-Rong Wen

Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications, but it remains a challenging task due to the stochastic and aperiodic nature of future poses.

Human motion prediction Knowledge Distillation +1

Alleviating the Long-Tail Problem in Conversational Recommender Systems

1 code implementation21 Jul 2023 Zhipeng Zhao, Kun Zhou, Xiaolei Wang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen

Conversational recommender systems (CRS) aim to provide the recommendation service via natural language conversations.

Diversity Recommendation Systems +1

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

1 code implementation20 Jul 2023 Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang

In this study, we present an initial analysis of the factual knowledge boundaries of LLMs and how retrieval augmentation affects LLMs on open-domain QA.

Open-Domain Question Answering Retrieval +1

Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study

1 code implementation16 Jul 2023 Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

Different from previous studies focused on overall performance, this work aims to investigate the impact of quantization on \emph{emergent abilities}, which are important characteristics that distinguish LLMs from small language models.

In-Context Learning Instruction Following +1

RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit

1 code implementation8 Jun 2023 Jiongnan Liu, Jiajie Jin, Zihan Wang, Jiehan Cheng, Zhicheng Dou, Ji-Rong Wen

To support research in this area and facilitate the development of retrieval-augmented LLM systems, we develop RETA-LLM, a {RET}reival-{A}ugmented LLM toolkit.

Answer Generation Fact Checking +5

Improving Conversational Recommendation Systems via Counterfactual Data Simulation

1 code implementation5 Jun 2023 Xiaolei Wang, Kun Zhou, Xinyu Tang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen

To develop our approach, we characterize user preference and organize the conversation flow by the entities involved in the dialogue, and design a multi-stage recommendation dialogue simulator based on a conversation flow language model.

Conversational Recommendation counterfactual +3

User Behavior Simulation with Large Language Model based Agents

1 code implementation5 Jun 2023 Lei Wang, Jingsen Zhang, Hao Yang, ZhiYuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, Jun Xu, Zhicheng Dou, Jun Wang, Ji-Rong Wen

Simulating high quality user behavior data has always been a fundamental problem in human-centered applications, where the major difficulty originates from the intricate mechanism of human decision process.

Language Modelling Large Language Model +2

Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning

1 code implementation NeurIPS 2023 Beichen Zhang, Kun Zhou, Xilin Wei, Wayne Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen

Based on this finding, we propose a new approach that can deliberate the reasoning steps with tool interfaces, namely \textbf{DELI}.

Math

Zero-shot Visual Question Answering with Language Model Feedback

1 code implementation26 May 2023 Yifan Du, Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we propose a novel language model guided captioning approach, LAMOC, for knowledge-based visual question answering (VQA).

Language Modelling Question Answering +1

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

1 code implementation23 May 2023 Zhipeng Chen, Kun Zhou, Beichen Zhang, Zheng Gong, Wayne Xin Zhao, Ji-Rong Wen

Although large language models (LLMs) have achieved excellent performance in a variety of evaluation benchmarks, they still struggle in complex reasoning tasks which require specific knowledge and multi-hop reasoning.

Math

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

1 code implementation22 May 2023 Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen

The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs), which rely on natural language conversations to satisfy user needs.

Conversational Recommendation Explanation Generation +1

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

3 code implementations19 May 2023 Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, i. e., content that conflicts with the source or cannot be verified by the factual knowledge.

Hallucination Hallucination Evaluation

When Search Meets Recommendation: Learning Disentangled Search Representation for Recommendation

1 code implementation18 May 2023 Zihua Si, Zhongxiang Sun, Xiao Zhang, Jun Xu, Xiaoxue Zang, Yang song, Kun Gai, Ji-Rong Wen

In our paper, we propose a Search-Enhanced framework for the Sequential Recommendation (SESRec) that leverages users' search interests for recommendation, by disentangling similar and dissimilar representations within S&R behaviors.

Contrastive Learning Disentanglement +1

TOME: A Two-stage Approach for Model-based Retrieval

no code implementations18 May 2023 Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, Haifeng Wang

Recently, model-based retrieval has emerged as a new paradigm in text retrieval that discards the index in the traditional retrieval model and instead memorizes the candidate corpora using model parameters.

Natural Questions Text Retrieval

The Web Can Be Your Oyster for Improving Large Language Models

1 code implementation18 May 2023 Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jingyuan Wang, Jian-Yun Nie, Ji-Rong Wen

In order to further improve the capacity of LLMs for knowledge-intensive tasks, we consider augmenting LLMs with the large-scale web using search engine.

Retrieval World Knowledge

Evaluating Object Hallucination in Large Vision-Language Models

3 code implementations17 May 2023 YiFan Li, Yifan Du, Kun Zhou, Jinpeng Wang, Wayne Xin Zhao, Ji-Rong Wen

Despite the promising progress on LVLMs, we find that LVLMs suffer from the hallucination problem, i. e. they tend to generate objects that are inconsistent with the target images in the descriptions.

Hallucination Object

StructGPT: A General Framework for Large Language Model to Reason over Structured Data

1 code implementation16 May 2023 Jinhao Jiang, Kun Zhou, Zican Dong, Keming Ye, Wayne Xin Zhao, Ji-Rong Wen

Specially, we propose an \emph{invoking-linearization-generation} procedure to support LLMs in reasoning on the structured data with the help of the external interfaces.

Language Modelling Large Language Model +1

Recommendation as Instruction Following: A Large Language Model Empowered Recommendation Approach

no code implementations11 May 2023 Junjie Zhang, Ruobing Xie, Yupeng Hou, Wayne Xin Zhao, Leyu Lin, Ji-Rong Wen

Inspired by the recent progress on large language models (LLMs), we take a different approach to developing the recommendation models, considering recommendation as instruction following by LLMs.

Instruction Following Language Modelling +2

Diffusion-NAT: Self-Prompting Discrete Diffusion for Non-Autoregressive Text Generation

no code implementations6 May 2023 Kun Zhou, YiFan Li, Wayne Xin Zhao, Ji-Rong Wen

To solve it, we propose Diffusion-NAT, which introduces discrete diffusion models~(DDM) into NAR text-to-text generation and integrates BART to improve the performance.

Denoising Text Generation

GlyphDiffusion: Text Generation as Image Generation

no code implementations25 Apr 2023 Junyi Li, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

In this way, conditional text generation can be cast as a glyph image generation task, and it is then natural to apply continuous diffusion models to discrete texts.

Conditional Text Generation Diversity +3

EulerNet: Adaptive Feature Interaction Learning via Euler's Formula for CTR Prediction

2 code implementations21 Apr 2023 Zhen Tian, Ting Bai, Wayne Xin Zhao, Ji-Rong Wen, Zhao Cao

EulerNet converts the exponential powers of feature interactions into simple linear combinations of the modulus and phase of the complex features, making it possible to adaptively learn the high-order feature interactions in an efficient way.

Click-Through Rate Prediction

WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

1 code implementation10 Apr 2023 Hongjing Qian, Yutao Zhu, Zhicheng Dou, Haoqi Gu, Xinyu Zhang, Zheng Liu, Ruofei Lai, Zhao Cao, Jian-Yun Nie, Ji-Rong Wen

In this paper, we introduce a new NLP task -- generating short factual articles with references for queries by mining supporting evidence from the Web.

Retrieval Text Generation

A Survey of Large Language Models

5 code implementations31 Mar 2023 Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, YiFan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen

To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.

Language Modelling

Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture

no code implementations27 Mar 2023 Peiyu Liu, Ze-Feng Gao, Yushuo Chen, Wayne Xin Zhao, Ji-Rong Wen

Based on such a decomposition, our architecture shares the central tensor across all layers for reducing the model size and meanwhile keeps layer-specific auxiliary tensors (also using adapters) for enhancing the adaptation flexibility.

Dually Enhanced Propensity Score Estimation in Sequential Recommendation

1 code implementation15 Mar 2023 Chen Xu, Jun Xu, Xu Chen, Zhenghua Dong, Ji-Rong Wen

According to the graph, two complementary propensity scores are estimated from the views of item and user, respectively, based on the same set of user feedback data.

Sequential Recommendation

Diffusion Models for Non-autoregressive Text Generation: A Survey

1 code implementation12 Mar 2023 YiFan Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

In this survey, we review the recent progress in diffusion models for NAR text generation.

Text Generation

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

1 code implementation26 Dec 2022 Tianyi Tang, Junyi Li, Zhipeng Chen, Yiwen Hu, Zhuohao Yu, Wenxun Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2. 0, focusing on the use of pre-trained language models (PLMs).

Abstractive Text Summarization Data-to-Text Generation +7

MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers

1 code implementation15 Dec 2022 Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, Ji-Rong Wen

Pre-trained Transformers (\eg BERT) have been commonly used in existing dense retrieval methods for parameter initialization, and recent studies are exploring more effective pre-training tasks for further improving the quality of dense vectors.

Decoder Passage Retrieval +1

Visually-augmented pretrained language models for NLP tasks without images

1 code implementation15 Dec 2022 Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Qinyu Zhang, Ji-Rong Wen

Although pre-trained language models~(PLMs) have shown impressive performance by text-only self-supervised training, they are found lack of visual semantics or commonsense.

Retrieval

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

1 code implementation2 Dec 2022 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG).

Language Modelling Multi-hop Question Answering +2

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

1 code implementation30 Nov 2022 Jing Yao, Zheng Liu, Junhan Yang, Zhicheng Dou, Xing Xie, Ji-Rong Wen

In the first stage, a lightweight CNN-based ad-hod neighbor selector is deployed to filter useful neighbors for the matching task with a small computation cost.

Recent Advances in RecBole: Extensions with more Practical Considerations

1 code implementation28 Nov 2022 Lanling Xu, Zhen Tian, Gaowei Zhang, Lei Wang, Junjie Zhang, Bowen Zheng, YiFan Li, Yupeng Hou, Xingyu Pan, Yushuo Chen, Wayne Xin Zhao, Xu Chen, Ji-Rong Wen

In order to show the recent update in RecBole, we write this technical report to introduce our latest improvements on RecBole.

Dense Text Retrieval based on Pretrained Language Models: A Survey

2 code implementations27 Nov 2022 Wayne Xin Zhao, Jing Liu, Ruiyang Ren, Ji-Rong Wen

With powerful PLMs, we can effectively learn the representations of queries and texts in the latent representation space, and further construct the semantic matching function between the dense vectors for relevance modeling.

Text Retrieval

Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation

1 code implementation21 Nov 2022 Zhen Tian, Ting Bai, Zibin Zhang, Zhiyuan Xu, Kangyi Lin, Ji-Rong Wen, Wayne Xin Zhao

Some recent knowledge distillation based methods transfer knowledge from complex teacher models to shallow student models for accelerating the online model inference.

Click-Through Rate Prediction Knowledge Distillation +1

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

1 code implementation21 Oct 2022 Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen

Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.

Text Retrieval

Privacy-Preserved Neural Graph Similarity Learning

1 code implementation21 Oct 2022 Yupeng Hou, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen

To develop effective and efficient graph similarity learning (GSL) models, a series of data-driven neural algorithms have been proposed in recent years.

Graph Matching Graph Similarity +1

Law Article-Enhanced Legal Case Matching: a Causal Learning Approach

1 code implementation20 Oct 2022 Zhongxiang Sun, Jun Xu, Xiao Zhang, Zhenhua Dong, Ji-Rong Wen

We show that the framework is model-agnostic, and a number of legal case matching models can be applied as the underlying models.

Semantic Text Matching Text Matching

Modeling Multiple Views via Implicitly Preserving Global Consistency and Local Complementarity

2 code implementations16 Sep 2022 Jiangmeng Li, Wenwen Qiang, Changwen Zheng, Bing Su, Farid Razzak, Ji-Rong Wen, Hui Xiong

To this end, we propose a methodology, specifically consistency and complementarity network (CoCoNet), which avails of strict global inter-view consistency and local cross-view complementarity preserving regularization to comprehensively learn representations from multiple views.

Representation Learning Self-Supervised Learning

A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language

4 code implementations12 Sep 2022 Bing Su, Dazhao Du, Zhao Yang, Yujie Zhou, Jiangmeng Li, Anyi Rao, Hao Sun, Zhiwu Lu, Ji-Rong Wen

Although artificial intelligence (AI) has made significant progress in understanding molecules in a wide range of fields, existing models generally acquire the single cognitive ability from the single molecular modality.

Contrastive Learning Cross-Modal Retrieval +4

Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search

1 code implementation23 Aug 2022 Haonan Chen, Zhicheng Dou, Yutao Zhu, Zhao Cao, Xiaohua Cheng, Ji-Rong Wen

To help the encoding of the current user behavior sequence, we propose to use a decoder and the information of future sequences and a supplemental query.

Decoder Session Search

Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer

no code implementations19 Aug 2022 Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Wu, Peitian Zhang, Ji-Rong Wen

In order to unify these two stages, we explore a model-based indexer for document retrieval.

Retrieval

Modeling Two-Way Selection Preference for Person-Job Fit

1 code implementation18 Aug 2022 Chen Yang, Yupeng Hou, Yang song, Tao Zhang, Ji-Rong Wen, Wayne Xin Zhao

To model the two-way selection preference from the dual-perspective of job seekers and employers, we incorporate two different nodes for each candidate (or job) and characterize both successful matching and failed matching via a unified dual-perspective interaction graph.

Contrastive Learning Graph Representation Learning +1

Multimodal foundation models are better simulators of the human brain

1 code implementation17 Aug 2022 Haoyu Lu, Qiongyi Zhou, Nanyi Fei, Zhiwu Lu, Mingyu Ding, Jingyuan Wen, Changde Du, Xin Zhao, Hao Sun, Huiguang He, Ji-Rong Wen

Further, from the perspective of neural encoding (based on our foundation model), we find that both visual and lingual encoders trained multimodally are more brain-like compared with unimodal ones.

STAR-GNN: Spatial-Temporal Video Representation for Content-based Retrieval

no code implementations15 Aug 2022 Guoping Zhao, Bingqing Zhang, Mingyu Zhang, Yaxian Li, Jiajun Liu, Ji-Rong Wen

It models a video with a lattice feature graph in which the nodes represent regions of different granularity, with weighted edges that represent the spatial and temporal links.

Graph Neural Network Representation Learning +2

MVP: Multi-task Supervised Pre-training for Natural Language Generation

3 code implementations24 Jun 2022 Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

Motivated by the success of supervised pre-training, we propose Multi-task superVised Pre-training (MVP) for natural language generation.

Text Generation

Towards Unified Conversational Recommender Systems via Knowledge-Enhanced Prompt Learning

1 code implementation19 Jun 2022 Xiaolei Wang, Kun Zhou, Ji-Rong Wen, Wayne Xin Zhao

Our approach unifies the recommendation and conversation subtasks into the prompt learning paradigm, and utilizes knowledge-enhanced prompts based on a fixed pre-trained language model (PLM) to fulfill both subtasks in a unified approach.

Language Modelling Recommendation Systems +1

RecBole 2.0: Towards a More Up-to-Date Recommendation Library

2 code implementations15 Jun 2022 Wayne Xin Zhao, Yupeng Hou, Xingyu Pan, Chen Yang, Zeyu Zhang, Zihan Lin, Jingsen Zhang, Shuqing Bian, Jiakai Tang, Wenqi Sun, Yushuo Chen, Lanling Xu, Gaowei Zhang, Zhen Tian, Changxin Tian, Shanlei Mu, Xinyan Fan, Xu Chen, Ji-Rong Wen

In order to support the study of recent advances in recommender systems, this paper presents an extended recommendation library consisting of eight packages for up-to-date topics and architectures.

Benchmarking Data Augmentation +3

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

1 code implementation13 Jun 2022 Wayne Xin Zhao, Kun Zhou, Zheng Gong, Beichen Zhang, Yuanhang Zhou, Jing Sha, Zhigang Chen, Shijin Wang, Cong Liu, Ji-Rong Wen

Considering the complex nature of mathematical texts, we design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses.

Language Modelling Math

Towards Universal Sequence Representation Learning for Recommender Systems

1 code implementation13 Jun 2022 Yupeng Hou, Shanlei Mu, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

In order to develop effective sequential recommenders, a series of sequence representation learning (SRL) methods are proposed to model historical user behaviors.

Recommendation Systems Representation Learning

Feature-aware Diversified Re-ranking with Disentangled Representations for Relevant Recommendation

no code implementations10 Jun 2022 Zihan Lin, Hui Wang, Jingshu Mao, Wayne Xin Zhao, Cheng Wang, Peng Jiang, Ji-Rong Wen

Relevant recommendation is a special recommendation scenario which provides relevant items when users express interests on one target item (e. g., click, like and purchase).

Disentanglement Diversity +1

Negative Sampling for Contrastive Representation Learning: A Review

no code implementations1 Jun 2022 Lanling Xu, Jianxun Lian, Wayne Xin Zhao, Ming Gong, Linjun Shou, Daxin Jiang, Xing Xie, Ji-Rong Wen

The learn-to-compare paradigm of contrastive representation learning (CRL), which compares positive samples with negative ones for representation learning, has achieved great success in a wide range of domains, including natural language processing, computer vision, information retrieval and graph learning.

Graph Learning Information Retrieval +2

Learning to Transfer Prompts for Text Generation

1 code implementation NAACL 2022 Junyi Li, Tianyi Tang, Jian-Yun Nie, Ji-Rong Wen, Wayne Xin Zhao

First, PTG learns a set of source prompts for various source generation tasks and then transfers these prompts as target prompts to perform target generation tasks.

Text Generation

Debiased Contrastive Learning of Unsupervised Sentence Representations

1 code implementation ACL 2022 Kun Zhou, Beichen Zhang, Wayne Xin Zhao, Ji-Rong Wen

In DCLR, we design an instance weighting method to punish false negatives and generate noise-based negatives to guarantee the uniformity of the representation space.

Contrastive Learning Semantic Textual Similarity +1

31