Search Results for author: Can Xu

Found 68 papers, 32 papers with code

A Survey on Knowledge Distillation of Large Language Models

1 code implementation20 Feb 2024 Xiaohan Xu, Ming Li, Chongyang Tao, Tao Shen, Reynold Cheng, Jinyang Li, Can Xu, DaCheng Tao, Tianyi Zhou

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.

Data Augmentation Knowledge Distillation +1

Diffusion-based graph generative methods

1 code implementation28 Jan 2024 Hongyang Chen, Can Xu, Lingyu Zheng, Qiang Zhang, Xuemin Lin

Being the most cutting-edge generative methods, diffusion methods have shown great advances in wide generation tasks.

Denoising Graph Generation

Leveraging Large Language Models for NLG Evaluation: A Survey

1 code implementation13 Jan 2024 Zhen Li, Xiaohan Xu, Tao Shen, Can Xu, Jia-Chen Gu, Chongyang Tao

In the rapidly evolving domain of Natural Language Generation (NLG) evaluation, introducing Large Language Models (LLMs) has opened new avenues for assessing generated content quality, e. g., coherence, creativity, and context relevance.

nlg evaluation Specificity +1

Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation

1 code implementation5 Jan 2024 Can Xu, Haosen Wang, Weigang Wang, Pengfei Zheng, Hongyang Chen

The second challenge involves accommodating molecule generation to diffusion and accurately predicting the existence of bonds.

3D Molecule Generation Denoising

WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation

no code implementations20 Dec 2023 Zhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, Yishujie Zhao, Wenxiang Hu, Qiufeng Yin

This paper thus offers a significant contribution to the field of instruction data generation and fine-tuning models, providing new insights and tools for enhancing performance in code-related tasks.

Code Generation

Re-Reading Improves Reasoning in Large Language Models

1 code implementation12 Sep 2023 Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-Guang Lou

To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i. e., \textbf{Re}-\textbf{Re}ading the question as input.

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

1 code implementation18 Aug 2023 Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, JianGuang Lou, Chongyang Tao, Xiubo Geng, QIngwei Lin, Shifeng Chen, Dongmei Zhang

Through extensive experiments on two mathematical reasoning benchmarks, namely GSM8k and MATH, we reveal the extraordinary capabilities of our model.

Ranked #49 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

1 code implementation28 Jul 2023 Xindi Wang, YuFei Wang, Can Xu, Xiubo Geng, BoWen Zhang, Chongyang Tao, Frank Rudzicz, Robert E. Mercer, Daxin Jiang

Large language models (LLMs) have shown remarkable capacity for in-context learning (ICL), where learning a new task from just a few training examples is done without being explicitly pre-trained.

In-Context Learning

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

2 code implementations14 Jun 2023 Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang

Moreover, our model even outperforms the largest closed LLMs, Anthropic's Claude and Google's Bard, on HumanEval and HumanEval+.

Ranked #3 on Code Generation on CodeContests (Test Set pass@1 metric)

Code Generation

Synergistic Interplay between Search and Large Language Models for Information Retrieval

1 code implementation12 May 2023 Jiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen, Can Xu, Guodong Long, Dongyan Zhao, Daxin Jiang

Information retrieval (IR) plays a crucial role in locating relevant resources from vast amounts of data, and its applications have evolved from traditional knowledge bases to modern retrieval models (RMs).

Information Retrieval Retrieval

Augmented Large Language Models with Parametric Knowledge Guiding

1 code implementation8 May 2023 Ziyang Luo, Can Xu, Pu Zhao, Xiubo Geng, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang

We demonstrate that our PKG framework can enhance the performance of "black-box" LLMs on a range of domain knowledge-intensive tasks that require factual (+7. 9%), tabular (+11. 9%), medical (+3. 0%), and multimodal (+8. 1%) knowledge.

Self-Supervised Multi-Modal Sequential Recommendation

1 code implementation26 Apr 2023 Kunzhe Song, Qingfeng Sun, Can Xu, Kai Zheng, Yaming Yang

To address this issue, we propose a dual-tower retrieval architecture for sequence recommendation.

Contrastive Learning Retrieval +1

WizardLM: Empowering Large Language Models to Follow Complex Instructions

4 code implementations24 Apr 2023 Can Xu, Qingfeng Sun, Kai Zheng, Xiubo Geng, Pu Zhao, Jiazhan Feng, Chongyang Tao, Daxin Jiang

In this paper, we show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans.

Instruction Following

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval

1 code implementation6 Feb 2023 Ziyang Luo, Pu Zhao, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Jing Ma, Qingwen Lin, Daxin Jiang

The conventional dense retrieval paradigm relies on encoding images and texts into dense representations using dual-stream encoders, however, it faces challenges with low retrieval speed in large-scale retrieval scenarios.

Retrieval Text Retrieval

Iterative Proposal Refinement for Weakly-Supervised Video Grounding

no code implementations CVPR 2023 Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang

Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.

Sentence Video Grounding

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval

1 code implementation ICCV 2023 Ziyang Luo, Pu Zhao, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang

To address this issue, we propose a novel sparse retrieval paradigm for ITR that exploits sparse representations in the vocabulary space for images and texts.

Image Classification Retrieval +2

Adam: Dense Retrieval Distillation with Adaptive Dark Examples

no code implementations20 Dec 2022 Chang Liu, Chongyang Tao, Xiubo Geng, Tao Shen, Dongyan Zhao, Can Xu, Binxing Jiao, Daxin Jiang

Different from previous works that only rely on one positive and hard negatives as candidate passages, we create dark examples that all have moderate relevance to the query through mixing-up and masking in discrete space.

Knowledge Distillation Retrieval

Fine-Grained Distillation for Long Document Retrieval

no code implementations20 Dec 2022 Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Guodong Long, Can Xu, Daxin Jiang

Long document retrieval aims to fetch query-relevant documents from a large-scale collection, where knowledge distillation has become de facto to improve a retriever by mimicking a heterogeneous yet powerful cross-encoder.

Knowledge Distillation Retrieval

Latent User Intent Modeling for Sequential Recommenders

no code implementations17 Nov 2022 Bo Chang, Alexandros Karatzoglou, Yuyan Wang, Can Xu, Ed H. Chi, Minmin Chen

We demonstrate the effectiveness of the latent user intent modeling via offline analyses as well as live experiments on a large-scale industrial recommendation platform.

Recommendation Systems

Reward Shaping for User Satisfaction in a REINFORCE Recommender

no code implementations30 Sep 2022 Konstantina Christakopoulou, Can Xu, Sai Zhang, Sriraj Badam, Trevor Potter, Daniel Li, Hao Wan, Xinyang Yi, Ya Le, Chris Berg, Eric Bencomo Dixon, Ed H. Chi, Minmin Chen

How might we design Reinforcement Learning (RL)-based recommenders that encourage aligning user trajectories with the underlying user satisfaction?

Imputation Reinforcement Learning (RL)

LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval

1 code implementation31 Aug 2022 Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang

In large-scale retrieval, the lexicon-weighting paradigm, learning weighted sparse representations in vocabulary space, has shown promising results with high quality and low latency.

Language Modelling Passage Retrieval +1

LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval

1 code implementation29 Aug 2022 Kai Zhang, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao, Daxin Jiang

The alignment is achieved by weakened knowledge distillations to enlighten the retriever via two aspects -- 1) a lexicon-augmented contrastive objective to challenge the dense encoder and 2) a pair-wise rank-consistent regularization to make dense model's behavior incline to the other.

Representation Learning Retrieval

LFGCF: Light Folksonomy Graph Collaborative Filtering for Tag-Aware Recommendation

no code implementations6 Aug 2022 Yin Zhang, Can Xu, XianJun Wu, Yan Zhang, LiGang Dong, Weigang Wang

Recently, many efforts have been devoted to improving Tag-aware recommendation systems (TRS) with Graph Convolutional Networks (GCN), which has become new state-of-the-art for the general recommendation.

Collaborative Filtering Recommendation Systems +1

Towards Robust Ranker for Text Retrieval

no code implementations16 Jun 2022 Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, Daxin Jiang

A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever.

Passage Retrieval Retrieval +1

PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings

1 code implementation28 Jan 2022 Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang

A straightforward solution is resorting to more diverse positives from a multi-augmenting strategy, while an open question remains about how to unsupervisedly learn from the diverse positives but with uneven augmenting qualities in the text field.

Contrastive Learning Open-Ended Question Answering +3

Recency Dropout for Recurrent Recommender Systems

no code implementations26 Jan 2022 Bo Chang, Can Xu, Matthieu Lê, Jingchen Feng, Ya Le, Sriraj Badam, Ed Chi, Minmin Chen

Recurrent recommender systems have been successful in capturing the temporal dynamics in users' activity trajectories.

Data Augmentation Recommendation Systems

Multimodal Dialogue Response Generation

no code implementations ACL 2022 Qingfeng Sun, Yujing Wang, Can Xu, Kai Zheng, Yaming Yang, Huang Hu, Fei Xu, Jessica Zhang, Xiubo Geng, Daxin Jiang

In such a low-resource setting, we devise a novel conversational agent, Divter, in order to isolate parameters that depend on multimodal dialogues from the entire generation model.

Dialogue Generation Response Generation +1

RecInDial: A Unified Framework for Conversational Recommendation with Pretrained Language Models

no code implementations14 Oct 2021 Lingzhi Wang, Huang Hu, Lei Sha, Can Xu, Kam-Fai Wong, Daxin Jiang

Furthermore, we propose to evaluate the CRS models in an end-to-end manner, which can reflect the overall performance of the entire system rather than the performance of individual modules, compared to the separate evaluations of the two modules used in previous work.

Dialogue Generation Language Modelling +1

Learning to Ground Visual Objects for Visual Dialog

no code implementations Findings (EMNLP) 2021 Feilong Chen, Xiuyi Chen, Can Xu, Daxin Jiang

Specifically, a posterior distribution over visual objects is inferred from both context (history and questions) and answers, and it ensures the appropriate grounding of visual objects during the training process.

Visual Dialog

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

no code implementations NeurIPS 2021 YuFei Wang, Can Xu, Huang Hu, Chongyang Tao, Stephen Wan, Mark Dras, Mark Johnson, Daxin Jiang

Sequence-to-Sequence (S2S) neural text generation models, especially the pre-trained ones (e. g., BART and T5), have exhibited compelling performance on various natural language generation tasks.

Text Generation

MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding

1 code implementation ACL 2021 Jia-Chen Gu, Chongyang Tao, Zhen-Hua Ling, Can Xu, Xiubo Geng, Daxin Jiang

Recently, various neural models for multi-party conversation (MPC) have achieved impressive improvements on a variety of tasks such as addressee recognition, speaker identification and response prediction.

Language Modelling Speaker Identification

Maria: A Visual Experience Powered Conversational Agent

1 code implementation ACL 2021 Zujie Liang, Huang Hu, Can Xu, Chongyang Tao, Xiubo Geng, Yining Chen, Fan Liang, Daxin Jiang

The retriever aims to retrieve a correlated image to the dialog from an image index, while the visual concept detector extracts rich visual knowledge from the image.

Learning Matching Representations for Individualized Organ Transplantation Allocation

1 code implementation28 Jan 2021 Can Xu, Ahmed M. Alaa, Ioana Bica, Brent D. Ershoff, Maxime Cannesson, Mihaela van der Schaar

Organ transplantation is often the last resort for treating end-stage illness, but the probability of a successful transplantation depends greatly on compatibility between donors and recipients.

counterfactual Representation Learning

Are Pre-trained Language Models Knowledgeable to Ground Open Domain Dialogues?

no code implementations19 Nov 2020 Yufan Zhao, Wei Wu, Can Xu

We study knowledge-grounded dialogue generation with pre-trained language models.

Dialogue Generation

StyleDGPT: Stylized Response Generation with Pre-trained Language Models

1 code implementation Findings of the Association for Computational Linguistics 2020 Ze Yang, Wei Wu, Can Xu, Xinnian Liang, Jiaqi Bai, Liran Wang, Wei Wang, Zhoujun Li

Generating responses following a desired style has great potentials to extend applications of open-domain dialogue systems, yet is refrained by lacking of parallel data for training.

Response Generation Sentence

Zero-Resource Knowledge-Grounded Dialogue Generation

1 code implementation NeurIPS 2020 Linxiao Li, Can Xu, Wei Wu, Yufan Zhao, Xueliang Zhao, Chongyang Tao

While neural conversation models have shown great potentials towards generating informative and engaging responses via introducing external knowledge, learning such a model often requires knowledge-grounded dialogues that are difficult to obtain.

Dialogue Generation

Open Domain Dialogue Generation with Latent Images

no code implementations4 Apr 2020 Ze Yang, Wei Wu, Huang Hu, Can Xu, Wei Wang, Zhoujun Li

Thus, we propose learning a response generation model with both image-grounded dialogues and textual dialogues by assuming that the visual scene information at the time of a conversation can be represented by an image, and trying to recover the latent images of the textual dialogues through text-to-image generation techniques.

Dialogue Generation Response Generation +1

Low-Resource Knowledge-Grounded Dialogue Generation

no code implementations ICLR 2020 Xueliang Zhao, Wei Wu, Chongyang Tao, Can Xu, Dongyan Zhao, Rui Yan

In such a low-resource setting, we devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model.

Dialogue Generation Response Generation

THUEE system description for NIST 2019 SRE CTS Challenge

no code implementations25 Dec 2019 Yi Liu, Tianyu Liang, Can Xu, Xianwei Zhang, Xianhong Chen, Wei-Qiang Zhang, Liang He, Dandan song, Ruyun Li, Yangcheng Wu, Peng Ouyang, Shouyi Yin

This paper describes the systems submitted by the department of electronic engineering, institute of microelectronics of Tsinghua university and TsingMicro Co. Ltd. (THUEE) to the NIST 2019 speaker recognition evaluation CTS challenge.

Speaker Recognition

Low-Resource Response Generation with Template Prior

1 code implementation IJCNLP 2019 Ze Yang, Wei Wu, Jian Yang, Can Xu, Zhoujun Li

Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data.

Response Generation

A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots

no code implementations11 Jun 2019 Xueliang Zhao, Chongyang Tao, Wei Wu, Can Xu, Dongyan Zhao, Rui Yan

We present a document-grounded matching network (DGMN) for response selection that can power a knowledge-aware retrieval-based chatbot system.

Chatbot Retrieval

Multiobjective Optimization Training of PLDA for Speaker Verification

2 code implementations25 Aug 2018 Liang He, Xianhong Chen, Can Xu, Jia Liu

Most current state-of-the-art text-independent speaker verification systems take probabilistic linear discriminant analysis (PLDA) as their backend classifiers.

Multiobjective Optimization Text-Independent Speaker Verification

Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection

no code implementations22 Aug 2018 Chongyang Tao, Wei Wu, Can Xu, Yansong Feng, Dongyan Zhao, Rui Yan

In this paper, we study context-response matching with pre-trained contextualized representations for multi-turn response selection in retrieval-based chatbots.

Dialogue Generation Retrieval +1

Towards Explainable and Controllable Open Domain Dialogue Generation with Dialogue Acts

no code implementations19 Jul 2018 Can Xu, Wei Wu, Yu Wu

We study open domain dialogue generation with dialogue acts designed to explain how people engage in social chat.

Dialogue Generation reinforcement-learning +2

Towards Interpretable Chit-chat: Open Domain Dialogue Generation with Dialogue Acts

no code implementations ICLR 2018 Wei Wu, Can Xu, Yu Wu, Zhoujun Li

Conventional methods model open domain dialogue generation as a black box through end-to-end learning from large scale conversation data.

Dialogue Generation Response Generation

A Sequential Matching Framework for Multi-turn Response Selection in Retrieval-based Chatbots

no code implementations CL 2019 Yu Wu, Wei Wu, Chen Xing, Can Xu, Zhoujun Li, Ming Zhou

The task requires matching a response candidate with a conversation context, whose challenges include how to recognize important parts of the context, and how to model the relationships among utterances in the context.

Retrieval

Large Margin Discriminant Dimensionality Reduction in Prediction Space

no code implementations NeurIPS 2016 Mohammad Saberian, Jose Costa Pereira, Can Xu, Jian Yang, Nuno Nvasconcelos

We argue that the intermediate mapping, e. g. boosting predictor, is preserving the discriminant aspects of the data and by controlling the dimension of this mapping it is possible to achieve discriminant low dimensional representations for the data.

Dimensionality Reduction General Classification +1

Visual Sentiment Prediction with Deep Convolutional Neural Networks

no code implementations21 Nov 2014 Can Xu, Suleyman Cetintas, Kuang-Chih Lee, Li-Jia Li

Images have become one of the most popular types of media through which users convey their emotions within online social networks.

Object Recognition Sentiment Analysis +2

Cannot find the paper you are looking for? You can Submit a new open access paper.