Search Results for author: Junyi Li

Found 63 papers, 33 papers with code

Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

no code implementations15 Jul 2024 Jinhao Jiang, Junyi Li, Wayne Xin Zhao, Yang song, Tao Zhang, Ji-Rong Wen

However, this method may result in inefficient knowledge memorization due to a lack of awareness of knowledge utilization and imposes substantial demands on LLMs to simultaneously learn knowledge utilization and format alignment with limited training samples.

Domain Adaptation Memorization

MKDTI: Predicting drug-target interactions via multiple kernel fusion on graph attention network

no code implementations14 Jul 2024 Yuhuan Zhou, Yulin Wu, Weiwei Yuan, Xuan Wang, Junyi Li

In our work, we formulate a model called MKDTI by extracting kernel information from various layer embeddings of a graph attention network.

Graph Attention

LLMBox: A Comprehensive Library for Large Language Models

1 code implementation8 Jul 2024 Tianyi Tang, Yiwen Hu, Bingqian Li, Wenyang Luo, Zijing Qin, Haoxiang Sun, Jiapeng Wang, Shiyi Xu, Xiaoxue Cheng, Geyang Guo, Han Peng, Bowen Zheng, Yiru Tang, Yingqian Min, Yushuo Chen, Jie Chen, Yuanqian Zhao, Luran Ding, Yuhao Wang, Zican Dong, Chunxuan Xia, Junyi Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

To facilitate the research on large language models (LLMs), this paper presents a comprehensive and unified library, LLMBox, to ease the development, use, and evaluation of LLMs.

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

1 code implementation17 Jun 2024 Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Hongzhi Zhang, Fuzheng Zhang, Di Zhang, Kun Gai, Ji-Rong Wen

Hallucination detection is a challenging task for large language models (LLMs), and existing studies heavily rely on powerful closed-source LLMs such as GPT-4.

2k Hallucination

Exploring Context Window of Large Language Models via Decomposed Positional Vectors

no code implementations28 May 2024 Zican Dong, Junyi Li, Xin Men, Wayne Xin Zhao, Bingbing Wang, Zhen Tian, WeiPeng Chen, Ji-Rong Wen

Based on our findings, we design two training-free context window extension methods, positional vector replacement and attention window extension.

TBSN: Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising

1 code implementation11 Apr 2024 Junyi Li, Zhilu Zhang, WangMeng Zuo

For channel self-attention, we observe that it may leak the blind-spot information when the channel number is greater than spatial size in the deep layers of multi-scale architectures.

Computational Efficiency Image Denoising +2

ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

1 code implementation21 Mar 2024 Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

In response to this challenge, we present an empirical investigation of CoT prompting and introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.

REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering

1 code implementation27 Feb 2024 Yuhao Wang, Ruiyang Ren, Junyi Li, Wayne Xin Zhao, Jing Liu, Ji-Rong Wen

By combining the improvements in both architecture and training, our proposed REAR can better utilize external knowledge by effectively perceiving the relevance of retrieved documents.

Open-Domain Question Answering RAG +1

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

1 code implementation6 Jan 2024 Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

To tackle the LLM hallucination, three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigate them (mitigation).

Hallucination

Device-Wise Federated Network Pruning

no code implementations CVPR 2024 Shangqian Gao, Junyi Li, Zeyu Zhang, yanfu Zhang, Weidong Cai, Heng Huang

Neural network pruning particularly channel pruning is a widely used technique for compressing deep learning models to enable their deployment on edge devices with limited resources.

Edge-computing Federated Learning +1

On the steerability of large language models toward data-driven personas

no code implementations8 Nov 2023 Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta

Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.

Collaborative Filtering Language Modelling +1

BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models

1 code implementation23 Sep 2023 Zican Dong, Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

Recently, multiple studies have committed to extending the context length and enhancing the long text modeling capabilities of LLMs.

Code Completion Hallucination +2

Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition

1 code implementation ICCV 2023 Xiaoyu Liu, Ming Liu, Junyi Li, Shuai Liu, Xiaotao Wang, Lei Lei, WangMeng Zuo

In this paper, we circumvent this issue by presenting a joint framework for both unbounded recommendation of camera view and image composition (i. e., UNIC).

Image Cropping

Zero-shot Visual Question Answering with Language Model Feedback

1 code implementation26 May 2023 Yifan Du, Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we propose a novel language model guided captioning approach, LAMOC, for knowledge-based visual question answering (VQA).

Language Modelling Question Answering +1

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

2 code implementations19 May 2023 Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, i. e., content that conflicts with the source or cannot be verified by the factual knowledge.

Hallucination Hallucination Evaluation

The Web Can Be Your Oyster for Improving Large Language Models

1 code implementation18 May 2023 Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jingyuan Wang, Jian-Yun Nie, Ji-Rong Wen

In order to further improve the capacity of LLMs for knowledge-intensive tasks, we consider augmenting LLMs with the large-scale web using search engine.

Retrieval World Knowledge

GlyphDiffusion: Text Generation as Image Generation

no code implementations25 Apr 2023 Junyi Li, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

In this way, conditional text generation can be cast as a glyph image generation task, and it is then natural to apply continuous diffusion models to discrete texts.

Conditional Text Generation Diversity +3

A Survey of Large Language Models

5 code implementations31 Mar 2023 Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, YiFan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen

To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.

Language Modelling

Communication-Efficient Federated Bilevel Optimization with Local and Global Lower Level Problems

no code implementations13 Feb 2023 Junyi Li, Feihu Huang, Heng Huang

In this work, we investigate Federated Bilevel Optimization problems and propose a communication-efficient algorithm, named FedBiOAcc.

Bilevel Optimization Federated Learning +1

FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging

no code implementations13 Feb 2023 Junyi Li, Feihu Huang, Heng Huang

This matches the best known rate for first-order FL algorithms and \textbf{FedDA-MVR} is the first adaptive FL algorithm that achieves this rate.

Federated Learning

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

1 code implementation26 Dec 2022 Tianyi Tang, Junyi Li, Zhipeng Chen, Yiwen Hu, Zhuohao Yu, Wenxun Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2. 0, focusing on the use of pre-trained language models (PLMs).

Abstractive Text Summarization Data-to-Text Generation +7

Adaptive Federated Minimax Optimization with Lower Complexities

no code implementations14 Nov 2022 Feihu Huang, Xinrui Wang, Junyi Li, Songcan Chen

To fill this gap, in the paper, we study a class of nonconvex minimax optimization, and propose an efficient adaptive federated minimax optimization algorithm (i. e., AdaFGDA) to solve these distributed minimax problems.

Federated Learning Privacy Preserving

FedGRec: Federated Graph Recommender System with Lazy Update of Latent Embeddings

no code implementations25 Oct 2022 Junyi Li, Heng Huang

Therefore, Federated Recommender (FedRec) systems are proposed to mitigate privacy concerns to non-distributed recommender systems.

Federated Learning Recommendation Systems

MVP: Multi-task Supervised Pre-training for Natural Language Generation

2 code implementations24 Jun 2022 Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

Motivated by the success of supervised pre-training, we propose Multi-task superVised Pre-training (MVP) for natural language generation.

Text Generation

Communication-Efficient Robust Federated Learning with Noisy Labels

no code implementations11 Jun 2022 Junyi Li, Jian Pei, Heng Huang

Bilevel optimization problem is a type of optimization problem with two levels of entangled problems.

Bilevel Optimization Federated Learning +2

Learning to Transfer Prompts for Text Generation

1 code implementation NAACL 2022 Junyi Li, Tianyi Tang, Jian-Yun Nie, Ji-Rong Wen, Wayne Xin Zhao

First, PTG learns a set of source prompts for various source generation tasks and then transfers these prompts as target prompts to perform target generation tasks.

Text Generation

Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction

no code implementations3 May 2022 Junyi Li, Feihu Huang, Heng Huang

Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm and we show it requires $O(\epsilon^{-2})$ number of iterations to reach an $\epsilon$-stationary point.

BIG-bench Machine Learning Bilevel Optimization +3

Correction of out-of-focus microscopic images by deep learning

1 code implementation Computational and Structural Biotechnology Journal 2022 Chi Zhang, Hao Jiang, Weihuang Liu, Junyi Li, Shiming Tang, Mario Juhas, Yang Zhang.

Results To solve the out-of-focus issue in microscopy, we developed a Cycle Generative Adversarial Network (CycleGAN) based model and a multi-component weighted loss function.

Generative Adversarial Network Image Deblurring +1

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-ahead Forward Ones

1 code implementation12 Apr 2022 Junyi Li, Xiaohe Wu, Zhenxing Niu, WangMeng Zuo

However, BiRNN is intrinsically offline because it uses backward recurrent modules to propagate from the last to current frames, which causes high latency and large memory consumption.

Denoising Video Denoising +1

WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models

no code implementations22 Mar 2022 Sha Yuan, Shuai Zhao, Jiahong Leng, Zhao Xue, Hanyu Zhao, Peiyu Liu, Zheng Gong, Wayne Xin Zhao, Junyi Li, Jie Tang

The results show that WuDaoMM can be applied as an efficient dataset for VLPMs, especially for the model in text-to-image generation task.

Image Captioning Question Answering +2

A Survey of Vision-Language Pre-Trained Models

no code implementations18 Feb 2022 Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao

In this paper, we review the recent progress in Vision-Language Pre-Trained Models (VL-PTMs).

Context-Tuning: Learning Contextualized Prompts for Natural Language Generation

1 code implementation COLING 2022 Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

Secondly, we use continuous inverse prompting to improve the process of natural language generation by modeling an inverse generation process from output to input, making the generated text more relevant to the inputs.

Text Generation

Pretrained Language Models for Text Generation: A Survey

no code implementations14 Jan 2022 Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

We begin with introducing three key aspects of applying PLMs to text generation: 1) how to encode the input into representations preserving input semantics which can be fused into PLMs; 2) how to design an effective PLM to serve as the generation model; and 3) how to effectively optimize PLMs given the reference text and to ensure that the generated texts satisfy special text properties.

Text Generation

A Fully Single Loop Algorithm for Bilevel Optimization without Hessian Inverse

no code implementations9 Dec 2021 Junyi Li, Bin Gu, Heng Huang

Combining our new formulation with the alternative update of the inner and outer variables, we propose an efficient fully single loop algorithm.

Bilevel Optimization

Enhanced Bilevel Optimization via Bregman Distance

no code implementations26 Jul 2021 Feihu Huang, Junyi Li, Shangqian Gao, Heng Huang

Specifically, we propose a bilevel optimization method based on Bregman distance (BiO-BreD) to solve deterministic bilevel problems, which achieves a lower computational complexity than the best known results.

Bilevel Optimization Hyperparameter Optimization +2

BiAdam: Fast Adaptive Bilevel Optimization Methods

no code implementations21 Jun 2021 Feihu Huang, Junyi Li, Shangqian Gao

To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that the outer problem is possibly nonconvex and the inner problem is strongly convex.

Bilevel Optimization Meta-Learning +1

Compositional federated learning: Applications in distributionally robust averaging and meta learning

no code implementations21 Jun 2021 Feihu Huang, Junyi Li

In the paper, we propose an effective and efficient Compositional Federated Learning (ComFedL) algorithm for solving a new compositional Federated Learning (FL) framework, which frequently appears in many data mining and machine learning problems with a hierarchical structure such as distributionally robust FL and model-agnostic meta learning (MAML).

BIG-bench Machine Learning Federated Learning +2

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

1 code implementation NeurIPS 2021 Feihu Huang, Junyi Li, Heng Huang

To fill this gap, we propose a faster and universal framework of adaptive gradients (i. e., SUPER-ADAM) by introducing a universal adaptive matrix that includes most existing adaptive gradient forms.

Pretrained Language Models for Text Generation: A Survey

no code implementations21 May 2021 Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we present an overview of the major advances achieved in the topic of PLMs for text generation.

Text Generation

Knowledge-based Review Generation by Coherence Enhanced Text Planning

no code implementations9 May 2021 Junyi Li, Wayne Xin Zhao, Zhicheng Wei, Nicholas Jing Yuan, Ji-Rong Wen

For global coherence, we design a hierarchical self-attentive architecture with both subgraph- and node-level attention to enhance the correlations between subgraphs.

Informativeness Knowledge Graphs +3

TextBox: A Unified, Modularized, and Extensible Framework for Text Generation

1 code implementation ACL 2021 Junyi Li, Tianyi Tang, Gaole He, Jinhao Jiang, Xiaoxuan Hu, Puzhao Xie, Zhipeng Chen, Zhuohao Yu, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we release an open-source library, called TextBox, to provide a unified, modularized, and extensible text generation framework.

Text Generation

Leveraging Class Hierarchy for Code Comprehension

no code implementations NeurIPS Workshop CAP 2020 Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Li, Ray Mooney, Milos Gligoric

Object-oriented programming languages enable a hierarchical class structure, which provides rich contextual information to guide code comprehension and synthesis.

Improved Bilevel Model: Fast and Optimal Algorithm with Theoretical Guarantee

no code implementations1 Sep 2020 Junyi Li, Bin Gu, Heng Huang

In this paper, we propose an improved bilevel model which converges faster and better compared to the current formulation.

Representation Learning

Faster Secure Data Mining via Distributed Homomorphic Encryption

no code implementations17 Jun 2020 Junyi Li, Heng Huang

Due to the rising privacy demand in data mining, Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.

Cloud Computing

Generating Realistic Stock Market Order Streams

no code implementations ICLR 2019 Junyi Li, Xitong Wang, Yaoyang Lin, Arunesh Sinha, Micheal P. Wellman

We propose an approach to generate realistic and high-fidelity stock market data based on generative adversarial networks (GANs).

Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine Decoding

1 code implementation ACL 2019 Junyi Li, Wayne Xin Zhao, Ji-Rong Wen, Yang song

In this paper, we propose a novel review generation model by characterizing an elaborately designed aspect-aware coarse-to-fine generation process.

Decoder Review Generation +2

Lijunyi at SemEval-2019 Task 9: An attention-based LSTM and ensemble of different models for suggestion mining from online reviews and forums

no code implementations SEMEVAL 2019 Junyi Li

In this paper, we describe a suggestion mining system that participated in SemEval 2019 Task 9, SubTask A - Suggestion Mining from Online Reviews and Forums.

Suggestion mining

Cannot find the paper you are looking for? You can Submit a new open access paper.