Search Results for author: Hongjin Su

Found 7 papers, 6 papers with code

ARKS: Active Retrieval in Knowledge Soup for Code Generation

no code implementations • 19 Feb 2024 • Hongjin Su, Shuyang Jiang, Yuhang Lai, Haoyuan Wu, Boao Shi, Che Liu, Qian Liu, Tao Yu

Recently the retrieval-augmented generation (RAG) paradigm has raised much attention for its potential in incorporating external knowledge into large language models (LLMs) without further training.

Code Generation Retrieval

Paper
Add Code

Generative Representational Instruction Tuning

2 code implementations • 15 Feb 2024 • Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela

Notably, we find that GRIT matches training on only generative or embedding data, thus we can unify both at no performance loss.

Language Modelling Large Language Model +1

807

Paper
Code

OpenAgents: An Open Platform for Language Agents in the Wild

2 code implementations • 16 Oct 2023 • Tianbao Xie, Fan Zhou, Zhoujun Cheng, Peng Shi, Luoxuan Weng, Yitao Liu, Toh Jing Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hongjin Su, Dongchan Shin, Caiming Xiong, Tao Yu

Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs).

2D Object Detection

3,400

Paper
Code

Lemur: Harmonizing Natural Language and Code for Language Agents

1 code implementation • 10 Oct 2023 • Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents.

516

Paper
Code

One Embedder, Any Task: Instruction-Finetuned Text Embeddings

3 code implementations • 19 Dec 2022 • Hongjin Su, Weijia Shi, Jungo Kasai, Yizhong Wang, Yushi Hu, Mari Ostendorf, Wen-tau Yih, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Our analysis suggests that INSTRUCTOR is robust to changes in instructions, and that instruction finetuning mitigates the challenge of training a single model on diverse datasets.

Information Retrieval Learning Word Embeddings +3

4,037

Paper
Code

Selective Annotation Makes Language Models Better Few-Shot Learners

1 code implementation • 5 Sep 2022 • Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

Departing from recent in-context learning methods, we formulate an annotation-efficient, two-step framework: selective annotation that chooses a pool of examples to annotate from unlabeled data in advance, followed by prompt retrieval that retrieves task examples from the annotated pool at test time.

Code Generation In-Context Learning +1

Paper
Code

Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation

1 code implementation • ACL 2021 • Shizhe Diao, Ruijia Xu, Hongjin Su, Yilei Jiang, Yan Song, Tong Zhang

In this paper, we aim to adapt a generic pretrained model with a relatively small amount of domain-specific data.

Domain Adaptation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.