no code implementations • 26 Feb 2024 • Mikayel Samvelyan, Sharath Chandra Raparthy, Andrei Lupu, Eric Hambro, Aram H. Markosyan, Manish Bhatt, Yuning Mao, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Tim Rocktäschel, Roberta Raileanu
As large language models (LLMs) become increasingly prevalent across many real-world applications, understanding and enhancing their robustness to user inputs is of paramount importance.
1 code implementation • 7 Dec 2023 • Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, Madian Khabsa
We introduce Llama Guard, an LLM-based input-output safeguard model geared towards Human-AI conversation use cases.
1 code implementation • 7 Dec 2023 • Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa
Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.
no code implementations • 13 Nov 2023 • Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao
Specifically, an adversarial LLM and a target LLM interplay with each other in an iterative manner, where the adversarial LLM aims to generate challenging prompts that elicit unsafe responses from the target LLM, while the target LLM is fine-tuned with safety aligned data on these adversarial prompts.
16 code implementations • 18 Jul 2023 • Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
Ranked #2 on Question Answering on PubChemQA
5 code implementations • NeurIPS 2023 • Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.
1 code implementation • 6 May 2023 • Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi
In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning.
1 code implementation • 4 Feb 2023 • Yu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer
In this work, we offer a new perspective on the consequence of such a discrepancy: We demonstrate empirically and theoretically that MLM pretraining allocates some model dimensions exclusively for representing $\texttt{[MASK]}$ tokens, resulting in a representation deficiency for real tokens and limiting the pretrained model's expressiveness when it is adapted to downstream data without $\texttt{[MASK]}$ tokens.
2 code implementations • 29 Jan 2023 • Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Amjad Almahairi
We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models.
2 code implementations • 25 Jan 2023 • Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa
Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.
2 code implementations • 13 Oct 2022 • Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, PengFei Liu, Chenguang Zhu, Heng Ji, Jiawei Han
We re-frame NLG evaluation as a Boolean Question Answering (QA) task, and by guiding the model with different questions, we can use one evaluator to evaluate from multiple dimensions.
1 code implementation • 12 May 2022 • Yuning Mao, Ming Zhong, Jiawei Han
Scientific extreme summarization (TLDR) aims to form ultra-short summaries of scientific papers.
Ranked #1 on Extreme Summarization on CiteSum
2 code implementations • 29 Jan 2022 • Ming Zhong, Yang Liu, Suyu Ge, Yuning Mao, Yizhu Jiao, Xingxing Zhang, Yichong Xu, Chenguang Zhu, Michael Zeng, Jiawei Han
In this paper, we propose the first unsupervised multi-granularity summarization framework, GranuSum.
1 code implementation • ACL 2022 • Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa
Recent parameter-efficient language model tuning (PELT) methods manage to match the performance of fine-tuning with much fewer trainable parameters and perform especially well when training data is limited.
1 code implementation • NAACL 2022 • Yuxin Xiao, Zecheng Zhang, Yuning Mao, Carl Yang, Jiawei Han
Consequently, it is more challenging to encode the key information sources--relevant contexts and entity types.
Ranked #1 on Relation Extraction on CDR
1 code implementation • Findings (ACL) 2022 • Yiqing Xie, Jiaming Shen, Sha Li, Yuning Mao, Jiawei Han
Typical DocRE methods blindly take the full document as input, while a subset of the sentences in the document, noted as the evidence, are often sufficient for humans to predict the relation of an entity pair.
Ranked #5 on Relation Extraction on DocRED
2 code implementations • EMNLP 2021 • Yuning Mao, Wenchang Ma, Deren Lei, Jiawei Han, Xiang Ren
In this paper, we present a systematic analysis that studies whether current seq2seq models, especially pre-trained language models, are good enough for preserving important input concepts and to what extent explicitly guiding generation with the concepts as lexical constraints is beneficial.
1 code implementation • 6 Jan 2021 • Jieyu Zhang, Xiangchen Song, Ying Zeng, Jiaze Chen, Jiaming Shen, Yuning Mao, Lei LI
Previous approaches focus on the taxonomy expansion, i. e. finding an appropriate hypernym concept from the taxonomy for a new query concept.
1 code implementation • 1 Jan 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen
Current open-domain question answering systems often follow a Retriever-Reader architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer.
2 code implementations • 24 Oct 2020 • Yuning Mao, Xiang Ren, Heng Ji, Jiawei Han
Despite significant progress, state-of-the-art abstractive summarization methods are still prone to hallucinate content inconsistent with the source document.
1 code implementation • EMNLP 2020 • Yuning Mao, Yanru Qu, Yiqing Xie, Xiang Ren, Jiawei Han
Additionally, the explicit redundancy measure in MMR helps the neural representation of the summary to better capture redundancy.
1 code implementation • ACL 2021 • Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen
We demonstrate that the generated contexts substantially enrich the semantics of the queries and GAR with sparse representations (BM25) achieves comparable or better performance than state-of-the-art dense retrieval methods such as DPR.
Ranked #9 on Passage Retrieval on Natural Questions
no code implementations • 24 Jun 2020 • Xin Luna Dong, Xiang He, Andrey Kan, Xi-An Li, Yan Liang, Jun Ma, Yifan Ethan Xu, Chenwei Zhang, Tong Zhao, Gabriel Blanco Saldana, Saurabh Deshpande, Alexandre Michetti Manduca, Jay Ren, Surender Pal Singh, Fan Xiao, Haw-Shiuan Chang, Giannis Karamanolakis, Yuning Mao, Yaqing Wang, Christos Faloutsos, Andrew McCallum, Jiawei Han
Can one build a knowledge graph (KG) for all products in the world?
no code implementations • 18 Jun 2020 • Yuning Mao, Tong Zhao, Andrey Kan, Chenwei Zhang, Xin Luna Dong, Christos Faloutsos, Jiawei Han
We propose to distantly train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure as well as the query-item-taxonomy interactions for term attachment.
1 code implementation • EMNLP 2020 • Deren Lei, Gangrong Jiang, Xiaotao Gu, Kexuan Sun, Yuning Mao, Xiang Ren
Walk-based models have shown their advantages in knowledge graph (KG) reasoning by achieving decent performance while providing interpretable decisions.
2 code implementations • 26 Jan 2020 • Xiaotao Gu, Yuning Mao, Jiawei Han, Jialu Liu, Hongkun Yu, You Wu, Cong Yu, Daniel Finnie, Jiaqi Zhai, Nicholas Zukoski
In this work, we study the problem of generating representative headlines for news stories.
1 code implementation • IJCNLP 2019 • Yuning Mao, Jingjing Tian, Jiawei Han, Xiang Ren
While existing hierarchical text classification (HTC) methods attempt to capture label hierarchies for model training, they either make local decisions regarding each label or completely ignore the hierarchy information during inference.
Ranked #1 on Text Classification on RCV1 (Macro F1 metric)
1 code implementation • ACL 2020 • Yuning Mao, Liyuan Liu, Qi Zhu, Xiang Ren, Jiawei Han
In this paper, we present a facet-aware evaluation setup for better assessment of the information coverage in extracted summaries.
no code implementations • 27 Sep 2018 • Yuning Mao, Jingjing Tian, Jiawei Han, Xiang Ren
We present an end-to-end reinforcement learning approach to hierarchical text classification where documents are labeled by placing them at the right positions in a given hierarchy.
1 code implementation • ACL 2018 • Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu, Jiawei Han
We present a novel end-to-end reinforcement learning approach to automatic taxonomy induction from a set of terms.