Search Results for author: Shengding Hu

Found 27 papers, 19 papers with code

LEGENT: Open Platform for Embodied Agents

no code implementations28 Apr 2024 Zhili Cheng, Zhitong Wang, Jinyi Hu, Shengding Hu, An Liu, Yuge Tu, Pengkai Li, Lei Shi, Zhiyuan Liu, Maosong Sun

Despite advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), their integration into language-grounded, human-like embodied agents remains incomplete, hindering complex real-life task performance in physical environments.

UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs

1 code implementation11 Apr 2024 Chaoqun He, Renjie Luo, Shengding Hu, Yuanqian Zhao, Jie zhou, Hanghao Wu, Jiajie Zhang, Xu Han, Zhiyuan Liu, Maosong Sun

The rapid development of LLMs calls for a lightweight and easy-to-use framework for swift evaluation deployment.

Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition

no code implementations23 Feb 2024 Yufei Huang, Shengding Hu, Xu Han, Zhiyuan Liu, Maosong Sun

Recent studies have uncovered intriguing phenomena in deep learning, such as grokking, double descent, and emergent abilities in large language models, which challenge human intuition and are crucial for a deeper understanding of neural models.

Memorization Multi-Task Learning

$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens

1 code implementation21 Feb 2024 Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, JunHao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu, Maosong Sun

Processing and reasoning over long contexts is crucial for many practical applications of Large Language Models (LLMs), such as document comprehension and agent construction.

ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models

1 code implementation21 Feb 2024 Chenyang Song, Xu Han, Zhengyan Zhang, Shengding Hu, Xiyu Shi, Kuai Li, Chen Chen, Zhiyuan Liu, Guangli Li, Tao Yang, Maosong Sun

Some recent efforts have explored introducing ReLU or its variants as the substitutive activation function to help LLMs achieve activation sparsity and inference acceleration, but few can simultaneously obtain high sparsity and comparable model performance.

OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems

1 code implementation21 Feb 2024 Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun

Notably, the best-performing model, GPT-4V, attains an average score of 17. 23% on OlympiadBench, with a mere 11. 28% in physics, highlighting the benchmark rigor and the intricacy of physical reasoning.

Logical Fallacies

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

1 code implementation5 Jul 2023 Shengding Hu, Ning Ding, Weilin Zhao, Xingtai Lv, Zhen Zhang, Zhiyuan Liu, Maosong Sun

The scale of large pre-trained models (PTMs) poses significant challenges in adapting to downstream tasks due to the high optimization overhead and storage costs associated with full-parameter fine-tuning.

Won't Get Fooled Again: Answering Questions with False Premises

1 code implementation5 Jul 2023 Shengding Hu, Yifan Luo, Huadong Wang, Xingyi Cheng, Zhiyuan Liu, Maosong Sun

In this paper, we find that the PLMs already possess the knowledge required to rebut such questions, and the key is how to activate the knowledge.

Question Answering

Exploring the Impact of Model Scaling on Parameter-Efficient Tuning

1 code implementation4 Jun 2023 Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Xingzhi Sun, Guotong Xie, Zhiyuan Liu, Maosong Sun

Our investigations reveal that model scaling (1) mitigates the effects of the positions of tunable parameters on performance, and (2) enables tuning methods to achieve performance comparable to full-parameter fine-tuning by optimizing fewer tunable parameters.

Exploring Lottery Prompts for Pre-trained Language Models

no code implementations31 May 2023 Yulin Chen, Ning Ding, Xiaobin Wang, Shengding Hu, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie

Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning.

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

1 code implementation23 May 2023 Ning Ding, Yulin Chen, Bokai Xu, Yujia Qin, Zhi Zheng, Shengding Hu, Zhiyuan Liu, Maosong Sun, BoWen Zhou

Fine-tuning on instruction data has been widely validated as an effective practice for implementing chat language models like ChatGPT.

Sparse Structure Search for Delta Tuning

1 code implementation NIPS 2022 Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

Generally, DT methods exquisitely design delta modules (DT modules) which could be applied to arbitrary fine-grained positions inside PTMs.

Sparse Structure Search for Parameter-Efficient Tuning

no code implementations15 Jun 2022 Shengding Hu, Zhen Zhang, Ning Ding, Yadao Wang, Yasheng Wang, Zhiyuan Liu, Maosong Sun

The searched structures preserve more than 99\% fine-tuning performance with 0. 01\% trainable parameters.

Prototypical Verbalizer for Prompt-based Few-shot Tuning

1 code implementation ACL 2022 Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu

However, manual verbalizers heavily depend on domain-specific prior knowledge and human efforts, while finding appropriate label words automatically still remains challenging. In this work, we propose the prototypical verbalizer (ProtoVerb) which is built directly from training data.

Contrastive Learning Entity Typing +2

OpenPrompt: An Open-source Framework for Prompt-learning

2 code implementations ACL 2022 Ning Ding, Shengding Hu, Weilin Zhao, Yulin Chen, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun

Prompt-learning has become a new paradigm in modern natural language processing, which directly adapts pre-trained language models (PLMs) to $cloze$-style prediction, autoregressive modeling, or sequence to sequence generation, resulting in promising performances on various tasks.

Graph Policy Network for Transferable Active Learning on Graphs

1 code implementation NeurIPS 2020 Shengding Hu, Zheng Xiong, Meng Qu, Xingdi Yuan, Marc-Alexandre Côté, Zhiyuan Liu, Jian Tang

Graph neural networks (GNNs) have been attracting increasing popularity due to their simplicity and effectiveness in a variety of fields.

Active Learning

KACC: A Multi-task Benchmark for Knowledge Abstraction, Concretization and Completion

1 code implementation Findings (ACL) 2021 Jie Zhou, Shengding Hu, Xin Lv, Cheng Yang, Zhiyuan Liu, Wei Xu, Jie Jiang, Juanzi Li, Maosong Sun

Based on the datasets, we propose novel tasks such as multi-hop knowledge abstraction (MKA), multi-hop knowledge concretization (MKC) and then design a comprehensive benchmark.

Knowledge Graphs Transfer Learning

Transfer Active Learning For Graph Neural Networks

no code implementations25 Sep 2019 Shengding Hu, Meng Qu, Zhiyuan Liu, Jian Tang

Moreover, we also studied how to learn a universal policy for labeling nodes on graphs with multiple training graphs and then transfer the learned policy to unseen graphs.

Active Learning Node Classification +1

Graph Neural Networks: A Review of Methods and Applications

5 code implementations20 Dec 2018 Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, LiFeng Wang, Changcheng Li, Maosong Sun

Lots of learning tasks require dealing with graph data which contains rich relation information among elements.

Graph Attention

Cannot find the paper you are looking for? You can Submit a new open access paper.