Search Results for author: Su Zhu

Found 32 papers, 11 papers with code

SciDFM: A Large Language Model with Mixture-of-Experts for Science

no code implementations27 Sep 2024 Liangtai Sun, Danyu Luo, Da Ma, Zihan Zhao, Baocai Chen, Zhennan Shen, Su Zhu, Lu Chen, Xin Chen, Kai Yu

We further analyze the expert layers and show that the results of expert selection vary with data from different disciplines.

Language Modelling Large Language Model +1

Evolving Subnetwork Training for Large Language Models

no code implementations11 Jun 2024 Hanqi Li, Lu Chen, Da Ma, Zijian Wu, Su Zhu, Kai Yu

In this paper, inspired by the redundancy in the parameters of large language models, we propose a novel training paradigm: Evolving Subnetwork Training (EST).

Language Modelling Large Language Model

Sparsity-Accelerated Training for Large Language Models

no code implementations3 Jun 2024 Da Ma, Lu Chen, Pengyu Wang, Hongshen Xu, Hanqi Li, Liangtai Sun, Su Zhu, Shuai Fan, Kai Yu

Large language models (LLMs) have demonstrated proficiency across various natural language processing (NLP) tasks but often require additional training, such as continual pre-training and supervised fine-tuning.

A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames

1 code implementation28 Feb 2024 Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen, Kai Yu

Previous work on spoken language understanding (SLU) mainly focuses on single-intent settings, where each input utterance merely contains one user intent.

Decoder Graph Attention +1

ChemDFM: A Large Language Foundation Model for Chemistry

1 code implementation26 Jan 2024 Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Yi Xia, Bo Chen, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Kai Yu, Xin Chen

In its utmost form, such a generalist AI chemist could be referred to as Chemical General Intelligence.

On the Structural Generalization in Text-to-SQL

no code implementations12 Jan 2023 Jieyu Li, Lu Chen, Ruisheng Cao, Su Zhu, Hongshen Xu, Zhi Chen, Hanchong Zhang, Kai Yu

Exploring the generalization of a text-to-SQL parser is essential for a system to automatically adapt the real-world databases.

Diversity Text-To-SQL

Few-Shot NLU with Vector Projection Distance and Abstract Triangular CRF

no code implementations9 Dec 2021 Su Zhu, Lu Chen, Ruisheng Cao, Zhi Chen, Qingliang Miao, Kai Yu

In this paper, we propose to improve prototypical networks with vector projection distance and abstract triangular Conditional Random Field (CRF) for the few-shot NLU.

intent-classification Intent Classification +5

ShadowGNN: Graph Projection Neural Network for Text-to-SQL Parser

no code implementations NAACL 2021 Zhi Chen, Lu Chen, Yanbin Zhao, Ruisheng Cao, Zihan Xu, Su Zhu, Kai Yu

Given a database schema, Text-to-SQL aims to translate a natural language question into the corresponding SQL query.

Decoder Text-To-SQL

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

1 code implementation25 Feb 2021 Boer Lyu, Lu Chen, Su Zhu, Kai Yu

Additionally, we adopt the word lattice graph as input to maintain multi-granularity information.

Text Matching

CREDIT: Coarse-to-Fine Sequence Generation for Dialogue State Tracking

no code implementations22 Sep 2020 Zhi Chen, Lu Chen, Zihan Xu, Yanbin Zhao, Su Zhu, Kai Yu

In dialogue systems, a dialogue state tracker aims to accurately find a compact representation of the current dialogue status, based on the entire dialogue history.

Dialogue State Tracking

Dual Learning for Dialogue State Tracking

no code implementations22 Sep 2020 Zhi Chen, Lu Chen, Yanbin Zhao, Su Zhu, Kai Yu

In task-oriented multi-turn dialogue systems, dialogue state refers to a compact representation of the user goal in the context of dialogue history.

Dialogue State Tracking Sentence

Vector Projection Network for Few-shot Slot Tagging in Natural Language Understanding

1 code implementation21 Sep 2020 Su Zhu, Ruisheng Cao, Lu Chen, Kai Yu

Few-shot slot tagging becomes appealing for rapid domain transfer and adaptation, motivated by the tremendous development of conversational dialogue systems.

Few-Shot Learning Natural Language Understanding +2

Jointly Encoding Word Confusion Network and Dialogue Context with BERT for Spoken Language Understanding

1 code implementation24 May 2020 Chen Liu, Su Zhu, Zijian Zhao, Ruisheng Cao, Lu Chen, Kai Yu

In this paper, a novel BERT based SLU model (WCN-BERT SLU) is proposed to encode WCNs and the dialogue context jointly.

Spoken Language Understanding

Dual Learning for Semi-Supervised Natural Language Understanding

2 code implementations26 Apr 2020 Su Zhu, Ruisheng Cao, Kai Yu

The framework is composed of dual pseudo-labeling and dual learning method, which enables an NLU model to make full use of data (labeled and unlabeled) through a closed-loop of the primal and dual tasks.

Natural Language Understanding Sentence

A Hierarchical Decoding Model For Spoken Language Understanding From Unaligned Data

1 code implementation9 Apr 2019 Zijian Zhao, Su Zhu, Kai Yu

In the paper, we focus on spoken language understanding from unaligned data whose annotation is a set of act-slot-value triples.

Spoken Language Understanding

Concept Transfer Learning for Adaptive Language Understanding

no code implementations WS 2018 Su Zhu, Kai Yu

Concept definition is important in language understanding (LU) adaptation since literal definition difference can easily lead to data sparsity even if different data sets are actually semantically correlated.

Domain Adaptation Transfer Learning

Encoder-decoder with Focus-mechanism for Sequence Labelling Based Spoken Language Understanding

no code implementations6 Aug 2016 Su Zhu, Kai Yu

This paper investigates the framework of encoder-decoder with attention for sequence labelling based spoken language understanding.

Decoder speech-recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.