Search Results for author: Zhi-Hong Deng

Found 40 papers, 23 papers with code

Empowering Large Language Model Agents through Action Learning

1 code implementation24 Feb 2024 Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior.

Language Modelling Large Language Model

Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy

1 code implementation ICCV 2023 Shibo Jie, Haoqing Wang, Zhi-Hong Deng

Current state-of-the-art results in computer vision depend in part on fine-tuning large pre-trained vision models.

Quantization

Dual-Alignment Pre-training for Cross-lingual Sentence Embedding

1 code implementation16 May 2023 Ziheng Li, Shaohan Huang, Zihan Zhang, Zhi-Hong Deng, Qiang Lou, Haizhen Huang, Jian Jiao, Furu Wei, Weiwei Deng, Qi Zhang

Recent studies have shown that dual encoder models trained with the sentence-level translation ranking task are effective methods for cross-lingual sentence embedding.

Language Modelling Sentence +3

Masked Image Modeling with Local Multi-Scale Reconstruction

1 code implementation CVPR 2023 Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han

The lower layers are not explicitly guided and the interaction among their patches is only used for calculating new activations.

Representation Learning

Are More Layers Beneficial to Graph Transformers?

1 code implementation1 Mar 2023 Haiteng Zhao, Shuming Ma, Dongdong Zhang, Zhi-Hong Deng, Furu Wei

Despite that going deep has proven successful in many neural architectures, the existing graph transformers are relatively shallow.

Detachedly Learn a Classifier for Class-Incremental Learning

no code implementations23 Feb 2023 Ziheng Li, Shibo Jie, Zhi-Hong Deng

In continual learning, model needs to continually learn a feature extractor and classifier on a sequence of tasks.

Class Incremental Learning Incremental Learning

FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer

1 code implementation6 Dec 2022 Shibo Jie, Zhi-Hong Deng

Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by updating only a few parameters so as to improve storage efficiency, called parameter-efficient transfer learning (PETL).

8k Transfer Learning

Contrastive Prototypical Network with Wasserstein Confidence Penalty

1 code implementation European Conference on Computer Vision 2022 Haoqing Wang, Zhi-Hong Deng

This results in our CPN (Contrastive Prototypical Network) model, which combines the prototypical loss with pairwise contrast and outperforms the existing models from this paradigm with modestly large batch size.

Contrastive Learning Inductive Bias +2

Certified Robustness Against Natural Language Attacks by Causal Intervention

1 code implementation24 May 2022 Haiteng Zhao, Chang Ma, Xinshuai Dong, Anh Tuan Luu, Zhi-Hong Deng, Hanwang Zhang

Deep learning models have achieved great success in many fields, yet they are vulnerable to adversarial examples.

Bypassing Logits Bias in Online Class-Incremental Learning with a Generative Framework

no code implementations19 May 2022 Gehui Shen, Shibo Jie, Ziheng Li, Zhi-Hong Deng

In our framework, a generative classifier which utilizes replay memory is used for inference, and the training objective is a pair-based metric learning loss which is proven theoretically to optimize the feature space in a generative way.

Class Incremental Learning Incremental Learning +1

Alleviating Representational Shift for Continual Fine-tuning

1 code implementation22 Apr 2022 Shibo Jie, Zhi-Hong Deng, Ziheng Li

We study a practical setting of continual learning: fine-tuning on a pre-trained model continually.

Continual Learning

Domain Adaptation via Maximizing Surrogate Mutual Information

1 code implementation23 Oct 2021 Haiteng Zhao, Chang Ma, Qinyu Chen, Zhi-Hong Deng

In the framework, a surrogate joint distribution models the underlying joint distribution of the unlabeled target domain.

Transfer Learning Unsupervised Domain Adaptation

What Makes for Good Representations for Contrastive Learning

no code implementations29 Sep 2021 Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu

Therefore, we assume the task-relevant information that is not shared between views can not be ignored and theoretically prove that the minimal sufficient representation in contrastive learning is not sufficient for the downstream tasks, which causes performance degradation.

Contrastive Learning Representation Learning

Cross-Domain Few-Shot Classification via Adversarial Task Augmentation

1 code implementation29 Apr 2021 Haoqing Wang, Zhi-Hong Deng

However, when there exists the domain shift between the training tasks and the test tasks, the obtained inductive bias fails to generalize across domains, which degrades the performance of the meta-learning models.

Classification Cross-Domain Few-Shot +4

Few-shot Learning with LSSVM Base Learner and Transductive Modules

1 code implementation12 Sep 2020 Haoqing Wang, Zhi-Hong Deng

The performance of meta-learning approaches for few-shot learning generally depends on three aspects: features suitable for comparison, the classifier ( base learner ) suitable for low-data scenarios, and valuable information from the samples to classify.

Few-Shot Learning

Self-Supervised Learning Aided Class-Incremental Lifelong Learning

no code implementations10 Jun 2020 Song Zhang, Gehui Shen, Jinsong Huang, Zhi-Hong Deng

Lifelong or continual learning remains to be a challenge for artificial neural network, as it is required to be both stable for preservation of old knowledge and plastic for acquisition of new knowledge.

Class Incremental Learning Incremental Learning +1

Generative Feature Replay with Orthogonal Weight Modification for Continual Learning

no code implementations7 May 2020 Gehui Shen, Song Zhang, Xiang Chen, Zhi-Hong Deng

For this scenario, generative replay is a promising strategy which generates and replays pseudo data for previous tasks to alleviate catastrophic forgetting.

Class Incremental Learning Incremental Learning

Fast Structured Decoding for Sequence Models

1 code implementation NeurIPS 2019 Zhiqing Sun, Zhuohan Li, Haoqing Wang, Zi Lin, Di He, Zhi-Hong Deng

However, these models assume that the decoding process of each token is conditionally independent of others.

Machine Translation Sentence +1

Neural Consciousness Flow

1 code implementation30 May 2019 Xiaoran Xu, Wei Feng, Zhiqing Sun, Zhi-Hong Deng

Instead, inspired by the consciousness prior proposed by Yoshua Bengio, we explore reasoning with the notion of attentive awareness from a cognitive perspective, and formulate it in the form of attentive message passing on graphs, called neural consciousness flow (NeuCFlow).

Decision Making Knowledge Base Completion

Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization

1 code implementation28 May 2019 Ting Huang, Gehui Shen, Zhi-Hong Deng

Compared to previous models which can also skip words, our model achieves better trade-offs between performance and efficiency.

General Classification Machine Translation +5

DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases

no code implementations19 May 2019 Zhiqing Sun, Jian Tang, Pan Du, Zhi-Hong Deng, Jian-Yun Nie

Furthermore, we propose a diversified point network to generate a set of diverse keyphrases out of the word graph in the decoding process.

Document Summarization Information Retrieval +2

DeepCF: A Unified Framework of Representation Learning and Matching Function Learning in Recommender System

2 code implementations15 Jan 2019 Zhi-Hong Deng, Ling Huang, Chang-Dong Wang, Jian-Huang Lai, Philip S. Yu

To solve this problem, many methods have been studied, which can be generally categorized into two types, i. e., representation learning-based CF methods and matching function learning-based CF methods.

Collaborative Filtering Recommendation Systems +1

Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling

1 code implementation EMNLP 2018 Zhiqing Sun, Zhi-Hong Deng

As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff.

Chinese Word Segmentation Language Modelling +1

Learning to Compose over Tree Structures via POS Tags

no code implementations18 Aug 2018 Gehui Shen, Zhi-Hong Deng, Ting Huang, Xi Chen

Recursive Neural Network (RecNN), a type of models which compose words or phrases recursively over syntactic tree structures, has been proven to have superior ability to obtain sentence representation for a variety of NLP tasks.

POS Semantic Composition +3

MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation

no code implementations COLING 2018 Meng Zou, Xihan Li, Haokun Liu, Zhi-Hong Deng

Neural encoder-decoder models have been widely applied to conversational response generation, which is a research hot spot in recent years.

Conversational Response Generation Response Generation +1

A Novel Framework for Recurrent Neural Networks with Enhancing Information Processing and Transmission between Units

no code implementations2 Jun 2018 Xi Chen, Zhi-Hong Deng, Gehui Shen, Ting Huang

This paper proposes a novel framework for recurrent neural networks (RNNs) inspired by the human memory models in the field of cognitive neuroscience to enhance information processing and transmission between adjacent RNNs' units.

General Classification Image Classification +3

A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks

no code implementations27 Dec 2017 Zhiqing Sun, Gehui Shen, Zhi-Hong Deng

However, if we consider segmenting a given sentence, the most intuitive idea is to predict whether to segment for each gap between two consecutive characters, which in comparison makes previous approaches seem too complex.

Chinese Word Segmentation Sentence

An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model

no code implementations COLING 2016 Shulei Ma, Zhi-Hong Deng, Yunlun Yang

In the age of information exploding, multi-document summarization is attracting particular attention for the ability to help people get the main ideas in a short time.

Clustering Document Summarization +3

Cannot find the paper you are looking for? You can Submit a new open access paper.