Search Results for author: Weiran Huang

Found 35 papers, 14 papers with code

Can I understand what I create? Self-Knowledge Evaluation of Large Language Models

no code implementations10 Jun 2024 Zhiquan Tan, Lai Wei, Jindong Wang, Xing Xie, Weiran Huang

Large language models (LLMs) have achieved remarkable progress in linguistic tasks, necessitating robust evaluation frameworks to understand their capabilities and limitations.

Math

A Statistical Theory of Regularization-Based Continual Learning

no code implementations10 Jun 2024 Xuyang Zhao, Huiyuan Wang, Weiran Huang, Wei Lin

Moreover, the estimation error of the optimal algorithm is derived explicitly, which is of the same order as that of the oracle estimator.

Continual Learning regression +1

Unveiling the Dynamics of Information Interplay in Supervised Learning

no code implementations6 Jun 2024 Kun Song, Zhiquan Tan, Bochao Zou, Huimin Ma, Weiran Huang

In this paper, we use matrix information theory as an analytical tool to analyze the dynamics of the information interplay between data representations and classification head vectors in the supervised learning process.

Linear Mode Connectivity

Provable Contrastive Continual Learning

no code implementations29 May 2024 Yichen Wen, Zhiquan Tan, Kaipeng Zheng, Chuanlong Xie, Weiran Huang

In this work, we fill this gap by establishing theoretical performance guarantees, which reveal how the performance of the model is bounded by training losses of previous tasks in the contrastive continual learning framework.

Continual Learning

BreakGPT: A Large Language Model with Multi-stage Structure for Financial Breakout Detection

1 code implementation12 Feb 2024 Kang Zhang, Osamu Yoshie, Weiran Huang

To address these issues, we introduce BreakGPT, the first large language model for financial breakout detection.

Language Modelling Large Language Model

The Information of Large Language Model Geometry

no code implementations1 Feb 2024 Zhiquan Tan, Chenghai Li, Weiran Huang

This paper investigates the information encoded in the embeddings of large language models (LLMs).

Language Modelling Large Language Model +1

Large Language Model Evaluation via Matrix Entropy

1 code implementation30 Jan 2024 Lai Wei, Zhiquan Tan, Chenghai Li, Jindong Wang, Weiran Huang

Large language models (LLMs) have revolutionized the field of natural language processing, extending their strong capabilities into multi-modal domains.

Data Compression Language Modelling +1

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

2 code implementations28 Dec 2023 Zhengqing Yuan, Zhaoxu Li, Weiran Huang, Yanfang Ye, Lichao Sun

In recent years, multimodal large language models (MLLMs) such as GPT-4V have demonstrated remarkable advancements, excelling in a variety of vision-language tasks.

Computational Efficiency Image Captioning +6

AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering

1 code implementation25 Nov 2023 Xiuyuan Chen, Yuan Lin, Yuchen Zhang, Weiran Huang

By using instance-specific rules as prompt, GPT-4, as an automatic evaluator, can achieve a stable evaluation accuracy of around 97. 0%, comparable to the 94. 9% - 97. 5% accuracy of a human evaluator.

Question Answering Video Question Answering

Understanding Grokking Through A Robustness Viewpoint

no code implementations11 Nov 2023 Zhiquan Tan, Weiran Huang

Recently, an interesting phenomenon called grokking has gained much attention, where generalization occurs long after the models have initially overfitted the training data.

OTMatch: Improving Semi-Supervised Learning with Optimal Transport

no code implementations26 Oct 2023 Zhiquan Tan, Kaipeng Zheng, Weiran Huang

Semi-supervised learning has made remarkable strides by effectively utilizing a limited amount of labeled data while capitalizing on the abundant information present in unlabeled data.

Information Flow in Self-Supervised Learning

2 code implementations29 Sep 2023 Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan, Yifan Zhang

In this paper, we conduct a comprehensive analysis of two dual-branch (Siamese architecture) self-supervised learning approaches, namely Barlow Twins and spectral contrastive learning, through the lens of matrix mutual information.

Contrastive Learning Self-Supervised Learning

InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4

3 code implementations23 Aug 2023 Lai Wei, Zihao Jiang, Weiran Huang, Lichao Sun

To achieve this, we first propose several metrics to access the quality of multimodal instruction data.

Instruction Following Question Answering +1

Rethinking Weak Supervision in Helping Contrastive Learning

no code implementations7 Jun 2023 Jingyi Cui, Weiran Huang, Yifei Wang, Yisen Wang

Therefore, to explore the mechanical differences between semi-supervised and noisy-labeled information in helping contrastive learning, we establish a unified theoretical framework of contrastive learning under weak supervision.

Contrastive Learning Denoising +1

Matrix Information Theory for Self-Supervised Learning

3 code implementations27 May 2023 Yifan Zhang, Zhiquan Tan, Jingqin Yang, Weiran Huang, Yang Yuan

Inspired by this framework, we introduce Matrix-SSL, a novel approach that leverages matrix information theory to interpret the maximum entropy encoding loss as matrix uniformity loss.

Contrastive Learning GSM8K +5

ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations

no code implementations2 Mar 2023 Xuyang Zhao, Tianqi Du, Yisen Wang, Jun Yao, Weiran Huang

Moreover, we show that contrastive learning fails to learn domain-invariant features, which limits its transferability.

Contrastive Learning Data Augmentation +1

Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding

2 code implementations30 May 2022 Tianyang Hu, Zhili Liu, Fengwei Zhou, Wenjia Wang, Weiran Huang

Contrastive learning, especially self-supervised contrastive learning (SSCL), has achieved great success in extracting powerful features from unlabeled data.

Contrastive Learning Data Augmentation +2

Towards the Generalization of Contrastive Self-Supervised Learning

1 code implementation1 Nov 2021 Weiran Huang, Mingyang Yi, Xuyang Zhao, Zihao Jiang

It reveals that the generalization ability of contrastive self-supervised learning is related to three key factors: alignment of positive samples, divergence of class centers, and concentration of augmented data.

Contrastive Learning Data Augmentation +1

Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream Data? A Theoretical Analysis

no code implementations5 Mar 2021 Jiaye Teng, Weiran Huang, Haowei He

Pretext-based self-supervised learning learns the semantic representation via a handcrafted pretext task over unlabeled data and then uses the learned representation for downstream tasks, which effectively reduces the sample complexity of downstream tasks under Conditional Independence (CI) condition.

Self-Supervised Learning

Should Graph Convolution Trust Neighbors? A Simple Causal Inference Method

1 code implementation22 Oct 2020 Fuli Feng, Weiran Huang, Xiangnan He, Xin Xin, Qifan Wang, Tat-Seng Chua

To this end, we analyze the working mechanism of GCN with causal graph, estimating the causal effect of a node's local structure for the prediction.

Blocking Causal Inference +4

New Interpretations of Normalization Methods in Deep Learning

no code implementations16 Jun 2020 Jiacheng Sun, Xiangyong Cao, Hanwen Liang, Weiran Huang, Zewei Chen, Zhenguo Li

In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc.

LEMMA

Boosting Few-Shot Learning With Adaptive Margin Loss

no code implementations CVPR 2020 Aoxue Li, Weiran Huang, Xu Lan, Jiashi Feng, Zhenguo Li, Li-Wei Wang

Few-shot learning (FSL) has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in learning to generalize from a few examples.

Few-Shot Image Classification Few-Shot Learning +2

Meta-Learning PAC-Bayes Priors in Model Averaging

no code implementations24 Dec 2019 Yimin Huang, Weiran Huang, Liang Li, Zhenguo Li

In this paper, we mainly consider the scenario in which we have a common model set used for model averaging instead of selecting a single final model via a model selection procedure to account for this model's uncertainty to improve the reliability and accuracy of inferences.

Meta-Learning Model Selection

GraphAIR: Graph Representation Learning with Neighborhood Aggregation and Interaction

1 code implementation5 Nov 2019 Fenyu Hu, Yanqiao Zhu, Shu Wu, Weiran Huang, Liang Wang, Tieniu Tan

Then, in order to better capture the complicated non-linearity of graph data, we present a novel GraphAIR framework which models the neighborhood interaction in addition to neighborhood aggregation.

Community Detection General Classification +3

DARTS+: Improved Differentiable Architecture Search with Early Stopping

no code implementations13 Sep 2019 Hanwen Liang, Shifeng Zhang, Jiacheng Sun, Xingqiu He, Weiran Huang, Kechen Zhuang, Zhenguo Li

Therefore, we propose a simple and effective algorithm, named "DARTS+", to avoid the collapse and improve the original DARTS, by "early stopping" the search procedure when meeting a certain criterion.

Few-Shot Learning with Global Class Representations

2 code implementations ICCV 2019 Tiange Luo, Aoxue Li, Tao Xiang, Weiran Huang, Li-Wei Wang

In this paper, we propose to tackle the challenging few-shot learning (FSL) problem by learning global class representations using both base and novel class training samples.

Few-Shot Learning Generalized Few-Shot Classification

Community Exploration: From Offline Optimization to Online Learning

no code implementations NeurIPS 2018 Xiaowei Chen, Weiran Huang, Wei Chen, John C. S. Lui

We introduce the community exploration problem that has many real-world applications such as online advertising.

Modeling Local Dependence in Natural Language with Multi-channel Recurrent Neural Networks

no code implementations13 Nov 2018 Chang Xu, Weiran Huang, Hongwei Wang, Gang Wang, Tie-Yan Liu

In this paper, we propose an improved variant of RNN, Multi-Channel RNN (MC-RNN), to dynamically capture and leverage local semantic structure information.

Abstractive Text Summarization Language Modelling +2

Multi-Round Influence Maximization (Extended Version)

1 code implementation12 Feb 2018 Lichao Sun, Weiran Huang, Philip S. Yu, Wei Chen

In this paper, we study the Multi-Round Influence Maximization (MRIM) problem, where influence propagates in multiple rounds independently from possibly different seed sets, and the goal is to select seeds for each round to maximize the expected number of nodes that are activated in at least one round.

Social and Information Networks

Cannot find the paper you are looking for? You can Submit a new open access paper.