Search Results for author: Fanxu Meng

Found 9 papers, 7 papers with code

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models

1 code implementation3 Apr 2024 Fanxu Meng, Zhaohui Wang, Muhan Zhang

However, LoRA approximates Delta W through the product of two matrices, A, initialized with Gaussian noise, and B, initialized with zeros, while PiSSA initializes A and B with principal singular values and vectors of the original matrix W. PiSSA can better approximate the outcomes of full-parameter fine-tuning at the beginning by changing the essential parts while freezing the "noisy" parts.

Quantization

Chain of Images for Intuitively Reasoning

1 code implementation9 Nov 2023 Fanxu Meng, Haotong Yang, Yiding Wang, Muhan Zhang

The human brain is naturally equipped to comprehend and interpret visual information rapidly.

Common Sense Reasoning Language Modelling +2

Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure

no code implementations9 Oct 2023 Haotong Yang, Fanxu Meng, Zhouchen Lin, Muhan Zhang

Furthermore, by generalizing this structure to the hierarchical case, we demonstrate that models can achieve task composition, further reducing the space needed to learn from linear to logarithmic, thereby effectively learning on complex reasoning involving multiple steps.

Answer Generation Language Modelling

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

1 code implementation24 May 2023 Xiaojuan Tang, Zilong Zheng, Jiaqi Li, Fanxu Meng, Song-Chun Zhu, Yitao Liang, Muhan Zhang

On the whole, our analysis provides a novel perspective on the role of semantics in developing and evaluating language models' reasoning abilities.

RMNet: Equivalently Removing Residual Connection from Networks

1 code implementation1 Nov 2021 Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun

In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock.

Network Pruning

An Empirical Study and Analysis on Open-Set Semi-Supervised Learning

no code implementations19 Jan 2021 Huixiang Luo, Hao Cheng, Fanxu Meng, Yuting Gao, Ke Li, Mengdan Zhang, Xing Sun

Pseudo-labeling (PL) and Data Augmentation-based Consistency Training (DACT) are two approaches widely used in Semi-Supervised Learning (SSL) methods.

Data Augmentation

Pruning Filter in Filter

1 code implementation NeurIPS 2020 Fanxu Meng, Hao Cheng, Ke Li, Huixiang Luo, Xiaowei Guo, Guangming Lu, Xing Sun

Through extensive experiments, we demonstrate that SWP is more effective compared to the previous FP-based methods and achieves the state-of-art pruning ratio on CIFAR-10 and ImageNet datasets without obvious accuracy drop.

Filter Grafting for Deep Neural Networks: Reason, Method, and Cultivation

1 code implementation26 Apr 2020 Hao Cheng, Fanxu Meng, Ke Li, Yuting Gao, Guangming Lu, Xing Sun, Rongrong Ji

To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting .

valid

Filter Grafting for Deep Neural Networks

2 code implementations CVPR 2020 Fanxu Meng, Hao Cheng, Ke Li, Zhixin Xu, Rongrong Ji, Xing Sun, Gaungming Lu

To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.