Search Results for author: Chengru Song

Found 10 papers, 5 papers with code

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

1 code implementation • 5 Feb 2024 • Yang Jin, Zhicheng Sun, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang song, Kun Gai, Yadong Mu

In light of recent advances in multimodal Large Language Models (LLMs), there is increasing attention to scaling them from image-text data to more informative real-world videos.

Ranked #57 on Visual Question Answering on MM-Vet

Video Understanding Visual Question Answering

334

Paper
Code

ASP: Automatic Selection of Proxy dataset for efficient AutoML

no code implementations • 17 Oct 2023 • Peng Yao, Chao Liao, Jiyuan Jia, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

Deep neural networks have gained great success due to the increasing amounts of data, and diverse effective neural network designs.

Neural Architecture Search

Paper
Add Code

USDC: Unified Static and Dynamic Compression for Visual Transformer

no code implementations • 17 Oct 2023 • Huan Yuan, Chao Liao, Jianchao Tan, Peng Yao, Jiyuan Jia, Bin Chen, Chengru Song, Di Zhang

To alleviate two disadvantages of two categories of methods, we propose to unify the static compression and dynamic compression techniques jointly to obtain an input-adaptive compressed model, which can further better balance the total compression ratios and the model performances.

Model Compression

Paper
Add Code

KwaiYiiMath: Technical Report

no code implementations • 11 Oct 2023 • Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, ShengNan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning.

Ranked #87 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Paper
Add Code

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

1 code implementation • 9 Sep 2023 • Yang Jin, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu

Specifically, we introduce a well-designed visual tokenizer to translate the non-linguistic image into a sequence of discrete tokens like a foreign language that LLM can read.

Language Modelling Large Language Model +1

334

Paper
Code

SHARK: A Lightweight Model Compression Approach for Large-scale Recommender Systems

no code implementations • 18 Aug 2023 • Beichuan Zhang, Chenggen Sun, Jianchao Tan, Xinjun Cai, Jun Zhao, Mengqi Miao, Kang Yin, Chengru Song, Na Mou, Yang song

Increasing the size of embedding layers has shown to be effective in improving the performance of recommendation models, yet gradually causing their sizes to exceed terabytes in industrial recommender systems, and hence the increase of computing and storage costs.

Model Compression Quantization +1

Paper
Add Code

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

1 code implementation • 9 Aug 2023 • Jue Chen, Huan Yuan, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

We propose an improved end-to-end Minimax optimization method for this sparse learning problem to better balance the model performance and the computation efficiency.

Model Compression Sparse Learning

Paper
Code

PA&DA: Jointly Sampling PAth and DAta for Consistent NAS

1 code implementation • CVPR 2023 • Shun Lu, Yu Hu, Longxing Yang, Zihao Sun, Jilin Mei, Jianchao Tan, Chengru Song

Our method only requires negligible computation cost for optimizing the sampling distributions of path and data, but achieves lower gradient variance during supernet training and better generalization performance for the supernet, resulting in a more consistent NAS.