Computer Vision

Token Reduction

7 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Token Reduction

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Most implemented papers

Most implemented Social Latest No code

TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference

thunlp/TR-BERT • • NAACL 2021

To address this issue, we propose a dynamic token reduction approach to accelerate PLMs' inference, named TR-BERT, which could flexibly adapt the layer number of each token in inference to avoid redundant calculation.

Paper
Code

AdaViT: Adaptive Tokens for Efficient Vision Transformer

NVlabs/A-ViT • • CVPR 2022

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Paper
Code

PuMer: Pruning and Merging Tokens for Efficient Vision Language Models

csarron/pumer • • 27 May 2023

Large-scale vision language (VL) models use Transformers to perform cross-modal interactions between the input text and image.

Paper
Code

Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers

tue-mps/cts-segmenter • • CVPR 2023

This paper introduces Content-aware Token Sharing (CTS), a token reduction approach that improves the computational efficiency of semantic segmentation networks that use Vision Transformers (ViTs).

Paper
Code

Which Tokens to Use? Investigating Token Reduction in Vision Transformers

JoakimHaurum/TokenReduction • • 9 Aug 2023

While different methods have been explored to achieve this goal, we still lack understanding of the resulting reduction patterns and how those patterns differ across token reduction methods and datasets.

Paper
Code

HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition

dun-research/haltingvt • • 10 Jan 2024

Action recognition in videos poses a challenge due to its high computational cost, especially for Joint Space-Time video transformers (Joint VT).

Paper
Code

Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

alinlab/homer • 16 Apr 2024

Large language models (LLMs) have shown remarkable performance in various natural language processing tasks.

Paper
Code

Token Reduction

Benchmarks Add a Result

Most implemented papers

TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference

AdaViT: Adaptive Tokens for Efficient Vision Transformer

PuMer: Pruning and Merging Tokens for Efficient Vision Language Models

Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers

Which Tokens to Use? Investigating Token Reduction in Vision Transformers

HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition

Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Content

Benchmarks

Add a Result