Compressed Memory is a secondary FIFO memory component proposed as part of the Compressive Transformer model. The Compressive Transformer keeps a fine-grained memory of past activations, which are then compressed into coarser compressed memories.
For choices of compression functions $f_{c}$ the authors consider (1) max/mean pooling, where the kernel and stride is set to the compression rate $c$; (2) 1D convolution also with kernel & stride set to $c$; (3) dilated convolutions; (4) most-used where the memories are sorted by their average attention (usage) and the most-used are preserved.
Source: Compressive Transformers for Long-Range Sequence ModellingPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Semantic Segmentation | 1 | 20.00% |
Video Object Segmentation | 1 | 20.00% |
Video Semantic Segmentation | 1 | 20.00% |
Sentence | 1 | 20.00% |
Language Modelling | 1 | 20.00% |
Component | Type |
|
---|---|---|
Average Pooling
|
Pooling Operations | (optional) |
Convolution
|
Convolutions | (optional) |
Max Pooling
|
Pooling Operations | (optional) |