An Algorithm for Routing Vectors in Sequences

20 Nov 2022  ยท  Franz A. Heinsen ยท

We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size. Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors. We describe output vectors as geometric objects, as latent variables that assign credit, as query states in a model of associative memory, and as agents in a model of a Society of Mind. We implement the algorithm with optimizations that reduce parameter count, computation, and memory use by orders of magnitude, enabling us to route sequences of greater length than previously possible. We evaluate our implementation on natural language and visual classification tasks, obtaining competitive or state-of-the-art accuracy and end-to-end credit assignments that are interpretable.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification CIFAR-10 Heinsen Routing + BEiT-large 16 224 Percentage correct 99.2 # 9
PARAMS 309.5M # 238
Top-1 Accuracy 99.2 # 4
Image Classification CIFAR-100 Heinsen Routing + BEiT-large 16 224 Percentage correct 93.8% # 7
PARAMS 309.8M # 199
Image Classification ImageNet Heinsen Routing + BEiT-large 16 224 Top 1 Accuracy 86.7% # 126
Number of params 312.8M # 917
Sentiment Analysis IMDb Heinsen Routing + RoBERTa Large Accuracy 96.2 # 4
Sentiment Analysis SST-2 Binary classification Heinsen Routing + RoBERTa-large Accuracy 96.0 # 21
Sentiment Analysis SST-5 Fine-grained classification Heinsen Routing + RoBERTa Large Accuracy 59.8 # 1

Methods


No methods listed for this paper. Add relevant methods here