LLoCO: Learning Long Contexts Offline

jeffreysijuntan/lloco 11 Apr 2024

We introduce LLoCO, a technique that combines context compression, retrieval, and parameter-efficient finetuning using LoRA.

4k In-Context Learning +1

71
0.52 stars / hour

ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model

chenhongruixuan/mambacd 4 Apr 2024

For the change decoder, which is available in all three architectures, we propose three spatio-temporal relationship modeling mechanisms, which can be naturally combined with the Mamba architecture and fully utilize its attribute to achieve spatio-temporal interaction of multi-temporal features, thereby obtaining accurate change information.

2D Semantic Segmentation Attribute +1

152
0.51 stars / hour

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

Tianhao-Qi/DEADiff_code 11 Mar 2024

The Q-Formers are trained using paired images rather than the identical target, in which the reference image and the ground-truth image are with the same style or semantics.

Disentanglement

137
0.46 stars / hour

A Survey of Large Language Models

rucaibox/llmsurvey 31 Mar 2023

To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.

Language Modelling

8,626
0.45 stars / hour

MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

openbmb/minicpm 9 Apr 2024

For data scaling, we introduce a Warmup-Stable-Decay (WSD) learning rate scheduler (LRS), conducive to continuous training and domain adaptation.

Domain Adaptation

3,690
0.44 stars / hour

Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences

nianticlabs/mickey 9 Apr 2024

Usually, correspondences are 2D-to-2D and the pose we estimate is defined only up to scale.

277
0.44 stars / hour

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

google-deepmind/recurrentgemma 29 Feb 2024

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale.

Language Modelling

465
0.42 stars / hour

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

pku-yuangroup/open-sora-plan 12 Jul 2023

The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged.

Fairness Image Classification +5

10,004
0.41 stars / hour

GoMVS: Geometrically Consistent Cost Aggregation for Multi-View Stereo

wuuu3511/gomvs 11 Apr 2024

More specifically, we correspond and propagate adjacent costs to the reference pixel by leveraging the local geometric smoothness in conjunction with surface normals.

61
0.40 stars / hour

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

myshell-ai/jetmoe 11 Apr 2024

Large Language Models (LLMs) have achieved remarkable results, but their increasing resource demand has become a major obstacle to the development of powerful and accessible super-human intelligence.

890
0.39 stars / hour