Trending Research

LLoCO: Learning Long Contexts Offline

jeffreysijuntan/lloco • • 11 Apr 2024

We introduce LLoCO, a technique that combines context compression, retrieval, and parameter-efficient finetuning using LoRA.

4k In-Context Learning +1

0.52 stars / hour

Paper
Code

ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model

chenhongruixuan/mambacd • 4 Apr 2024

For the change decoder, which is available in all three architectures, we propose three spatio-temporal relationship modeling mechanisms, which can be naturally combined with the Mamba architecture and fully utilize its attribute to achieve spatio-temporal interaction of multi-temporal features, thereby obtaining accurate change information.

Ranked #1 on 2D Semantic Segmentation on xBD

2D Semantic Segmentation Attribute +1

147

0.51 stars / hour

Paper
Code

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

Tianhao-Qi/DEADiff_code • • 11 Mar 2024

The Q-Formers are trained using paired images rather than the identical target, in which the reference image and the ground-truth image are with the same style or semantics.

Disentanglement

136

0.46 stars / hour

Paper
Code

A Survey of Large Language Models

rucaibox/llmsurvey • • 31 Mar 2023

To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size.

Language Modelling

8,626

0.45 stars / hour

Paper
Code

MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

openbmb/minicpm • • 9 Apr 2024

For data scaling, we introduce a Warmup-Stable-Decay (WSD) learning rate scheduler (LRS), conducive to continuous training and domain adaptation.

Domain Adaptation

3,665

0.44 stars / hour

Paper
Code

Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences

nianticlabs/mickey • • 9 Apr 2024

Usually, correspondences are 2D-to-2D and the pose we estimate is defined only up to scale.

265

0.44 stars / hour

Paper
Code

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

google-deepmind/recurrentgemma • • 29 Feb 2024

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale.

Language Modelling

458

0.42 stars / hour

Paper
Code

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

pku-yuangroup/open-sora-plan • • 12 Jul 2023

The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged.

Fairness Image Classification +5

9,962

0.41 stars / hour

Paper
Code

GoMVS: Geometrically Consistent Cost Aggregation for Multi-View Stereo

wuuu3511/gomvs • • 11 Apr 2024

More specifically, we correspond and propagate adjacent costs to the reference pixel by leveraging the local geometric smoothness in conjunction with surface normals.

0.40 stars / hour

Paper
Code

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

myshell-ai/jetmoe • • 11 Apr 2024

Large Language Models (LLMs) have achieved remarkable results, but their increasing resource demand has become a major obstacle to the development of powerful and accessible super-human intelligence.

884

0.39 stars / hour

Paper
Code