Trending Research

Data Engineering for Scaling Language Models to 128K Context

franxyao/long-context-data-engineering • • 15 Feb 2024

We demonstrate that continual pretraining of the full model on 1B-5B tokens of such data is an effective and affordable strategy for scaling the context length of language models to 128K.

4k Continual Pretraining

346

0.21 stars / hour

Paper
Code

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

hustvl/vim • • 17 Jan 2024

The results demonstrate that Vim is capable of overcoming the computation & memory constraints on performing Transformer-style understanding for high-resolution images and it has great potential to be the next-generation backbone for vision foundation models.

object-detection Object Detection +3

2,207

0.21 stars / hour

Paper
Code

EasySpider: A No-Code Visual System for Crawling the Web

NaiboWang/EasySpider • ACM The Web Conference 2023

As such, web-crawling is an essential tool for both computational and non-computational scientists to conduct research.

Data Integration Marketing

23,069

0.20 stars / hour

Paper
Code

An End-to-End Structure with Novel Position Mechanism and Improved EMD for Stock Forecasting

durandallee/aceformer • • 25 Mar 2024

As a branch of time series forecasting, stock movement forecasting is one of the challenging problems for investors and researchers.

Position Time Series +1

0.19 stars / hour

Paper
Code

A Survey on Vision Mamba: Models, Applications and Challenges

ruixxxx/awesome-vision-mamba-models • • 29 Apr 2024

To help keep pace with the rapid advancements in computer vision, this paper aims to provide a comprehensive review of visual Mamba approaches.

0.19 stars / hour

Paper
Code

PLAID SHIRTTT for Large-Scale Streaming Dense Retrieval

hltcoe/colbert-x • • 2 May 2024

PLAID, an efficient implementation of the ColBERT late interaction bi-encoder using pretrained language models for ranking, consistently achieves state-of-the-art performance in monolingual, cross-language, and multilingual retrieval.

Retrieval

0.19 stars / hour

Paper
Code

GPU-based Private Information Retrieval for On-Device Machine Learning Inference

facebookresearch/GPU-DPF • • 26 Jan 2023

Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to $100, 000$ queries per second -- a $>100 \times$ throughput improvement over a CPU-based baseline -- while maintaining model accuracy.

Information Retrieval Language Modelling +1

0.19 stars / hour

Paper
Code

ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation

hiyouga/llama-efficient-tuning • • 4 Aug 2023

Applying Reinforcement Learning (RL) to sequence generation models enables the direct optimization of long-term rewards (\textit{e. g.,} BLEU and human feedback), but typically requires large-scale sampling over a space of action sequences.

Abstractive Text Summarization Language Modelling +5