L-MAGIC: Language Model Assisted Generation of Images with Coherence

intellabs/mmpano 3 Jun 2024

However, the lack of global scene layout priors leads to subpar outputs with duplicated objects (e. g., multiple beds in a bedroom) or requires time-consuming human text inputs for each view.

Depth Estimation Language Modelling +2

23
0.63 stars / hour

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

multimodal-art-projection/map-neo 29 May 2024

To improve the transparency of LLMs, the research community has formed to open-source truly open LLMs (e. g., Pythia, Amber, OLMo), where more details (e. g., pre-training corpus and training code) are being provided.

Language Modelling Large Language Model

655
0.62 stars / hour

vHeat: Building Vision Models upon Heat Conduction

MzeroMiko/vHeat 26 May 2024

A fundamental problem in learning robust and expressive visual representations lies in efficiently estimating the spatial relationships of visual semantics throughout the entire image.

Computational Efficiency

70
0.56 stars / hour

Executable Code Actions Elicit Better LLM Agents

xingyaoww/code-act 1 Feb 2024

LLM agents are typically prompted to produce actions by generating JSON or text in a pre-defined format, which is usually limited by constrained action space (e. g., the scope of pre-defined tools) and restricted flexibility (e. g., inability to compose multiple tools).

Language Modelling Large Language Model

334
0.53 stars / hour

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

lichao-sun/mora 20 Mar 2024

Sora is the first large-scale generalist video generation model that garnered significant attention across society.

Image to Video Generation Text-to-Video Generation +1

1,365
0.50 stars / hour

Neighborhood-Enhanced Supervised Contrastive Learning for Collaborative Filtering

PeiJieSun/NESCL 18 Feb 2024

Using the graph-based collaborative filtering model as our backbone and following the same data augmentation methods as the existing contrastive learning model SGL, we effectively enhance the performance of the recommendation model.

Collaborative Filtering Contrastive Learning +2

74
0.48 stars / hour

Looking Backward: Streaming Video-to-Video Translation with Feature Banks

Jeff-LiangF/streamv2v 24 May 2024

This paper introduces StreamV2V, a diffusion model that achieves real-time streaming video-to-video (V2V) translation with user prompts.

Translation

285
0.47 stars / hour

FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research

ruc-nlpir/flashrag 22 May 2024

With the advent of Large Language Models (LLMs), the potential of Retrieval Augmented Generation (RAG) techniques have garnered considerable research attention.

Retrieval

680
0.46 stars / hour

Pre-training Small Base LMs with Fewer Tokens

Lightning-AI/lit-gpt 12 Apr 2024

Here we show that smaller LMs trained utilizing some of the layers of GPT2-medium (355M) and GPT-2-large (770M) can effectively match the val loss of their bigger counterparts when trained from scratch for the same number of training steps on OpenWebText dataset with 9B tokens.

Language Modelling

7,241
0.45 stars / hour

GraphAny: A Foundation Model for Node Classification on Any Graph

deepgraphlearning/graphany 30 May 2024

Traditional graph ML models such as graph neural networks (GNNs) trained on graphs cannot perform inference on a new graph with feature and label spaces different from the training ones.

Node Classification

44
0.43 stars / hour