Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

OpenLLMAI/OpenLLaMA2 25 May 2020

We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms: Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO).

reinforcement-learning Reinforcement Learning (RL)

1,221
0.26 stars / hour

4D Panoptic Scene Graph Generation

jingkang50/psg4d NeurIPS 2023

To facilitate research in this new area, we build a richly annotated PSG-4D dataset consisting of 3K RGB-D videos with a total of 1M frames, each of which is labeled with 4D panoptic segmentation masks as well as fine-grained, dynamic scene graphs.

4D Panoptic Segmentation Graph Generation +5

46
0.25 stars / hour

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

mlc-ai/web-llm 9 Jul 2022

Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives.

BIG-bench Machine Learning

10,116
0.25 stars / hour

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

alpha-vllm/lumina-t2x 9 May 2024

Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks sufficient implementation details.

1,078
0.24 stars / hour

MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

openbmb/minicpm 9 Apr 2024

For data scaling, we introduce a Warmup-Stable-Decay (WSD) learning rate scheduler (LRS), conducive to continuous training and domain adaptation.

Domain Adaptation

3,968
0.24 stars / hour

HMT: Hierarchical Memory Transformer for Long Context Language Processing

OswaldHe/HMT-pytorch 9 May 2024

With an additional 0. 5% - 2% of parameters, HMT can easily plug in and augment future LLMs to handle long context effectively.

Language Modelling Memorization +1

33
0.23 stars / hour

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

alibaba-damo-academy/FunASR 23 Dec 2023

To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Self-Supervised Learning Sentiment Analysis +1

3,806
0.23 stars / hour

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

facebookresearch/generative-recommenders 27 Feb 2024

Large-scale recommendation systems are characterized by their reliance on high cardinality, heterogeneous features and the need to handle tens of billions of user actions on a daily basis.

 Ranked #1 on Recommendation Systems on Amazon-Book (HR@10 metric)

Recommendation Systems

329
0.22 stars / hour

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

tothebeginning/pulid 24 Apr 2024

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation.

Text-to-Image Generation

767
0.22 stars / hour

Vidur: A Large-Scale Simulation Framework For LLM Inference

microsoft/vidur 8 May 2024

Vidur models the performance of LLM operators using a combination of experimental profiling and predictive modeling, and evaluates the end-to-end inference performance for different workloads by estimating several metrics of interest such as latency and throughput.

Scheduling

72
0.22 stars / hour