MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning

MTLab/MorphMLP 24 Nov 2021

With such multi-dimension and multi-scale factorization, our MorphMLP block can achieve a great accuracy-computation balance.

Disentangling Random and Cyclic Effects in Time-Lapse Sequences

harskish/tlgan 4 Jul 2022

We introduce the problem of disentangling time-lapse sequences in a way that allows separate, after-the-fact control of overall trends, cyclic effects, and random effects in the images, and describe a technique based on data-driven generative models that achieves this goal.

Is ChatGPT A Good Translator? A Preliminary Study

wxjiao/is-chatgpt-a-good-translator 20 Jan 2023

By evaluating on a number of benchmark test sets, we find that ChatGPT performs competitively with commercial translation products (e. g., Google Translate) on high-resource European languages but lags behind significantly on low-resource or distant languages.

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

sense-gvt/fast-bev 29 Jan 2023

Our Fast-BEV consists of five parts, We novelly propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image feature to 3D voxel space, (2) an multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference.

A Length-Extrapolatable Transformer

microsoft/torchscale 20 Dec 2022

Position modeling plays a critical role in Transformers.

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

autonomousvision/stylegan-t 23 Jan 2023

Text-to-image synthesis has recently seen significant progress thanks to large pretrained language models, large-scale training data, and the introduction of scalable model families such as diffusion and autoregressive models.

Scaling Language-Image Pre-training via Masking

ofa-sys/chinese-clip 1 Dec 2022

We present Fast Language-Image Pre-training (FLIP), a simple and more efficient method for training CLIP.

From Semi-supervised to Omni-supervised Room Layout Estimation Using Point Clouds

air-discover/omni-pq 31 Jan 2023

But adapting this scheme to the state-of-the-art (SOTA) solution for PC-based layout estimation is not straightforward.

In-Context Retrieval-Augmented Language Models

ai21labs/in-context-ralm 31 Jan 2023

Retrieval-Augmented Language Modeling (RALM) methods, that condition a language model (LM) on relevant documents from a grounding corpus during generation, have been shown to significantly improve language modeling while also providing a natural source attribution mechanism.

The Flan Collection: Designing Data and Methods for Effective Instruction Tuning

google-research/flan 31 Jan 2023

We study the design decisions of publicly available instruction tuning methods, and break down the development of Flan 2022 (Chung et al., 2022).

