Learning the Beauty in Songs: Neural Singing Voice Beautifier

MoonInTheRiver/DiffSinger ACL 2022

Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one.

Dynamic Time Warping

1,776
0.98 stars / hour

Parsel: A (De-)compositional Framework for Algorithmic Reasoning with Language Models

ezelikman/parsel 20 Dec 2022

Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.

Automated Theorem Proving Code Generation +2

89
0.82 stars / hour

Towards Robust Blind Face Restoration with Codebook Lookup Transformer

sczhou/codeformer 22 Jun 2022

In this paper, we demonstrate that a learned discrete codebook prior in a small proxy space largely reduces the uncertainty and ambiguity of restoration mapping by casting blind face restoration as a code prediction task, while providing rich visual atoms for generating high-quality faces.

Blind Face Restoration

4,300
0.77 stars / hour

Image Super-Resolution using Efficient Striped Window Transformer

fried-rice-lab/friedricelab 24 Jan 2023

To further exploit the potential of the transformer, we propose a novel flexible window training strategy.

Image Super-Resolution Single Image Super Resolution

88
0.72 stars / hour

LogAI: A Library for Log Analytics and Intelligence

salesforce/logai 31 Jan 2023

In order to enable users to perform multiple types of AI-based log analysis tasks in a uniform manner, we introduce LogAI (https://github. com/salesforce/logai), a one-stop open source library for log analytics and intelligence.

Anomaly Detection LOG PARSING +2

38
0.63 stars / hour

DAMO-YOLO : A Report on Real-Time Object Detection Design

tinyvision/damo-yolo 23 Nov 2022

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series.

Neural Architecture Search object-detection +1

1,213
0.62 stars / hour

TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

tencent/tencentpretrain 13 Dec 2022

The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.

166
0.61 stars / hour

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

autonomousvision/stylegan-t 23 Jan 2023

Text-to-image synthesis has recently seen significant progress thanks to large pretrained language models, large-scale training data, and the introduction of scalable model families such as diffusion and autoregressive models.

Pretrained Language Models Text-to-Image Generation

400
0.57 stars / hour

ThoughtSource: A central hub for large language model reasoning data

openbiolink/thoughtsource 27 Jan 2023

Large language models (LLMs) such as GPT-3 and ChatGPT have recently demonstrated impressive results across a wide range of tasks.

Language Modelling Question Answering

251
0.56 stars / hour

MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning

MTLab/MorphMLP 24 Nov 2021

With such multi-dimension and multi-scale factorization, our MorphMLP block can achieve a great accuracy-computation balance.

Ranked #18 on Action Recognition on Something-Something V2 (using extra training data)

Action Recognition Image Classification +2

130
0.47 stars / hour