Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling

agemagician/Ankh 16 Jan 2023

As opposed to scaling-up protein language models (PLMs), we seek improving performance via protein-specific optimization.

Language Modelling Protein Function Prediction +1

0.22 stars / hour

Long-tail Detection with Effective Class-Margins

janghyuncho/ecm-loss 23 Jan 2023

Large-scale object detection and instance segmentation face a severe data imbalance.

Instance Segmentation object-detection +2

0.21 stars / hour

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

eleutherai/gpt-neox 20 May 2022

Relative positional embeddings (RPE) have received considerable attention since RPEs effectively model the relative distance among tokens and enable length extrapolation.

Language Modelling

0.20 stars / hour

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Robust Speech Recognition speech-recognition

0.19 stars / hour

Muse: Text-To-Image Generation via Masked Generative Transformers

lucidrains/muse-pytorch 2 Jan 2023

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Language Modelling Text to image generation +1

0.18 stars / hour

Atlas: Few-shot Learning with Retrieval Augmented Language Models

facebookresearch/atlas 5 Aug 2022

Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings.

Fact Checking Few-Shot Learning +6

0.18 stars / hour

SensorX2car: Sensors-to-car calibration for autonomous driving in road scenarios

opencalib/sensorx2car 18 Jan 2023

To this end, we present SensorX2car, a calibration toolbox for the online calibration of sensor-to-car coordinate systems in road scenes.

Autonomous Driving

0.17 stars / hour

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT

ye-kyaw-thu/myword 13 Nov 2017

Our experiments show that subword model performs best for Chinese-to-English translation with the vocabulary which is not so big while hybrid word-character model is most suitable for English-to-Chinese translation.

Machine Translation NMT +1

0.17 stars / hour

LAION-5B: An open large-scale dataset for training next generation image-text models

mlfoundations/open_clip NeurIPS 2022 Datasets and Benchmarks 2022

We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale.

Image Generation Zero-Shot Learning

0.17 stars / hour

Mask3D for 3D Semantic Instance Segmentation

jonasschult/mask3d 6 Oct 2022

Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques.

3D Instance Segmentation 3D Semantic Instance Segmentation

0.16 stars / hour