Pen and Paper Exercises in Machine Learning

michaelgutmann/ml-pen-and-paper-exercises 27 Jun 2022

This is a collection of (mostly) pen-and-paper exercises in machine learning.

1,085
3.77 stars / hour

Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

kwea123/ngp_pl 16 Jan 2022

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate.

3D Reconstruction 3D Shape Reconstruction +2

190
1.59 stars / hour

LViT: Language meets Vision Transformer in Medical Image Segmentation

huanglizi/lvit 29 Jun 2022

In our model, medical text annotation is introduced to compensate for the quality deficiency in image data.

Medical Image Segmentation Semantic Segmentation

104
0.81 stars / hour

BoT-SORT: Robust Associations Multi-Pedestrian Tracking

niraharon/bot-sort 29 Jun 2022

The goal of multi-object tracking (MOT) is detecting and tracking all the objects in a scene, while keeping a unique identifier for each object.

 Ranked #1 on Multi-Object Tracking on MOT20 (using extra training data)

Multi-Object Tracking

57
0.59 stars / hour

Text2Human: Text-Driven Controllable Human Image Generation

yumingj/deepfashion-multimodal 31 May 2022

In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.

Human Parsing Image Generation

136
0.56 stars / hour

Forecasting Future World Events with Neural Networks

andyzoujm/autocast 30 Jun 2022

We test language models on our forecasting task and find that performance is far below a human expert baseline.

Decision Making Language Modelling

54
0.50 stars / hour

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

towhee-io/towhee 28 Jan 2022

Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision.

Image Captioning Visual Question Answering

689
0.44 stars / hour

TSM: Temporal Shift Module for Efficient Video Understanding

towhee-io/towhee ICCV 2019

The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.

Action Classification Action Recognition +4

689
0.43 stars / hour

Ivy: Templated Deep Learning for Inter-Framework Portability

ivy-dl/ivy 4 Feb 2021

We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.

3,086
0.42 stars / hour

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications

mmaaz60/EdgeNeXt 21 Jun 2022

Our EdgeNeXt model with 1. 3M parameters achieves 71. 2\% top-1 accuracy on ImageNet-1K, outperforming MobileViT with an absolute gain of 2. 2\% with 28\% reduction in FLOPs.

BrendaWoups Image Classification +2

120
0.38 stars / hour