Pen and Paper Exercises in Machine Learning

michaelgutmann/ml-pen-and-paper-exercises 27 Jun 2022

This is a collection of (mostly) pen-and-paper exercises in machine learning.

LViT: Language meets Vision Transformer in Medical Image Segmentation

huanglizi/lvit 29 Jun 2022

In our model, medical text annotation is introduced to compensate for the quality deficiency in image data.

Medical Image Segmentation Semantic Segmentation

Multi-Graph Fusion Networks for Urban Region Embedding

wushangbin/mgfn 24 Jan 2022

Human mobility data contains rich but abundant information, which yields to the comprehensive region embeddings for cross domain tasks.

Crime Prediction

Forecasting Future World Events with Neural Networks

andyzoujm/autocast 30 Jun 2022

We test language models on our forecasting task and find that performance is far below a human expert baseline.

Decision Making Language Modelling

Denoised MDPs: Learning World Models Better Than the World Itself

facebookresearch/denoised_mdp 30 Jun 2022

The ability to separate signal from noise, and reason with clean abstractions, is critical to intelligence.

Representation Learning

Ivy: Templated Deep Learning for Inter-Framework Portability

ivy-dl/ivy 4 Feb 2021

We introduce Ivy, a templated Deep Learning (DL) framework which abstracts existing DL frameworks.

BoT-SORT: Robust Associations Multi-Pedestrian Tracking

niraharon/bot-sort 29 Jun 2022

The goal of multi-object tracking (MOT) is detecting and tracking all the objects in a scene, while keeping a unique identifier for each object.

Multi-Object Tracking

ProGen2: Exploring the Boundaries of Protein Language Models

salesforce/progen 27 Jun 2022

Attention-based models trained on protein sequences have demonstrated incredible success at classification and generation tasks relevant for artificial intelligence-driven protein design.

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

towhee-io/towhee 28 Jan 2022

Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision.

Image Captioning Visual Question Answering

TSM: Temporal Shift Module for Efficient Video Understanding

towhee-io/towhee ICCV 2019

The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.

Action Classification Action Recognition +4

