GPT4All: An Ecosystem of Open Source Compressed Language Models

nomic-ai/gpt4all • • 6 Nov 2023

It is our hope that this paper acts as both a technical overview of the original GPT4All models as well as a case study on the subsequent growth of the GPT4All open source ecosystem.

65,128

Paper
Code

Politeness Transfer: A Tag and Generate Approach

fighting41love/funNLP • • ACL 2020

This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning.

Sentence Style Transfer +1

64,583

Paper
Code

CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset

fighting41love/funNLP • • TACL 2020

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset.

Dialogue State Tracking Task-Oriented Dialogue Systems +1

64,583

Paper
Code

MaskNet: Introducing Feature-Wise Multiplication to CTR Ranking Models by Instance-Guided Mask

twitter/the-algorithm • • 9 Feb 2021

We also turn the feed-forward layer in DNN model into a mixture of addictive and multiplicative feature interactions by proposing MaskBlock in this paper.

Ranked #9 on Click-Through Rate Prediction on Criteo

Click-Through Rate Prediction Recommendation Systems

61,491

Paper
Code

Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition

openai/whisper • • 17 Jul 2023

Most end-to-end (E2E) speech recognition models are composed of encoder and decoder blocks that perform acoustic and language modeling functions.

Decoder Language Modelling +3

61,322

Paper
Code

A ConvNet for the 2020s

keras-team/keras • • CVPR 2022

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.

Ranked #1 on Classification on InDL

Classification Domain Generalization +3

60,884

Paper
Code

RegNet: Self-Regulated Network for Image Classification

keras-team/keras • • 3 Jan 2021

The ResNet and its variants have achieved remarkable successes in various computer vision tasks.

Ranked #3 on Medical Image Classification on NCT-CRC-HE-100K

General Classification Image Classification +1

60,884

Paper
Code

Adapting the Tesseract Open Source OCR Engine for Multilingual OCR

tesseract-ocr/tesseract • ACM 2009

We describe efforts to adapt the Tesseract open source OCR engine for multiple scripts and languages.

Optical Character Recognition (OCR)

58,453

Paper
Code

Improving the Cluster Structure Extracted from OPTICS Plots

scikit-learn/scikit-learn • Lernen, Wissen, Daten, Analysen 2018

Density-based clustering is closely associated with the two algorithms DBSCAN and OPTICS.

58,304

Paper
Code

Scikit-learn: Machine Learning in Python

scikit-learn/scikit-learn • 2 Jan 2012

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.

BIG-bench Machine Learning Clustering +3

58,304

Paper
Code

Top Papers