GPT4All: An Ecosystem of Open Source Compressed Language Models

nomic-ai/gpt4all 6 Nov 2023

It is our hope that this paper acts as both a technical overview of the original GPT4All models as well as a case study on the subsequent growth of the GPT4All open source ecosystem.

Politeness Transfer: A Tag and Generate Approach

fighting41love/funNLP ACL 2020

This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning.

Sentence Style Transfer +1

CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset

fighting41love/funNLP TACL 2020

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset.

Dialogue State Tracking Task-Oriented Dialogue Systems +1

MaskNet: Introducing Feature-Wise Multiplication to CTR Ranking Models by Instance-Guided Mask

twitter/the-algorithm 9 Feb 2021

We also turn the feed-forward layer in DNN model into a mixture of addictive and multiplicative feature interactions by proposing MaskBlock in this paper.

Click-Through Rate Prediction Recommendation Systems

Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition

openai/whisper 17 Jul 2023

Most end-to-end (E2E) speech recognition models are composed of encoder and decoder blocks that perform acoustic and language modeling functions.

Decoder Language Modelling +3

A ConvNet for the 2020s

keras-team/keras CVPR 2022

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.

Classification Domain Generalization +3

RegNet: Self-Regulated Network for Image Classification

keras-team/keras 3 Jan 2021

The ResNet and its variants have achieved remarkable successes in various computer vision tasks.

General Classification Image Classification +1

Adapting the Tesseract Open Source OCR Engine for Multilingual OCR

tesseract-ocr/tesseract ACM 2009

We describe efforts to adapt the Tesseract open source OCR engine for multiple scripts and languages.

Optical Character Recognition (OCR)

Improving the Cluster Structure Extracted from OPTICS Plots

scikit-learn/scikit-learn Lernen, Wissen, Daten, Analysen 2018

Density-based clustering is closely associated with the two algorithms DBSCAN and OPTICS.

Scikit-learn: Machine Learning in Python

scikit-learn/scikit-learn 2 Jan 2012

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.

BIG-bench Machine Learning Clustering +3