How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

hello-simpleai/chatgpt-comparison-detection 18 Jan 2023

We call the collected dataset the Human ChatGPT Comparison Corpus (HC3).

Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data

salesforce/causalai 25 Jan 2023

We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data.

Causal Discovery Causal Inference +1

Diffusion Models for Causal Discovery via Topological Ordering

vios-s/diffan 12 Oct 2022

Topological ordering approaches for causal discovery exploit this by performing graph discovery in two steps, first sequentially identifying nodes in reverse order of depth (topological ordering), and secondly pruning the potential relations.

Causal Discovery

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Robust Speech Recognition speech-recognition

LION: Latent Point Diffusion Models for 3D Shape Generation

nv-tlabs/LION 12 Oct 2022

To advance 3D DDMs and make them useful for digital artists, we require (i) high generation quality, (ii) flexibility for manipulation and applications such as conditional synthesis and shape interpolation, and (iii) the ability to output smooth surfaces or meshes.

3D Shape Generation Denoising +2

Is ChatGPT A Good Translator? A Preliminary Study

wxjiao/is-chatgpt-a-good-translator 20 Jan 2023

This report provides a preliminary evaluation of ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness.

Machine Translation Translation

Generate rather than Retrieve: Large Language Models are Strong Context Generators

wyu97/GenRead 21 Sep 2022

We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.

Language Modelling Open-Domain Question Answering

LAION-5B: An open large-scale dataset for training next generation image-text models

mlfoundations/open_clip NeurIPS 2022 Datasets and Benchmarks 2022

We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale.

Image Generation Zero-Shot Learning

KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation

eleutherai/gpt-neox 20 May 2022

Relative positional embeddings (RPE) have received considerable attention since RPEs effectively model the relative distance among tokens and enable length extrapolation.

Language Modelling

Synthcity: facilitating innovative use cases of synthetic data in different data modalities

vanderschaarlab/synthcity 18 Jan 2023

Synthcity is an open-source software package for innovative use cases of synthetic data in ML fairness, privacy and augmentation across diverse tabular data modalities, including static data, regular and irregular time series, data with censoring, multi-source data, composite data, and more.

Fairness Irregular Time Series +1

