ML Reproducibility Challenge 2022

Welcome to the ML Reproducibility Challenge 2022. This is the sixth edition of the event (v1, v2, v3, v4, v5), where we are accepting reproducibility reports on papers published at eleven top ML conferences, including NeurIPS 2022, ICML 2022, ICLR 2022, ACL 2022, EMNLP 2022, CVPR 2022, ECCV 2022, AAAI 2022, IJCAI-ECAI 2022, ACM FAccT 2022, SIGIR 2022, and also for papers published in top ML journals in 2022, including JMLR, TACL and TMLR.

The primary goal of this event is to encourage the publishing and sharing of scientific results that are reliable and reproducible. In support of this, the objective of this challenge is to investigate reproducibility of papers accepted for publication at top conferences by inviting members of the community at large to select a paper, and verify the empirical results and claims in the paper by reproducing the computational experiments, either via a new implementation or using code/data or other information provided by the authors.

Decisions announced publicly for MLRC 2022

We are happy to announce the decisions of MLRC 2022 publicly! We had already communicated the decisions to the authors, but we held off to release the decisions until our Camera Ready process is complete. We received 74 submissions, and this year marks yet another iteration of exceptionally high quality reproducibility reports. While we did had to desk reject several reports due to double-blind anonymity violations, formatting issues, and incorrect submissions, the quality of the submissions again improved sharply from last year! After an extensive peer review and meta review process, we are delighted to accept 45 reports to the program, all of which raise the bar in the standard and process of reproducibility effort in Machine Learning.

All papers, decisions and reviews can be viewed at our OpenReview platform.


Following the tradition set last iteration, we are presenting best paper awards to a few select reports to highlight the excellent quality all-round of their reproducibility work. The selection criteria consisted of votes from the Area Chairs, based on the reproducibility motivation, experimental depth, results beyond the original paper, ablation studies, and discussion/recommendations. Since the quality of these top papers are exceptionally high, we decided to change the “Best paper” award nomenclature to “Outstanding Paper” and “Outstanding Paper (Honorable Mentions)” to closely reflect the overall merits of the best performing papers. We believe the community will appreciate the strong reproducibility efforts in each of these papers, which will improve the understanding of the original publications, and inspire authors to promote better science in their own work. Congratulations to all!

Outstanding Paper Awards

  • A Replication Study of Compositional Generalization Works on Semantic Parsing, Kaiser Sun, Adina Williams, Dieuwke Hupkes, Paper, OpenReview
  • [Re] Pure Noise to the Rescue of Insufficient Data, Seungjae Ryan Lee, Seungmin Lee, Paper, OpenReview

Outstanding Paper Awards (Honorable Mentions)

  • [Re] On Explainability of Graph Neural Networks via Subgraph Explorations, Yannik Mahlau, Lukas Berg, Leo Kayser, Paper, OpenReview
  • Towards Understanding Grokking, Alexander Shabalin, Ildus Sadrtdinov, Evgeniy Shabalin, Paper, OpenReview
  • [Re] Reproducibility Study of Behavior Transformers, Skander Moalla, Manuel Madeira, Lorenzo Riccio, Joonhyung Lee, Paper, OpenReview

Camera Ready Papers

All camera ready accepted papers will be available soon in our ReScience C Journal publication, Volume 9, Issue 2. Congratulations to all authors!

  • [Re] G-Mixup: Graph Data Augmentation for Graph Classification, Ermin Omeragić, Vuk Đuranović, Paper, OpenReview
  • [Re] Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization, Mert Erkol, Furkan Kınlı, Barış Özcan, Furkan Kıraç, Paper, OpenReview
  • [Re] End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking, Sean McLeish, Long Tran-Thanh, Paper, OpenReview
  • [Re] Label-Free Explainability for Unsupervised Models, Eric Robin Langezaal, Jesse Belleman, Tim Veenboer, Joeri Noorthoek, Paper, OpenReview
  • [Re] Exploring the Representation of Word Meanings in Context, Matteo Brivio, Çağrı Çöltekin, Paper, OpenReview
  • [Re] Intriguing Properties of Contrastive Losses, Luca Marini, Mohamad Nabeel, Alexandre Loiko, Paper, OpenReview
  • [Re] Bandit Theory and Thompson Sampling-guided Directed Evolution for Sequence Optimization, Luka Žontar, Paper, OpenReview
  • [Re] Hypergraph-Induced Semantic Tuplet Loss for Deep Metric Learning, Jicheng Yuan, Danh Le-Phuoc, Paper, OpenReview
  • [Re] Easy Bayesian Transfer Learning with Informative Priors, Martin Špendl, Klementina Pirc, Paper, OpenReview
  • [Re] On the Reproducibility of CartoonX, Elias Dubbeldam, Aniek Eijpe, Jona Ruthardt, Robin Sasse, Paper, OpenReview
  • [Re] Reproducibility Study of “Label-Free Explainability for Unsupervised Models”, Valentinos Pariza, Avik Pal, Madhura Pawar, Quim Serra Faber, Paper, OpenReview
  • [Re] FOCUS: Flexible Optimizable Counterfactual Explanations for Tree Ensembles, Kyosuke Morita, Paper, OpenReview
  • [Re] Fairness Guarantees under Demographic Shift, Valentin Leonhard Buchner, Philip Onno Olivier Schutte, Yassin Ben Allal, Hamed Ahadi, Paper, OpenReview
  • [Re] DialSummEval - Evaluation of automatic summarization evaluation metrics, Patrick Camara, Mojca Kloos, Vasiliki Kyrmanidi, Agnieszka Kluska, Rorick Terlou, Lea Krause, Paper, OpenReview
  • [Re] On the Reproducibility of “FairCal: Fairness Calibration for Face Verification”, Marga Don, Satchit Chatterji, Milena Kapralova, Ryan Amaudruz, Paper, OpenReview
  • [Re] Reproducibility Study: Label-Free Explainability for Unsupervised Models, Sławomir Garcarz, Andreas Giorkatzi, Ana Ivășchescu, Theodora-Mara Pîslar, Paper, OpenReview
  • [Re] Numerical influence of ReLU’(0) on backpropagation, Tommaso Martorella, Héctor Manuel Ramirez Contreras, Daniel Cerezo García, Paper, OpenReview
  • [¬Re] A Reproducibility Case Study of “Fairness Guarantees under Demographic Shift”, Dennis Agafonov, Jelke Matthijsse, Noa Nonkes, Zjos van de Sande, Paper, OpenReview
  • [Re] Hierarchical Shrinkage: Improving the Accuracy and Interpretability of Tree-Based Methods, Domen Mohorčič, David Ocepek, Paper, OpenReview
  • [Re] Reproducibility study of Joint Multisided Exposure Fairness for Recommendation, Alessia Hu, Oline Ranum, Chrysoula Pozrikidou, Miranda Zhou, Paper, OpenReview
  • [Re] Exploring the Explainability of Bias in Image Captioning Models, Marten Türk, Luyang Busser, Daniël van Dijk, Max J.A. Bosch, Paper, OpenReview
  • [¬Re] Reproducibility study of ‘Proto2Proto: Can you recognize the car, the way I do?’, David Bikker, Gerson de Kleuver, Wenhua Hu, Bram Veenman, Paper, OpenReview
  • [Re] Reproducibility Study of ”Focus On The Common Good: Group Distributional Robustness Follows”, Walter Simoncini, Ioanna Gogou, Marta Freixo Lopes, Ron Kremer, Paper, OpenReview
  • [Re] Reproducibility study of ”Label-Free Explainability for Unsupervised Models”, Gergely Papp, Julius Wagenbach, Laurens Jans de Vries, Niklas Mather, Paper, OpenReview
  • [Re] Reproducibility study of ``Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention’’, Erik Buis, Sebastiaan Dijkstra, Bram Heijermans, Paper, OpenReview
  • [Re] Reproducibility Study of “Quantifying Societal Bias Amplification in Image Captioning”, Farrukh Baratov, Göksenin Yüksel, Darie Petcu, Jan Bakker, Paper, OpenReview
  • [Re] On the reproducibility of “CrossWalk: Fairness-Enhanced Node Representation Learning”, Eric Zila, Jonathan Gerbscheid, Luc Sträter, Kieron Kretschmar, Paper, OpenReview
  • [Re] Reproducing FairCal: Fairness Calibration for Face Verification, Jip Greven, Simon Stallinga, Zirk Seljee, Paper, OpenReview
  • [Re] Reproducibility Study of ’CartoonX: Cartoon Explanations of Image Classifiers’, Sina Taslimi, Luke Chin A Foeng, Pratik Kayal, Aditya Prakash Patra, Paper, OpenReview
  • [Re] Reproducibility Study of ”Latent Space Smoothing for Individually Fair Representations”, Didier Merk, Denny Smit, Boaz Beukers, Tsatsral Mendsuren, Paper, OpenReview
  • [Re] Variational Neural Cellular Automata, Albert Aillet, Simon Sondén, Paper, OpenReview
  • [Re] If you like Shapley, then you’ll love the core, Anes Benmerzoug, Miguel de Benito Delgado, Paper, OpenReview
  • [Re] A Reproduction of Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot Classification, Victor Livernoche, Vidya Sujaya, Paper, OpenReview
  • [¬Re] G-Mixup: Graph Data Augmentation for Graph Classification, Dylan Cordaro, Shelby Cox, Yiman Ren, Teresa Yu, Paper, OpenReview
  • [Re] Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate Speech Classification, Priyanka Bose, Chandra Shekhar Pandey, Fraida Fund, Paper, OpenReview
  • [Re] RELIC: Reproducibility and Extension on LIC metric in quantifying bias in captioning models, Paula Antequera, Egoitz Gonzalez, Marta Grasa, Martijn van Raaphorst, Paper, OpenReview
  • [Re] VAE Approximation Error: ELBO and Exponential Families, Volodymyr Kyrylov, Navdeep Singh Bedi, Qianbo Zang, Paper, OpenReview
  • [Re] CrossWalk Fairness-enhanced Node Representation Learning, Gijs Joppe Moens, Job de Witte, Tobias Pieter Göbel, Meggie van den Oever, Paper, OpenReview
  • [Re] CrossWalk: Fairness-enhanced Node Representation Learning, Luca Pantea, Andrei Blahovici, Paper, OpenReview
  • [Re] Masked Autoencoders Are Small Scale Vision Learners: A Reproduction Under Resource Constraints, Athanasios Charisoudis, Simon Ekman von Huth, Emil Jansson, Paper, OpenReview

Outstanding Reviewer Awards

Our program would not have been possible without the hard work and support of our reviewers. Thus, we would also like to honor them for their timely, high quality reviews which enabled us to curate high quality reproducibility reports.

  • Divyat Mahajan
  • Furkan Kınlı
  • Fan Feng
  • Tobias Uelwer
  • Siba Smarak Panigrahi
  • Prateek Garg
  • Maxwell D Collins
  • Harman Singh
  • Pascal Lamblin
  • Taniya Seth
  • Gabriel Bénédict
  • Olivier Delalleau
  • Philipp Hager
  • Saketh Bachu
  • Sunnie S. Y. Kim

Kaggle Awards

Kaggle deserves a special mention as they partnered with us in this iteration to provide awards to the best papers and reviewers. Kaggle has provided awards in the form of Google Cloud Compute (GCP) credits worth of 500k USD, which is extremely beneficial to conduct exploratory research leveraging high performance computing platform of Google. Kaggle has sponsored this award to outstanding papers and reviewers based on a final decision of the Kaggle awards committee. We thank Kaggle for providing such generous award and enabling reproducible research in the Machine Learning community.


  • [23/01/2023] Call for reviewers for RC2022 is out. Please sign up to be a reviewer for the challenge in this form. Reviewers are also eligible for GCP credit awards thanks to our sponsor Kaggle.
  • [20/12/2022] Kaggle announces $500k worth of awards for the top publications at MLRC2022.
  • [29/11/2022] Accepted reports from MLRC 2021 was featured in in-person and virtual poster sessions at NeurIPS 2022, New Orleans, USA. Checkout the announcement from NeurIPS Journal Chairs for more information.

Key Dates

  • Announcement of the challenge: August 18th, 2022
  • Submission deadline: February 3rd, 2023 (11:59PM AOE), platform: OpenReview.
  • Author notification deadline for ReScience Journal special issue: April 14th, 2023 April 21st, 2023
  • Camera Ready OpenReview update deadline: May 19th, 2023
  • Deadline to submit Kaggle notebook for the Kaggle Awards: June 1st, 2023
  • Announcement of Best Paper and Kaggle Awards: June 15th, 2023

Invitation to Participate

The challenge is a great event for community members to participate in shaping scientific practices and findings in our field. We particularly encourage participation from:

  • Course instructors of advanced ML, NLP, CV courses, who can use this challenge as a course assignment or project.
  • Organizers of hackathons.
  • Members of ML developer communities
  • ML enthusiasts everywhere!

How to participate

  • Check the Registration page for details on the conferences we cover, and then start working on a published paper from the list of conferences.
  • Check the Task Description page for more details on the task description
  • Check the Resources page for available resources
  • You can find answers to common question in our Frequently Asked Questions section.
  • Keep an eye out for the Important dates and deadlines
  • Submit your report in our OpenReview Portal.

Participating Courses

If you are an instructor participating in RC2022 with your course, we would love to hear from you and will be happy to list your course here! Please fill the following form with your course details:

Contact Information

For general queries regarding the challenge, mail us at

Organizing Committee


  • Melisa Bok, Celeste Martinez Gomez, Mohit Uniyal, Parag Pachpute, Andrew McCallum (OpenReview / University of Massachusetts Amherst)
  • Nicolas Rougier, Konrad Hinsen (ReScience)