Search Results for author: Christian Szegedy

Found 29 papers, 17 papers with code

Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization

1 code implementation • 26 Mar 2024 • Jin Peng Zhou, Charles Staats, Wenda Li, Christian Szegedy, Kilian Q. Weinberger, Yuhuai Wu

Large language models (LLM), such as Google's Minerva and OpenAI's GPT families, are becoming increasingly capable of solving mathematical quantitative reasoning problems.

Automated Theorem Proving GSM8K +1

Paper
Code

Magnushammer: A Transformer-Based Approach to Premise Selection

no code implementations • 8 Mar 2023 • Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

By combining \method with a language-model-based automated theorem prover, we further improve the state-of-the-art proof success rate from $57. 0\%$ to $71. 0\%$ on the PISA benchmark using $4$x fewer parameters.

Automated Theorem Proving Language Modelling +1

Paper
Add Code

Autoformalization with Large Language Models

no code implementations • 25 May 2022 • Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy

Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs.

Ranked #1 on Automated Theorem Proving on miniF2F-test (using extra training data)

Automated Theorem Proving Program Synthesis

Paper
Add Code

Memorizing Transformers

3 code implementations • ICLR 2022 • Yuhuai Wu, Markus N. Rabe, DeLesley Hutchins, Christian Szegedy

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights.

Language Modelling Math

613

Paper
Code

Hierarchical Transformers Are More Efficient Language Models

3 code implementations • Findings (NAACL) 2022 • Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski

Transformer models yield impressive results on many NLP and sequence modeling tasks.

Ranked #4 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation Language Modelling

48,640

Paper
Code

LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

1 code implementation • 15 Jan 2021 • Yuhuai Wu, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, Christian Szegedy

While designing inductive bias in neural architectures has been widely studied, we hypothesize that transformer networks are flexible enough to learn inductive bias from suitable generic tasks.

Inductive Bias Mathematical Reasoning

Paper
Code

Mathematical Reasoning via Self-supervised Skip-tree Training

no code implementations • ICLR 2021 • Markus N. Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy

We examine whether self-supervised language modeling applied to mathematical formulas enables logical reasoning.

Language Modelling Logical Reasoning +1

Paper
Add Code

Mathematical Reasoning in Latent Space

no code implementations • ICLR 2020 • Dennis Lee, Christian Szegedy, Markus N. Rabe, Sarah M. Loos, Kshitij Bansal

We design and conduct a simple experiment to study whether neural networks can perform several steps of approximate reasoning in a fixed dimensional latent space.

Mathematical Reasoning

Paper
Add Code

Learning to Reason in Large Theories without Imitation

no code implementations • 25 May 2019 • Kshitij Bansal, Christian Szegedy, Markus N. Rabe, Sarah M. Loos, Viktor Toman

Our experiments show that the theorem prover trained with this exploration mechanism outperforms provers that are trained only on human proofs.

Ranked #3 on Automated Theorem Proving on HOList benchmark

Automated Theorem Proving Imitation Learning +2

Paper
Add Code

Graph Representations for Higher-Order Logic and Theorem Proving

no code implementations • 24 May 2019 • Aditya Paliwal, Sarah Loos, Markus Rabe, Kshitij Bansal, Christian Szegedy

This paper presents the first use of graph neural networks (GNNs) for higher-order proof search and demonstrates that GNNs can improve upon state-of-the-art results in this domain.

Ranked #1 on Automated Theorem Proving on HOList benchmark

Automated Theorem Proving

Paper
Add Code

HOList: An Environment for Machine Learning of Higher-Order Theorem Proving

3 code implementations • 5 Apr 2019 • Kshitij Bansal, Sarah M. Loos, Markus N. Rabe, Christian Szegedy, Stewart Wilcox

We present an environment, benchmark, and deep learning driven automated theorem prover for higher-order logic.

Ranked #2 on Automated Theorem Proving on HOList benchmark

Automated Theorem Proving BIG-bench Machine Learning +2

782

Paper
Code

Text Embeddings for Retrieval From a Large Knowledge Base

no code implementations • ICLR 2019 • Tolgahan Cakaloglu, Christian Szegedy, Xiaowei Xu

Text embedding representing natural language documents in a semantic vector space can be used for document retrieval using nearest neighbor lookup.

Open-Domain Question Answering Retrieval

Paper
Add Code

HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

1 code implementation • 1 Mar 2017 • Cezary Kaliszyk, François Chollet, Christian Szegedy

We propose various machine learning tasks that can be performed on this dataset, and discuss their significance for theorem proving.

Ranked #3 on Automated Theorem Proving on HolStep (Unconditional)

Automated Theorem Proving BIG-bench Machine Learning

782

Paper
Code

Deep Network Guided Proof Search

no code implementations • 24 Jan 2017 • Sarah Loos, Geoffrey Irving, Christian Szegedy, Cezary Kaliszyk

Here we suggest deep learning based guidance in the proof search of the theorem prover E. We train and compare several deep neural network models on the traces of existing ATP proofs of Mizar statements and use them to select processed clauses during proof search.

Game of Go Image Captioning +5

Paper
Add Code

DeepMath - Deep Sequence Models for Premise Selection

2 code implementations • NeurIPS 2016 • Alex A. Alemi, Francois Chollet, Niklas Een, Geoffrey Irving, Christian Szegedy, Josef Urban

We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics.

Automated Theorem Proving

Paper
Code

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

87 code implementations • 23 Feb 2016 • Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi

Recently, the introduction of residual connections in conjunction with a more traditional architecture has yielded state-of-the-art performance in the 2015 ILSVRC challenge; its performance was similar to the latest generation Inception-v3 network.

Ranked #4 on Classification on InDL

Classification General Classification +1

76,628

Paper
Code

Large Scale Business Discovery from Street Level Imagery

no code implementations • 17 Dec 2015 • Qian Yu, Christian Szegedy, Martin C. Stumpe, Liron Yatziv, Vinay Shet, Julian Ibarz, Sacha Arnoud

Precise business store front detection enables accurate geo-location of businesses, and further provides input for business categorization, listing generation, etc.

Paper
Add Code

SSD: Single Shot MultiBox Detector

223 code implementations • 8 Dec 2015 • Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg

Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference.

Ranked #3 on Object Detection on PASCAL VOC 2012

LIDAR Semantic Segmentation Low-Light Image Enhancement +4

27,894

Paper
Code

Rethinking the Inception Architecture for Computer Vision

112 code implementations • CVPR 2016 • Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna

Convolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks.

Ranked #8 on Retinal OCT Disease Classification on OCT2017

Computational Efficiency Image Classification +2

76,628

Paper
Code

Batch Normalization: Accelerating Deep Network Training byReducing Internal Covariate Shift

no code implementations • ICML 2015 2015 • Sergey Ioffe, Christian Szegedy

Training Deep Neural Networks is complicated by the factthat the distribution of each layer’s inputs changes duringtraining, as the parameters of the previous layers change. This slows down the training by requiring lower learningrates and careful parameter initialization, and makes it no-toriously hard to train models with saturating nonlineari-ties.

General Classification Image Classification

Paper
Add Code

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

74 code implementations • 11 Feb 2015 • Sergey Ioffe, Christian Szegedy

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change.

Ranked #487 on Image Classification on ImageNet (Number of params metric)

General Classification Image Classification

76,628

Paper
Code

Training Deep Neural Networks on Noisy Labels with Bootstrapping

3 code implementations • 20 Dec 2014 • Scott Reed, Honglak Lee, Dragomir Anguelov, Christian Szegedy, Dumitru Erhan, Andrew Rabinovich

On MNIST handwritten digits, we show that our model is robust to label corruption.

Emotion Recognition Object Recognition

Paper
Code

Explaining and Harnessing Adversarial Examples

59 code implementations • 20 Dec 2014 • Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy

Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence.

Ranked #57 on Image Classification on MNIST

Image Classification

6,084

Paper
Code

Scalable, High-Quality Object Detection

no code implementations • 3 Dec 2014 • Christian Szegedy, Scott Reed, Dumitru Erhan, Dragomir Anguelov, Sergey Ioffe

Using the multi-scale convolutional MultiBox (MSC-MultiBox) approach, we substantially advance the state-of-the-art on the ILSVRC 2014 detection challenge data set, with $0. 5$ mAP for a single model and $0. 52$ mAP for an ensemble of two models.

Object object-detection +2

Paper
Add Code

Going Deeper with Convolutions

79 code implementations • CVPR 2015 • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014).

General Classification Image Classification +2

76,628

Paper
Code

Intriguing properties of neural networks

12 code implementations • 21 Dec 2013 • Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus

Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks.

655

Paper
Code

DeepPose: Human Pose Estimation via Deep Neural Networks

7 code implementations • CVPR 2014 • Alexander Toshev, Christian Szegedy

We propose a method for human pose estimation based on Deep Neural Networks (DNNs).

Pose Estimation regression

5,037

Paper
Code

Scalable Object Detection using Deep Neural Networks

6 code implementations • CVPR 2014 • Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov

Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012).

Object object-detection +2

2,965

Paper
Code

Deep Neural Networks for Object Detection

no code implementations • NeurIPS 2013 • Christian Szegedy, Alexander Toshev, Dumitru Erhan

Deep Neural Networks (DNNs) have recently shown outstanding performance on the task of whole image classification.

General Classification Image Classification +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.