Search Results for author: Izzeddin Gur

Found 24 papers, 5 papers with code

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

no code implementations • 11 Dec 2023 • Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel

To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times.

Math

Paper
Add Code

Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web

1 code implementation • 30 Nov 2023 • Hiroki Furuta, Yutaka Matsuo, Aleksandra Faust, Izzeddin Gur

We show that while existing prompted LMAs (gpt-3. 5-turbo or gpt-4) achieve 94. 0% average success rate on base tasks, their performance degrades to 24. 9% success rate on compositional tasks.

Decision Making Language Modelling

32,798

Paper
Code

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

no code implementations • 8 Nov 2023 • C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant, Peter J. Liu, Roman Novak, Yundi Qian, Noah Fiedel, Jascha Sohl-Dickstein

We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment.

Language Modelling

Paper
Add Code

Small-scale proxies for large-scale Transformer training instabilities

no code implementations • 25 Sep 2023 • Mitchell Wortsman, Peter J. Liu, Lechao Xiao, Katie Everett, Alex Alemi, Ben Adlam, John D. Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith

In this work, we seek ways to reproduce and study training stability and instability at smaller scales.

Paper
Add Code

A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

no code implementations • 24 Jul 2023 • Izzeddin Gur, Hiroki Furuta, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust

Pre-trained large language models (LLMs) have recently achieved better generalization and sample efficiency in autonomous web automation.

Ranked #1 on on Mind2Web

Code Generation Denoising +3

Paper
Add Code

Multimodal Web Navigation with Instruction-Finetuned Foundation Models

no code implementations • 19 May 2023 • Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum, Yutaka Matsuo, Aleksandra Faust, Shixiang Shane Gu, Izzeddin Gur

The progress of autonomous web navigation has been hindered by the dependence on billions of exploratory interactions via online reinforcement learning, and domain-specific model designs that make it difficult to leverage generalization from rich out-of-domain data.

Autonomous Web Navigation Instruction Following +1

Paper
Add Code

Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration

no code implementations • 29 Nov 2022 • Srivatsan Krishnan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, Aleksandra Faust

It is unclear how scalable single-agent formulations are as we increase the complexity of the design space (e. g., full stack System-on-Chip design).

Compiler Optimization Multi-agent Reinforcement Learning +2

Paper
Add Code

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

1 code implementation • 19 Oct 2022 • Abdus Salam Azad, Izzeddin Gur, Jasper Emhoff, Nathaniel Alexis, Aleksandra Faust, Pieter Abbeel, Ion Stoica

Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks.

Reinforcement Learning (RL) Representation Learning +1

Paper
Code

Understanding HTML with Large Language Models

no code implementations • 8 Oct 2022 • Izzeddin Gur, Ofir Nachum, Yingjie Miao, Mustafa Safdari, Austin Huang, Aakanksha Chowdhery, Sharan Narang, Noah Fiedel, Aleksandra Faust

We contribute HTML understanding models (fine-tuned LLMs) and an in-depth analysis of their capabilities under three tasks: (i) Semantic Classification of HTML elements, (ii) Description Generation for HTML inputs, and (iii) Autonomous Web Navigation of HTML pages.

Autonomous Web Navigation Retrieval

Paper
Add Code

Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization

no code implementations • 25 May 2022 • Sungryull Sohn, Hyunjae Woo, Jongwook Choi, lyubing qiang, Izzeddin Gur, Aleksandra Faust, Honglak Lee

Different from the previous meta-rl methods trying to directly infer the unstructured task embedding, our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks, and use it as a prior to improve the task inference in testing.

Hierarchical Reinforcement Learning Meta Reinforcement Learning +2

Paper
Add Code

Environment Generation for Zero-Shot Compositional Reinforcement Learning

1 code implementation • NeurIPS 2021 • Izzeddin Gur, Natasha Jaques, Yingjie Miao, Jongwook Choi, Manoj Tiwari, Honglak Lee, Aleksandra Faust

We learn to generate environments composed of multiple pages or rooms, and train RL agents capable of completing wide-range of complex tasks in those environments.

Navigate reinforcement-learning +1

32,798

Paper
Code

Less is More: Generating Grounded Navigation Instructions from Landmarks

no code implementations • CVPR 2022 • Su Wang, Ceslee Montgomery, Jordi Orbay, Vighnesh Birodkar, Aleksandra Faust, Izzeddin Gur, Natasha Jaques, Austin Waters, Jason Baldridge, Peter Anderson

We study the automatic generation of navigation instructions from 360-degree images captured on indoor routes.

Instruction Following Visual Grounding

Paper
Add Code

Targeted Environment Design from Offline Data

no code implementations • 29 Sep 2021 • Izzeddin Gur, Ofir Nachum, Aleksandra Faust

We formalize our approach as offline targeted environment design(OTED), which automatically learns a distribution over simulator parameters to match a provided offline dataset, and then uses the learned simulator to train an RL agent in standard online fashion.

Offline RL Reinforcement Learning (RL)

Paper
Add Code

SparseDice: Imitation Learning for Temporally Sparse Data via Regularization

no code implementations • ICML Workshop URL 2021 • Alberto Camacho, Izzeddin Gur, Marcin Lukasz Moczulski, Ofir Nachum, Aleksandra Faust

We are concerned with a setting where the demonstrations comprise only a subset of state-action pairs (as opposed to the whole trajectories).

Imitation Learning

Paper
Add Code

Adversarial Environment Generation for Learning to Navigate the Web

1 code implementation • 2 Mar 2021 • Izzeddin Gur, Natasha Jaques, Kevin Malta, Manoj Tiwari, Honglak Lee, Aleksandra Faust

The regret objective trains the adversary to design a curriculum of environments that are "just-the-right-challenge" for the navigator agents; our results show that over time, the adversary learns to generate increasingly complex web navigation tasks.

Benchmarking Decision Making +2

32,792

Paper
Code

Assessing Post-Disaster Damage from Satellite Imagery using Semi-Supervised Learning Techniques

no code implementations • 24 Nov 2020 • Jihyeon Lee, Joseph Z. Xu, Kihyuk Sohn, Wenhan Lu, David Berthelot, Izzeddin Gur, Pranav Khaitan, Ke-Wei, Huang, Kyriacos Koupparis, Bernhard Kowatsch

To respond to disasters such as earthquakes, wildfires, and armed conflicts, humanitarian organizations require accurate and timely data in the form of damage assessments, which indicate what buildings and population centers have been most affected.

BIG-bench Machine Learning Disaster Response +1

Paper
Add Code

Learning to Navigate the Web

no code implementations • ICLR 2019 • Izzeddin Gur, Ulrich Rueckert, Aleksandra Faust, Dilek Hakkani-Tur

Even though recent approaches improve the success rate on relatively simple environments with the help of human demonstrations to guide the exploration, they still fail in environments where the set of possible instructions can reach millions.

Instruction Following Meta-Learning +3

Paper
Add Code

User Modeling for Task Oriented Dialogues

no code implementations • 11 Nov 2018 • Izzeddin Gur, Dilek Hakkani-Tur, Gokhan Tur, Pararth Shah

We further develop several variants by utilizing a latent variable model to inject random variations into user responses to promote diversity in simulated user responses and a novel goal regularization mechanism to penalize divergence of user responses from the initial user goal.

Dialogue State Tracking Task-Oriented Dialogue Systems +1

Paper
Add Code

What It Takes to Achieve 100\% Condition Accuracy on WikiSQL

no code implementations • EMNLP 2018 • Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan

The SQL queries in WikiSQL are simple: Each involves one relation and does not have any join operation.

Translation

Paper
Add Code

DialSQL: Dialogue Based Structured Query Generation

no code implementations • ACL 2018 • Izzeddin Gur, Semih Yavuz, Yu Su, Xifeng Yan

The recent advance in deep learning and semantic parsing has significantly improved the translation accuracy of natural language questions to structured queries.

Semantic Parsing Translation

Paper
Add Code

Recovering Question Answering Errors via Query Revision

no code implementations • EMNLP 2017 • Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan

The existing factoid QA systems often lack a post-inspection component that can help models recover from their own mistakes.

Question Answering Semantic Parsing

Paper
Add Code

Accurate Supervised and Semi-Supervised Machine Reading for Long Documents

no code implementations • EMNLP 2017 • Daniel Hewlett, Llion Jones, Alex Lacoste, re, Izzeddin Gur

We also evaluate the model in a semi-supervised setting by downsampling the WikiReading training set to create increasingly smaller amounts of supervision, while leaving the full unlabeled document corpus to train a sequence autoencoder on document windows.

Question Answering Reading Comprehension

Paper
Add Code

Global Relation Embedding for Relation Extraction

2 code implementations • NAACL 2018 • Yu Su, Honglei Liu, Semih Yavuz, Izzeddin Gur, Huan Sun, Xifeng Yan

We study the problem of textual relation embedding with distant supervision.

Relation Relation Extraction

Paper
Code

Improving Semantic Parsing via Answer Type Inference

no code implementations • EMNLP 2016 • Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa, Xifeng Yan

Knowledge Base Population Question Answering +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.