Search Results for author: Uri Alon

Found 33 papers, 25 papers with code

In-Context Learning with Long-Context Models: An In-Depth Exploration

no code implementations30 Apr 2024 Amanda Bertsch, Maor Ivgi, Uri Alon, Jonathan Berant, Matthew R. Gormley, Graham Neubig

As model context lengths continue to increase, the number of demonstrations that can be provided in-context approaches the size of entire training datasets.

In-Context Learning Retrieval

Transformers Can Achieve Length Generalization But Not Robustly

no code implementations14 Feb 2024 Yongchao Zhou, Uri Alon, Xinyun Chen, Xuezhi Wang, Rishabh Agarwal, Denny Zhou

We show that the success of length generalization is intricately linked to the data format and the type of position encoding.


In-Context Principle Learning from Mistakes

no code implementations8 Feb 2024 Tianjun Zhang, Aman Madaan, Luyu Gao, Steven Zheng, Swaroop Mishra, Yiming Yang, Niket Tandon, Uri Alon

We evaluate LEAP on a wide range of benchmarks, including multi-hop question answering (Hotpot QA), textual QA (DROP), Big-Bench Hard reasoning, and math problems (GSM8K and MATH); in all these benchmarks, LEAP improves the strongest available LLMs such as GPT-3. 5-turbo, GPT-4, GPT-4 turbo and Claude-2. 1.

GSM8K In-Context Learning +3

Universal Self-Consistency for Large Language Model Generation

no code implementations29 Nov 2023 Xinyun Chen, Renat Aksitov, Uri Alon, Jie Ren, Kefan Xiao, Pengcheng Yin, Sushant Prakash, Charles Sutton, Xuezhi Wang, Denny Zhou

Self-consistency with chain-of-thought prompting (CoT) has demonstrated remarkable performance gains on various challenging tasks, by utilizing multiple reasoning paths sampled from large language models (LLMs).

Code Generation Language Modelling +3

CAT-LM: Training Language Models on Aligned Code And Tests

2 code implementations2 Oct 2023 Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn

We also drastically increase the maximum sequence length of inputs to 8, 192 tokens, 4x more than typical code generation models, to ensure that the code context is available to the model when generating test code.

Code Generation Language Modelling

WebArena: A Realistic Web Environment for Building Autonomous Agents

1 code implementation25 Jul 2023 Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.

GPT-Calls: Enhancing Call Segmentation and Tagging by Generating Synthetic Conversations via Large Language Models

no code implementations9 Jun 2023 Itzik Malkiel, Uri Alon, Yakir Yehuda, Shahar Keren, Oren Barkan, Royi Ronen, Noam Koenigstein

The online phase is applied to every call separately and scores the similarity between the transcripted conversation and the topic anchors found in the offline phase.

Segmentation TAG

On the Expressivity Role of LayerNorm in Transformers' Attention

1 code implementation4 May 2023 Shaked Brody, Uri Alon, Eran Yahav

Layer Normalization (LayerNorm) is an inherent component in all Transformer-based models.

Language Modelling

Unlimiformer: Long-Range Transformers with Unlimited Length Input

1 code implementation NeurIPS 2023 Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley

This kNN index can be kept on either the GPU or CPU memory and queried in sub-linear time; this way, we can index practically unlimited input sequences, while every attention head in every decoder layer retrieves its top-k keys, instead of attending to every key.

Book summarization Decoder

Learning Performance-Improving Code Edits

2 code implementations15 Feb 2023 Alexander Shypula, Aman Madaan, Yimeng Zeng, Uri Alon, Jacob Gardner, Milad Hashemi, Graham Neubig, Parthasarathy Ranganathan, Osbert Bastani, Amir Yazdanbakhsh

Next, we propose a broad range of adaptation strategies for code optimization; for prompting, these include retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.

Code Generation Code Repair +2

CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code

1 code implementation10 Feb 2023 Shuyan Zhou, Uri Alon, Sumit Agarwal, Graham Neubig

We release five language-specific pretrained models to use with our publicly available code.

Code Generation

Why do Nearest Neighbor Language Models Work?

1 code implementation7 Jan 2023 Frank F. Xu, Uri Alon, Graham Neubig

Language models (LMs) compute the probability of a text by sequentially computing a representation of an already-seen context and using this representation to predict the next word.


PAL: Program-aided Language Models

3 code implementations18 Nov 2022 Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, PengFei Liu, Yiming Yang, Jamie Callan, Graham Neubig

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Arithmetic Reasoning GSM8K +2

Language Models of Code are Few-Shot Commonsense Learners

2 code implementations13 Oct 2022 Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig

In all these natural language tasks, we show that using our approach, a code generation LM (CODEX) outperforms natural-LMs that are fine-tuned on the target task (e. g., T5) and other strong LMs such as GPT-3 in the few-shot setting.

Code Generation

Oversquashing in GNNs through the lens of information contraction and graph expansion

1 code implementation6 Aug 2022 Pradeep Kr. Banerjee, Kedar Karhadkar, Yu Guang Wang, Uri Alon, Guido Montúfar

We compare the spectral expansion properties of our algorithm with that of an existing curvature-based non-local rewiring strategy.

graph construction

A Systematic Evaluation of Large Language Models of Code

3 code implementations26 Feb 2022 Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn

We aim to fill in some of these blanks through a systematic evaluation of the largest existing models: Codex, GPT-J, GPT-Neo, GPT-NeoX-20B, and CodeParrot, across various programming languages.

Language Modelling

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

2 code implementations28 Jan 2022 Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig

Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.

Language Modelling Retrieval

How Attentive are Graph Attention Networks?

8 code implementations ICLR 2022 Shaked Brody, Uri Alon, Eran Yahav

Because GATs use a static attention mechanism, there are simple graph problems that GAT cannot express: in a controlled problem, we show that static attention hinders GAT from even fitting the training data.

Graph Attention Graph Property Prediction +3

Single-Node Attacks for Fooling Graph Neural Networks

1 code implementation6 Nov 2020 Ben Finkelshtein, Chaim Baskin, Evgenii Zheltonozhskii, Uri Alon

Graph neural networks (GNNs) have shown broad applicability in a variety of domains.

Adversarial Attack

On the Bottleneck of Graph Neural Networks and its Practical Implications

2 code implementations ICLR 2021 Uri Alon, Eran Yahav

Since the proposal of the graph neural network (GNN) by Gori et al. (2005) and Scarselli et al. (2008), one of the major problems in training GNNs was their struggle to propagate information between distant nodes in the graph.

A Structural Model for Contextual Code Changes

1 code implementation27 May 2020 Shaked Brody, Uri Alon, Eran Yahav

We conduct a thorough evaluation, comparing our approach to a variety of representation and modeling approaches that are driven by multiple strong models such as LSTMs, Transformers, and neural CRFs.


Adversarial Examples for Models of Code

3 code implementations15 Oct 2019 Noam Yefet, Uri Alon, Eran Yahav

Our evaluations demonstrate that DAMP has up to 89% success rate in changing a prediction to the adversary's choice (a targeted attack) and a success rate of up to 94% in changing a given prediction to any incorrect prediction (a non-targeted attack).

Structural Language Models of Code

2 code implementations ICML 2020 Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM).

C++ code Code Completion +2

Structural Language Models for Any-Code Generation

no code implementations25 Sep 2019 Uri Alon, Roy Sadaka, Omer Levy, Eran Yahav

We introduce a new approach to AnyGen that leverages the strict syntax of programming languages to model a code snippet as tree structural language modeling (SLM).

C++ code Code Generation +1

Neural Reverse Engineering of Stripped Binaries using Augmented Control Flow Graphs

1 code implementation25 Feb 2019 Yaniv David, Uri Alon, Eran Yahav

This is a challenging problem because of the low amount of syntactic information available in stripped executables, and the diverse assembly code patterns arising from compiler optimizations.

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

2 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Contextual Speech Recognition with Difficult Negative Training Examples

no code implementations29 Oct 2018 Uri Alon, Golan Pundak, Tara N. Sainath

Improving the representation of contextual information is key to unlocking the potential of end-to-end (E2E) automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

code2seq: Generating Sequences from Structured Representations of Code

6 code implementations ICLR 2019 Uri Alon, Shaked Brody, Omer Levy, Eran Yahav

The ability to generate natural language sequences from source code snippets has a variety of applications such as code summarization, documentation, and retrieval.

Code Summarization NMT +3

A General Path-Based Representation for Predicting Program Properties

3 code implementations26 Mar 2018 Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$.

code2vec: Learning Distributed Representations of Code

9 code implementations26 Mar 2018 Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav

We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body.

Coarse-Graining and Self-Dissimilarity of Complex Networks

no code implementations14 May 2004 Shalev Itzkovitz, Reuven Levitt, Nadav Kashtan, Ron Milo, Michael Itzkovitz, Uri Alon

Can complex engineered and biological networks be coarse-grained into smaller and more understandable versions in which each node represents an entire pattern in the original network?

Cannot find the paper you are looking for? You can Submit a new open access paper.