Search Results for author: Or Sharir

Found 18 papers, 5 papers with code

ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs

no code implementations19 Feb 2024 Pengrui Han, Rafal Kocielnik, Adhithya Saravanan, Roy Jiang, Or Sharir, Anima Anandkumar

Our results reveal that: (1) ChatGPT can efficiently produce high-quality training data for debiasing other LLMs; (2) data produced via our approach surpasses existing datasets in debiasing performance while also preserving internal knowledge of a pre-trained LLM; and (3) synthetic data exhibits generalizability across categories, effectively mitigating various biases, including intersectional ones.

Data Augmentation Fairness

Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs

no code implementations27 Jul 2023 Or Sharir, Anima Anandkumar

Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.

Document Classification Knowledge Distillation +2

Towards Neural Variational Monte Carlo That Scales Linearly with System Size

no code implementations21 Dec 2022 Or Sharir, Garnet Kin-Lic Chan, Anima Anandkumar

Quantum many-body problems are some of the most challenging problems in science and are central to demystifying some exotic quantum phenomena, e. g., high-temperature superconductors.

Quantization Variational Monte Carlo

Neural tensor contractions and the expressive power of deep neural quantum states

no code implementations18 Mar 2021 Or Sharir, Amnon Shashua, Giuseppe Carleo

We establish a direct connection between general tensor networks and deep feed-forward artificial neural networks.

Tensor Networks

Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation

no code implementations30 Jun 2020 Yoel Zeldes, Dan Padnos, Or Sharir, Barak Peleg

We introduce a simple and efficient method, called Auxiliary Tuning, for adapting a pre-trained Language Model to a novel task; we demonstrate this approach on the task of conditional text generation.

Conditional Text Generation Language Modelling

The Depth-to-Width Interplay in Self-Attention

1 code implementation NeurIPS 2020 Yoav Levine, Noam Wies, Or Sharir, Hofit Bata, Amnon Shashua

Our guidelines elucidate the depth-to-width trade-off in self-attention networks of sizes up to the scale of GPT3 (which we project to be too deep for its size), and beyond, marking an unprecedented width of 30K as optimal for a 1-Trillion parameter network.

The Cost of Training NLP Models: A Concise Overview

no code implementations19 Apr 2020 Or Sharir, Barak Peleg, Yoav Shoham

We review the cost of training large-scale language models, and the drivers of these costs.

Deep autoregressive models for the efficient variational simulation of many-body quantum systems

2 code implementations11 Feb 2019 Or Sharir, Yoav Levine, Noam Wies, Giuseppe Carleo, Amnon Shashua

Artificial Neural Networks were recently shown to be an efficient representation of highly-entangled many-body quantum states.

Variational Monte Carlo

Benefits of Depth for Long-Term Memory of Recurrent Networks

no code implementations ICLR 2018 Yoav Levine, Or Sharir, Amnon Shashua

We prove that deep recurrent networks support Start-End separation ranks which are exponentially higher than those supported by their shallow counterparts.

Attribute Time Series Analysis

On the Long-Term Memory of Deep Recurrent Networks

1 code implementation25 Oct 2017 Yoav Levine, Or Sharir, Alon Ziv, Amnon Shashua

A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies.

Attribute Tensor Networks

Sum-Product-Quotient Networks

no code implementations12 Oct 2017 Or Sharir, Amnon Shashua

We present a novel tractable generative model that extends Sum-Product Networks (SPNs) and significantly boosts their power.

Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions

no code implementations5 May 2017 Nadav Cohen, Or Sharir, Yoav Levine, Ronen Tamari, David Yakira, Amnon Shashua

Expressive efficiency refers to the ability of a network architecture to realize functions that require an alternative architecture to be much larger.

Inductive Bias

On the Expressive Power of Overlapping Architectures of Deep Learning

1 code implementation ICLR 2018 Or Sharir, Amnon Shashua

Expressive efficiency refers to the relation between two architectures A and B, whereby any function realized by B could be replicated by A, but there exists functions realized by A, which cannot be replicated by B unless its size grows significantly larger.

Attribute

Tensorial Mixture Models

2 code implementations13 Oct 2016 Or Sharir, Ronen Tamari, Nadav Cohen, Amnon Shashua

Other methods, based on arithmetic circuits and sum-product networks, do allow tractable marginalization, but their performance is challenged by the need to learn the structure of a circuit.

On the Expressive Power of Deep Learning: A Tensor Analysis

no code implementations16 Sep 2015 Nadav Cohen, Or Sharir, Amnon Shashua

In this work we derive a deep network architecture based on arithmetic circuits that inherently employs locality, sharing and pooling.

Deep SimNets

no code implementations CVPR 2016 Nadav Cohen, Or Sharir, Amnon Shashua

We present a deep layered architecture that generalizes convolutional neural networks (ConvNets).

Cannot find the paper you are looking for? You can Submit a new open access paper.