Search Results for author: Vladislav Lialin

Found 14 papers, 8 papers with code

Deconstructing In-Context Learning: Understanding Prompts via Corruption

1 code implementation2 Apr 2024 Namrata Shivagunde, Vladislav Lialin, Sherin Muckatira, Anna Rumshisky

In contrast, the underlying pre-trained LLMs they use as a backbone are known to be brittle in this respect.

In-Context Learning

Let's Reinforce Step by Step

no code implementations10 Nov 2023 Sarah Pan, Vladislav Lialin, Sherin Muckatira, Anna Rumshisky

While recent advances have boosted LM proficiency in linguistic benchmarks, LMs consistently struggle to reason correctly on complex tasks like mathematics.

GSM8K Logical Reasoning +2

ReLoRA: High-Rank Training Through Low-Rank Updates

3 code implementations11 Jul 2023 Vladislav Lialin, Namrata Shivagunde, Sherin Muckatira, Anna Rumshisky

Despite the dominance and effectiveness of scaling, resulting in large networks with hundreds of billions of parameters, the necessity to train overparameterized models remains poorly understood, while training costs grow exponentially.

Honey, I Shrunk the Language: Language Model Behavior at Reduced Scale

1 code implementation26 May 2023 Vijeta Deshpande, Dan Pechi, Shree Thatte, Vladislav Lialin, Anna Rumshisky

The majority of recent scaling laws studies focused on high-compute high-parameter count settings, leaving the question of when these abilities begin to emerge largely unanswered.

Language Modelling Masked Language Modeling

Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data

no code implementations4 Apr 2023 Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza

Currently popular video-text data mining approach via automatic speech recognition (ASR) used in HowTo100M provides low-quality captions that often do not refer to the video content.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Larger Probes Tell a Different Story: Extending Psycholinguistic Datasets Via In-Context Learning

1 code implementation29 Mar 2023 Namrata Shivagunde, Vladislav Lialin, Anna Rumshisky

Finally, we observe that while GPT3 has generated all the examples in ROLE-1500 is only able to solve 24. 6% of them during probing.

In-Context Learning Language Modelling +2

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

no code implementations28 Mar 2023 Vladislav Lialin, Vijeta Deshpande, Anna Rumshisky

This paper presents a systematic overview and comparison of parameter-efficient fine-tuning methods covering over 40 papers published between February 2019 and February 2023.

Life after BERT: What do Other Muppets Understand about Language?

1 code implementation ACL 2022 Vladislav Lialin, Kevin Zhao, Namrata Shivagunde, Anna Rumshisky

Existing pre-trained transformer analysis works usually focus only on one or two model families at a time, overlooking the variability of the architecture and pre-training objectives.

Update Frequently, Update Fast: Retraining Semantic Parsing Systems in a Fraction of Time

no code implementations15 Oct 2020 Vladislav Lialin, Rahul Goel, Andrey Simanovsky, Anna Rumshisky, Rushin Shah

To reduce training time, one can fine-tune the previously trained model on each patch, but naive fine-tuning exhibits catastrophic forgetting - degradation of the model performance on the data not represented in the data patch.

Continual Learning Goal-Oriented Dialogue Systems +1

Injecting Hierarchy with U-Net Transformers

2 code implementations16 Oct 2019 David Donahue, Vladislav Lialin, Anna Rumshisky

The Transformer architecture has become increasingly popular over the past two years, owing to its impressive performance on a number of natural language processing (NLP) tasks.

Machine Translation

NarrativeTime: Dense Temporal Annotation on a Timeline

no code implementations29 Aug 2019 Anna Rogers, Marzena Karpinska, Ankita Gupta, Vladislav Lialin, Gregory Smelkov, Anna Rumshisky

For the past decade, temporal annotation has been sparse: only a small portion of event pairs in a text was annotated.

Chunking

Cannot find the paper you are looking for? You can Submit a new open access paper.