6 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?


Most implemented papers

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

alibaba/AliceMind 14 Dec 2021

Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge.

On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks

worksheets/0x39ba5559 Findings of the Association for Computational Linguistics 2020

Many pairwise classification tasks, such as paraphrase detection and open-domain question answering, naturally have extreme label imbalance (e. g., $99. 99\%$ of examples are negatives).

Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level

ruiqi-zhong/acl2021-instance-level Findings (ACL) 2021

We develop statistically rigorous methods to address this, and after accounting for pretraining and finetuning noise, we find that our BERT-Large is worse than BERT-Mini on at least 1-4% of instances across MNLI, SST-2, and QQP, compared to the overall accuracy improvement of 2-10%.

LEAP: Learnable Pruning for Transformer-based Models

yaozhewei/mlpruning 30 May 2021

Moreover, in order to reduce hyperparameter tuning, a novel adaptive regularization coefficient is deployed to control the regularization penalty adaptively.

Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation

lhryang/crl_egpg Findings (EMNLP) 2021

Exemplar-Guided Paraphrase Generation (EGPG) aims to generate a target sentence which conforms to the style of the given exemplar while encapsulating the content information of the source sentence.

Linear Connectivity Reveals Generalization Strategies

anonwhymoos/connectivity 24 May 2022

It is widely accepted in the mode connectivity literature that when two neural networks are trained similarly on the same data, they are connected by a path through parameter space over which test set accuracy is maintained.