Transfer learning is a methodology where weights from a model trained on one task are taken and either used (a) to construct a fixed feature extractor, (b) as weight initialization and/or fine-tuning.
|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
Deep convolutional neural networks are now widely deployed in vision applications, but the size of training data can bottleneck their performance.
We assume that a client, a target application with its own small labeled dataset, is only interested in fetching a subset of the server’s data that is most relevant to its own target domain.
While many sentiment classification solutions report high accuracy scores in product or movie review datasets, the performance of the methods in niche domains such as finance still largely falls behind.
The Lottery Ticket Hypothesis from Frankle & Carbin (2019) conjectures that, for typically-sized neural networks, it is possible to find small sub-networks which train faster and yield superior performance than their original counterparts.
Is it possible to compress these large-scale language representation models?
Experimental results show that the proposed method has significant improvement over state of the art methods, and it enables knowledge transfer and prevents catastrophic forgetting, resulting in more than 85% accuracy up to 100 stages, compared with less 50% accuracy for baselines.
Though word embeddings and topics are complementary representations, several past works have only used pretrained word embeddings in (neural) topic modeling to address data sparsity problem in short text or small collection of documents.
One way to compress these heavy models is knowledge transfer (KT), in which a light student network is trained through absorbing the knowledge from a powerful teacher network.
Many automated machine learning methods, such as those for hyperparameter and neural architecture optimization, are computationally expensive because they involve training many different model configurations.