1 code implementation • 2 Apr 2024 • Namrata Shivagunde, Vladislav Lialin, Sherin Muckatira, Anna Rumshisky
In contrast, the underlying pre-trained LLMs they use as a backbone are known to be brittle in this respect.
1 code implementation • 2 Apr 2024 • Sherin Muckatira, Vijeta Deshpande, Vladislav Lialin, Anna Rumshisky
Large language models can solve new tasks without task-specific fine-tuning.
no code implementations • 10 Nov 2023 • Sarah Pan, Vladislav Lialin, Sherin Muckatira, Anna Rumshisky
While recent advances have boosted LM proficiency in linguistic benchmarks, LMs consistently struggle to reason correctly on complex tasks like mathematics.
3 code implementations • 11 Jul 2023 • Vladislav Lialin, Namrata Shivagunde, Sherin Muckatira, Anna Rumshisky
Despite the dominance and effectiveness of scaling, resulting in large networks with hundreds of billions of parameters, the necessity to train overparameterized models remains poorly understood, while training costs grow exponentially.
no code implementations • 25 Aug 2020 • Sherin Muckatira
We find that iterative pruning of the network resulted in improved accuracy, compared to that of the unpruned network, implying that -- the lottery ticket hypothesis can be applied to the problem of skin cancer detection and this hypothesis can result in a smaller network for inference.