Search Results for author: Michele Merler

Found 8 papers, 0 papers with code

A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models

no code implementations • 13 Oct 2023 • Takuma Udagawa, Aashka Trivedi, Michele Merler, Bishwaranjan Bhattacharjee

Our target of study includes Output Distribution (OD) transfer, Hidden State (HS) transfer with various layer mapping strategies, and Multi-Head Attention (MHA) transfer based on MiniLMv2.

Knowledge Distillation

Paper
Add Code

CoSiNES: Contrastive Siamese Network for Entity Standardization

no code implementations • 5 Jun 2023 • Jiaqing Yuan, Michele Merler, Mihir Choudhury, Raju Pavuluri, Munindar P. Singh, Maja Vukovic

Entity standardization maps noisy mentions from free-form text to standard entities in a knowledge base.

Language Modelling Management

Paper
Add Code

Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models

no code implementations • 16 Mar 2023 • Aashka Trivedi, Takuma Udagawa, Michele Merler, Rameswar Panda, Yousef El-Kurdi, Bishwaranjan Bhattacharjee

In each episode of the search process, a NAS controller predicts a reward based on the distillation loss and latency of inference.

CoLA Knowledge Distillation +2

Paper
Add Code

Large Scale Neural Architecture Search with Polyharmonic Splines

no code implementations • 20 Nov 2020 • Ulrich Finkler, Michele Merler, Rameswar Panda, Mayoore S. Jaiswal, Hui Wu, Kandan Ramakrishnan, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee

Neural Architecture Search (NAS) is a powerful tool to automatically design deep neural networks for many tasks, including image classification.

Image Classification Neural Architecture Search

Paper
Add Code

NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search

no code implementations • 23 Jun 2020 • Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee

The typical way of conducting large scale NAS is to search for an architectural building block on a small dataset (either using a proxy set from the large dataset or a completely different small scale dataset) and then transfer the block to a larger dataset.

Neural Architecture Search

Paper
Add Code

Covering the News with (AI) Style

no code implementations • 5 Jan 2020 • Michele Merler, Cicero Nogueira dos santos, Mauro Martino, Alfio M. Gliozzo, John R. Smith

We introduce a multi-modal discriminative and generative frame-work capable of assisting humans in producing visual content re-lated to a given theme, starting from a collection of documents(textual, visual, or both).

Paper
Add Code

Diversity in Faces

no code implementations • 29 Jan 2019 • Michele Merler, Nalini Ratha, Rogerio S. Feris, John R. Smith

We expect face recognition to work equally accurately for every face.

Cultural Vocal Bursts Intensity Prediction Face Recognition +1

Paper
Add Code

Automatic Curation of Golf Highlights using Multimodal Excitement Features

no code implementations • 22 Jul 2017 • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R. Smith, Rogerio S. Feris

The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media.

Action Recognition Retrieval +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.