Search Results for author: Laks V. S. Lakshmanan

Found 19 papers, 11 papers with code

Autoregressive + Chain of Thought (CoT) $\simeq$ Recurrent: Recurrence's Role in Language Models and a Revist of Recurrent Transformer

no code implementations14 Sep 2024 Xiang Zhang, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

Moreover, we revisit recent recurrent-based Transformer model designs, focusing on their computational abilities through our proposed concept of ``recurrence-completeness" and identify key theoretical limitations in models like Linear Transformer and RWKV.

Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and Efficiency

1 code implementation14 Jun 2024 Ningyi Liao, Haoyu Liu, Zulun Zhu, Siqiang Luo, Laks V. S. Lakshmanan

With the recent advancements in graph neural networks (GNNs), spectral GNNs have received increasing popularity by virtue of their specialty in capturing graph signals in the frequency domain, demonstrating promising capability in specific tasks.

Benchmarking

Predicting Cascading Failures with a Hyperparametric Diffusion Model

1 code implementation12 Jun 2024 Bin Xiang, Bogdan Cautis, Xiaokui Xiao, Olga Mula, Dusit Niyato, Laks V. S. Lakshmanan

In this paper, we study cascading failures in power grids through the lens of information diffusion models.

OCCAM: Towards Cost-Efficient and Accuracy-Aware Image Classification Inference

no code implementations6 Jun 2024 Dujian Ding, Bicheng Xu, Laks V. S. Lakshmanan

Image classification is a fundamental building block for a majority of computer vision applications.

Image Classification

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

no code implementations22 Apr 2024 Dujian Ding, Ankur Mallick, Chi Wang, Robert Sim, Subhabrata Mukherjee, Victor Ruhle, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah

Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e. g., edge) devices, tend to lag behind in terms of response quality.

LLM Performance Predictors are good initializers for Architecture Search

1 code implementation25 Oct 2023 Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Dujian Ding

In this work, we utilize Large Language Models (LLMs) for a novel use case: constructing Performance Predictors (PP) that estimate the performance of specific deep neural network architectures on downstream tasks.

Machine Translation Neural Architecture Search

KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text Generation

1 code implementation17 Jun 2023 Yuxi Feng, Xiaoyuan Yi, Laks V. S. Lakshmanan, Xing Xie

Self-training (ST) has come to fruition in language understanding tasks by producing pseudo labels, which reduces the labeling bottleneck of language model fine-tuning.

Diversity Language Modelling +1

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

1 code implementation8 Jun 2023 Ganesh Jawahar, Haichuan Yang, Yunyang Xiong, Zechun Liu, Dilin Wang, Fei Sun, Meng Li, Aasish Pappu, Barlas Oguz, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Raghuraman Krishnamoorthi, Vikas Chandra

In NLP tasks like machine translation and pre-trained language modeling, there is a significant performance gap between supernet and training from scratch for the same model architecture, necessitating retraining post optimal architecture identification.

Language Modelling Machine Translation +2

DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation

1 code implementation16 Dec 2022 Yuxi Feng, Xiaoyuan Yi, Xiting Wang, Laks V. S. Lakshmanan, Xing Xie

Augmented by only self-generated pseudo text, generation models over-emphasize exploitation of the previously learned space, suffering from a constrained generalization boundary.

Attribute Diversity +1

Cross-Platform and Cross-Domain Abusive Language Detection with Supervised Contrastive Learning

no code implementations11 Nov 2022 Md Tawkat Islam Khondaker, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection.

Abusive Language Contrastive Learning +2

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

no code implementations6 Oct 2022 Ganesh Jawahar, Subhabrata Mukherjee, Debadeepta Dey, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Caio Cesar Teodoro Mendes, Gustavo Henrique de Rosa, Shital Shah

In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e. g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models.

Inductive Bias

Automatic Detection of Entity-Manipulated Text using Factual Knowledge

1 code implementation ACL 2022 Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

We propose a neural network based detector that detects manipulated news articles by reasoning about the facts mentioned in the article.

Automatic Detection of Machine Generated Text: A Critical Survey

1 code implementation COLING 2020 Ganesh Jawahar, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan

Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs.

Refutations on "Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study"

2 code implementations15 May 2017 Wei Lu, Xiaokui Xiao, Amit Goyal, Keke Huang, Laks V. S. Lakshmanan

In a recent SIGMOD paper titled "Debunking the Myths of Influence Maximization: An In-Depth Benchmarking Study", Arora et al. [1] undertake a performance benchmarking study of several well-known algorithms for influence maximization.

Social and Information Networks

Cannot find the paper you are looking for? You can Submit a new open access paper.