Search Results for author: Amrita Saha

Found 25 papers, 11 papers with code

Automatic Curriculum Expert Iteration for Reliable LLM Reasoning

1 code implementation10 Oct 2024 Zirui Zhao, Hanze Dong, Amrita Saha, Caiming Xiong, Doyen Sahoo

To mitigate hallucination and laziness in reasoning tasks, we propose Automatic Curriculum Expert Iteration (Auto-CEI) to enhance LLM reasoning and align responses to the model's capabilities--assertively answering within its limits and declining when tasks exceed them.

Hallucination Logical Reasoning

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

no code implementations7 Oct 2024 Lei Wang, Shan Dong, Yuhui Xu, Hanze Dong, Yalu Wang, Amrita Saha, Ee-Peng Lim, Caiming Xiong, Doyen Sahoo

Although some recent benchmarks have been developed to evaluate the long-context capabilities of LLMs, there is a lack of benchmarks evaluating the mathematical reasoning abilities of LLMs over long contexts, which is crucial for LLMs' application in real-world scenarios.

Information Retrieval Mathematical Reasoning

ThinK: Thinner Key Cache by Query-Driven Pruning

no code implementations30 Jul 2024 Yuhui Xu, Zhanming Jie, Hanze Dong, Lei Wang, Xudong Lu, Aojun Zhou, Amrita Saha, Caiming Xiong, Doyen Sahoo

Large Language Models (LLMs) have revolutionized the field of natural language processing, achieving unprecedented performance across a variety of applications.

Quantization

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

1 code implementation28 Oct 2023 Hailin Chen, Amrita Saha, Steven Hoi, Shafiq Joty

With the rise of powerful closed-sourced LLMs (ChatGPT, GPT-4), there are increasing interests in distilling the capabilies of close-sourced LLMs to smaller open-sourced LLMs.

Code Generation HumanEval

CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules

1 code implementation13 Oct 2023 Hung Le, Hailin Chen, Amrita Saha, Akash Gokul, Doyen Sahoo, Shafiq Joty

We find that by naturally encouraging the LLM to reuse the previously developed and verified sub-modules, CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests.

Code Generation HumanEval +1

AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges

no code implementations10 Apr 2023 Qian Cheng, Doyen Sahoo, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Gerald Woo, Manpreet Singh, Silvio Saverese, Steven C. H. Hoi

There are a wide variety of problems to address, and multiple use-cases, where AI capabilities can be leveraged to enhance operational efficiency.

LogAI: A Library for Log Analytics and Intelligence

1 code implementation31 Jan 2023 Qian Cheng, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Doyen Sahoo, Steven Hoi

In order to enable users to perform multiple types of AI-based log analysis tasks in a uniform manner, we introduce LogAI (https://github. com/salesforce/logai), a one-stop open source library for log analytics and intelligence.

Anomaly Detection Log Parsing +2

Learning Label Modular Prompts for Text Classification in the Wild

1 code implementation30 Nov 2022 Hailin Chen, Amrita Saha, Shafiq Joty, Steven C. H. Hoi

Machine learning models usually assume i. i. d data during training and testing, but data and tasks in real world often change over time.

text-classification Text Classification

Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding

1 code implementation23 May 2022 Rishabh Bhardwaj, Amrita Saha, Steven C. H. Hoi, Soujanya Poria

VIP particularly focuses on two aspects -- contextual prompts that learns input-specific contextualization of the soft prompt tokens through a small-scale sentence encoder and quantized prompts that maps the contextualized prompts to a set of learnable codebook vectors through a Vector quantization network.

Natural Language Understanding NER +3

Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps

no code implementations21 Apr 2022 Amrita Saha, Steven C. H. Hoi

ICA forms the backbone of a simple-yet-effective Retrieval based RCA for new incidents, through an Information Retrieval system to search and rank past incidents and detect likely root causes from them, given the incident symptom.

2k Information Retrieval +2

Weakly Supervised Neuro-Symbolic Module Networks for Numerical Reasoning

no code implementations28 Jan 2021 Amrita Saha, Shafiq Joty, Steven C. H. Hoi

Neural Module Networks (NMNs) have been quite successful in incorporating explicit reasoning as learnable modules in various question answering tasks, including the most generic form of numerical reasoning over text in Machine Reading Comprehension (MRC).

Dependency Parsing Language Modeling +3

Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset

no code implementations3 Nov 2019 Sahana Ramnath, Amrita Saha, Soumen Chakrabarti, Mitesh M. Khapra

With the prolification of multimodal interaction in various domains, recently there has been much interest in text based image retrieval in the computer vision community.

Graph Matching Image Retrieval +3

DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension

1 code implementation ACL 2018 Amrita Saha, Rahul Aralikatte, Mitesh M. Khapra, Karthik Sankaranarayanan

We propose DuoRC, a novel dataset for Reading Comprehension (RC) that motivates several new challenges for neural approaches in language understanding beyond those offered by existing RC datasets.

Descriptive Reading Comprehension

Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph

1 code implementation31 Jan 2018 Amrita Saha, Vardaan Pahuja, Mitesh M. Khapra, Karthik Sankaranarayanan, Sarath Chandar

Further, unlike existing large scale QA datasets which contain simple questions that can be answered from a single tuple, the questions in our dialogs require a larger subgraph of the KG.

Knowledge Graphs Question Answering

Towards Building Large Scale Multimodal Domain-Aware Conversation Systems

1 code implementation1 Apr 2017 Amrita Saha, Mitesh Khapra, Karthik Sankaranarayanan

With this dataset, we propose 5 new sub-tasks for multimodal conversations along with their evaluation methodology.

Response Generation

A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation

no code implementations COLING 2016 Amrita Saha, Mitesh M. Khapra, Sarath Chandar, Janarthanan Rajendran, Kyunghyun Cho

However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications).

Decoder Transliteration

Cannot find the paper you are looking for? You can Submit a new open access paper.