Search Results for author: Patrick Huber

Found 26 papers, 7 papers with code

Scaling Parameter-Constrained Language Models with Quality Data

no code implementations4 Oct 2024 Ernie Chang, Matteo Paltenghi, Yang Li, Pin-Jie Lin, Changsheng Zhao, Patrick Huber, Zechun Liu, Rastislav Rabatin, Yangyang Shi, Vikas Chandra

Scaling laws in language modeling traditionally quantify training loss as a function of dataset size and model parameters, providing compute-optimal estimates but often neglecting the impact of data quality on model generalization.

Diversity Language Modelling

CoDi: Conversational Distillation for Grounded Question Answering

no code implementations20 Aug 2024 Patrick Huber, Arash Einolghozati, Rylan Conway, Kanika Narang, Matt Smith, Waqar Nayyar, Adithya Sagar, Ahmed Aly, Akshat Shrivastava

This is a typical on-device scenario for specialist SLMs, allowing for open-domain model responses, without requiring the model to "memorize" world knowledge in its limited weights.

Question Answering World Knowledge

Small But Funny: A Feedback-Driven Approach to Humor Distillation

no code implementations28 Feb 2024 Sahithya Ravi, Patrick Huber, Akshat Shrivastava, Aditya Sagar, Ahmed Aly, Vered Shwartz, Arash Einolghozati

The emergence of Large Language Models (LLMs) has brought to light promising language generation capabilities, particularly in performing tasks like complex reasoning and creative writing.

Text Generation

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling

1 code implementation16 Feb 2024 Zekun Li, Zhiyu Zoey Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Luna Dong, Adithya Sagar, Xifeng Yan, Paul A. Crook

We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities.

Avg Dialogue State Tracking +1

FLEE-GNN: A Federated Learning System for Edge-Enhanced Graph Neural Network in Analyzing Geospatial Resilience of Multicommodity Food Flows

1 code implementation20 Oct 2023 Yuxiao Qu, Jinmeng Rao, Song Gao, Qianheng Zhang, Wei-Lun Chao, Yu Su, Michelle Miller, Alfonso Morales, Patrick Huber

This paper proposes FLEE-GNN, a novel Federated Learning System for Edge-Enhanced Graph Neural Network, designed to overcome these challenges and enhance the analysis of geospatial resilience of multicommodity food flow network, which is one type of spatial networks.

Federated Learning Graph Neural Network

Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues

no code implementations12 Feb 2023 Chuyuan Li, Patrick Huber, Wen Xiao, Maxime Amblard, Chloé Braud, Giuseppe Carenini

As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs).

Sentence Sentence Ordering

Large Discourse Treebanks from Scalable Distant Supervision

no code implementations18 Oct 2022 Patrick Huber, Giuseppe Carenini

Discourse parsing is an essential upstream task in Natural Language Processing with strong implications for many real-world applications.

Discourse Parsing Sentiment Analysis

Unsupervised Inference of Data-Driven Discourse Structures using a Tree Auto-Encoder

no code implementations18 Oct 2022 Patrick Huber, Giuseppe Carenini

With a growing need for robust and general discourse structures in many downstream tasks and real-world applications, the current lack of high-quality, high-quantity discourse trees poses a severe shortcoming.

Discourse Parsing

Towards Domain-Independent Supervised Discourse Parsing Through Gradient Boosting

no code implementations18 Oct 2022 Patrick Huber, Giuseppe Carenini

Discourse analysis and discourse parsing have shown great impact on many important problems in the field of Natural Language Processing (NLP).

Discourse Parsing Domain Adaptation

Improving Topic Segmentation by Injecting Discourse Dependencies

no code implementations COLING (CODI, CRAC) 2022 Linzi Xing, Patrick Huber, Giuseppe Carenini

Recent neural supervised topic segmentation models achieve distinguished superior effectiveness over unsupervised methods, with the availability of large-scale training corpora sampled from Wikipedia.

Segmentation Sentence

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

no code implementations NAACL 2022 Patrick Huber, Giuseppe Carenini

With a growing number of BERTology work analyzing different components of pre-trained language models, we extend this line of research through an in-depth analysis of discourse information in pre-trained and fine-tuned language models.

Predicting Above-Sentence Discourse Structure using Distant Supervision from Topic Segmentation

no code implementations12 Dec 2021 Patrick Huber, Linzi Xing, Giuseppe Carenini

RST-style discourse parsing plays a vital role in many NLP tasks, revealing the underlying semantic/pragmatic structure of potentially complex and diverse documents.

Discourse Parsing Sentence +1

W-RST: Towards a Weighted RST-style Discourse Framework

no code implementations ACL 2021 Patrick Huber, Wen Xiao, Giuseppe Carenini

Aiming for a better integration of data-driven and linguistically-inspired approaches, we explore whether RST Nuclearity, assigning a binary assessment of importance between text segments, can be replaced by automatically generated, real-valued scores, in what we call a Weighted-RST framework.

Dynamics of Water Confined in Mesopores with Variable Surface Interaction

no code implementations29 Dec 2020 Aicha Jani, Mark Busch, J Benedikt Mietner, Jacques Ollivier, Markus Appel, Bernhard Frick, Jean-Marc Zanotti, Aziz Ghoufi, Patrick Huber, Michael Fröba, Denis Morineau

We have investigated the dynamics of liquid water confined in mesostructured porous silica (MCM-41) and periodic mesoporous organosilicas (PMOs) by incoherent quasielastic neutron scattering experiments.

Chemical Physics

Unsupervised Learning of Discourse Structures using a Tree Autoencoder

no code implementations17 Dec 2020 Patrick Huber, Giuseppe Carenini

In this paper we are inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks.

Discourse Parsing

Do We Really Need That Many Parameters In Transformer For Extractive Summarization? Discourse Can Help !

no code implementations EMNLP (CODI) 2020 Wen Xiao, Patrick Huber, Giuseppe Carenini

The multi-head self-attention of popular transformer models is widely used within Natural Language Processing (NLP), including for the task of extractive summarization.

Extractive Summarization Sentence

Unleashing the Power of Neural Discourse Parsers - A Context and Structure Aware Approach Using Large Scale Pretraining

no code implementations COLING 2020 Grigorii Guz, Patrick Huber, Giuseppe Carenini

RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining.

Ranked #18 on Discourse Parsing on RST-DT (Standard Parseval (Span) metric)

Discourse Parsing Machine Translation +2

MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable Distant Sentiment Supervision

1 code implementation EMNLP 2020 Patrick Huber, Giuseppe Carenini

The lack of large and diverse discourse treebanks hinders the application of data-driven approaches, such as deep-learning, to RST-style discourse parsing.

Discourse Parsing

From Sentiment Annotations to Sentiment Prediction through Discourse Augmentation

no code implementations COLING 2020 Patrick Huber, Giuseppe Carenini

Sentiment analysis, especially for long documents, plausibly requires methods capturing complex linguistics structures.

Sentiment Analysis

Predicting Discourse Structure using Distant Supervision from Sentiment

no code implementations IJCNLP 2019 Patrick Huber, Giuseppe Carenini

Results indicate that while our parser does not yet match the performance of a parser trained and tested on the same dataset (intra-domain), it does perform remarkably well on the much more difficult and arguably more useful task of inter-domain discourse structure prediction, where the parser is trained on one domain and tested/applied on another one.

Discourse Parsing Multiple Instance Learning +2

A Hierarchical Approach to Neural Context-Aware Modeling

no code implementations27 Jul 2018 Patrick Huber, Jan Niehues, Alex Waibel

Our approach overcomes recent limitations with extended narratives through a multi-layered computational approach to generate an abstract context representation.

Binary Classification Language Modelling +1

Automated Evaluation of Out-of-Context Errors

1 code implementation LREC 2018 Patrick Huber, Jan Niehues, Alex Waibel

We present a new approach to evaluate computational models for the task of text understanding by the means of out-of-context error detection.

Binary Classification Language Modelling +2

On the determination of anti-neutrino spectra from nuclear reactors

1 code implementation3 Jun 2011 Patrick Huber

We also perform a critical evaluation of the errors associated with our method to extract the anti-neutrino spectrum using synthetic beta spectra.

High Energy Physics - Phenomenology High Energy Physics - Experiment Nuclear Experiment Nuclear Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.