no code implementations • 4 Oct 2024 • Ernie Chang, Matteo Paltenghi, Yang Li, Pin-Jie Lin, Changsheng Zhao, Patrick Huber, Zechun Liu, Rastislav Rabatin, Yangyang Shi, Vikas Chandra
Scaling laws in language modeling traditionally quantify training loss as a function of dataset size and model parameters, providing compute-optimal estimates but often neglecting the impact of data quality on model generalization.
no code implementations • 20 Aug 2024 • Patrick Huber, Arash Einolghozati, Rylan Conway, Kanika Narang, Matt Smith, Waqar Nayyar, Adithya Sagar, Ahmed Aly, Akshat Shrivastava
This is a typical on-device scenario for specialist SLMs, allowing for open-domain model responses, without requiring the model to "memorize" world knowledge in its limited weights.
no code implementations • 28 Feb 2024 • Sahithya Ravi, Patrick Huber, Akshat Shrivastava, Aditya Sagar, Ahmed Aly, Vered Shwartz, Arash Einolghozati
The emergence of Large Language Models (LLMs) has brought to light promising language generation capabilities, particularly in performing tasks like complex reasoning and creative writing.
1 code implementation • 16 Feb 2024 • Zekun Li, Zhiyu Zoey Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Luna Dong, Adithya Sagar, Xifeng Yan, Paul A. Crook
We also show that by fine-tuning on a small collection of diverse task-oriented dialogues, we can equip modestly sized models, specifically a 13B parameter LLaMA2-Chat model, with function-calling capabilities and DST performance comparable to ChatGPT while maintaining their chat capabilities.
1 code implementation • 20 Oct 2023 • Yuxiao Qu, Jinmeng Rao, Song Gao, Qianheng Zhang, Wei-Lun Chao, Yu Su, Michelle Miller, Alfonso Morales, Patrick Huber
This paper proposes FLEE-GNN, a novel Federated Learning System for Edge-Enhanced Graph Neural Network, designed to overcome these challenges and enhance the analysis of geospatial resilience of multicommodity food flow network, which is one type of spatial networks.
no code implementations • 12 Feb 2023 • Chuyuan Li, Patrick Huber, Wen Xiao, Maxime Amblard, Chloé Braud, Giuseppe Carenini
As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs).
no code implementations • 18 Oct 2022 • Patrick Huber, Giuseppe Carenini
Discourse parsing is an essential upstream task in Natural Language Processing with strong implications for many real-world applications.
no code implementations • 18 Oct 2022 • Patrick Huber, Giuseppe Carenini
With a growing need for robust and general discourse structures in many downstream tasks and real-world applications, the current lack of high-quality, high-quantity discourse trees poses a severe shortcoming.
no code implementations • 18 Oct 2022 • Patrick Huber, Giuseppe Carenini
Discourse analysis and discourse parsing have shown great impact on many important problems in the field of Natural Language Processing (NLP).
no code implementations • COLING (CODI, CRAC) 2022 • Linzi Xing, Patrick Huber, Giuseppe Carenini
Recent neural supervised topic segmentation models achieve distinguished superior effectiveness over unsupervised methods, with the availability of large-scale training corpora sampled from Wikipedia.
no code implementations • NAACL 2022 • Patrick Huber, Giuseppe Carenini
With a growing number of BERTology work analyzing different components of pre-trained language models, we extend this line of research through an in-depth analysis of discourse information in pre-trained and fine-tuned language models.
no code implementations • 12 Dec 2021 • Patrick Huber, Linzi Xing, Giuseppe Carenini
RST-style discourse parsing plays a vital role in many NLP tasks, revealing the underlying semantic/pragmatic structure of potentially complex and diverse documents.
1 code implementation • Findings (NAACL) 2022 • Patrick Huber, Armen Aghajanyan, Barlas Oğuz, Dmytro Okhonko, Wen-tau Yih, Sonal Gupta, Xilun Chen
Consequently, we propose a novel QA dataset based on the Common Crawl project in this paper.
no code implementations • ACL 2021 • Patrick Huber, Wen Xiao, Giuseppe Carenini
Aiming for a better integration of data-driven and linguistically-inspired approaches, we explore whether RST Nuclearity, assigning a binary assessment of importance between text segments, can be replaced by automatically generated, real-valued scores, in what we call a Weighted-RST framework.
1 code implementation • NAACL 2021 • Wen Xiao, Patrick Huber, Giuseppe Carenini
Previous work indicates that discourse information benefits summarization.
no code implementations • 29 Dec 2020 • Aicha Jani, Mark Busch, J Benedikt Mietner, Jacques Ollivier, Markus Appel, Bernhard Frick, Jean-Marc Zanotti, Aziz Ghoufi, Patrick Huber, Michael Fröba, Denis Morineau
We have investigated the dynamics of liquid water confined in mesostructured porous silica (MCM-41) and periodic mesoporous organosilicas (PMOs) by incoherent quasielastic neutron scattering experiments.
Chemical Physics
no code implementations • 17 Dec 2020 • Patrick Huber, Giuseppe Carenini
In this paper we are inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks.
no code implementations • EMNLP (CODI) 2020 • Wen Xiao, Patrick Huber, Giuseppe Carenini
The multi-head self-attention of popular transformer models is widely used within Natural Language Processing (NLP), including for the task of extractive summarization.
no code implementations • COLING 2020 • Grigorii Guz, Patrick Huber, Giuseppe Carenini
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining.
Ranked #18 on Discourse Parsing on RST-DT (Standard Parseval (Span) metric)
no code implementations • 6 Nov 2020 • Grigorii Guz, Patrick Huber, Giuseppe Carenini
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining.
Ranked #9 on Discourse Parsing on Instructional-DT (Instr-DT)
1 code implementation • EMNLP 2020 • Patrick Huber, Giuseppe Carenini
The lack of large and diverse discourse treebanks hinders the application of data-driven approaches, such as deep-learning, to RST-style discourse parsing.
no code implementations • COLING 2020 • Patrick Huber, Giuseppe Carenini
Sentiment analysis, especially for long documents, plausibly requires methods capturing complex linguistics structures.
no code implementations • IJCNLP 2019 • Patrick Huber, Giuseppe Carenini
Results indicate that while our parser does not yet match the performance of a parser trained and tested on the same dataset (intra-domain), it does perform remarkably well on the much more difficult and arguably more useful task of inter-domain discourse structure prediction, where the parser is trained on one domain and tested/applied on another one.
no code implementations • 27 Jul 2018 • Patrick Huber, Jan Niehues, Alex Waibel
Our approach overcomes recent limitations with extended narratives through a multi-layered computational approach to generate an abstract context representation.
1 code implementation • LREC 2018 • Patrick Huber, Jan Niehues, Alex Waibel
We present a new approach to evaluate computational models for the task of text understanding by the means of out-of-context error detection.
1 code implementation • 3 Jun 2011 • Patrick Huber
We also perform a critical evaluation of the errors associated with our method to extract the anti-neutrino spectrum using synthetic beta spectra.
High Energy Physics - Phenomenology High Energy Physics - Experiment Nuclear Experiment Nuclear Theory