Search Results for author: Frank Rudzicz

Found 89 papers, 27 papers with code

MeSHup: Corpus for Full Text Biomedical Document Indexing

1 code implementation LREC 2022 Xindi Wang, Robert E. Mercer, Frank Rudzicz

Medical Subject Heading (MeSH) indexing refers to the problem of assigning a given biomedical document with the most relevant labels from an extremely large set of MeSH terms.

Explainable Clinical Decision Support from Text

no code implementations EMNLP 2020 Jinyue Feng, Chantal Shaib, Frank Rudzicz

Clinical prediction models often use structured variables and provide outcomes that are not readily interpretable by clinicians.

Language Modelling Mortality Prediction

NIFTY Financial News Headlines Dataset

no code implementations16 May 2024 Raeid Saqur, Ken Kato, Nicholas Vinden, Frank Rudzicz

We introduce and make publicly available the NIFTY Financial News Headlines dataset, designed to facilitate and advance research in financial market forecasting using large language models (LLMs).

Causal Language Modeling Language Modelling

LLM-Generated Black-box Explanations Can Be Adversarially Helpful

no code implementations10 May 2024 Rohan Ajwani, Shashidhar Reddy Javaji, Frank Rudzicz, Zining Zhu

Some LLMs are not able to find alternative paths along simple graphs, indicating that their misleading explanations aren't produced by only logical deductions using complex knowledge.


Plug and Play with Prompts: A Prompt Tuning Approach for Controlling Text Generation

no code implementations8 Apr 2024 Rohan Deepak Ajwani, Zining Zhu, Jonathan Rose, Frank Rudzicz

Transformer-based Large Language Models (LLMs) have shown exceptional language generation capabilities in response to text-based prompts.

Language Modelling Sentiment Analysis +1

Immunization against harmful fine-tuning attacks

no code implementations26 Feb 2024 Domenic Rosati, Jan Wehner, Kai Williams, Łukasz Bartoszcze, Jan Batzner, Hassan Sajjad, Frank Rudzicz

Approaches to aligning large language models (LLMs) with human values has focused on correcting misalignment that emerges from pretraining.

A State-Vector Framework for Dataset Effects

1 code implementation17 Oct 2023 Esmat Sahak, Zining Zhu, Frank Rudzicz

The impressive success of recent deep neural network (DNN)-based systems is significantly influenced by the high-quality datasets used in training.

Measuring Information in Text Explanations

no code implementations6 Oct 2023 Zining Zhu, Frank Rudzicz

Text-based explanation is a particularly promising approach in explainable AI, but the evaluation of text explanations is method-dependent.


Situated Natural Language Explanations

no code implementations27 Aug 2023 Zining Zhu, Haoming Jiang, Jingfeng Yang, Sreyashi Nag, Chao Zhang, Jie Huang, Yifan Gao, Frank Rudzicz, Bing Yin

Situated NLE provides a perspective and facilitates further research on the generation and evaluation of explanations.

Prompt Engineering

SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks

no code implementations24 Aug 2023 Shuja Khalid, Frank Rudzicz

By using GNNs to analyze the complex visual data of surgical procedures represented as graph structures, relevant features can be extracted and surgical skill can be predicted.

Scene Understanding

Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

1 code implementation28 Jul 2023 Xindi Wang, YuFei Wang, Can Xu, Xiubo Geng, BoWen Zhang, Chongyang Tao, Frank Rudzicz, Robert E. Mercer, Daxin Jiang

Large language models (LLMs) have shown remarkable capacity for in-context learning (ICL), where learning a new task from just a few training examples is done without being explicitly pre-trained.

In-Context Learning

Improving Automatic Quotation Attribution in Literary Novels

no code implementations7 Jul 2023 Krishnapriya Vishnubhotla, Frank Rudzicz, Graeme Hirst, Adam Hammond

Current models for quotation attribution in literary novels assume varying levels of available information in their training and test data, which poses a challenge for in-the-wild inference.


MLHOps: Machine Learning for Healthcare Operations

no code implementations4 May 2023 Faiza Khan Khattak, Vallijah Subasri, Amrit Krishnan, Elham Dolatabadi, Deval Pandya, Laleh Seyyed-Kalantari, Frank Rudzicz

We cover the foundational concepts of general machine learning operations, describe the initial setup of MLHOps pipelines (including data sources, preparation, engineering, and tools).


RefiNeRF: Modelling dynamic neural radiance fields with inconsistent or missing camera parameters

no code implementations15 Mar 2023 Shuja Khalid, Frank Rudzicz

We demonstrate the effectiveness of our method on a variety of static and dynamic scenes and show that it outperforms traditional SfM and MVS approaches.

Novel View Synthesis

Predicting Fine-Tuning Performance with Probing

1 code implementation13 Oct 2022 Zining Zhu, Soroosh Shahtalebi, Frank Rudzicz

Large NLP models have recently shown impressive performance in language understanding tasks, typically evaluated by their fine-tuned performance.

wildNeRF: Complete view synthesis of in-the-wild dynamic scenes captured using sparse monocular data

no code implementations20 Sep 2022 Shuja Khalid, Frank Rudzicz

We present a novel neural radiance model that is trainable in a self-supervised manner for novel-view synthesis of dynamic unstructured scenes.

Novel View Synthesis

Relevance in Dialogue: Is Less More? An Empirical Comparison of Existing Metrics, and a Novel Simple Metric

1 code implementation NLP4ConvAI (ACL) 2022 Ian Berlot-Attwell, Frank Rudzicz

Our proposed metric achieves state-of-the-art performance on the HUMOD dataset while reducing measured sensitivity to dataset by 37%-66%.

Language Modelling

The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

no code implementations6 May 2022 Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi

Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups.

BIG-bench Machine Learning Fairness

Detoxifying Language Models with a Toxic Corpus

no code implementations LTEDI (ACL) 2022 Yoon A Park, Frank Rudzicz

Existing studies have investigated the tendency of autoregressive language models to generate contexts that exhibit undesired biases and toxicity.

Text Generation

MeSHup: A Corpus for Full Text Biomedical Document Indexing

no code implementations28 Apr 2022 Xindi Wang, Robert E. Mercer, Frank Rudzicz

Medical Subject Heading (MeSH) indexing refers to the problem of assigning a given biomedical document with the most relevant labels from an extremely large set of MeSH terms.

Doctor XAvIer: Explainable Diagnosis on Physician-Patient Dialogues and XAI Evaluation

1 code implementation BioNLP (ACL) 2022 Hillary Ngai, Frank Rudzicz

We introduce Doctor XAvIer, a BERT-based diagnostic system that extracts relevant clinical data from transcribed patient-doctor dialogues and explains predictions using feature attribution methods.

Classification Explainable Artificial Intelligence (XAI) +6

KenMeSH: Knowledge-enhanced End-to-end Biomedical Text Labelling

1 code implementation ACL 2022 Xindi Wang, Robert E. Mercer, Frank Rudzicz

Currently, Medical Subject Headings (MeSH) are manually assigned to every biomedical article published and subsequently recorded in the PubMed database to facilitate retrieving relevant information.

On the data requirements of probing

1 code implementation Findings (ACL) 2022 Zining Zhu, Jixuan Wang, Bai Li, Frank Rudzicz

As large and powerful neural language models are developed, researchers have been increasingly interested in developing diagnostic tools to probe them.

Neural reality of argument structure constructions

1 code implementation ACL 2022 Bai Li, Zining Zhu, Guillaume Thomas, Frank Rudzicz, Yang Xu

Second, in a "Jabberwocky" priming-based experiment, we find that LMs associate ASCs with meaning, even in semantically nonsensical sentences.


Improving greedy core-set configurations for active learning with uncertainty-scaled distances

no code implementations9 Feb 2022 Yuchen Li, Frank Rudzicz

We scale perceived distances of the core-set algorithm by a factor of uncertainty and search for low-confidence configurations, finding significant improvements in sample efficiency across CIFAR10/100 and SVHN image classification, especially in larger acquisition sizes.

Active Learning Image Classification

Quantifying the Task-Specific Information in Text-Based Classifications

no code implementations17 Oct 2021 Zining Zhu, Aparna Balagopalan, Marzyeh Ghassemi, Frank Rudzicz

This framework allows us to compare across datasets, saying that, apart from a set of ``shortcut features'', classifying each sample in the Multi-NLI task involves around 0. 4 nats more TSI than in the Quora Question Pair.

Language Modelling via Learning to Rank

no code implementations13 Oct 2021 Arvid Frydenlund, Gagandeep Singh, Frank Rudzicz

We also develop a method using $N$-grams to create a non-probabilistic teacher which generates the ranks without the need of a pre-trained LM.

Knowledge Distillation Language Modelling +2

An unsupervised framework for tracing textual sources of moral change

1 code implementation Findings (EMNLP) 2021 Aida Ramezani, Zining Zhu, Frank Rudzicz, Yang Xu

Morality plays an important role in social well-being, but people's moral perception is not stable and changes over time.

What do writing features tell us about AI papers?

1 code implementation13 Jul 2021 Zining Zhu, Bai Li, Yang Xu, Frank Rudzicz

As the numbers of submissions to conferences grow quickly, the task of assessing the quality of academic papers automatically, convincingly, and with high accuracy attracts increasing attention.

How is BERT surprised? Layerwise detection of linguistic anomalies

1 code implementation ACL 2021 Bai Li, Zining Zhu, Guillaume Thomas, Yang Xu, Frank Rudzicz

Transformer language models have shown remarkable ability in detecting when a word is anomalous in context, but likelihood scores offer no information about the cause of the anomaly.

Density Estimation

Challenges for Reinforcement Learning in Healthcare

no code implementations9 Mar 2021 Elsa Riachi, Muhammad Mamdani, Michael Fralick, Frank Rudzicz

Many healthcare decisions involve navigating through a multitude of treatment options in a sequential and iterative manner to find an optimal treatment pathway with the goal of an optimal patient outcome.

reinforcement-learning Reinforcement Learning (RL)

Speaker attribution with voice profiles by graph-based semi-supervised learning

no code implementations6 Feb 2021 Jixuan Wang, Xiong Xiao, Jian Wu, Ranjani Ramamurthy, Frank Rudzicz, Michael Brudno

Speaker attribution is required in many real-world applications, such as meeting transcription, where speaker identity is assigned to each utterance according to speaker voice profiles.

Speaker Identification

BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data

1 code implementation28 Jan 2021 Demetres Kostas, Stephane Aroca-Ouellette, Frank Rudzicz

Deep neural networks (DNNs) used for brain-computer-interface (BCI) classification are commonly expected to learn general features when trained across a variety of contexts, such that these features could be fine-tuned to specific contexts.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Understanding Adversarial Attacks on Autoencoders

no code implementations1 Jan 2021 Elsa Riachi, Frank Rudzicz

We show that training an autoencoder on adversarial input-target pairs leads to low reconstruction error on the standard test set, suggesting that adversarial attacks on autoencoders are predictive.

Compressive Sensing Knowledge Distillation

Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP

1 code implementation EMNLP (ClinicalNLP) 2020 John Chen, Ian Berlot-Attwell, Safwan Hossain, Xindi Wang, Frank Rudzicz

Clinical machine learning is increasingly multimodal, collected in both structured tabular formats and unstructured forms such as freetext.

Fairness Word Embeddings

Semantic coordinates analysis reveals language changes in the AI field

no code implementations1 Nov 2020 Zining Zhu, Yang Xu, Frank Rudzicz

Semantic shifts can reflect changes in beliefs across hundreds of years, but it is less clear whether trends in fast-changing communities across a short time can be detected.

On Losses for Modern Language Models

1 code implementation EMNLP 2020 Stephane Aroca-Ouellette, Frank Rudzicz

BERT set many state-of-the-art results over varied NLU benchmarks by pre-training over two tasks: masked language modelling (MLM) and next sentence prediction (NSP), the latter of which has been highly criticized.

Language Modelling Sentence +1

Word class flexibility: A deep contextualized approach

2 code implementations EMNLP 2020 Bai Li, Guillaume Thomas, Yang Xu, Frank Rudzicz

Word class flexibility refers to the phenomenon whereby a single word form is used across different grammatical categories.

Word Embeddings

An information theoretic view on selecting linguistic probes

no code implementations EMNLP 2020 Zining Zhu, Frank Rudzicz

Hewitt and Liang (2019) showed that a high performance on diagnostic classification itself is insufficient, because it can be attributed to either "the representation being rich in knowledge", or "the probe learning the task", which Pimentel et al. (2020) challenged.

General Classification valid

Ethics of Artificial Intelligence in Surgery

no code implementations28 Jul 2020 Frank Rudzicz, Raeid Saqur

Here we discuss the four key principles of bio-medical ethics from surgical context.

Ethics Fairness

To BERT or Not To BERT: Comparing Speech and Language-based Approaches for Alzheimer's Disease Detection

no code implementations26 Jul 2020 Aparna Balagopalan, Benjamin Eyre, Frank Rudzicz, Jekaterina Novikova

Research related to automatically detecting Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional methods.

Alzheimer's Disease Detection

Sequential Explanations with Mental Model-Based Policies

no code implementations17 Jul 2020 Arnold YS Yeung, Shalmali Joshi, Joseph Jay Williams, Frank Rudzicz

The act of explaining across two parties is a feedback loop, where one provides information on what needs to be explained and the other provides an explanation relevant to this information.

Speaker diarization with session-level speaker embedding refinement using graph neural networks

no code implementations22 May 2020 Jixuan Wang, Xiong Xiao, Jian Wu, Ranjani Ramamurthy, Frank Rudzicz, Michael Brudno

Deep speaker embedding models have been commonly used as a building block for speaker diarization systems; however, the speaker embedding model is usually trained according to a global loss defined on the training data, which could be sub-optimal for distinguishing speakers locally in a specific meeting session.

Clustering speaker-diarization +1

Identification of primary and collateral tracks in stuttered speech

no code implementations LREC 2020 Rachid Riad, Anne-Catherine Bachoud-Lévi, Frank Rudzicz, Emmanuel Dupoux

Here, we introduce a new evaluation framework for disfluency detection inspired by the clinical and NLP perspective together with the theory of performance from \cite{clark1996using} which distinguishes between primary and collateral tracks.

Representation Learning for Discovering Phonemic Tone Contours

no code implementations WS 2020 Bai Li, Jing Yi Xie, Frank Rudzicz

Tone is a prosodic feature used to distinguish words in many languages, some of which are endangered and scarcely documented.

Decoder Representation Learning

Lexical Features Are More Vulnerable, Syntactic Features Have More Predictive Power

no code implementations WS 2019 Jekaterina Novikova, Aparna Balagopalan, Ksenia Shkaruta, Frank Rudzicz

Understanding the vulnerability of linguistic features extracted from noisy text is important for both developing better health text classification models and for interpreting vulnerabilities of natural language models.

General Classification text-classification +1

Variations on the Chebyshev-Lagrange Activation Function

2 code implementations24 Jun 2019 Yuchen Li, Frank Rudzicz, Jekaterina Novikova

We seek to improve the data efficiency of neural networks and present novel implementations of parameterized piece-wise polynomial activation functions.

Predicting ICU transfers using text messages between nurses and doctors

no code implementations WS 2019 Faiza Khan Khattak, Chlo{\'e} Pou-Prom, Robert Wu, Frank Rudzicz

We explore the use of real-time clinical information, i. e., text messages sent between nurses and doctors regarding patient conditions in order to predict transfer to the intensive care unit(ICU).

Multilingual prediction of Alzheimer's disease through domain adaptation and concept-based language modelling

no code implementations NAACL 2019 Kathleen C. Fraser, Nicklas Linz, Bai Li, Kristina Lundholm Fors, Frank Rudzicz, Alex K{\"o}nig, ra, Alex, Jan ersson, Philippe Robert, Dimitrios Kokkinakis

There is growing evidence that changes in speech and language may be early markers of dementia, but much of the previous NLP work in this area has been limited by the size of the available datasets.

Domain Adaptation Language Modelling

Generative Adversarial Networks for text using word2vec intermediaries

1 code implementation WS 2019 Akshay Budhkar, Krishnapriya Vishnubhotla, Safwan Hossain, Frank Rudzicz

Generative adversarial networks (GANs) have shown considerable success, especially in the realistic generation of images.

Word Embeddings

Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus

no code implementations NAACL 2019 Bai Li, Yi-Te Hsu, Frank Rudzicz

Machine learning has shown promise for automatic detection of Alzheimer's disease (AD) through speech; however, efforts are hampered by a scarcity of data, especially in languages other than English.

BIG-bench Machine Learning Machine Translation +2

Centroid-based deep metric learning for speaker recognition

no code implementations6 Feb 2019 Jixuan Wang, Kuan-Chieh Wang, Marc Law, Frank Rudzicz, Michael Brudno

Speaker embedding models that utilize neural networks to map utterances to a space where distances reflect similarity between speakers have driven recent progress in the speaker recognition task.

Few-Shot Image Classification Few-Shot Learning +4

Robustness against the channel effect in pathological voice detection

no code implementations26 Nov 2018 Yi-Te Hsu, Zining Zhu, Chi-Te Wang, Shih-Hau Fang, Frank Rudzicz, Yu Tsao

In this study, we propose a detection system for pathological voice, which is robust against the channel effect.

Unsupervised Domain Adaptation

ChainGAN: A sequential approach to GANs

1 code implementation ICLR 2019 Safwan Hossain, Kiarash Jamali, Yuchen Li, Frank Rudzicz

Current approaches attempt to learn the transformation from a noise sample to a generated data sample in one shot.

Augmenting word2vec with latent Dirichlet allocation within a clinical application

no code implementations NAACL 2019 Akshay Budhkar, Frank Rudzicz

This paper presents three hybrid models that directly combine latent Dirichlet allocation and word embedding for distinguishing between speakers with and without Alzheimer's disease from transcripts of picture descriptions.

Dropout during inference as a model for neurological degeneration in an image captioning network

no code implementations11 Aug 2018 Bai Li, Ran Zhang, Frank Rudzicz

We replicate a variation of the image captioning architecture by Vinyals et al. (2015), then introduce dropout during inference mode to simulate the effects of neurodegenerative diseases like Alzheimer's disease (AD) and Wernicke's aphasia (WA).

Image Captioning

Deconfounding age effects with fair representation learning when assessing dementia

no code implementations19 Jul 2018 Zining Zhu, Jekaterina Novikova, Frank Rudzicz

One of the most prevalent symptoms among the elderly population, dementia, can be detected by classifiers trained on linguistic features extracted from narrative transcripts.

Representation Learning

Semi-supervised classification by reaching consensus among modalities

no code implementations23 May 2018 Zining Zhu, Jekaterina Novikova, Frank Rudzicz

Deep learning has demonstrated abilities to learn complex structures, but they can be restricted by available data.

Classification General Classification +1

On the importance of normative data in speech-based assessment

no code implementations30 Nov 2017 Zeinab Noorian, Chloé Pou-Prom, Frank Rudzicz

Data sets for identifying Alzheimer's disease (AD) are often relatively sparse, which limits their ability to train generalizable models.

Binary Classification General Classification

Detecting Anxiety through Reddit

1 code implementation WS 2017 Judy Hanwen Shen, Frank Rudzicz

Previous investigations into detecting mental illnesses through social media have predominately focused on detecting depression through Twitter corpora.

Language Modelling Word Embeddings

Identifying and Avoiding Confusion in Dialogue with People with Alzheimer's Disease

no code implementations CL 2017 Hamidreza Chinaei, Leila Chan Currie, Andrew Danks, Hubert Lin, Tejas Mehta, Frank Rudzicz

Alzheimer{'}s disease (AD) is an increasingly prevalent cognitive disorder in which memory, language, and executive function deteriorate, usually in that order.

An Adaptive Psychoacoustic Model for Automatic Speech Recognition

no code implementations14 Sep 2016 Peng Dai, Xue Teng, Frank Rudzicz, Ing Yann Soon

Experiments are carried out on the AURORA2 database and show that the word recognition rate using our proposed feature extraction method is significantly increased over the baseline.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sex, drugs, and violence

no code implementations11 Aug 2016 Stefania Raimondo, Frank Rudzicz

Automatically detecting inappropriate content can be a difficult NLP task, requiring understanding context and innuendo, not just identifying specific keywords.


Predicting health inspection results from online restaurant reviews

no code implementations17 Mar 2016 Samantha Wong, Hamidreza Chinaei, Frank Rudzicz

Informatics around public health are increasingly shifting from the professional to the public spheres.

General Classification

Learning measures of semi-additive behaviour

no code implementations9 Dec 2015 Hamidreza Chinaei, Mohsen Rais-Ghasem, Frank Rudzicz

In business analytics, measure values, such as sales numbers or volumes of cargo transported, are often summed along values of one or more corresponding categories, such as time or shipping container.

Cannot find the paper you are looking for? You can Submit a new open access paper.