Search Results for author: Lizhen Qu

Found 70 papers, 30 papers with code

Personal Information Leakage Detection in Conversations

1 code implementation EMNLP 2020 Qiongkai Xu, Lizhen Qu, Zeyu Gao, Gholamreza Haffari

In this work, we propose to protect personal information by warning users of detected suspicious sentences generated by conversational assistants.

Language Modeling Language Modelling

Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning

no code implementations8 Feb 2025 Manh Luong, Khai Nguyen, Dinh Phung, Gholamreza Haffari, Lizhen Qu

However, the contrastive method ignores the temporal information when measuring similarity across acoustic and linguistic modalities, leading to inferior performance.

AudioCaps Audio captioning

Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models

1 code implementation31 Oct 2024 Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari

Our results under these settings demonstrate that open-source audio LMMs suffer an average attack success rate of 69. 14% on harmful audio questions, and exhibit safety vulnerabilities when distracted with non-speech audio noise.

Red Teaming Safety Alignment

The Best of Both Worlds: Bridging Quality and Diversity in Data Selection with Bipartite Graph

no code implementations16 Oct 2024 Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari

In this paper, we introduce GraphFilter, a novel method that represents the dataset as a bipartite graph, linking sentences to their constituent n-grams.

Computational Efficiency Diversity

Jigsaw Puzzles: Splitting Harmful Questions to Jailbreak Large Language Models

1 code implementation15 Oct 2024 Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari

Moreover, JSP achieves a state-of-the-art attack success rate of 92% on GPT-4 on the harmful query benchmark, and exhibits strong resistant to defence strategies.

Scalable Frame-based Construction of Sociocultural NormBases for Socially-Aware Dialogues

no code implementations4 Oct 2024 Shilin Qu, Weiqing Wang, Xin Zhou, Haolan Zhan, Zhuang Li, Lizhen Qu, Linhao Luo, Yuan-Fang Li, Gholamreza Haffari

Our empirical results show: (i) the quality of the SCNs derived from synthetic data is comparable to that from real dialogues annotated with gold frames, and (ii) the quality of the SCNs extracted from real data, annotated with either silver (predicted) or gold frames, surpasses that without the frame annotations.

Information Retrieval RAG +1

Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis

no code implementations5 Sep 2024 Xianbing Zhao, Lizhen Qu, Tao Feng, Jianfei Cai, Buzhou Tang

This work proposes a novel and simple sequential learning strategy to train models on videos and texts for multimodal sentiment analysis.

feature selection Multimodal Sentiment Analysis

From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks?

no code implementations29 Jul 2024 Tao Feng, Lizhen Qu, Niket Tandon, Zhuang Li, Xiaoxi Kang, Gholamreza Haffari

Recent advances in artificial intelligence have seen Large Language Models (LLMs) demonstrate notable proficiency in causal discovery tasks.

Causal Discovery

Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights

1 code implementation25 Jun 2024 Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari

Based on the taxonomy, we create a small-scale dataset for evaluating current LMMs capability in detecting these categories of risk.

CausalScore: An Automatic Reference-Free Metric for Assessing Response Relevance in Open-Domain Dialogue Systems

no code implementations25 Jun 2024 Tao Feng, Lizhen Qu, Xiaoxi Kang, Gholamreza Haffari

Automatically evaluating the quality of responses in open-domain dialogue systems is a challenging but crucial task.

Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair Extraction

no code implementations18 Jun 2024 Yuncheng Hua, Yujin Huang, Shuo Huang, Tao Feng, Lizhen Qu, Chris Bain, Richard Bassed, Gholamreza Haffari

Inspired by causal discovery, we propose a novel deep latent model in the variational autoencoder (VAE) framework, which not only captures the underlying latent structures of data but also utilizes the easily transferable knowledge of emotions as the bridge to link the distributions of events in different domains.

Causal Discovery Emotion-Cause Pair Extraction +2

SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response Ranking

1 code implementation16 Jun 2024 Zhuang Li, Yuncheng Hua, Thuy-Trang Vu, Haolan Zhan, Lizhen Qu, Gholamreza Haffari

Recent studies emphasize that manually ensuring a consistent response style and maintaining high data quality in training sets can significantly improve the performance of fine-tuned Large Language Models (LLMs) while reducing the number of training examples needed.

Open-Ended Question Answering

Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models

no code implementations13 Jun 2024 Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari

In this work, we propose a general, model-agnostic, reinforcement learning framework, Mixture-of-Skills (MoS), that learns to optimize data usage automatically during the fine-tuning process.

NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human

no code implementations6 Jun 2024 Shuo Huang, William MacLean, Xiaoxi Kang, Anqi Wu, Lizhen Qu, Qiongkai Xu, Zhuang Li, Xingliang Yuan, Gholamreza Haffari

Increasing concerns about privacy leakage issues in academia and industry arise when employing NLP models from third-party providers to process sensitive texts.

Privacy Preserving

Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation

1 code implementation16 May 2024 Manh Luong, Khai Nguyen, Nhat Ho, Reza Haf, Dinh Phung, Lizhen Qu

The Learning-to-match (LTM) framework proves to be an effective inverse optimal transport approach for learning the underlying ground metric between two sources of data, facilitating subsequent matching.

AudioCaps Event Detection +4

Generative Region-Language Pretraining for Open-Ended Object Detection

1 code implementation CVPR 2024 Chuang Lin, Yi Jiang, Lizhen Qu, Zehuan Yuan, Jianfei Cai

To address it, we formulate object detection as a generative problem and propose a simple framework named GenerateU, which can detect dense objects and generate their names in a free-form way.

Language Modeling Language Modelling +5

FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning

no code implementations10 Mar 2024 Zhuo Zhang, Jingyuan Zhang, Jintao Huang, Lizhen Qu, Hongzhi Zhang, Qifan Wang, Xun Zhou, Zenglin Xu

Federated instruction tuning (FedIT) has emerged as a promising solution, by consolidating collaborative training across multiple data owners, thereby resulting in a privacy-preserving learning model.

Federated Learning Few-Shot Learning +3

RENOVI: A Benchmark Towards Remediating Norm Violations in Socio-Cultural Conversations

no code implementations17 Feb 2024 Haolan Zhan, Zhuang Li, Xiaoxi Kang, Tao Feng, Yuncheng Hua, Lizhen Qu, Yi Ying, Mei Rianto Chandra, Kelly Rosalin, Jureynolds Jureynolds, Suraj Sharma, Shilin Qu, Linhao Luo, Lay-Ki Soon, Zhaleh Semnani Azad, Ingrid Zukerman, Gholamreza Haffari

While collecting sufficient human-authored data is costly, synthetic conversations provide suitable amounts of data to help mitigate the scarcity of training data, as well as the chance to assess the alignment between LLMs and humans in the awareness of social norms.

Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues

no code implementations29 Jan 2024 Yuncheng Hua, Lizhen Qu, Gholamreza Haffari

We introduce a simple tuning-free and label-free In-Context Learning (ICL) method to identify high-quality ICL exemplars for the remediator, where we propose a novel select criteria, called value impact, to measure the quality of the negotiation outcomes.

In-Context Learning Language Modeling +2

Importance-Aware Data Augmentation for Document-Level Neural Machine Translation

no code implementations27 Jan 2024 Minghao Wu, YuFei Wang, George Foster, Lizhen Qu, Gholamreza Haffari

Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive, in contrast to its sentence-level counterpart.

Data Augmentation Machine Translation +2

Adapting Large Language Models for Document-Level Machine Translation

no code implementations12 Jan 2024 Minghao Wu, Thuy-Trang Vu, Lizhen Qu, George Foster, Gholamreza Haffari

We provide an in-depth analysis of these LLMs tailored for DocMT, examining translation errors, discourse phenomena, strategies for training and inference, the data efficiency of parallel documents, recent test set evaluations, and zero-shot crosslingual transfer.

Document Level Machine Translation Domain Generalization +2

TMID: A Comprehensive Real-world Dataset for Trademark Infringement Detection in E-Commerce

1 code implementation8 Dec 2023 Tongxin Hu, Zhuang Li, Xin Jin, Lizhen Qu, Xin Zhang

Annually, e-commerce platforms incur substantial financial losses due to trademark infringements, making it crucial to identify and mitigate potential legal risks tied to merchant information registered to the platforms.

Legal Reasoning

Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?

1 code implementation23 Oct 2023 Xiaoxi Kang, Lizhen Qu, Lay-Ki Soon, Adnan Trakic, Terry Yue Zhuo, Patrick Charles Emerton, Genevieve Grant

Each scenario in the corpus is annotated with a complete IRAC analysis in a semi-structured format so that both machines and legal professionals are able to interpret and understand the annotations.

Legal Reasoning

FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing

1 code implementation27 May 2023 Zhuang Li, Yuyang Chai, Terry Yue Zhuo, Lizhen Qu, Gholamreza Haffari, Fei Li, Donghong Ji, Quan Hung Tran

Textual scene graph parsing has become increasingly important in various vision-language applications, including image caption evaluation and image retrieval.

Graph Similarity Human Judgment Correlation +4

The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning

no code implementations22 May 2023 Zhuang Li, Lizhen Qu, Philip R. Cohen, Raj V. Tumuluri, Gholamreza Haffari

Multilingual semantic parsing aims to leverage the knowledge from the high-resource languages to improve low-resource semantic parsing, yet commonly suffers from the data imbalance problem.

Active Learning Semantic Parsing

Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding

1 code implementation8 May 2023 Bhanu Prakash Voutharoja, Lizhen Qu, Fatemeh Shiri

Our model parses a form into a word-relation graph in order to identify entities and relations jointly and reduce the time complexity of inference.

Form Graph Neural Network +2

Turning Flowchart into Dialog: Augmenting Flowchart-grounded Troubleshooting Dialogs via Synthetic Data Generation

1 code implementation2 May 2023 Haolan Zhan, Sameen Maruf, Lizhen Qu, YuFei Wang, Ingrid Zukerman, Gholamreza Haffari

Flowchart-grounded troubleshooting dialogue (FTD) systems, which follow the instructions of a flowchart to diagnose users' problems in specific domains (e. g., vehicle, laptop), have been gaining research interest in recent years.

Data Augmentation Response Generation +2

Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery

1 code implementation2 Mar 2023 Tao Feng, Lizhen Qu, Gholamreza Haffari

In this paper, we conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.

Causal Discovery Informativeness +1

Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation

no code implementations16 Feb 2023 Minghao Wu, George Foster, Lizhen Qu, Gholamreza Haffari

Existing work in document-level neural machine translation commonly concatenates several consecutive sentences as a pseudo-document, and then learns inter-sentential dependencies.

Machine Translation Translation

When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods

1 code implementation20 Dec 2022 Zhuo Zhang, Yuanhang Yang, Yong Dai, Lizhen Qu, Zenglin Xu

To facilitate the research of PETuning in FL, we also develop a federated tuning framework FedPETuning, which allows practitioners to exploit different PETuning methods under the FL training paradigm conveniently.

Federated Learning

Let's Negotiate! A Survey of Negotiation Dialogue Systems

no code implementations18 Dec 2022 Haolan Zhan, YuFei Wang, Tao Feng, Yuncheng Hua, Suraj Sharma, Zhuang Li, Lizhen Qu, Gholamreza Haffari

Negotiation is one of the crucial abilities in human communication, and there has been a resurgent research interest in negotiation dialogue systems recently, which goal is to empower intelligent agents with such ability that can efficiently help humans resolve conflicts or reach beneficial agreements.

Survey

Learning Object-Language Alignments for Open-Vocabulary Object Detection

1 code implementation27 Nov 2022 Chuang Lin, Peize Sun, Yi Jiang, Ping Luo, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan, Jianfei Cai

In this paper, we propose a novel open-vocabulary object detection framework directly learning from image-text pair data.

Object object-detection +4

Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation

1 code implementation27 Feb 2022 Zhuang Li, Lizhen Qu, Qiongkai Xu, Tongtong Wu, Tianyang Zhan, Gholamreza Haffari

In this paper, we propose a variational autoencoder with disentanglement priors, VAE-DPRIOR, for task-specific natural language generation with none or a handful of task-specific labeled examples.

Data Augmentation Disentanglement +3

Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation

1 code implementation10 Nov 2021 Chuang Lin, Yi Jiang, Jianfei Cai, Lizhen Qu, Gholamreza Haffari, Zehuan Yuan

Vision-and-Language Navigation (VLN) is a task that an agent is required to follow a language instruction to navigate to the goal position, which relies on the ongoing interactions with the environment during moving.

Decoder Navigate +1

Total Recall: a Customized Continual Learning Method for Neural Semantic Parsers

1 code implementation EMNLP 2021 Zhuang Li, Lizhen Qu, Gholamreza Haffari

We conduct extensive experiments to study the research problems involved in continual semantic parsing and demonstrate that a neural semantic parser trained with TotalRecall achieves superior performance than the one trained directly with the SOTA continual learning algorithms and achieve a 3-6 times speedup compared to re-training from scratch.

Continual Learning Semantic Parsing

On Robustness of Neural Semantic Parsers

no code implementations EACL 2021 Shuo Huang, Zhuang Li, Lizhen Qu, Lei Pan

In this paper, we provide the empirical study on the robustness of semantic parsers in the presence of adversarial attacks.

Data Augmentation Semantic Parsing

COSMO: Conditional SEQ2SEQ-based Mixture Model for Zero-Shot Commonsense Question Answering

1 code implementation COLING 2020 Farhad Moghimifar, Lizhen Qu, Yue Zhuo, Mahsa Baktashmotlagh, Gholamreza Haffari

However, current approaches in this realm lack the ability to perform commonsense reasoning upon facing an unseen situation, mostly due to incapability of identifying a diverse range of implicit social relations.

Question Answering

Context Dependent Semantic Parsing: A Survey

1 code implementation COLING 2020 Zhuang Li, Lizhen Qu, Gholamreza Haffari

Semantic parsing is the task of translating natural language utterances into machine-readable meaning representations.

Semantic Parsing Survey

Privacy-Aware Text Rewriting

no code implementations WS 2019 Qiongkai Xu, Lizhen Qu, Chenchen Xu, Ran Cui

Biased decisions made by automatic systems have led to growing concerns in research communities.

Fairness Translation

ALTER: Auxiliary Text Rewriting Tool for Natural Language Generation

1 code implementation IJCNLP 2019 Qiongkai Xu, Chenchen Xu, Lizhen Qu

In this paper, we describe ALTER, an auxiliary text rewriting tool that facilitates the rewriting process for natural language generation tasks, such as paraphrasing, text simplification, fairness-aware text rewriting, and text style transfer.

Fairness Style Transfer +2

Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection

no code implementations ICLR 2019 Tue Le, Tuan Nguyen, Trung Le, Dinh Phung, Paul Montague, Olivier De Vel, Lizhen Qu

Due to the sharp increase in the severity of the threat imposed by software vulnerabilities, the detection of vulnerabilities in binary code has become an important concern in the software industry, such as the embedded systems industry, and in the field of computer security.

Computer Security Vulnerability Detection

f-GANs in an Information Geometric Nutshell

1 code implementation NeurIPS 2017 Richard Nock, Zac Cranko, Aditya Krishna Menon, Lizhen Qu, Robert C. Williamson

In this paper, we unveil a broad class of distributions for which such convergence happens --- namely, deformed exponential families, a wide superset of exponential families --- and show tight connections with the three other key GAN parameters: loss, game and architecture.

Demographic Inference on Twitter using Recursive Neural Networks

no code implementations ACL 2017 Sunghwan Mac Kim, Qiongkai Xu, Lizhen Qu, Stephen Wan, C{\'e}cile Paris

In social media, demographic inference is a critical task in order to gain a better understanding of a cohort and to facilitate interacting with one{'}s audience.

Network Embedding

Collective Vertex Classification Using Recursive Neural Network

no code implementations24 Jan 2017 Qiongkai Xu, Qing Wang, Chenchen Xu, Lizhen Qu

In this paper, we propose a graph-based recursive neural network framework for collective vertex classification.

Classification General Classification

Automatic Generation of Grounded Visual Questions

no code implementations20 Dec 2016 Shijie Zhang, Lizhen Qu, ShaoDi You, Zhenglu Yang, Jiawan Zhang

In this paper, we propose the first model to be able to generate visually grounded questions with diverse types for a single image.

Diversity Question Generation +1

Named Entity Recognition for Novel Types by Transfer Learning

no code implementations EMNLP 2016 Lizhen Qu, Gabriela Ferraro, Liyuan Zhou, Weiwei Hou, Timothy Baldwin

In named entity recognition, we often don't have a large in-domain training corpus or a knowledge base with adequate coverage to train a model directly.

named-entity-recognition Named Entity Recognition +2

Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach

2 code implementations CVPR 2017 Giorgio Patrini, Alessandro Rozza, Aditya Menon, Richard Nock, Lizhen Qu

We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise.

Ranked #2 on Image Classification on Clothing1M (using clean data) (using extra training data)

Diversity Learning with noisy labels +1

STransE: a novel embedding model of entities and relationships in knowledge bases

1 code implementation NAACL 2016 Dat Quoc Nguyen, Kairit Sirts, Lizhen Qu, Mark Johnson

Knowledge bases of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks.

Knowledge Base Completion Link Prediction +2

Big Data Small Data, In Domain Out-of Domain, Known Word Unknown Word: The Impact of Word Representation on Sequence Labelling Tasks

no code implementations21 Apr 2015 Lizhen Qu, Gabriela Ferraro, Liyuan Zhou, Weiwei Hou, Nathan Schneider, Timothy Baldwin

Word embeddings -- distributed word representations that can be learned from unlabelled data -- have been shown to have high utility in many natural language processing applications.

Chunking NER +4

Estimating Maximally Probable Constrained Relations by Mathematical Programming

no code implementations4 Aug 2014 Lizhen Qu, Bjoern Andres

Estimating (learning) a maximally probable measure, given (a training set of) related and unrelated pairs, is a convex optimization problem.

Clustering General Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.