Search Results for author: Michael R. Lyu

Found 93 papers, 47 papers with code

Tools and Benchmarks for Automated Log Parsing

7 code implementations8 Nov 2018 Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, Michael R. Lyu

Logs are imperative in the development and maintenance process of many software systems.

Software Engineering

Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics

8 code implementations14 Aug 2020 Shilin He, Jieming Zhu, Pinjia He, Michael R. Lyu

To fill this significant gap and facilitate more research on AI-driven log analytics, we have collected and released loghub, a large collection of system log datasets.

Software Engineering

Heterogeneous Anomaly Detection for Software Systems via Semi-supervised Cross-modal Attention

2 code implementations14 Feb 2023 Cheryl Lee, Tianyi Yang, Zhuangbin Chen, Yuxin Su, Yongqiang Yang, Michael R. Lyu

Our study demonstrates that logs and metrics can manifest system anomalies collaboratively and complementarily, and neither of them only is sufficient.

Anomaly Detection

Experience Report: Deep Learning-based System Log Analysis for Anomaly Detection

1 code implementation13 Jul 2021 Zhuangbin Chen, Jinyang Liu, Wenwei Gu, Yuxin Su, Michael R. Lyu

To better understand the characteristics of different anomaly detectors, in this paper, we provide a comprehensive review and evaluation of five popular neural networks used by six state-of-the-art methods.

Anomaly Detection

Topic-Aware Neural Keyphrase Generation for Social Media Language

2 code implementations ACL 2019 Yue Wang, Jing Li, Hou Pong Chan, Irwin King, Michael R. Lyu, Shuming Shi

Further discussions show that our model learns meaningful topics, which interprets its superiority in social media keyphrase generation.

Keyphrase Generation

DDFlow: Learning Optical Flow with Unlabeled Data Distillation

1 code implementation25 Feb 2019 Pengpeng Liu, Irwin King, Michael R. Lyu, Jia Xu

We present DDFlow, a data distillation approach to learning optical flow estimation from unlabeled data.

Optical Flow Estimation

Improving the Transferability of Adversarial Samples With Adversarial Transformations

1 code implementation CVPR 2021 Weibin Wu, Yuxin Su, Michael R. Lyu, Irwin King

Although deep neural networks (DNNs) have achieved tremendous performance in diverse vision challenges, they are surprisingly susceptible to adversarial examples, which are born of intentionally perturbing benign samples in a human-imperceptible fashion.

Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization

2 code implementations CVPR 2023 Jianping Zhang, Yizhan Huang, Weibin Wu, Michael R. Lyu

However, the variance of the back-propagated gradients in intermediate blocks of ViTs may still be large, which may make the generated adversarial samples focus on some model-specific features and get stuck in poor local optima.

Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling

1 code implementation ACL 2019 Yifan Gao, Piji Li, Irwin King, Michael R. Lyu

The coreference alignment modeling explicitly aligns coreferent mentions in conversation history with corresponding pronominal references in generated questions, which makes generated questions interconnected to conversation history.

Question Answering Question Generation +2

HiGRU: Hierarchical Gated Recurrent Units for Utterance-level Emotion Recognition

1 code implementation NAACL 2019 Wenxiang Jiao, Haiqin Yang, Irwin King, Michael R. Lyu

In this paper, we address three challenges in utterance-level emotion recognition in dialogue systems: (1) the same word can deliver different emotions in different contexts; (2) some emotions are rarely seen in general dialogues; (3) long-range contextual information is hard to be effectively captured.

Emotion Recognition

Generating Distractors for Reading Comprehension Questions from Real Examinations

2 code implementations8 Sep 2018 Yifan Gao, Lidong Bing, Piji Li, Irwin King, Michael R. Lyu

We investigate the task of distractor generation for multiple choice reading comprehension questions from examinations.

Distractor Generation Multiple-choice +2

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench

1 code implementation7 Aug 2023 Jen-tse Huang, Man Ho Lam, Eric John Li, Shujie Ren, Wenxuan Wang, Wenxiang Jiao, Zhaopeng Tu, Michael R. Lyu

Evaluating Large Language Models' (LLMs) anthropomorphic capabilities has become increasingly important in contemporary discourse.

Logzip: Extracting Hidden Structures via Iterative Clustering for Log Compression

1 code implementation24 Sep 2019 Jinyang Liu, Jieming Zhu, Shilin He, Pinjia He, Zibin Zheng, Michael R. Lyu

Data compression is essential to reduce the cost of log storage.

Databases Software Engineering

CLEVA: Chinese Language Models EVAluation Platform

1 code implementation9 Aug 2023 Yanyang Li, Jianqiao Zhao, Duo Zheng, Zi-Yuan Hu, Zhi Chen, Xiaohui Su, Yongfeng Huang, Shijia Huang, Dahua Lin, Michael R. Lyu, LiWei Wang

With the continuous emergence of Chinese Large Language Models (LLMs), how to evaluate a model's capabilities has become an increasingly significant issue.

No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence

1 code implementation24 Jul 2022 Chaozheng Wang, Yuanhang Yang, Cuiyun Gao, Yun Peng, Hongyu Zhang, Michael R. Lyu

Besides, the performance of fine-tuning strongly relies on the amount of downstream data, while in practice, the scenarios with scarce data are common.

Code Summarization Code Translation

VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control

1 code implementation ICCV 2023 Zi-Yuan Hu, Yanyang Li, Michael R. Lyu, LiWei Wang

In particular, our VL-PET-large with lightweight PET module designs significantly outperforms VL-Adapter by 2. 92% (3. 41%) and LoRA by 3. 37% (7. 03%) with BART-base (T5-base) on image-text tasks.

Image Captioning Text Generation +4

VD-BERT: A Unified Vision and Dialog Transformer with BERT

1 code implementation EMNLP 2020 Yue Wang, Shafiq Joty, Michael R. Lyu, Irwin King, Caiming Xiong, Steven C. H. Hoi

By contrast, in this work, we propose VD-BERT, a simple yet effective framework of unified vision-dialog Transformer that leverages the pretrained BERT language models for Visual Dialog tasks.

Answer Generation Visual Dialog

EMT: Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading

1 code implementation26 May 2020 Yifan Gao, Chien-Sheng Wu, Shafiq Joty, Caiming Xiong, Richard Socher, Irwin King, Michael R. Lyu, Steven C. H. Hoi

The goal of conversational machine reading is to answer user questions given a knowledge base text which may require asking clarification questions.

Decision Making Reading Comprehension +1

Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading

1 code implementation EMNLP 2020 Yifan Gao, Chien-Sheng Wu, Jingjing Li, Shafiq Joty, Steven C. H. Hoi, Caiming Xiong, Irwin King, Michael R. Lyu

Based on the learned EDU and entailment representations, we either reply to the user our final decision "yes/no/irrelevant" of the initial question, or generate a follow-up question to inquiry more information.

Decision Making Discourse Segmentation +3

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation

1 code implementation ACL 2021 Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Shuming Shi, Michael R. Lyu, Irwin King

In this work, we propose to improve the sampling procedure by selecting the most informative monolingual sentences to complement the parallel data.

Machine Translation NMT +1

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench

1 code implementation2 Oct 2023 Jen-tse Huang, Wenxuan Wang, Eric John Li, Man Ho Lam, Shujie Ren, Youliang Yuan, Wenxiang Jiao, Zhaopeng Tu, Michael R. Lyu

Large Language Models (LLMs) have recently showcased their remarkable capacities, not only in natural language processing tasks but also across diverse domains such as clinical medicine, legal consultation, and education.

Benchmarking

Microblog Hashtag Generation via Encoding Conversation Contexts

1 code implementation NAACL 2019 Yue Wang, Jing Li, Irwin King, Michael R. Lyu, Shuming Shi

Automatic hashtag annotation plays an important role in content understanding for microblog posts.

Topic Models

Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network

1 code implementation20 Nov 2019 Wenxiang Jiao, Michael R. Lyu, Irwin King

We propose an Attention Gated Hierarchical Memory Network (AGHMN) to address the problems of prior work: (1) Commonly used convolutional neural networks (CNNs) for utterance feature extraction are less compatible in the memory modules; (2) Unidirectional gated recurrent units (GRUs) only allow each historical utterance to have context before it, preventing information propagation in the opposite direction; (3) The Soft Attention for summarizing loses the positional and ordering information of memories, regardless of how the memory bank is built.

Emotion Recognition in Conversation

Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation

1 code implementation EMNLP 2020 Wenxiang Jiao, Xing Wang, Shilin He, Irwin King, Michael R. Lyu, Zhaopeng Tu

First, we train an identification model on the original training data, and use it to distinguish inactive examples and active examples by their sentence-level output probabilities.

Machine Translation NMT +2

Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings

1 code implementation EMNLP 2020 Yue Wang, Jing Li, Michael R. Lyu, Irwin King

Further analyses show that our multi-head attention is able to attend information from various aspects and boost classification or generation in diverse scenarios.

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

1 code implementation18 Mar 2024 Jen-tse Huang, Eric John Li, Man Ho Lam, Tian Liang, Wenxuan Wang, Youliang Yuan, Wenxiang Jiao, Xing Wang, Zhaopeng Tu, Michael R. Lyu

Additionally, we conduct evaluations across various LLMs and find that GPT-4 outperforms other models on GAMA-Bench, achieving a score of 72. 5.

Decision Making

Why an Android App is Classified as Malware? Towards Malware Classification Interpretation

1 code implementation24 Apr 2020 Bozhi Wu, Sen Chen, Cuiyun Gao, Lingling Fan, Yang Liu, Weiping Wen, Michael R. Lyu

In this paper, to fill this gap, we propose a novel and interpretable ML-based approach (named XMal) to classify malware with high accuracy and explain the classification result meanwhile.

Android Malware Detection Classification +2

Revisiting the Reliability of Psychological Scales on Large Language Models

1 code implementation31 May 2023 Jen-tse Huang, Wenxuan Wang, Man Ho Lam, Eric John Li, Wenxiang Jiao, Michael R. Lyu

Recent research has extended beyond assessing the performance of Large Language Models (LLMs) to examining their characteristics from a psychological standpoint, acknowledging the necessity of understanding their behavioral characteristics.

All Languages Matter: On the Multilingual Safety of Large Language Models

1 code implementation2 Oct 2023 Wenxuan Wang, Zhaopeng Tu, Chang Chen, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu

In this work, we build the first multilingual safety benchmark for LLMs, XSafety, in response to the global deployment of LLMs in practice.

Open-Retrieval Conversational Machine Reading

1 code implementation17 Feb 2021 Yifan Gao, Jingjing Li, Chien-Sheng Wu, Michael R. Lyu, Irwin King

On our created OR-ShARC dataset, MUDERN achieves the state-of-the-art performance, outperforming existing single-passage conversational machine reading models as well as a new multi-passage conversational machine reading baseline by a large margin.

Discourse Segmentation Reading Comprehension +1

Code Completion with Neural Attention and Pointer Networks

1 code implementation27 Nov 2017 Jian Li, Yue Wang, Michael R. Lyu, Irwin King

Intelligent code completion has become an essential research task to accelerate modern software development.

Code Completion

PT-CoDE: Pre-trained Context-Dependent Encoder for Utterance-level Emotion Recognition

1 code implementation20 Oct 2019 Wenxiang Jiao, Michael R. Lyu, Irwin King

Witnessing the success of transfer learning in natural language process (NLP), we propose to pre-train a context-dependent encoder (CoDE) for ULER by learning from unlabeled conversation data.

Emotion Recognition Sentence +3

Automating App Review Response Generation

1 code implementation10 Feb 2020 Cuiyun Gao, Jichuan Zeng, Xin Xia, David Lo, Michael R. Lyu, Irwin King

Previous studies showed that replying to a user review usually has a positive effect on the rating that is given by the user to the app.

Response Generation

Improving the Transferability of Adversarial Samples by Path-Augmented Method

1 code implementation CVPR 2023 Jianping Zhang, Jen-tse Huang, Wenxuan Wang, Yichen Li, Weibin Wu, Xiaosen Wang, Yuxin Su, Michael R. Lyu

However, such methods selected the image augmentation path heuristically and may augment images that are semantics-inconsistent with the target images, which harms the transferability of the generated adversarial samples.

Image Augmentation

On the Robustness of Latent Diffusion Models

1 code implementation14 Jun 2023 Jianping Zhang, Zhuoer Xu, Shiwen Cui, Changhua Meng, Weibin Wu, Michael R. Lyu

Therefore, in this paper, we aim to analyze the robustness of latent diffusion models more thoroughly.

Denoising Image Generation

AEON: A Method for Automatic Evaluation of NLP Test Cases

1 code implementation13 May 2022 Jen-tse Huang, Jianping Zhang, Wenxuan Wang, Pinjia He, Yuxin Su, Michael R. Lyu

However, in practice, many of the generated test cases fail to preserve similar semantic meaning and are unnatural (e. g., grammar errors), which leads to a high false alarm rate and unnatural test cases.

Semantic Similarity Semantic Textual Similarity +1

Graph-based Incident Aggregation for Large-Scale Online Service Systems

1 code implementation27 Aug 2021 Zhuangbin Chen, Jinyang Liu, Yuxin Su, Hongyu Zhang, Xuemin Wen, Xiao Ling, Yongqiang Yang, Michael R. Lyu

The proposed framework is evaluated with real-world incident data collected from a large-scale online service system of Huawei Cloud.

Graph Representation Learning Management

Text Revision by On-the-Fly Representation Optimization

1 code implementation In2Writing (ACL) 2022 Jingjing Li, Zichao Li, Tao Ge, Irwin King, Michael R. Lyu

In this approach, we simply fine-tune a pre-trained Transformer with masked language modeling and attribute classification.

Attribute Language Modelling +3

High-Resolution Deep Convolutional Generative Adversarial Networks

1 code implementation17 Nov 2017 Joachim D. Curtó, Irene C. Zarza, Fernando de la Torre, Irwin King, Michael R. Lyu

Generative Adversarial Networks (GANs) convergence in a high-resolution setting with a computational constrain of GPU memory capacity (from 12GB to 24 GB) has been beset with difficulty due to the known lack of convergence rate stability.

 Ranked #1 on Image Generation on CelebA 128x128 (MS-SSIM metric)

Image Generation MS-SSIM +2

Doctor of Crosswise: Reducing Over-parametrization in Neural Networks

1 code implementation24 May 2019 J. D. Curtó, I. C. Zarza, Kris Kitani, Irwin King, Michael R. Lyu

Dr. of Crosswise proposes a new architecture to reduce over-parametrization in Neural Networks.

Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models

1 code implementation27 Mar 2024 Yiwu Zhong, Zi-Yuan Hu, Michael R. Lyu, LiWei Wang

When visual tables serve as standalone visual representations, our model can closely match or even beat the SOTA MLLMs that are built on CLIP visual embeddings.

Semantically Consistent Image Completion with Fine-grained Details

no code implementations26 Nov 2017 Pengpeng Liu, Xiaojuan Qi, Pinjia He, Yikang Li, Michael R. Lyu, Irwin King

Image completion has achieved significant progress due to advances in generative adversarial networks (GANs).

Image Inpainting

Difficulty Controllable Generation of Reading Comprehension Questions

no code implementations10 Jul 2018 Yifan Gao, Lidong Bing, Wang Chen, Michael R. Lyu, Irwin King

We investigate the difficulty levels of questions in reading comprehension datasets such as SQuAD, and propose a new question generation setting, named Difficulty-controllable Question Generation (DQG).

Question Generation Question-Generation +2

Title-Guided Encoding for Keyphrase Generation

no code implementations26 Aug 2018 Wang Chen, Yifan Gao, Jiani Zhang, Irwin King, Michael R. Lyu

Keyphrase generation (KG) aims to generate a set of keyphrases given a document, which is a fundamental task in natural language processing (NLP).

Keyphrase Generation

Multi-Head Attention with Disagreement Regularization

no code implementations EMNLP 2018 Jian Li, Zhaopeng Tu, Baosong Yang, Michael R. Lyu, Tong Zhang

Multi-head attention is appealing for the ability to jointly attend to information from different representation subspaces at different positions.

Translation

Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs

no code implementations NeurIPS 2018 Han Shao, Xiaotian Yu, Irwin King, Michael R. Lyu

In this paper, under a weaker assumption on noises, we study the problem of \underline{lin}ear stochastic {\underline b}andits with h{\underline e}avy-{\underline t}ailed payoffs (LinBET), where the distributions have finite moments of order $1+\epsilon$, for some $\epsilon\in (0, 1]$.

Toward Efficient and Accurate Covariance Matrix Estimation on Compressed Data

no code implementations ICML 2017 Xixian Chen, Michael R. Lyu, Irwin King

Estimating covariance matrices is a fundamental technique in various domains, most notably in machine learning and signal processing.

Data Compression

Information Aggregation for Multi-Head Attention with Routing-by-Agreement

no code implementations NAACL 2019 Jian Li, Baosong Yang, Zi-Yi Dou, Xing Wang, Michael R. Lyu, Zhaopeng Tu

Multi-head attention is appealing for its ability to jointly extract different types of information from multiple representation subspaces.

Machine Translation Translation

An Online Topic Modeling Framework with Topics Automatically Labeled

no code implementations WS 2019 Fenglei Jin, Cuiyun Gao, Michael R. Lyu

In this paper, we propose a novel online topic tracking framework, named IEDL, for tracking the topic changes related to deep learning techniques on Stack Exchange and automatically interpreting each identified topic.

Towards Understanding Neural Machine Translation with Word Importance

no code implementations IJCNLP 2019 Shilin He, Zhaopeng Tu, Xing Wang, Long-Yue Wang, Michael R. Lyu, Shuming Shi

Although neural machine translation (NMT) has advanced the state-of-the-art on various language pairs, the interpretability of NMT remains unsatisfactory.

Machine Translation NMT +1

Improving Question Generation With to the Point Context

no code implementations IJCNLP 2019 Jingjing Li, Yifan Gao, Lidong Bing, Irwin King, Michael R. Lyu

Question generation (QG) is the task of generating a question from a reference sentence and a specified answer within the sentence.

Question Generation Question-Generation +1

Improving Word Representations: A Sub-sampled Unigram Distribution for Negative Sampling

no code implementations21 Oct 2019 Wenxiang Jiao, Irwin King, Michael R. Lyu

Word2Vec is the most popular model for word representation and has been widely investigated in literature.

Sentence Sentence Completion

Neuron Interaction Based Representation Composition for Neural Machine Translation

no code implementations22 Nov 2019 Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e. g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors.

Machine Translation Translation

What Changed Your Mind: The Roles of Dynamic Topics and Discourse in Argumentation Process

no code implementations10 Feb 2020 Jichuan Zeng, Jing Li, Yulan He, Cuiyun Gao, Michael R. Lyu, Irwin King

In our world with full of uncertainty, debates and argumentation contribute to the progress of science and society.

Persuasiveness

Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models

no code implementations28 Apr 2020 Shilin He, Xing Wang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons.

Machine Translation NMT +1

DeepObfuscation: Securing the Structure of Convolutional Neural Networks via Knowledge Distillation

no code implementations27 Jun 2018 Hui Xu, Yuxin Su, Zirui Zhao, Yangfan Zhou, Michael R. Lyu, Irwin King

Our obfuscation approach is very effective to protect the critical structure of a deep learning model from being exposed to attackers.

Cryptography and Security

On Secure and Usable Program Obfuscation: A Survey

no code implementations3 Oct 2017 Hui Xu, Yangfan Zhou, Yu Kang, Michael R. Lyu

On the other hand, the performance requirement for model-oriented obfuscation approaches is too weak to develop practical program obfuscation solutions.

Cryptography and Security Software Engineering

N-Version Obfuscation: Impeding Software Tampering Replication with Program Diversity

no code implementations8 Jun 2015 Hui Xu, Yangfan Zhou, Michael R. Lyu

Our idea is to impede the replication of tampering via program diversification, and thus increasing the complexity to break the whole software system.

Cryptography and Security Programming Languages

Emerging App Issue Identification via Online Joint Sentiment-Topic Tracing

no code implementations23 Aug 2020 Cuiyun Gao, Jichuan Zeng, Zhiyuan Wen, David Lo, Xin Xia, Irwin King, Michael R. Lyu

Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT in identifying emerging app issues, improving the state-of-the-art method by 22. 3% in terms of F1-score.

Clustering

A Directed Acyclic Graph Approach to Online Log Parsing

no code implementations12 Jun 2018 Pinjia He, Jieming Zhu, Pengcheng Xu, Zibin Zheng, Michael R. Lyu

A typical log-based system reliability management procedure is to first parse log messages because of their unstructured format; and apply data mining techniques on the parsed logs to obtain critical system behavior information.

Software Engineering

Effective Data-aware Covariance Estimator from Compressed Data

no code implementations10 Oct 2020 Xixian Chen, Haiqin Yang, Shenglin Zhao, Michael R. Lyu, Irwin King

Estimating covariance matrix from massive high-dimensional and distributed data is significant for various real-world applications.

Making Online Sketching Hashing Even Faster

no code implementations10 Oct 2020 Xixian Chen, Haiqin Yang, Shenglin Zhao, Michael R. Lyu, Irwin King

Data-dependent hashing methods have demonstrated good performance in various machine learning applications to learn a low-dimensional representation from the original data.

A Survey of Point-of-interest Recommendation in Location-based Social Networks

no code implementations3 Jul 2016 Shenglin Zhao, Irwin King, Michael R. Lyu

Then, we present a comprehensive review in three aspects: influential factors for POI recommendation, methodologies employed for POI recommendation, and different tasks in POI recommendation.

Movie Recommendation Recommendation Systems

Code Structure Guided Transformer for Source Code Summarization

no code implementations19 Apr 2021 Shuzheng Gao, Cuiyun Gao, Yulan He, Jichuan Zeng, Lun Yiu Nie, Xin Xia, Michael R. Lyu

Code summaries help developers comprehend programs and reduce their time to infer the program functionalities during software maintenance.

Code Summarization Inductive Bias +1

Learning by Distillation: A Self-Supervised Learning Framework for Optical Flow Estimation

no code implementations8 Jun 2021 Pengpeng Liu, Michael R. Lyu, Irwin King, Jia Xu

Then, a self-supervised learning framework is constructed: confident predictions from teacher models are served as annotations to guide the student model to learn optical flow for those less confident predictions.

Knowledge Distillation Optical Flow Estimation +1

Towards Efficient Post-training Quantization of Pre-trained Language Models

no code implementations30 Sep 2021 Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang, Irwin King, Michael R. Lyu

Experiments on GLUE and SQuAD benchmarks show that our proposed PTQ solution not only performs close to QAT, but also enjoys significant reductions in training time, memory overhead, and data consumption.

Quantization

FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows

no code implementations14 Feb 2022 Jianqiao Zhao, Yanyang Li, Wanyu Du, Yangfeng Ji, Dong Yu, Michael R. Lyu, LiWei Wang

Hence, we propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it.

Dialogue Evaluation

Understanding and Mitigating the Uncertainty in Zero-Shot Translation

no code implementations20 May 2022 Wenxuan Wang, Wenxiang Jiao, Shuo Wang, Zhaopeng Tu, Michael R. Lyu

Zero-shot translation is a promising direction for building a comprehensive multilingual neural machine translation (MNMT) system.

Machine Translation Translation

An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software

no code implementations18 Aug 2023 Wenxuan Wang, Jingyuan Huang, Jen-tse Huang, Chang Chen, Jiazhen Gu, Pinjia He, Michael R. Lyu

Moreover, through retraining the models with the test cases generated by OASIS, the robustness of the moderation model can be improved without performance degradation.

Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services

no code implementations19 Aug 2023 Jinyang Liu, Tianyi Yang, Zhuangbin Chen, Yuxin Su, Cong Feng, Zengyin Yang, Michael R. Lyu

As modern software systems continue to grow in terms of complexity and volume, anomaly detection on multivariate monitoring metrics, which profile systems' health status, becomes more and more critical and challenging.

Anomaly Detection

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models

no code implementations19 Oct 2023 Wenxuan Wang, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang, Zhaopeng Tu, Michael R. Lyu

This paper identifies a cultural dominance issue within large language models (LLMs) due to the predominant use of English data in model training (e. g., ChatGPT).

New Job, New Gender? Measuring the Social Bias in Image Generation Models

no code implementations1 Jan 2024 Wenxuan Wang, Haonan Bai, Jen-tse Huang, Yuxuan Wan, Youliang Yuan, Haoyi Qiu, Nanyun Peng, Michael R. Lyu

BiasPainter uses a diverse range of seed images of individuals and prompts the image generation models to edit these images using gender, race, and age-neutral queries.

Fairness Image Generation

The Earth is Flat? Unveiling Factual Errors in Large Language Models

no code implementations1 Jan 2024 Wenxuan Wang, Juluan Shi, Zhaopeng Tu, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu

Current methods for evaluating LLMs' veracity are limited by test data leakage or the need for extensive human labor, hindering efficient and accurate error detection.

In-Context Learning Multiple-choice

A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models

no code implementations1 Jan 2024 Yuxuan Wan, Wenxuan Wang, Yiliu Yang, Youliang Yuan, Jen-tse Huang, Pinjia He, Wenxiang Jiao, Michael R. Lyu

In addition, the test cases of LogicAsker can be further used to design demonstration examples for in-context learning, which effectively improves the logical reasoning ability of LLMs, e. g., 10\% for GPT-4.

Code Generation In-Context Learning +2

Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context

no code implementations6 Feb 2024 Yichen Li, Yun Peng, Yintong Huo, Michael R. Lyu

We conducted preliminary experiments to validate the performance of IDECoder and observed that this synergy represents a promising trend for future exploration.

Code Completion

Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models

no code implementations17 Feb 2024 Wenxuan Wang, Yihang Su, Jingyuan Huan, Jie Liu, WenTing Chen, Yudi Zhang, Cheng-Yi Li, Kao-Jung Chang, Xiaohan Xin, Linlin Shen, Michael R. Lyu

However, these models are often evaluated on benchmarks that are unsuitable for the Med-MLLMs due to the intricate nature of the real-world diagnostic frameworks, which encompass diverse medical specialties and involve complex clinical decisions.

FaultProfIT: Hierarchical Fault Profiling of Incident Tickets in Large-scale Cloud Systems

no code implementations27 Feb 2024 JunJie Huang, Jinyang Liu, Zhuangbin Chen, Zhihan Jiang, Yichen Li, Jiazhen Gu, Cong Feng, Zengyin Yang, Yongqiang Yang, Michael R. Lyu

To date, FaultProfIT has analyzed 10, 000+ incidents from 30+ cloud services, successfully revealing several fault trends that have informed system improvements.

Contrastive Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.