Search Results for author: Hai Hu

Found 18 papers, 11 papers with code

Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom

1 code implementation30 Apr 2024 Shisen Yue, Siyuan Song, Xinyuan Cheng, Hai Hu

While all models generate largely fluent and self-consistent text, their explanations score low on reasonability except for GPT-4, suggesting that most LLMs cannot produce satisfactory explanations of the implicatures in the conversation.

Implicatures Multiple-choice

SH2: Self-Highlighted Hesitation Helps You Decode More Truthfully

1 code implementation11 Jan 2024 Jushi Kai, Hai Hu, Zhouhan Lin

Therefore, we propose to ''highlight'' the factual information by selecting the tokens with the lowest probabilities and concatenating them to the original context, thus forcing the model to repeatedly read and hesitate on these tokens before generation.

Hallucination Text Generation

MELA: Multilingual Evaluation of Linguistic Acceptability

1 code implementation15 Nov 2023 Ziyin Zhang, Yikang Liu, Weifang Huang, Junyu Mao, Rui Wang, Hai Hu

In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability -- MELA, with 46K samples covering 10 languages from a diverse set of language families.

Code Generation Cross-Lingual Transfer +3

Revisiting Acceptability Judgements

1 code implementation23 May 2023 Hai Hu, Ziyin Zhang, Weifang Huang, Jackie Yan-Ki Lai, Aini Li, Yina Patterson, Jiahui Huang, Peng Zhang, Chien-Jer Charles Lin, Rui Wang

We introduce CoLAC - Corpus of Linguistic Acceptability in Chinese, the first large-scale acceptability dataset for a non-Indo-European language.

Cross-Lingual Transfer Linguistic Acceptability

ArguGPT: evaluating, understanding and identifying argumentative essays generated by GPT models

2 code implementations16 Apr 2023 Yikang Liu, Ziyin Zhang, Wanyang Zhang, Shisen Yue, Xiaojing Zhao, Xinyuan Cheng, Yiwen Zhang, Hai Hu

To address these challenges in English language teaching, we first present ArguGPT, a balanced corpus of 4, 038 argumentative essays generated by 7 GPT models in response to essay prompts from three sources: (1) in-class or homework exercises, (2) TOEFL and (3) GRE writing tasks.

Sentence

Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference

1 code implementation Findings (ACL) 2021 Hai Hu, He Zhou, Zuoyu Tian, Yiwen Zhang, Yina Ma, Yanting Li, Yixin Nie, Kyle Richardson

These results, however, come with important caveats: cross-lingual models often perform best when trained on a mixture of English and high-quality monolingual NLI data (OCNLI), and are often hindered by automatically translated resources (XNLI-zh).

Cross-Lingual Transfer Natural Language Inference +2

OCNLI: Original Chinese Natural Language Inference

1 code implementation Findings of the Association for Computational Linguistics 2020 Hai Hu, Kyle Richardson, Liang Xu, Lu Li, Sandra Kuebler, Lawrence S. Moss

In this paper, we present the first large-scale NLI dataset (consisting of ~56, 000 annotated sentence pairs) for Chinese called the Original Chinese Natural Language Inference dataset (OCNLI).

Natural Language Inference Sentence +1

MonaLog: a Lightweight System for Natural Language Inference Based on Monotonicity

1 code implementation SCiL 2020 Hai Hu, Qi Chen, Kyle Richardson, Atreyee Mukherjee, Lawrence S. Moss, Sandra Kuebler

We present a new logic-based inference engine for natural language inference (NLI) called MonaLog, which is based on natural logic and the monotonicity calculus.

Data Augmentation Natural Language Inference

Probing Natural Language Inference Models through Semantic Fragments

3 code implementations16 Sep 2019 Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal

Our experiments, using a library of 8 such semantic fragments, reveal two remarkable findings: (a) State-of-the-art models, including BERT, that are pre-trained on existing NLI benchmark datasets perform poorly on these new fragments, even though the phenomena probed here are central to the NLI task.

Natural Language Inference

Ensemble Methods to Distinguish Mainland and Taiwan Chinese

no code implementations WS 2019 Hai Hu, Wen Li, He Zhou, Zuoyu Tian, Yiwen Zhang, Liang Zou

This paper describes the IUCL system at VarDial 2019 evaluation campaign for the task of discriminating between Mainland and Taiwan variation of mandarin Chinese.

Word Embeddings

Natural Language Inference with Monotonicity

no code implementations WS 2019 Hai Hu, Qi Chen, Larry Moss

This paper describes a working system which performs natural language inference using polarity-marked parse trees.

Natural Language Inference

Polarity Computations in Flexible Categorial Grammar

1 code implementation SEMEVAL 2018 Hai Hu, Larry Moss

This paper shows how to take parse trees in CCG and algorithmically find the polarities of all the constituents.

Detecting Syntactic Features of Translated Chinese

no code implementations WS 2018 Hai Hu, Wen Li, Sandra Kübler

We present a machine learning approach to distinguish texts translated to Chinese (by humans) from texts originally written in Chinese, with a focus on a wide range of syntactic features.

Translation

Path of Vowel Raising in Chengdu Dialect of Mandarin

no code implementations11 Mar 2018 Hai Hu, Yiwen Zhang

He and Rao (2013) reported a raising phenomenon of /a/ in /Xan/ (X being a consonant or a vowel) in Chengdu dialect of Mandarin, i. e. /a/ is realized as [epsilon] for young speakers but [ae] for older speakers, but they offered no acoustic analysis.

Is China Entering WTO or shijie maoyi zuzhi--a Corpus Study of English Acronyms in Chinese Newspapers

no code implementations18 Nov 2017 Hai Hu

This is one of the first studies that quantitatively examine the usage of English acronyms (e. g. WTO) in Chinese texts.

Translation

Non-Deterministic Segmentation for Chinese Lattice Parsing

no code implementations RANLP 2017 Hai Hu, Daniel Dakota, S K{\"u}bler, ra

Parsing Chinese critically depends on correct word segmentation for the parser since incorrect segmentation inevitably causes incorrect parses.

Morphological Analysis Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.