1 code implementation • 24 Aug 2024 • Hangfeng He, Weijie J. Su
Large language models (LLMs) have been widely employed across various application domains, yet their black-box nature poses significant challenges to understanding how these models process input data internally to make predictions.
no code implementations • 28 May 2024 • Xinyi Liu, Pinxin Liu, Hangfeng He
In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3. 5 and GPT-4 in the context of debate evaluation.
1 code implementation • 1 Apr 2024 • Sindhu Kishore, Hangfeng He
Unraveling the intricate details of events in natural language necessitates a subtle understanding of temporal dynamics.
1 code implementation • 29 Sep 2023 • Hangfeng He, Hongming Zhang, Dan Roth
Existing reference-free reasoning evaluation metrics, while eliminating the need for human-crafted reasoning chains as references, often require fine-tuning with human-derived chains before evaluation, complicating the process and questioning their adaptability to other datasets.
no code implementations • 8 Jul 2023 • Kaifu Wang, Hangfeng He, Tin D. Nguyen, Piyush Kumar, Dan Roth
Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems.
1 code implementation • 31 Dec 2022 • Hangfeng He, Hongming Zhang, Dan Roth
To address this issue, we propose a novel post-processing approach, rethinking with retrieval (RR), which retrieves relevant external knowledge based on the decomposed reasoning steps obtained from the chain-of-thought (CoT) prompting.
Ranked #2 on Question Answering on StrategyQA
2 code implementations • 31 Oct 2022 • Hangfeng He, Weijie J. Su
While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision makings.
1 code implementation • ICLR 2022 • Shuxiao Chen, Koby Crammer, Hangfeng He, Dan Roth, Weijie J. Su
In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks.
1 code implementation • 29 Jan 2021 • Cong Fang, Hangfeng He, Qi Long, Weijie J. Su
More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term \textit{Minority Collapse}, which fundamentally limits the performance of deep learning models on the minority classes.
no code implementations • 1 Jan 2021 • Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth
Real-world applications often require making use of {\em a range of incidental supervision signals}.
1 code implementation • COLING 2020 • Ayal Klein, Jonathan Mamou, Valentina Pyatkin, Daniela Stepanov, Hangfeng He, Dan Roth, Luke Zettlemoyer, Ido Dagan
We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom.
no code implementations • 27 Oct 2020 • Zhun Deng, Hangfeng He, Weijie J. Su
Given that, we propose \emph{locally elastic stability} as a weaker and distribution-dependent stability notion, which still yields exponential generalization bounds.
1 code implementation • NeurIPS 2020 • Shuxiao Chen, Hangfeng He, Weijie J. Su
As a popular approach to modeling the dynamics of training overparametrized neural networks (NNs), the neural tangent kernels (NTK) are known to fall behind real-world NNs in generalization ability.
no code implementations • ICML 2020 • Zhun Deng, Hangfeng He, Jiaoyang Huang, Weijie J. Su
An acknowledged weakness of neural networks is their vulnerability to adversarial perturbations to the inputs.
no code implementations • LREC 2020 • Soham Dan, Hangfeng He, Dan Roth
Recognizing spatial relations and reasoning about them is essential in multiple applications including navigation, direction giving and human-computer interaction in general.
2 code implementations • EMNLP 2021 • Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth
Real-world applications often require improved models by leveraging a range of cheap incidental supervision signals.
no code implementations • 18 Oct 2019 • Matteo Sordello, Niccolò Dalmasso, Hangfeng He, Weijie Su
This paper proposes SplitSGD, a new dynamic learning rate schedule for stochastic optimization.
1 code implementation • ICLR 2020 • Hangfeng He, Weijie J. Su
This phenomenon is shown to persist for neural networks with nonlinear activation functions through extensive simulations on real-life and synthetic datasets, whereas this is not observed in linear classifiers.
1 code implementation • ACL 2020 • Hangfeng He, Qiang Ning, Dan Roth
Question-answering (QA) data often encodes essential information in many facets.
no code implementations • NAACL 2019 • Qiang Ning, Hangfeng He, Chuchu Fan, Dan Roth
For many structured learning tasks, the data annotation process is complex and costly.
no code implementations • WS 2017 • Hangfeng He, Federico Fancellu, Bonnie Webber
In particular, the use of a character-based model allows us to capture characteristics of negation cues in Chinese using word-embedding information only.
no code implementations • EACL 2017 • Federico Fancellu, Adam Lopez, Bonnie Webber, Hangfeng He
Several corpora have been annotated with negation scope{---}the set of words whose meaning is negated by a cue like the word {``}not{''}{---}leading to the development of classifiers that detect negation scope with high accuracy.
no code implementations • EACL 2017 • Hangfeng He, Xu sun
We focus on named entity recognition (NER) for Chinese social media.