Search Results for author: Hangfeng He

Found 23 papers, 12 papers with code

A Law of Next-Token Prediction in Large Language Models

1 code implementation24 Aug 2024 Hangfeng He, Weijie J. Su

Large language models (LLMs) have been widely employed across various application domains, yet their black-box nature poses significant challenges to understanding how these models process input data internally to make predictions.

An Empirical Analysis on Large Language Models in Debate Evaluation

no code implementations28 May 2024 Xinyi Liu, Pinxin Liu, Hangfeng He

In this study, we investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3. 5 and GPT-4 in the context of debate evaluation.

Unveiling Divergent Inductive Biases of LLMs on Temporal Data

1 code implementation1 Apr 2024 Sindhu Kishore, Hangfeng He

Unraveling the intricate details of events in natural language necessitates a subtle understanding of temporal dynamics.

Inductive Bias Natural Language Inference +1

SocREval: Large Language Models with the Socratic Method for Reference-Free Reasoning Evaluation

1 code implementation29 Sep 2023 Hangfeng He, Hongming Zhang, Dan Roth

Existing reference-free reasoning evaluation metrics, while eliminating the need for human-crafted reasoning chains as references, often require fine-tuning with human-derived chains before evaluation, complicating the process and questioning their adaptability to other datasets.

On Regularization and Inference with Label Constraints

no code implementations8 Jul 2023 Kaifu Wang, Hangfeng He, Tin D. Nguyen, Piyush Kumar, Dan Roth

Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems.

Structured Prediction

Rethinking with Retrieval: Faithful Large Language Model Inference

1 code implementation31 Dec 2022 Hangfeng He, Hongming Zhang, Dan Roth

To address this issue, we propose a novel post-processing approach, rethinking with retrieval (RR), which retrieves relevant external knowledge based on the decomposed reasoning steps obtained from the chain-of-thought (CoT) prompting.

Language Modelling Large Language Model +2

A Law of Data Separation in Deep Learning

2 code implementations31 Oct 2022 Hangfeng He, Weijie J. Su

While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision makings.

Weighted Training for Cross-Task Learning

1 code implementation ICLR 2022 Shuxiao Chen, Koby Crammer, Hangfeng He, Dan Roth, Weijie J. Su

In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks.

Chunking named-entity-recognition +6

Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training

1 code implementation29 Jan 2021 Cong Fang, Hangfeng He, Qi Long, Weijie J. Su

More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term \textit{Minority Collapse}, which fundamentally limits the performance of deep learning models on the minority classes.

QANom: Question-Answer driven SRL for Nominalizations

1 code implementation COLING 2020 Ayal Klein, Jonathan Mamou, Valentina Pyatkin, Daniela Stepanov, Hangfeng He, Dan Roth, Luke Zettlemoyer, Ido Dagan

We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom.

Toward Better Generalization Bounds with Locally Elastic Stability

no code implementations27 Oct 2020 Zhun Deng, Hangfeng He, Weijie J. Su

Given that, we propose \emph{locally elastic stability} as a weaker and distribution-dependent stability notion, which still yields exponential generalization bounds.

Generalization Bounds Learning Theory

Label-Aware Neural Tangent Kernel: Toward Better Generalization and Local Elasticity

1 code implementation NeurIPS 2020 Shuxiao Chen, Hangfeng He, Weijie J. Su

As a popular approach to modeling the dynamics of training overparametrized neural networks (NNs), the neural tangent kernels (NTK) are known to fall behind real-world NNs in generalization ability.

Towards Understanding the Dynamics of the First-Order Adversaries

no code implementations ICML 2020 Zhun Deng, Hangfeng He, Jiaoyang Huang, Weijie J. Su

An acknowledged weakness of neural networks is their vulnerability to adversarial perturbations to the inputs.

Understanding Spatial Relations through Multiple Modalities

no code implementations LREC 2020 Soham Dan, Hangfeng He, Dan Roth

Recognizing spatial relations and reasoning about them is essential in multiple applications including navigation, direction giving and human-computer interaction in general.

Common Sense Reasoning Implicit Relations

Foreseeing the Benefits of Incidental Supervision

2 code implementations EMNLP 2021 Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth

Real-world applications often require improved models by leveraging a range of cheap incidental supervision signals.

Informativeness Learning Theory +4

The Local Elasticity of Neural Networks

1 code implementation ICLR 2020 Hangfeng He, Weijie J. Su

This phenomenon is shown to persist for neural networks with nonlinear activation functions through extensive simulations on real-life and synthetic datasets, whereas this is not observed in linear classifiers.

Clustering

Partial Or Complete, That's The Question

no code implementations NAACL 2019 Qiang Ning, Hangfeng He, Chuchu Fan, Dan Roth

For many structured learning tasks, the data annotation process is complex and costly.

Neural Networks for Negation Cue Detection in Chinese

no code implementations WS 2017 Hangfeng He, Federico Fancellu, Bonnie Webber

In particular, the use of a character-based model allows us to capture characteristics of negation cues in Chinese using word-embedding information only.

Feature Engineering Negation +2

Detecting negation scope is easy, except when it isn't

no code implementations EACL 2017 Federico Fancellu, Adam Lopez, Bonnie Webber, Hangfeng He

Several corpora have been annotated with negation scope{---}the set of words whose meaning is negated by a cue like the word {``}not{''}{---}leading to the development of classifiers that detect negation scope with high accuracy.

Negation

Cannot find the paper you are looking for? You can Submit a new open access paper.