Search Results for author: Peifeng Wang

Found 13 papers, 11 papers with code

Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators

1 code implementation21 Apr 2025 Yilun Zhou, Austin Xu, Peifeng Wang, Caiming Xiong, Shafiq Joty

Scaling test-time computation, or affording a generator large language model (LLM) extra compute during inference, typically employs the help of external non-generative evaluators (i. e., reward models).

Code Generation Instruction Following +2

ReIFE: Re-evaluating Instruction-Following Evaluation

1 code implementation9 Oct 2024 Yixin Liu, Kejian Shi, Alexander R. Fabbri, Yilun Zhao, Peifeng Wang, Chien-Sheng Wu, Shafiq Joty, Arman Cohan

The automatic evaluation of instruction following typically involves using large language models (LLMs) to assess response quality.

Instruction Following

Direct Judgement Preference Optimization

no code implementations23 Sep 2024 Peifeng Wang, Austin Xu, Yilun Zhou, Caiming Xiong, Shafiq Joty

Auto-evaluation is crucial for assessing response quality and offering feedback for model development.

SCOTT: Self-Consistent Chain-of-Thought Distillation

1 code implementation3 May 2023 Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, Xiang Ren

While CoT can yield dramatically improved performance, such gains are only observed for sufficiently large LMs.

counterfactual Counterfactual Reasoning +1

PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales

1 code implementation3 Nov 2022 Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren

Neural language models (LMs) have achieved impressive results on various language-based reasoning tasks by utilizing latent knowledge encoded in their own pretrained parameters.

counterfactual Decision Making

Contextualized Scene Imagination for Generative Commonsense Reasoning

1 code implementation ICLR 2022 Peifeng Wang, Jonathan Zamora, Junfeng Liu, Filip Ilievski, Muhao Chen, Xiang Ren

In this paper, we propose an Imagine-and-Verbalize (I&V) method, which learns to imagine a relational scene knowledge graph (SKG) with relations between the input concepts, and leverage the SKG as a constraint when generating a plausible scene description.

Common Sense Reasoning Descriptive +2

Do Language Models Perform Generalizable Commonsense Inference?

1 code implementation Findings (ACL) 2021 Peifeng Wang, Filip Ilievski, Muhao Chen, Xiang Ren

Inspired by evidence that pretrained language models (LMs) encode commonsense knowledge, recent work has applied LMs to automatically populate commonsense knowledge graphs (CKGs).

Knowledge Graphs

Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering

2 code implementations EMNLP 2020 Yanlin Feng, Xinyue Chen, Bill Yuchen Lin, Peifeng Wang, Jun Yan, Xiang Ren

Existing work on augmenting question answering (QA) models with external knowledge (e. g., knowledge graphs) either struggle to model multi-hop relations efficiently, or lack transparency into the model's prediction rationale.

Knowledge Graphs Question Answering +2

Logic Attention Based Neighborhood Aggregation for Inductive Knowledge Graph Embedding

1 code implementation4 Nov 2018 Peifeng Wang, Jialong Han, Chenliang Li, Rong pan

Recent efforts on this issue suggest training a neighborhood aggregator in conjunction with the conventional entity and relation embeddings, which may help embed new entities inductively via their existing neighbors.

Knowledge Graph Embedding World Knowledge

Incorporating GAN for Negative Sampling in Knowledge Representation Learning

no code implementations23 Sep 2018 Peifeng Wang, Shuangyin Li, Rong pan

In this GAN-based framework, we take advantage of a generator to obtain high-quality negative samples.

Link Prediction Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.