Search Results for author: Xuezhi Cao

Found 22 papers, 10 papers with code

Table Fact Verification with Structure-Aware Transformer

no code implementations EMNLP 2020 Hongzhi Zhang, Yingyao Wang, Sirui Wang, Xuezhi Cao, Fuzheng Zhang, Zhongyuan Wang

Verifying fact on semi-structured evidence like tables requires the ability to encode structural information and perform symbolic reasoning.

Fact Verification

OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics

no code implementations12 Jun 2025 Yaoming Zhu, junxin Wang, Yiyang Li, Lin Qiu, ZongYu Wang, Jun Xu, Xuezhi Cao, Yuhuai Wei, Mingshi Wang, Xunliang Cai, Rong Ma

As models become increasingly sophisticated, conventional algorithm benchmarks are increasingly saturated, underscoring the need for more challenging benchmarks to guide future improvements in algorithmic reasoning.

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

no code implementations22 May 2025 Shuhao Han, Haotian Fan, Fangyuan Kong, Wenjie Liao, Chunle Guo, Chongyi Li, Radu Timofte, Liang Li, Tao Li, Junhui Cui, Yunqiu Wang, Yang Tai, Jingwei Sun, Jianhui Sun, Xinli Yue, Tianyi Wang, Huan Hou, Junda Lu, Xinyang Huang, Zitang Zhou, Zijian Zhang, Xuhui Zheng, Xuecheng Wu, Chong Peng, Xuezhi Cao, Trong-Hieu Nguyen-Mau, Minh-Hoang Le, Minh-Khoa Le-Phan, Duy-Nam Ly, Hai-Dang Nguyen, Minh-Triet Tran, Yukang Lin, Yan Hong, Chuanbiao Song, Siyuan Li, Jun Lan, Zhichao Zhang, Xinyue Li, Wei Sun, ZiCheng Zhang, Yunhao Li, Xiaohong Liu, Guangtao Zhai, Zitong Xu, Huiyu Duan, Jiarui Wang, Guangji Ma, Liu Yang, Lu Liu, Qiang Hu, Xiongkuo Min, Zichuan Wang, Zhenchen Tang, Bo Peng, Jing Dong, Fengbin Guan, Zihao Yu, Yiting Lu, Wei Luo, Xin Li, Minhao Lin, Haofeng Chen, Xuanxuan He, Kele Xu, Qisheng Xu, Zijian Gao, Tianjiao Wan, Bo-Cheng Qiu, Chih-Chung Hsu, Chia-Ming Lee, Yu-Fan Lin, Bo Yu, Zehao Wang, Da Mu, Mingxiu Chen, Junkang Fang, Huamei Sun, Wending Zhao, Zhiyu Wang, Wang Liu, Weikang Yu, Puhong Duan, Bin Sun, Xudong Kang, Shutao Li, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Jiarong He, Zhishan Qiao, Yongqing Huang, Zewen Chen, Zhe Pang, Juan Wang, Jian Guo, Zhizhuo Shao, Ziyu Feng, Bing Li, Weiming Hu, Hesong Li, Dehua Liu, Zeming Liu, Qingsong Xie, Ruichen Wang, Zhihao LI, Yuqi Liang, Jianqi Bi, Jun Luo, Junfeng Yang, Can Li, Jing Fu, Hongwei Xu, Mingrui Long, Lulin Tang

A total of 211 participants have registered in the structure track.

Image Restoration Text to Image Generation +1

ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations

no code implementations20 May 2025 Xuecheng Wu, Jiaxing Liu, Danlei Huang, Xiaoyu Li, Yifan Wang, Chen Chen, Liya Ma, Xuezhi Cao, Junxiao Xue

Visual-Interleaved Chain-of-Thought (VI-CoT) enables MLLMs to continually update their understanding and decisions based on step-wise intermediate visual states (IVS), much like a human would, which demonstrates impressive success in various tasks, thereby leading to emerged advancements in related benchmarks.

Benchmarking

Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement

1 code implementation17 May 2025 Peng Ding, Jun Kuang, ZongYu Wang, Xuezhi Cao, Xunliang Cai, Jiajun Chen, ShuJian Huang

Large Language Models (LLMs) have shown impressive capabilities across various tasks but remain vulnerable to meticulously crafted jailbreak attacks.

Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese

no code implementations16 May 2025 Xihuai Wang, Ziyi Zhao, Siyu Ren, Shao Zhang, Song Li, Xiaoyu Li, Ziwen Wang, Lin Qiu, Guanglu Wan, Xuezhi Cao, Xunliang Cai, Weinan Zhang

Recent advances in large language models (LLMs) have significantly improved text-to-speech (TTS) systems, enhancing control over speech style, naturalness, and emotional expression, which brings TTS Systems closer to human-level performance.

Benchmarking Language Modeling +4

Ask, Fail, Repeat: Meeseeks, an Iterative Feedback Benchmark for LLMs' Multi-turn Instruction-Following Ability

no code implementations30 Apr 2025 JiaMing Wang, Yunke Zhao, Peng Ding, Jun Kuang, ZongYu Wang, Xuezhi Cao, Xunliang Cai

The ability to follow instructions accurately is fundamental for Large Language Models (LLMs) to serve as reliable agents in real-world applications.

Instruction Following Intent Recognition

TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs

no code implementations10 Apr 2025 Zijian Zhang, Xuhui Zheng, Xuecheng Wu, Chong Peng, Xuezhi Cao

While text-to-image (T2I) generation models have achieved remarkable progress in recent years, existing evaluation methodologies for vision-language alignment still struggle with the fine-grained semantic matching.

Ensemble Learning Position +2

Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content

1 code implementation CVPR 2025 ZiCheng Zhang, Tengchuan Kou, Shushi Wang, Chunyi Li, Wei Sun, Wei Wang, Xiaoyu Li, ZongYu Wang, Xuezhi Cao, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai

While significant progress has been made in developing objective models to assess these dimensions, the performance of such models heavily relies on the scale and quality of human annotations.

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

1 code implementation17 Feb 2025 Shao Zhang, Xihuai Wang, WenHao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen

We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration.

Who's the MVP? A Game-Theoretic Evaluation Benchmark for Modular Attribution in LLM Agents

no code implementations1 Feb 2025 Yingxuan Yang, Bo Huang, Siyuan Qi, Chao Feng, Haoyi Hu, Yuxuan Zhu, Jinbo Hu, Haoran Zhao, Ziyi He, Xiao Liu, ZongYu Wang, Lin Qiu, Xuezhi Cao, Xunliang Cai, Yong Yu, Weinan Zhang

Large Language Model (LLM) agents frameworks often employ modular architectures, incorporating components such as planning, reasoning, action execution, and reflection to tackle complex tasks.

Large Language Model

Length Desensitization in Direct Preference Optimization

no code implementations10 Sep 2024 Wei Liu, Yang Bai, Chengcheng Han, Rongxiang Weng, Jun Xu, Xuezhi Cao, Jingang Wang, Xunliang Cai

Direct Preference Optimization (DPO) is widely utilized in the Reinforcement Learning from Human Feedback (RLHF) phase to align Large Language Models (LLMs) with human preferences, thereby enhancing both their harmlessness and efficacy.

Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs

1 code implementation2 Aug 2024 Peng Ding, Jingyu Wu, Jun Kuang, Dan Ma, Xuezhi Cao, Xunliang Cai, Shi Chen, Jiajun Chen, ShuJian Huang

Extensive experiments on 12 mainstream MLLMs, such as GPT-4V and Gemini-Pro Vision, demonstrate that these models exhibit significant hallucinations on Hallu-PI, which is not observed in unperturbed scenarios.

Attribute Hallucination +1

Entity-Aspect-Opinion-Sentiment Quadruple Extraction for Fine-grained Sentiment Analysis

no code implementations28 Nov 2023 Dan Ma, Jun Xu, ZongYu Wang, Xuezhi Cao, Yunsen Xian

To facilitate research in this new task, we have constructed four datasets (Res14-EASQE, Res15-EASQE, Res16-EASQE, and Lap14-EASQE) based on the SemEval Restaurant and Laptop datasets.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily

1 code implementation14 Nov 2023 Peng Ding, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, ShuJian Huang

Finally, we analyze the failure of LLMs defense from the perspective of prompt execution priority, and propose corresponding defense strategies.

Exchanging-based Multimodal Fusion with Transformer

1 code implementation5 Sep 2023 Renyu Zhu, Chengcheng Han, Yong Qian, Qiushi Sun, Xiang Li, Ming Gao, Xuezhi Cao, Yunsen Xian

To solve these issues, in this paper, we propose a novel exchanging-based multimodal fusion model MuSE for text-vision fusion based on Transformer.

Image Captioning Multimodal Sentiment Analysis +4

Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition

1 code implementation14 Feb 2023 Chengcheng Han, Renyu Zhu, Jun Kuang, FengJiao Chen, Xiang Li, Ming Gao, Xuezhi Cao, Wei Wu

We design an improved triplet network to map samples and prototype vectors into a low-dimensional space that is easier to be classified and propose an adaptive margin for each entity type.

few-shot-ner Few-shot NER +6

Adap-$τ$: Adaptively Modulating Embedding Magnitude for Recommendation

2 code implementations9 Feb 2023 Jiawei Chen, Junkang Wu, Jiancan Wu, Sheng Zhou, Xuezhi Cao, Xiangnan He

Recent years have witnessed the great successes of embedding-based methods in recommender systems.

Recommendation Systems

FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph Completion

no code implementations7 Feb 2023 Wentao Shi, Junkang Wu, Xuezhi Cao, Jiawei Chen, Wenqiang Lei, Wei Wu, Xiangnan He

Specifically, they suffer from two main limitations: 1) existing Graph Convolutional Network (GCN) methods in hyperbolic space rely on tangent space approximation, which would incur approximation error in representation learning, and 2) due to the lack of inner product operation definition in hyperbolic space, existing methods can only measure the plausibility of facts (links) with hyperbolic distance, which is difficult to capture complex data patterns.

Knowledge Graph Completion Representation Learning

Popularity Bias Is Not Always Evil: Disentangling Benign and Harmful Bias for Recommendation

no code implementations16 Sep 2021 Zihao Zhao, Jiawei Chen, Sheng Zhou, Xiangnan He, Xuezhi Cao, Fuzheng Zhang, Wei Wu

To sufficiently exploit such important information for recommendation, it is essential to disentangle the benign popularity bias caused by item quality from the harmful popularity bias caused by conformity.

Recommendation Systems

DisenKGAT: Knowledge Graph Embedding with Disentangled Graph Attention Network

2 code implementations22 Aug 2021 Junkang Wu, Wentao Shi, Xuezhi Cao, Jiawei Chen, Wenqiang Lei, Fuzheng Zhang, Wei Wu, Xiangnan He

Knowledge graph completion (KGC) has become a focus of attention across deep learning community owing to its excellent contribution to numerous downstream tasks.

Disentanglement Graph Attention +1

Revealing Multiple Layers of Hidden Community Structure in Networks

2 code implementations23 Jan 2015 Kun He, Sucheta Soundarajan, Xuezhi Cao, John Hopcroft, Menglong Huang

Additionally, on both real and synthetic networks containing a hidden ground-truth community structure, HICODE uncovers this structure better than any baseline algorithms that we compared against.

Social and Information Networks Physics and Society

Cannot find the paper you are looking for? You can Submit a new open access paper.