Search Results for author: Yue Zhang

Found 457 papers, 225 papers with code

Speeding up Transformer Decoding via an Attention Refinement Network

1 code implementation COLING 2022 Kaixin Wu, Yue Zhang, Bojie Hu, Tong Zhang

Extensive experiments on ten WMT machine translation tasks show that the proposed model yields an average of 1. 35x faster (with almost no decrease in BLEU) over the state-of-the-art inference implementation.

Machine Translation NMT +1

Contrastive Data and Learning for Natural Language Processing

no code implementations NAACL (ACL) 2022 Rui Zhang, Yangfeng Ji, Yue Zhang, Rebecca J. Passonneau

We then survey the benefits and the best practices of contrastive learning for various downstream NLP applications including Text Classification, Question Answering, Summarization, Text Generation, Interpretability and Explainability, Commonsense Knowledge and Reasoning, Vision-and-Language. This tutorial intends to help researchers in the NLP and computational linguistics community to understand this emerging topic and promote future research directions of using contrastive learning for NLP applications.

Contrastive Learning Question Answering +5

Investigating Rich Feature Sources for Conceptual Representation Encoding

no code implementations COLING (CogALex) 2020 Lu Cao, Yulong Chen, Dandan Huang, Yue Zhang

Functional Magnetic Resonance Imaging (fMRI) provides a means to investigate human conceptual representation in cognitive and neuroscience studies, where researchers predict the fMRI activations with elicited stimuli inputs.

Cross-Lingual Dependency Parsing via Self-Training

no code implementations CCL 2020 Meishan Zhang, Yue Zhang

Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings.

Cross-Lingual POS Tagging Cross-Lingual Transfer +3

新型冠状病毒肺炎相关的推特主题与情感研究(Exploring COVID-19-related Twitter Topic Dynamics across Countries)

no code implementations CCL 2020 Shuailong Liang, Derek F. Wong, Yue Zhang

我们基于从2020年1月22日至2020年4月30日在推特社交平台上抓取的不同国家和地区发布的50万条推文, 研究了有关 2019新型冠状病毒肺炎相关的主题和人们的观点, 发现了不同国家之间推特用户的普遍关切和看法之间存在着异同, 并且对不同议题的情感态度也有所不同。我们发现大部分推文中包含了强烈的情感, 其中表达爱与支持的推文比较普遍。总体来看, 人们的情感随着时间的推移逐渐正向增强。

Entity Enhanced BERT Pre-training for Chinese NER

no code implementations EMNLP 2020 Chen Jia, Yuefeng Shi, Qinrong Yang, Yue Zhang

We then integrate the entity information into BERT using Char-Entity-Transformer, which augments the self-attention using a combination of character and entity representations.

NER

PromptGen: Automatically Generate Prompts using Generative Models

no code implementations Findings (NAACL) 2022 Yue Zhang, Hongliang Fei, Dingcheng Li, Ping Li

Recently, prompt learning has received significant attention, where the downstream tasks are reformulated to the mask-filling task with the help of a textual prompt.

Knowledge Probing Sentence

Natural Language Processing Meets Quantum Physics: A Survey and Categorization

no code implementations EMNLP 2021 Sixuan Wu, Jian Li, Peng Zhang, Yue Zhang

Recent research has investigated quantum NLP, designing algorithms that process natural language in quantum computers, and also quantum-inspired algorithms that improve NLP performance on classical computers.

Survey

DialogSum Challenge: Summarizing Real-Life Scenario Dialogues

no code implementations INLG (ACL) 2021 Yulong Chen, Yang Liu, Yue Zhang

We propose a shared task on summarizing real-life scenario dialogues, DialogSum Challenge, to encourage researchers to address challenges in dialogue summarization, which has been less studied by the summarization community.

Common Sense Reasoning Representation Learning

Prompt-Driven Neural Machine Translation

1 code implementation Findings (ACL) 2022 Yafu Li, Yongjing Yin, Jing Li, Yue Zhang

Neural machine translation (NMT) has obtained significant performance improvement over the recent years.

Machine Translation NMT +1

Logical Reasoning in Large Language Models: A Survey

no code implementations13 Feb 2025 Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang

With the emergence of advanced reasoning models like OpenAI o3 and DeepSeek-R1, large language models (LLMs) have demonstrated remarkable reasoning capabilities.

Logical Reasoning Survey

Bi-directional Curriculum Learning for Graph Anomaly Detection: Dual Focus on Homogeneity and Heterogeneity

no code implementations24 Jan 2025 Yitong Hao, Enbo He, Yue Zhang, Guisheng Yin

To address this problem, we propose an innovative Bi-directional Curriculum Learning strategy (BCL), which considers nodes with higher and lower similarity to neighbor nodes as simple nodes in the direction of focusing on homogeneity and focusing on heterogeneity, respectively, and prioritizes their training.

Graph Anomaly Detection

Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization

1 code implementation19 Dec 2024 Yue Zhang, Liqiang Jing, Vibhav Gogate

Additionally, we introduce a reward-driven update optimization method to further enhance the quality of updates generated by multimodal models.

Contrastive Learning Decision Making +3

DocFusion: A Unified Framework for Document Parsing Tasks

1 code implementation17 Dec 2024 Mingxu Chai, Ziyu Shen, Chong Zhang, Yue Zhang, Xiao Wang, Shihan Dou, Jihua Kang, Jiazheng Zhang, Qi Zhang

Document parsing is essential for analyzing complex document structures and extracting fine-grained information, supporting numerous downstream applications.

PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization

no code implementations17 Dec 2024 Yun Luo, Yingjie Li, Xiangkun Hu, Qinglin Qi, Fang Guo, Qipeng Guo, Zheng Zhang, Yue Zhang

As online platforms and recommendation algorithms evolve, people are increasingly trapped in echo chambers, leading to biased understandings of various issues.

Retrieval

Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection

1 code implementation16 Dec 2024 Guangsheng Bao, Yanbin Zhao, Juncai He, Yue Zhang

Advanced large language models (LLMs) can generate text almost indistinguishable from human-written text, highlighting the importance of LLM-generated text detection.

LLM-generated Text Detection Text Detection

Deep Partially Linear Transformation Model for Right-Censored Survival Data

no code implementations10 Dec 2024 Junkai Yin, Yue Zhang, Zhangsheng Yu

Although the Cox proportional hazards model is well established and extensively used in the analysis of survival data, the proportional hazards (PH) assumption may not always hold in practical scenarios.

Mind the Gap: Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning

1 code implementation5 Dec 2024 Shicheng Zhou, Jingju Liu, Yuliang Lu, Jiahai Yang, Yue Zhang, Jie Chen

GAP introduces a Real-to-Sim-to-Real pipeline that (a) enables end-to-end policy learning in unknown real environments while constructing realistic simulations; (b) improves agents' generalization ability by leveraging domain randomization and meta-RL learning. Specially, we are among the first to apply domain randomization in autonomous pentesting and propose a large language model-powered domain randomization method for synthetic environment generation.

Large Language Model Meta Reinforcement Learning +1

Blockchain Meets LLMs: A Living Survey on Bidirectional Integration

no code implementations25 Nov 2024 Jianghao Gong, Peiqi Yan, Yue Zhang, Hongli An, Logan Liu

In the domain of large language models, considerable advancements have been attained in multimodal large language models and explainability research, propelled by the continuous technological progress and innovation.

Survey

Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels

1 code implementation21 Nov 2024 Jianhao Yan, Pingchuan Yan, Yulong Chen, Jing Li, Xianchao Zhu, Yue Zhang

This study presents a comprehensive evaluation of GPT-4's translation capabilities compared to human translators of varying expertise levels.

Benchmarking Machine Translation +1

Can Language Models Learn to Skip Steps?

1 code implementation4 Nov 2024 Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang

In this work, we study the ability to skip steps in reasoning - a hallmark of human expertise developed through practice.

Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement

1 code implementation1 Nov 2024 Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen, Yibo Liu, Yuchen Liu, Binhua Li, Fei Huang, Yongbin Li

The results demonstrate that Lingma SWE-GPT 72B successfully resolves 30. 20% of the GitHub issues, marking a significant improvement in automatic issue resolution (22. 76% relative improvement compared to Llama 3. 1 405B), approaching the performance of closed-source models (31. 80\% issues of GPT-4o resolved).

Language Modeling Language Modelling

SVIP: Towards Verifiable Inference of Open-source Large Language Models

no code implementations29 Oct 2024 Yifan Sun, Yuhang Li, Yue Zhang, Yuchen Jin, huan zhang

Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains.

Natural Language Understanding

CycleResearcher: Improving Automated Research via Automated Review

no code implementations28 Oct 2024 Yixuan Weng, Minjun Zhu, Guangsheng Bao, Hongbo Zhang, Jindong Wang, Yue Zhang, Linyi Yang

In research, the papers generated by the CycleResearcher model achieved a score of 5. 36 in simulated peer reviews, surpassing the preprint level of 5. 24 from human experts and approaching the accepted paper level of 5. 69.

scientific discovery

Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

no code implementations24 Oct 2024 Donglin Di, Weinan Zhang, Yue Zhang, Fanglin Wang

Making use of off-the-shelf resources of resource-rich languages to transfer knowledge for low-resource languages raises much attention recently.

Cross-Lingual Transfer Decoder +6

Task Calibration: Calibrating Large Language Models on Inference Tasks

no code implementations24 Oct 2024 Yingjie Li, Yun Luo, Xiaotian Xie, Yue Zhang

TC encourages LLMs to reason based on both premise and hypothesis, while mitigating the models' over-reliance on individual premise or hypothesis for inference.

Natural Language Understanding

Locking Down the Finetuned LLMs Safety

1 code implementation14 Oct 2024 Minjun Zhu, Linyi Yang, Yifan Wei, Ningyu Zhang, Yue Zhang

SafetyLock leverages our discovery that fine-tuned models retain similar safety-related activation representations to their base models.

Safety Alignment

MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting

1 code implementation14 Oct 2024 Yue Zhang, Minhao Liu, Zhaokang Chen, Bin Wu, Yubin Zeng, Chao Zhan, Yingjie He, Junxin Huang, Wenjiang Zhou

We propose MuseTalk, which generates lip-sync targets in a latent space encoded by a Variational Autoencoder, enabling high-fidelity talking face video generation with efficient inference.

Video Generation

EchoApex: A General-Purpose Vision Foundation Model for Echocardiography

no code implementations14 Oct 2024 Abdoul Aziz Amadou, Yue Zhang, Sebastien Piat, Paul Klein, Ingo Schmuecking, Tiziano Passerini, Puneet Sharma

Quantitative evaluation of echocardiography is essential for precise assessment of cardiac condition, monitoring disease progression, and guiding treatment decisions.

Self-Supervised Learning

Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings

no code implementations13 Oct 2024 Di wu, Siyuan Li, Chen Feng, Lu Cao, Yue Zhang, Jie Yang, Mohamad Sawan

To address these limitations, we introduce Homogeneity-Heterogeneity Disentangled Learning for neural Representations (H2DiLR), a novel framework that disentangles and learns both the homogeneity and heterogeneity from intracranial recordings across multiple subjects.

Representation Learning

ELICIT: LLM Augmentation via External In-Context Capability

1 code implementation12 Oct 2024 Futing Wang, Jianhao Yan, Yue Zhang, Tao Lin

By externally storing and reusing vectors that represent in-context learned capabilities, \alg not only demonstrates the potential to operate modular capabilities but also significantly enhances the performance, versatility, adaptability, and scalability of large language models.

In-Context Learning

Keys to Robust Edits: from Theoretical Insights to Practical Advances

no code implementations12 Oct 2024 Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang

Large language models (LLMs) have revolutionized knowledge storage and retrieval, but face challenges with conflicting and outdated information.

knowledge editing Specificity

ECon: On the Detection and Resolution of Evidence Conflicts

1 code implementation5 Oct 2024 Cheng Jiayang, Chunkit Chan, Qianqian Zhuang, Lin Qiu, Tianhang Zhang, Tengxiao Liu, Yangqiu Song, Yue Zhang, PengFei Liu, Zheng Zhang

The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems, leading to the prevalence of AI-generated content and challenges in detecting misinformation and managing conflicting information, or "inter-evidence conflicts."

Decision Making Misinformation +1

SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models

no code implementations4 Oct 2024 Yue Zhang, Zhiyang Xu, Ying Shen, Parisa Kordjamshidi, Lifu Huang

2) the architectures of existing 3D-based LLMs lack explicit alignment between the spatial representations of 3D scenes and natural language, limiting their performance in tasks requiring precise spatial reasoning.

Scene Understanding Spatial Reasoning

Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding

1 code implementation29 Sep 2024 Chong Zhang, Yi Tu, Yixi Zhao, Chenshu Yuan, Huan Chen, Yue Zhang, Mingxu Chai, Ya Guo, Huijia Zhu, Qi Zhang, Tao Gui

However, we argue that this formulation does not adequately convey the complete reading order information in the layout, which may potentially lead to performance decline in downstream VrD tasks.

document understanding Entity Linking +4

A Unified Hallucination Mitigation Framework for Large Vision-Language Models

1 code implementation24 Sep 2024 Yue Chang, Liqiang Jing, Xiaopeng Zhang, Yue Zhang

To mitigate hallucination, current studies either focus on the process of model inference or the results of model generation, but the solutions they design sometimes do not deal appropriately with various types of queries and the hallucinations of the generations about these queries.

Hallucination Question Answering +1

Learning Task Planning from Multi-Modal Demonstration for Multi-Stage Contact-Rich Manipulation

no code implementations18 Sep 2024 KeJia Chen, Zheng Shen, Yue Zhang, Lingyun Chen, Fan Wu, Zhenshan Bing, Sami Haddadin, Alois Knoll

To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the planning process.

In-Context Learning Task Planning

Semformer: Transformer Language Models with Semantic Planning

no code implementations17 Sep 2024 Yongjing Yin, Junran Ding, Kai Song, Yue Zhang

In this paper, we introduce Semformer, a novel method of training a Transformer language model that explicitly models the semantic planning of response.

In-Context Learning Language Modeling +1

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

2 code implementations11 Sep 2024 Yu Zhang, Songlin Yang, Ruijie Zhu, Yue Zhang, Leyang Cui, Yiqiao Wang, Bolun Wang, Freda Shi, Bailin Wang, Wei Bi, Peng Zhou, Guohong Fu

Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for training from scratch.

Personality Alignment of Large Language Models

1 code implementation21 Aug 2024 Minjun Zhu, Linyi Yang, Yue Zhang

This dataset allows us to quantitatively evaluate the extent to which LLMs can align with each subject's behavioral patterns.

Personality Alignment

LAKD-Activation Mapping Distillation Based on Local Learning

no code implementations21 Aug 2024 Yaoze Zhang, Yuming Zhang, Yu Zhao, Yue Zhang, Feiyu Zhu

Existing knowledge distillation methods focus on designing different distillation targets to acquire knowledge from teacher models.

Knowledge Distillation

Narrowing the Gap between Vision and Action in Navigation

1 code implementation19 Aug 2024 Yue Zhang, Parisa Kordjamshidi

First, VLN-CE agents that discretize the visual environment are primarily trained with high-level view selection, which causes them to ignore crucial spatial reasoning within the low-level action movements.

Decoder Spatial Reasoning +1

See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses

1 code implementation16 Aug 2024 Yulong Chen, Yang Liu, Jianhao Yan, Xuefeng Bai, Ming Zhong, Yinghao Yang, ZiYi Yang, Chenguang Zhu, Yue Zhang

We then build a benchmark, SC-G4, consisting of 1, 835 instances generated by GPT-4 using these patterns, with human-annotated gold responses.

RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation

1 code implementation15 Aug 2024 Dongyu Ru, Lin Qiu, Xiangkun Hu, Tianhang Zhang, Peng Shi, Shuaichen Chang, Cheng Jiayang, Cunxiang Wang, Shichao Sun, Huanyu Li, Zizhao Zhang, Binjie Wang, Jiarong Jiang, Tong He, Zhiguo Wang, PengFei Liu, Yue Zhang, Zheng Zhang

Despite Retrieval-Augmented Generation (RAG) showing promising capability in leveraging external knowledge, a comprehensive evaluation of RAG systems is still challenging due to the modular nature of RAG, evaluation of long-form responses and reliability of measurements.

RAG Retrieval

LLM-based MOFs Synthesis Condition Extraction using Few-Shot Demonstrations

no code implementations6 Aug 2024 Lei Shi, Zhimeng Liu, Yi Yang, Weize Wu, Yuyang Zhang, Hongbo Zhang, Jing Lin, Siyu Wu, Zihan Chen, Ruiming Li, Nan Wang, Zipeng Liu, Huobin Tan, Hongyi Gao, Yue Zhang, Ge Wang

The extraction of Metal-Organic Frameworks (MOFs) synthesis conditions from literature text has been challenging but crucial for the logical design of new MOFs with desirable functionality.

Few-Shot Learning In-Context Learning +2

DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training

1 code implementation1 Aug 2024 Yu Xie, Qian Qiao, Jun Gao, Tianxiang Wu, Jiaqing Fan, Yue Zhang, Jielei Zhang, Huyang Sun

Unfortunately, this denoising training method cannot be directly applied to text spotting tasks, as these tasks need to perform irregular shape detection tasks and more complex text recognition tasks than classification.

Denoising Graph Matching +4

Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

no code implementations12 Jul 2024 Yue Zhang, Woyu Zhang, Shaocong Wang, Ning Lin, Yifei Yu, Yangu He, Bo wang, Hao Jiang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing.

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

1 code implementation9 Jul 2024 Yue Zhang, Ziqiao Ma, Jialu Li, Yanyuan Qiao, Zun Wang, Joyce Chai, Qi Wu, Mohit Bansal, Parisa Kordjamshidi

Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development.

Vision and Language Navigation

GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

no code implementations4 Jul 2024 Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, Yue Zhang

This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains.

Translation

The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

no code implementations2 Jul 2024 Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case.

Automatic Speech Recognition Pseudo Label +5

ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees

1 code implementation29 Jun 2024 Zhiyuan Wang, Jinhao Duan, Lu Cheng, Yue Zhang, Qingni Wang, Xiaoshuang Shi, Kaidi Xu, HengTao Shen, Xiaofeng Zhu

Uncertainty quantification (UQ) in natural language generation (NLG) tasks remains an open challenge, exacerbated by the closed-source nature of the latest large language models (LLMs).

Conformal Prediction Prediction +2

Nash CoT: Multi-Path Inference with Preference Equilibrium

1 code implementation18 Jun 2024 Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang

However, placing LLMs into specific roles may reduce their reasoning diversity and performance on a few tasks where role dependence is low.

Diversity Question Answering

AutoSurvey: Large Language Models Can Automatically Write Surveys

1 code implementation10 Jun 2024 Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang

This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence.

Retrieval Survey

Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models

1 code implementation4 Jun 2024 Qingkai Min, Qipeng Guo, Xiangkun Hu, Songfang Huang, Zheng Zhang, Yue Zhang

Experimental results demonstrate that our approach surpasses the performance of both the large and small language models individually, forming a complementary advantage.

coreference-resolution Diversity +1

Break the Chain: Large Language Models Can be Shortcut Reasoners

no code implementations4 Jun 2024 Mengru Ding, Hanmeng Liu, Zhizhang Fu, Jian Song, WenBo Xie, Yue Zhang

We propose the integration of human-like heuristics and shortcuts into language models (LMs) through "break the chain" strategies.

LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

1 code implementation3 Jun 2024 Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang

The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation.

Data Augmentation Machine Translation +2

Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models

no code implementations24 May 2024 Yue Zhang, Hehe Fan, Yi Yang

To bridge the gap between vision and language modalities, Multimodal Large Language Models (MLLMs) usually learn an adapter that converts visual inputs to understandable tokens for Large Language Models (LLMs).

Question Answering Visual Question Answering

Image-of-Thought Prompting for Visual Reasoning Refinement in Multimodal Large Language Models

no code implementations22 May 2024 Qiji Zhou, Ruochen Zhou, Zike Hu, Panzhong Lu, Siyang Gao, Yue Zhang

Recent advancements in Chain-of-Thought (CoT) and related rationale-based works have significantly improved the performance of Large Language Models (LLMs) in complex reasoning tasks.

Multimodal Reasoning Visual Question Answering +1

Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text

1 code implementation21 May 2024 Yafu Li, Zhilin Wang, Leyang Cui, Wei Bi, Shuming Shi, Yue Zhang

To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text.

Diversity Text Detection

What Have We Achieved on Non-autoregressive Translation?

1 code implementation21 May 2024 Yafu Li, Huajian Zhang, Jianhao Yan, Yongjing Yin, Yue Zhang

Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT).

Translation

The high dimensional psychological profile and cultural bias of ChatGPT

no code implementations6 May 2024 Hang Yuan, Zhongyue Che, Shao Li, Yue Zhang, Xiaomeng Hu, Siyang Luo

However, to ensure that artificial intelligence models benefit human society, we must first fully understand the similarities and differences between the human-like characteristics exhibited by artificial intelligence models and real humans, as well as the cultural stereotypes and biases that artificial intelligence models may exhibit in the process of interacting with humans.

Decision Making

Logic Agent: Enhancing Validity with Logic Rule Invocation

no code implementations28 Apr 2024 Hanmeng Liu, Zhiyang Teng, Chaoli Zhang, Yue Zhang

Chain-of-Thought (CoT) prompting has emerged as a pivotal technique for augmenting the inferential capabilities of language models during reasoning tasks.

Informativeness Navigate

Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model

no code implementations25 Apr 2024 Runzhe Zhan, Xinyi Yang, Derek F. Wong, Lidia S. Chao, Yue Zhang

While supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences, concerns have been raised about the depth of this alignment, with some critiques suggesting it is merely "superficial".

Language Modeling Language Modelling +3

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

no code implementations22 Apr 2024 Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai, Matthew Dixon, Ronen Eldan, Victor Fragoso, Jianfeng Gao, Mei Gao, Min Gao, Amit Garg, Allie Del Giorno, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Wenxiang Hu, Jamie Huynh, Dan Iter, Sam Ade Jacobs, Mojan Javaheripi, Xin Jin, Nikos Karampatziakis, Piero Kauffmann, Mahoud Khademi, Dongwoo Kim, Young Jin Kim, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Xihui Lin, Zeqi Lin, Ce Liu, Liyuan Liu, Mengchen Liu, Weishung Liu, Xiaodong Liu, Chong Luo, Piyush Madan, Ali Mahmoudzadeh, David Majercak, Matt Mazzola, Caio César Teodoro Mendes, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Liliang Ren, Gustavo de Rosa, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Yelong Shen, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Praneetha Vaddamanu, Chunyu Wang, Guanhua Wang, Lijuan Wang, Shuohang Wang, Xin Wang, Yu Wang, Rachel Ward, Wen Wen, Philipp Witte, Haiping Wu, Xiaoxia Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Jilong Xue, Sonali Yadav, Fan Yang, Jianwei Yang, Yifan Yang, ZiYi Yang, Donghan Yu, Lu Yuan, Chenruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou

We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.

Ranked #5 on MMR total on MRR-Benchmark (using extra training data)

Language Modeling Language Modelling +3

Efficient and accurate neural field reconstruction using resistive memory

no code implementations15 Apr 2024 Yifei Yu, Shaocong Wang, Woyu Zhang, Xinyuan Zhang, Xiuzhe Wu, Yangu He, Jichang Yang, Yue Zhang, Ning Lin, Bo wang, Xi Chen, Songqi Wang, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.

Novel View Synthesis Quantization

From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications

no code implementations10 Apr 2024 Yongqiang Ma, Lizhi Qing, Jiawei Liu, Yangyang Kang, Yue Zhang, Wei Lu, Xiaozhong Liu, Qikai Cheng

Therefore, our study shifts the focus from model-centered to human-centered evaluation in the context of AI-powered writing assistance applications.

FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

1 code implementation9 Apr 2024 Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang

The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency.

Fairness Language Modelling +1

A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution

1 code implementation2 Apr 2024 Bowen Ding, Qingkai Min, Shengkun Ma, Yingjie Li, Linyi Yang, Yue Zhang

Based on Pre-trained Language Models (PLMs), event coreference resolution (ECR) systems have demonstrated outstanding performance in clustering coreferential events across documents.

coreference-resolution counterfactual +3

HeteroMILE: a Multi-Level Graph Representation Learning Framework for Heterogeneous Graphs

no code implementations31 Mar 2024 Yue Zhang, Yuntian He, Saket Gurukar, Srinivasan Parthasarathy

To address this issue, we propose a Multi-Level Embedding framework of nodes on a heterogeneous graph (HeteroMILE) - a generic methodology that allows contemporary graph embedding methods to scale to large graphs.

Graph Embedding Graph Representation Learning +2

NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens

1 code implementation18 Mar 2024 Cunxiang Wang, Ruoxi Ning, Boqi Pan, Tonghui Wu, Qipeng Guo, Cheng Deng, Guangsheng Bao, Xiangkun Hu, Zheng Zhang, Qian Wang, Yue Zhang

The rapid advancement of Large Language Models (LLMs) has introduced a new frontier in natural language processing, particularly in understanding and processing long-context information.

Benchmarking Question Answering

Knowledge Conflicts for LLMs: A Survey

1 code implementation13 Mar 2024 Rongwu Xu, Zehan Qi, Zhijiang Guo, Cunxiang Wang, Hongru Wang, Yue Zhang, Wei Xu

This survey provides an in-depth analysis of knowledge conflicts for large language models (LLMs), highlighting the complex challenges they encounter when blending contextual and parametric knowledge.

Misinformation Survey

DuDoUniNeXt: Dual-domain unified hybrid model for single and multi-contrast undersampled MRI reconstruction

no code implementations8 Mar 2024 Ziqi Gao, Yue Zhang, Xinwen Liu, Kaiyan Li, S. Kevin Zhou

Multi-contrast (MC) Magnetic Resonance Imaging (MRI) reconstruction aims to incorporate a reference image of auxiliary modality to guide the reconstruction process of the target modality.

MRI Reconstruction

Improving Cross-lingual Representation for Semantic Retrieval with Code-switching

no code implementations3 Mar 2024 Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang

Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario.

Question Answering Retrieval +3

How Likely Do LLMs with CoT Mimic Human Reasoning?

1 code implementation25 Feb 2024 Guangsheng Bao, Hongbo Zhang, Cunxiang Wang, Linyi Yang, Yue Zhang

Chain-of-thought emerges as a promising technique for eliciting reasoning capabilities from Large Language Models (LLMs).

In-Context Learning

KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

2 code implementations23 Feb 2024 Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang

Automatic evaluation methods for large language models (LLMs) are hindered by data contamination, leading to inflated assessments of their effectiveness.

Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond

no code implementations22 Feb 2024 Zhiyuan Wang, Jinhao Duan, Chenxi Yuan, Qingyu Chen, Tianlong Chen, Yue Zhang, Ren Wang, Xiaoshuang Shi, Kaidi Xu

Uncertainty estimation is crucial for the reliability of safety-critical human and artificial intelligence (AI) interaction systems, particularly in the domain of healthcare engineering.

MedQA Question Answering +1

Potential and Challenges of Model Editing for Social Debiasing

no code implementations21 Feb 2024 Jianhao Yan, Futing Wang, Yafu Li, Yue Zhang

Large language models (LLMs) trained on vast corpora suffer from inevitable stereotype biases.

Model Editing

Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs

no code implementations19 Feb 2024 Naihao Deng, Zhenjie Sun, Ruiqi He, Aman Sikka, Yulong Chen, Lin Ma, Yue Zhang, Rada Mihalcea

In this paper, we investigate the effectiveness of various LLMs in interpreting tabular data through different prompting strategies and data formats.

Fact Checking Question Answering

Cofca: A Step-Wise Counterfactual Multi-hop QA benchmark

no code implementations19 Feb 2024 Jian Wu, Linyi Yang, Zhen Wang, Manabu Okumura, Yue Zhang

Although previous counterfactual QA benchmarks can separate the internal memory of LLMs, they focus solely on final QA performance, which is insufficient for reporting LLMs' real reasoning abilities.

counterfactual Multi-hop Question Answering +2

Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization

no code implementations18 Feb 2024 Yue Zhang, Jingxuan Zuo, Liqiang Jing

To evaluate the factuality of multimodal summarization models, we propose two fine-grained and explainable evaluation frameworks (FALLACIOUS) for different application scenarios, i. e. reference-based factuality evaluation framework and reference-free factuality evaluation framework.

NavHint: Vision and Language Navigation Agent with a Hint Generator

1 code implementation4 Feb 2024 Yue Zhang, Quan Guo, Parisa Kordjamshidi

The hint generator assists the navigation agent in developing a global understanding of the visual environment.

Vision and Language Navigation

Common Sense Reasoning for Deepfake Detection

1 code implementation31 Jan 2024 Yue Zhang, Ben Colman, Xiao Guo, Ali Shahriyari, Gaurav Bharaj

To address these challenges, we frame deepfake detection as a Deepfake Detection VQA (DD-VQA) task and model human intuition by providing textual explanations that describe common sense reasons for labeling an image as real or fake.

Binary Classification Common Sense Reasoning +4

Detecting Multimedia Generated by Large AI Models: A Survey

1 code implementation22 Jan 2024 Li Lin, Neeraj Gupta, Yue Zhang, Hainan Ren, Chun-Hao Liu, Feng Ding, Xin Wang, Xin Li, Luisa Verdoliva, Shu Hu

The rapid advancement of Large AI Models (LAIMs), particularly diffusion models and large language models, has marked a new era where AI-generated multimedia is increasingly integrated into various aspects of daily life.

Survey

SCALA: Sparsification-based Contrastive Learning for Anomaly Detection on Attributed Networks

no code implementations3 Jan 2024 Enbo He, Yitong Hao, Yue Zhang, Guisheng Yin, Lina Yao

Besides, the node representation of normal entities can be perturbed easily by the noise relationships introduced by anomalous nodes.

Anomaly Detection Contrastive Learning

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

2 code implementations25 Dec 2023 Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi

Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families.

Hallucination Hallucination Evaluation +1

Random resistive memory-based deep extreme point learning machine for unified visual processing

no code implementations14 Dec 2023 Shaocong Wang, Yizhao Gao, Yi Li, Woyu Zhang, Yifei Yu, Bo wang, Ning Lin, Hegan Chen, Yue Zhang, Yang Jiang, Dingchen Wang, Jia Chen, Peng Dai, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Xiaoxin Xu, Hayden So, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu

Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.

A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly

no code implementations4 Dec 2023 Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Zhibo Sun, Yue Zhang

In the meantime, LLMs have also gained traction in the security community, revealing security vulnerabilities and showcasing their potential in security-related tasks.

Language Modeling Language Modelling +4

Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus

1 code implementation22 Nov 2023 Tianhang Zhang, Lin Qiu, Qipeng Guo, Cheng Deng, Yue Zhang, Zheng Zhang, Chenghu Zhou, Xinbing Wang, Luoyi Fu

Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields.

Hallucination Retrieval

End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions

no code implementations15 Nov 2023 Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li

End-to-end task-oriented dialogue (EToD) can directly generate responses in an end-to-end fashion without modular training, which attracts escalating popularity.

Survey

Few-Shot Recognition and Classification Framework for Jamming Signal: A CGAN-Based Fusion CNN Approach

no code implementations9 Nov 2023 Xuhui Ding, Yue Zhang, Gaoyang Li, Xiaozheng Gao, Neng Ye, Dusit Niyato, Kai Yang

Subject to intricate environmental variables, the precise classification of jamming signals holds paramount significance in the effective implementation of anti-jamming strategies within communication systems.

Generative Adversarial Network

LLM-enhanced Self-training for Cross-domain Constituency Parsing

1 code implementation5 Nov 2023 Jianling Li, Meishan Zhang, Peiming Guo, Min Zhang, Yue Zhang

Our experimental results demonstrate that self-training for constituency parsing, equipped with an LLM, outperforms traditional methods regardless of the LLM's performance.

Constituency Parsing Language Modeling +2

Constituency Parsing using LLMs

no code implementations30 Oct 2023 Xuefeng Bai, Jialong Wu, Yulong Chen, Zhongqing Wang, Yue Zhang

Constituency parsing is a fundamental yet unsolved natural language processing task.

Constituency Parsing

TRAMS: Training-free Memory Selection for Long-range Language Modeling

1 code implementation24 Oct 2023 Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi

The Transformer architecture is crucial for numerous AI models, but it still faces challenges in long-range language modeling.

Language Modeling Language Modelling

Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts

1 code implementation23 Oct 2023 Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks.

Logical Reasoning Math

GLoRE: Evaluating Logical Reasoning of Large Language Models

1 code implementation13 Oct 2023 Hanmeng Liu, Zhiyang Teng, Ruoxi Ning, Jian Liu, Qiji Zhou, Yue Zhang

Recently, large language models (LLMs), including notable models such as GPT-4 and burgeoning community models, have showcased significant general language understanding abilities.

Logical Reasoning Natural Language Understanding

RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation

1 code implementation11 Oct 2023 Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, Shuming Shi

In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.

Grammatical Error Correction Sentence