no code implementations • EMNLP (sustainlp) 2021 • Yue Zhang, ChengCheng Hu, Yuqi Liu, Hui Fang, Jimmy Lin
It is well known that rerankers built on pretrained transformer models such as BERT have dramatically improved retrieval effectiveness in many tasks.
1 code implementation • COLING 2022 • Kaixin Wu, Yue Zhang, Bojie Hu, Tong Zhang
Extensive experiments on ten WMT machine translation tasks show that the proposed model yields an average of 1. 35x faster (with almost no decrease in BLEU) over the state-of-the-art inference implementation.
no code implementations • NAACL (ACL) 2022 • Rui Zhang, Yangfeng Ji, Yue Zhang, Rebecca J. Passonneau
We then survey the benefits and the best practices of contrastive learning for various downstream NLP applications including Text Classification, Question Answering, Summarization, Text Generation, Interpretability and Explainability, Commonsense Knowledge and Reasoning, Vision-and-Language. This tutorial intends to help researchers in the NLP and computational linguistics community to understand this emerging topic and promote future research directions of using contrastive learning for NLP applications.
no code implementations • COLING (CogALex) 2020 • Lu Cao, Yulong Chen, Dandan Huang, Yue Zhang
Functional Magnetic Resonance Imaging (fMRI) provides a means to investigate human conceptual representation in cognitive and neuroscience studies, where researchers predict the fMRI activations with elicited stimuli inputs.
no code implementations • CCL 2020 • Meishan Zhang, Yue Zhang
Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings.
no code implementations • CCL 2020 • Shuailong Liang, Derek F. Wong, Yue Zhang
我们基于从2020年1月22日至2020年4月30日在推特社交平台上抓取的不同国家和地区发布的50万条推文, 研究了有关 2019新型冠状病毒肺炎相关的主题和人们的观点, 发现了不同国家之间推特用户的普遍关切和看法之间存在着异同, 并且对不同议题的情感态度也有所不同。我们发现大部分推文中包含了强烈的情感, 其中表达爱与支持的推文比较普遍。总体来看, 人们的情感随着时间的推移逐渐正向增强。
no code implementations • EMNLP 2020 • Chenhua Chen, Zhiyang Teng, Yue Zhang
Aspect-level sentiment analysis aims to recognize the sentiment polarity of an aspect or a target in a comment.
no code implementations • EMNLP 2020 • Chen Jia, Yuefeng Shi, Qinrong Yang, Yue Zhang
We then integrate the entity information into BERT using Char-Entity-Transformer, which augments the self-attention using a combination of character and entity representations.
no code implementations • Findings (NAACL) 2022 • Yue Zhang, Hongliang Fei, Dingcheng Li, Ping Li
Recently, prompt learning has received significant attention, where the downstream tasks are reformulated to the mask-filling task with the help of a textual prompt.
no code implementations • EMNLP 2021 • Sixuan Wu, Jian Li, Peng Zhang, Yue Zhang
Recent research has investigated quantum NLP, designing algorithms that process natural language in quantum computers, and also quantum-inspired algorithms that improve NLP performance on classical computers.
2 code implementations • Findings (ACL) 2022 • Sen yang, Leyang Cui, Ruoxi Ning, Di wu, Yue Zhang
Neural constituency parsers have reached practical performance on news-domain benchmarks.
no code implementations • INLG (ACL) 2021 • Yulong Chen, Yang Liu, Yue Zhang
We propose a shared task on summarizing real-life scenario dialogues, DialogSum Challenge, to encourage researchers to address challenges in dialogue summarization, which has been less studied by the summarization community.
1 code implementation • ACL 2022 • Yue Zhang, Parisa Kordjamshidi
In this paper, we investigate the problem of vision and language navigation.
1 code implementation • ACL 2022 • Chenhua Chen, Zhiyang Teng, Zhongqing Wang, Yue Zhang
Dependency trees have been intensively used with graph neural networks for aspect-based sentiment classification.
Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA)
+1
1 code implementation • Findings (ACL) 2022 • Yafu Li, Yongjing Yin, Jing Li, Yue Zhang
Neural machine translation (NMT) has obtained significant performance improvement over the recent years.
no code implementations • 13 Feb 2025 • Hanmeng Liu, Zhizhang Fu, Mengru Ding, Ruoxi Ning, Chaoli Zhang, Xiaozhang Liu, Yue Zhang
With the emergence of advanced reasoning models like OpenAI o3 and DeepSeek-R1, large language models (LLMs) have demonstrated remarkable reasoning capabilities.
no code implementations • 24 Jan 2025 • Yitong Hao, Enbo He, Yue Zhang, Guisheng Yin
To address this problem, we propose an innovative Bi-directional Curriculum Learning strategy (BCL), which considers nodes with higher and lower similarity to neighbor nodes as simple nodes in the direction of focusing on homogeneity and focusing on heterogeneity, respectively, and prioritizes their training.
1 code implementation • 19 Dec 2024 • Yue Zhang, Liqiang Jing, Vibhav Gogate
Additionally, we introduce a reward-driven update optimization method to further enhance the quality of updates generated by multimodal models.
1 code implementation • 17 Dec 2024 • Mingxu Chai, Ziyu Shen, Chong Zhang, Yue Zhang, Xiao Wang, Shihan Dou, Jihua Kang, Jiazheng Zhang, Qi Zhang
Document parsing is essential for analyzing complex document structures and extracting fine-grained information, supporting numerous downstream applications.
no code implementations • 17 Dec 2024 • Yun Luo, Yingjie Li, Xiangkun Hu, Qinglin Qi, Fang Guo, Qipeng Guo, Zheng Zhang, Yue Zhang
As online platforms and recommendation algorithms evolve, people are increasingly trapped in echo chambers, leading to biased understandings of various issues.
1 code implementation • 16 Dec 2024 • Guangsheng Bao, Yanbin Zhao, Juncai He, Yue Zhang
Advanced large language models (LLMs) can generate text almost indistinguishable from human-written text, highlighting the importance of LLM-generated text detection.
1 code implementation • 13 Dec 2024 • Yudong Jiang, Baohan Xu, Siqian Yang, Mingyu Yin, Jing Liu, Chao Xu, Siqi Wang, Yidi Wu, Bingwen Zhu, Xinwen Zhang, Xingyu Zheng, Jixuan Xu, Yue Zhang, Jinlong Hou, Huyang Sun
Animation has gained significant interest in the recent film and TV industry.
no code implementations • 10 Dec 2024 • Junkai Yin, Yue Zhang, Zhangsheng Yu
Although the Cox proportional hazards model is well established and extensively used in the analysis of survival data, the proportional hazards (PH) assumption may not always hold in practical scenarios.
1 code implementation • 5 Dec 2024 • Shicheng Zhou, Jingju Liu, Yuliang Lu, Jiahai Yang, Yue Zhang, Jie Chen
GAP introduces a Real-to-Sim-to-Real pipeline that (a) enables end-to-end policy learning in unknown real environments while constructing realistic simulations; (b) improves agents' generalization ability by leveraging domain randomization and meta-RL learning. Specially, we are among the first to apply domain randomization in autonomous pentesting and propose a large language model-powered domain randomization method for synthetic environment generation.
no code implementations • 25 Nov 2024 • Jianghao Gong, Peiqi Yan, Yue Zhang, Hongli An, Logan Liu
In the domain of large language models, considerable advancements have been attained in multimodal large language models and explainability research, propelled by the continuous technological progress and innovation.
1 code implementation • 21 Nov 2024 • Jianhao Yan, Pingchuan Yan, Yulong Chen, Jing Li, Xianchao Zhu, Yue Zhang
This study presents a comprehensive evaluation of GPT-4's translation capabilities compared to human translators of varying expertise levels.
no code implementations • 12 Nov 2024 • Qingyu Yin, Chak Tou Leong, Hongbo Zhang, Minjun Zhu, Hanqi Yan, Qiang Zhang, Yulan He, Wenjie Li, Jun Wang, Yue Zhang, Linyi Yang
The alignment of large language models (LLMs) with human preferences remains a key challenge.
1 code implementation • 4 Nov 2024 • Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang
In this work, we study the ability to skip steps in reasoning - a hallmark of human expertise developed through practice.
1 code implementation • 1 Nov 2024 • Yingwei Ma, Rongyu Cao, Yongchang Cao, Yue Zhang, Jue Chen, Yibo Liu, Yuchen Liu, Binhua Li, Fei Huang, Yongbin Li
The results demonstrate that Lingma SWE-GPT 72B successfully resolves 30. 20% of the GitHub issues, marking a significant improvement in automatic issue resolution (22. 76% relative improvement compared to Llama 3. 1 405B), approaching the performance of closed-source models (31. 80\% issues of GPT-4o resolved).
no code implementations • 29 Oct 2024 • Yifan Sun, Yuhang Li, Yue Zhang, Yuchen Jin, huan zhang
Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains.
no code implementations • 28 Oct 2024 • Yixuan Weng, Minjun Zhu, Guangsheng Bao, Hongbo Zhang, Jindong Wang, Yue Zhang, Linyi Yang
In research, the papers generated by the CycleResearcher model achieved a score of 5. 36 in simulated peer reviews, surpassing the preprint level of 5. 24 from human experts and approaching the accepted paper level of 5. 69.
no code implementations • 24 Oct 2024 • Donglin Di, Weinan Zhang, Yue Zhang, Fanglin Wang
Making use of off-the-shelf resources of resource-rich languages to transfer knowledge for low-resource languages raises much attention recently.
no code implementations • 24 Oct 2024 • Yingjie Li, Yun Luo, Xiaotian Xie, Yue Zhang
TC encourages LLMs to reason based on both premise and hypothesis, while mitigating the models' over-reliance on individual premise or hypothesis for inference.
1 code implementation • 14 Oct 2024 • Minjun Zhu, Linyi Yang, Yifan Wei, Ningyu Zhang, Yue Zhang
SafetyLock leverages our discovery that fine-tuned models retain similar safety-related activation representations to their base models.
1 code implementation • 14 Oct 2024 • Yue Zhang, Minhao Liu, Zhaokang Chen, Bin Wu, Yubin Zeng, Chao Zhan, Yingjie He, Junxin Huang, Wenjiang Zhou
We propose MuseTalk, which generates lip-sync targets in a latent space encoded by a Variational Autoencoder, enabling high-fidelity talking face video generation with efficient inference.
no code implementations • 14 Oct 2024 • Abdoul Aziz Amadou, Yue Zhang, Sebastien Piat, Paul Klein, Ingo Schmuecking, Tiziano Passerini, Puneet Sharma
Quantitative evaluation of echocardiography is essential for precise assessment of cardiac condition, monitoring disease progression, and guiding treatment decisions.
no code implementations • 13 Oct 2024 • Di wu, Siyuan Li, Chen Feng, Lu Cao, Yue Zhang, Jie Yang, Mohamad Sawan
To address these limitations, we introduce Homogeneity-Heterogeneity Disentangled Learning for neural Representations (H2DiLR), a novel framework that disentangles and learns both the homogeneity and heterogeneity from intracranial recordings across multiple subjects.
1 code implementation • 12 Oct 2024 • Futing Wang, Jianhao Yan, Yue Zhang, Tao Lin
By externally storing and reusing vectors that represent in-context learned capabilities, \alg not only demonstrates the potential to operate modular capabilities but also significantly enhances the performance, versatility, adaptability, and scalability of large language models.
no code implementations • 12 Oct 2024 • Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang
Large language models (LLMs) have revolutionized knowledge storage and retrieval, but face challenges with conflicting and outdated information.
1 code implementation • 5 Oct 2024 • Cheng Jiayang, Chunkit Chan, Qianqian Zhuang, Lin Qiu, Tianhang Zhang, Tengxiao Liu, Yangqiu Song, Yue Zhang, PengFei Liu, Zheng Zhang
The rise of large language models (LLMs) has significantly influenced the quality of information in decision-making systems, leading to the prevalence of AI-generated content and challenges in detecting misinformation and managing conflicting information, or "inter-evidence conflicts."
no code implementations • 4 Oct 2024 • Yue Zhang, Zhiyang Xu, Ying Shen, Parisa Kordjamshidi, Lifu Huang
2) the architectures of existing 3D-based LLMs lack explicit alignment between the spatial representations of 3D scenes and natural language, limiting their performance in tasks requiring precise spatial reasoning.
1 code implementation • 29 Sep 2024 • Chong Zhang, Yi Tu, Yixi Zhao, Chenshu Yuan, Huan Chen, Yue Zhang, Mingxu Chai, Ya Guo, Huijia Zhu, Qi Zhang, Tao Gui
However, we argue that this formulation does not adequately convey the complete reading order information in the layout, which may potentially lead to performance decline in downstream VrD tasks.
Ranked #1 on
Key Information Extraction
on CORD
1 code implementation • 24 Sep 2024 • Yue Chang, Liqiang Jing, Xiaopeng Zhang, Yue Zhang
To mitigate hallucination, current studies either focus on the process of model inference or the results of model generation, but the solutions they design sometimes do not deal appropriately with various types of queries and the hallucinations of the generations about these queries.
no code implementations • 18 Sep 2024 • KeJia Chen, Zheng Shen, Yue Zhang, Lingyun Chen, Fan Wu, Zhenshan Bing, Sami Haddadin, Alois Knoll
To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the planning process.
no code implementations • 17 Sep 2024 • Yongjing Yin, Junran Ding, Kai Song, Yue Zhang
In this paper, we introduce Semformer, a novel method of training a Transformer language model that explicitly models the semantic planning of response.
2 code implementations • 11 Sep 2024 • Yu Zhang, Songlin Yang, Ruijie Zhu, Yue Zhang, Leyang Cui, Yiqiao Wang, Bolun Wang, Freda Shi, Bailin Wang, Wei Bi, Peng Zhou, Guohong Fu
Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for training from scratch.
no code implementations • 8 Sep 2024 • Jiahua Dong, Yue Zhang, Qiuli Wang, Ruofeng Tong, Shihong Ying, Shaolin Gong, Xuanpu Zhang, Lanfen Lin, Yen-Wei Chen, S. Kevin Zhou
To achieve this, we devise a gaussian mixture model-based label filtering module that distinguishes noisy labels from clean labels.
1 code implementation • 21 Aug 2024 • Minjun Zhu, Linyi Yang, Yue Zhang
This dataset allows us to quantitatively evaluate the extent to which LLMs can align with each subject's behavioral patterns.
no code implementations • 21 Aug 2024 • Yaoze Zhang, Yuming Zhang, Yu Zhao, Yue Zhang, Feiyu Zhu
Existing knowledge distillation methods focus on designing different distillation targets to acquire knowledge from teacher models.
1 code implementation • 21 Aug 2024 • Xuanwang Zhang, Yunze Song, Yidong Wang, Shuyun Tang, Xinfeng Li, Zhengran Zeng, Zhen Wu, Wei Ye, Wenyuan Xu, Yue Zhang, Xinyu Dai, Shikun Zhang, Qingsong Wen
Leveraging RAGLAB, we conduct a fair comparison of 6 RAG algorithms across 10 benchmarks.
1 code implementation • 19 Aug 2024 • Yue Zhang, Parisa Kordjamshidi
First, VLN-CE agents that discretize the visual environment are primarily trained with high-level view selection, which causes them to ignore crucial spatial reasoning within the low-level action movements.
1 code implementation • 16 Aug 2024 • Yulong Chen, Yang Liu, Jianhao Yan, Xuefeng Bai, Ming Zhong, Yinghao Yang, ZiYi Yang, Chenguang Zhu, Yue Zhang
We then build a benchmark, SC-G4, consisting of 1, 835 instances generated by GPT-4 using these patterns, with human-annotated gold responses.
1 code implementation • 15 Aug 2024 • Dongyu Ru, Lin Qiu, Xiangkun Hu, Tianhang Zhang, Peng Shi, Shuaichen Chang, Cheng Jiayang, Cunxiang Wang, Shichao Sun, Huanyu Li, Zizhao Zhang, Binjie Wang, Jiarong Jiang, Tong He, Zhiguo Wang, PengFei Liu, Yue Zhang, Zheng Zhang
Despite Retrieval-Augmented Generation (RAG) showing promising capability in leveraging external knowledge, a comprehensive evaluation of RAG systems is still challenging due to the modular nature of RAG, evaluation of long-form responses and reliability of measurements.
no code implementations • 6 Aug 2024 • Lei Shi, Zhimeng Liu, Yi Yang, Weize Wu, Yuyang Zhang, Hongbo Zhang, Jing Lin, Siyu Wu, Zihan Chen, Ruiming Li, Nan Wang, Zipeng Liu, Huobin Tan, Hongyi Gao, Yue Zhang, Ge Wang
The extraction of Metal-Organic Frameworks (MOFs) synthesis conditions from literature text has been challenging but crucial for the logical design of new MOFs with desirable functionality.
1 code implementation • 1 Aug 2024 • Yu Xie, Qian Qiao, Jun Gao, Tianxiang Wu, Jiaqing Fan, Yue Zhang, Jielei Zhang, Huyang Sun
Unfortunately, this denoising training method cannot be directly applied to text spotting tasks, as these tasks need to perform irregular shape detection tasks and more complex text recognition tasks than classification.
1 code implementation • 26 Jul 2024 • Bo wang, Shaocong Wang, Ning Lin, Yi Li, Yifei Yu, Yue Zhang, Jichang Yang, Xiaoshan Wu, Yangu He, Songqi Wang, Rui Chen, Guoqi Li, Xiaojuan Qi, Zhongrui Wang, Dashan Shang
To address these fundamental challenges, we introduce pruning optimization for input-aware dynamic memristive spiking neural network (PRIME).
no code implementations • 12 Jul 2024 • Yue Zhang, Woyu Zhang, Shaocong Wang, Ning Lin, Yifei Yu, Yangu He, Bo wang, Hao Jiang, Peng Lin, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu
In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing.
1 code implementation • 9 Jul 2024 • Yue Zhang, Ziqiao Ma, Jialu Li, Yanyuan Qiao, Zun Wang, Joyce Chai, Qi Wu, Mohit Bansal, Parisa Kordjamshidi
Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development.
no code implementations • 4 Jul 2024 • Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, Yue Zhang
This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains.
no code implementations • 4 Jul 2024 • Litton Jose Kurisinkel, Pruthwik Mishra, Yue Zhang
Specifically, we leverage the intuition of large language models about future changes to update real number time series predictions.
no code implementations • 2 Jul 2024 • Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang
This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case.
1 code implementation • 29 Jun 2024 • Zhiyuan Wang, Jinhao Duan, Lu Cheng, Yue Zhang, Qingni Wang, Xiaoshuang Shi, Kaidi Xu, HengTao Shen, Xiaofeng Zhu
Uncertainty quantification (UQ) in natural language generation (NLG) tasks remains an open challenge, exacerbated by the closed-source nature of the latest large language models (LLMs).
1 code implementation • 18 Jun 2024 • Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang
However, placing LLMs into specific roles may reduce their reasoning diversity and performance on a few tasks where role dependence is low.
1 code implementation • 10 Jun 2024 • Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang
This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence.
1 code implementation • 4 Jun 2024 • Qingkai Min, Qipeng Guo, Xiangkun Hu, Songfang Huang, Zheng Zhang, Yue Zhang
Experimental results demonstrate that our approach surpasses the performance of both the large and small language models individually, forming a complementary advantage.
no code implementations • 4 Jun 2024 • Mengru Ding, Hanmeng Liu, Zhizhang Fu, Jian Song, WenBo Xie, Yue Zhang
We propose the integration of human-like heuristics and shortcuts into language models (LMs) through "break the chain" strategies.
1 code implementation • 3 Jun 2024 • Yongjing Yin, Jiali Zeng, Yafu Li, Fandong Meng, Yue Zhang
The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation.
no code implementations • 24 May 2024 • Yue Zhang, Hehe Fan, Yi Yang
To bridge the gap between vision and language modalities, Multimodal Large Language Models (MLLMs) usually learn an adapter that converts visual inputs to understandable tokens for Large Language Models (LLMs).
2 code implementations • 23 May 2024 • Xiangkun Hu, Dongyu Ru, Lin Qiu, Qipeng Guo, Tianhang Zhang, Yang Xu, Yun Luo, PengFei Liu, Yue Zhang, Zheng Zhang
In RefChecker, an extractor generates claim-triplets from a response, which are then evaluated by a checker against a reference.
no code implementations • 22 May 2024 • Qiji Zhou, Ruochen Zhou, Zike Hu, Panzhong Lu, Siyang Gao, Yue Zhang
Recent advancements in Chain-of-Thought (CoT) and related rationale-based works have significantly improved the performance of Large Language Models (LLMs) in complex reasoning tasks.
Ranked #8 on
Visual Question Answering
on MM-Vet
1 code implementation • 21 May 2024 • Yafu Li, Zhilin Wang, Leyang Cui, Wei Bi, Shuming Shi, Yue Zhang
To this end, we propose a novel detection framework, paraphrased text span detection (PTD), aiming to identify paraphrased text spans within a text.
1 code implementation • 21 May 2024 • Yafu Li, Huajian Zhang, Jianhao Yan, Yongjing Yin, Yue Zhang
Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT).
no code implementations • 8 May 2024 • Irene Alisjahbana, Jiawei Li, Ben, Strong, Yue Zhang
Satellite imagery has played an increasingly important role in post-disaster building damage assessment.
no code implementations • 6 May 2024 • Hang Yuan, Zhongyue Che, Shao Li, Yue Zhang, Xiaomeng Hu, Siyang Luo
However, to ensure that artificial intelligence models benefit human society, we must first fully understand the similarities and differences between the human-like characteristics exhibited by artificial intelligence models and real humans, as well as the cultural stereotypes and biases that artificial intelligence models may exhibit in the process of interacting with humans.
no code implementations • 28 Apr 2024 • Hanmeng Liu, Zhiyang Teng, Chaoli Zhang, Yue Zhang
Chain-of-Thought (CoT) prompting has emerged as a pivotal technique for augmenting the inferential capabilities of language models during reasoning tasks.
no code implementations • 25 Apr 2024 • Runzhe Zhan, Xinyi Yang, Derek F. Wong, Lidia S. Chao, Yue Zhang
While supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences, concerns have been raised about the depth of this alignment, with some critiques suggesting it is merely "superficial".
no code implementations • 22 Apr 2024 • Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai, Matthew Dixon, Ronen Eldan, Victor Fragoso, Jianfeng Gao, Mei Gao, Min Gao, Amit Garg, Allie Del Giorno, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Wenxiang Hu, Jamie Huynh, Dan Iter, Sam Ade Jacobs, Mojan Javaheripi, Xin Jin, Nikos Karampatziakis, Piero Kauffmann, Mahoud Khademi, Dongwoo Kim, Young Jin Kim, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Yunsheng Li, Chen Liang, Lars Liden, Xihui Lin, Zeqi Lin, Ce Liu, Liyuan Liu, Mengchen Liu, Weishung Liu, Xiaodong Liu, Chong Luo, Piyush Madan, Ali Mahmoudzadeh, David Majercak, Matt Mazzola, Caio César Teodoro Mendes, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Liliang Ren, Gustavo de Rosa, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Yelong Shen, Swadheen Shukla, Xia Song, Masahiro Tanaka, Andrea Tupini, Praneetha Vaddamanu, Chunyu Wang, Guanhua Wang, Lijuan Wang, Shuohang Wang, Xin Wang, Yu Wang, Rachel Ward, Wen Wen, Philipp Witte, Haiping Wu, Xiaoxia Wu, Michael Wyatt, Bin Xiao, Can Xu, Jiahang Xu, Weijian Xu, Jilong Xue, Sonali Yadav, Fan Yang, Jianwei Yang, Yifan Yang, ZiYi Yang, Donghan Yu, Lu Yuan, Chenruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou
We introduce phi-3-mini, a 3. 8 billion parameter language model trained on 3. 3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3. 5 (e. g., phi-3-mini achieves 69% on MMLU and 8. 38 on MT-bench), despite being small enough to be deployed on a phone.
Ranked #5 on
MMR total
on MRR-Benchmark
(using extra training data)
1 code implementation • 18 Apr 2024 • Fang Guo, Wenyu Li, Honglei Zhuang, Yun Luo, Yafu Li, Qi Zhu, Le Yan, Yue Zhang
The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results.
no code implementations • 15 Apr 2024 • Yifei Yu, Shaocong Wang, Woyu Zhang, Xinyuan Zhang, Xiuzhe Wu, Yangu He, Jichang Yang, Yue Zhang, Ning Lin, Bo wang, Xi Chen, Songqi Wang, Xumeng Zhang, Xiaojuan Qi, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu
The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.
no code implementations • 10 Apr 2024 • Yongqiang Ma, Lizhi Qing, Jiawei Liu, Yangyang Kang, Yue Zhang, Wei Lu, Xiaozhong Liu, Qikai Cheng
Therefore, our study shifts the focus from model-centered to human-centered evaluation in the context of AI-powered writing assistance applications.
1 code implementation • 9 Apr 2024 • Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang
The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency.
no code implementations • 4 Apr 2024 • Jiawei Li, Yue Zhang
We conclude that the BERT base model will be improved by incorporating the features.
1 code implementation • 2 Apr 2024 • Bowen Ding, Qingkai Min, Shengkun Ma, Yingjie Li, Linyi Yang, Yue Zhang
Based on Pre-trained Language Models (PLMs), event coreference resolution (ECR) systems have demonstrated outstanding performance in clustering coreferential events across documents.
no code implementations • 31 Mar 2024 • Yue Zhang, Yuntian He, Saket Gurukar, Srinivasan Parthasarathy
To address this issue, we propose a Multi-Level Embedding framework of nodes on a heterogeneous graph (HeteroMILE) - a generic methodology that allows contemporary graph embedding methods to scale to large graphs.
1 code implementation • 18 Mar 2024 • Cunxiang Wang, Ruoxi Ning, Boqi Pan, Tonghui Wu, Qipeng Guo, Cheng Deng, Guangsheng Bao, Xiangkun Hu, Zheng Zhang, Qian Wang, Yue Zhang
The rapid advancement of Large Language Models (LLMs) has introduced a new frontier in natural language processing, particularly in understanding and processing long-context information.
1 code implementation • 13 Mar 2024 • Rongwu Xu, Zehan Qi, Zhijiang Guo, Cunxiang Wang, Hongru Wang, Yue Zhang, Wei Xu
This survey provides an in-depth analysis of knowledge conflicts for large language models (LLMs), highlighting the complex challenges they encounter when blending contextual and parametric knowledge.
no code implementations • 8 Mar 2024 • Ziqi Gao, Yue Zhang, Xinwen Liu, Kaiyan Li, S. Kevin Zhou
Multi-contrast (MC) Magnetic Resonance Imaging (MRI) reconstruction aims to incorporate a reference image of auxiliary modality to guide the reconstruction process of the target modality.
no code implementations • 3 Mar 2024 • Mieradilijiang Maimaiti, Yuanhang Zheng, Ji Zhang, Fei Huang, Yue Zhang, Wenpei Luo, Kaiyu Huang
Semantic Retrieval (SR) has become an indispensable part of the FAQ system in the task-oriented question-answering (QA) dialogue scenario.
1 code implementation • 25 Feb 2024 • Guangsheng Bao, Hongbo Zhang, Cunxiang Wang, Linyi Yang, Yue Zhang
Chain-of-thought emerges as a promising technique for eliciting reasoning capabilities from Large Language Models (LLMs).
2 code implementations • 23 Feb 2024 • Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang
Automatic evaluation methods for large language models (LLMs) are hindered by data contamination, leading to inflated assessments of their effectiveness.
no code implementations • 22 Feb 2024 • Zhiyuan Wang, Jinhao Duan, Chenxi Yuan, Qingyu Chen, Tianlong Chen, Yue Zhang, Ren Wang, Xiaoshuang Shi, Kaidi Xu
Uncertainty estimation is crucial for the reliability of safety-critical human and artificial intelligence (AI) interaction systems, particularly in the domain of healthcare engineering.
no code implementations • 21 Feb 2024 • Jianhao Yan, Futing Wang, Yafu Li, Yue Zhang
Large language models (LLMs) trained on vast corpora suffer from inevitable stereotype biases.
1 code implementation • 21 Feb 2024 • Jianhao Yan, Yun Luo, Yue Zhang
The application scope of large language models (LLMs) is increasingly expanding.
no code implementations • 20 Feb 2024 • Hanchen Xia, Feng Jiang, Naihao Deng, Cunxiang Wang, Guojiang Zhao, Rada Mihalcea, Yue Zhang
Large Language Models (LLMs) have demonstrated strong performance on various tasks.
no code implementations • 19 Feb 2024 • Naihao Deng, Zhenjie Sun, Ruiqi He, Aman Sikka, Yulong Chen, Lin Ma, Yue Zhang, Rada Mihalcea
In this paper, we investigate the effectiveness of various LLMs in interpreting tabular data through different prompting strategies and data formats.
no code implementations • 19 Feb 2024 • Jian Wu, Linyi Yang, Zhen Wang, Manabu Okumura, Yue Zhang
Although previous counterfactual QA benchmarks can separate the internal memory of LLMs, they focus solely on final QA performance, which is insufficient for reporting LLMs' real reasoning abilities.
no code implementations • 18 Feb 2024 • Yue Zhang, Jingxuan Zuo, Liqiang Jing
To evaluate the factuality of multimodal summarization models, we propose two fine-grained and explainable evaluation frameworks (FALLACIOUS) for different application scenarios, i. e. reference-based factuality evaluation framework and reference-free factuality evaluation framework.
1 code implementation • 4 Feb 2024 • Yue Zhang, Quan Guo, Parisa Kordjamshidi
The hint generator assists the navigation agent in developing a global understanding of the visual environment.
1 code implementation • 31 Jan 2024 • Yue Zhang, Ben Colman, Xiao Guo, Ali Shahriyari, Gaurav Bharaj
To address these challenges, we frame deepfake detection as a Deepfake Detection VQA (DD-VQA) task and model human intuition by providing textual explanations that describe common sense reasons for labeling an image as real or fake.
1 code implementation • 22 Jan 2024 • Li Lin, Neeraj Gupta, Yue Zhang, Hainan Ren, Chun-Hao Liu, Feng Ding, Xin Wang, Xin Li, Luisa Verdoliva, Shu Hu
The rapid advancement of Large AI Models (LAIMs), particularly diffusion models and large language models, has marked a new era where AI-generated multimedia is increasingly integrated into various aspects of daily life.
no code implementations • 6 Jan 2024 • Kaiyan Li, Jingyuan Yang, Wenxuan Liang, Xingde Li, Chenxi Zhang, Lulu Chen, Chan Wu, Xiao Zhang, Zhiyan Xu, Yuelin Wang, Lihui Meng, Yue Zhang, Youxin Chen, S. Kevin Zhou
Optical coherence tomography (OCT) is a noninvasive technology that enables real-time imaging of tissue microanatomies.
no code implementations • 3 Jan 2024 • Enbo He, Yitong Hao, Yue Zhang, Guisheng Yin, Lina Yao
Besides, the node representation of normal entities can be perturbed easily by the noise relationships introduced by anomalous nodes.
3 code implementations • 27 Dec 2023 • Zijie Yang, Yongjing Yin, Chaojun Kong, Tiange Chi, Wufan Tao, Yue Zhang, Tian Xu
Natural Medicinal Materials (NMMs) have a long history of global clinical applications and a wealth of records and knowledge.
1 code implementation • 26 Dec 2023 • Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
2 code implementations • 25 Dec 2023 • Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi
Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families.
no code implementations • 14 Dec 2023 • Shaocong Wang, Yizhao Gao, Yi Li, Woyu Zhang, Yifei Yu, Bo wang, Ning Lin, Hegan Chen, Yue Zhang, Yang Jiang, Dingchen Wang, Jia Chen, Peng Dai, Hao Jiang, Peng Lin, Xumeng Zhang, Xiaojuan Qi, Xiaoxin Xu, Hayden So, Zhongrui Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu
Our random resistive memory-based deep extreme point learning machine may pave the way for energy-efficient and training-friendly edge AI across various data modalities and tasks.
no code implementations • 12 Dec 2023 • Yue Zhang, Ming Zhang, Haipeng Yuan, Shichun Liu, Yongyao Shi, Tao Gui, Qi Zhang, Xuanjing Huang
The three crucial questions for LLM evaluation are ``what, where, and how to evaluate''.
no code implementations • 4 Dec 2023 • Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Zhibo Sun, Yue Zhang
In the meantime, LLMs have also gained traction in the security community, revealing security vulnerabilities and showcasing their potential in security-related tasks.
1 code implementation • 22 Nov 2023 • Tianhang Zhang, Lin Qiu, Qipeng Guo, Cheng Deng, Yue Zhang, Zheng Zhang, Chenghu Zhou, Xinbing Wang, Luoyi Fu
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields.
no code implementations • 15 Nov 2023 • Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li
End-to-end task-oriented dialogue (EToD) can directly generate responses in an end-to-end fashion without modular training, which attracts escalating popularity.
no code implementations • 11 Nov 2023 • Yue Zhang
Developmental bias plays a major role in phenotypic evolution.
no code implementations • 9 Nov 2023 • Xuhui Ding, Yue Zhang, Gaoyang Li, Xiaozheng Gao, Neng Ye, Dusit Niyato, Kai Yang
Subject to intricate environmental variables, the precise classification of jamming signals holds paramount significance in the effective implementation of anti-jamming strategies within communication systems.
1 code implementation • 5 Nov 2023 • Jianling Li, Meishan Zhang, Peiming Guo, Min Zhang, Yue Zhang
Our experimental results demonstrate that self-training for constituency parsing, equipped with an LLM, outperforms traditional methods regardless of the LLM's performance.
1 code implementation • 30 Oct 2023 • Chiyu Song, Zhanchao Zhou, Jianhao Yan, Yuejiao Fei, Zhenzhong Lan, Yue Zhang
Instruction tuning is a burgeoning method to elicit the general intelligence of Large Language Models (LLMs).
no code implementations • 30 Oct 2023 • Xuefeng Bai, Jialong Wu, Yulong Chen, Zhongqing Wang, Yue Zhang
Constituency parsing is a fundamental yet unsolved natural language processing task.
1 code implementation • 24 Oct 2023 • Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi
The Transformer architecture is crucial for numerous AI models, but it still faces challenges in long-range language modeling.
1 code implementation • 23 Oct 2023 • Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks.
1 code implementation • 19 Oct 2023 • Cheng Jiayang, Lin Qiu, Tsz Ho Chan, Tianqing Fang, Weiqi Wang, Chunkit Chan, Dongyu Ru, Qipeng Guo, Hongming Zhang, Yangqiu Song, Yue Zhang, Zheng Zhang
Analogy-making between narratives is crucial for human reasoning.
1 code implementation • 13 Oct 2023 • Hanmeng Liu, Zhiyang Teng, Ruoxi Ning, Jian Liu, Qiji Zhou, Yue Zhang
Recently, large language models (LLMs), including notable models such as GPT-4 and burgeoning community models, have showcased significant general language understanding abilities.
1 code implementation • 11 Oct 2023 • Yue Zhang, Leyang Cui, Enbo Zhao, Wei Bi, Shuming Shi
In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.