1 code implementation • ECCV 2020 • Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia
Tehchniques for manipulating images are advancing rapidly; while these are helpful for many useful tasks, they also pose a threat to society with their ability to create believable misinformation.
Ranked #7 on
Image Manipulation Detection
on CocoGlide
1 code implementation • 9 Jun 2025 • Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, Hong Yu, Mehran Kazemi, Kumar Ayush, A. Ali Heydari, Maxwell A. Xu, Girish Narayanswamy, Yun Liu, Ming-Zher Poh, Yuzhe Yang, Mark Malhotra, Shwetak Patel, Hamid Palangi, Xuhai Xu, Daniel McDuff, Tim Althoff, Xin Liu
Language models (LMs) are increasingly being deployed to perform autonomous data analyses.
no code implementations • 4 Jun 2025 • Zhaoxuan Tan, Zheng Li, Tianyi Liu, Haodong Wang, Hyokun Yun, Ming Zeng, Pei Chen, Zhihan Zhang, Yifan Gao, Ruijie Wang, Priyanka Nigam, Bing Yin, Meng Jiang
Learning from preference feedback is essential for aligning large language models (LLMs) with human values and improving the quality of generated responses.
1 code implementation • 3 Apr 2025 • Zhihan Zhang, Yixin Cao, Lizi Liao
Chart-to-code generation, the process of converting chart images into executable plotting scripts, provides a lossless representation of chart information, requiring models to accurately capture and summarize all visual and structural elements.
no code implementations • 1 Apr 2025 • Enjun Du, Xunkai Li, Tian Jin, Zhihan Zhang, Rong-Hua Li, Guoren Wang
To address these issues, we introduce GraphMaster, the first multi-agent framework specifically designed for graph data synthesis in data-limited environments.
no code implementations • 27 Mar 2025 • Zhihan Zhang, Xunkai Li, Zhu Lei, Guang Zeng, RongHua Li, Guoren Wang
For Question 2, we propose decoupled and training-free model design principles for LLM integration, shifting the focus from computation-intensive fine-tuning to more efficient inference.
no code implementations • 18 Mar 2025 • Jiankang Wang, Zhihan Zhang, Zhihang Liu, Yang Li, Jiannan Ge, Hongtao Xie, Yongdong Zhang
Moreover, we propose a Query-Guided Space Decoder to establish a corresponding connection between the queries and spatial coordinates.
no code implementations • 19 Feb 2025 • Huiying Shi, Zhihong Tan, Zhihan Zhang, Hongchen Wei, Yaosi Hu, Yingxue Zhang, Zhenzhong Chen
This makes the evaluation of semantic segmentation quality in such scenarios an issue to be resolved.
1 code implementation • 12 Feb 2025 • Zhihan Zhang, Shiyang Li, Zixuan Zhang, Xin Liu, Haoming Jiang, Xianfeng Tang, Yifan Gao, Zheng Li, Haodong Wang, Zhaoxuan Tan, Yichuan Li, Qingyu Yin, Bing Yin, Meng Jiang
The instruction hierarchy, which establishes a priority order from system messages to user messages, conversation history, and tool outputs, is essential for ensuring consistent and safe behavior in language models (LMs).
no code implementations • 26 Dec 2024 • Sitan Chen, Weiyuan Gong, Zhihan Zhang
In recent years there has been significant interest in understanding the statistical complexity of learning from quantum data under the constraint that one can only make unentangled measurements.
1 code implementation • 18 Oct 2024 • Zifeng Zhu, Mengzhao Jia, Zhihan Zhang, Lang Li, Meng Jiang
Multimodal Large Language Models (MLLMs) have demonstrated impressive abilities across various tasks, including visual question answering and chart comprehension, yet existing benchmarks for chart-related tasks fall short in capturing the complexity of real-world multi-chart scenarios.
no code implementations • 16 Oct 2024 • Weiyuan Gong, Jonas Haferkamp, Qi Ye, Zhihan Zhang
Our protocol can be modified to estimate the purity of an unknown quantum state $\rho$ using $k$-qubit quantum memory with the same complexity.
no code implementations • 16 Oct 2024 • Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang
Best-of-N decoding methods instruct large language models (LLMs) to generate multiple solutions, score each using a scoring function, and select the highest scored as the final answer to mathematical reasoning problems.
no code implementations • 8 Oct 2024 • Noah Ziems, Zhihan Zhang, Meng Jiang
Evaluating the ability of large language models (LLMs) to follow complex human-written instructions is essential for their deployment in real-world applications.
1 code implementation • 3 Oct 2024 • Siru Ouyang, Wenhao Yu, Kaixin Ma, Zilin Xiao, Zhihan Zhang, Mengzhao Jia, Jiawei Han, Hongming Zhang, Dong Yu
Unlike traditional function-level or file-level coding tasks, AI software engineering requires not only basic coding proficiency but also advanced skills in managing and interacting with code repositories.
1 code implementation • 2 Oct 2024 • Mengzhao Jia, Wenhao Yu, Kaixin Ma, Tianqing Fang, Zhihan Zhang, Siru Ouyang, Hongming Zhang, Meng Jiang, Dong Yu
Tasks involving multiple text-rich images are especially challenging, as they require not only understanding the content of individual images but reasoning about inter-relationships and logical flows across multiple visual inputs.
no code implementations • 13 Aug 2024 • Sitan Chen, Weiyuan Gong, Qi Ye, Zhihan Zhang
We study the task of agnostic tomography: given copies of an unknown $n$-qubit state $\rho$ which has fidelity $\tau$ with some state in a given class $C$, find a state which has fidelity $\ge \tau - \epsilon$ with $\rho$.
4 code implementations • 22 Jun 2024 • Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu, Zijian Wang, Binyuan Hui, Niklas Muennighoff, David Lo, Daniel Fried, Xiaoning Du, Harm de Vries, Leandro von Werra
In addition, using multiple tools to solve a task needs compositional reasoning by accurately understanding complex instructions.
Ranked #1 on
Code Generation
on BigCodeBench-Instruct
no code implementations • 17 Jun 2024 • Yiwei Sun, Zhihang Liu, Chuanbin Liu, Bowei Pu, Zhihan Zhang, Hongtao Xie
Although existing methods achieve a balance between memory and information by aggregating frames, they inevitably introduce the severe hallucination issue.
1 code implementation • 17 Jun 2024 • Zhihan Zhang, Tao Ge, Zhenwen Liang, Wenhao Yu, Dian Yu, Mengzhao Jia, Dong Yu, Meng Jiang
Supervised fine-tuning enhances the problem-solving abilities of language models across various mathematical reasoning tasks.
1 code implementation • 4 Jun 2024 • Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua
We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE).
1 code implementation • 29 May 2024 • Zhenwen Liang, Dian Yu, Wenhao Yu, Wenlin Yao, Zhihan Zhang, Xiangliang Zhang, Dong Yu
We evaluate the performance of various SOTA LLMs on the MathChat benchmark, and we observe that while these models excel in single turn question answering, they significantly underperform in more complex scenarios that require sustained reasoning and dialogue understanding.
no code implementations • 23 May 2024 • Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang
The condition can be an entity in an open-domain question or a numeric value in a math question, which requires minimal effort (via prompting) to identify.
no code implementations • 1 May 2024 • Zhihan Zhang, Weiyuan Gong, Weikang Li, Dong-Ling Deng
In addition, for quantum devices with constant noise strength, we prove that no super-polynomial classical-quantum separation exists for any classification task defined by shallow Clifford circuits, independent of the structures of the circuits that specify the learning models.
no code implementations • 22 Apr 2024 • Mengzhao Jia, Zhihan Zhang, Wenhao Yu, Fangkai Jiao, Meng Jiang
Open-source multimodal large language models (MLLMs) excel in various tasks involving textual and visual inputs but still struggle with complex multimodal mathematical reasoning, lagging behind proprietary models like GPT-4V(ision) and Gemini-Pro.
1 code implementation • 14 Mar 2024 • Chu Li, Zhihan Zhang, Michael Saugstad, Esteban Safranchik, Minchu Kulkarni, Xiaoyu Huang, Shwetak Patel, Vikram Iyer, Tim Althoff, Jon E. Froehlich
Crowdsourcing platforms have transformed distributed problem-solving, yet quality control remains a persistent challenge.
1 code implementation • 12 Feb 2024 • Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Shangbin Feng, Zhenwen Liang, Zhihan Zhang, Meng Jiang
Automatic taxonomy induction is crucial for web search, recommendation systems, and question answering.
1 code implementation • 15 Nov 2023 • Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, Francesco Barbieri
Instruction tuning has remarkably advanced large language models (LLMs) in understanding and responding to diverse human instructions.
no code implementations • 19 Oct 2023 • Zhihan Zhang, Shuohang Wang, Wenhao Yu, Yichong Xu, Dan Iter, Qingkai Zeng, Yang Liu, Chenguang Zhu, Meng Jiang
Large language models (LLMs) can perform a wide range of tasks by following natural language instructions, without the necessity of task-specific fine-tuning.
no code implementations • 7 Jul 2023 • Zachary Englhardt, Richard Li, Dilini Nissanka, Zhihan Zhang, Girish Narayanswamy, Joseph Breda, Xin Liu, Shwetak Patel, Vikram Iyer
We leverage this finding to study how human programmers interact with these tools, and develop an human-AI based software engineering workflow for building embedded systems.
1 code implementation • 23 May 2023 • Zhihan Zhang, Wenhao Yu, Zheng Ning, Mingxuan Ju, Meng Jiang
Contrast consistency, the ability of a model to make consistently correct predictions in the presence of perturbations, is an essential aspect in NLP.
no code implementations • 23 May 2023 • Mengxia Yu, Zhihan Zhang, Wenhao Yu, Meng Jiang
Comparative reasoning is a process of comparing objects, concepts, or entities to draw conclusions, which constitutes a fundamental cognitive ability.
no code implementations • 23 May 2023 • Wenhao Yu, Zhihan Zhang, Zhenwen Liang, Meng Jiang, Ashish Sabharwal
ReFeed first generates initial outputs, then utilizes a retrieval model to acquire relevant information from large document collections, and finally incorporates the retrieved information into the in-context demonstration for output refinement, thereby addressing the limitations of LLMs in a more efficient and cost-effective manner.
1 code implementation • 16 May 2023 • Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
To overcome this limitation, recent autoregressive search engines replace the dual-encoder architecture by directly generating identifiers for relevant documents in the candidate pool.
4 code implementations • 9 May 2023 • Raymond Li, Loubna Ben allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.
Ranked #56 on
Code Generation
on MBPP
1 code implementation • 23 Oct 2022 • Wenhao Yu, Chenguang Zhu, Zhihan Zhang, Shuohang Wang, Zhuosheng Zhang, Yuwei Fang, Meng Jiang
However, applying such methods to commonsense reasoning tasks faces two unique challenges, i. e., the lack of a general large-scale corpus for retrieval and a corresponding effective commonsense retriever.
1 code implementation • 7 Oct 2022 • Zhihan Zhang, Wenhao Yu, Chenguang Zhu, Meng Jiang
The entity knowledge is stored in the memory as latent representations, and the memory is pre-trained on Wikipedia along with encoder-decoder parameters.
no code implementations • 9 Jul 2022 • Gang Liu, Zhihan Zhang, Zheng Ning, Meng Jiang
To enable explainability, recent techniques such as ACCENT and FIA are looking for counterfactual explanations that are specific historical actions of a user, the removal of which leads to a change to the recommendation result.
no code implementations • 7 Apr 2022 • Zhihan Zhang, Wenhao Yu, Mengxia Yu, Zhichun Guo, Meng Jiang
Multi-task learning (MTL) has become increasingly popular in natural language processing (NLP) because it improves the performance of related tasks by exploiting their commonalities and differences.
1 code implementation • NAACL (DLG4NLP) 2022 • Wenhao Yu, Chenguang Zhu, Lianhui Qin, Zhihan Zhang, Tong Zhao, Meng Jiang
A set of knowledge experts seek diverse reasoning on KG to encourage various generation outputs.
no code implementations • 28 Sep 2020 • Zhihan Zhang, Xiubo Geng, Tao Qin, Yunfang Wu, Daxin Jiang
In this work, we focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
no code implementations • 1 Sep 2020 • Xuefeng Hu, Zhihan Zhang, Zhenye Jiang, Syomantak Chaudhuri, Zhenheng Yang, Ram Nevatia
We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations.
1 code implementation • 25 Jul 2020 • Daheng Wang, Zhihan Zhang, Yihong Ma, Tong Zhao, Tianwen Jiang, Nitesh V. Chawla, Meng Jiang
In this work, we present a novel framework called CoEvoGNN for modeling dynamic attributed graph sequence.
no code implementations • 7 Nov 2019 • Zhihan Zhang, Zhiyi Yin, Shuhuai Ren, Xinhang Li, Shicheng Li
In this paper, we aim to collect diversified information from video and text for informative comment generation.
1 code implementation • ACL 2019 • Pengcheng Yang, Zhihan Zhang, Fuli Luo, Lei LI, Chengyang Huang, Xu sun
Automatic commenting of online articles can provide additional opinions and facts to the reader, which improves user experience and engagement on social media platforms.