1 code implementation • 3 Jan 2025 • Jiaming Li, Jiacheng Zhang, Zequn Jie, Lin Ma, Guanbin Li
In this method, we design a Cross-Modal Value-Enhanced Decoding(CMVED) module to alleviate hallucination by a novel contrastive decoding mechanism.
1 code implementation • 19 Dec 2024 • Yatai Ji, Jiacheng Zhang, Jie Wu, Shilong Zhang, Shoufa Chen, Chongjian Ge, Peize Sun, Weifeng Chen, Wenqi Shao, Xuefeng Xiao, Weilin Huang, Ping Luo
Text-to-video models have made remarkable advancements through optimization on high-quality text-video pairs, where the textual prompts play a pivotal role in determining quality of output videos.
no code implementations • 19 Dec 2024 • Jiacheng Zhang, Jie Wu, Weifeng Chen, Yatai Ji, Xuefeng Xiao, Weilin Huang, Kai Han
To tackle these issues, we introduce OnlineVPO, a more efficient preference learning approach tailored specifically for video diffusion models.
no code implementations • 18 Dec 2024 • Haiming Zhang, Ying Xue, Xu Yan, Jiacheng Zhang, Weichao Qiu, Dongfeng Bai, Bingbing Liu, Shuguang Cui, Zhen Li
Experiments demonstrate the effectiveness of our approach, showcasing its state-of-the-art performance on the nuScenes and OpenScene benchmarks for 4D occupancy forecasting, end-to-end motion planning and point cloud forecasting.
no code implementations • 4 Oct 2024 • Changxiao Cai, Jiacheng Zhang
Multi-armed bandit (MAB) algorithms have achieved significant success in sequential decision-making applications, under the premise that humans perfectly implement the recommended policy.
no code implementations • 25 Sep 2024 • Jiacheng Zhang, Yang Jiao, Shaoxiang Chen, Jingjing Chen, Yu-Gang Jiang
To effectively instruct an MLLM, in addition to conventional language expressions, the practice of referring to objects by painting with brushes on images has emerged as a prevalent tool (referred to as "referring visual prompts") due to its efficacy in aligning the user's intention with specific image regions.
1 code implementation • 25 Sep 2024 • Jiacheng Zhang, Yang Jiao, Shaoxiang Chen, Na Zhao, Jingjing Chen
To mitigate this gap, we propose EventHallusion, a novel benchmark that focuses on assessing the VideoLLMs' hallucination toward event, the crux of video analysis.
1 code implementation • 2 Jun 2024 • Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu
However, in this paper, we discover that not all pixels contribute equally to the accuracy on AEs (i. e., robustness) and accuracy on natural images (i. e., accuracy).
no code implementations • CVPR 2024 • Jiaming Li, Jiacheng Zhang, Jichang Li, Ge Li, Si Liu, Liang Lin, Guanbin Li
Specifically, we devise three modules: Background Category-specific Prompt, Background Object Discovery, and Inference Probability Rectification, to empower the detector to discover, represent, and leverage implicit object knowledge explored from background proposals.
no code implementations • 23 Apr 2024 • Weifeng Chen, Jiacheng Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang Lin
The rapid development of diffusion models has triggered diverse applications.
no code implementations • 21 Apr 2024 • Yuxi Ren, Xin Xia, Yanzuo Lu, Jiacheng Zhang, Jie Wu, Pan Xie, Xing Wang, Xuefeng Xiao
Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation.
no code implementations • 8 Apr 2024 • Jiacheng Zhang, Jie Wu, Yuxi Ren, Xin Xia, Huafeng Kuang, Pan Xie, Jiashi Li, Xuefeng Xiao, Weilin Huang, Shilei Wen, Lean Fu, Guanbin Li
Latent diffusion models (LDM) have revolutionized text-to-image generation, leading to the proliferation of various advanced models and diverse downstream applications.
no code implementations • CVPR 2024 • Jiacheng Zhang, Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li
Additionally, we present a DepthGradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels, effectively decoupling the depth gradient and removing conflicting gradients.
no code implementations • ICCV 2023 • Fengyu Yang, Jiacheng Zhang, Andrew Owens
An emerging line of work has sought to generate plausible imagery from touch.
no code implementations • 1 Aug 2023 • Zhenyu Zhong, Qiliang Fan, Jiacheng Zhang, Minghua Ma, Shenglin Zhang, Yongqian Sun, QIngwei Lin, Yuzhi Zhang, Dan Pei
Internet-based services have seen remarkable success, generating vast amounts of monitored key performance indicators (KPIs) as univariate or multivariate time series.
3 code implementations • CVPR 2023 • Jiacheng Zhang, Xiangru Lin, Wei zhang, Kuo Wang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li
Specifically, we propose a Stage-wise Hybrid Matching strategy that combines the one-to-many assignment and one-to-one assignment strategies to improve the training efficiency of the first stage and thus provide high-quality pseudo labels for the training of the second stage.
no code implementations • 15 Feb 2023 • Shuoqing Deng, Xiang Yu, Jiacheng Zhang
When the sufficient condition of the attitude function is violated, we can illustrate by various examples that the characterization of the optimal equilibrium may differ significantly from some existing results for an individual agent.
no code implementations • 24 Nov 2022 • Jiacheng Zhang, Wenyi Yan, Ye Zhang
In this paper, a new speech feature fusion method is proposed for speaker recognition on the basis of the cross gate parallel convolutional neural network (CG-PCNN).
no code implementations • 22 Nov 2022 • Fengyu Yang, Chenyang Ma, Jiacheng Zhang, Jing Zhu, Wenzhen Yuan, Andrew Owens
The ability to associate touch with sight is essential for tasks that require physically interacting with objects in the world.
1 code implementation • ACL 2022 • Xueqing Wu, Jiacheng Zhang, Hang Li
We first employ a seq2seq model fine-tuned from a pre-trained language model to perform the task.
1 code implementation • 14 Jul 2020 • Xuancheng Huang, Jiacheng Zhang, Zhixing Tan, Derek F. Wong, Huanbo Luan, Jingfang Xu, Maosong Sun, Yang Liu
System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance.
no code implementations • 26 Nov 2019 • Jiacheng Zhang, Huanbo Luan, Maosong Sun, FeiFei Zhai, Jingfang Xu, Yang Liu
The lack of alignment in NMT models leads to three problems: it is hard to (1) interpret the translation process, (2) impose lexical constraints, and (3) impose structural constraints.
1 code implementation • ACL 2017 • Jiacheng Zhang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun
Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge.
3 code implementations • EMNLP 2018 • Jiacheng Zhang, Huanbo Luan, Maosong Sun, FeiFei Zhai, Jingfang Xu, Min Zhang, Yang Liu
Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge.
6 code implementations • 20 Jun 2017 • Jiacheng Zhang, Yanzhuo Ding, Shiqi Shen, Yong Cheng, Maosong Sun, Huanbo Luan, Yang Liu
This paper introduces THUMT, an open-source toolkit for neural machine translation (NMT) developed by the Natural Language Processing Group at Tsinghua University.