1 code implementation • COLING 2022 • Kaixin Wu, Yue Zhang, Bojie Hu, Tong Zhang
Extensive experiments on ten WMT machine translation tasks show that the proposed model yields an average of 1. 35x faster (with almost no decrease in BLEU) over the state-of-the-art inference implementation.
1 code implementation • ACL 2022 • Ying Su, Hongming Zhang, Yangqiu Song, Tong Zhang
However, the imbalanced training dataset leads to poor performance on rare senses and zero-shot senses.
no code implementations • NLP4ConvAI (ACL) 2022 • Tong Zhang, Yong liu, Boyang Li, Peixiang Zhong, Chen Zhang, Hao Wang, Chunyan Miao
Conversational Recommendation Systems recommend items through language based interactions with users. In order to generate naturalistic conversations and effectively utilize knowledge graphs (KGs) containing background information, we propose a novel Bag-of-Entities loss, which encourages the generated utterances to mention concepts related to the item being recommended, such as the genre or director of a movie.
1 code implementation • 13 May 2025 • Yangyi Chen, Hao Peng, Tong Zhang, Heng Ji
PRIOR introduces a reference model-a text-only large language model (LLM) trained on the captions without image inputs, to weight each token based on its probability for LVLMs training.
no code implementations • 12 May 2025 • Tong Zhang, Boyuan Zheng, Ruiqian Nai, Yingdong Hu, Yen-Jen Wang, Geng Chen, Fanqi Lin, Jiongye Li, Chuye Hong, Koushil Sreenath, Yang Gao
The human body demonstrates exceptional motor capabilities-such as standing steadily on one foot or performing a high kick with the leg raised over 1. 5 meters-both requiring precise balance control.
no code implementations • 11 May 2025 • Tong Zhang, Fenghua Shao, Runsheng Zhang, Yifan Zhuang, Liuqingqing Yang
It can accurately capture and track the user's gesture trajectory and is superior to traditional tracking methods in terms of real-time and accuracy.
1 code implementation • 5 May 2025 • Xiusi Chen, Gaotang Li, Ziqi Wang, Bowen Jin, Cheng Qian, Yu Wang, Hongru Wang, Yu Zhang, Denghui Zhang, Tong Zhang, Hanghang Tong, Heng Ji
The training of M-R1 consists of two key stages: (1) distillation of high-quality reasoning chains and (2) reinforcement learning with verifiable rewards.
1 code implementation • 5 May 2025 • Jiarui Yao, Yifan Hao, Hanning Zhang, Hanze Dong, Wei Xiong, Nan Jiang, Tong Zhang
Chain-of-thought (CoT) reasoning in large language models (LLMs) can be formalized as a latent variable problem, where the model needs to generate intermediate reasoning steps.
no code implementations • 2 May 2025 • Nishant Jain, Xunpeng Huang, Yian Ma, Tong Zhang
Additionally, under minimal assumptions on the data distribution an increasingly common setting in recent diffusion model analyses we show that a similar KL convergence guarantee can be obtained, with the number of steps scaling as $ O\left(d \log\left(\frac{d}{\varepsilon}\right)\right) $.
no code implementations • 30 Apr 2025 • Xunpeng Huang, Yujin Han, Difan Zou, Yian Ma, Tong Zhang
On the other hand, when there is no obvious conditional dependence across patches of the data, AR diffusion does not outperform DDPM.
1 code implementation • 24 Apr 2025 • Fengchun Liu, Tong Zhang, Chunying Zhang
Aiming at the problems of poor quality of steganographic images and slow network convergence of image steganography models based on deep learning, this paper proposes a Steganography Curriculum Learning training strategy (STCL) for deep learning image steganography models.
1 code implementation • 23 Apr 2025 • Xu Guo, Tong Zhang, Fuyun Wang, Xudong Wang, Xiaoya Zhang, Xin Liu, Zhen Cui
For a comprehensive information exploration from user-product relations, we construct two hypergraphs, i. e. a user-to-user (u2u) hypergraph and an item-to-item (i2i) hypergraph, to mine shared preferences among users and intricate multimodal semantic resemblance among items, respectively.
1 code implementation • 23 Apr 2025 • Fengchun Liu, Tong Zhang, Chunying Zhang
In recent years, a large number of works have introduced Convolutional Neural Networks (CNNs) into image steganography, which transform traditional steganography methods such as hand-crafted features and prior knowledge design into steganography methods that neural networks autonomically learn information embedding.
no code implementations • 17 Apr 2025 • Wei zhang, Miaoxin Cai, Yaqian Ning, Tong Zhang, Yin Zhuang, He Chen, Jun Li, Xuerui Mao
Recent advances in the visual-language area have developed natural multi-modal large language models (MLLMs) for spatial reasoning through visual prompting.
1 code implementation • 15 Apr 2025 • Wei Xiong, Jiarui Yao, Yuhui Xu, Bo Pang, Lei Wang, Doyen Sahoo, Junnan Li, Nan Jiang, Tong Zhang, Caiming Xiong, Hanze Dong
In this work, we revisit GRPO from a reinforce-like algorithm perspective and analyze its core components.
no code implementations • 13 Apr 2025 • Xu Guo, Tong Zhang, Yuanzhi Wang, Chenxu Wang, Fuyun Wang, Xudong Wang, Xiaoya Zhang, Xin Liu, Zhen Cui
To this end, we propose a novel framework, Hypergraph Enhanced LLM Learning for multimodal Recommendation (HeLLM), designed to equip LLMs with the capability to capture intricate higher-order semantic correlations by fusing graph-level contextual signals with sequence-level behavioral patterns.
no code implementations • 6 Apr 2025 • Tong Zhang
AdaptRec employs a two-phase user selection mechanism -- User Similarity Retrieval and Self-Adaptive User Selection -- to efficiently identify relevant user sequences in large-scale datasets from multi-metric evaluation.
no code implementations • 3 Apr 2025 • Congpei Qiu, Yanhao Wu, Wei Ke, Xiuxiu Bai, Tong Zhang
Contrastive Language-Image Pre-training (CLIP) excels in global alignment with language but exhibits limited sensitivity to spatial information, leading to strong performance in zero-shot classification tasks but underperformance in tasks requiring precise spatial understanding.
no code implementations • 29 Mar 2025 • Yufan Ren, Konstantinos Tertikas, Shalini Maiti, Junlin Han, Tong Zhang, Sabine Süsstrunk, Filippos Kokkinos
Our results reveal that even the state-of-the-art LVLMs struggle with these puzzles, highlighting fundamental limitations in their puzzle-solving capabilities.
no code implementations • 26 Mar 2025 • Kang An, Yuxing Liu, Rui Pan, Shiqian Ma, Donald Goldfarb, Tong Zhang
Training deep neural networks (DNNs) is a structured optimization problem, because the parameters are naturally represented by matrices and tensors rather than simple vectors.
no code implementations • 24 Mar 2025 • Haiqi Liu, C. L. Philip Chen, Tong Zhang
This article introduces the few-shot adapter with a cross-view fusion method called FACE for cross-subject EEG emotion recognition, which leverages dynamic multi-view fusion and effective subject-specific adaptation.
no code implementations • 23 Mar 2025 • Yang Yang, Tong Zhang, Jian Wu, Lijie Su
In Stage 2, a convex optimization algorithm refines the dynamic topic structure using the convex NMF (cNMF) model, further enhancing topic integration and stability.
no code implementations • 19 Mar 2025 • Yanhao Wu, Haoyang Zhang, Tianwei Lin, Lichao Huang, Shujie Luo, Rui Wu, Congpei Qiu, Wei Ke, Tong Zhang
Generative models in Autonomous Driving (AD) enable diverse scene creation, yet existing methods fall short by only capturing a limited range of modalities, restricting the capability of generating controllable scenes for comprehensive evaluation of AD systems.
no code implementations • 12 Mar 2025 • Jiale Wang, Chen Zhao, Wei Ke, Tong Zhang
Random Sample Consensus (RANSAC) is a fundamental approach for robustly estimating parametric models from noisy data.
no code implementations • 8 Mar 2025 • Shivanshu Shekhar, Tong Zhang
Diffusion models have revolutionized generative modeling in continuous domains like image, audio, and video synthesis.
no code implementations • 6 Mar 2025 • Lihao Xiao, Tingyu Zhang, Yun Liu, Chayanis Sutcharitchan, Qingyuan Liu, Xiaoxue Fan, Jian Feng, Huifang Gao, Tong Zhang, Shao Li
Differential metabolites in plasma were determined by untargeted metabolomics, and gut microbiota diversity/composition in fecal and cecal samples was assessed via 16S rRNA sequencing.
no code implementations • 5 Mar 2025 • Yi-Fan Lu, Xian-Ling Mao, Tian Lan, Tong Zhang, Yu-Shi Zhu, Heyan Huang
To address these two problems above, we propose a scalable and reliable Semantic-level Evaluation framework for Open domain Event detection (SEOE) by constructing a more representative evaluation benchmark and introducing a semantic evaluation metric.
1 code implementation • 5 Mar 2025 • Ruida Wang, Rui Pan, Yuxin Li, Jipeng Zhang, Yizhen Jia, Shizhe Diao, Renjie Pi, Junjie Hu, Tong Zhang
To solve these issues, we propose MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought framework, (to the best of our knowledge), the first multi-agent framework for Lean4 theorem proving that balance high-level NL reasoning and FL verification in Long CoT.
no code implementations • 5 Mar 2025 • Jiarui Yao, Ruida Wang, Tong Zhang
To the best of our knowledge, it is the first framework that utilizes Lean4 to enhance LLMs' NL math reasoning ability.
no code implementations • 4 Mar 2025 • Songming Zhang, Xue Zhang, Tong Zhang, Bojie Hu, Yufeng Chen, Jinan Xu
However, in most existing methods for LLM alignment, all tokens in the response are optimized using a sparse, response-level reward or preference annotation.
no code implementations • 27 Feb 2025 • Tong Zhang, Shu Shen, C. L. Philip Chen
MICINet achieves the reliable removal of both types of noise by unifying them into the concept of Inter-class Confusing Information (\textit{ICI}) and eliminating it at both global and individual levels.
1 code implementation • 26 Feb 2025 • Wei Xiong, Hanning Zhang, Chenlu Ye, Lichang Chen, Nan Jiang, Tong Zhang
We study self-rewarding reasoning large language models (LLMs), which can simultaneously generate step-by-step reasoning and evaluate the correctness of their outputs during the inference time-without external feedback.
no code implementations • 19 Feb 2025 • Xinwei Shen, Nicolai Meinshausen, Tong Zhang
We propose a framework that defines a general forward process transitioning from the target distribution to a known distribution (e. g., Gaussian) and then learns a reverse Markov process using multiple engression models.
no code implementations • 13 Feb 2025 • Rui Yang, Hanyang Chen, Junyu Zhang, Mark Zhao, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, huan zhang, Tong Zhang
Leveraging Multi-modal Large Language Models (MLLMs) to create embodied agents offers a promising avenue for tackling real-world tasks.
no code implementations • 11 Feb 2025 • Li Mao, Wei Du, Shuo Wen, Qi Li, Tong Zhang, Wei Zhong
We conducted a comparative analysis to access the effects of various data sequences, including raw and binned data, on the prediction errors of four deep learning forecasting models.
no code implementations • 11 Feb 2025 • Heyang Zhao, Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang
Recent advances in Reinforcement Learning from Human Feedback (RLHF) have shown that KL-regularization plays a pivotal role in improving the efficiency of RL fine-tuning for large language models (LLMs).
no code implementations • 10 Feb 2025 • Haoqi Wang, Tong Zhang, Mathieu Salzmann
Large transformer models are known to produce high-norm tokens.
no code implementations • 9 Feb 2025 • Qingyue Zhao, Kaixuan Ji, Heyang Zhao, Tong Zhang, Quanquan Gu
KL-regularized policy optimization has become a workhorse in learning-based decision making, while its theoretical understanding is still very limited.
no code implementations • 5 Feb 2025 • Boyao Wang, Rui Pan, Shizhe Diao, Xingyuan Pan, Jipeng Zhang, Renjie Pi, Tong Zhang
Small language models (SLMs) have attracted considerable attention from both academia and industry due to their broad range of applications in edge devices.
no code implementations • 5 Feb 2025 • Dongqing Wang, Ehsan Pajouheshgar, Yitao Xu, Tong Zhang, Sabine Süsstrunk
Artistic stylization of 3D volumetric smoke data is still a challenge in computer graphics due to the difficulty of ensuring spatiotemporal consistency given a reference style image, and that within reasonable time and computational resources.
no code implementations • 4 Feb 2025 • Chenlu Ye, Yujia Jin, Alekh Agarwal, Tong Zhang
When the variance of the reward at each round is known, we use a variance-weighted regression approach and establish a regret bound that depends only on the cumulative reward variance and logarithmically on the reward range $R$ as well as the number of rounds $T$.
no code implementations • 2 Feb 2025 • Yi Jiang, Oubo Ma, Yong Yang, Tong Zhang, Shouling Ji
Human language encompasses a wide range of intricate and diverse implicit features, which attackers can exploit to launch adversarial or backdoor attacks, compromising DNN models for NLP tasks.
no code implementations • 28 Jan 2025 • Chunyu Lei, Guang-Ze Chen, C. L. Philip Chen, Tong Zhang
Different from employing existing incremental broad learning algorithms for online learning tasks, which tend to incur degraded accuracy and expensive online update overhead, we design an effective weight estimation algorithm and an efficient online updating strategy to remedy the above two deficiencies, respectively.
1 code implementation • NeurIPS 2019 • Qing Wang, Yingru Li, Jiechao Xiong, Tong Zhang
In deep reinforcement learning, policy optimization methods need to deal with issues such as function approximation and the reuse of off-policy data.
no code implementations • 22 Jan 2025 • Hanning Zhang, Juntong Song, Juno Zhu, Yuanhao Wu, Tong Zhang, Cheng Niu
Using \textbf{RAG-Reward}, we train reward models and apply reinforcement learning with human feedback (RLHF) to improve LLMs' effectiveness in RAG.
no code implementations • 24 Dec 2024 • Fenghua Shao, Tong Zhang, Shang Gao, Qi Sun, Liuqingqing Yang
This paper proposes a gesture recognition method based on a three-dimensional hand skeleton model.
no code implementations • 19 Dec 2024 • Shu Shen, C. L. Philip Chen, Tong Zhang
This paper finds that existing reliable multimodal classification methods not only fail to provide robust estimation of data quality, but also lack dynamic networks for sample-specific depth and parameters to achieve reliable inference.
1 code implementation • 15 Dec 2024 • Hanning Zhang, Pengcheng Wang, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, Tong Zhang
Our theoretical analysis shows that we could derive the optimal reward model from the initial policy sampling.
no code implementations • 7 Dec 2024 • Pengyu Li, Zhijie Zhong, Tong Zhang, Zhiwen Yu, C. L. Philip Chen, Kaixiang Yang
Time series anomaly detection (TSAD) has been a research hotspot in both academia and industry in recent years.
no code implementations • 3 Dec 2024 • Zhaozhi Wang, Conghu Li, Qixiang Ye, Tong Zhang
Most parameter-efficient fine-tuning (PEFT) methods rely on low-rank representations to adapt models.
1 code implementation • 27 Nov 2024 • Alejandro Pardo, Fabio Pizzati, Tong Zhang, Alexander Pondaven, Philip Torr, Juan Camilo Perez, Bernard Ghanem
Match-cuts are powerful cinematic tools that create seamless transitions between scenes, delivering strong visual and metaphorical connections.
1 code implementation • 11 Nov 2024 • Haohan Weng, Zibo Zhao, Biwen Lei, Xianghui Yang, Jian Liu, Zeqiang Lai, Zhuo Chen, Yuhong Liu, Jie Jiang, Chunchao Guo, Tong Zhang, Shenghua Gao, C. L. Philip Chen
We propose a compressive yet effective mesh representation, Blocked and Patchified Tokenization (BPT), facilitating the generation of meshes exceeding 8k faces.
no code implementations • 8 Nov 2024 • Zijian Hu, Jipeng Zhang, Rui Pan, Zhaozhuo Xu, Shanshan Han, Han Jin, Alay Dilipbhai Shah, Dimitris Stripelis, Yuhang Yao, Salman Avestimehr, Chaoyang He, Tong Zhang
Aiming to improve the pre-training efficiency, Fox-1-1. 6B model introduces a novel 3-stage data curriculum across all the training data with 2K-8K sequence length.
no code implementations • 7 Nov 2024 • Yide Ran, Zhaozhuo Xu, Yuhang Yao, Zijian Hu, Shanshan Han, Han Jin, Alay Dilipbhai Shah, Jipeng Zhang, Dimitris Stripelis, Tong Zhang, Salman Avestimehr, Chaoyang He
The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance.
no code implementations • 7 Nov 2024 • Heyang Zhao, Chenlu Ye, Quanquan Gu, Tong Zhang
To understand the fundamental distinction between policy learning objectives with KL-regularization and ones without KL-regularization, we are the first to theoretically demonstrate the power of KL-regularization by providing a sharp analysis for KL-regularized contextual bandits and RLHF, revealing an $\mathcal{O}(1 / \epsilon)$ sample complexity when $\epsilon$ is sufficiently small.
no code implementations • 6 Nov 2024 • Shivanshu Shekhar, Shreyas Singh, Tong Zhang
Direct Preference Optimization (DPO) has been successfully used to align large language models (LLMs) according to human preferences, and more recently it has also been applied to improving the quality of text-to-image diffusion models.
1 code implementation • 31 Oct 2024 • Chen Zhao, Xuan Wang, Tong Zhang, Saqib Javed, Mathieu Salzmann
In this paper, we address this overfitting issue by introducing Self-Ensembling Gaussian Splatting (SE-GS).
1 code implementation • 27 Oct 2024 • Peter Grönquist, Deblina Bhattacharjee, Bahar Aydemir, Baran Ozaydin, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
This dataset is a crucial component of the AI4VA Workshop Challenges~\url{https://sites. google. com/view/ai4vaeccv2024}, where we specifically explore depth and saliency.
no code implementations • 25 Oct 2024 • Xiaoyu Wang, Xuxing Chen, Shiqian Ma, Tong Zhang
This paper focuses on decentralized stochastic bilevel optimization (DSBO) where agents only communicate with their neighbors.
no code implementations • 24 Oct 2024 • Jipeng Zhang, Jianshu Zhang, Yuanzhe Li, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang
The underlying cause of this issue is the gap between natural language to programming language gap (NL-PL Gap), which is especially pronounced in LRPLs due to limited aligned data.
no code implementations • 22 Oct 2024 • Meng Xu, Tong Zhang, Fuyun Wang, Yi Lei, Xin Liu, Zhen Cui
As dedicated to posters, MPDS stands out as the first image-text pair dataset to our knowledge, composing of 373k+ image-text pairs and 8k+ actor images (covering 4k+ actors).
no code implementations • 12 Oct 2024 • Connah G. M. Johnson, Zachary Johnson, Liam S. Mackey, Xiaolu Li, Natalie C. Sadler, Tong Zhang, Wei-Jun Qian, Pavlo Bohutskyi, Song Feng, Margaret S. Cheung
We develop a systems approach based on an energy-landscape concept to differentiate interactions involving redox activities and conformational changes of proteins and nucleic acids interactions in multi-layered protein-DNA regulatory networks under light disturbance.
1 code implementation • 9 Oct 2024 • Renjie Pi, Jianshu Zhang, Tianyang Han, Jipeng Zhang, Rui Pan, Tong Zhang
In this paper, we introduce Personalized Visual Instruction Tuning (PVIT), a novel data curation and training framework designed to enable MLLMs to identify target individuals within an image and engage in personalized and coherent dialogues.
no code implementations • 30 Sep 2024 • Ke Yi, Zengke Liu, Jianwei Zhang, Chengyuan Li, Tong Zhang, Junyang Lin, Jingren Zhou
Based on observing activations from large language models, outliers can be classified into channel-wise and spike outliers.
no code implementations • 18 Sep 2024 • Xuanchang Zhang, Wei Xiong, Lichang Chen, Tianyi Zhou, Heng Huang, Tong Zhang
In this work, we extend the study of biases in preference learning beyond the commonly recognized length bias, offering a comprehensive analysis of a wider range of format biases.
1 code implementation • 11 Sep 2024 • Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
We propose a novel data augmentation method for deep saliency prediction that edits natural images while preserving the complexity and variability of real-world scenes.
no code implementations • 5 Sep 2024 • Yong Lin, Skyler Seto, Maartje ter Hoeve, Katherine Metcalf, Barry-John Theobald, Xuan Wang, Yizhe Zhang, Chen Huang, Tong Zhang
These findings highlight that DPORM has limited generalization ability and substantiates the integration of an explicit reward model in iterative DPO approaches.
no code implementations • 4 Sep 2024 • Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu
Recent studies have shown that large language models' (LLMs) mathematical problem-solving capabilities can be enhanced by integrating external tools, such as code interpreters, and employing multi-turn Chain-of-Thought (CoT) reasoning.
no code implementations • 30 Aug 2024 • Mingjun Sun, Shaochuan Wu, Haojie Wang, Yuanwei Liu, Guoyu Li, Tong Zhang
Lastly, using ICGNN as the core algorithm, we tailor the neural network's input and output for specific problem requirements and validate its performance in two scenarios: 1) in cellular networks, we develop a matrix-inverse-free multi-user multi-input multi-output (MU-MIMO) precoding scheme using the conjugate gradient (CG) method, adaptable to varying user and antenna numbers; 2) in a cell-free network, facing dynamic variations in the number of users served by APs, the number of APs serving each user, and the number of antennas per AP, we propose a universal power allocation scheme.
no code implementations • 27 Aug 2024 • Fangjinhua Wang, Qingtian Zhu, Di Chang, Quankai Gao, Junlin Han, Tong Zhang, Richard Hartley, Marc Pollefeys
3D reconstruction aims to recover the dense 3D structure of a scene.
2 code implementations • 24 Aug 2024 • Yifei He, Yuzheng Hu, Yong Lin, Tong Zhang, Han Zhao
Our algorithm works in two steps: i) Localization: identify tiny ($1\%$ of the total parameters) localized regions in the finetuned models containing essential skills for the downstream tasks, and ii) Stitching: reintegrate only these essential regions back into the pretrained model for task synergy.
1 code implementation • 29 Jul 2024 • Yuheng Shi, Tong Zhang, Xiaojie Guo
In principle, the detection in a certain frame of a video can benefit from information in other frames.
Ranked #1 on
Video Object Detection
on ImageNet VID
(using extra training data)
no code implementations • 24 Jul 2024 • Shuang Qiu, Dake Zhang, Rui Yang, Boxiang Lyu, Tong Zhang
This paper investigates multi-objective reinforcement learning (MORL), which focuses on learning Pareto optimal policies in the presence of multiple reward functions.
Multi-Objective Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 23 Jul 2024 • Haoqi Wang, Tong Zhang, Mathieu Salzmann
Vision Transformer models trained on large-scale datasets, although effective, often exhibit artifacts in the patch token they extract.
1 code implementation • 21 Jul 2024 • Jipeng Zhang, Yaxuan Qin, Renjie Pi, Weizhong Zhang, Rui Pan, Tong Zhang
Achieving this goal poses non-trivial challenges: 1) data selection requires accurate data representations that reflect the training samples' quality, 2) considering the diverse nature of instruction datasets, and 3) ensuring the efficiency of the coreset selection algorithm for large models.
1 code implementation • 18 Jul 2024 • Wei zhang, Miaoxin Cai, Tong Zhang, Jun Li, Yin Zhuang, Xuerui Mao
Specifically, a shared visual encoding method is developed to establish the spatial pattern interpretation relationships between the multi-scale representations of input images and various visual prompts.
no code implementations • 17 Jul 2024 • Rui Xie, Asad Ul Haq, Linsen Ma, Krystal Sun, Sanchari Sen, Swagath Venkataramani, Liu Liu, Tong Zhang
Recent studies have revealed that, during the inference on generative AI models such as transformer, the importance of different weights exhibits substantial context-dependent variations.
no code implementations • 15 Jul 2024 • Tong Zhang, Chris Junchi Li
In this paper, we revisit \textsf{ROOT-SGD}, an innovative method for stochastic optimization to bridge the gap between stochastic optimization and statistical efficiency.
no code implementations • 11 Jul 2024 • Shuangqi Li, Chen Liu, Tong Zhang, Hieu Le, Sabine Süsstrunk, Mathieu Salzmann
We introduce an approach to bias deep generative models, such as GANs and diffusion models, towards generating data with either enhanced fidelity or increased diversity.
no code implementations • 10 Jul 2024 • Dake Zhang, Boxiang Lyu, Shuang Qiu, Mladen Kolar, Tong Zhang
We study risk-sensitive reinforcement learning (RL), a crucial field due to its ability to enhance decision-making in scenarios where it is essential to manage uncertainty and minimize potential adverse outcomes.
1 code implementation • 10 Jul 2024 • Lingzhi Pan, Tong Zhang, Bingyuan Chen, Qi Zhou, Wei Ke, Sabine Süsstrunk, Mathieu Salzmann
Our method searches latent spaces capable of generating inpainted regions that exhibit high fidelity to user-provided prompts while maintaining coherence with the background.
1 code implementation • 3 Jul 2024 • Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang
However, due to the scarcity of aligned NL and Formal Language (FL) theorem-proving data most modern LLMs exhibit suboptimal performance. This scarcity results in a paucity of methodologies for training LLMs and techniques to fully utilize their capabilities in composing formal proofs.
no code implementations • 28 Jun 2024 • Rui Pan, Jipeng Zhang, Xingyuan Pan, Renjie Pi, Xiaoyu Wang, Tong Zhang
Bilevel optimization has shown its utility across various machine learning settings, yet most algorithms in practice require second-order information, making it challenging to scale them up.
no code implementations • 21 Jun 2024 • Yuxing Liu, Rui Pan, Tong Zhang
Despite the huge success in practice, their theoretical advantages over classical gradient methods with uniform step sizes across all coordinates (e. g. SGD) have not been fully understood, especially in the large batch-size setting commonly used in practice.
2 code implementations • 18 Jun 2024 • Haoxiang Wang, Wei Xiong, Tengyang Xie, Han Zhao, Tong Zhang
The trained RM serves as a proxy for human preferences.
1 code implementation • 15 Jun 2024 • Tong Zhang, Yingdong Hu, Jiacheng You, Yang Gao
SGRv2 excels in RLBench tasks with keyframe control using merely 5 demonstrations and surpasses the RVT baseline in 23 of 26 tasks.
2 code implementations • 14 Jun 2024 • Rui Yang, Ruomeng Ding, Yong Lin, huan zhang, Tong Zhang
Reward models trained on human preference data have been proven to effectively align Large Language Models (LLMs) with human intent within the framework of reinforcement learning from human feedback (RLHF).
no code implementations • 12 Jun 2024 • Cheng Niu, Yang Guan, Yuanhao Wu, Juno Zhu, Juntong Song, Randy Zhong, Kaihua Zhu, Siliang Xu, Shizhe Diao, Tong Zhang
In response to this challenge, we introduce VeraCT Scan, a novel retrieval-augmented system for fake news detection.
no code implementations • 12 Jun 2024 • Yitao Xu, Tong Zhang, Sabine Süsstrunk
In this paper, we propose Adaptor Neural Cellular Automata (AdaNCA) for Vision Transformers that uses NCA as plug-and-play adaptors between ViT layers, thus enhancing ViT's performance and robustness against adversarial samples as well as out-of-distribution inputs.
1 code implementation • 11 Jun 2024 • Renjie Pi, Jianshu Zhang, Jipeng Zhang, Rui Pan, Zhekai Chen, Tong Zhang
Image description datasets play a crucial role in the advancement of various applications such as image understanding, text-to-image generation, and text-image retrieval.
no code implementations • 30 May 2024 • Ke Yi, Yuhui Xu, Heng Chang, Chen Tang, Yuan Meng, Tong Zhang, Jia Li
Large Language Models (LLMs) have advanced rapidly but face significant memory demands.
no code implementations • 27 May 2024 • Haohan Weng, Yikai Wang, Tong Zhang, C. L. Philip Chen, Jun Zhu
Generating compact and sharply detailed 3D meshes poses a significant challenge for current 3D generative models.
no code implementations • 27 May 2024 • Xunpeng Huang, Difan Zou, Yi-An Ma, Hanze Dong, Tong Zhang
Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems.
no code implementations • 26 May 2024 • Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yi-An Ma, Tong Zhang
To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs.
no code implementations • 22 May 2024 • Licheng Shen, Ho Ngai Chow, Lingyun Wang, Tong Zhang, Mengqiu Wang, Yuxing Han
In this paper, we present Gaussian Time Machine (GTM) which models the time-dependent attributes of Gaussian primitives with discrete time embedding vectors decoded by a lightweight Multi-Layer-Perceptron(MLP).
1 code implementation • 20 May 2024 • Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, dingnan jin, Hongru Liang, Tat-Seng Chua
To this end, we introduce CLAMBER, a benchmark for evaluating LLMs using a well-organized taxonomy.
no code implementations • 17 May 2024 • Cheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang
Then a two-stage fine-tuning on LLaMA 2 is performed on the generated data and the real data for the DST prediction.
3 code implementations • 13 May 2024 • Hanze Dong, Wei Xiong, Bo Pang, Haoxiang Wang, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang
We present the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF) in this technical report, which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature.
no code implementations • 23 Apr 2024 • Tong Zhang, Wenxue Cui, Shaohui Liu, Feng Jiang
Convolutional Neural Network (CNN) and Transformer have attracted much attention recently for video post-processing (VPP).
1 code implementation • CVPR 2024 • Yanhao Wu, Tong Zhang, Wei Ke, Congpei Qiu, Sabine Susstrunk, Mathieu Salzmann
Subsequently, we introduce a context-aware feature learning strategy, which encodes object patterns without relying on their specific context by aggregating object features across various scenes.
no code implementations • 4 Apr 2024 • Miao Lu, Han Zhong, Tong Zhang, Jose Blanchet
Unlike previous work, which relies on a generative model or a pre-collected offline dataset enjoying good coverage of the deployment environment, we tackle robust RL via interactive data collection, where the learner interacts with the training environment only and refines the policy through trial and error.
no code implementations • 2 Apr 2024 • Zixuan Zhang, Revanth Gangi Reddy, Kevin Small, Tong Zhang, Heng Ji
In addition, it is still unclear how well an OpenQA model can transfer to completely new knowledge domains.
no code implementations • 26 Mar 2024 • Yifan Hao, Yong Lin, Difan Zou, Tong Zhang
We demonstrate that in this scenario, further increasing the model's parameterization can significantly reduce the OOD loss.
1 code implementation • 26 Mar 2024 • Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang
Attempting to complement this deficiency, we investigate the layerwise properties of LoRA on fine-tuning tasks and observe an unexpected but consistent skewness of weight norms across different layers.
1 code implementation • CVPR 2024 • Chen Zhao, Tong Zhang, Zheng Dang, Mathieu Salzmann
Determining the relative pose of a previously unseen object between two images is pivotal to the success of generalizable object pose estimation.
no code implementations • 18 Mar 2024 • Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang
Large vision language models, such as CLIP, demonstrate impressive robustness to spurious features than single-modal models trained on ImageNet.
no code implementations • 15 Mar 2024 • Yongjie Wang, Tong Zhang, Xu Guo, Zhiqi Shen
Due to the lack of a rigorous definition of explainable AI (XAI), a plethora of research related to explainability, interpretability, and transparency has been developed to explain and analyze the model from various perspectives.
1 code implementation • CVPR 2024 • Haohan Weng, Danqing Huang, Yu Qiao, Zheng Hu, Chin-Yew Lin, Tong Zhang, C. L. Philip Chen
In this paper, we present Desigen, an automatic template creation pipeline which generates background images as well as harmonious layout elements over the background.
no code implementations • 13 Mar 2024 • Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang
To mitigate this issue, we propose Bootstrapped Preference Optimization (BPO), which conducts preference learning with datasets containing negative responses bootstrapped from the model itself.
Ranked #102 on
Visual Question Answering
on MM-Vet
no code implementations • 13 Mar 2024 • ZiCheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Qixiang Ye, Wei Ke
To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information. Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.
no code implementations • 11 Mar 2024 • Tong Zhang, Chen Huang, Yang Deng, Hongru Liang, Jia Liu, Zujie Wen, Wenqiang Lei, Tat-Seng Chua
We investigate non-collaborative dialogue agents, which are expected to engage in strategic conversations with diverse users, for securing a mutual agreement that leans favorably towards the system's objectives.
no code implementations • 11 Mar 2024 • Baran Ozaydin, Tong Zhang, Deblina Bhattacharjee, Sabine Süsstrunk, Mathieu Salzmann
Our OMH yields better unsupervised segmentation performance compared to existing USS methods.
no code implementations • 10 Mar 2024 • Xunpeng Huang, Hanze Dong, Difan Zou, Tong Zhang
Along this line, Freund et al. (2022) suggest that the modified Langevin algorithm with prior diffusion is able to converge dimension independently for strongly log-concave target distributions.
no code implementations • 6 Mar 2024 • Wei zhang, Miaoxin Cai, Tong Zhang, Guoqiang Lei, Yin Zhuang, Xuerui Mao
Ship detection needs to identify ship locations from remote sensing (RS) scenes.
no code implementations • 29 Feb 2024 • Xiang Chen, Wenjie Zhu, Jiayuan Chen, Tong Zhang, Changyan Yi, Jun Cai
This paper proposes a novel edge computing enabled real-time video analysis system for intelligent visual devices.
1 code implementation • 28 Feb 2024 • Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang
Additionally, DPA models user preferences as directions (i. e., unit vectors) in the reward space to achieve user-dependent preference control.
no code implementations • 15 Feb 2024 • Ying Su, Tianqing Fang, Huiru Xiao, Weiqi Wang, Yangqiu Song, Tong Zhang, Lei Chen
In this paper, we propose to adopt textual entailment to find implicit entailment relations between CSKG nodes, to effectively densify the subgraph connecting nodes within the same conceptual class, which indicates a similar level of plausibility.
no code implementations • 14 Feb 2024 • Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang
We also prove a lower bound to show that the additive dependence on $C$ is optimal.
Model-based Reinforcement Learning
reinforcement-learning
+2
1 code implementation • 11 Feb 2024 • Chenlu Ye, Wei Xiong, Yuheng Zhang, Hanze Dong, Nan Jiang, Tong Zhang
We investigate Reinforcement Learning from Human Feedback (RLHF) in the context of a general preference oracle.
1 code implementation • 6 Feb 2024 • Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang
In this paper, we identify a typical class of inputs that baffles MLLMs, which consist of images that are highly relevant but inconsistent with answers, causing MLLMs to suffer from visual illusion.
1 code implementation • 31 Jan 2024 • Ying Su, Jipeng Zhang, Yangqiu Song, Tong Zhang
To facilitate the evaluation of pruned subgraphs, we also propose a graph attention network (GAT) based module to reason with the subgraph data.
1 code implementation • 30 Jan 2024 • Wei zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
Multi-modal large language models (MLLMs) have demonstrated remarkable success in vision and visual-language tasks within the natural image domain.
1 code implementation • 21 Jan 2024 • Chengbo Yuan, Chuan Wen, Tong Zhang, Yang Gao
Therefore, we propose to utilize 3D flow, which represents the future trajectories of 3D points on objects of interest, as an ideal prediction target.
no code implementations • 19 Jan 2024 • Yifan Hao, Tong Zhang
Recent empirical and theoretical studies have established the generalization capabilities of large machine learning models that are trained to (approximately or exactly) fit noisy data.
no code implementations • 12 Jan 2024 • Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang
Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation.
1 code implementation • 5 Jan 2024 • Renjie Pi, Tianyang Han, Jianshu Zhang, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang
The deployment of multimodal large language models (MLLMs) has brought forth a unique vulnerability: susceptibility to malicious attacks through visual inputs.
no code implementations • 3 Jan 2024 • Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciucă for UniverseTBD
We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training.
3 code implementations • 31 Dec 2023 • Cheng Niu, Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Randy Zhong, Juntong Song, Tong Zhang
Retrieval-augmented generation (RAG) has become a main technique for alleviating hallucinations in large language models (LLMs).
no code implementations • 22 Dec 2023 • Rui Pan, Yuxing Liu, Xiaoyu Wang, Tong Zhang
This means SGD with heavy-ball momentum is useful in the large-batch settings such as distributed machine learning or federated learning, where a smaller number of iterations can significantly reduce the number of communication rounds, leading to acceleration in practice.
3 code implementations • 18 Dec 2023 • Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang
We investigate its behavior in three distinct settings -- offline, online, and hybrid -- and propose efficient algorithms with finite-sample theoretical guarantees.
1 code implementation • 14 Dec 2023 • Jiaqi Tang, Hao Lu, Xiaogang Xu, Ruizheng Wu, Sixing Hu, Tong Zhang, Tsz Wa Cheng, Ming Ge, Ying-Cong Chen, Fugee Tsung
Artificial Intelligence (AI)-driven defect inspection is pivotal in industrial manufacturing.
1 code implementation • 7 Dec 2023 • Shibin Wu, Bang Yang, Zhiyu Ye, Haoqian Wang, Hairong Zheng, Tong Zhang
Medical report generation demands automatic creation of coherent and precise descriptions for medical images.
1 code implementation • 5 Dec 2023 • Zhi Chen, Yufan Ren, Tong Zhang, Zheng Dang, Wenbing Tao, Sabine Süsstrunk, Mathieu Salzmann
To achieve this, we introduce a training procedure and a refinement network.
no code implementations • 29 Nov 2023 • Yingdong Hu, Fanqi Lin, Tong Zhang, Li Yi, Yang Gao
In this study, we are interested in imbuing robots with the capability of physically-grounded task planning.
no code implementations • 27 Nov 2023 • Tong Zhang, Haoyang Liu, Peiyan Zhang, Yuxuan Cheng, Haohan Wang
Our method focuses on producing SVGs that are both accurate and simple, aligning with human readability and understanding.
1 code implementation • 16 Nov 2023 • Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang
This approach is formalized by first identifying the disparity in knowledge encompassed by pre-trained parameters compared to that of instruction tuning data.
1 code implementation • 14 Nov 2023 • Rui Pan, Shuo Xing, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang
Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models.
no code implementations • CVPR 2024 • Renjie Pi, Lewei Yao, Jiahui Gao, Jipeng Zhang, Tong Zhang
In this paper, we present a novel end-to-end framework named PerceptionGPT, which efficiently and effectively equips the VLLMs with visual perception abilities by leveraging the representation power of LLMs' token embedding.
1 code implementation • 11 Nov 2023 • Haoyu Ma, Tong Zhang, Shanlin Sun, Xiangyi Yan, Kun Han, Xiaohui Xie
Reconstructing personalized animatable head avatars has significant implications in the fields of AR/VR.
no code implementations • 6 Nov 2023 • Ehsan Pajouheshgar, Yitao Xu, Alexander Mordvintsev, Eyvind Niklasson, Tong Zhang, Sabine Süsstrunk
We propose Mesh Neural Cellular Automata (MeshNCA), a method that directly synthesizes dynamic textures on 3D meshes without requiring any UV maps.
1 code implementation • NeurIPS 2023 • Chenlu Ye, Rui Yang, Quanquan Gu, Tong Zhang
Notably, under the assumption of single policy coverage and the knowledge of $\zeta$, our proposed algorithm achieves a suboptimality bound that is worsened by an additive factor of $\mathcal{O}(\zeta (C(\widehat{\mathcal{F}},\mu)n)^{-1})$ due to the corruption.
2 code implementations • 19 Oct 2023 • Rui Yang, Han Zhong, Jiawei Xu, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang
Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment.
no code implementations • 12 Oct 2023 • Haohan Weng, Tianyu Yang, Jianan Wang, Yu Li, Tong Zhang, C. L. Philip Chen, Lei Zhang
Large image diffusion models enable novel view synthesis with high quality and excellent zero-shot capability.
no code implementations • 5 Oct 2023 • Chen Zhao, Tong Zhang, Mathieu Salzmann
Our goal then is to estimate the relative object pose between this reference view and a query image that depicts the object in a different pose.
no code implementations • 4 Oct 2023 • Weirui Ye, Yunsheng Zhang, Haoyang Weng, Xianfan Gu, Shengjie Wang, Tong Zhang, Mengchen Wang, Pieter Abbeel, Yang Gao
We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models.
no code implementations • 29 Sep 2023 • Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang
Contrary to the conventional wisdom that focuses on learning invariant features for better OOD performance, our findings suggest that incorporating a large number of diverse spurious features weakens their individual contributions, leading to improved overall OOD generalization performance.
no code implementations • 25 Sep 2023 • Tong Zhang, X. Jessie Yang, Boyang Li
With this paper, we investigate if free-form conversations can enhance users' comprehension of static explanations, improve acceptance and trust in the explanation methods, and facilitate human-AI collaboration.
1 code implementation • 18 Sep 2023 • Helbert Paat, Qing Lian, Weilong Yao, Tong Zhang
In this paper, we present the first approach that addresses the inherent ambiguities present in pseudo labels by introducing an Evidential Deep Learning (EDL) based uncertainty estimation framework.
1 code implementation • 12 Sep 2023 • Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan YAO, Tong Zhang
Building on the analysis and the observation that averaging different layers of the transformer leads to significantly different alignment-forgetting trade-offs, we propose Heterogeneous Model Averaging (HMA) to Heterogeneously find various combination ratios of model layers.
1 code implementation • 11 Sep 2023 • Yide Qiu, Shaoxiang Ling, Tong Zhang, Bo Huang, Zhen Cui
To perform effective learning on the large-scale UniKG, two key measures are taken, including (i) the semantic alignment strategy for multi-attribute entities, which projects the feature description of multi-attribute nodes into a common embedding space to facilitate node aggregation in a large receptive field; (ii) proposing a novel plug-and-play anisotropy propagation module (APM) to learn effective multi-hop anisotropy propagation kernels, which extends methods of large-scale homogeneous graphs to heterogeneous graphs.
no code implementations • 9 Sep 2023 • Menghao Hu, Tong Zhang, Shuai Wang, Guoliang Li, Yingyang Chen, Qiang Li, Gaojie Chen
Terrestrial robots, i. e., unmanned ground vehicles (UGVs), and aerial robots, i. e., unmanned aerial vehicles (UAVs), operate in separate spaces.
no code implementations • 5 Sep 2023 • Yong Lin, Chen Liu, Chenlu Ye, Qing Lian, Yuan YAO, Tong Zhang
Our proposed method, COPS (unCertainty based OPtimal Sub-sampling), is designed to minimize the expected loss of a model trained on subsampled data.
1 code implementation • 16 Aug 2023 • Jianyu Wen, Chenhao Wu, Tong Zhang, Yixuan Yu, Piotr Swierczynski
In this paper, we propose a 2-stage low-light image enhancement method called Self-Reference Deep Adaptive Curve Estimation (Self-DACE).
no code implementations • 10 Jul 2023 • Aixuan Li, Jing Zhang, Yunqiu Lv, Tong Zhang, Yiran Zhong, Mingyi He, Yuchao Dai
In this case, salient objects are typically non-camouflaged, and camouflaged objects are usually not salient.
no code implementations • 5 Jul 2023 • Xunpeng Huang, Hanze Dong, Yifan Hao, Yi-An Ma, Tong Zhang
We propose a Monte Carlo sampler from the reverse diffusion process.
1 code implementation • 21 Jun 2023 • Shizhe Diao, Rui Pan, Hanze Dong, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang
As the number of available foundation models and specialized tasks keeps growing, the job of training scientific language models becomes highly nontrivial.
1 code implementation • 18 Jun 2023 • Tong Zhang, Yingdong Hu, Hanchen Cui, Hang Zhao, Yang Gao
To this end, we present $\textbf{Semantic-Geometric Representation} (\textbf{SGR})$, a universal perception module for robotics that leverages the rich semantic information of large-scale pre-trained 2D models and inherits the merits of 3D spatial reasoning.
no code implementations • 18 Jun 2023 • Guangbu Liu, Tong Zhang, Xudong Wang, Wenting Zhao, Chuanwei Zhou, Zhen Cui
Instead of a plain use of a base graph dictionary, we propose the variational graph dictionary adaptation (VGDA) to generate a personalized dictionary (named adapted graph dictionary) for catering to each input graph.
1 code implementation • 18 Jun 2023 • Yifan Zhao, Tong Zhang, Jia Li, Yonghong Tian
Recent progress in this setting assumes that the base knowledge and novel query samples are distributed in the same domains, which are usually infeasible for realistic applications.
no code implementations • 9 Jun 2023 • Bang Yang, Asif Raza, Yuexian Zou, Tong Zhang
In this work, we propose customizing off-the-shelf general-purpose large-scale pre-trained models, i. e., foundation models (FMs), in computer vision and natural language processing with a specific focus on medical report generation.
1 code implementation • 8 Jun 2023 • Shizhe Diao, Tianyang Xu, Ruijia Xu, Jiawei Wang, Tong Zhang
Pre-trained language models (PLMs) demonstrate excellent abilities to understand texts in the generic domain while struggling in a specific domain.
1 code implementation • 30 May 2023 • Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang
In this paper, we study out-of-distribution (OOD) generalization of offline GCRL both theoretically and empirically to identify factors that are important.
no code implementations • CVPR 2024 • Dongqing Wang, Tong Zhang, Alaa Abboud, Sabine Süsstrunk
We propose InNeRF360, an automatic system that accurately removes text-specified objects from 360-degree Neural Radiance Fields (NeRF).
1 code implementation • 23 May 2023 • Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang
Overall, our proposed paradigm and DetGPT demonstrate the potential for more sophisticated and intuitive interactions between humans and machines.
no code implementations • 12 May 2023 • Haiqi Liu, C. L. Philip Chen, Xinrong Gong, Tong Zhang
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
1 code implementation • 18 Apr 2023 • Wentao Zhang, Yujun Huang, Tong Zhang, Qingsong Zou, Wei-Shi Zheng, Ruixuan Wang
In particular, updating an intelligent diagnosis system with training data of new diseases would cause catastrophic forgetting of old disease knowledge.
no code implementations • 15 Apr 2023 • Tong Zhang, Wenxue Cui, Chen Hui, Feng Jiang
Deep network-based image and video Compressive Sensing(CS) has attracted increasing attentions in recent years.
1 code implementation • 13 Apr 2023 • Hanze Dong, Wei Xiong, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang
Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples.
no code implementations • 12 Apr 2023 • Shiwei Zhang, Zhengzheng Wang, Qing Liu, Fei Wang, Wei Ke, Tong Zhang
This paper presents a new annotation method called Sparse Annotation (SA) for crowd counting, which reduces human labeling efforts by sparsely labeling individuals in an image.
no code implementations • 1 Apr 2023 • Chunyu Lei, C. L. Philip Chen, Jifeng Guo, Tong Zhang
Third, the TSMS feature fusion layer is proposed to extract more effective multi-scale features through the integration of CF layers and CE layers.
1 code implementation • 29 Mar 2023 • Congpei Qiu, Tong Zhang, Wei Ke, Mathieu Salzmann, Sabine Süsstrunk
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
1 code implementation • CVPR 2023 • Yanhao Wu, Tong Zhang, Wei Ke, Sabine Süsstrunk, Mathieu Salzmann
In this paper, we introduce an SSL strategy that leverages positive pairs in both the spatial and temporal domain.
no code implementations • ICCV 2023 • Dongqing Wang, Tong Zhang, Sabine Süsstrunk
We propose NEMTO, the first end-to-end neural rendering pipeline to model 3D transparent objects with complex geometry and unknown indices of refraction.
1 code implementation • 6 Mar 2023 • Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang
To the best of our knowledge, this paper is the first to realize statistically efficient invariance learning in the general linear model.
no code implementations • 2 Mar 2023 • Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang
To circumvent this difficulty, we examine the problem of identifying a mixed Nash equilibrium, where strategies are randomized and characterized by probability distributions over continuous domains.
2 code implementations • 24 Feb 2023 • Kashun Shum, Shizhe Diao, Tong Zhang
However, most CoT studies rely on carefully designed human-annotated rational chains to prompt LLMs, posing challenges for real-world applications where labeled data is available without rational chains.
2 code implementations • 23 Feb 2023 • Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang
For this purpose, we propose a solution to the key problem of determining which questions are the most important and helpful ones to annotate from a pool of task-specific queries.
no code implementations • 21 Feb 2023 • Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu
We propose a variance-adaptive algorithm for linear mixture MDPs, which achieves a problem-dependent horizon-free regret bound that can gracefully reduce to a nearly constant regret for deterministic MDPs.
1 code implementation • 20 Feb 2023 • Shizhe Diao, Sedrick Scott Keh, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang
Social media classification tasks (e. g., tweet sentiment analysis, tweet stance detection) are challenging because social media posts are typically short, informal, and ambiguous.
no code implementations • 6 Feb 2023 • Yae Jee Cho, Pranay Sharma, Gauri Joshi, Zheng Xu, Satyen Kale, Tong Zhang
Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL).
no code implementations • 2 Feb 2023 • Tong Zhang, Yong liu, Boyang Li, Zhiwei Zeng, Pengwei Wang, Yuan You, Chunyan Miao, Lizhen Cui
HAHT maintains a long-term memory of history conversations and utilizes history information to understand current conversation context and generate well-informed and context-relevant responses.
1 code implementation • 1 Feb 2023 • Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu
To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.
no code implementations • 31 Jan 2023 • Jonathan N. Lee, Alekh Agarwal, Christoph Dann, Tong Zhang
POMDPs capture a broad class of decision making problems, but hardness results suggest that learning is intractable even in simple settings due to the inherent partial observability.
no code implementations • 24 Jan 2023 • Xiao Zhou, Renjie Pi, Weizhong Zhang, Yong Lin, Tong Zhang
The goal of coreset selection in supervised learning is to produce a weighted subset of data, so that training only on the subset achieves similar performance as training on the entire dataset.
1 code implementation • 24 Jan 2023 • Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang
The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size.
1 code implementation • 5 Jan 2023 • Bahar Aydemir, Ludo Hoffstetter, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
Deep saliency prediction algorithms complement the object recognition features, they typically rely on additional information, such as scene context, semantic relationships, gaze direction, and object dissimilarity.
1 code implementation • CVPR 2023 • Bahar Aydemir, Ludo Hoffstetter, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk
Deep saliency prediction algorithms complement the object recognition features, they typically rely on additional information such as scene context, semantic relationships, gaze direction, and object dissimilarity.
Ranked #3 on
Saliency Prediction
on SALICON
1 code implementation • 26 Dec 2022 • Baran Ozaydin, Tong Zhang, Sabine Süsstrunk, Mathieu Salzmann
To stylize the source content with the exemplar style, we extract unsupervised cross-domain semantic correspondences and warp the exemplar style to the source content.
1 code implementation • CVPR 2023 • Yufan Ren, Fangjinhua Wang, Tong Zhang, Marc Pollefeys, Sabine Süsstrunk
The success of the Neural Radiance Fields (NeRF) in novel view synthesis has inspired researchers to propose neural implicit scene reconstruction.
no code implementations • 12 Dec 2022 • Alekh Agarwal, Yujia Jin, Tong Zhang
We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards.
no code implementations • 12 Dec 2022 • Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang
In this paper, we consider the contextual bandit with general function approximation and propose a computationally efficient algorithm to achieve a regret of $\tilde{O}(\sqrt{T}+\zeta)$.
1 code implementation • 30 Nov 2022 • Rui Pan, Shizhe Diao, Jianlin Chen, Tong Zhang
In this paper, we present ExtremeBERT, a toolkit for accelerating and customizing BERT pretraining.
no code implementations • 29 Nov 2022 • Tong Zhang, Ying Tan, Xiang Chen, Zike Lei
The key design idea for this observer is to estimate the visible set and identify the mis-identified features from the measurements.
no code implementations • 25 Nov 2022 • Hanze Dong, Xi Wang, Yong Lin, Tong Zhang
With the popularity of Stein variational gradient descent (SVGD), the focus of particle-based VI algorithms has been on the properties of functions in Reproducing Kernel Hilbert Space (RKHS) to approximate the gradient flow.
no code implementations • CVPR 2023 • Ehsan Pajouheshgar, Yitao Xu, Tong Zhang, Sabine Süsstrunk
Current Dynamic Texture Synthesis (DyTS) models can synthesize realistic videos.
1 code implementation • 21 Nov 2022 • Hanze Dong, Shizhe Diao, Weizhong Zhang, Tong Zhang
The resulting method is significantly more powerful than the standard normalization flow approach for generating data distributions with multiple modes.
no code implementations • 20 Nov 2022 • Zhongyu Fang, Aoyun He, Qihui Yu, Baopeng Gao, Weiping Ding, Tong Zhang, Lei Ma
In this paper, we developed a large multimodal emotion dataset, named "HED" dataset, to facilitate the emotion recognition task, and accordingly propose a multimodal emotion recognition method.
no code implementations • 11 Nov 2022 • Kilean Hwang, Tomofumi Maruta, Alexander Plastun, Kei Fukushima, Tong Zhang, Qiang Zhao, Peter Ostroumov, Yue Hao
Bayesian optimization~(BO) is often used for accelerator tuning due to its high sample efficiency.
no code implementations • 3 Nov 2022 • Han Zhong, Wei Xiong, Sirui Zheng, LiWei Wang, Zhaoran Wang, Zhuoran Yang, Tong Zhang
The proposed algorithm modifies the standard posterior sampling algorithm in two aspects: (i) we use an optimistic prior distribution that biases towards hypotheses with higher values and (ii) a loglikelihood function is set to be the empirical loss evaluated on the historical data, where the choice of loss function supports both model-free and model-based learning.