Search Results for author: Tong Zhang

Found 365 papers, 110 papers with code

Speeding up Transformer Decoding via an Attention Refinement Network

1 code implementation COLING 2022 Kaixin Wu, Yue Zhang, Bojie Hu, Tong Zhang

Extensive experiments on ten WMT machine translation tasks show that the proposed model yields an average of 1. 35x faster (with almost no decrease in BLEU) over the state-of-the-art inference implementation.

Machine Translation NMT +1

Toward Knowledge-Enriched Conversational Recommendation Systems

no code implementations NLP4ConvAI (ACL) 2022 Tong Zhang, Yong liu, Boyang Li, Peixiang Zhong, Chen Zhang, Hao Wang, Chunyan Miao

Conversational Recommendation Systems recommend items through language based interactions with users. In order to generate naturalistic conversations and effectively utilize knowledge graphs (KGs) containing background information, we propose a novel Bag-of-Entities loss, which encourages the generated utterances to mention concepts related to the item being recommended, such as the genre or director of a movie.

Knowledge Graphs Recommendation Systems +1

Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange

no code implementations11 Apr 2024 Yanhao Wu, Tong Zhang, Wei Ke, Congpei Qiu, Sabine Susstrunk, Mathieu Salzmann

Subsequently, we introduce a context-aware feature learning strategy, which encodes object patterns without relying on their specific context by aggregating object features across various scenes.

Object Scene Understanding +1

Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm

no code implementations4 Apr 2024 Miao Lu, Han Zhong, Tong Zhang, Jose Blanchet

Unlike previous work, which relies on a generative model or a pre-collected offline dataset enjoying good coverage of the deployment environment, we tackle robust RL via interactive data collection, where the learner interacts with the training environment only and refines the policy through trial and error.

Reinforcement Learning (RL)

On the Benefits of Over-parameterization for Out-of-Distribution Generalization

no code implementations26 Mar 2024 Yifan Hao, Yong Lin, Difan Zou, Tong Zhang

We demonstrate that in this scenario, further increasing the model's parameterization can significantly reduce the OOD loss.

Out-of-Distribution Generalization

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

1 code implementation26 Mar 2024 Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang

Attempting to complement this deficiency, we investigate layerwise properties of LoRA on fine-tuning tasks and observe an uncommon skewness of weight norms across different layers.

GSM8K Language Modelling +1

DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses

1 code implementation20 Mar 2024 Chen Zhao, Tong Zhang, Zheng Dang, Mathieu Salzmann

Determining the relative pose of an object between two images is pivotal to the success of generalizable object pose estimation.

Object Pose Estimation

Do CLIPs Always Generalize Better than ImageNet Models?

no code implementations18 Mar 2024 Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang

The performance drops from the common to counter groups quantify the reliance of models on spurious features (i. e., backgrounds) to predict the animals.

Gradient based Feature Attribution in Explainable AI: A Technical Review

no code implementations15 Mar 2024 Yongjie Wang, Tong Zhang, Xu Guo, Zhiqi Shen

Due to the lack of a rigorous definition of explainable AI (XAI), a plethora of research related to explainability, interpretability, and transparency has been developed to explain and analyze the model from various perspectives.

Autonomous Driving

Desigen: A Pipeline for Controllable Design Template Generation

no code implementations14 Mar 2024 Haohan Weng, Danqing Huang, Yu Qiao, Zheng Hu, Chin-Yew Lin, Tong Zhang, C. L. Philip Chen

In this paper, we present Desigen, an automatic template creation pipeline which generates background images as well as harmonious layout elements over the background.

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation

no code implementations13 Mar 2024 ZiCheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Qixiang Ye, Wei Ke

To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information. Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.

Language Modelling Semantic Segmentation +1

Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization

no code implementations13 Mar 2024 Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang

To mitigate this issue, we propose Bootstrapped Preference Optimization (BPO), which conducts preference learning with datasets containing negative responses bootstrapped from the model itself.

Language Modelling Large Language Model +1

Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning

no code implementations11 Mar 2024 Tong Zhang, Chen Huang, Yang Deng, Hongru Liang, Jia Liu, Zujie Wen, Wenqiang Lei, Tat-Seng Chua

We investigate non-collaborative dialogue agents that must engage in tailored strategic planning for diverse users to secure a favorable agreement.

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling

no code implementations10 Mar 2024 Xunpeng Huang, Hanze Dong, Difan Zou, Tong Zhang

Along this line, Freund et al. (2022) suggest that the modified Langevin algorithm with prior diffusion is able to converge dimension independently for strongly log-concave target distributions.

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

1 code implementation28 Feb 2024 Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang

Additionally, DPA models user preferences as directions (i. e., unit vectors) in the reward space to achieve user-dependent preference control.

EntailE: Introducing Textual Entailment in Commonsense Knowledge Graph Completion

no code implementations15 Feb 2024 Ying Su, Tianqing Fang, Huiru Xiao, Weiqi Wang, Yangqiu Song, Tong Zhang, Lei Chen

In this paper, we propose to adopt textual entailment to find implicit entailment relations between CSKG nodes, to effectively densify the subgraph connecting nodes within the same conceptual class, which indicates a similar level of plausibility.

graph construction Knowledge Graph Embedding +1

A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference

no code implementations11 Feb 2024 Chenlu Ye, Wei Xiong, Yuheng Zhang, Nan Jiang, Tong Zhang

In this work, we provide theoretical insights for a recently proposed learning paradigm, Nash learning from human feedback (NLHF), which considered a general preference model and formulated the alignment process as a game between two competitive LLMs.

The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs

1 code implementation6 Feb 2024 Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang

In this paper, we identify a typical class of inputs that baffles MLLMs, which consist of images that are highly relevant but inconsistent with answers, causing MLLMs to suffer from hallucination.


PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs

no code implementations31 Jan 2024 Ying Su, Jipeng Zhang, Yangqiu Song, Tong Zhang

To facilitate the evaluation of pruned subgraphs, we also propose a graph attention network (GAT) based module to reason with the subgraph data.

Graph Attention Knowledge Graphs +1

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

1 code implementation30 Jan 2024 Wei zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao

Multi-modal large language models (MLLMs) have demonstrated remarkable success in vision and visual-language tasks within the natural image domain.

Image Comprehension Instruction Following +2

General Flow as Foundation Affordance for Scalable Robot Learning

no code implementations21 Jan 2024 Chengbo Yuan, Chuan Wen, Tong Zhang, Yang Gao

Our predicted flow offers actionable geometric and physics guidance, thus facilitating stable zero-shot skill transfer in real-world scenarios. We deploy our method with a policy based on closed-loop flow prediction.

The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness

no code implementations19 Jan 2024 Yifan Hao, Tong Zhang

Recent empirical and theoretical studies have established the generalization capabilities of large machine learning models that are trained to (approximately or exactly) fit noisy data.

Adversarial Robustness

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

no code implementations12 Jan 2024 Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang

Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation.

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

1 code implementation5 Jan 2024 Renjie Pi, Tianyang Han, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

The deployment of multimodal large language models (MLLMs) has brought forth a unique vulnerability: susceptibility to malicious attacks through visual inputs.

RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models

1 code implementation31 Dec 2023 Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Cheng Niu, Randy Zhong, Juntong Song, Tong Zhang

Retrieval-augmented generation (RAG) has become a main technique for alleviating hallucinations in large language models (LLMs).

Hallucination Retrieval

Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise

no code implementations22 Dec 2023 Rui Pan, Yuxing Liu, Xiaoyu Wang, Tong Zhang

This means SGD with heavy-ball momentum is useful in the large-batch settings such as distributed machine learning or federated learning, where a smaller number of iterations can significantly reduce the number of communication rounds, leading to acceleration in practice.

Federated Learning

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

no code implementations18 Dec 2023 Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang

This includes an iterative version of the Direct Preference Optimization (DPO) algorithm for online settings, and a multi-step rejection sampling strategy for offline scenarios.

Language Modelling Large Language Model

DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud Registration

no code implementations5 Dec 2023 Zhi Chen, Yufan Ren, Tong Zhang, Zheng Dang, Wenbing Tao, Sabine Süsstrunk, Mathieu Salzmann

We propose formulating PCR as a denoising diffusion probabilistic process, mapping noisy transformations to the ground truth.

Denoising Point Cloud Registration

Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning

no code implementations29 Nov 2023 Yingdong Hu, Fanqi Lin, Tong Zhang, Li Yi, Yang Gao

In this study, we are interested in imbuing robots with the capability of physically-grounded task planning.

Beyond Pixels: Exploring Human-Readable SVG Generation for Simple Images with Vision Language Models

no code implementations27 Nov 2023 Tong Zhang, Haoyang Liu, Peiyan Zhang, Yuxuan Cheng, Haohan Wang

Our method focuses on producing SVGs that are both accurate and simple, aligning with human readability and understanding.

Vector Graphics

R-Tuning: Teaching Large Language Models to Refuse Unknown Questions

1 code implementation16 Nov 2023 Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang

This approach is formalized by first identifying the knowledge gap between parametric knowledge and the instruction tuning data.

Hallucination Sentence

Plum: Prompt Learning using Metaheuristic

1 code implementation14 Nov 2023 Rui Pan, Shuo Xing, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang

Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models.

Image Generation

CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer

1 code implementation11 Nov 2023 Haoyu Ma, Tong Zhang, Shanlin Sun, Xiangyi Yan, Kun Han, Xiaohui Xie

Reconstructing personalized animatable head avatars has significant implications in the fields of AR/VR.

Neural Rendering

PerceptionGPT: Effectively Fusing Visual Perception into LLM

no code implementations11 Nov 2023 Renjie Pi, Lewei Yao, Jiahui Gao, Jipeng Zhang, Tong Zhang

In this paper, we present a novel end-to-end framework named PerceptionGPT, which efficiently and effectively equips the VLLMs with visual perception abilities by leveraging the representation power of LLMs' token embedding.

Mesh Neural Cellular Automata

no code implementations6 Nov 2023 Ehsan Pajouheshgar, Yitao Xu, Alexander Mordvintsev, Eyvind Niklasson, Tong Zhang, Sabine Süsstrunk

We propose Mesh Neural Cellular Automata (MeshNCA), a method for directly synthesizing dynamic textures on 3D meshes without requiring any UV maps.

Texture Synthesis

Corruption-Robust Offline Reinforcement Learning with General Function Approximation

1 code implementation NeurIPS 2023 Chenlu Ye, Rui Yang, Quanquan Gu, Tong Zhang

Notably, under the assumption of single policy coverage and the knowledge of $\zeta$, our proposed algorithm achieves a suboptimality bound that is worsened by an additive factor of $\mathcal{O}(\zeta (C(\widehat{\mathcal{F}},\mu)n)^{-1})$ due to the corruption.

Offline RL reinforcement-learning +1

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

2 code implementations19 Oct 2023 Rui Yang, Han Zhong, Jiawei Xu, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang

Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment.

Offline RL Q-Learning +2

3D-Aware Hypothesis & Verification for Generalizable Relative Object Pose Estimation

no code implementations5 Oct 2023 Chen Zhao, Tong Zhang, Mathieu Salzmann

Our goal then is to estimate the relative object pose between this reference view and a query image that depicts the object in a different pose.

Object Pose Estimation

Spurious Feature Diversification Improves Out-of-distribution Generalization

no code implementations29 Sep 2023 Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang

Contrary to the conventional wisdom that focuses on learning invariant features for better OOD performance, our findings suggest that incorporating a large number of diverse spurious features weakens their individual contributions, leading to improved overall OOD generalization performance.

Out-of-Distribution Generalization

May I Ask a Follow-up Question? Understanding the Benefits of Conversations in Neural Network Explainability

no code implementations25 Sep 2023 Tong Zhang, X. Jessie Yang, Boyang Li

With this paper, we investigate if free-form conversations can enhance users' comprehension of static explanations, improve acceptance and trust in the explanation methods, and facilitate human-AI collaboration.

Decision Making

MEDL-U: Uncertainty-aware 3D Automatic Annotation based on Evidential Deep Learning

1 code implementation18 Sep 2023 Helbert Paat, Qing Lian, Weilong Yao, Tong Zhang

In this paper, we present the first approach that addresses the inherent ambiguities present in pseudo labels by introducing an Evidential Deep Learning (EDL) based uncertainty estimation framework.

3D Object Detection object-detection

Mitigating the Alignment Tax of RLHF

no code implementations12 Sep 2023 Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan YAO, Tong Zhang

Building on the analysis and the observation that averaging different layers of the transformer leads to significantly different reward-tax trade-offs, we propose Adaptive Model Averaging (AMA) to adaptively find various combination ratios of model layers.

Common Sense Reasoning Continual Learning

UniKG: A Benchmark and Universal Embedding for Large-Scale Knowledge Graphs

1 code implementation11 Sep 2023 Yide Qiu, Shaoxiang Ling, Tong Zhang, Bo Huang, Zhen Cui

To perform effective learning on the large-scale UniKG, two key measures are taken, including (i) the semantic alignment strategy for multi-attribute entities, which projects the feature description of multi-attribute nodes into a common embedding space to facilitate node aggregation in a large receptive field; (ii) proposing a novel plug-and-play anisotropy propagation module (APM) to learn effective multi-hop anisotropy propagation kernels, which extends methods of large-scale homogeneous graphs to heterogeneous graphs.

Attribute Graph Learning +3

Integrated Robotics Networks with Co-optimization of Drone Placement and Air-Ground Communications

no code implementations9 Sep 2023 Menghao Hu, Tong Zhang, Shuai Wang, Guoliang Li, Yingyang Chen, Qiang Li, Gaojie Chen

Terrestrial robots, i. e., unmanned ground vehicles (UGVs), and aerial robots, i. e., unmanned aerial vehicles (UAVs), operate in separate spaces.

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning

no code implementations5 Sep 2023 Yong Lin, Chen Liu, Chenlu Ye, Qing Lian, Yuan YAO, Tong Zhang

Our proposed method, COPS (unCertainty based OPtimal Sub-sampling), is designed to minimize the expected loss of a model trained on subsampled data.

Active Learning

Self-Reference Deep Adaptive Curve Estimation for Low-Light Image Enhancement

1 code implementation16 Aug 2023 Jianyu Wen, Chenhao Wu, Tong Zhang, Yixuan Yu, Piotr Swierczynski

In this paper, we propose a 2-stage low-light image enhancement method called Self-Reference Deep Adaptive Curve Estimation (Self-DACE).

Denoising Low-Light Image Enhancement

Reverse Diffusion Monte Carlo

no code implementations5 Jul 2023 Xunpeng Huang, Hanze Dong, Yifan Hao, Yi-An Ma, Tong Zhang

We propose a Monte Carlo sampler from the reverse diffusion process.

LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models

1 code implementation21 Jun 2023 Shizhe Diao, Rui Pan, Hanze Dong, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang

As the number of available models and specialized tasks keeps growing, the job of general finetuning becomes highly nontrivial.

A Universal Semantic-Geometric Representation for Robotic Manipulation

no code implementations18 Jun 2023 Tong Zhang, Yingdong Hu, Hanchen Cui, Hang Zhao, Yang Gao

To this end, we present $\textbf{Semantic-Geometric Representation} (\textbf{SGR})$, a universal perception module for robotics that leverages the rich semantic information of large-scale pre-trained 2D models and inherits the merits of 3D spatial reasoning.

Dual Adaptive Representation Alignment for Cross-domain Few-shot Learning

1 code implementation18 Jun 2023 Yifan Zhao, Tong Zhang, Jia Li, Yonghong Tian

Recent progress in this setting assumes that the base knowledge and novel query samples are distributed in the same domains, which are usually infeasible for realistic applications.

cross-domain few-shot learning

Structure-Sensitive Graph Dictionary Embedding for Graph Classification

no code implementations18 Jun 2023 Guangbu Liu, Tong Zhang, Xudong Wang, Wenting Zhao, Chuanwei Zhou, Zhen Cui

Instead of a plain use of a base graph dictionary, we propose the variational graph dictionary adaptation (VGDA) to generate a personalized dictionary (named adapted graph dictionary) for catering to each input graph.

Graph Classification Variational Inference

Customizing General-Purpose Foundation Models for Medical Report Generation

no code implementations9 Jun 2023 Bang Yang, Asif Raza, Yuexian Zou, Tong Zhang

In this work, we propose customizing off-the-shelf general-purpose large-scale pre-trained models, i. e., foundation models (FMs), in computer vision and natural language processing with a specific focus on medical report generation.

Medical Report Generation Transfer Learning

Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories

1 code implementation8 Jun 2023 Shizhe Diao, Tianyang Xu, Ruijia Xu, Jiawei Wang, Tong Zhang

Pre-trained language models (PLMs) demonstrate excellent abilities to understand texts in the generic domain while struggling in a specific domain.

Domain Adaptation

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?

1 code implementation30 May 2023 Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang

In this paper, we study out-of-distribution (OOD) generalization of offline GCRL both theoretically and empirically to identify factors that are important.

Imitation Learning Offline RL

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields

no code implementations24 May 2023 Dongqing Wang, Tong Zhang, Alaa Abboud, Sabine Süsstrunk

We propose InNeRF360, an automatic system that accurately removes text-specified objects from 360-degree Neural Radiance Fields (NeRF).

3D Inpainting Segmentation

DetGPT: Detect What You Need via Reasoning

1 code implementation23 May 2023 Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang

Overall, our proposed paradigm and DetGPT demonstrate the potential for more sophisticated and intuitive interactions between humans and machines.

Autonomous Driving Object +2

Effective Bilevel Optimization via Minimax Reformulation

no code implementations22 May 2023 Xiaoyu Wang, Rui Pan, Renjie Pi, Tong Zhang

To address this issue, we propose a reformulation of bilevel optimization as a minimax problem, effectively decoupling the outer-inner dependency.

Bilevel Optimization Meta-Learning

Adapter Learning in Pretrained Feature Extractor for Continual Learning of Diseases

1 code implementation18 Apr 2023 Wentao Zhang, Yujun Huang, Tong Zhang, Qingsong Zou, Wei-Shi Zheng, Ruixuan Wang

In particular, updating an intelligent diagnosis system with training data of new diseases would cause catastrophic forgetting of old disease knowledge.

Continual Learning

Hierarchical Interactive Reconstruction Network For Video Compressive Sensing

no code implementations15 Apr 2023 Tong Zhang, Wenxue Cui, Chen Hui, Feng Jiang

Deep network-based image and video Compressive Sensing(CS) has attracted increasing attentions in recent years.

Compressive Sensing Video Compressive Sensing

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

1 code implementation13 Apr 2023 Hanze Dong, Wei Xiong, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang

Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples.


Crowd Counting with Sparse Annotation

no code implementations12 Apr 2023 Shiwei Zhang, Zhengzheng Wang, Qing Liu, Fei Wang, Wei Ke, Tong Zhang

This paper presents a new annotation method called Sparse Annotation (SA) for crowd counting, which reduces human labeling efforts by sparsely labeling individuals in an image.

Crowd Counting

ConvBLS: An Effective and Efficient Incremental Convolutional Broad Learning System for Image Classification

no code implementations1 Apr 2023 Chunyu Lei, C. L. Philip Chen, Jifeng Guo, Tong Zhang

Third, the TSMS feature fusion layer is proposed to extract more effective multi-scale features through the integration of CF layers and CE layers.

Image Classification Incremental Learning

De-coupling and De-positioning Dense Self-supervised Learning

no code implementations29 Mar 2023 Congpei Qiu, Tong Zhang, Wei Ke, Mathieu Salzmann, Sabine Süsstrunk

Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.

Data Augmentation Object +5

NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects

no code implementations ICCV 2023 Dongqing Wang, Tong Zhang, Sabine Süsstrunk

We propose NEMTO, the first end-to-end neural rendering pipeline to model 3D transparent objects with complex geometry and unknown indices of refraction.

Image Matting Neural Rendering +2

Environment Invariant Linear Least Squares

no code implementations6 Mar 2023 Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang

To the best of our knowledge, this paper is the first to realize statistically efficient invariance learning in the general linear model.

Causal Inference regression +2

PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium

no code implementations2 Mar 2023 Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang

To circumvent this difficulty, we examine the problem of identifying a mixed Nash equilibrium, where strategies are randomized and characterized by probability distributions over continuous domains. To this end, we propose PArticle-based Primal-dual ALgorithm (PAPAL) tailored for a weakly entropy-regularized min-max optimization over probability distributions.

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data

2 code implementations24 Feb 2023 Kashun Shum, Shizhe Diao, Tong Zhang

However, most CoT studies rely on carefully designed human-annotated rational chains to prompt LLMs, posing challenges for real-world applications where labeled data is available without rational chains.

Arithmetic Reasoning Language Modelling

Active Prompting with Chain-of-Thought for Large Language Models

2 code implementations23 Feb 2023 Shizhe Diao, Pengcheng Wang, Yong Lin, Tong Zhang

For this purpose, we propose a solution to the key problem of determining which questions are the most important and helpful ones to annotate from a pool of task-specific queries.

Active Learning Zero-Shot Learning

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

no code implementations21 Feb 2023 Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

We propose a variance-adaptive algorithm for linear mixture MDPs, which achieves a problem-dependent horizon-free regret bound that can gracefully reduce to a nearly constant regret for deterministic MDPs.

Computational Efficiency Decision Making +1

Hashtag-Guided Low-Resource Tweet Classification

1 code implementation20 Feb 2023 Shizhe Diao, Sedrick Scott Keh, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang

Social media classification tasks (e. g., tweet sentiment analysis, tweet stance detection) are challenging because social media posts are typically short, informal, and ambiguous.

Classification Sentiment Analysis +1

On the Convergence of Federated Averaging with Cyclic Client Participation

no code implementations6 Feb 2023 Yae Jee Cho, Pranay Sharma, Gauri Joshi, Zheng Xu, Satyen Kale, Tong Zhang

Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL).

Federated Learning

History-Aware Hierarchical Transformer for Multi-session Open-domain Dialogue System

no code implementations2 Feb 2023 Tong Zhang, Yong liu, Boyang Li, Zhiwei Zeng, Pengwei Wang, Yuan You, Chunyan Miao, Lizhen Cui

HAHT maintains a long-term memory of history conversations and utilizes history information to understand current conversation context and generate well-informed and context-relevant responses.

ADAPT: Action-aware Driving Caption Transformer

1 code implementation1 Feb 2023 Bu Jin, Xinyu Liu, Yupeng Zheng, Pengfei Li, Hao Zhao, Tong Zhang, Yuhang Zheng, Guyue Zhou, Jingjing Liu

To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.

Autonomous Driving Decision Making

Learning in POMDPs is Sample-Efficient with Hindsight Observability

no code implementations31 Jan 2023 Jonathan N. Lee, Alekh Agarwal, Christoph Dann, Tong Zhang

POMDPs capture a broad class of decision making problems, but hardness results suggest that learning is intractable even in simple settings due to the inherent partial observability.

Decision Making Scheduling

Probabilistic Bilevel Coreset Selection

no code implementations24 Jan 2023 Xiao Zhou, Renjie Pi, Weizhong Zhang, Yong Lin, Tong Zhang

The goal of coreset selection in supervised learning is to produce a weighted subset of data, so that training only on the subset achieves similar performance as training on the entire dataset.

Bilevel Optimization Continual Learning

Model Agnostic Sample Reweighting for Out-of-Distribution Learning

1 code implementation24 Jan 2023 Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang

The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size.

TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction

no code implementations5 Jan 2023 Bahar Aydemir, Ludo Hoffstetter, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk

Deep saliency prediction algorithms complement the object recognition features, they typically rely on additional information, such as scene context, semantic relationships, gaze direction, and object dissimilarity.

Object Object Recognition +1

TempSAL - Uncovering Temporal Information for Deep Saliency Prediction

no code implementations CVPR 2023 Bahar Aydemir, Ludo Hoffstetter, Tong Zhang, Mathieu Salzmann, Sabine Süsstrunk

Deep saliency prediction algorithms complement the object recognition features, they typically rely on additional information such as scene context, semantic relationships, gaze direction, and object dissimilarity.

Object Object Recognition +1

DSI2I: Dense Style for Unpaired Image-to-Image Translation

no code implementations26 Dec 2022 Baran Ozaydin, Tong Zhang, Sabine Süsstrunk, Mathieu Salzmann

Unpaired exemplar-based image-to-image (UEI2I) translation aims to translate a source image to a target image domain with the style of a target image exemplar, without ground-truth input-translation pairs.

Image-to-Image Translation Translation

VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction

1 code implementation CVPR 2023 Yufan Ren, Fangjinhua Wang, Tong Zhang, Marc Pollefeys, Sabine Süsstrunk

The success of the Neural Radiance Fields (NeRF) in novel view synthesis has inspired researchers to propose neural implicit scene reconstruction.

Novel View Synthesis

VO$Q$L: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation

no code implementations12 Dec 2022 Alekh Agarwal, Yujia Jin, Tong Zhang

We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards.

Q-Learning regression +1

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

no code implementations12 Dec 2022 Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang

In this paper, we consider the contextual bandit with general function approximation and propose a computationally efficient algorithm to achieve a regret of $\tilde{O}(\sqrt{T}+\zeta)$.

Multi-Armed Bandits Reinforcement Learning (RL)

On Robust Observer Design for System Motion on SE(3) Using Onboard Visual Sensors

no code implementations29 Nov 2022 Tong Zhang, Ying Tan, Xiang Chen, Zike Lei

The key design idea for this observer is to estimate the visible set and identify the mis-identified features from the measurements.

Particle-based Variational Inference with Preconditioned Functional Gradient Flow

no code implementations25 Nov 2022 Hanze Dong, Xi Wang, Yong Lin, Tong Zhang

With the popularity of Stein variational gradient descent (SVGD), the focus of particle-based VI algorithms has been on the properties of functions in Reproducing Kernel Hilbert Space (RKHS) to approximate the gradient flow.

Variational Inference

Normalizing Flow with Variational Latent Representation

1 code implementation21 Nov 2022 Hanze Dong, Shizhe Diao, Weizhong Zhang, Tong Zhang

The resulting method is significantly more powerful than the standard normalization flow approach for generating data distributions with multiple modes.

FAF: A novel multimodal emotion recognition approach integrating face, body and text

no code implementations20 Nov 2022 Zhongyu Fang, Aoyun He, Qihui Yu, Baopeng Gao, Weiping Ding, Tong Zhang, Lei Ma

In this paper, we developed a large multimodal emotion dataset, named "HED" dataset, to facilitate the emotion recognition task, and accordingly propose a multimodal emotion recognition method.

Multimodal Emotion Recognition

GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond

no code implementations3 Nov 2022 Han Zhong, Wei Xiong, Sirui Zheng, LiWei Wang, Zhaoran Wang, Zhuoran Yang, Tong Zhang

The proposed algorithm modifies the standard posterior sampling algorithm in two aspects: (i) we use an optimistic prior distribution that biases towards hypotheses with higher values and (ii) a loglikelihood function is set to be the empirical loss evaluated on the historical data, where the choice of loss function supports both model-free and model-based learning.

Decision Making Reinforcement Learning (RL)

Large-Scale Bandwidth and Power Optimization for Multi-Modal Edge Intelligence Autonomous Driving

no code implementations18 Oct 2022 Xinrao Li, Tong Zhang, Shuai Wang, Guangxu Zhu, Rui Wang, Tsung-Hui Chang

However, wireless channels between the edge server and the autonomous vehicles are time-varying due to the high-mobility of vehicles.

Autonomous Driving

Increasing Visual Awareness in Multimodal Neural Machine Translation from an Information Theoretic Perspective

no code implementations16 Oct 2022 Baijun Ji, Tong Zhang, Yicheng Zou, Bojie Hu, Si Shen

Multimodal machine translation (MMT) aims to improve translation quality by equipping the source sentence with its corresponding image.

Multimodal Machine Translation Sentence +1

MICO: A Multi-alternative Contrastive Learning Framework for Commonsense Knowledge Representation

1 code implementation14 Oct 2022 Ying Su, ZiHao Wang, Tianqing Fang, Hongming Zhang, Yangqiu Song, Tong Zhang

Commonsense reasoning tasks such as commonsense knowledge graph completion and commonsense question answering require powerful representation learning.

Contrastive Learning Question Answering +2

Multilingual Word Sense Disambiguation with Unified Sense Representation

1 code implementation COLING 2022 Ying Su, Hongming Zhang, Yangqiu Song, Tong Zhang

As a key natural language processing (NLP) task, word sense disambiguation (WSD) evaluates how well NLP models can understand the lexical semantics of words under specific contexts.

Word Sense Disambiguation

A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games

no code implementations4 Oct 2022 Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, Tong Zhang

Existing studies on provably efficient algorithms for Markov games (MGs) almost exclusively build on the "optimism in the face of uncertainty" (OFU) principle.

Optimal Operation of a Tidal Lagoon as a Flexible Source of Electricity

no code implementations27 Sep 2022 Tong Zhang, Christopher Williams, Reza Ahmadian, Meysam Qadrdan

It was demonstrated that by exploiting the flexibility offered by the tidal lagoon, it can achieve a higher revenue in the day-ahead market, although their total electricity generation is reduced.

Asymptotic Statistical Analysis of $f$-divergence GAN

no code implementations14 Sep 2022 Xinwei Shen, Kani Chen, Tong Zhang

We show that for parametric generative models that are correctly specified, all $f$-divergence GANs with the same discriminator classes are asymptotically equivalent under suitable regularity conditions.

Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering Over Knowledge Graphs

no code implementations COLING 2022 Zile Qiao, Wei Ye, Tong Zhang, Tong Mo, Weiping Li, Shikun Zhang

Answering natural language questions on knowledge graphs (KGQA) remains a great challenge in terms of understanding complex questions via multi-hop reasoning.

Answer Selection Knowledge Graphs +3

Two-person Graph Convolutional Network for Skeleton-based Human Interaction Recognition

1 code implementation12 Aug 2022 Zhengcen Li, Yueran Li, Linlin Tang, Tong Zhang, Jingyong Su

To overcome the above shortcoming, we introduce a novel unified two-person graph to represent inter-body and intra-body correlations between joints.

Action Classification Action Recognition +3

OpenMedIA: Open-Source Medical Image Analysis Toolbox and Benchmark under Heterogeneous AI Computing Platforms

no code implementations11 Aug 2022 Jia-Xin Zhuang, Xiansong Huang, Yang Yang, Jiancong Chen, Yue Yu, Wei Gao, Ge Li, Jie Chen, Tong Zhang

In this paper, we present OpenMedIA, an open-source toolbox library containing a rich set of deep learning methods for medical image analysis under heterogeneous Artificial Intelligence (AI) computing platforms.

Image Classification Medical Image Classification +2

Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain

1 code implementation8 Jul 2022 Tong Zhang, Peng Gao, Hao Dong, Yin Zhuang, Guanqun Wang, Wei zhang, He Chen

Currently, under supervised learning, a model pretrained by a large-scale nature scene dataset and then fine-tuned on a few specific task labeling data is the paradigm that has dominated the knowledge transfer learning.

Land Cover Classification object-detection +3

Beyond Uniform Lipschitz Condition in Differentially Private Optimization

no code implementations21 Jun 2022 Rudrajit Das, Satyen Kale, Zheng Xu, Tong Zhang, Sujay Sanghavi

Most prior results on differentially private stochastic gradient descent (DP-SGD) are derived under the simplistic assumption of uniform Lipschitzness, i. e., the per-sample gradients are uniformly bounded.

Benchmarking regression

Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity

no code implementations15 Jun 2022 Alekh Agarwal, Tong Zhang

We propose a general framework to design posterior sampling methods for model-based RL.

On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data

no code implementations9 Jun 2022 Jianyu Wang, Rudrajit Das, Gauri Joshi, Satyen Kale, Zheng Xu, Tong Zhang

Motivated by this observation, we propose a new quantity, average drift at optimum, to measure the effects of data heterogeneity, and explicitly use it to present a new theoretical analysis of FedAvg.

Federated Learning

Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint

no code implementations9 Jun 2022 Hao liu, Minshuo Chen, Siawpeng Er, Wenjing Liao, Tong Zhang, Tuo Zhao

Overparameterized neural networks enjoy great representation power on complex data, and more importantly yield sufficiently smooth output, which is crucial to their generalization and robustness.

Image Classification

An Indoor Environment Sensing and Localization System via mmWave Phased Array

no code implementations7 Jun 2022 Yifei Sun, Jie Li, Tong Zhang, Rui Wang, Xiaohui Peng, Tony Xiao Han, Haisheng Tan

At the end, we show that the reconstructed room layout can be utilized to locate a mobile device according to its AoA spectrum, even with single access point.

Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game

no code implementations31 May 2022 Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, LiWei Wang, Tong Zhang

We also extend our techniques to the two-player zero-sum Markov games (MGs), and establish a new performance lower bound for MGs, which tightens the existing result, and verifies the nearly minimax optimality of the proposed algorithm.

Offline RL Reinforcement Learning (RL)

Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions

no code implementations13 May 2022 Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

We show that for both known $C$ and unknown $C$ cases, our algorithm with proper choice of hyperparameter achieves a regret that nearly matches the lower bounds.

Multi-Armed Bandits

Deep Non-rigid Structure-from-Motion: A Sequence-to-Sequence Translation Perspective

no code implementations10 Apr 2022 Hui Deng, Tong Zhang, Yuchao Dai, Jiawei Shi, Yiran Zhong, Hongdong Li

In this paper, we propose to model deep NRSfM from a sequence-to-sequence translation perspective, where the input 2D frame sequence is taken as a whole to reconstruct the deforming 3D non-rigid shape sequence.

3D Reconstruction Translation

Leverage Your Local and Global Representations: A New Self-Supervised Learning Strategy

no code implementations CVPR 2022 Tong Zhang, Congpei Qiu, Wei Ke, Sabine Süsstrunk, Mathieu Salzmann

In essence, this strategy ignores the fact that two crops may truly contain different image information, e. g., background and small objects, and thus tends to restrain the diversity of the learned representations.

Self-Supervised Learning Transfer Learning

Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling

no code implementations15 Mar 2022 Alekh Agarwal, Tong Zhang

Provably sample-efficient Reinforcement Learning (RL) with rich observations and function approximation has witnessed tremendous recent progress, particularly when the underlying function approximators are linear.

Reinforcement Learning (RL)

RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering

1 code implementation8 Mar 2022 Di Chang, Aljaž Božič, Tong Zhang, Qingsong Yan, Yingcong Chen, Sabine Süsstrunk, Matthias Nießner

Finding accurate correspondences among different views is the Achilles' heel of unsupervised Multi-View Stereo (MVS).

Neural Rendering

Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets

no code implementations15 Feb 2022 Han Zhong, Wei Xiong, Jiyuan Tan, LiWei Wang, Tong Zhang, Zhaoran Wang, Zhuoran Yang

When the dataset does not have uniform coverage over all policy pairs, finding an approximate NE involves challenges in three aspects: (i) distributional shift between the behavior policy and the optimal policy, (ii) function approximation to handle large state space, and (iii) minimax optimization for equilibrium solving.

A GAN-Based Short-Term Link Traffic Prediction Approach for Urban Road Networks Under a Parallel Learning Framework

no code implementations IEEE Transactions on Intelligent Transportation Systems 2022 Junchen Jin, Member, IEEE, Dingding Rong, Tong Zhang, Qingyuan Ji, Haifeng Guo, Yisheng Lv, Xiaoliang Ma, and Fei-Yue Wang

This paper proposes a short-term traffic speed prediction approach, called PL-WGAN, for urban road networks, which is considered an important part of a novel parallel learning framework for traffic control and operation.

Traffic Prediction

Minimax Regret Optimization for Robust Machine Learning under Distribution Shift

no code implementations11 Feb 2022 Alekh Agarwal, Tong Zhang

We instead propose an alternative method called Minimax Regret Optimization (MRO), and show that under suitable conditions this method achieves uniformly low regret across all test distributions.

BIG-bench Machine Learning Learning Theory

Fast Rates in Pool-Based Batch Active Learning

no code implementations11 Feb 2022 Claudio Gentile, Zhilei Wang, Tong Zhang

We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle.

Active Learning Informativeness

Black-box Prompt Learning for Pre-trained Language Models

1 code implementation21 Jan 2022 Shizhe Diao, Zhichao Huang, Ruijia Xu, Xuechun Li, Yong Lin, Xiao Zhou, Tong Zhang

Particularly, instead of fine-tuning the model in the cloud, we adapt PLMs by prompt learning, which efficiently optimizes only a few parameters of the discrete prompts.

text-classification Text Classification

OneDConv: Generalized Convolution For Transform-Invariant Representation

no code implementations15 Jan 2022 Tong Zhang, Haohan Weng, Ke Yi, C. L. Philip Chen

Convolutional Neural Networks (CNNs) have exhibited their great power in a variety of vision tasks.

A Novel Multi-Task Learning Method for Symbolic Music Emotion Recognition

no code implementations15 Jan 2022 Jibao Qiu, C. L. Philip Chen, Tong Zhang

In this paper, we present a simple multi-task framework for SMER, which incorporates the emotion recognition task with other emotion-related auxiliary tasks derived from the intrinsic structure of the music.

Emotion Recognition Language Modelling +2

Time Series Generation with Masked Autoencoder

1 code implementation14 Jan 2022 Mengyue Zha, SiuTim Wong, Mengqi Liu, Tong Zhang, Kani Chen

This paper shows that masked autoencoder with extrapolator (ExtraMAE) is a scalable self-supervised model for time series generation.

Data Augmentation Imputation +6

IDEA: Interpretable Dynamic Ensemble Architecture for Time Series Prediction

no code implementations14 Jan 2022 Mengyue Zha, Kani Chen, Tong Zhang

We enhance the accuracy and generalization of univariate time series point prediction by an explainable ensemble on the fly.

Time Series Time Series Prediction

Bayesian Invariant Risk Minimization

no code implementations CVPR 2022 Yong Lin, Hanze Dong, Hao Wang, Tong Zhang

Generalization under distributional shift is an open challenge for machine learning.

Bayesian Inference

Frequency-Aware Contrastive Learning for Neural Machine Translation

no code implementations29 Dec 2021 Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao

Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.

Contrastive Learning Machine Translation +3

On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training

no code implementations14 Dec 2021 Chen Liu, Zhichao Huang, Mathieu Salzmann, Tong Zhang, Sabine Süsstrunk

This lets us show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances.

CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter

1 code implementation30 Nov 2021 Bang Yang, Tong Zhang, Yuexian Zou

DCD is an auxiliary task that requires a caption model to learn the correspondence between video content and concepts and the co-occurrence relations between concepts.

Caption Generation Representation Learning +1

Optimizing Latent Space Directions For GAN-based Local Image Editing

1 code implementation24 Nov 2021 Ehsan Pajouheshgar, Tong Zhang, Sabine Süsstrunk

Generative Adversarial Network (GAN) based localized image editing can suffer from ambiguity between semantic attributes.

Disentanglement Generative Adversarial Network

Efficient Neural Network Training via Forward and Backward Propagation Sparsification

1 code implementation NeurIPS 2021 Xiao Zhou, Weizhong Zhang, Zonghao Chen, Shizhe Diao, Tong Zhang

For the latter step, instead of using the chain rule based gradient estimators as in existing methods, we propose a variance reduced policy gradient estimator, which only requires two forward passes without backward propagation, thus achieving completely sparse training.

Efficient Neural Network

A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization

1 code implementation3 Nov 2021 Renzhe Xu, Xingxuan Zhang, Zheyan Shen, Tong Zhang, Peng Cui

Afterward, we prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set.

feature selection

Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums

1 code implementation ICLR 2022 Rui Pan, Haishan Ye, Tong Zhang

In this paper, we propose Eigencurve, the first family of learning rate schedules that can achieve minimax optimal convergence rates (up to a constant) for SGD on quadratic objectives when the eigenvalue distribution of the underlying Hessian matrix is skewed.

Image Classification

When is the Convergence Time of Langevin Algorithms Dimension Independent? A Composite Optimization Viewpoint

no code implementations5 Oct 2021 Yoav Freund, Yi-An Ma, Tong Zhang

There has been a surge of works bridging MCMC sampling and optimization, with a specific focus on translating non-asymptotic convergence guarantees for optimization problems into the analysis of Langevin algorithms in MCMC sampling.

Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning

no code implementations2 Oct 2021 Tong Zhang

In this setting, we show that the standard Thompson Sampling is not aggressive enough in exploring new actions, leading to suboptimality in some pessimistic situations.

Multi-Armed Bandits regression +3

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning

1 code implementation ICLR 2022 Ziniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo

However, it is limited to the case where 1) a good feature is known in advance and 2) this feature is fixed during the training: if otherwise, RLSVI suffers an unbearable computational burden to obtain the posterior samples of the parameter in the $Q$-value function.

Efficient Exploration reinforcement-learning +1

Interest-based Item Representation Framework for Recommendation with Multi-Interests Capsule Network

no code implementations29 Sep 2021 Yanpeng Xie, Tong Zhang, Heng Zhang, Zhendong Qu

To make the framework model-agnostic, user Multi Interests Capsule Network is designed as an auxiliary task to jointly learn item-based item representations and interest-based item representations.

Recommendation Systems Representation Learning +1

Improving Adversarial Defense with Self-supervised Test-time Fine-tuning

no code implementations29 Sep 2021 Zhichao Huang, Chen Liu, Mathieu Salzmann, Sabine Süsstrunk, Tong Zhang

Although adversarial training and its variants currently constitute the most effective way to achieve robustness against adversarial attacks, their poor generalization limits their performance on the test samples.

Adversarial Defense

EllipseNet: Anchor-Free Ellipse Detection for Automatic Cardiac Biometrics in Fetal Echocardiography

1 code implementation26 Sep 2021 Jiancong Chen, Yingying Zhang, Jingyi Wang, Xiaoxue Zhou, Yihua He, Tong Zhang

In this paper, we present an anchor-free ellipse detection network, namely EllipseNet, which detects the cardiac and thoracic regions in ellipse and automatically calculates the CTR and cardiac axis for fetal cardiac biometrics in 4-chamber view.

Feature Correlation Aggregation: on the Path to Better Graph Neural Networks

no code implementations20 Sep 2021 Jieming Zhou, Tong Zhang, Pengfei Fang, Lars Petersson, Mehrtash Harandi

The core concept of GNNs is to find a representation by recursively aggregating the representations of a central node and those of its neighbors.

Feature Correlation

G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation

no code implementations ICCV 2021 Lewei Yao, Renjie Pi, Hang Xu, Wei zhang, Zhenguo Li, Tong Zhang

In this paper, we investigate the knowledge distillation (KD) strategy for object detection and propose an effective framework applicable to both homogeneous and heterogeneous student-teacher pairs.

Knowledge Distillation object-detection +1

Accelerating Edge Intelligence via Integrated Sensing and Communication

no code implementations20 Jul 2021 Tong Zhang, Shuai Wang, Guoliang Li, Fan Liu, Guangxu Zhu, Rui Wang

Conventionally, the sensing and communication stages are executed sequentially, which results in excessive amount of dataset generation and uploading time.

Near Optimal Stochastic Algorithms for Finite-Sum Unbalanced Convex-Concave Minimax Optimization

no code implementations3 Jun 2021 Luo Luo, Guangzeng Xie, Tong Zhang, Zhihua Zhang

This paper considers stochastic first-order algorithms for convex-concave minimax problems of the form $\min_{\bf x}\max_{\bf y}f(\bf x, \bf y)$, where $f$ can be presented by the average of $n$ individual components which are $L$-average smooth.

Multi-Hop Transformer for Document-Level Machine Translation

no code implementations NAACL 2021 Long Zhang, Tong Zhang, Haibo Zhang, Baosong Yang, Wei Ye, Shikun Zhang

Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information.

Document Level Machine Translation Document Translation +4

Universal Adder Neural Networks

no code implementations29 May 2021 Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, Chunjing Xu, Tong Zhang

The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.

Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation

no code implementations CVPR 2021 Lewei Yao, Renjie Pi, Hang Xu, Wei zhang, Zhenguo Li, Tong Zhang

For student morphism, weight inheritance strategy is adopted, allowing the student to flexibly update its architecture while fully utilize the predecessor's weights, which considerably accelerates the search; To facilitate dynamic distillation, an elastic teacher pool is trained via integrated progressive shrinking strategy, from which teacher detectors can be sampled without additional cost in subsequent searches.

Knowledge Distillation Neural Architecture Search +2

TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search

2 code implementations CVPR 2021 Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

While existing NAS methods mostly design architectures on a single task, algorithms that look beyond single-task search are surging to pursue a more efficient and universal solution across various tasks.

Neural Architecture Search Transfer Learning

KECRS: Towards Knowledge-Enriched Conversational Recommendation System

no code implementations18 May 2021 Tong Zhang, Yong liu, Peixiang Zhong, Chen Zhang, Hao Wang, Chunyan Miao

The chit-chat-based conversational recommendation systems (CRS) provide item recommendations to users through natural language interactions.

Entity Embeddings Knowledge Graphs +3

ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text Encoders

1 code implementation4 May 2021 Yan Song, Tong Zhang, Yonggang Wang, Kai-Fu Lee

Pre-trained text encoders have drawn sustaining attention in natural language processing (NLP) and shown their capability in obtaining promising results in different tasks.

Effective Sparsification of Neural Networks with Global Sparsity Constraint

1 code implementation CVPR 2021 Xiao Zhou, Weizhong Zhang, Hang Xu, Tong Zhang

Weight pruning is an effective technique to reduce the model size and inference time for deep neural networks in real-world deployments.

Network Pruning

Exploring Geometric Consistency for Monocular 3D Object Detection

no code implementations CVPR 2022 Qing Lian, Botao Ye, Ruijia Xu, Weilong Yao, Tong Zhang

In addition, we demonstrate that the augmentation methods are well suited for semi-supervised training and cross-dataset generalization.

Autonomous Driving Data Augmentation +4

Reinforced Attention for Few-Shot Learning and Beyond

no code implementations CVPR 2021 Jie Hong, Pengfei Fang, Weihao Li, Tong Zhang, Christian Simon, Mehrtash Harandi, Lars Petersson

Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images.

Few-Shot Learning Image Classification

Modeling Object Dissimilarity for Deep Saliency Prediction

1 code implementation8 Apr 2021 Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang, Seungryong Kim, Mathieu Salzmann, Sabine Süsstrunk

Saliency prediction has made great strides over the past two decades, with current techniques modeling low-level information, such as color, intensity and size contrasts, and high-level ones, such as attention and gaze direction for entire objects.

Object Saliency Prediction

Uncertainty-aware Joint Salient Object and Camouflaged Object Detection

2 code implementations CVPR 2021 Aixuan Li, Jing Zhang, Yunqiu Lv, Bowen Liu, Tong Zhang, Yuchao Dai

Visual salient object detection (SOD) aims at finding the salient object(s) that attract human attention, while camouflaged object detection (COD) on the contrary intends to discover the camouflaged object(s) that hidden in the surrounding.

Object object-detection +2