Search Results for author: Yong liu

Found 342 papers, 133 papers with code

Toward Knowledge-Enriched Conversational Recommendation Systems

no code implementations NLP4ConvAI (ACL) 2022 Tong Zhang, Yong liu, Boyang Li, Peixiang Zhong, Chen Zhang, Hao Wang, Chunyan Miao

Conversational Recommendation Systems recommend items through language based interactions with users. In order to generate naturalistic conversations and effectively utilize knowledge graphs (KGs) containing background information, we propose a novel Bag-of-Entities loss, which encourages the generated utterances to mention concepts related to the item being recommended, such as the genre or director of a movie.

Knowledge Graphs Recommendation Systems +1

Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

1 code implementation16 Apr 2024 Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong liu, Guansong Pang, DaCheng Tao

Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods.

Anomaly Detection object-detection +2

Adapting LLaMA Decoder to Vision Transformer

no code implementations10 Apr 2024 Jiahao Wang, Wenqi Shao, Mengzhao Chen, Chengyue Wu, Yong liu, Kaipeng Zhang, Songyang Zhang, Kai Chen, Ping Luo

We first "LLaMAfy" a standard ViT step-by-step to align with LLaMA's architecture, and find that directly applying a casual mask to the self-attention brings an attention collapse issue, resulting in the failure to the network training.

Computational Efficiency Quantization +1

TryOn-Adapter: Efficient Fine-Grained Clothing Identity Adaptation for High-Fidelity Virtual Try-On

1 code implementation1 Apr 2024 Jiazheng Xing, Chao Xu, Yijie Qian, Yang Liu, Guang Dai, Baigui Sun, Yong liu, Jingdong Wang

However, the clothing identity uncontrollability and training inefficiency of existing diffusion-based methods, which struggle to maintain the identity even with full parameter training, are significant limitations that hinder the widespread applications.

Virtual Try-on

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

no code implementations28 Mar 2024 Haonan Lin, Mengmeng Wang, Yan Chen, Wenbin An, Yuzhe Yao, Guang Dai, Qianying Wang, Yong liu, Jingdong Wang

While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centered images, novel challenges arise with a nuanced task of "identity fine editing": precisely modifying specific features of a subject while maintaining its inherent identity and context.

Denoising Face Generation

END4Rec: Efficient Noise-Decoupling for Multi-Behavior Sequential Recommendation

no code implementations26 Mar 2024 Yongqiang Han, Hao Wang, Kefan Wang, Likang Wu, Zhi Li, Wei Guo, Yong liu, Defu Lian, Enhong Chen

In recommendation systems, users frequently engage in multiple types of behaviors, such as clicking, adding to a cart, and purchasing.

Denoising Sequential Recommendation +1

Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference

no code implementations21 Mar 2024 Xi Jiang, Ying Chen, Qiang Nie, Jianlin Liu, Yong liu, Chengjie Wang, Feng Zheng

To address this issue, we introduce a Multi-class Implicit Neural representation Transformer for unified Anomaly Detection (MINT-AD), which leverages the fine-grained category information in the training stage.

Anomaly Detection

SoftPatch: Unsupervised Anomaly Detection with Noisy Data

1 code implementation NeurIPS 2022 Xi Jiang, Ying Chen, Qiang Nie, Yong liu, Jianlin Liu, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zheng

Noise discriminators are utilized to generate outlier scores for patch-level noise elimination before coreset construction.

Unsupervised Anomaly Detection

Exploring Task Unification in Graph Representation Learning via Generative Approach

no code implementations21 Mar 2024 Yulan Hu, Sheng Ouyang, Zhirui Yang, Ge Chen, Junchen Wan, Xiao Wang, Yong liu

Specifically, GA^2E proposes to use the subgraph as the meta-structure, which remains consistent across all graph tasks (ranging from node-, edge-, and graph-level to transfer learning) and all stages (both during training and inference).

Graph Representation Learning Transfer Learning

Tuning-Free Image Customization with Image and Text Guidance

no code implementations19 Mar 2024 Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, Yong liu, Jinlong Peng, Chengjie Wang, Feng Zheng

To our knowledge, this is the first tuning-free method that concurrently utilizes text and image guidance for image customization in specific regions.

Denoising Image Generation

HCPM: Hierarchical Candidates Pruning for Efficient Detector-Free Matching

no code implementations19 Mar 2024 Ying Chen, Yong liu, Kai Wu, Qiang Nie, Shang Xu, Huifang Ma, Bing Wang, Chengjie Wang

Deep learning-based image matching methods play a crucial role in computer vision, yet they often suffer from substantial computational demands.

Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction

no code implementations17 Mar 2024 Yixiang Mao, Liyang Sun, Yong liu, Yao Wang

We develop a low-latency FoV-adaptive coding and streaming system for interactive applications that is robust to bandwidth variations and FoV prediction errors.

AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

no code implementations13 Mar 2024 Siqi Li, Jun Chen, Jingyang Xiang, Chengrui Zhu, Yong liu

AutoDFP assesses the similarity of channels for each layer and provides this information to the reinforcement learning agent, guiding the pruning and reconstruction process of the network.

DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos

no code implementations9 Mar 2024 Xiuzhe Wu, Xiaoyang Lyu, Qihao Huang, Yong liu, Yang Wu, Ying Shan, Xiaojuan Qi

Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.

Depth Estimation Disentanglement +5

RLPeri: Accelerating Visual Perimetry Test with Reinforcement Learning and Convolutional Feature Extraction

no code implementations8 Mar 2024 Tanvi Verma, Linh Le Dinh, Nicholas Tan, Xinxing Xu, ChingYu Cheng, Yong liu

During the test, a patient's gaze is fixed at a specific location while light stimuli of varying intensities are presented in central and peripheral vision.

LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking

no code implementations7 Mar 2024 Jialin Li, Qiang Nie, WeiFu Fu, Yuhuan Lin, Guangpin Tao, Yong liu, Chengjie Wang

Deep learning models, particularly those based on transformers, often employ numerous stacked structures, which possess identical architectures and perform similar functions.

The 2nd Workshop on Recommendation with Generative Models

no code implementations7 Mar 2024 Wenjie Wang, Yang Zhang, Xinyu Lin, Fuli Feng, Weiwen Liu, Yong liu, Xiangyu Zhao, Wayne Xin Zhao, Yang song, Xiangnan He

The rise of generative models has driven significant advancements in recommender systems, leaving unique opportunities for enhancing users' personalized recommendations.

Recommendation Systems

TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables

no code implementations29 Feb 2024 Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Yong liu, Yunzhong Qiu, Haoran Zhang, Jianmin Wang, Mingsheng Long

Experimentally, TimeXer significantly improves time series forecasting with exogenous variables and achieves consistent state-of-the-art performance in twelve real-world forecasting benchmarks.

Time Series Time Series Forecasting

Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models

1 code implementation29 Feb 2024 Chen Qian, Jie Zhang, Wei Yao, Dongrui Liu, Zhenfei Yin, Yu Qiao, Yong liu, Jing Shao

This research provides an initial exploration of trustworthiness modeling during LLM pre-training, seeking to unveil new insights and spur further developments in the field.

Fairness Mutual Information Estimation

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

no code implementations24 Feb 2024 Yong liu, Zirui Zhu, Chaoyu Gong, Minhao Cheng, Cho-Jui Hsieh, Yang You

While fine-tuning large language models (LLMs) for specific tasks often yields impressive results, it comes at the cost of memory inefficiency due to back-propagation in gradient-based training.

RTE

Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization

1 code implementation23 Feb 2024 Zirui Zhu, Yong liu, Zangwei Zheng, Huifeng Guo, Yang You

We explore the typical data characteristics and optimization statistics of CTR prediction, revealing a strong positive correlation between the top hessian eigenvalue and feature frequency.

Click-Through Rate Prediction

Training-free image style alignment for self-adapting domain shift on handheld ultrasound devices

no code implementations17 Feb 2024 Hongye Zeng, Ke Zou, Zhihao Chen, Yuchong Gao, Hongbo Chen, Haibin Zhang, Kang Zhou, Meng Wang, Rick Siow Mong Goh, Yong liu, Chang Jiang, Rui Zheng, Huazhu Fu

Moreover, the models trained on standard ultrasound device data are constrained by training data distribution and perform poorly when directly applied to handheld device data.

Aligning Crowd Feedback via Distributional Preference Reward Modeling

no code implementations15 Feb 2024 Dexun Li, Cong Zhang, Kuicai Dong, Derrick Goh Xin Deik, Ruiming Tang, Yong liu

In this paper, we introduce the Distributional Preference Reward Model (DPRM), a simple yet effective framework to align large language models with a diverse set of human preferences.

AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

1 code implementation4 Feb 2024 Yong liu, Guo Qin, Xiangdong Huang, Jianmin Wang, Mingsheng Long

Foundation models of time series have not been fully developed due to the limited availability of large-scale time series and the underexploration of scalable pre-training.

In-Context Learning Language Modelling +1

Timer: Transformers for Time Series Analysis at Scale

1 code implementation4 Feb 2024 Yong liu, Haoran Zhang, Chenyu Li, Xiangdong Huang, Jianmin Wang, Mingsheng Long

Continuous progresses have been achieved as the emergence of large language models, exhibiting unprecedented ability in few-shot generalization, scalability, and task generality, which is however absent in time series models.

Anomaly Detection Imputation +2

RIDERS: Radar-Infrared Depth Estimation for Robust Sensing

1 code implementation3 Feb 2024 Han Li, Yukai Ma, Yuehao Huang, Yaqing Gu, Weihua Xu, Yong liu, Xingxing Zuo

Dense depth recovery is crucial in autonomous driving, serving as a foundational element for obstacle avoidance, 3D object detection, and local path planning.

3D Object Detection Autonomous Driving +3

Parameter-Efficient Conversational Recommender System as a Language Processing Task

1 code implementation25 Jan 2024 Mathieu Ravaut, Hao Zhang, Lu Xu, Aixin Sun, Yong liu

Conversational recommender systems (CRS) aim to recommend relevant items to users by eliciting user preference through natural language conversation.

Dialogue Generation Knowledge Graphs +2

Distilling Mathematical Reasoning Capabilities into Small Language Models

no code implementations22 Jan 2024 Xunyu Zhu, Jian Li, Yong liu, Can Ma, Weiping Wang

This work addresses the challenge of democratizing advanced Large Language Models (LLMs) by compressing their mathematical reasoning capabilities into sub-billion parameter Small Language Models (SLMs) without compromising performance.

Mathematical Reasoning

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

no code implementations22 Jan 2024 Mengmeng Wang, Jiazheng Xing, Boyuan Jiang, Jun Chen, Jianbiao Mei, Xingxing Zuo, Guang Dai, Jingdong Wang, Yong liu

In this paper, we introduce a novel Multimodal, Multi-task CLIP adapting framework named \name to address these challenges, preserving both high supervised performance and robust transferability.

Action Recognition Temporal Action Localization

Self-supervised Event-based Monocular Depth Estimation using Cross-modal Consistency

no code implementations14 Jan 2024 Junyu Zhu, Lina Liu, Bofeng Jiang, Feng Wen, Hongbo Zhang, Wanlong Li, Yong liu

In this paper, to lower the annotation cost, we propose a self-supervised event-based monocular depth estimation framework named EMoDepth.

Depth Prediction Monocular Depth Estimation

RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale

1 code implementation9 Jan 2024 Han Li, Yukai Ma, Yaqing Gu, Kewei Hu, Yong liu, Xingxing Zuo

To circumvent this issue, we learn to augment versatile and robust monocular depth prediction with the dense metric scale induced from sparse and noisy Radar data.

Depth Estimation Depth Prediction

FedNS: A Fast Sketching Newton-Type Algorithm for Federated Learning

1 code implementation5 Jan 2024 Jian Li, Yong liu, Wei Wang, Haoran Wu, Weiping Wang

We provide convergence analysis based on statistical learning for the federated Newton sketch approaches.

Federated Learning

Learning Prompt with Distribution-Based Feature Replay for Few-Shot Class-Incremental Learning

1 code implementation3 Jan 2024 Zitong Huang, Ze Chen, Zhixing Chen, Erjin Zhou, Xinxing Xu, Rick Siow Mong Goh, Yong liu, WangMeng Zuo, ChunMei Feng

When progressing to a new session, pseudo-features are sampled from old-class distributions combined with training images of the current session to optimize the prompt, thus enabling the model to learn new knowledge while retaining old knowledge.

Few-Shot Class-Incremental Learning Incremental Learning +1

Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt

1 code implementation2 Jan 2024 Jiaqi Liu, Kai Wu, Qiang Nie, Ying Chen, Bin-Bin Gao, Yong liu, Jinbao Wang, Chengjie Wang, Feng Zheng

Unsupervised Anomaly Detection (UAD) with incremental training is crucial in industrial manufacturing, as unpredictable defects make obtaining sufficient labeled data infeasible.

continual anomaly detection Continual Learning +2

1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation

1 code implementation1 Jan 2024 Zhuoyan Luo, Yicheng Xiao, Yong liu, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

The recent transformer-based models have dominated the Referring Video Object Segmentation (RVOS) task due to the superior performance.

Object Referring Video Object Segmentation +3

A Generalist FaceX via Learning Unified Facial Representation

1 code implementation31 Dec 2023 Yue Han, Jiangning Zhang, Junwei Zhu, Xiangtai Li, Yanhao Ge, Wei Li, Chengjie Wang, Yong liu, Xiaoming Liu, Ying Tai

This work presents FaceX framework, a novel facial generalist model capable of handling diverse facial tasks simultaneously.

Facial Editing

Learnable Chamfer Distance for Point Cloud Reconstruction

1 code implementation27 Dec 2023 Tianxin Huang, Qingyao Liu, Xiangrui Zhao, Jun Chen, Yong liu

As point clouds are 3D signals with permutation invariance, most existing works train their reconstruction networks by measuring shape differences with the average point-to-point distance between point clouds matched with predefined rules.

Point cloud reconstruction

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

1 code implementation21 Dec 2023 Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong liu, Gang Yu

This paper presents Paint3D, a novel coarse-to-fine generative framework that is capable of producing high-resolution, lighting-less, and diverse 2K UV texture maps for untextured 3D meshes conditioned on text or image inputs.

2k

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

1 code implementation19 Dec 2023 Yanqi Ge, Qiang Nie, Ye Huang, Yong liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan

By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem.

Disentanglement

VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

1 code implementation19 Dec 2023 Chun-Mei Feng, Yang Bai, Tao Luo, Zhen Li, Salman Khan, WangMeng Zuo, Xinxing Xu, Rick Siow Mong Goh, Yong liu

By feeding the retrieved image and question to the VQA model, one can find the images inconsistent with relative caption when the answer by VQA is inconsistent with the answer in the QA pair.

Image Retrieval Question Answering +2

CR-SFP: Learning Consistent Representation for Soft Filter Pruning

no code implementations17 Dec 2023 Jingyang Xiang, Zhuangzhi Chen, Jianbiao Mei, Siqi Li, Jun Chen, Yong liu

In this paper, we propose to mitigate this gap by learning consistent representation for soft filter pruning, dubbed as CR-SFP.

MaxQ: Multi-Axis Query for N:M Sparsity Network

1 code implementation12 Dec 2023 Jingyang Xiang, Siqi Li, JunHao Chen, Zhuangzhi Chen, Tianxin Huang, Linpeng Peng, Yong liu

Meanwhile, a sparsity strategy that gradually increases the percentage of N:M weight blocks is applied, which allows the network to heal from the pruning-induced damage progressively.

Image Classification Instance Segmentation +3

Camera-based 3D Semantic Scene Completion with Sparse Guidance Network

1 code implementation10 Dec 2023 Jianbiao Mei, Yu Yang, Mengmeng Wang, Junyu Zhu, Xiangrui Zhao, Jongwon Ra, Laijian Li, Yong liu

Semantic scene completion (SSC) aims to predict the semantic occupancy of each voxel in the entire 3D scene from limited observations, which is an emerging and critical task for autonomous driving.

3D Semantic Scene Completion Autonomous Driving

ASWT-SGNN: Adaptive Spectral Wavelet Transform-based Self-Supervised Graph Neural Network

no code implementations10 Dec 2023 Ruyue Liu, Rong Yin, Yong liu, Weiping Wang

Graph Comparative Learning (GCL) is a self-supervised method that combines the advantages of Graph Convolutional Networks (GCNs) and comparative learning, making it promising for learning node representations.

Node Classification

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

no code implementations8 Nov 2023 Huayi Tang, Yong liu

In this paper, we develop data-dependent and algorithm-dependent generalization bounds for transductive learning algorithms in the context of information theory for the first time.

Generalization Bounds Graph Learning +1

APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation

1 code implementation6 Nov 2023 Mingjia Yin, Hao Wang, Xiang Xu, Likang Wu, Sirui Zhao, Wei Guo, Yong liu, Ruiming Tang, Defu Lian, Enhong Chen

To this end, we propose a graph-driven framework, named Adaptive and Personalized Graph Learning for Sequential Recommendation (APGL4SR), that incorporates adaptive and personalized global collaborative information into sequential recommendation systems.

Graph Learning Multi-Task Learning +1

GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection

1 code implementation5 Nov 2023 Jiangning Zhang, Haoyang He, Xuhai Chen, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual grounding capabilities, making it possible to handle certain tasks through the Visual Question Answering (VQA) paradigm.

Anomaly Detection Question Answering +3

VIGraph: Generative Self-supervised Learning for Class-Imbalanced Node Classification

no code implementations2 Nov 2023 Yulan Hu, Sheng Ouyang, Zhirui Yang, Yong liu

VIGraph strictly adheres to the concept of imbalance when constructing imbalanced graphs and innovatively leverages the variational inference (VI) ability of Variational GAE to generate nodes for minority classes.

Contrastive Learning Node Classification +2

ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair

no code implementations25 Oct 2023 Yonghao Wu, Zheng Li, Jie M. Zhang, Yong liu

With the growing interest on Large Language Models (LLMs) for fault localization and program repair, ensuring the integrity and generalizability of the LLM-based methods becomes paramount.

Benchmarking Fault localization

Nighttime Thermal Infrared Image Colorization with Feedback-based Object Appearance Learning

1 code implementation24 Oct 2023 Fu-Ya Luo, Shu-Lin Liu, Yi-Jun Cao, Kai-Fu Yang, Chang-Yong Xie, Yong liu, Yong-Jie Li

Extensive experiments illustrate that the proposed FoalGAN is not only effective for appearance learning of small objects, but also outperforms other image translation methods in terms of semantic preservation and edge consistency for the NTIR2DC task.

Colorization Generative Adversarial Network +2

Graph Ranking Contrastive Learning: A Extremely Simple yet Efficient Method

no code implementations23 Oct 2023 Yulan Hu, Sheng Ouyang, Jingyu Liu, Ge Chen, Zhirui Yang, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Yong liu

Thus, we propose GraphRank, a simple yet efficient graph contrastive learning method that addresses the problem of false negative samples by redefining the concept of negative samples to a certain extent, thereby avoiding the issue of false negative samples.

Contrastive Learning Graph Learning +1

In-context Learning with Transformer Is Really Equivalent to a Contrastive Learning Pattern

no code implementations20 Oct 2023 Ruifeng Ren, Yong liu

To the best of our knowledge, our work is the first to provide the understanding of ICL from the perspective of contrastive learning and has the potential to facilitate future model design by referring to related works on contrastive learning.

Contrastive Learning In-Context Learning

Understanding Fairness Surrogate Functions in Algorithmic Fairness

1 code implementation17 Oct 2023 Wei Yao, Zhanke Zhou, Zhicong Li, Bo Han, Yong liu

To mitigate such bias while achieving comparable accuracy, a promising approach is to introduce surrogate functions of the concerned fairness definition and solve a constrained optimization problem.

Fairness

SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration

1 code implementation10 Oct 2023 Jingyang Xiang, Siqi Li, Jun Chen, Shipeng Bai, Yukai Ma, Guang Dai, Yong liu

To overcome them, this paper proposes a novel \emph{\textbf{S}oft \textbf{U}niform \textbf{B}lock \textbf{P}runing} (SUBP) approach to train a uniform 1$\times$N sparse structured network from scratch.

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

4 code implementations10 Oct 2023 Yong liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, Mingsheng Long

These forecasters leverage Transformers to model the global dependencies over temporal tokens of time series, with each token formed by multiple variates of the same timestamp.

Time Series Time Series Forecasting

Sentence-level Prompts Benefit Composed Image Retrieval

1 code implementation9 Oct 2023 Yang Bai, Xinxing Xu, Yong liu, Salman Khan, Fahad Khan, WangMeng Zuo, Rick Siow Mong Goh, Chun-Mei Feng

Composed image retrieval (CIR) is the task of retrieving specific images by using a query that involves both a reference image and a relative caption.

Attribute Composed Image Retrieval (CoIR) +2

Perfect Alignment May be Poisonous to Graph Contrastive Learning

no code implementations6 Oct 2023 Jingyu Liu, Huayi Tang, Yong liu

Graph Contrastive Learning (GCL) aims to learn node representations by aligning positive pairs and separating negative ones.

Contrastive Learning

Quantum generative adversarial learning in photonics

no code implementations1 Oct 2023 Yizhi Wang, Shichuan Xue, Yaxuan Wang, Yong liu, Jiangfang Ding, Weixu Shi, Dongyang Wang, Yingwen Liu, Xiang Fu, Guangyao Huang, Anqi Huang, Mingtang Deng, Junjie Wu

Quantum Generative Adversarial Networks (QGANs), an intersection of quantum computing and machine learning, have attracted widespread attention due to their potential advantages over classical analogs.

Can the Query-based Object Detector Be Designed with Fewer Stages?

no code implementations28 Sep 2023 Jialin Li, WeiFu Fu, Yuhuan Lin, Qiang Nie, Yong liu

Query-based object detectors have made significant advancements since the publication of DETR.

Real3D-AD: A Dataset of Point Cloud Anomaly Detection

1 code implementation NeurIPS 2023 Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng

High-precision point cloud anomaly detection is the gold standard for identifying the defects of advancing machining and precision manufacturing.

3D Anomaly Detection

Predicting Fatigue Crack Growth via Path Slicing and Re-Weighting

1 code implementation13 Sep 2023 Yingjie Zhao, Yong liu, Zhiping Xu

Predicting potential risks associated with the fatigue of key structural components is crucial in engineering design.

Decision Making Dimensionality Reduction +2

A Comprehensive Survey on Deep Learning Techniques in Educational Data Mining

no code implementations9 Sep 2023 Yuanguo Lin, Hong Chen, Wei Xia, Fan Lin, Zongyue Wang, Yong liu

With the increasing complexity and diversity of educational data, Deep Learning techniques have shown significant advantages in addressing the challenges associated with analyzing and modeling this data.

Knowledge Tracing

Adaptive Multi-Modalities Fusion in Sequential Recommendation Systems

1 code implementation30 Aug 2023 Hengchang Hu, Wei Guo, Yong liu, Min-Yen Kan

We propose a graph-based approach (named MMSR) to fuse modality features in an adaptive order, enabling each modality to prioritize either its inherent sequential nature or its interplay with other modalities.

Sequential Recommendation

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

1 code implementation28 Aug 2023 Junyu Zhu, Lina Liu, Yu Tang, Feng Wen, Wanlong Li, Yong liu

In this paper, we present a novel semi-supervised framework for visual BEV semantic segmentation to boost performance by exploiting unlabeled images during the training.

Autonomous Vehicles Bird's-Eye View Semantic Segmentation +2

Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking

no code implementations ICCV 2023 Teli Ma, Mengmeng Wang, Jimin Xiao, Huifeng Wu, Yong liu

In this paper, we forsake the conventional Siamese paradigm and propose a novel single-branch framework, SyncTrack, synchronizing the feature extracting and matching to avoid forwarding encoder twice for template and search region as well as introducing extra parameters of matching network.

3D Object Tracking Object Tracking

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

no code implementations21 Aug 2023 Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong liu

This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold.

Second-order methods

End-to-End Beam Retrieval for Multi-Hop Question Answering

2 code implementations17 Aug 2023 Jiahao Zhang, Haiyang Zhang, Dongmei Zhang, Yong liu, Shen Huang

This approach models the multi-hop retrieval process in an end-to-end manner by jointly optimizing an encoder and two classification heads across all hops.

Language Modelling Large Language Model +3

A Survey on Model Compression for Large Language Models

no code implementations15 Aug 2023 Xunyu Zhu, Jian Li, Yong liu, Can Ma, Weiping Wang

As these challenges become increasingly pertinent, the field of model compression has emerged as a pivotal research area to alleviate these limitations.

Benchmarking Knowledge Distillation +2

Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning

no code implementations ICCV 2023 Shipeng Bai, Jun Chen, Xintian Shen, Yixuan Qian, Yong liu

Therefore, a few data-free methods are proposed to address this problem, but they perform data-free pruning and quantization separately, which does not explore the complementarity of pruning and quantization.

Image Classification Quantization

Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

1 code implementation ICCV 2023 Chun-Mei Feng, Kai Yu, Yong liu, Salman Khan, WangMeng Zuo

In this paper, we focus on a particular setting of learning adaptive prompts on the fly for each test sample from an unseen new domain, which is known as test-time prompt tuning (TPT).

Data Augmentation

Unfolding Once is Enough: A Deployment-Friendly Transformer Unit for Super-Resolution

1 code implementation5 Aug 2023 Yong liu, Hang Dong, Boyang Liang, Songwei Liu, Qingji Dong, Kai Chen, Fangmin Chen, Lean Fu, Fei Wang

Since the high resolution of intermediate features in SISR models increases memory and computational requirements, efficient SISR transformers are more favored.

Image Super-Resolution

Unsupervised Representation Learning for Time Series: A Review

1 code implementation3 Aug 2023 Qianwen Meng, Hangwei Qian, Yong liu, Yonghui Xu, Zhiqi Shen, Lizhen Cui

However, there is a lack of systematic analysis of unsupervised representation learning approaches for time series.

Contrastive Learning Representation Learning +1

Multimodal Adaptation of CLIP for Few-Shot Action Recognition

no code implementations3 Aug 2023 Jiazheng Xing, Mengmeng Wang, Xiaojun Hou, Guang Dai, Jingdong Wang, Yong liu

The adapters we design can combine information from video-text multimodal sources for task-oriented spatiotemporal modeling, which is fast, efficient, and has low training costs.

Few-Shot action recognition Few Shot Action Recognition

High Probability Analysis for Non-Convex Stochastic Optimization with Clipping

no code implementations25 Jul 2023 Shaojie Li, Yong liu

Gradient clipping is a commonly used technique to stabilize the training process of neural networks.

Stochastic Optimization

Can Large Language Models Empower Molecular Property Prediction?

1 code implementation14 Jul 2023 Chen Qian, Huayi Tang, Zhirui Yang, Hong Liang, Yong liu

Molecular property prediction has gained significant attention due to its transformative potential in multiple scientific disciplines.

Molecular Property Prediction Property Prediction

YOLIC: An Efficient Method for Object Localization and Classification on Edge Devices

no code implementations13 Jul 2023 Kai Su, Yoichi Tomioka, Qiangfu Zhao, Yong liu

In the realm of Tiny AI, we introduce ``You Only Look at Interested Cells" (YOLIC), an efficient method for object localization and classification on edge devices.

Classification Computational Efficiency +6

Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning

no code implementations2 Jul 2023 Jun Chen, Shipeng Bai, Tianxin Huang, Mengmeng Wang, Guanzhong Tian, Yong liu

In this paper, we propose a data-free mixed-precision compensation (DF-MPC) method to recover the performance of an ultra-low precision quantized model without any data and fine-tuning process.

Data Free Quantization Model Compression

PANet: LiDAR Panoptic Segmentation with Sparse Instance Proposal and Aggregation

1 code implementation27 Jun 2023 Jianbiao Mei, Yu Yang, Mengmeng Wang, Xiaojun Hou, Laijian Li, Yong liu

Firstly, we propose a non-learning Sparse Instance Proposal (SIP) module with the ``sampling-shifting-grouping" scheme to directly group thing points into instances from the raw point cloud efficiently.

Autonomous Driving Instance Segmentation +2

SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion

1 code implementation27 Jun 2023 Jianbiao Mei, Yu Yang, Mengmeng Wang, Tianxin Huang, Xuemeng Yang, Yong liu

However, how to effectively exploit the relationships between the semantic context in semantic segmentation and geometric structure in scene completion remains under exploration.

Autonomous Driving Scene Understanding +1

How Can Recommender Systems Benefit from Large Language Models: A Survey

1 code implementation9 Jun 2023 Jianghao Lin, Xinyi Dai, Yunjia Xi, Weiwen Liu, Bo Chen, Hao Zhang, Yong liu, Chuhan Wu, Xiangyang Li, Chenxu Zhu, Huifeng Guo, Yong Yu, Ruiming Tang, Weinan Zhang

In this paper, we conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.

Ethics Feature Engineering +5

ViG-UNet: Vision Graph Neural Networks for Medical Image Segmentation

1 code implementation8 Jun 2023 Juntao Jiang, Xiyu Chen, Guanzhong Tian, Yong liu

Deep neural networks have been widely used in medical image analysis and medical image segmentation is one of the most important tasks.

Image Segmentation Medical Image Segmentation +2

Koopa: Learning Non-stationary Time Series Dynamics with Koopman Predictors

1 code implementation NeurIPS 2023 Yong liu, Chenyu Li, Jianmin Wang, Mingsheng Long

While previous models suffer from complicated series variations induced by changing temporal distribution, we tackle non-stationary time series with modern Koopman theory that fundamentally considers the underlying time-variant dynamics.

Time Series

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

1 code implementation NeurIPS 2023 Zhuoyan Luo, Yicheng Xiao, Yong liu, Shuyan Li, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

To address this issue, we propose Semantic-assisted Object Cluster (SOC), which aggregates video content and textual guidance for unified temporal modeling and cross-modal alignment.

Ranked #2 on Referring Expression Segmentation on A2D Sentences (using extra training data)

Object Referring Expression Segmentation +4

Learning Global-aware Kernel for Image Harmonization

no code implementations ICCV 2023 Xintian Shen, Jiangning Zhang, Jun Chen, Shipeng Bai, Yue Han, Yabiao Wang, Chengjie Wang, Yong liu

To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references.

Image Harmonization

Correlation Pyramid Network for 3D Single Object Tracking

no code implementations16 May 2023 Mengmeng Wang, Teli Ma, Xingxing Zuo, Jiajun Lv, Yong liu

Additionally, considering the sparsity characteristics of the point clouds, we design a lateral correlation pyramid structure for the encoder to keep as many points as possible by integrating hierarchical correlated features.

3D Single Object Tracking Autonomous Driving +2

FusionDepth: Complement Self-Supervised Monocular Depth Estimation with Cost Volume

no code implementations10 May 2023 Zhuofei Huang, Jianlin Liu, Shang Xu, Ying Chen, Yong liu

Multi-view stereo depth estimation based on cost volume usually works better than self-supervised monocular depth estimation except for moving objects and low-textured surfaces.

Monocular Depth Estimation Stereo Depth Estimation

Multimodal-driven Talking Face Generation via a Unified Diffusion-based Generator

no code implementations4 May 2023 Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong liu

More specifically, given a textured face as the source and the rendered face projected from the desired 3DMM coefficients as the target, our proposed Texture-Geometry-aware Diffusion Model decomposes the complex transfer problem into multi-conditional denoising process, where a Texture Attention-based module accurately models the correspondences between appearance and geometry cues contained in source and target conditions, and incorporate extra implicit information for high-fidelity talking face generation.

Denoising Face Swapping +1

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

no code implementations CVPR 2023 Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong liu

Specifically, we supplement the emotion style in text prompts and use an Aligned Multi-modal Emotion encoder to embed the text, image, and audio emotion modality into a unified space, which inherits rich semantic prior from CLIP.

Talking Face Generation

NeRF-Loc: Visual Localization with Conditional Neural Radiance Field

1 code implementation17 Apr 2023 Jianlin Liu, Qiang Nie, Yong liu, Chengjie Wang

We propose a novel visual re-localization method based on direct matching between the implicit 3D descriptors and the 2D image with transformer.

Neural Rendering Visual Localization

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

2 code implementations12 Apr 2023 Jiahao Wang, Songyang Zhang, Yong liu, Taiqiang Wu, Yujiu Yang, Xihui Liu, Kai Chen, Ping Luo, Dahua Lin

Extensive experiments and ablative analysis also demonstrate that the inductive bias of network architecture, can be incorporated into simple network structure with appropriate optimization strategy.

Inductive Bias

Robust Neural Architecture Search

no code implementations6 Apr 2023 Xunyu Zhu, Jian Li, Yong liu, Weiping Wang

Neural Architectures Search (NAS) becomes more and more popular over these years.

Image Classification Neural Architecture Search

Learning Federated Visual Prompt in Null Space for MRI Reconstruction

1 code implementation CVPR 2023 Chun-Mei Feng, Bangjun Li, Xinxing Xu, Yong liu, Huazhu Fu, WangMeng Zuo

Federated Magnetic Resonance Imaging (MRI) reconstruction enables multiple hospitals to collaborate distributedly without aggregating local data, thereby protecting patient privacy.

MRI Reconstruction

Federated Uncertainty-Aware Aggregation for Fundus Diabetic Retinopathy Staging

no code implementations23 Mar 2023 Meng Wang, Lianyu Wang, Xinxing Xu, Ke Zou, Yiming Qian, Rick Siow Mong Goh, Yong liu, Huazhu Fu

Our TWEU employs an evidential deep layer to produce the uncertainty score with the DR staging results for client reliability evaluation.

Federated Learning

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

1 code implementation ICCV 2023 Kunyang Han, Yong liu, Jun Hao Liew, Henghui Ding, Yunchao Wei, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang, Jiashi Feng, Yao Zhao

Recent advancements in pre-trained vision-language models, such as CLIP, have enabled the segmentation of arbitrary concepts solely from textual inputs, a process commonly referred to as open-vocabulary semantic segmentation (OVS).

Knowledge Distillation Open Vocabulary Semantic Segmentation +4

Medical Phrase Grounding with Region-Phrase Context Contrastive Alignment

no code implementations14 Mar 2023 Zhihao Chen, Yang Zhou, Anh Tran, Junting Zhao, Liang Wan, Gideon Ooi, Lionel Cheng, Choon Hua Thng, Xinxing Xu, Yong liu, Huazhu Fu

To enable MedRPG to locate nuanced medical findings with better region-phrase correspondences, we further propose Tri-attention Context contrastive alignment (TaCo).

Phrase Grounding Visual Grounding

A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation

1 code implementation28 Feb 2023 Lin Li, Wendong Ding, Yongkun Wen, Yufei Liang, Yong liu, Guowei Wan

For overlap detection, a cross-attention module is applied for interacting contextual information of input point clouds, followed by a classification head to estimate the overlapping region.

Point Cloud Registration

TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual Vision Transformer for Fast Arbitrary One-Shot Image Generation

no code implementations16 Feb 2023 Yunliang Jiang, Lili Yan, Xiongtao Zhang, Yong liu, Danfeng Sun

One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention.

Image Harmonization Image Super-Resolution

Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK

no code implementations16 Feb 2023 Xiongtao Zhang, Zezong Yin, Yunliang Jiang, Yizhang Jiang, Danfeng Sun, Yong liu

High-order Takagi-Sugeno-Kang (TSK) fuzzy classifiers possess powerful classification performance yet have fewer fuzzy rules, but always be impaired by its exponential growth training time and poorer interpretability owing to High-order polynomial used in consequent part of fuzzy rule, while Low-order TSK fuzzy classifiers run quickly with high interpretability, however they usually require more fuzzy rules and perform relatively not very well.

Benchmarking Knowledge Distillation +1

Adaptive Value Decomposition with Greedy Marginal Contribution Computation for Cooperative Multi-Agent Reinforcement Learning

1 code implementation14 Feb 2023 Shanqi Liu, Yujing Hu, Runze Wu, Dong Xing, Yu Xiong, Changjie Fan, Kun Kuang, Yong liu

We first illustrate that the proposed value decomposition can consider the complicated interactions among agents and is feasible to learn in large-scale scenarios.

Multi-agent Reinforcement Learning

Operation-level Progressive Differentiable Architecture Search

1 code implementation11 Feb 2023 Xunyu Zhu, Jian Li, Yong liu, Weiping Wang

It can effectively alleviate the unfair competition between operations during the search phase of DARTS by offsetting the inherent unfair advantage of the skip connection over other operations.

Neural Architecture Search

Improving Differentiable Architecture Search via Self-Distillation

no code implementations11 Feb 2023 Xunyu Zhu, Jian Li, Yong liu, Weiping Wang

Differentiable Architecture Search (DARTS) is a simple yet efficient Neural Architecture Search (NAS) method.

Neural Architecture Search

Learning Discretized Neural Networks under Ricci Flow

no code implementations7 Feb 2023 Jun Chen, Hanwen Chen, Mengmeng Wang, Guang Dai, Ivor W. Tsang, Yong liu

By introducing a partial differential equation on metrics, i. e., the Ricci flow, we establish the dynamical stability and convergence of the LNE metric with the $L^2$-norm perturbation.

History-Aware Hierarchical Transformer for Multi-session Open-domain Dialogue System

no code implementations2 Feb 2023 Tong Zhang, Yong liu, Boyang Li, Zhiwei Zeng, Pengwei Wang, Yuan You, Chunyan Miao, Lizhen Cui

HAHT maintains a long-term memory of history conversations and utilizes history information to understand current conversation context and generate well-informed and context-relevant responses.

IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing

2 code implementations31 Jan 2023 Guoyang Xie, Jinbao Wang, Jiaqi Liu, Jiayi Lyu, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

We realize that the lack of a uniform IM benchmark is hindering the development and usage of IAD methods in real-world applications.

Anomaly Detection Continual Learning +1

Reliable Federated Disentangling Network for Non-IID Domain Feature

no code implementations30 Jan 2023 Meng Wang, Kai Yu, Chun-Mei Feng, Yiming Qian, Ke Zou, Lianyu Wang, Rick Siow Mong Goh, Yong liu, Huazhu Fu

To the best of our knowledge, our proposed RFedDis is the first work to develop an FL approach based on evidential uncertainty combined with feature disentangling, which enhances the performance and reliability of FL in non-IID domain features.

Federated Learning

FG-Depth: Flow-Guided Unsupervised Monocular Depth Estimation

no code implementations20 Jan 2023 Junyu Zhu, Lina Liu, Yong liu, Wanlong Li, Feng Wen, Hongbo Zhang

The great potential of unsupervised monocular depth estimation has been demonstrated by many works due to low annotation cost and impressive accuracy comparable to supervised methods.

Image Reconstruction Monocular Depth Estimation +2

Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition

no code implementations19 Jan 2023 Jiazheng Xing, Mengmeng Wang, Yong liu, Boyu Mu

In this paper, we propose SloshNet, a new framework that revisits the spatial and temporal modeling for few-shot action recognition in a finer manner.

Few-Shot action recognition Few Shot Action Recognition

BSNet: Lane Detection via Draw B-spline Curves Nearby

no code implementations17 Jan 2023 Haoxin Chen, Mengmeng Wang, Yong liu

The locality of lane representation is the ability to modify lanes locally which can simplify parameter optimization.

Lane Detection

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

1 code implementation3 Jan 2023 Yue Han, Jiangning Zhang, Zhucun Xue, Chao Xu, Xintian Shen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li

In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework.

Benchmarking Few-Shot Object Detection +3

RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer

no code implementations CVPR 2023 Jiahao Wang, Songyang Zhang, Yong liu, Taiqiang Wu, Yujiu Yang, Xihui Liu, Kai Chen, Ping Luo, Dahua Lin

Extensive experiments and ablative analysis also demonstrate that the inductive bias of network architecture, can be incorporated into simple network structure with appropriate optimization strategy.

Inductive Bias

Learning To Measure the Point Cloud Reconstruction Loss in a Representation Space

no code implementations CVPR 2023 Tianxin Huang, Zhonggan Ding, Jiangning Zhang, Ying Tai, Zhenyu Zhang, Mingang Chen, Chengjie Wang, Yong liu

Specifically, we use the contrastive constraint to help CALoss learn a representation space with shape similarity, while we introduce the adversarial strategy to help CALoss mine differences between reconstructed results and ground truths.

Point cloud reconstruction

HOTNAS: Hierarchical Optimal Transport for Neural Architecture Search

no code implementations CVPR 2023 Jiechao Yang, Yong liu, Hongteng Xu

To address these issues, we propose a hierarchical optimal transport metric called HOTNN for measuring the similarity of different networks.

Bayesian Optimization Neural Architecture Search

Fair Scratch Tickets: Finding Fair Sparse Networks Without Weight Training

1 code implementation CVPR 2023 Pengwei Tang, Wei Yao, Zhicong Li, Yong liu

We randomly initialize a dense neural network and find appropriate binary masks for the weights to obtain fair sparse subnetworks without any weight training.

Fairness

Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty

3 code implementations1 Jan 2023 Ke Zou, Yidi Chen, Ling Huang, Xuedong Yuan, Xiaojing Shen, Meng Wang, Rick Siow Mong Goh, Yong liu, Huazhu Fu

DEviS not only enhances the calibration and robustness of baseline segmentation accuracy but also provides high-efficiency uncertainty estimation for reliable predictions.

Computational Efficiency Image Segmentation +3

Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition

no code implementations9 Dec 2022 Xinzhe Ni, Yong liu, Hao Wen, Yatai Ji, Jing Xiao, Yujiu Yang

Then in the visual flow, visual prototypes are computed by a Temporal-Relational CrossTransformer (TRX) module for example.

Few-Shot action recognition Few Shot Action Recognition +1

AdaCM: Adaptive ColorMLP for Real-Time Universal Photo-realistic Style Transfer

no code implementations3 Dec 2022 Tianwei Lin, Honglin Lin, Fu Li, Dongliang He, Wenhao Wu, Meiling Wang, Xin Li, Yong liu

Then, in \textbf{AdaCM}, we adopt a CNN encoder to adaptively predict all parameters for the ColorMLP conditioned on each input content and style image pair.

4k Style Transfer

MHCCL: Masked Hierarchical Cluster-Wise Contrastive Learning for Multivariate Time Series

1 code implementation2 Dec 2022 Qianwen Meng, Hangwei Qian, Yong liu, Lizhen Cui, Yonghui Xu, Zhiqi Shen

Learning semantic-rich representations from raw unlabeled time series data is critical for downstream tasks such as classification and forecasting.

Clustering Contrastive Learning +3

Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images

no code implementations1 Dec 2022 Meng Wang, Kai Yu, Chun-Mei Feng, Ke Zou, Yanyu Xu, Qingquan Meng, Rick Siow Mong Goh, Yong liu, Huazhu Fu

Specifically, aiming at improving the model's ability to learn the complex pathological features of retinal edema lesions in OCT images, we develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module of our newly designed.

Segmentation

Inductive Graph Transformer for Delivery Time Estimation

1 code implementation5 Nov 2022 Xin Zhou, Jinglong Wang, Yong liu, Xingyu Wu, Zhiqi Shen, Cyril Leung

Providing accurate estimated time of package delivery on users' purchasing pages for e-commerce platforms is of great importance to their purchasing decisions and post-purchase experiences.

Global Spectral Filter Memory Network for Video Object Segmentation

1 code implementation11 Oct 2022 Yong liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong Tang, Yujiu Yang

Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head).

Attribute Object +4

Predictive Edge Caching through Deep Mining of Sequential Patterns in User Content Retrievals

no code implementations6 Oct 2022 Chen Li, Xiaoyu Wang, Tongyu Zong, Houwei Cao, Yong liu

Edge caching plays an increasingly important role in boosting user content retrieval performance while reducing redundant network traffic.

Retrieval

TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis

3 code implementations5 Oct 2022 Haixu Wu, Tengge Hu, Yong liu, Hang Zhou, Jianmin Wang, Mingsheng Long

TimesBlock can discover the multi-periodicity adaptively and extract the complex temporal variations from transformed 2D tensors by a parameter-efficient inception block.

Action Recognition Anomaly Detection +4

Generative Model Watermarking Based on Human Visual System

no code implementations30 Sep 2022 Li Zhang, Yong liu, Shaoteng Liu, Tianshu Yang, Yexin Wang, Xinpeng Zhang, Hanzhou Wu

Intellectual property protection of deep neural networks is receiving attention from more and more researchers, and the latest research applies model watermarking to generative models for image processing.

Mask-Guided Image Person Removal with Data Synthesis

no code implementations29 Sep 2022 Yunliang Jiang, Chenyang Gu, Zhenfeng Xue, Xiongtao Zhang, Yong liu

As a special case of common object removal, image person removal is playing an increasingly important role in social media and criminal investigation domains.

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

no code implementations20 Sep 2022 Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li

In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.

3D Object Detection Cloud Detection +3

Exemplar-Based Image Colorization with A Learning Framework

no code implementations13 Sep 2022 Zhenfeng Xue, Jiandang Yang, Jie Ren, Yong liu

This method can be viewed as a hybrid of exemplar-based and learning-based method, and it decouples the colorization process and learning process so as to generate various color styles for the same gray image.

Colorization Image Colorization

Joint Learning Content and Degradation Aware Feature for Blind Super-Resolution

1 code implementation29 Aug 2022 Yifeng Zhou, Chuming Lin, Donghao Luo, Yong liu, Ying Tai, Chengjie Wang, Mingang Chen

Although some Unsupervised Degradation Prediction (UDP) methods are proposed to bypass this problem, the \textit{inconsistency} between degradation embedding and SR feature is still challenging.

Blind Super-Resolution Image Super-Resolution +1

ATPL: Mutually enhanced adversarial training and pseudo labeling for unsupervised domain adaptation

no code implementations Knowledge-Based Systems 2022 Changan Yi, Haotian Chen, Yonghui Xu, Yong liu, Lei Jiang, Haishu Tan

Accordingly, ATPL will use the pseudo-labeled information to improve the adversarial training process, which can guarantee the feature transferability by generating adversarial data to fill in the domain gap.

Unsupervised Domain Adaptation

SuperLine3D: Self-supervised Line Segmentation and Description for LiDAR Point Cloud

1 code implementation3 Aug 2022 Xiangrui Zhao, Sheng Yang, Tianxin Huang, Jun Chen, Teng Ma, Mingyang Li, Yong liu

To repetitively extract them as features and perform association between discrete LiDAR frames for registration, we propose the first learning-based feature segmentation and description model for 3D lines in LiDAR point cloud.

Point Cloud Registration Segmentation

DA$^2$ Dataset: Toward Dexterity-Aware Dual-Arm Grasping

no code implementations31 Jul 2022 Guangyao Zhai, Yu Zheng, Ziwei Xu, Xin Kong, Yong liu, Benjamin Busam, Yi Ren, Nassir Navab, Zhengyou Zhang

In this paper, we introduce DA$^2$, the first large-scale dual-arm dexterity-aware dataset for the generation of optimal bimanual grasping pairs for arbitrary large objects.

Layer-refined Graph Convolutional Networks for Recommendation

1 code implementation22 Jul 2022 Xin Zhou, Donghui Lin, Yong liu, Chunyan Miao

Specifically, these models usually aggregate all layer embeddings for node updating and achieve their best recommendation performance within a few layers because of over-smoothing.

Adaptive Assignment for Geometry Aware Local Feature Matching

1 code implementation CVPR 2023 Dihe Huang, Ying Chen, Shang Xu, Yong liu, Wenlong Wu, Yikang Ding, Chengjie Wang, Fan Tang

The detector-free feature matching approaches are currently attracting great attention thanks to their excellent performance.

Feature Correlation

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

1 code implementation17 Jul 2022 Zizhang Li, Mengmeng Wang, Huaijin Pi, Kechun Xu, Jianbiao Mei, Yong liu

However, the redundant parameters within the network structure can cause a large model size when scaling up for desirable performance.

Video Reconstruction

Learning Quality-aware Dynamic Memory for Video Object Segmentation

1 code implementation16 Jul 2022 Yong liu, Ran Yu, Fei Yin, Xinyuan Zhao, Wei Zhao, Weihao Xia, Yujiu Yang

However, they mainly focus on better matching between the current frame and the memory frames without explicitly paying attention to the quality of the memory.

Ranked #11 on Semi-Supervised Video Object Segmentation on DAVIS 2016 (using extra training data)

Segmentation Semantic Segmentation +2

Bootstrap Latent Representations for Multi-modal Recommendation

2 code implementations13 Jul 2022 Xin Zhou, HongYu Zhou, Yong liu, Zhiwei Zeng, Chunyan Miao, Pengwei Wang, Yuan You, Feijun Jiang

Besides the user-item interaction graph, existing state-of-the-art methods usually use auxiliary graphs (e. g., user-user or item-item relation graph) to augment the learned representations of users and/or items.

Minimalist and High-performance Conversational Recommendation with Uncertainty Estimation for User Preference

no code implementations29 Jun 2022 Yinan Zhang, Boyang Li, Yong liu, You Yuan, Chunyan Miao

Multi-shot CRS is designed to make recommendations multiple times until the user either accepts the recommendation or leaves at the end of their patience.

Attribute Reinforcement Learning (RL)

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

1 code implementation19 Jun 2022 Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.

Image Classification

Towards Practical Differential Privacy in Data Analysis: Understanding the Effect of Epsilon on Utility in Private ERM

no code implementations6 Jun 2022 Yuzhe Li, Yong liu, Bo Li, Weiping Wang, Nan Liu

In this paper, we focus our attention on private Empirical Risk Minimization (ERM), which is one of the most commonly used data analysis method.

Enhancing Sequential Recommendation with Graph Contrastive Learning

no code implementations30 May 2022 Yixin Zhang, Yong liu, Yonghui Xu, Hao Xiong, Chenyi Lei, wei he, Lizhen Cui, Chunyan Miao

Specifically, GCL4SR employs a Weighted Item Transition Graph (WITG), built based on interaction sequences of all users, to provide global context information for each interaction and weaken the noise information in the sequence data.

Auxiliary Learning Contrastive Learning +1

Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting

1 code implementation28 May 2022 Yong liu, Haixu Wu, Jianmin Wang, Mingsheng Long

However, their performance can degenerate terribly on non-stationary real-world data in which the joint distribution changes over time.

Time Series Time Series Forecasting

UniInst: Unique Representation for End-to-End Instance Segmentation

1 code implementation25 May 2022 Yimin Ou, Rui Yang, Lufan Ma, Yong liu, Jiangpeng Yan, Shang Xu, Chengjie Wang, Xiu Li

Existing instance segmentation methods have achieved impressive performance but still suffer from a common dilemma: redundant representations (e. g., multiple boxes, grids, and anchor points) are inferred for one instance, which leads to multiple duplicated predictions.

Instance Segmentation Re-Ranking +2

Ridgeless Regression with Random Features

1 code implementation1 May 2022 Jian Li, Yong liu, Yingying Zhang

Recent theoretical studies illustrated that kernel ridgeless regression can guarantee good generalization ability without an explicit regularization.

regression

Understanding the Generalization Performance of Spectral Clustering Algorithms

no code implementations30 Apr 2022 Shaojie Li, Sheng Ouyang, Yong liu

The theoretical analysis of spectral clustering mainly focuses on consistency, while there is relatively little research on its generalization performance.

Clustering

Sharper Utility Bounds for Differentially Private Models

no code implementations22 Apr 2022 Yilin Kang, Yong liu, Jian Li, Weiping Wang

In this paper, by introducing Generalized Bernstein condition, we propose the first $\mathcal{O}\big(\frac{\sqrt{p}}{n\epsilon}\big)$ high probability excess population risk bound for differentially private algorithms under the assumptions $G$-Lipschitz, $L$-smooth, and Polyak-{\L}ojasiewicz condition, based on gradient perturbation method.

Stability and Generalization of Differentially Private Minimax Problems

no code implementations11 Apr 2022 Yilin Kang, Yong liu, Jian Li, Weiping Wang

To the best of our knowledge, this is the first time to analyze the generalization performance of general minimax paradigm, taking differential privacy into account.

CRAFT: Cross-Attentional Flow Transformer for Robust Optical Flow

1 code implementation CVPR 2022 Xiuchao Sui, Shaohua Li, Xue Geng, Yan Wu, Xinxing Xu, Yong liu, Rick Goh, Hongyuan Zhu

This is mainly because the correlation volume, the basis of pixel matching, is computed as the dot product of the convolutional features of the two images.

Optical Flow Estimation

Region-Aware Face Swapping

no code implementations CVPR 2022 Chao Xu, Jiangning Zhang, Miao Hua, Qian He, Zili Yi, Yong liu

This paper presents a novel Region-Aware Face Swapping (RAFSwap) network to achieve identity-consistent harmonious high-resolution face generation in a local-global manner: \textbf{1)} Local Facial Region-Aware (FRA) branch augments local identity-relevant features by introducing the Transformer to effectively model misaligned cross-scale semantic interaction.

Face Generation Face Swapping +1

Towards Efficient and Scalable Sharpness-Aware Minimization

2 code implementations CVPR 2022 Yong liu, Siqi Mai, Xiangning Chen, Cho-Jui Hsieh, Yang You

Recently, Sharpness-Aware Minimization (SAM), which connects the geometry of the loss landscape and generalization, has demonstrated significant performance boosts on training large-scale models such as vision transformers.

Omni-frequency Channel-selection Representations for Unsupervised Anomaly Detection

1 code implementation1 Mar 2022 Yufei Liang, Jiangning Zhang, Shiwei Zhao, Runze Wu, Yong liu, Shuwen Pan

Density-based and classification-based methods have ruled unsupervised anomaly detection in recent years, while reconstruction-based methods are rarely mentioned for the poor reconstruction ability and low performance.

Unsupervised Anomaly Detection

Guide Local Feature Matching by Overlap Estimation

1 code implementation18 Feb 2022 Ying Chen, Dihe Huang, Shang Xu, Jianlin Liu, Yong liu

Local image feature matching under large appearance, viewpoint, and distance changes is challenging yet important.

Feature Correlation

A Survey of Visual Sensory Anomaly Detection

1 code implementation14 Feb 2022 Xi Jiang, Guoyang Xie, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

In this survey, we are the first one to provide a comprehensive review of visual sensory AD and category into three levels according to the form of anomalies.

Anomaly Detection

SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-Resolution

no code implementations12 Jan 2022 Jiangning Zhang, Chao Xu, Jian Li, Yue Han, Yabiao Wang, Ying Tai, Yong liu

In the practical application of restoring low-resolution gray-scale images, we generally need to run three separate processes of image colorization, super-resolution, and dows-sampling operation for the target device.

Colorization Image Colorization +1

Deep Domain Adversarial Adaptation for Photon-efficient Imaging

2 code implementations7 Jan 2022 YiWei Chen, Gongxin Yao, Yong liu, Hongye Su, Xiaomin Hu, Yu Pan

Photon-efficient imaging with the single-photon light detection and ranging (LiDAR) captures the three-dimensional (3D) structure of a scene by only a few detected signal photons per pixel.

Domain Adaptation

Robust photon-efficient imaging using a pixel-wise residual shrinkage network

2 code implementations5 Jan 2022 Gongxin Yao, YiWei Chen, Yong liu, Xiaomin Hu, Yu Pan

Single-photon light detection and ranging (LiDAR) has been widely applied to 3D imaging in challenging scenarios.

Depth Estimation

Deep Safe Multi-View Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

no code implementations CVPR 2022 Huayi Tang, Yong liu

However, we observe that learning from data with more views is not guaranteed to achieve better clustering performance than from data with fewer views.

Clustering

Dynamically Stable Poincaré Embeddings for Neural Manifolds

no code implementations21 Dec 2021 Jun Chen, Yuang Liu, Xiangrui Zhao, Mengmeng Wang, Yong liu

As a result, we prove that, if initial metrics have an $L^2$-norm perturbation which deviates from the Hyperbolic metric on the Poincar\'e ball, the scaled Ricci-DeTurck flow of such metrics smoothly and exponentially converges to the Hyperbolic metric.

Image Classification

SelFSR: Self-Conditioned Face Super-Resolution in the Wild via Flow Field Degradation Network

no code implementations20 Dec 2021 Xianfang Zeng, Jiangning Zhang, Liang Liu, Guangzhong Tian, Yong liu

To tackle this problem, we propose a novel domain-adaptive degradation network for face super-resolution in the wild.

Super-Resolution

Searching Parameterized AP Loss for Object Detection

1 code implementation NeurIPS 2021 Chenxin Tao, Zizhang Li, Xizhou Zhu, Gao Huang, Yong liu, Jifeng Dai

In this paper, we propose Parameterized AP Loss, where parameterized functions are introduced to substitute the non-differentiable components in the AP calculation.

Object object-detection +1

MSP : Refine Boundary Segmentation via Multiscale Superpixel

no code implementations3 Dec 2021 Jie Zhu, Huabin Huang, Banghuai Li, Yong liu, Leye Wang

Inspired by the generated sharp edges of superpixel blocks, we employ superpixel to guide the information passing within feature map.

Scene Parsing Segmentation +1

Refined Learning Bounds for Kernel and Approximate $k$-Means

no code implementations NeurIPS 2021 Yong liu

In this paper, we study the statistical properties of kernel $k$-means and Nystr\"{o}m-based kernel $k$-means, and obtain optimal clustering risk bounds, which improve the existing risk bounds.

Clustering

Towards Sharper Generalization Bounds for Structured Prediction

no code implementations NeurIPS 2021 Shaojie Li, Yong liu

In the smoothness scenario, we provide generalization bounds that are not only a logarithmic dependency on the label set cardinality but a faster convergence rate of order $\mathcal{O}(\frac{1}{n})$ on the sample size $n$.

Generalization Bounds Structured Prediction

Improved Learning Rates of a Functional Lasso-type SVM with Sparse Multi-Kernel Representation

no code implementations NeurIPS 2021 Shaogao Lv, Junhui Wang, Jiankun Liu, Yong liu

In this paper, we provide theoretical results of estimation bounds and excess risk upper bounds for support vector machine (SVM) with sparse multi-kernel representation.

Morphological feature visualization of Alzheimer's disease via Multidirectional Perception GAN

no code implementations25 Nov 2021 Wen Yu, Baiying Lei, Yanyan Shen, Shuqiang Wang, Yong liu, Zhiguang Feng, Yong Hu, Michael K. Ng

In this work, a novel Multidirectional Perception Generative Adversarial Network (MP-GAN) is proposed to visualize the morphological features indicating the severity of AD for patients of different stages.

Generative Adversarial Network

MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation

no code implementations21 Nov 2021 Zizhang Li, Mengmeng Wang, Jianbiao Mei, Yong liu

Referring image segmentation is a typical multi-modal task, which aims at generating a binary mask for referent described in given language expressions.

Image Segmentation Referring Expression Segmentation +2

Green CWS: Extreme Distillation and Efficient Decode Method Towards Industrial Application

no code implementations17 Nov 2021 Yulan Hu, Yong liu

Benefiting from the strong ability of the pre-trained model, the research on Chinese Word Segmentation (CWS) has made great progress in recent years.

Chinese Word Segmentation Language Modelling

Thoughts on the Consistency between Ricci Flow and Neural Network Behavior

no code implementations16 Nov 2021 Jun Chen, Tianxin Huang, Wenzhou Chen, Yong liu

During the training process of the neural network, we observe that its metric will also regularly converge to the linearly nearly Euclidean metric, which is consistent with the convergent behavior of linearly nearly Euclidean metrics under the Ricci-DeTurck flow.

A layer-stress learning framework universally augments deep neural network tasks

no code implementations14 Nov 2021 Shihao Shao, Yong liu, Qinghua Cui

Here we presented a layer-stress deep learning framework (x-NN) which implemented automatic and wise depth decision on shallow or deep feature map in a deep network through firstly designing enough number of layers and then trading off them by Multi-Head Attention Block.

Learning Rates for Nonconvex Pairwise Learning

no code implementations9 Nov 2021 Shaojie Li, Yong liu

We first successfully establish learning rates for these algorithms in a general nonconvex setting, where the analysis sheds insights on the trade-off between optimization and generalization and the role of early-stopping.

Metric Learning

Explicitly Modeling the Discriminability for Instance-Aware Visual Object Tracking

no code implementations28 Oct 2021 Mengmeng Wang, Xiaoqian Yang, Yong liu

Visual object tracking performance has been dramatically improved in recent years, but some severe challenges remain open, like distractors and occlusions.

Contrastive Learning Visual Object Tracking +1

Cannot find the paper you are looking for? You can Submit a new open access paper.