Search Results for author: Xiangyang Ji

Found 128 papers, 55 papers with code

Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

1 code implementation NeurIPS 2023 Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji

The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications.

Multi-agent Reinforcement Learning Multi-Task Learning +3

CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications

1 code implementation7 Aug 2024 Tianfang Zhang, Lei LI, Yang Zhou, Wentao Liu, Chen Qian, Xiangyang Ji

In this paper, we introduce CAS-ViT: Convolutional Additive Self-attention Vision Transformers, to achieve a balance between efficiency and performance in mobile applications.

Image Classification Instance Segmentation +3

Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation

no code implementations28 Jul 2024 Cheems Wang, Yiqin Lv, Yixiu Mao, Yun Qu, Yi Xu, Xiangyang Ji

This work has practical implications, particularly in dealing with task distribution shifts in meta-learning, and contributes to theoretical insights in the field.

Meta-Learning

LLM-Empowered State Representation for Reinforcement Learning

1 code implementation18 Jul 2024 Boyuan Wang, Yun Qu, Yuhang Jiang, Jianzhun Shao, Chang Liu, Wenming Yang, Xiangyang Ji

Conventional state representations in reinforcement learning often omit critical task-related details, presenting a significant challenge for value networks in establishing accurate mappings from states to task rewards.

reinforcement-learning

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation

no code implementations15 Jul 2024 Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Runyi Yu, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen

Specifically, we provide an automated method for reference local action sampling and leverage graph attention networks to assess the guiding weight of each local action in the overall motion synthesis.

Graph Attention Motion Synthesis

Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace

1 code implementation30 Jun 2024 Shian Du, Xiaotian Cheng, Qi Qian, Henglu Wei, Yi Xu, Xiangyang Ji

Personalized text-to-image generation has attracted unprecedented attention in the recent few years due to its unique capability of generating highly-personalized images via using the input concept dataset and novel textual prompt.

Representation Learning Text-to-Image Generation

Spatial Annealing Smoothing for Efficient Few-shot Neural Rendering

1 code implementation12 Jun 2024 Yuru Xiao, Xianming Liu, Deming Zhai, Kui Jiang, Junjun Jiang, Xiangyang Ji

In this paper, we introduce an accurate and efficient few-shot neural rendering method named Spatial Annealing smoothing regularized NeRF (SANeRF), which is specifically designed for a pre-filtering-driven hybrid representation architecture.

Neural Rendering

CompetEvo: Towards Morphological Evolution from Competition

1 code implementation28 May 2024 Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu

Training an agent to adapt to specific tasks through co-optimization of morphology and control has widely attracted attention.

MORPH

The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks

1 code implementation14 May 2024 Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan

In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks and reliable uncertainty quantification in decision-making.

Adversarial Defense Adversarial Robustness +4

Mesh Denoising Transformer

no code implementations10 May 2024 Wenbo Zhao, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji

Next, we propose a dual-stream structure consisting of a Geometric Encoder branch and a Spatial Encoder branch, which jointly encode local geometry details and spatial information to fully explore multimodal information for mesh denoising.

Denoising

RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications

no code implementations5 Apr 2024 Xingyu Liu, Chenyangguang Zhang, Gu Wang, Ruida Zhang, Xiangyang Ji

In robotic vision, a de-facto paradigm is to learn in simulated environments and then transfer to real-world applications, which poses an essential challenge in bridging the sim-to-real domain gap.

Diversity

ParCo: Part-Coordinating Text-to-Motion Synthesis

1 code implementation27 Mar 2024 Qiran Zou, Shangyuan Yuan, Shian Du, Yu Wang, Chang Liu, Yi Xu, Jie Chen, Xiangyang Ji

However, these methods encounter challenges such as the lack of coordination between different part motions and difficulties for networks to understand part concepts.

Motion Synthesis

Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023

no code implementations26 Mar 2024 Hongpeng Pan, Yang Yang, Zhongtian Fu, Yuxuan Zhang, Shian Du, Yi Xu, Xiangyang Ji

To address this issue, we propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera.

Motion Detection Point Tracking +2

Stimulate the Potential of Robots via Competition

no code implementations15 Mar 2024 Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu

It is common for us to feel pressure in a competition environment, which arises from the desire to obtain success comparing with other individuals or opponents.

KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation

1 code implementation CVPR 2024 Ruida Zhang, Chenyangguang Zhang, Yan Di, Fabian Manhardt, Xingyu Liu, Federico Tombari, Xiangyang Ji

In this paper, we present KP-RED, a unified KeyPoint-driven REtrieval and Deformation framework that takes object scans as input and jointly retrieves and deforms the most geometrically similar CAD models from a pre-processed database to tightly match the target.

3D Shape Retrieval Retrieval

Pursuit Winning Strategies for Reach-Avoid Games with Polygonal Obstacles

no code implementations10 Mar 2024 Rui Yan, Shuai Mi, Xiaoming Duan, Jintao Chen, Xiangyang Ji

The pursuers cooperate to protect a convex region from the evaders who try to reach the region.

MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

no code implementations8 Mar 2024 Zijie Fang, Yifeng Wang, Zhi Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang

To tackle this challenge, we propose a MamMIL framework for WSI classification by cooperating the selective structured state space model (i. e., Mamba) with MIL for the first time, enabling the modeling of instance dependencies while maintaining linear complexity.

Multiple Instance Learning whole slide images

VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models

1 code implementation8 Mar 2024 Yabo Zhang, Yuxiang Wei, Xianhui Lin, Zheng Hui, Peiran Ren, Xuansong Xie, Xiangyang Ji, WangMeng Zuo

Different from conventional T2V sampling (i. e., temporal and spatial modeling), VideoElevator explicitly decomposes each sampling step into temporal motion refining and spatial quality elevating.

Video Generation

Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning

no code implementations12 Jan 2024 Chenyang Wang, Junjun Jiang, Xingyu Hu, Xianming Liu, Xiangyang Ji

Using the measurement, we analyze existing techniques for inverting samples and get some insightful information that inspires a novel loss function to reduce the inconsistency.

Class Incremental Learning Incremental Learning

ShapeMatcher: Self-Supervised Joint Shape Canonicalization Segmentation Retrieval and Deformation

1 code implementation CVPR 2024 Yan Di, Chenyangguang Zhang, Chaowei Wang, Ruida Zhang, Guangyao Zhai, Yanyan Li, Bowen Fu, Xiangyang Ji, Shan Gao

Finally we deform the retrieved shape in the deformation module to tightly fit the input object by harnessing part center guided neural cage deformation.

Object Retrieval +2

On the Dynamics Under the Unhinged Loss and Beyond

no code implementations13 Dec 2023 Xiong Zhou, Xianming Liu, Hanzhang Wang, Deming Zhai, Junjun Jiang, Xiangyang Ji

In this paper, we introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze the closed-form dynamics while requiring as few simplifications or assumptions as possible.

ShapeMatcher: Self-Supervised Joint Shape Canonicalization, Segmentation, Retrieval and Deformation

1 code implementation18 Nov 2023 Yan Di, Chenyangguang Zhang, Chaowei Wang, Ruida Zhang, Guangyao Zhai, Yanyan Li, Bowen Fu, Xiangyang Ji, Shan Gao

In this paper, we present ShapeMatcher, a unified self-supervised learning framework for joint shape canonicalization, segmentation, retrieval and deformation.

Object Retrieval +2

Supported Trust Region Optimization for Offline Reinforcement Learning

no code implementations15 Nov 2023 Yixiu Mao, Hongchang Zhang, Chen Chen, Yi Xu, Xiangyang Ji

Offline reinforcement learning suffers from the out-of-distribution issue and extrapolation error.

reinforcement-learning

Enhancing Few-shot CLIP with Semantic-Aware Fine-Tuning

no code implementations8 Nov 2023 Yao Zhu, Yuefeng Chen, Wei Wang, Xiaofeng Mao, Xiu Yan, Yue Wang, Zhigang Li, Wang Lu, Jindong Wang, Xiangyang Ji

Hence, we propose fine-tuning the parameters of the attention pooling layer during the training process to encourage the model to focus on task-specific semantics.

ZooPFL: Exploring Black-box Foundation Models for Personalized Federated Learning

1 code implementation8 Oct 2023 Wang Lu, Hao Yu, Jindong Wang, Damien Teney, Haohan Wang, Yiqiang Chen, Qiang Yang, Xing Xie, Xiangyang Ji

When personalized federated learning (FL) meets large foundation models, new challenges arise from various limitations in resources.

Personalized Federated Learning

Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning

1 code implementation NeurIPS 2023 Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji

Offline multi-agent reinforcement learning is challenging due to the coupling effect of both distribution shift issue common in offline setting and the high dimension issue common in multi-agent setting, making the action out-of-distribution (OOD) and value overestimation phenomenon excessively severe.

counterfactual Multi-agent Reinforcement Learning +3

Towards Real-World Burst Image Super-Resolution: Benchmark and Method

1 code implementation ICCV 2023 Pengxu Wei, Yujing Sun, Xingbei Guo, Chang Liu, Jie Chen, Xiangyang Ji, Liang Lin

Despite substantial advances, single-image super-resolution (SISR) is always in a dilemma to reconstruct high-quality images with limited information from one input image, especially in realistic scenarios.

Burst Image Super-Resolution

CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction

no code implementations15 Aug 2023 Yan Di, Chenyangguang Zhang, Pengyuan Wang, Guangyao Zhai, Ruida Zhang, Fabian Manhardt, Benjamin Busam, Xiangyang Ji, Federico Tombari

However, such strategies fail to consistently align the denoised point cloud with the given image, leading to unstable conditioning and inferior performance.

3D Reconstruction

U-RED: Unsupervised 3D Shape Retrieval and Deformation for Partial Point Clouds

1 code implementation ICCV 2023 Yan Di, Chenyangguang Zhang, Ruida Zhang, Fabian Manhardt, Yongzhi Su, Jason Rambach, Didier Stricker, Xiangyang Ji, Federico Tombari

In this paper, we propose U-RED, an Unsupervised shape REtrieval and Deformation pipeline that takes an arbitrary object observation as input, typically captured by RGB images or scans, and jointly retrieves and deforms the geometrically similar CAD models from a pre-established database to tightly match the target.

3D Shape Retrieval Retrieval

Backdoor Attacks Against Incremental Learners: An Empirical Evaluation Study

no code implementations28 May 2023 Yiqi Zhong, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji

Large amounts of incremental learning algorithms have been proposed to alleviate the catastrophic forgetting issue arises while dealing with sequential data on a time series.

Adversarial Robustness Backdoor Attack +3

Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

4 code implementations CVPR 2023 Peng Jin, Jinfa Huang, Pengfei Xiong, Shangxuan Tian, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen

Contrastive learning-based video-language representation learning approaches, e. g., CLIP, have achieved outstanding performance, which pursue semantic interaction upon pre-defined video-text pairs.

Contrastive Learning Question Answering +5

Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation

no code implementations ICCV 2023 Kehan Li, Yian Zhao, Zhennan Wang, Zesen Cheng, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

Interactive segmentation enables users to segment as needed by providing cues of objects, which introduces human-computer interaction for many fields, such as image editing and medical image analysis.

Interactive Segmentation

TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization

1 code implementation CVPR 2023 Ziquan Liu, Yi Xu, Xiangyang Ji, Antoni B. Chan

To better exploit the potential of pre-trained models in adversarial robustness, this paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.

Adversarial Robustness Image Classification

DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

4 code implementations ICCV 2023 Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen

Existing text-video retrieval solutions are, in essence, discriminant models focused on maximizing the conditional likelihood, i. e., p(candidates|query).

Retrieval Video Retrieval

Parallel Vertex Diffusion for Unified Visual Grounding

no code implementations13 Mar 2023 Zesen Cheng, Kehan Li, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

An intuitive materialization of our paradigm is Parallel Vertex Diffusion (PVD) to directly set vertex coordinates as the generation target and use a diffusion model to train and infer.

Visual Grounding

Guided Depth Map Super-resolution: A Survey

1 code implementation19 Feb 2023 Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Xiangyang Ji

Guided depth map super-resolution (GDSR), which aims to reconstruct a high-resolution (HR) depth map from a low-resolution (LR) observation with the help of a paired HR color image, is a longstanding and fundamental problem, it has attracted considerable attention from computer vision and image processing communities.

Depth Image Upsampling Depth Map Super-Resolution +1

UATVR: Uncertainty-Adaptive Text-Video Retrieval

1 code implementation ICCV 2023 Bo Fang, Wenhao Wu, Chang Liu, Yu Zhou, Yuxin Song, Weiping Wang, Xiangbo Shu, Xiangyang Ji, Jingdong Wang

In the refined embedding space, we represent text-video pairs as probabilistic distributions where prototypes are sampled for matching evaluation.

Retrieval Semantic correspondence +1

TopoSeg: Topology-Aware Nuclear Instance Segmentation

no code implementations ICCV 2023 Hongliang He, Jun Wang, Pengxu Wei, Fan Xu, Xiangyang Ji, Chang Liu, Jie Chen

Experiments on three nuclear instance segmentation datasets justify the superiority of TopoSeg, which achieves state-of-the-art performance.

Instance Segmentation Segmentation +1

Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling

1 code implementation31 Dec 2022 Xin Ma, Chang Liu, Chunyu Xie, Long Ye, Yafeng Deng, Xiangyang Ji

Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency.

object-detection Object Detection +2

Human Health Indicator Prediction from Gait Video

no code implementations25 Dec 2022 Ziqing Li, Xuexin Yu, Xiaocong Lian, Yifeng Wang, Xiangyang Ji

To address this issue, we analyse the similarity and relationship between pose estimation and health indicator prediction tasks, and then propose a paradigm enabling deep learning for small health indicator datasets by pre-training on the pose estimation task.

Pose Estimation

Proposal Distribution Calibration for Few-Shot Object Detection

1 code implementation15 Dec 2022 Bohao Li, Chang Liu, Mengnan Shi, Xiaozhong Chen, Xiangyang Ji, Qixiang Ye

Adapting object detectors learned with sufficient supervision to novel classes under low data regimes is charming yet challenging.

Few-Shot Object Detection Object +1

ILSGAN: Independent Layer Synthesis for Unsupervised Foreground-Background Segmentation

1 code implementation25 Nov 2022 Qiran Zou, Yu Yang, Wing Yin Cheung, Chang Liu, Xiangyang Ji

Unsupervised foreground-background segmentation aims at extracting salient objects from cluttered backgrounds, where Generative Adversarial Network (GAN) approaches, especially layered GANs, show great promise.

Generative Adversarial Network Image Generation +4

Learning to Annotate Part Segmentation with Gradient Matching

1 code implementation ICLR 2022 Yu Yang, Xiaotian Cheng, Hakan Bilen, Xiangyang Ji

The success of state-of-the-art deep neural networks heavily relies on the presence of large-scale labelled datasets, which are extremely expensive and time-consuming to annotate.

Segmentation

Distilling Representations from GAN Generator via Squeeze and Span

1 code implementation6 Nov 2022 Yu Yang, Xiaotian Cheng, Chang Liu, Hakan Bilen, Xiangyang Ji

In recent years, generative adversarial networks (GANs) have been an actively studied topic and shown to successfully produce high-quality realistic images in various domains.

Representation Learning

Local Manifold Augmentation for Multiview Semantic Consistency

no code implementations5 Nov 2022 Yu Yang, Wing Yin Cheung, Chang Liu, Xiangyang Ji

Multiview self-supervised representation learning roots in exploring semantic consistency across data of complex intra-class variation.

Representation Learning Self-Supervised Learning

Beyond Instance Discrimination: Relation-aware Contrastive Self-supervised Learning

no code implementations2 Nov 2022 Yifei Zhang, Chang Liu, Yu Zhou, Weiping Wang, Qixiang Ye, Xiangyang Ji

In this paper, we present relation-aware contrastive self-supervised learning (ReCo) to integrate instance relations, i. e., global distribution relation and local interpolation relation, into the CSL framework in a plug-and-play fashion.

Relation Self-Supervised Learning

Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning

no code implementations15 Oct 2022 Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji

Meanwhile, we show that to achieve $\tilde{O}(\mathrm{poly}(S, A, H)\sqrt{K})$ regret, the number of batches is at least $\Omega\left(H/\log_A(K)+ \log_2\log_2(K) \right)$, which matches our upper bound up to logarithmic terms.

reinforcement-learning Reinforcement Learning (RL)

Multi-Camera Collaborative Depth Prediction via Consistent Structure Estimation

no code implementations5 Oct 2022 Jialei Xu, Xianming Liu, Yuanchao Bai, Junjun Jiang, Kaixuan Wang, Xiaozhi Chen, Xiangyang Ji

During the iterative update, the results of depth estimation are compared across cameras and the information of overlapping areas is propagated to the whole depth maps with the help of basis formulation.

Depth Prediction Monocular Depth Estimation

Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image Compression

1 code implementation11 Sep 2022 Yuanchao Bai, Xianming Liu, Kai Wang, Xiangyang Ji, Xiaolin Wu, Wen Gao

In the lossless mode, the DLPR coding system first performs lossy compression and then lossless coding of residuals.

Image Compression

6D Robotic Assembly Based on RGB-only Object Pose Estimation

no code implementations27 Aug 2022 Bowen Fu, Sek Kun Leong, Xiaocong Lian, Xiangyang Ji

Vision-based robotic assembly is a crucial yet challenging task as the interaction with multiple objects requires high levels of precision.

6D Pose Estimation 6D Pose Estimation using RGB

SSP-Pose: Symmetry-Aware Shape Prior Deformation for Direct Category-Level Object Pose Estimation

no code implementations13 Aug 2022 Ruida Zhang, Yan Di, Fabian Manhardt, Federico Tombari, Xiangyang Ji

In this paper, to handle these shortcomings, we propose an end-to-end trainable network SSP-Pose for category-level pose estimation, which integrates shape priors into a direct pose regression network.

Pose Estimation regression

RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation

1 code implementation30 Jul 2022 Ruida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji

Category-level object pose estimation aims to predict the 6D pose as well as the 3D metric size of arbitrary objects from a known set of categories.

Object Pose Estimation

CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement

1 code implementation17 Jul 2022 Xingyu Liu, Gu Wang, Yi Li, Xiangyang Ji

While category-level 9DoF object pose estimation has emerged recently, previous correspondence-based or direct regression methods are both limited in accuracy due to the huge intra-category variances in object shape and color, etc.

Object Pose Estimation

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

no code implementations6 Jul 2022 Zhennan Wang, Kehan Li, Runyi Yu, Yian Zhao, Pengchong Qiao, Chang Liu, Fan Xu, Xiangyang Ji, Guoli Song, Jie Chen

In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features.

Acoustic Scene Classification Image Classification +1

Learning Towards the Largest Margins

no code implementations ICLR 2022 Xiong Zhou, Xianming Liu, Deming Zhai, Junjun Jiang, Xin Gao, Xiangyang Ji

One of the main challenges for feature representation in deep learning-based classification is the design of appropriate loss functions that exhibit strong discriminative power.

Face Verification imbalanced classification +1

Prototype-Anchored Learning for Learning with Imperfect Annotations

no code implementations23 Jun 2022 Xiong Zhou, Xianming Liu, Deming Zhai, Junjun Jiang, Xin Gao, Xiangyang Ji

We verify the effectiveness of PAL on class-imbalanced learning and noise-tolerant learning by extensive experiments on synthetic and real-world datasets.

An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation

no code implementations25 May 2022 Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, Antoni B. Chan

With our empirical result obtained from 1, 330 models, we provide the following main observations: 1) ERM combined with data augmentation can achieve state-of-the-art performance if we choose a proper pre-trained model respecting the data property; 2) specialized algorithms further improve the robustness on top of ERM when handling a specific type of distribution shift, e. g., GroupDRO for spurious correlation and CORAL for large-scale out-of-distribution data; 3) Comparing different pre-training modes, architectures and data sizes, we provide novel observations about pre-training on distribution shift, which sheds light on designing or selecting pre-training strategy for different kinds of distribution shifts.

Data Augmentation

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

3 code implementations ICCV 2023 Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye

Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.

Decoder Few-Shot Object Detection +3

CCMB: A Large-scale Chinese Cross-modal Benchmark

1 code implementation8 May 2022 Chunyu Xie, Heng Cai, Jincheng Li, Fanjing Kong, Xiaoyu Wu, Jianfei Song, Henrique Morimitsu, Lin Yao, Dexin Wang, Xiangzheng Zhang, Dawei Leng, Baochang Zhang, Xiangyang Ji, Yafeng Deng

In this work, we build a large-scale high-quality Chinese Cross-Modal Benchmark named CCMB for the research community, which contains the currently largest public pre-training dataset Zero and five human-annotated fine-tuning datasets for downstream tasks.

Image Classification Image Retrieval +7

PUERT: Probabilistic Under-sampling and Explicable Reconstruction Network for CS-MRI

1 code implementation24 Apr 2022 Jingfen Xie, Jian Zhang, Yongbing Zhang, Xiangyang Ji

Compressed Sensing MRI (CS-MRI) aims at reconstructing de-aliased images from sub-Nyquist sampling k-space data to accelerate MR Imaging, thus presenting two basic issues, i. e., where to sample and how to reconstruct.

Binarization

Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit Neural Representation

1 code implementation CVPR 2022 Wenbo Zhao, Xianming Liu, Zhiwei Zhong, Junjun Jiang, Wei Gao, Ge Li, Xiangyang Ji

Most existing methods either take the end-to-end supervised learning based manner, where large amounts of pairs of sparse input and dense ground-truth are exploited as supervision information; or treat up-scaling of different scale factors as independent tasks, and have to build multiple networks to handle upsampling with varying factors.

Self-Supervised Learning

Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

no code implementations24 Mar 2022 Zihan Zhang, Xiangyang Ji, Simon S. Du

This paper gives the first polynomial-time algorithm for tabular Markov Decision Processes (MDP) that enjoys a regret bound \emph{independent on the planning horizon}.

reinforcement-learning Reinforcement Learning (RL)

GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting

3 code implementations CVPR 2022 Yan Di, Ruida Zhang, Zhiqiang Lou, Fabian Manhardt, Xiangyang Ji, Nassir Navab, Federico Tombari

While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle a single or a handful of different objects, which limits their applications.

 Ranked #1 on 6D Pose Estimation on LineMOD (Mean ADD-S metric)

6D Pose Estimation 6D Pose Estimation using RGB +3

Shadows can be Dangerous: Stealthy and Effective Physical-world Adversarial Attack by Natural Phenomenon

1 code implementation CVPR 2022 Yiqi Zhong, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji

A new type of non-invasive attacks emerged recently, which attempt to cast perturbation onto the target by optics based tools, such as laser beam and projector.

Adversarial Attack Traffic Sign Recognition +1

Which Style Makes Me Attractive? Interpretable Control Discovery and Counterfactual Explanation on StyleGAN

1 code implementation24 Jan 2022 Bo Li, Qiulin Wang, JiQuan Pei, Yu Yang, Xiangyang Ji

First, we propose a novel approach to disentangle latent subspace semantics by exploiting existing face analysis models, e. g., face parsers and face landmark detectors.

counterfactual Counterfactual Explanation +3

Towards End-to-End Image Compression and Analysis with Transformers

1 code implementation17 Dec 2021 Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, YaoWei Wang, Xiangyang Ji, Wen Gao

Meanwhile, we propose a feature aggregation module to fuse the compressed features with the selected intermediate features of the Transformer, and feed the aggregated features to a deconvolutional neural network for image reconstruction.

Classification Image Classification +3

Deep Attentional Guided Image Filtering

1 code implementation13 Dec 2021 Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Xiangyang Ji

Specifically, we propose an attentional kernel learning module to generate dual sets of filter kernels from the guidance and the target, respectively, and then adaptively combine them by modeling the pixel-wise dependency between the two images.

Collaborative Filtering Depth Image Upsampling +1

Improved Fine-Tuning by Better Leveraging Pre-Training Data

no code implementations24 Nov 2021 Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni Chan, Rong Jin

The generalization result of using pre-training data shows that the excess risk bound on a target task can be improved when the appropriate pre-training data is included in fine-tuning.

Image Classification Learning Theory

Wasserstein Unsupervised Reinforcement Learning

no code implementations15 Oct 2021 Shuncheng He, Yuhang Jiang, Hongchang Zhang, Jianzhun Shao, Xiangyang Ji

These pre-trained policies can accelerate learning when endowed with external reward, and can also be used as primitive options in hierarchical reinforcement learning.

Hierarchical Reinforcement Learning reinforcement-learning +2

Weakly-Supervised Monocular Depth Estimationwith Resolution-Mismatched Data

no code implementations23 Sep 2021 Jialei Xu, Yuanchao Bai, Xianming Liu, Junjun Jiang, Xiangyang Ji

In this paper, we propose a novel weakly-supervised framework to train a monocular depth estimation network to generate HR depth maps with resolution-mismatched supervision, i. e., the inputs are HR color images and the ground-truth are low-resolution (LR) depth maps.

Monocular Depth Estimation

SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

2 code implementations ICCV 2021 Yan Di, Fabian Manhardt, Gu Wang, Xiangyang Ji, Nassir Navab, Federico Tombari

Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e. g. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem.

6D Pose Estimation 6D Pose Estimation using RGB +1

Learning with Noisy Labels via Sparse Regularization

1 code implementation ICCV 2021 Xiong Zhou, Xianming Liu, Chenyang Wang, Deming Zhai, Junjun Jiang, Xiangyang Ji

In this paper, we theoretically prove that \textbf{any loss can be made robust to noisy labels} by restricting the network output to the set of permutations over a fixed vector.

Learning with noisy labels

Physics-Based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging

no code implementations CVPR 2021 Feilong Zhang, Xianming Liu, Cheng Guo, Shiyi Lin, Junjun Jiang, Xiangyang Ji

Specifically, we unfold the iterative process of the alternative projection phase retrieval into a feed-forward neural network, whose layers mimic the processing flow.

Retrieval

Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection

2 code implementations CVPR 2021 Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects.

Ranked #34 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images

Learning Scalable lY=-Constrained Near-Lossless Image Compression via Joint Lossy Image and Residual Compression

no code implementations CVPR 2021 Yuanchao Bai, Xianming Liu, WangMeng Zuo, YaoWei Wang, Xiangyang Ji

To achieve scalable compression with the error bound larger than zero, we derive the probability model of the quantized residual by quantizing the learned probability model of the original residual, instead of training multiple networks.

Image Compression

Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation

1 code implementation CVPR 2021 Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Encouraging progress in few-shot semantic segmentation has been made by leveraging features learned upon base classes with sufficient training data to represent novel classes with few-shot examples.

Few-Shot Semantic Segmentation Segmentation +1

Multiple instance active learning for object detection

1 code implementation CVPR 2021 Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.

Active Object Detection Multiple Instance Learning +3

High-resolution Depth Maps Imaging via Attention-based Hierarchical Multi-modal Fusion

1 code implementation4 Apr 2021 Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Zhiwen Chen, Xiangyang Ji

Specifically, to effectively extract and combine relevant information from LR depth and HR guidance, we propose a multi-modal attention based fusion (MMAF) strategy for hierarchical convolutional layers, including a feature enhance block to select valuable features and a feature recalibration block to unify the similarity metrics of modalities with different appearance characteristics.

Depth Map Super-Resolution

Learning Foreground-Background Segmentation from Improved Layered GANs

no code implementations1 Apr 2021 Yu Yang, Hakan Bilen, Qiran Zou, Wing Yin Cheung, Xiangyang Ji

Deep learning approaches heavily rely on high-quality human supervision which is nonetheless expensive, time-consuming, and error-prone, especially for image segmentation task.

Generative Adversarial Network Image Segmentation +3

Learning Scalable $\ell_\infty$-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression

no code implementations31 Mar 2021 Yuanchao Bai, Xianming Liu, WangMeng Zuo, YaoWei Wang, Xiangyang Ji

To achieve scalable compression with the error bound larger than zero, we derive the probability model of the quantized residual by quantizing the learned probability model of the original residual, instead of training multiple networks.

Image Compression

Reducing Conservativeness Oriented Offline Reinforcement Learning

no code implementations27 Feb 2021 Hongchang Zhang, Jianzhun Shao, Yuhang Jiang, Shuncheng He, Xiangyang Ji

In offline reinforcement learning, a policy learns to maximize cumulative rewards with a fixed collection of data.

D4RL reinforcement-learning +1

Credit Assignment with Meta-Policy Gradient for Multi-Agent Reinforcement Learning

no code implementations24 Feb 2021 Jianzhun Shao, Hongchang Zhang, Yuhang Jiang, Shuncheng He, Xiangyang Ji

Reward decomposition is a critical problem in centralized training with decentralized execution~(CTDE) paradigm for multi-agent reinforcement learning.

Meta-Learning Multi-agent Reinforcement Learning +4

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

1 code implementation CVPR 2021 Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji

In this work, we perform an in-depth investigation on both direct and indirect methods, and propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations.

6D Pose Estimation 6D Pose Estimation using RGB +1

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

no code implementations NeurIPS 2021 Zihan Zhang, Jiaqi Yang, Xiangyang Ji, Simon S. Du

With the new confidence sets, we obtain the follow regret bounds: For linear bandits, we obtain an $\tilde{O}(poly(d)\sqrt{1 + \sum_{k=1}^{K}\sigma_k^2})$ data-dependent regret bound, where $d$ is the feature dimension, $K$ is the number of rounds, and $\sigma_k^2$ is the \emph{unknown} variance of the reward at the $k$-th round.

LEMMA

Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition

no code implementations NeurIPS 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

We study the reinforcement learning problem in the setting of finite-horizon1episodic Markov Decision Processes (MDPs) with S states, A actions, and episode length H. We propose a model-free algorithm UCB-ADVANTAGE and prove that it achieves \tilde{O}(\sqrt{H^2 SAT}) regret where T=KH and K is the number of episodes to play.

reinforcement-learning Reinforcement Learning (RL)

Nearly Minimax Optimal Reward-free Reinforcement Learning

no code implementations12 Oct 2020 Zihan Zhang, Simon S. Du, Xiangyang Ji

In the planning phase, the agent needs to return a near-optimal policy for arbitrary reward functions.

reinforcement-learning Reinforcement Learning (RL)

Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon

no code implementations28 Sep 2020 Zihan Zhang, Xiangyang Ji, Simon S. Du

Episodic reinforcement learning generalizes contextual bandits and is often perceived to be more difficult due to long planning horizon and unknown state-dependent transitions.

Decision Making Multi-Armed Bandits +2

Robust RGB-based 6-DoF Pose Estimation without Real Pose Annotations

no code implementations19 Aug 2020 Zhigang Li, Yinlin Hu, Mathieu Salzmann, Xiangyang Ji

We achieve state of the art performance on LINEMOD, and OccludedLINEMOD in without real-pose setting, even outperforming methods that rely on real annotations during training on Occluded-LINEMOD.

Pose Estimation

Depth image denoising using nuclear norm and learning graph model

no code implementations9 Aug 2020 Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang

The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision.

Image Denoising Image Restoration

Domain Contrast for Domain Adaptive Object Detection

no code implementations26 Jun 2020 Feng Liu, Xiaoxong Zhang, Fang Wan, Xiangyang Ji, Qixiang Ye

We present Domain Contrast (DC), a simple yet effective approach inspired by contrastive learning for training domain adaptive detectors.

Contrastive Learning Object +2

Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

no code implementations6 Jun 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

In this paper we consider the problem of learning an $\epsilon$-optimal policy for a discounted Markov Decision Process (MDP).

reinforcement-learning Reinforcement Learning (RL)

Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

no code implementations21 Apr 2020 Zihan Zhang, Yuan Zhou, Xiangyang Ji

We study the reinforcement learning problem in the setting of finite-horizon episodic Markov Decision Processes (MDPs) with $S$ states, $A$ actions, and episode length $H$.

reinforcement-learning Reinforcement Learning (RL)

PgNN: Physics-guided Neural Network for Fourier Ptychographic Microscopy

no code implementations19 Sep 2019 Yongbing Zhang, Yangzhe Liu, Xiu Li, Shaowei Jiang, Krishna Dixit, Xinfeng Zhang, Xiangyang Ji

Since the optimal parameters of the PgNN can be derived by minimizing the difference between the model-generated images and real captured angle-varied images corresponding to the same scene, the proposed PgNN can get rid of the problem of massive training data as in traditional supervised methods.

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function

no code implementations NeurIPS 2019 Zihan Zhang, Xiangyang Ji

We present an algorithm based on the \emph{Optimism in the Face of Uncertainty} (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently.

reinforcement-learning Reinforcement Learning (RL)

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection

1 code implementation CVPR 2019 Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye

Weakly supervised object detection (WSOD) is a challenging task when provided with image category supervision but required to simultaneously learn object locations and object detectors.

Multiple Instance Learning Object +3

Bi-stream Pose Guided Region Ensemble Network for Fingertip Localization from Stereo Images

no code implementations26 Feb 2019 Guijin Wang, Cairong Zhang, Xinghao Chen, Xiangyang Ji, Jing-Hao Xue, Hang Wang

To mitigate these limitations and promote further research on hand pose estimation from stereo images, we propose a new large-scale binocular hand pose dataset called THU-Bi-Hand, offering a new perspective for fingertip localization.

3D Hand Pose Estimation Missing Values

DeepIM: Deep Iterative Matching for 6D Pose Estimation

2 code implementations ECCV 2018 Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, Dieter Fox

Estimating the 6D pose of objects from images is an important problem in various applications such as robot manipulation and virtual reality.

6D Pose Estimation 6D Pose Estimation using RGB +1

Dynamic Filtering with Large Sampling Field for ConvNets

no code implementations ECCV 2018 Jialin Wu, Dai Li, Yu Yang, Chandrajit Bajaj, Xiangyang Ji

We propose a dynamic filtering strategy with large sampling field for ConvNets (LS-DFN), where the position-specific kernels learn from not only the identical position but also multiple sampled neighbor regions.

object-detection Object Detection +3

A Graphical Social Topology Model for Multi-Object Tracking

no code implementations14 Feb 2017 Shan Gao, Xiaogang Chen, Qixiang Ye, Junliang Xing, Arjan Kuijper, Xiangyang Ji

Inspired with the social affinity property of moving objects, we propose a Graphical Social Topology (GST) model, which estimates the group dynamics by jointly modeling the group structure and the states of objects using a topological representation.

Multi-Object Tracking Object

Action Recognition with Joint Attention on Multi-Level Deep Features

no code implementations9 Jul 2016 Jialin Wu, Gu Wang, Wukui Yang, Xiangyang Ji

We propose a novel deep supervised neural network for the task of action recognition in videos, which implicitly takes advantage of visual tracking and shares the robustness of both deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN).

Action Recognition In Videos Temporal Action Localization +1

Fast and High Quality Highlight Removal from A Single Image

no code implementations1 Dec 2015 Dongsheng An, Jinli Suo, Xiangyang Ji, Haoqian Wang, Qionghai Dai

Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively.

Clustering Diversity +2

Sliding-Window Optimization on an Ambiguity-Clearness Graph for Multi-object Tracking

no code implementations28 Nov 2015 Qi Guo, Le Dan, Dong Yin, Xiangyang Ji

Multi-object tracking remains challenging due to frequent occurrence of occlusions and outliers.

Multi-Object Tracking

Efficient Divide-And-Conquer Classification Based on Feature-Space Decomposition

no code implementations29 Jan 2015 Qi Guo, Bo-Wei Chen, Feng Jiang, Xiangyang Ji, Sun-Yuan Kung

Firstly, we divide the feature space into several subspaces using the decomposition method proposed in this paper.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.