1 code implementation • NeurIPS 2023 • Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji
The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications.
1 code implementation • 7 Aug 2024 • Tianfang Zhang, Lei LI, Yang Zhou, Wentao Liu, Chen Qian, Xiangyang Ji
In this paper, we introduce CAS-ViT: Convolutional Additive Self-attention Vision Transformers, to achieve a balance between efficiency and performance in mobile applications.
Ranked #510 on Image Classification on ImageNet
no code implementations • 28 Jul 2024 • Cheems Wang, Yiqin Lv, Yixiu Mao, Yun Qu, Yi Xu, Xiangyang Ji
This work has practical implications, particularly in dealing with task distribution shifts in meta-learning, and contributes to theoretical insights in the field.
1 code implementation • 18 Jul 2024 • Boyuan Wang, Yun Qu, Yuhang Jiang, Jianzhun Shao, Chang Liu, Wenming Yang, Xiangyang Ji
Conventional state representations in reinforcement learning often omit critical task-related details, presenting a significant challenge for value networks in establishing accurate mappings from states to task rewards.
no code implementations • 15 Jul 2024 • Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Runyi Yu, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen
Specifically, we provide an automated method for reference local action sampling and leverage graph attention networks to assess the guiding weight of each local action in the overall motion synthesis.
1 code implementation • 30 Jun 2024 • Shian Du, Xiaotian Cheng, Qi Qian, Henglu Wei, Yi Xu, Xiangyang Ji
Personalized text-to-image generation has attracted unprecedented attention in the recent few years due to its unique capability of generating highly-personalized images via using the input concept dataset and novel textual prompt.
no code implementations • 15 Jun 2024 • Ying Fu, Yu Li, ShaoDi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu, Yunkang Zhang, Siyuan Jiang, Xiaoqiang Lu, Licheng Jiao, Fang Liu, Xu Liu, Lingling Li, Wenping Ma, Shuyuan Yang, Haiyang Xie, Jian Zhao, Shihua Huang, Peng Cheng, Xi Shen, Zheng Wang, Shuai An, Caizhi Zhu, Xuelong Li, Tao Zhang, Liang Li, Yu Liu, Chenggang Yan, Gengchen Zhang, Linyan Jiang, Bingyi Song, Zhuoyu An, Haibo Lei, Qing Luo, Jie Song, YuAn Liu, Haoyuan Zhang, Lingfeng Wang, Wei Chen, Aling Luo, Cheng Li, Jun Cao, Shu Chen, Zifei Dou, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Xuejian Gou, Qinliang Wang, Yang Liu, Shizhan Zhao, Yanzhao Zhang, Libo Yan, Yuwei Guo, Guoxin Li, Qiong Gao, Chenyue Che, Long Sun, Xiang Chen, Hao Li, Jinshan Pan, Chuanlong Xie, Hongming Chen, Mingrui Li, Tianchen Deng, Jingwei Huang, Yufeng Li, Fei Wan, Bingxin Xu, Jian Cheng, Hongzhe Liu, Cheng Xu, Yuxiang Zou, Weiguo Pan, Songyin Dai, Sen Jia, Junpei Zhang, Puhua Chen, Qihang Li
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies.
1 code implementation • 12 Jun 2024 • Yuru Xiao, Xianming Liu, Deming Zhai, Kui Jiang, Junjun Jiang, Xiangyang Ji
In this paper, we introduce an accurate and efficient few-shot neural rendering method named Spatial Annealing smoothing regularized NeRF (SANeRF), which is specifically designed for a pre-filtering-driven hybrid representation architecture.
1 code implementation • 28 May 2024 • Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu
Training an agent to adapt to specific tasks through co-optimization of morphology and control has widely attracted attention.
1 code implementation • 14 May 2024 • Ziquan Liu, Yufei Cui, Yan Yan, Yi Xu, Xiangyang Ji, Xue Liu, Antoni B. Chan
In safety-critical applications such as medical imaging and autonomous driving, where decisions have profound implications for patient health and road safety, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks and reliable uncertainty quantification in decision-making.
no code implementations • 10 May 2024 • Wenbo Zhao, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji
Next, we propose a dual-stream structure consisting of a Geometric Encoder branch and a Spatial Encoder branch, which jointly encode local geometry details and spatial information to fully explore multimodal information for mesh denoising.
1 code implementation • IEEE International Conference on Robotics and Automation (ICRA) 2024 • Henrique Morimitsu, Xiaobin Zhu, Roberto M. Cesar-Jr., Xiangyang Ji, Xu-Cheng Yin
Extracting motion information from videos with optical flow estimation is vital in multiple practical robot applications.
Ranked #6 on Optical Flow Estimation on KITTI 2015
no code implementations • 5 Apr 2024 • Xingyu Liu, Chenyangguang Zhang, Gu Wang, Ruida Zhang, Xiangyang Ji
In robotic vision, a de-facto paradigm is to learn in simulated environments and then transfer to real-world applications, which poses an essential challenge in bridging the sim-to-real domain gap.
no code implementations • 1 Apr 2024 • Yuru Xiao, Xianming Liu, Deming Zhai, Kui Jiang, Junjun Jiang, Xiangyang Ji
Neural Radiance Field (NeRF) technology has made significant strides in creating novel viewpoints.
1 code implementation • 27 Mar 2024 • Qiran Zou, Shangyuan Yuan, Shian Du, Yu Wang, Chang Liu, Yi Xu, Jie Chen, Xiangyang Ji
However, these methods encounter challenges such as the lack of coordination between different part motions and difficulties for networks to understand part concepts.
Ranked #9 on Motion Synthesis on HumanML3D
no code implementations • 26 Mar 2024 • Hongpeng Pan, Yang Yang, Zhongtian Fu, Yuxuan Zhang, Shian Du, Yi Xu, Xiangyang Ji
To address this issue, we propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera.
no code implementations • CVPR 2024 • Yiming Xie, Henglu Wei, Zhenyi Liu, Xiaoyu Wang, Xiangyang Ji
To advance research in learning-based defogging algorithms, various synthetic fog datasets have been developed.
no code implementations • 15 Mar 2024 • Kangyao Huang, Di Guo, Xinyu Zhang, Xiangyang Ji, Huaping Liu
It is common for us to feel pressure in a competition environment, which arises from the desire to obtain success comparing with other individuals or opponents.
1 code implementation • CVPR 2024 • Ruida Zhang, Chenyangguang Zhang, Yan Di, Fabian Manhardt, Xingyu Liu, Federico Tombari, Xiangyang Ji
In this paper, we present KP-RED, a unified KeyPoint-driven REtrieval and Deformation framework that takes object scans as input and jointly retrieves and deforms the most geometrically similar CAD models from a pre-processed database to tightly match the target.
1 code implementation • CVPR 2024 • Pengchong Qiao, Lei Shang, Chang Liu, Baigui Sun, Xiangyang Ji, Jie Chen
In this paper, motivated by object-oriented programming, we model the subject as a derived class whose base class is its semantic category.
no code implementations • 10 Mar 2024 • Rui Yan, Shuai Mi, Xiaoming Duan, Jintao Chen, Xiangyang Ji
The pursuers cooperate to protect a convex region from the evaders who try to reach the region.
no code implementations • 8 Mar 2024 • Zijie Fang, Yifeng Wang, Zhi Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang
To tackle this challenge, we propose a MamMIL framework for WSI classification by cooperating the selective structured state space model (i. e., Mamba) with MIL for the first time, enabling the modeling of instance dependencies while maintaining linear complexity.
1 code implementation • 8 Mar 2024 • Yabo Zhang, Yuxiang Wei, Xianhui Lin, Zheng Hui, Peiran Ren, Xuansong Xie, Xiangyang Ji, WangMeng Zuo
Different from conventional T2V sampling (i. e., temporal and spatial modeling), VideoElevator explicitly decomposes each sampling step into temporal motion refining and spatial quality elevating.
no code implementations • 19 Feb 2024 • Jialei Xu, Xianming Liu, Junjun Jiang, Kui Jiang, Rui Li, Kai Cheng, Xiangyang Ji
Monocular depth estimation from RGB images plays a pivotal role in 3D vision.
no code implementations • 12 Jan 2024 • Chenyang Wang, Junjun Jiang, Xingyu Hu, Xianming Liu, Xiangyang Ji
Using the measurement, we analyze existing techniques for inverting samples and get some insightful information that inspires a novel loss function to reduce the inconsistency.
1 code implementation • CVPR 2024 • Yan Di, Chenyangguang Zhang, Chaowei Wang, Ruida Zhang, Guangyao Zhai, Yanyan Li, Bowen Fu, Xiangyang Ji, Shan Gao
Finally we deform the retrieved shape in the deformation module to tightly fit the input object by harnessing part center guided neural cage deformation.
no code implementations • 13 Dec 2023 • Xiong Zhou, Xianming Liu, Hanzhang Wang, Deming Zhai, Junjun Jiang, Xiangyang Ji
In this paper, we introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze the closed-form dynamics while requiring as few simplifications or assumptions as possible.
no code implementations • 23 Nov 2023 • Bowen Fu, Gu Wang, Chenyangguang Zhang, Yan Di, Ziqin Huang, Zhiying Leng, Fabian Manhardt, Xiangyang Ji, Federico Tombari
Reconstructing hand-held objects from a single RGB image is a challenging task in computer vision.
1 code implementation • 18 Nov 2023 • Yan Di, Chenyangguang Zhang, Chaowei Wang, Ruida Zhang, Guangyao Zhai, Yanyan Li, Bowen Fu, Xiangyang Ji, Shan Gao
In this paper, we present ShapeMatcher, a unified self-supervised learning framework for joint shape canonicalization, segmentation, retrieval and deformation.
no code implementations • 15 Nov 2023 • Yixiu Mao, Hongchang Zhang, Chen Chen, Yi Xu, Xiangyang Ji
Offline reinforcement learning suffers from the out-of-distribution issue and extrapolation error.
no code implementations • 8 Nov 2023 • Yao Zhu, Yuefeng Chen, Wei Wang, Xiaofeng Mao, Xiu Yan, Yue Wang, Zhigang Li, Wang Lu, Jindong Wang, Xiangyang Ji
Hence, we propose fine-tuning the parameters of the attention pooling layer during the training process to encourage the model to focus on task-specific semantics.
no code implementations • CVPR 2024 • Chenyangguang Zhang, Guanlong Jiao, Yan Di, Gu Wang, Ziqin Huang, Ruida Zhang, Fabian Manhardt, Bowen Fu, Federico Tombari, Xiangyang Ji
Previous works concerning single-view hand-held object reconstruction typically rely on supervision from 3D ground-truth models, which are hard to collect in real world.
1 code implementation • 8 Oct 2023 • Wang Lu, Hao Yu, Jindong Wang, Damien Teney, Haohan Wang, Yiqiang Chen, Qiang Yang, Xing Xie, Xiangyang Ji
When personalized federated learning (FL) meets large foundation models, new challenges arise from various limitations in resources.
1 code implementation • NeurIPS 2023 • Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji
Offline multi-agent reinforcement learning is challenging due to the coupling effect of both distribution shift issue common in offline setting and the high dimension issue common in multi-agent setting, making the action out-of-distribution (OOD) and value overestimation phenomenon excessively severe.
1 code implementation • ICCV 2023 • Pengxu Wei, Yujing Sun, Xingbei Guo, Chang Liu, Jie Chen, Xiangyang Ji, Liang Lin
Despite substantial advances, single-image super-resolution (SISR) is always in a dilemma to reconstruct high-quality images with limited information from one input image, especially in realistic scenarios.
no code implementations • 15 Aug 2023 • Yan Di, Chenyangguang Zhang, Pengyuan Wang, Guangyao Zhai, Ruida Zhang, Fabian Manhardt, Benjamin Busam, Xiangyang Ji, Federico Tombari
However, such strategies fail to consistently align the denoised point cloud with the given image, leading to unstable conditioning and inferior performance.
1 code implementation • ICCV 2023 • Yan Di, Chenyangguang Zhang, Ruida Zhang, Fabian Manhardt, Yongzhi Su, Jason Rambach, Didier Stricker, Xiangyang Ji, Federico Tombari
In this paper, we propose U-RED, an Unsupervised shape REtrieval and Deformation pipeline that takes an arbitrary object observation as input, typically captured by RGB images or scans, and jointly retrieves and deforms the geometrically similar CAD models from a pre-established database to tightly match the target.
no code implementations • 4 Aug 2023 • Wang Lu, Jindong Wang, Xinwei Sun, Yiqiang Chen, Xiangyang Ji, Qiang Yang, Xing Xie
We propose DIVERSIFY, a general framework, for OOD detection and generalization on dynamic distributions of time series.
no code implementations • 19 Jun 2023 • Zesen Cheng, Peng Jin, Hao Li, Kehan Li, Siheng Li, Xiangyang Ji, Chang Liu, Jie Chen
Bottom-up methods are mainly perturbed by Inferior Positive (IP) errors due to the lack of prior object information.
no code implementations • 28 May 2023 • Yiqi Zhong, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji
Large amounts of incremental learning algorithms have been proposed to alleviate the catastrophic forgetting issue arises while dealing with sequential data on a time series.
4 code implementations • CVPR 2023 • Peng Jin, Jinfa Huang, Pengfei Xiong, Shangxuan Tian, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen
Contrastive learning-based video-language representation learning approaches, e. g., CLIP, have achieved outstanding performance, which pursue semantic interaction upon pre-defined video-text pairs.
Ranked #8 on Video Question Answering on MSRVTT-QA
no code implementations • ICCV 2023 • Kehan Li, Yian Zhao, Zhennan Wang, Zesen Cheng, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen
Interactive segmentation enables users to segment as needed by providing cues of objects, which introduces human-computer interaction for many fields, such as image editing and medical image analysis.
1 code implementation • CVPR 2023 • Ziquan Liu, Yi Xu, Xiangyang Ji, Antoni B. Chan
To better exploit the potential of pre-trained models in adversarial robustness, this paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
4 code implementations • ICCV 2023 • Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen
Existing text-video retrieval solutions are, in essence, discriminant models focused on maximizing the conditional likelihood, i. e., p(candidates|query).
Ranked #15 on Video Retrieval on MSVD
no code implementations • 13 Mar 2023 • Zesen Cheng, Kehan Li, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen
An intuitive materialization of our paradigm is Parallel Vertex Diffusion (PVD) to directly set vertex coordinates as the generation target and use a diffusion model to train and infer.
1 code implementation • 19 Feb 2023 • Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Xiangyang Ji
Guided depth map super-resolution (GDSR), which aims to reconstruct a high-resolution (HR) depth map from a low-resolution (LR) observation with the help of a paired HR color image, is a longstanding and fundamental problem, it has attracted considerable attention from computer vision and image processing communities.
1 code implementation • ICCV 2023 • Bo Fang, Wenhao Wu, Chang Liu, Yu Zhou, Yuxin Song, Weiping Wang, Xiangbo Shu, Xiangyang Ji, Jingdong Wang
In the refined embedding space, we represent text-video pairs as probabilistic distributions where prototypes are sampled for matching evaluation.
no code implementations • ICCV 2023 • Hongliang He, Jun Wang, Pengxu Wei, Fan Xu, Xiangyang Ji, Chang Liu, Jie Chen
Experiments on three nuclear instance segmentation datasets justify the superiority of TopoSeg, which achieves state-of-the-art performance.
1 code implementation • ICCV 2023 • Runyi Yu, Zhennan Wang, Yinhuai Wang, Kehan Li, Chang Liu, Haoyi Duan, Xiangyang Ji, Jie Chen
A typical way to introduce position information is adding the absolute Position Embedding (PE) to patch embedding before entering VTs.
1 code implementation • 31 Dec 2022 • Xin Ma, Chang Liu, Chunyu Xie, Long Ye, Yafeng Deng, Xiangyang Ji
Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency.
no code implementations • 25 Dec 2022 • Ziqing Li, Xuexin Yu, Xiaocong Lian, Yifeng Wang, Xiangyang Ji
To address this issue, we analyse the similarity and relationship between pose estimation and health indicator prediction tasks, and then propose a paradigm enabling deep learning for small health indicator datasets by pre-training on the pose estimation task.
1 code implementation • 15 Dec 2022 • Bohao Li, Chang Liu, Mengnan Shi, Xiaozhong Chen, Xiangyang Ji, Qixiang Ye
Adapting object detectors learned with sufficient supervision to novel classes under low data regimes is charming yet challenging.
no code implementations • 13 Dec 2022 • Chenyangguang Zhang, Zhiqiang Lou, Yan Di, Federico Tombari, Xiangyang Ji
Real-time monocular 3D reconstruction is a challenging problem that remains unsolved.
1 code implementation • 25 Nov 2022 • Qiran Zou, Yu Yang, Wing Yin Cheung, Chang Liu, Xiangyang Ji
Unsupervised foreground-background segmentation aims at extracting salient objects from cluttered backgrounds, where Generative Adversarial Network (GAN) approaches, especially layered GANs, show great promise.
no code implementations • CVPR 2023 • Zesen Cheng, Pengchong Qiao, Kehan Li, Siheng Li, Pengxu Wei, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen
Weakly supervised semantic segmentation is typically inspired by class activation maps, which serve as pseudo masks with class-discriminative regions highlighted.
Optical Character Recognition (OCR) Weakly supervised Semantic Segmentation +1
1 code implementation • ICLR 2022 • Yu Yang, Xiaotian Cheng, Hakan Bilen, Xiangyang Ji
The success of state-of-the-art deep neural networks heavily relies on the presence of large-scale labelled datasets, which are extremely expensive and time-consuming to annotate.
1 code implementation • 6 Nov 2022 • Yu Yang, Xiaotian Cheng, Chang Liu, Hakan Bilen, Xiangyang Ji
In recent years, generative adversarial networks (GANs) have been an actively studied topic and shown to successfully produce high-quality realistic images in various domains.
no code implementations • 5 Nov 2022 • Yu Yang, Wing Yin Cheung, Chang Liu, Xiangyang Ji
Multiview self-supervised representation learning roots in exploring semantic consistency across data of complex intra-class variation.
no code implementations • 2 Nov 2022 • Yifei Zhang, Chang Liu, Yu Zhou, Weiping Wang, Qixiang Ye, Xiangyang Ji
In this paper, we present relation-aware contrastive self-supervised learning (ReCo) to integrate instance relations, i. e., global distribution relation and local interpolation relation, into the CSL framework in a plug-and-play fashion.
no code implementations • CVPR 2023 • Pengchong Qiao, Zhidan Wei, Yu Wang, Zhennan Wang, Guoli Song, Fan Xu, Xiangyang Ji, Chang Liu, Jie Chen
Semi-supervised learning (SSL) essentially pursues class boundary exploration with less dependence on human annotations.
no code implementations • 15 Oct 2022 • Zihan Zhang, Yuhang Jiang, Yuan Zhou, Xiangyang Ji
Meanwhile, we show that to achieve $\tilde{O}(\mathrm{poly}(S, A, H)\sqrt{K})$ regret, the number of batches is at least $\Omega\left(H/\log_A(K)+ \log_2\log_2(K) \right)$, which matches our upper bound up to logarithmic terms.
no code implementations • 5 Oct 2022 • Jialei Xu, Xianming Liu, Yuanchao Bai, Junjun Jiang, Kaixuan Wang, Xiaozhi Chen, Xiangyang Ji
During the iterative update, the results of depth estimation are compared across cameras and the information of overlapping areas is propagated to the whole depth maps with the help of basis formulation.
1 code implementation • 11 Sep 2022 • Yuanchao Bai, Xianming Liu, Kai Wang, Xiangyang Ji, Xiaolin Wu, Wen Gao
In the lossless mode, the DLPR coding system first performs lossy compression and then lossless coding of residuals.
no code implementations • 27 Aug 2022 • Bowen Fu, Sek Kun Leong, Xiaocong Lian, Xiangyang Ji
Vision-based robotic assembly is a crucial yet challenging task as the interaction with multiple objects requires high levels of precision.
no code implementations • 13 Aug 2022 • Ruida Zhang, Yan Di, Fabian Manhardt, Federico Tombari, Xiangyang Ji
In this paper, to handle these shortcomings, we propose an end-to-end trainable network SSP-Pose for category-level pose estimation, which integrates shape priors into a direct pose regression network.
1 code implementation • 30 Jul 2022 • Ruida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji
Category-level object pose estimation aims to predict the 6D pose as well as the 3D metric size of arbitrary objects from a known set of categories.
1 code implementation • 17 Jul 2022 • Xingyu Liu, Gu Wang, Yi Li, Xiangyang Ji
While category-level 9DoF object pose estimation has emerged recently, previous correspondence-based or direct regression methods are both limited in accuracy due to the huge intra-category variances in object shape and color, etc.
no code implementations • 6 Jul 2022 • Zhennan Wang, Kehan Li, Runyi Yu, Yian Zhao, Pengchong Qiao, Chang Liu, Fan Xu, Xiangyang Ji, Guoli Song, Jie Chen
In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features.
no code implementations • ICLR 2022 • Xiong Zhou, Xianming Liu, Deming Zhai, Junjun Jiang, Xin Gao, Xiangyang Ji
One of the main challenges for feature representation in deep learning-based classification is the design of appropriate loss functions that exhibit strong discriminative power.
no code implementations • 23 Jun 2022 • Xiong Zhou, Xianming Liu, Deming Zhai, Junjun Jiang, Xin Gao, Xiangyang Ji
We verify the effectiveness of PAL on class-imbalanced learning and noise-tolerant learning by extensive experiments on synthetic and real-world datasets.
no code implementations • 25 May 2022 • Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, Antoni B. Chan
With our empirical result obtained from 1, 330 models, we provide the following main observations: 1) ERM combined with data augmentation can achieve state-of-the-art performance if we choose a proper pre-trained model respecting the data property; 2) specialized algorithms further improve the robustness on top of ERM when handling a specific type of distribution shift, e. g., GroupDRO for spurious correlation and CORAL for large-scale out-of-distribution data; 3) Comparing different pre-training modes, architectures and data sizes, we provide novel observations about pre-training on distribution shift, which sheds light on designing or selecting pre-training strategy for different kinds of distribution shifts.
3 code implementations • ICCV 2023 • Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye
Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.
Ranked #5 on Few-Shot Object Detection on MS-COCO (30-shot)
1 code implementation • 8 May 2022 • Chunyu Xie, Heng Cai, Jincheng Li, Fanjing Kong, Xiaoyu Wu, Jianfei Song, Henrique Morimitsu, Lin Yao, Dexin Wang, Xiangzheng Zhang, Dawei Leng, Baochang Zhang, Xiangyang Ji, Yafeng Deng
In this work, we build a large-scale high-quality Chinese Cross-Modal Benchmark named CCMB for the research community, which contains the currently largest public pre-training dataset Zero and five human-annotated fine-tuning datasets for downstream tasks.
Ranked #3 on Image Retrieval on Flickr30k-CN
1 code implementation • 24 Apr 2022 • Jingfen Xie, Jian Zhang, Yongbing Zhang, Xiangyang Ji
Compressed Sensing MRI (CS-MRI) aims at reconstructing de-aliased images from sub-Nyquist sampling k-space data to accelerate MR Imaging, thus presenting two basic issues, i. e., where to sample and how to reconstruct.
1 code implementation • CVPR 2022 • Wenbo Zhao, Xianming Liu, Zhiwei Zhong, Junjun Jiang, Wei Gao, Ge Li, Xiangyang Ji
Most existing methods either take the end-to-end supervised learning based manner, where large amounts of pairs of sparse input and dense ground-truth are exploited as supervision information; or treat up-scaling of different scale factors as independent tasks, and have to build multiple networks to handle upsampling with varying factors.
no code implementations • 24 Mar 2022 • Zihan Zhang, Xiangyang Ji, Simon S. Du
This paper gives the first polynomial-time algorithm for tabular Markov Decision Processes (MDP) that enjoys a regret bound \emph{independent on the planning horizon}.
1 code implementation • 19 Mar 2022 • Gu Wang, Fabian Manhardt, Xingyu Liu, Xiangyang Ji, Federico Tombari
6D object pose estimation is a fundamental yet challenging problem in computer vision.
3 code implementations • CVPR 2022 • Yan Di, Ruida Zhang, Zhiqiang Lou, Fabian Manhardt, Xiangyang Ji, Nassir Navab, Federico Tombari
While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle a single or a handful of different objects, which limits their applications.
Ranked #1 on 6D Pose Estimation on LineMOD (Mean ADD-S metric)
1 code implementation • CVPR 2022 • Yiqi Zhong, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji
A new type of non-invasive attacks emerged recently, which attempt to cast perturbation onto the target by optics based tools, such as laser beam and projector.
1 code implementation • 24 Jan 2022 • Bo Li, Qiulin Wang, JiQuan Pei, Yu Yang, Xiangyang Ji
First, we propose a novel approach to disentangle latent subspace semantics by exploiting existing face analysis models, e. g., face parsers and face landmark detectors.
1 code implementation • 17 Dec 2021 • Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, YaoWei Wang, Xiangyang Ji, Wen Gao
Meanwhile, we propose a feature aggregation module to fuse the compressed features with the selected intermediate features of the Transformer, and feed the aggregated features to a deconvolutional neural network for image reconstruction.
1 code implementation • 13 Dec 2021 • Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Xiangyang Ji
Specifically, we propose an attentional kernel learning module to generate dual sets of filter kernels from the guidance and the target, respectively, and then adaptively combine them by modeling the pixel-wise dependency between the two images.
no code implementations • 24 Nov 2021 • Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni Chan, Rong Jin
The generalization result of using pre-training data shows that the excess risk bound on a target task can be improved when the appropriate pre-training data is included in fine-tuning.
no code implementations • 15 Oct 2021 • Zihan Zhang, Xiangyang Ji, Yuan Zhou
We study the optimal batch-regret tradeoff for batch linear contextual bandits.
no code implementations • 15 Oct 2021 • Shuncheng He, Yuhang Jiang, Hongchang Zhang, Jianzhun Shao, Xiangyang Ji
These pre-trained policies can accelerate learning when endowed with external reward, and can also be used as primitive options in hierarchical reinforcement learning.
Hierarchical Reinforcement Learning reinforcement-learning +2
no code implementations • 23 Sep 2021 • Jialei Xu, Yuanchao Bai, Xianming Liu, Junjun Jiang, Xiangyang Ji
In this paper, we propose a novel weakly-supervised framework to train a monocular depth estimation network to generate HR depth maps with resolution-mismatched supervision, i. e., the inputs are HR color images and the ground-truth are low-resolution (LR) depth maps.
2 code implementations • ICCV 2021 • Yan Di, Fabian Manhardt, Gu Wang, Xiangyang Ji, Nassir Navab, Federico Tombari
Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e. g. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem.
Ranked #1 on 6D Pose Estimation using RGB on Occlusion LineMOD
1 code implementation • ICCV 2021 • Xiong Zhou, Xianming Liu, Chenyang Wang, Deming Zhai, Junjun Jiang, Xiangyang Ji
In this paper, we theoretically prove that \textbf{any loss can be made robust to noisy labels} by restricting the network output to the set of permutations over a fixed vector.
no code implementations • CVPR 2021 • Feilong Zhang, Xianming Liu, Cheng Guo, Shiyi Lin, Junjun Jiang, Xiangyang Ji
Specifically, we unfold the iterative process of the alternative projection phase retrieval into a feed-forward neural network, whose layers mimic the processing flow.
2 code implementations • CVPR 2021 • Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye
Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects.
Ranked #34 on Object Detection In Aerial Images on DOTA (using extra training data)
no code implementations • CVPR 2021 • Yuanchao Bai, Xianming Liu, WangMeng Zuo, YaoWei Wang, Xiangyang Ji
To achieve scalable compression with the error bound larger than zero, we derive the probability model of the quantized residual by quantizing the learned probability model of the original residual, instead of training multiple networks.
1 code implementation • 6 Jun 2021 • Xiong Zhou, Xianming Liu, Junjun Jiang, Xin Gao, Xiangyang Ji
Symmetric loss functions are confirmed to be robust to label noise.
3 code implementations • NeurIPS 2021 • Zhuchen Shao, Hao Bian, Yang Chen, Yifeng Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang
Multiple instance learning (MIL) is a powerful tool to solve the weakly supervised classification in whole slide image (WSI) based pathology diagnosis.
Ranked #6 on Multiple Instance Learning on TCGA
1 code implementation • CVPR 2021 • Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye
Encouraging progress in few-shot semantic segmentation has been made by leveraging features learned upon base classes with sufficient training data to represent novel classes with few-shot examples.
Ranked #69 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)
1 code implementation • CVPR 2021 • Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye
Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.
Ranked #1 on Active Object Detection on MS COCO
1 code implementation • 4 Apr 2021 • Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Zhiwen Chen, Xiangyang Ji
Specifically, to effectively extract and combine relevant information from LR depth and HR guidance, we propose a multi-modal attention based fusion (MMAF) strategy for hierarchical convolutional layers, including a feature enhance block to select valuable features and a feature recalibration block to unify the similarity metrics of modalities with different appearance characteristics.
no code implementations • 1 Apr 2021 • Yu Yang, Hakan Bilen, Qiran Zou, Wing Yin Cheung, Xiangyang Ji
Deep learning approaches heavily rely on high-quality human supervision which is nonetheless expensive, time-consuming, and error-prone, especially for image segmentation task.
no code implementations • 31 Mar 2021 • Yuanchao Bai, Xianming Liu, WangMeng Zuo, YaoWei Wang, Xiangyang Ji
To achieve scalable compression with the error bound larger than zero, we derive the probability model of the quantized residual by quantizing the learned probability model of the original residual, instead of training multiple networks.
no code implementations • 27 Feb 2021 • Hongchang Zhang, Jianzhun Shao, Yuhang Jiang, Shuncheng He, Xiangyang Ji
In offline reinforcement learning, a policy learns to maximize cumulative rewards with a fixed collection of data.
no code implementations • 24 Feb 2021 • Jianzhun Shao, Hongchang Zhang, Yuhang Jiang, Shuncheng He, Xiangyang Ji
Reward decomposition is a critical problem in centralized training with decentralized execution~(CTDE) paradigm for multi-agent reinforcement learning.
1 code implementation • CVPR 2021 • Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji
In this work, we perform an in-depth investigation on both direct and indirect methods, and propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner from dense correspondence-based intermediate geometric representations.
Ranked #3 on 6D Pose Estimation using RGB on Occlusion LineMOD
no code implementations • CVPR 2020 • Jianzhun Shao, Yuhang Jiang, Gu Wang, Zhigang Li, Xiangyang Ji
6D pose estimation from a single RGB image is a challenging and vital task in computer vision.
no code implementations • NeurIPS 2021 • Zihan Zhang, Jiaqi Yang, Xiangyang Ji, Simon S. Du
With the new confidence sets, we obtain the follow regret bounds: For linear bandits, we obtain an $\tilde{O}(poly(d)\sqrt{1 + \sum_{k=1}^{K}\sigma_k^2})$ data-dependent regret bound, where $d$ is the feature dimension, $K$ is the number of rounds, and $\sigma_k^2$ is the \emph{unknown} variance of the reward at the $k$-th round.
no code implementations • NeurIPS 2020 • Zihan Zhang, Yuan Zhou, Xiangyang Ji
We study the reinforcement learning problem in the setting of finite-horizon1episodic Markov Decision Processes (MDPs) with S states, A actions, and episode length H. We propose a model-free algorithm UCB-ADVANTAGE and prove that it achieves \tilde{O}(\sqrt{H^2 SAT}) regret where T=KH and K is the number of episodes to play.
no code implementations • 12 Oct 2020 • Zihan Zhang, Simon S. Du, Xiangyang Ji
In the planning phase, the agent needs to return a near-optimal policy for arbitrary reward functions.
no code implementations • 28 Sep 2020 • Zihan Zhang, Xiangyang Ji, Simon S. Du
Episodic reinforcement learning generalizes contextual bandits and is often perceived to be more difficult due to long planning horizon and unknown state-dependent transitions.
no code implementations • 19 Aug 2020 • Zhigang Li, Yinlin Hu, Mathieu Salzmann, Xiangyang Ji
We achieve state of the art performance on LINEMOD, and OccludedLINEMOD in without real-pose setting, even outperforming methods that rely on real annotations during training on Occluded-LINEMOD.
no code implementations • 9 Aug 2020 • Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang
The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision.
no code implementations • 26 Jun 2020 • Feng Liu, Xiaoxong Zhang, Fang Wan, Xiangyang Ji, Qixiang Ye
We present Domain Contrast (DC), a simple yet effective approach inspired by contrastive learning for training domain adaptive detectors.
no code implementations • 7 Jun 2020 • Shuncheng He, Jianzhun Shao, Xiangyang Ji
Meanwhile it suppresses the empowerment of Z on the state of any single agent by adversarial training.
no code implementations • 6 Jun 2020 • Zihan Zhang, Yuan Zhou, Xiangyang Ji
In this paper we consider the problem of learning an $\epsilon$-optimal policy for a discounted Markov Decision Process (MDP).
no code implementations • 21 Apr 2020 • Zihan Zhang, Yuan Zhou, Xiangyang Ji
We study the reinforcement learning problem in the setting of finite-horizon episodic Markov Decision Processes (MDPs) with $S$ states, $A$ actions, and episode length $H$.
1 code implementation • ECCV 2020 • Gu Wang, Fabian Manhardt, Jianzhun Shao, Xiangyang Ji, Nassir Navab, Federico Tombari
6D object pose estimation is a fundamental problem in computer vision.
no code implementations • 14 Mar 2020 • Qiang Li, Xian-Ming Liu, Kaige Han, Cheng Guo, Xiangyang Ji, Xiaolin Wu
Whole slide imaging (WSI) is an emerging technology for digital pathology.
no code implementations • 12 Mar 2020 • Fabian Manhardt, Gu Wang, Benjamin Busam, Manuel Nickel, Sven Meier, Luca Minciullo, Xiangyang Ji, Nassir Navab
Contemporary monocular 6D pose estimation methods can only cope with a handful of object instances.
no code implementations • 19 Sep 2019 • Yongbing Zhang, Yangzhe Liu, Xiu Li, Shaowei Jiang, Krishna Dixit, Xinfeng Zhang, Xiangyang Ji
Since the optimal parameters of the PgNN can be derived by minimizing the difference between the model-generated images and real captured angle-varied images corresponding to the same scene, the proposed PgNN can get rid of the problem of massive training data as in traditional supervised methods.
no code implementations • NeurIPS 2019 • Zihan Zhang, Xiangyang Ji
We present an algorithm based on the \emph{Optimism in the Face of Uncertainty} (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently.
1 code implementation • CVPR 2019 • Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye
Weakly supervised object detection (WSOD) is a challenging task when provided with image category supervision but required to simultaneously learn object locations and object detectors.
Ranked #15 on Weakly Supervised Object Detection on PASCAL VOC 2007
no code implementations • 26 Feb 2019 • Guijin Wang, Cairong Zhang, Xinghao Chen, Xiangyang Ji, Jing-Hao Xue, Hang Wang
To mitigate these limitations and promote further research on hand pose estimation from stereo images, we propose a new large-scale binocular hand pose dataset called THU-Bi-Hand, offering a new perspective for fingertip localization.
no code implementations • IEEE Access 2018 • Xinghao Chen, Guijin Wang, Cairong Zhang, Tae-Kyun Kim, Xiangyang Ji
The semantic segmentation network assigns semantic labels for each point in the point set.
Ranked #7 on Hand Pose Estimation on MSRA Hands
2 code implementations • ECCV 2018 • Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, Dieter Fox
Estimating the 6D pose of objects from images is an important problem in various applications such as robot manipulation and virtual reality.
Ranked #1 on 6D Pose Estimation using RGB on YCB-Video
no code implementations • ECCV 2018 • Jialin Wu, Dai Li, Yu Yang, Chandrajit Bajaj, Xiangyang Ji
We propose a dynamic filtering strategy with large sampling field for ConvNets (LS-DFN), where the position-specific kernels learn from not only the identical position but also multiple sampled neighbor regions.
no code implementations • 14 Feb 2017 • Shan Gao, Xiaogang Chen, Qixiang Ye, Junliang Xing, Arjan Kuijper, Xiangyang Ji
Inspired with the social affinity property of moving objects, we propose a Graphical Social Topology (GST) model, which estimates the group dynamics by jointly modeling the group structure and the states of objects using a topological representation.
3 code implementations • CVPR 2017 • Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei
It inherits all the merits of FCNs for semantic segmentation and instance mask proposal.
Ranked #98 on Instance Segmentation on COCO test-dev
no code implementations • 9 Jul 2016 • Jialin Wu, Gu Wang, Wukui Yang, Xiangyang Ji
We propose a novel deep supervised neural network for the task of action recognition in videos, which implicitly takes advantage of visual tracking and shares the robustness of both deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN).
Action Recognition In Videos Temporal Action Localization +1
no code implementations • 1 Dec 2015 • Dongsheng An, Jinli Suo, Xiangyang Ji, Haoqian Wang, Qionghai Dai
Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively.
no code implementations • 28 Nov 2015 • Qi Guo, Le Dan, Dong Yin, Xiangyang Ji
Multi-object tracking remains challenging due to frequent occurrence of occlusions and outliers.
no code implementations • 29 Jan 2015 • Qi Guo, Bo-Wei Chen, Feng Jiang, Xiangyang Ji, Sun-Yuan Kung
Firstly, we divide the feature space into several subspaces using the decomposition method proposed in this paper.