no code implementations • 13 Jun 2024 • Yue Xu, Kaizhi Yang, Jiebo Luo, Xuejin Chen
3D visual grounding is an emerging research area dedicated to making connections between the 3D physical world and natural language, which is crucial for achieving embodied intelligence.
no code implementations • 22 Feb 2024 • Kai Cheng, Xiaoxiao Long, Kaizhi Yang, Yao Yao, Wei Yin, Yuexin Ma, Wenping Wang, Xuejin Chen
The advent of 3D Gaussian Splatting (3DGS) has recently brought about a revolution in the field of neural rendering, facilitating high-quality renderings at real-time speed.
no code implementations • 20 Feb 2024 • Jiaqi Xu, Cuiling Lan, Wenxuan Xie, Xuejin Chen, Yan Lu
A pivotal challenge is the development of an efficient method to encapsulate video content into a set of representative tokens to align with LLMs.
1 code implementation • 5 Jan 2024 • Qihua Chen, Xuejin Chen, Chenxuan Wang, Yixiong Liu, Zhiwei Xiong, Feng Wu
In this work, we aim to reduce human workload by predicting connectivity between over-segmented neuron pieces, taking both microscopy image and 3D morphology features into account, similar to human proofreading workflow.
no code implementations • CVPR 2024 • Xin Kang, Lei Chu, Jiahao Li, Xuejin Chen, Yan Lu
Recent methods for label-free 3D semantic segmentation aim to assist 3D model training by leveraging the open-world recognition ability of pre-trained vision language models.
1 code implementation • CVPR 2024 • Xiaoyu Liu, Miaomiao Cai, Yinda Chen, Yueyi Zhang, Te Shi, Ruobing Zhang, Xuejin Chen, Zhiwei Xiong
Recent advancements utilize 3D CNNs to predict a 3D affinity map with improved accuracy but suffer from two challenges: high computational cost and limited input size especially for practical deployment for large-scale EM volumes.
no code implementations • 8 Dec 2023 • Jiaqi Xu, Cuiling Lan, Wenxuan Xie, Xuejin Chen, Yan Lu
To address these issues, we introduce a simple yet effective retrieval-based video language model (R-VLM) for efficient and interpretable long video QA.
no code implementations • 28 Nov 2023 • Kai Cheng, Xiaoxiao Long, Wei Yin, Jin Wang, Zhiqiang Wu, Yuexin Ma, Kaixuan Wang, Xiaozhi Chen, Xuejin Chen
Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities.
no code implementations • ICCV 2023 • Qingyao Shuai, Chi Zhang, Kaizhi Yang, Xuejin Chen
Unsupervised methods for reconstructing structures face significant challenges in capturing the geometric details with consistent structures among diverse shapes of the same category.
no code implementations • 19 Jun 2023 • Jiaqi Xu, Yuwang Wang, Xuejin Chen
In this work, with the assumption that the gradients of a specific domain samples under the classification task could also reflect the property of the domain, we propose a Shape Guided Gradient Voting (SGGV) method for domain generalization.
no code implementations • 15 Apr 2023 • Xin Kang, Chaoqun Wang, Xuejin Chen
We design a region-based feature enhancement (RFE) module, which consists of a Semantic-Spatial Region Extraction stage and a Region Dependency Modeling stage.
no code implementations • 10 Mar 2023 • Kaizhi Yang, Xiaoshuai Zhang, Zhiao Huang, Xuejin Chen, Zexiang Xu, Hao Su
Under the Lagrangian view, we parameterize the scene motion by tracking the trajectory of particles on objects.
1 code implementation • ICCV 2023 • Xiaoyu Liu, Wei Huang, Zhiwei Xiong, Shenglong Zhou, Yueyi Zhang, Xuejin Chen, Zheng-Jun Zha, Feng Wu
Sparse instance-level supervision has recently been explored to address insufficient annotation in biomedical instance segmentation, which is easier to annotate crowded instances and better preserves instance completeness for 3D volumetric datasets compared to common semi-supervision. In this paper, we propose a sparsely supervised biomedical instance segmentation framework via cross-representation affinity consistency regularization.
2 code implementations • CVPR 2023 • Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen
Language-guided image editing has achieved great success recently.
no code implementations • 23 Nov 2022 • Binxin Yang, Xuejin Chen, Chaoqun Wang, Chi Zhang, Zihan Chen, Xiaoyan Sun
With a semantic feature matching loss for effective semantic supervision, our sketch embedding precisely conveys the semantics in the input sketches to the synthesized images.
no code implementations • 17 Jul 2022 • Zhihua Cheng, Xuejin Chen
Sketching is an intuitive and effective way for content creation.
no code implementations • 5 May 2022 • Kai Cheng, Hao Chen, Wei Yin, Guangkai Xu, Xuejin Chen
However, multi-view depth estimation is fundamentally a correspondence-based optimization problem, but previous learning-based methods mainly rely on predefined depth hypotheses to build correspondence as the cost volume and implicitly regularize it to fit depth prediction, deviating from the essence of iterative optimization based on stereo correspondence.
1 code implementation • 14 Mar 2022 • Qihua Chen, Xuejin Chen, Hyun-Myung Woo, Byung-Jun Yoon
In this work, we propose a novel scheme to reduce the computational cost for objective-UQ via MOCU based on a data-driven approach.
no code implementations • NeurIPS 2021 • Chaoqun Wang, Shaobo Min, Xuejin Chen, Xiaoyan Sun, Houqiang Li
This enables DPPN to produce visual representations with accurate attribute localization ability, which benefits the semantic-visual alignment and representation transferability.
no code implementations • 25 Sep 2021 • Zhili Li, Xuejin Chen, Jie Zhao, Zhiwei Xiong
However, due to the image degradation during the imaging process, the large variety of mitochondrial structures, as well as the presence of noise, artifacts and other sub-cellular structures, mitochondria segmentation is very challenging.
no code implementations • 26 Aug 2021 • Hao Wang, Zheng-Jun Zha, Liang Li, Xuejin Chen, Jiebo Luo
We propose a novel MultiModulation Network (M2N) to learn the above correlation and leverage it as semantic guidance to modulate the related auditory, visual, and fused features.
1 code implementation • 7 Jun 2021 • Kaizhi Yang, Xuejin Chen
In this paper, we propose an unsupervised shape abstraction method to map a point cloud into a compact cuboid representation.
no code implementations • 5 Apr 2021 • Chaoqun Wang, Xuejin Chen, Shaobo Min, Xiaoyan Sun, Houqiang Li
First, DCEN leverages task labels to cluster representations of the same semantic category by cross-modal contrastive learning and exploring semantic-visual complementarity.
4 code implementations • CVPR 2021 • Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng
S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation.
1 code implementation • 4 Dec 2020 • Songfang Han, Jiayuan Gu, Kaichun Mo, Li Yi, Siyu Hu, Xuejin Chen, Hao Su
However, there remains a much more difficult and under-explored issue on how to generalize the learned skills over unseen object categories that have very different shape geometry distributions.
1 code implementation • 31 Aug 2020 • Yuhang Li, Xuejin Chen, Binxin Yang, Zihan Chen, Zhihua Cheng, Zheng-Jun Zha
In this paper, we explore the task of generating photo-realistic face images from hand-drawn sketches.
no code implementations • 20 Oct 2019 • Yuhang Li, Xuejin Chen, Feng Wu, Zheng-Jun Zha
The large-scale discriminator enforces the completeness of global structures and the small-scale discriminator encourages fine details, thereby enhancing the realism of generated face images.
1 code implementation • 13 Jul 2019 • Xiaotian Chen, Xuejin Chen, Zheng-Jun Zha
We propose a Residual Pyramid Decoder (RPD) which expresses global scene structure in upper levels to represent layouts, and local structure in lower levels to present shape details.
Ranked #58 on Monocular Depth Estimation on NYU-Depth V2 (RMSE metric)
no code implementations • 31 Jul 2018 • Shaobo Min, Xuejin Chen, Zheng-Jun Zha, Feng Wu, Yongdong Zhang
\begin{abstract} Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation.