1 code implementation • 13 Feb 2025 • Lingting Zhu, Guying Lin, Jinnan Chen, Xinjie Zhang, Zhenchao Jin, Zhao Wang, Lequan Yu
While Implicit Neural Representations (INRs) have demonstrated significant success in image representation, they are often hindered by large training memory and slow decoding speed.
1 code implementation • 8 Oct 2024 • Zhenchao Jin, Mengchen Liu, Dongdong Chen, Lingting Zhu, Yunsheng Li, Lequan Yu
Through the integration of external tools, large language models (LLMs) such as GPT-4o and Llama 3. 1 significantly expand their functional capabilities, evolving from elementary conversational agents to general-purpose assistants.
no code implementations • 5 Aug 2024 • Changtao Miao, Qi Chu, Tao Gong, Zhentao Tan, Zhenchao Jin, Wanyi Zhuang, Man Luo, Honggang Hu, Nenghai Yu
The FUP integrates detection and localization tasks using a token learning strategy and multiple forgery-aware transformers, which facilitates the use of classification information to enhance localization capability.
2 code implementations • 19 Mar 2024 • Lingting Zhu, Noel Codella, Dongdong Chen, Zhenchao Jin, Lu Yuan, Lequan Yu
Our method begins with a 2D slice, noted as the informed slice to serve the patient prior, and propagates the generation process using a 3D segmentation mask.
1 code implementation • 21 Jan 2024 • Lingting Zhu, Zhao Wang, Jiahao Cui, Zhenchao Jin, Guying Lin, Lequan Yu
Specifically, our approach incorporates deformation fields to handle dynamic scenes, depth-guided supervision with spatial-temporal weight masks to optimize 3D targets with tool occlusion from a single viewpoint, and surface-aligned regularization terms to capture the much better geometry.
1 code implementation • NeurIPS 2023 • Zhenchao Jin, Xiaowei Hu, Lingting Zhu, Luchuan Song, Li Yuan, Lequan Yu
Next, a deletion diagnostics procedure is conducted to model relations of these semantic-level representations via perceiving the network outputs and the extracted relations are utilized to guide the semantic-level representations to interact with each other.
no code implementations • ICCV 2023 • Luchuan Song, Guojun Yin, Zhenchao Jin, Xiaoyi Dong, Chenliang Xu
Listener head generation centers on generating non-verbal behaviors (e. g., smile) of a listener in reference to the information delivered by a speaker.
no code implementations • 19 Jul 2023 • Lingting Zhu, Zeyue Xue, Zhenchao Jin, Xian Liu, Jingzhen He, Ziwei Liu, Lequan Yu
This paradigm extends the 2D image diffusion model to a volumetric version with a slightly increasing number of parameters and computation, offering a principled solution for generic cross-modality 3D medical image synthesis.
1 code implementation • 26 May 2023 • Zhenchao Jin
This paper presents SSSegmenation, which is an open source supervised semantic image segmentation toolbox based on PyTorch.
1 code implementation • 18 May 2023 • Changtao Miao, Qi Chu, Zhentao Tan, Zhenchao Jin, Tao Gong, Wanyi Zhuang, Yue Wu, Bin Liu, Honggang Hu, Nenghai Yu
To this end, a novel Multi-Spectral Class Center Network (MSCCNet) is proposed for face manipulation detection and localization.
2 code implementations • 9 Sep 2022 • Zhenchao Jin, Dongdong Yu, Zehuan Yuan, Lequan Yu
To this end, we propose a novel soft mining contextual information beyond image paradigm named MCIBI++ to further boost the pixel-level representations.
1 code implementation • 16 Jul 2022 • Zhenchao Jin, Dongdong Yu, Luchuan Song, Zehuan Yuan, Lequan Yu
Feature pyramid network (FPN) is one of the key components for object detectors.
no code implementations • 1 Sep 2021 • Zhenchao Jin, Dongdong Yu, Kai Su, Zehuan Yuan, Changhu Wang
Video scene parsing is a long-standing challenging task in computer vision, aiming to assign pre-defined semantic labels to pixels of all frames in a given video.
1 code implementation • ICCV 2021 • Zhenchao Jin, Bin Liu, Qi Chu, Nenghai Yu
Third, we compute the similarities between each pixel representation and the image-level contextual information, the semantic-level contextual information, respectively.
1 code implementation • ICCV 2021 • Zhenchao Jin, Tao Gong, Dongdong Yu, Qi Chu, Jian Wang, Changhu Wang, Jie Shao
To address this, this paper proposes to mine the contextual information beyond individual images to further augment the pixel representations.