1 code implementation • 23 Dec 2024 • Zixi Liang, Guowei Xu, Haifeng Wu, Ye Huang, Wen Li, Lixin Duan
It disentangles the multimodal relationships into scene layout relationships and detailed object relationships, fusing them later through implicit neural fields (INFs).
no code implementations • 4 Jul 2024 • Linlong Fan, Ye Huang, Yanqi Ge, Wen Li, Lixin Duan
It has properties such as viewpoint invariance and rotation robustness, which give it an advantage in addressing the 3D object recognition problem under arbitrary views.
no code implementations • 10 Apr 2024 • Yanqi Ge, Jiaqi Liu, Qingnan Fan, Xi Jiang, Ye Huang, Shuai Qin, Hong Gu, Wen Li, Lixin Duan
In this work, we propose a novel solution to the text-driven style transfer task, namely, Adaptive Style Incorporation~(ASI), to achieve fine-grained feature-level style incorporation.
1 code implementation • 20 Feb 2024 • Yu Xiong, Zhipeng Hu, Ye Huang, Runze Wu, Kai Guan, Xingchen Fang, Ji Jiang, Tianze Zhou, Yujing Hu, Haoyu Liu, Tangjie Lyu, Changjie Fan
To address this, we introduce XRL-Bench, a unified standardized benchmark tailored for the evaluation and comparison of XRL methods, encompassing three main modules: standard RL environments, explainers based on state importance, and standard evaluators.
no code implementations • 26 Jan 2024 • Yanqi Ge, Ye Huang, Wen Li, Lixin Duan
We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains.
1 code implementation • 19 Dec 2023 • Yanqi Ge, Qiang Nie, Ye Huang, Yong liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan
By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem.
no code implementations • 15 Mar 2023 • Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan
One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.
1 code implementation • 11 Jan 2023 • Ye Huang, Di Kang, Liang Chen, Wenjing Jia, Xiangjian He, Lixin Duan, Xuefei Zhe, Linchao Bao
Extensive experiments and ablation studies conducted on multiple benchmark datasets demonstrate that the proposed CAR can boost the accuracy of all baseline models by up to 2. 23% mIOU with superior generalization ability.
1 code implementation • arXiv:2203.07160 2022 • Ye Huang, Di Kang, Liang Chen, Xuefei Zhe, Wenjing Jia, Xiangjian He, Linchao Bao
Recent segmentation methods, such as OCR and CPNet, utilizing "class level" information in addition to pixel features, have achieved notable success for boosting the accuracy of existing network modules.
Ranked #8 on
Semantic Segmentation
on PASCAL Context
1 code implementation • 19 Jan 2021 • Ye Huang, Di Kang, Wenjing Jia, Xiangjian He, Liu Liu
Spatial and channel attentions, modelling the semantic interdependencies in spatial and channel dimensions respectively, have recently been widely used for semantic segmentation.
Ranked #6 on
Semantic Segmentation
on COCO-Stuff test
no code implementations • 26 Aug 2019 • Ye Huang, Qingqing Wang, Wenjing Jia, Xiangjian He
Experiments conducted on the benchmark PASCAL VOC 2012 dataset show that the proposed sharing strategy can not only boost a network s generalization and representation abilities but also reduce the model complexity significantly.
no code implementations • 20 Apr 2019 • Qingqing Wang, Wenjing Jia, Xiangjian He, Yue Lu, Michael Blumenstein, Ye Huang
Scene text recognition has recently been widely treated as a sequence-to-sequence prediction problem, where traditional fully-connected-LSTM (FC-LSTM) has played a critical role.