no code implementations • 25 Jun 2025 • Jiaxing Huang, Heng Guo, Le Lu, Fan Yang, Minfeng Xu, Ge Yang, Wei Luo
Opportunistic computed tomography (CT) analysis has emerged as a promising alternative for osteoporosis diagnosis using existing imaging data.
1 code implementation • 23 Jun 2025 • Zijie Yang, Qiji Zhou, Fang Guo, Sijie Zhang, Yexun Xi, Jinglei Nie, Yudian Zhu, Liping Huang, Chou Wu, Yonghe Xia, Xiaoyu Ma, Yingming Pu, Panzhong Lu, Junshu Pan, Mingtao Chen, Tiannan Guo, Yanmei Dou, Hongyu Chen, Anping Zeng, Jiaxing Huang, Tian Xu, Yue Zhang
In this study, we address these challenges by developing Airalogy (https://airalogy. com), the world's first AI- and community-driven platform that balances universality and standardization for digitizing research data across multiple disciplines.
1 code implementation • 9 Jun 2025 • Michael K. Chen, Xikun Zhang, Jiaxing Huang, DaCheng Tao
Large language models (LLMs) have become the cornerstone of modern AI.
no code implementations • 26 May 2025 • Rong-Cheng Tu, Wenhao Sun, Hanzhe You, Yingjie Wang, Jiaxing Huang, Li Shen, DaCheng Tao
Zero-Shot Composed Image Retrieval (ZS-CIR) aims to retrieve target images given a compositional query, consisting of a reference image and a modifying text-without relying on annotated training data.
1 code implementation • 22 May 2025 • Yibo Wang, Li Shen, Huanjin Yao, Tiansheng Huang, Rui Liu, Naiqiang Tan, Jiaxing Huang, Kai Zhang, DaCheng Tao
Chain-of-Thought (CoT) reasoning enhances large language models (LLMs) by enabling step-by-step problem-solving, yet its extension to Long-CoT introduces substantial computational overhead due to increased token length.
2 code implementations • 22 May 2025 • Huanjin Yao, Qixiang Yin, Jingyi Zhang, Min Yang, Yibo Wang, Wenhao Wu, Fei Su, Li Shen, Minghui Qiu, DaCheng Tao, Jiaxing Huang
To this end, we propose Share-GRPO, a novel RL approach that tackle these issues by exploring and sharing diverse reasoning trajectories over expanded question space.
no code implementations • 3 Apr 2025 • Laibin Chang, Yunke Wang, Jiaxing Huang, Longxiang Deng, Bo Du, Chang Xu
Marine Saliency Segmentation (MSS) plays a pivotal role in various vision-based marine exploration tasks.
1 code implementation • 31 Mar 2025 • YuFei Wang, Lanqing Guo, Zhihao LI, Jiaxing Huang, Pichao Wang, Bihan Wen, Jian Wang
Text-guided image editing is an essential task that enables users to modify images through natural language descriptions.
1 code implementation • 17 Mar 2025 • Jingyi Zhang, Jiaxing Huang, Huanjin Yao, Shunyu Liu, Xikun Zhang, Shijian Lu, DaCheng Tao
Recent studies generally enhance MLLMs' reasoning capabilities via supervised fine-tuning on high-quality chain-of-thought reasoning data, which often leads models to merely imitate successful reasoning paths without understanding what the wrong reasoning paths are.
no code implementations • 21 Feb 2025 • Yingying Sun, Jun A, Zhiwei Liu, Rui Sun, Liujia Qian, Samuel H. Payne, Wout Bittremieux, Markus Ralser, Chen Li, Yi Chen, Zhen Dong, Yasset Perez-Riverol, Asif Khan, Chris Sander, Ruedi Aebersold, Juan Antonio Vizcaíno, Jonathan R Krieger, Jianhua Yao, Han Wen, Linfeng Zhang, Yunping Zhu, Yue Xuan, Benjamin Boyang Sun, Liang Qiao, Henning Hermjakob, Haixu Tang, Huanhuan Gao, Yamin Deng, Qing Zhong, Cheng Chang, Nuno Bandeira, Ming Li, Weinan E, Siqi Sun, Yuedong Yang, Gilbert S. Omenn, Yue Zhang, Ping Xu, Yan Fu, Xiaowen Liu, Christopher M. Overall, Yu Wang, Eric W. Deutsch, Luonan Chen, Jürgen Cox, Vadim Demichev, Fuchu He, Jiaxing Huang, Huilin Jin, Chao Liu, Nan Li, Zhongzhi Luan, Jiangning Song, Kaicheng Yu, Wanggen Wan, Tai Wang, Kang Zhang, Le Zhang, Peter A. Bell, Matthias Mann, Bing Zhang, Tiannan Guo
Artificial intelligence (AI) is transforming scientific research, including proteomics.
1 code implementation • 19 Feb 2025 • Kongcheng Zhang, Qi Yao, Baisheng Lai, Jiaxing Huang, Wenkai Fang, DaCheng Tao, Mingli Song, Shunyu Liu
Specifically, RFTT comprises two phases: (1) supervised fine-tuning performs prompt-driven tree search to obtain self-generated training data annotated with functional tokens, which warms up the model to learn these tokens for reasoning; and (2) online reinforcement learning further allows the model to explore different reasoning pathways through functional token sampling without relying on prompts, thereby facilitating effective self-improvement for functional reasoning.
no code implementations • 17 Feb 2025 • Xinyu Zhang, Yuxuan Dong, Yanrui Wu, Jiaxing Huang, Chengyou Jia, Basura Fernando, Mike Zheng Shou, Lingling Zhang, Jun Liu
These findings position PhysReason as a novel and comprehensive benchmark for evaluating physics-based reasoning capabilities in large language models.
1 code implementation • 30 Jan 2025 • Yibo Wang, Tiansheng Huang, Li Shen, Huanjin Yao, Haotian Luo, Rui Liu, Naiqiang Tan, Jiaxing Huang, DaCheng Tao
Mainstream defenses aim to vaccinate the model such that the later harmful fine-tuning attack is less effective.
2 code implementations • 24 Dec 2024 • Huanjin Yao, Jiaxing Huang, Wenhao Wu, Jingyi Zhang, Yibo Wang, Shunyu Liu, Yingjie Wang, Yuxin Song, Haocheng Feng, Li Shen, DaCheng Tao
Using CoMCTS, we construct Mulberry-260k, a multimodal dataset with a tree of rich, explicit and well-defined reasoning nodes for each question.
no code implementations • 28 Nov 2024 • Rong-Cheng Tu, Wenhao Sun, Zhao Jin, Jingyi Liao, Jiaxing Huang, DaCheng Tao
While open-source video generation and editing models have made significant progress, individual models are typically limited to specific tasks, failing to meet the diverse needs of users.
no code implementations • 13 Nov 2024 • Kai Jiang, Jiaxing Huang
Autoregressive models have demonstrated great performance in natural language processing (NLP) with impressive scalability, adaptability and generalizability.
no code implementations • 27 Oct 2024 • Jiaxing Huang, Jingyi Zhang, Kai Jiang, Shijian Lu
LHST expands the image-level labels with language hierarchy and enables co-regularization between the expanded labels and self-training.
no code implementations • 27 Oct 2024 • Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Ling Shao, Shijian Lu
Test-time prompt tuning, which learns prompts online with unlabelled test samples during the inference stage, has demonstrated great potential by learning effective prompts on-the-fly without requiring any task-specific annotations.
1 code implementation • 22 Oct 2024 • Aoran Xiao, Weihao Xuan, Junjue Wang, Jiaxing Huang, DaCheng Tao, Shijian Lu, Naoto Yokoya
This survey systematically reviews the emerging field of RSFMs.
1 code implementation • 13 Oct 2024 • Han Qiu, Jiaxing Huang, Peng Gao, Qin Qi, Xiaoqin Zhang, Ling Shao, Shijian Lu
Several benchmarks have been created to gauge the hallucination levels of MLLMs, by either raising discriminative questions about the existence of objects or introducing LLM evaluators to score the generated text from MLLMs.
no code implementations • 28 Aug 2024 • Jiaxing Huang, Jingyi Zhang
Multimodal Large Language Models (MLLMs) mimic human perception and reasoning system by integrating powerful Large Language Models (LLMs) with various modality encoders (e. g., vision, audio), positioning LLMs as the "brain" and various modality encoders as sensory organs.
1 code implementation • 20 Jul 2024 • Jiaxing Huang, Yanfeng Zhou, Yaoru Luo, Guole Liu, Heng Guo, Ge Yang
A fundamental property of such structures is their topological self-similarity, which can be quantified by fractal features such as fractal dimension (FD).
1 code implementation • 22 Mar 2024 • Heng Guo, Jianfeng Zhang, Jiaxing Huang, Tony C. W. Mok, Dazhou Guo, Ke Yan, Le Lu, Dakai Jin, Minfeng Xu
Therefore, we propose two key technical developments: 1) a progressively and spatially aligned prompt encoding method to effectively encode click prompts in local 3D space; and 2) a cross-patch prompt scheme to capture more 3D spatial context, which is beneficial for reducing the editing workloads when interactively prompting on large organs.
1 code implementation • CVPR 2024 • Han Qiu, Jiaxing Huang, Peng Gao, Lewei Lu, Xiaoqin Zhang, Shijian Lu
Inspired by the success of general-purpose models in NLP, recent studies attempt to unify different vision tasks in the same sequence format and employ autoregressive Transformers for sequence prediction.
no code implementations • 7 Feb 2024 • Sheng Jin, Xueying Jiang, Jiaxing Huang, Lewei Lu, Shijian Lu
This paper presents DVDet, a Descriptor-Enhanced Open Vocabulary Detector that introduces conditional context prompts and hierarchical textual descriptors that enable precise region-text alignment as well as open-vocabulary detection training in general.
no code implementations • 13 Jan 2024 • Kai Jiang, Jiaxing Huang, Weiying Xie, Jie Lei, Yunsong Li, Ling Shao, Shijian Lu
Large-vocabulary object detectors (LVDs) aim to detect objects of many categories, which learn super objectness features and can locate objects accurately while applied to various downstream data.
no code implementations • 13 Jan 2024 • Kai Jiang, Jiaxing Huang, Weiying Xie, Yunsong Li, Ling Shao, Shijian Lu
Camera-only Bird's Eye View (BEV) has demonstrated great potential in environment perception in a 3D space.
no code implementations • 9 Jan 2024 • Jiaxing Huang, Kai Jiang, Jingyi Zhang, Han Qiu, Lewei Lu, Shijian Lu, Eric Xing
SAMs work with two types of prompts including spatial prompts (e. g., points) and semantic prompts (e. g., texts), which work together to prompt SAMs to segment anything on downstream datasets.
no code implementations • 27 Dec 2023 • Jiaxing Huang, Jingyi Zhang, Kai Jiang, Han Qiu, Shijian Lu
Traditional computer vision generally solves each single task independently by a dedicated model with the task instruction implicitly designed in the model architecture, arising two limitations: (1) it leads to task-specific models, which require multiple models for different tasks and restrict the potential synergies from diverse tasks; (2) it leads to a pre-defined and fixed model interface that has limited interactivity and adaptability in following user' task instructions.
no code implementations • ICCV 2023 • Xueying Jiang, Jiaxing Huang, Sheng Jin, Shijian Lu
Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalization model.
1 code implementation • ICCV 2023 • Jingyi Zhang, Jiaxing Huang, Xueying Jiang, Shijian Lu
However, the source predictions of target data are often noisy and training with them is prone to learning collapses.
no code implementations • 29 Jun 2023 • Jiaxing Huang, Jingyi Zhang, Han Qiu, Sheng Jin, Shijian Lu
Traditional domain adaptation assumes the same vocabulary across source and target domains, which often struggles with limited transfer flexibility and efficiency while handling target domains with different vocabularies.
1 code implementation • 3 Apr 2023 • Jingyi Zhang, Jiaxing Huang, Sheng Jin, Shijian Lu
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task, leading to a laborious and time-consuming visual recognition paradigm.
1 code implementation • CVPR 2023 • Aoran Xiao, Jiaxing Huang, Weihao Xuan, Ruijie Ren, Kangcheng Liu, Dayan Guan, Abdulmotaleb El Saddik, Shijian Lu, Eric Xing
In addition, we design a domain randomization technique that alternatively randomizes the geometry styles of point clouds and aggregates their embeddings, ultimately leading to a generalizable model that can improve 3DSS under various adverse weather effectively.
1 code implementation • ICCV 2023 • Yanfeng Zhou, Jiaxing Huang, Chenlong Wang, Le Song, Ge Yang
Perturbations in consistency-based semi-supervised models are often artificially designed.
2 code implementations • 30 Jul 2022 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Kaiwen Cui, Shijian Lu, Ling Shao
The first is scene-level swapping which exchanges point cloud sectors of two LiDAR scans that are cut along the azimuth axis.
1 code implementation • 28 Jul 2022 • Gongjie Zhang, Zhipeng Luo, Jiaxing Huang, Shijian Lu, Eric P. Xing
The recently proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection.
no code implementations • 26 Jul 2022 • Chuhui Xue, Jiaxing Huang, Shijian Lu, Changhu Wang, Song Bai
We formulate the new setup by a dual detection task which first detects integral text units and then groups them into a CTB.
1 code implementation • 6 Jul 2022 • Yun Xing, Dayan Guan, Jiaxing Huang, Shijian Lu
Specifically, we design cross-frame pseudo labelling to provide pseudo supervision from previous video frames while learning from the augmented current video frames.
no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Shijian Lu
Domain adaptive panoptic segmentation aims to mitigate data annotation challenge by leveraging off-the-shelf annotated data in one or multiple related source domains.
Ranked #3 on
Domain Adaptation
on Panoptic SYNTHIA-to-Cityscapes
1 code implementation • CVPR 2022 • Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu
We build the balanced subclass distributions by clustering pixels of each original class into multiple subclasses of similar sizes, which provide class-balanced pseudo supervision to regularize the class-biased segmentation.
1 code implementation • 28 Feb 2022 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Xiaoqin Zhang, Shijian Lu, Ling Shao
The convergence of point cloud and DNNs has led to many deep point cloud models, largely trained under the supervision of large-scale and densely-labelled point cloud data.
1 code implementation • NeurIPS 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
To this end, we design an innovative historical contrastive learning (HCL) technique that exploits historical source hypothesis to make up for the absence of source data in UMA.
1 code implementation • 4 Oct 2021 • Kaiwen Cui, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Fangneng Zhan, Shijian Lu
Specifically, we design GenCo, a Generative Co-training network that mitigates the discriminator over-fitting issue by introducing multiple complementary discriminators that provide diverse supervision from multiple distinctive views in training.
no code implementations • 29 Sep 2021 • Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Song Bai, Changhu Wang
This paper presents Contextual Text Detection, a new setup that detects contextual text blocks for better understanding of texts in scenes.
1 code implementation • ICCV 2021 • Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu
This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain videos.
1 code implementation • 12 Jul 2021 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Fangneng Zhan, Shijian Lu
Extensive experiments show that SynLiDAR provides a high-quality data source for studying 3D transfer and the proposed PCT achieves superior point cloud translation consistently across the three setups.
no code implementations • 7 Jul 2021 • Kaiwen Cui, Gongjie Zhang, Fangneng Zhan, Jiaxing Huang, Shijian Lu
Generative Adversarial Networks (GANs) have become the de-facto standard in image synthesis.
no code implementations • CVPR 2022 • Jingyi Zhang, Jiaxing Huang, Zichen Tian, Shijian Lu
Second, it introduces multi-view spectral learning that learns useful unsupervised representations by maximizing mutual information among multiple ST-generated spectral views of each target sample.
1 code implementation • ICCV 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
With FAA-generated samples, the training can continue the 'random walk' and drift into an area with a flat loss landscape, leading to more robust domain adaptation.
no code implementations • 5 Jun 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
We position the few labeled target samples as references that gauge the similarity between source and target features and guide adaptive inter-domain alignment for learning more similar source features.
1 code implementation • CVPR 2022 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu, Ling Shao
In this work, we explore the idea of instance contrastive learning in unsupervised domain adaptation (UDA) and propose a novel Category Contrast technique (CaCo) that introduces semantic priors on top of instance discrimination for visual UDA tasks.
no code implementations • 18 May 2021 • Chuhui Xue, Jiaxing Huang, Wenqing Zhang, Shijian Lu, Changhu Wang, Song Bai
The first task focuses on image-to-character (I2C) mapping which detects a set of character candidates from images based on different alignments of visual features in an non-sequential way.
no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Xiaoqin Zhang, Shijian Lu
DA-DETR introduces a novel CNN-Transformer Blender (CTBlender) that fuses the CNN features and Transformer features ingeniously for effective feature alignment and knowledge transfer across domains.
no code implementations • 24 Mar 2021 • Jiaxing Huang, Dayan Guan, Shijian Lu, Aoran Xiao
Recent progresses in domain adaptive semantic segmentation demonstrate the effectiveness of adversarial learning (AL) in unsupervised domain adaptation.
1 code implementation • CVPR 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
It has been studied widely by domain randomization that transfers source images to different styles in spatial space for learning domain-agnostic features.
1 code implementation • CVPR 2021 • Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu
The inter-task regularization exploits the complementary nature of instance segmentation and semantic segmentation and uses it as a constraint for better feature alignment across domains.
Ranked #3 on
Domain Adaptation
on Panoptic SYNTHIA-to-Mapillary
1 code implementation • 1 Mar 2021 • Aoran Xiao, Xiaofei Yang, Shijian Lu, Dayan Guan, Jiaxing Huang
Specifically, we design a residual dense block with multiple receptive fields as a building block in the encoder which preserves detailed information in each modality and learns hierarchical modality-specific and fused features effectively.
Ranked #25 on
3D Semantic Segmentation
on SemanticKITTI
3 code implementations • 27 Feb 2021 • Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu, Yanpeng Cao
Specifically, we design an uncertainty metric that assesses the alignment of each sample and adjusts the strength of adversarial learning for well-aligned and poorly-aligned samples adaptively.
1 code implementation • ECCV 2020 • Jiaxing Huang, Shijian Lu, Dayan Guan, Xiaobing Zhang
Recent advances in unsupervised domain adaptation for semantic segmentation have shown great potentials to relieve the demand of expensive per-pixel annotations.
no code implementations • 12 May 2019 • Fangneng Zhan, Jiaxing Huang, Shijian Lu
Despite the rapid progress of generative adversarial networks (GANs) in image synthesis in recent years, the existing image synthesis approaches work in either geometry domain or appearance domain alone which often introduces various synthesis artifacts.