no code implementations • EMNLP 2020 • Li Kong, Chuanyi Li, Jidong Ge, Bin Luo, Vincent Ng
While hyperbole is one of the most prevalent rhetorical devices, it is arguably one of the least studied devices in the figurative language processing community.
no code implementations • Findings (EMNLP) 2021 • Yi Feng, Ting Wang, Chuanyi Li, Vincent Ng, Jidong Ge, Bin Luo, Yucheng Hu, Xiaopeng Zhang
User targeting is an essential task in the modern advertising industry: given a package of ads for a particular category of products (e. g., green tea), identify the online users to whom the ad package should be targeted.
1 code implementation • 17 Apr 2025 • Wentao Wu, Xiao Wang, Chenglong Li, Bo Jiang, Jin Tang, Bin Luo, Qi Liu
Event cameras have attracted increasing attention in recent years due to their advantages in high dynamic range, high temporal resolution, low power consumption, and low latency.
1 code implementation • 23 Mar 2025 • Zeng-Hui Zhu, Wei Lu, Si-Bao Chen, Chris H. Q. Ding, Jin Tang, Bin Luo
To address this, we introduce Real-World Remote Sensing Hazy Image Dataset (RRSHID), the first large-scale dataset featuring real-world hazy and dehazed image pairs across diverse atmospheric conditions.
1 code implementation • 18 Mar 2025 • Wei Lu, Si-Bao Chen, Hui-Dong Li, Qing-Ling Shu, Chris H. Q. Ding, Jin Tang, Bin Luo
Remote sensing object detection (RSOD) faces formidable challenges in complex visual environments.
Ranked #2 on
Object Detection
on VisDrone-DET2019
no code implementations • 14 Mar 2025 • Andong Lu, Mai Wen, Jinhu Wang, Yuanzhi Guo, Chenglong Li, Jin Tang, Bin Luo
Despite quad-modal data provides richer information, the differences in information quantity among modalities and the computational burden from four modalities are two challenging issues in fusing four modalities.
no code implementations • 14 Mar 2025 • Andong Lu, Yuanzhi Guo, Wanyu Wang, Chenglong Li, Jin Tang, Bin Luo
To break shallow limits, we propose a novel \textbf{T}ask-driven \textbf{P}ixel-level \textbf{F}usion network, named \textbf{TPF}, which unveils the power of pixel-level fusion in RGBT tracking through a progressive learning framework.
Ranked #2 on
Rgb-T Tracking
on GTOT
no code implementations • 10 Mar 2025 • Wentao Wu, Chenglong Li, Xiao Wang, Bin Luo, Qi Liu
To address this problem, we propose a Large Language Model (LLM) guided Progressive feature Alignment Network called LPANet, which leverages the semantic features extracted from a large language model to guide the progressive semantic and spatial alignment between modalities for multimodal UAV object detection.
1 code implementation • 9 Mar 2025 • Xiao Wang, Yuehang Li, Fuling Wang, Bo Jiang, YaoWei Wang, Yonghong Tian, Jin Tang, Bin Luo
Accurate sign language understanding serves as a crucial communication channel for individuals with disabilities.
1 code implementation • 25 Feb 2025 • Han Nie, Bin Luo, Jun Liu, Zhitao Fu, Huan Zhou, Shuo Zhang, Weixing Liu
Therefore, effectively leveraging foundation models to improve the generalization of optical-SAR image matching remains challenge.
1 code implementation • 17 Jan 2025 • Wei Lu, Si-Bao Chen, Chris H. Q. Ding, Jin Tang, Bin Luo
This article introduces LWGANet, a specialized lightweight backbone network tailored for RS visual tasks, incorporating a novel lightweight group attention (LWGA) module designed to address these specific challenges.
Ranked #1 on
Change Detection
on WHU-CD
1 code implementation • 19 Dec 2024 • Kunpeng Wang, Keke Chen, Chenglong Li, Zhengzheng Tu, Bin Luo
Alignment-free RGB-Thermal (RGB-T) salient object detection (SOD) aims to achieve robust performance in complex scenes by directly leveraging the complementary information from unaligned visible-thermal image pairs, without requiring manual alignment.
no code implementations • 18 Nov 2024 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Chenyang Li, Hanyuan Chen, Jin-Peng Lan, Bin Luo, Yifeng Geng
Text logo design heavily relies on the creativity and expertise of professional designers, in which arranging element layouts is one of the most important procedures.
no code implementations • 29 Oct 2024 • Bo Jiang, Hao Wu, Beibei Wang, Jin Tang, Bin Luo
To address this issue, we propose exploiting sparse representation theory for graph prompting and present Graph Sparse Prompting (GSP).
1 code implementation • 15 Oct 2024 • Andong Lu, jiacong Zhao, Chenglong Li, Yun Xiao, Bin Luo
To handle this issue, we take original RGB and TIR networks as the teachers, and distill their content knowledge into two student networks respectively by the style-content orthogonal feature decoupling scheme.
Ranked #3 on
Rgb-T Tracking
on RGBT210
no code implementations • 12 Oct 2024 • Yi Xiao, Bin Luo, Jun Liu, Xin Su, Wei Wang
However, existing CD methods still struggle to address pseudo changes resulting from domain information differences in multi-temporal images and instances of detail errors caused by the loss and contamination of detail features during the upsampling process in the network.
1 code implementation • 27 Aug 2024 • Kunpeng Wang, Danying Lin, Chenglong Li, Zhengzheng Tu, Bin Luo
Then, we feed the extracted multi-modal semantic features into both the SAM image encoder and mask decoder for fine-tuning and prompting, respectively.
1 code implementation • IEEE Transactions on Image Processing 2024 • Jiahui Wang, Qin Xu, Bo Jiang, Bin Luo, Jinhui Tang
In this paper, we propose a novel Multi-Granularity Part Sampling Attention (MPSA) network for fine-grained visual classification.
Ranked #2 on
Fine-Grained Image Classification
on Stanford Dogs
no code implementations • 16 Aug 2024 • Andong Lu, Wanyu Wang, Chenglong Li, Jin Tang, Bin Luo
Existing RGBT tracking methods often design various interaction models to perform cross-modal fusion of each layer, but can not execute the feature interactions among all layers, which plays a critical role in robust multimodal representation, due to large computational burden.
Ranked #6 on
Rgb-T Tracking
on LasHeR
1 code implementation • 16 Jul 2024 • Han Nie, Bin Luo, Jun Liu, Zhitao Fu, Weixing Liu, Xin Su
In this paper, we demonstrate that our REMM is very useful for multimodal image matching, including multimodal feature learning module and cyclic shift module.
no code implementations • 1 Jul 2024 • Chunrong Fang, Weisong Sun, Yuchen Chen, Xiao Chen, Zhao Wei, Quanjun Zhang, Yudu You, Bin Luo, Yang Liu, Zhenyu Chen
Recently, large-scale pre-trained models for source code are equipped with encoders capable of producing general context vectors and have achieved substantial improvements on code summarization.
no code implementations • 28 Jun 2024 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Qi He, Wangmeng Xiang, Hanyuan Chen, Jin-Peng Lan, Xianhui Lin, Kang Zhu, Bin Luo, Yifeng Geng, Xuansong Xie, Alexander G. Hauptmann
MetaDesigner introduces a transformative framework for artistic typography synthesis, powered by Large Language Models (LLMs) and grounded in a user-centric design paradigm.
no code implementations • 21 Jun 2024 • Bo Jiang, Sheng Ge, Ziyan Zhang, Beibei Wang, Jin Tang, Bin Luo
However, existing Graph Convolution (GC) operators are mainly defined on adjacency matrix and node features and generally focus on obtaining effective node embeddings which cannot be utilized to address the graphs with (high-dimensional) edge features.
no code implementations • 7 Jun 2024 • Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu
By leveraging this feature, C4 can rapidly identify the faulty components, swiftly isolate the anomaly, and restart the task, thereby avoiding resource wastage caused by delays in anomaly detection.
1 code implementation • 3 Jun 2024 • Kunpeng Wang, Zhengzheng Tu, Chenglong Li, Cheng Zhang, Bin Luo
To adaptively select the appropriate fusion scheme for multi-modal input, we introduce an adaptive ensemble module that forms the adaptive fusion bank, which is embedded into hierarchical layers for sufficient fusion of different source data.
1 code implementation • 3 Jun 2024 • Kunpeng Wang, Danying Lin, Chenglong Li, Zhengzheng Tu, Bin Luo
In this paper, we make the first attempt to address RGBT SOD for initially captured RGB and thermal image pairs without manual alignment.
no code implementations • 30 May 2024 • Haitao Cao, Baoping Cheng, Qiran Pu, Haocheng Zhang, Bin Luo, Yixiang Zhuang, Juncong Lin, Liyan Chen, Xuan Cheng
Parametric 3D models have enabled a wide variety of computer vision and graphics tasks, such as modeling human faces, bodies and hands.
no code implementations • 28 May 2024 • Lianlei Shan, Weiqiang Wang, Ke Lv, Bin Luo
Due to the gap between aerial and natural images, the previous AL methods are not ideal, mainly caused by unreasonable labeling units and the neglect of class imbalance.
1 code implementation • 4 May 2024 • Andong Lu, Wanyu Wang, Chenglong Li, Jin Tang, Bin Luo
In particular, we design a fusion structure space based on the hierarchical attention network, each attention-based fusion unit corresponding to a fusion operation and a combination of these attention units corresponding to a fusion structure.
Ranked #7 on
Rgb-T Tracking
on RGBT210
1 code implementation • CVPR 2024 • Runmin Dong, Shuai Yuan, Bin Luo, Mengxuan Chen, Jinxiao Zhang, Lixian Zhang, Weijia Li, Juepeng Zheng, Haohuan Fu
Specifically, we inject the priors into the denoising model to improve the utilization of reference information in unchanged areas and regulate the reconstruction of semantically relevant content in changed areas.
no code implementations • 26 Mar 2024 • Jiawen Zhu, Xin Chen, Haiwen Diao, Shuai Li, Jun-Yan He, Chenyang Li, Bin Luo, Dong Wang, Huchuan Lu
For instance, DyTrack obtains 64. 9% AUC on LaSOT with a speed of 256 fps.
4 code implementations • 9 Mar 2024 • Xiao Wang, Ju Huang, Shiao Wang, Chuanming Tang, Bo Jiang, Yonghong Tian, Jin Tang, Bin Luo
Current event-/frame-event based trackers undergo evaluation on short-term tracking datasets, however, the tracking of real-world scenarios involves long-term tracking, and the performance of existing tracking algorithms in these scenarios remains unclear.
1 code implementation • CVPR 2024 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Xuansong Xie
Multimodal Large Language Model (MLLMs) leverages Large Language Models as a cognitive framework for diverse visual-language tasks.
no code implementations • 31 Jan 2024 • Weixing Liu, Jun Liu, Xin Su, Han Nie, Bin Luo
To address this challenge, we propose a practical source-free object detection (SFOD) setting for RS images, which aims to perform target domain adaptation using only the source pre-trained model.
no code implementations • 8 Jan 2024 • Ziyan Zhang, Bo Jiang, Jin Tang, Bin Luo
Based on the proposed GMA, we then propose a unified graph contrastive learning, termed Graph Message Contrastive Learning (GMCL), that employs attribution-guided universal GMA for graph contrastive learning.
no code implementations • 3 Jan 2024 • Dengdi Sun, Yajie Pan, Andong Lu, Chenglong Li, Bin Luo
We introduce independent dynamic template tokens to interact with the search region, embedding temporal information to address appearance changes, while also retaining the involvement of the initial static template tokens in the joint feature extraction process to ensure the preservation of the original reliable target appearance information that prevent deviations from the target appearance caused by traditional temporal updates.
Ranked #15 on
Rgb-T Tracking
on RGBT210
no code implementations • 3 Jan 2024 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Yusen Hu, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Bin Luo, Yifeng Geng, Xuansong Xie, Jingren Zhou
This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope.
1 code implementation • 29 Dec 2023 • Jiawen Zhu, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Huchuan Lu, Yifeng Geng, Xuansong Xie
The perception component then generates the tracking results based on the embeddings.
Ranked #5 on
Referring Video Object Segmentation
on ReVOS
1 code implementation • 25 Dec 2023 • Andong Lu, jiacong Zhao, Chenglong Li, Jin Tang, Bin Luo
To address this challenge, we propose a novel invertible prompt learning approach, which integrates the content-preserving prompts into a well-trained tracking model to adapt to various modality-missing scenarios, for robust RGBT tracking.
1 code implementation • 25 Dec 2023 • Andong Lu, Chenglong Li, Tianrui Zha, Jin Tang, XiaoFeng Wang, Bin Luo
Prevalent nighttime person re-identification (ReID) methods typically combine image relighting and ReID networks in a sequential manner.
1 code implementation • 25 Dec 2023 • Wentao Zou, Qi Li, Jidong Ge, Chuanyi Li, Xiaoyu Shen, LiGuo Huang, Bin Luo
We hope that our findings can provide a deeper understanding of PEFT methods on various PTMs and SE downstream tasks.
1 code implementation • 30 Nov 2023 • Dong Li, Jiandong Jin, Yuhao Zhang, Yanlin Zhong, Yaoyang Wu, Lan Chen, Xiao Wang, Bin Luo
Current methods typically employ backbone networks to individually extract the features of RGB frames and event streams, and subsequently fuse these features for pattern recognition.
1 code implementation • 28 Nov 2023 • Kunpeng Wang, Chenglong Li, Zhengzheng Tu, Zhengyi Liu, Bin Luo
Existing single-modal and multi-modal salient object detection (SOD) methods focus on designing specific architectures tailored for their respective tasks.
no code implementations • 28 Nov 2023 • Jiahui Wang, Qin Xu, Bo Jiang, Bin Luo
Label propagation methods try to propagate the labels of support samples on the constructed graph encoding the relationships between both support and query samples.
no code implementations • 20 Oct 2023 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Yusen Hu, Bin Luo, Yifeng Geng, Xuansong Xie, Jingren Zhou
This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on the Large Language Model (LLM).
1 code implementation • 17 Oct 2023 • Bo Jiang, Zitian Wang, Xixi Wang, Ziyan Zhang, Lan Chen, Xiao Wang, Bin Luo
Then, each pixel of feature map is regarded as a graph node and the graph neural network is proposed to model the structured information for coarse change map prediction.
1 code implementation • 19 Sep 2023 • Jiawen Zhu, Huayi Tang, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Shihao Qiu, Shengming Li, Huchuan Lu
To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts.
1 code implementation • 4 Sep 2023 • Hanbing Liu, Wangmeng Xiang, Jun-Yan He, Zhi-Qi Cheng, Bin Luo, Yifeng Geng, Xuansong Xie
Accurately estimating the 3D pose of humans in video sequences requires both accuracy and a well-structured architecture.
Ranked #10 on
3D Human Pose Estimation
on HumanEva-I
1 code implementation • 18 Aug 2023 • Hanbing Liu, Jun-Yan He, Zhi-Qi Cheng, Wangmeng Xiang, Qize Yang, Wenhao Chai, Gaoang Wang, Xu Bao, Bin Luo, Yifeng Geng, Xuansong Xie
Typically, PoSynDA uses a diffusion-inspired structure to simulate 3D pose distribution in the target domain.
1 code implementation • ICCV 2023 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Yifeng Geng, Xuansong Xie
Our method sets the new state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS and SemKITTI-DVPS datasets.
1 code implementation • 26 Jul 2023 • Jiawen Zhu, Zhenyu Chen, Zeqi Hao, Shijie Chang, Lu Zhang, Dong Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Hanyuan Chen, Chenyang Li
To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.
Ranked #5 on
Semi-Supervised Video Object Segmentation
on YouTube-VOS 2019
(using extra training data)
1 code implementation • 19 Jul 2023 • Leilei Ma, Dengdi Sun, Lei Wang, Haifeng Zhao, Bin Luo
Specifically, we leverage semantic-aware representation learning to extract category-related local discriminative features and construct category prototypes.
Ranked #1 on
Multi-Label Learning
on COCO 2014
1 code implementation • 1 Jul 2023 • Bin Luo, Susan Halabi
To overcome this limitation, we propose a framework of sparse-input neural networks using group concave regularization for feature selection in both low-dimensional and high-dimensional settings.
1 code implementation • 28 Jun 2023 • Wei-Bin Kou, Shuai Wang, Guangxu Zhu, Bin Luo, Yingxian Chen, Derrick Wing Kwan Ng, Yik-Chung Wu
While federated learning (FL) improves the generalization of end-to-end autonomous driving by model aggregation, the conventional single-hop FL (SFL) suffers from slow convergence rate due to long-range communications among vehicles and cloud server.
1 code implementation • 27 May 2023 • Weisong Sun, Yuchen Chen, Guanhong Tao, Chunrong Fang, Xiangyu Zhang, Quanjun Zhang, Bin Luo
Neural code search models are hence behind many such engines.
1 code implementation • 25 May 2023 • Xu Bao, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang, Jingdong Sun, Hanbing Liu, Wei Liu, Bin Luo, Yifeng Geng, Xuansong Xie
By spearheading the integration of Multilateration with facial analysis, KeyPosS marks a paradigm shift in facial landmark detection.
1 code implementation • 19 May 2023 • Yuxuan Zhou, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Yifeng Geng, Xuansong Xie
As a remedy, we propose a threefold strategy: (1) We forge an innovative pathway that encodes bone connectivity by harnessing the power of graph distances.
no code implementations • 8 May 2023 • Zhaoxia Yin, Shaowei Zhu, Hang Su, Jianteng Peng, Wanli Lyu, Bin Luo
However, numerous studies have proven that previous methods create detection or defense against certain attacks, which renders the method ineffective in the face of the latest unknown attack methods.
1 code implementation • 30 Mar 2023 • Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Wangmeng Xiang, Binghui Chen, Bin Luo, Yifeng Geng, Xuansong Xie
Real-time perception, or streaming perception, is a crucial aspect of autonomous driving that has yet to be thoroughly explored in existing research.
no code implementations • 23 Mar 2023 • Yuan Zhang, Chuanyi Li, Yu Sheng, Jidong Ge, Bin Luo
It is a difficult but necessary task to extract the key information for the cases from these textual materials and to clarify the dispute focus of related parties.
1 code implementation • CVPR 2023 • Hui Lv, Zhongqi Yue, Qianru Sun, Bin Luo, Zhen Cui, Hanwang Zhang
At each MIL training iteration, we use the current detector to divide the samples into two groups with different context biases: the most confident abnormal/normal snippets and the rest ambiguous ones.
1 code implementation • IEEE Transactions on Multimedia 2023 • Qin Xu, Jiahui Wang, Bo Jiang, Bin Luo
The proposed IELT involves three main modules: multi-head voting (MHV) module, cross-layer refinement (CLR) module, and dynamic selection (DS) module.
no code implementations • 8 Feb 2023 • Changan Niu, Chuanyi Li, Vincent Ng, Bin Luo
Despite the recent advances showing that a model pre-trained on large-scale source code data is able to gain appreciable generalization capability, it still requires a sizeable amount of data on the target task for fine-tuning.
1 code implementation • 3 Feb 2023 • Hanyuan Chen, Jun-Yan He, Wangmeng Xiang, Zhi-Qi Cheng, Wei Liu, Hanbing Liu, Bin Luo, Yifeng Geng, Xuansong Xie
Human pose estimation is a challenging task due to its structured data sequence nature.
Ranked #89 on
3D Human Pose Estimation
on Human3.6M
1 code implementation • 8 Jan 2023 • Jidong Ge, Yuxiang Liu, Jie Gui, Lanting Fang, Ming Lin, James Tin-Yau Kwok, LiGuo Huang, Bin Luo
However, the relation between these two losses is not clear.
no code implementations • 16 Dec 2022 • Shaowei Zhu, Wanli Lyu, Bin Li, Zhaoxia Yin, Bin Luo
In addition, the proposed method does not modify any task model, which can be used as a preprocessing module, which significantly reduces the deployment cost in practical applications.
no code implementations • 19 Nov 2022 • Xixi Wang, Bo Jiang, Xiao Wang, Bin Luo
(1) It employs a flexible graph model, termed Batch Graph to jointly encode the visual and semantic relationships of samples within each mini-batch.
no code implementations • 17 Nov 2022 • Wisal Khan, Muhammad Turab, Waqas Ahmad, Syed Hasnat Ahmad, Kelash Kumar, Bin Luo
Similarly, in AE based DDR, we compare unsupervised learning algorithm accuracy and time before and after AE representation learning.
4 code implementations • 27 Oct 2022 • Jin-Peng Lan, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Xu Bao, Wangmeng Xiang, Yifeng Geng, Xuansong Xie
Existing Visual Object Tracking (VOT) only takes the target area in the first frame as a template.
Ranked #1 on
Video Object Tracking
on NT-VOT211
2 code implementations • 27 Oct 2022 • Chenyang Li, Zhi-Qi Cheng, Jun-Yan He, Pengyu Li, Bin Luo, Hanyuan Chen, Yifeng Geng, Jin-Peng Lan, Xuansong Xie
Streaming perception is a critical task in autonomous driving that requires balancing the latency and accuracy of the autopilot system.
2 code implementations • 12 Oct 2022 • Qiming Peng, Yinxu Pan, Wenjin Wang, Bin Luo, Zhenyu Zhang, Zhengjie Huang, Teng Hu, Weichong Yin, Yongfeng Chen, Yin Zhang, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
Recent years have witnessed the rise and success of pre-training techniques in visually-rich document understanding.
Ranked #2 on
Semantic entity labeling
on FUNSD
document-image-classification
Document Image Classification
+5
no code implementations • 9 Oct 2022 • Wenlin Bai, Peixuan Li, Xihua Zou, Ningyuan Zhong, Wei Pan, Lianshan Yan, Bin Luo
Then the self-coherent detection, as a simple and low-cost means, is accordingly facilitated for both de-chirping of MMW radar and frequency down-conversion reception of MMW communication, which circumvents the costly high-speed mixers along with MMW local oscillators and more significantly achieves the real-time decomposition of radar and communication information.
no code implementations • 18 Sep 2022 • Wenjin Wang, Zhengjie Huang, Bin Luo, Qianglong Chen, Qiming Peng, Yinxu Pan, Weichong Yin, Shikun Feng, Yu Sun, dianhai yu, Yin Zhang
At first, a document graph is proposed to model complex relationships among multi-grained multimodal elements, in which salient visual regions are detected by a cluster-based method.
no code implementations • 14 Sep 2022 • Wisal Khan, Teerath Kumar, Zhang Cheng, Kislay Raj, Arunabha M Roy, Bin Luo
Also, the mixed model is followed by each category of NoSQL Databases.
no code implementations • 26 Aug 2022 • Xixi Wang, Xiao Wang, Bo Jiang, Bin Luo
sampleFormer aims to capture the dependence of samples in support and query sets for image representation.
1 code implementation • 15 Jun 2022 • Weisong Sun, Chunrong Fang, Yuchen Chen, Quanjun Zhang, Guanhong Tao, Tingxu Han, Yifei Ge, Yudu You, Bin Luo
The extractive module in the framework performs a task of extractive code summarization, which takes in the code snippet and predicts important statements containing key factual details.
no code implementations • 24 May 2022 • Changan Niu, Chuanyi Li, Bin Luo, Vincent Ng
In particular, the development and use of pre-trained models of source code has enabled state-of-the-art results to be achieved on a wide variety of SE tasks.
1 code implementation • 19 May 2022 • Xiao Wang, Zhe Chen, Bo Jiang, Jin Tang, Bin Luo, DaCheng Tao
To track the target in a video, current visual trackers usually adopt greedy search for target object localization in each frame, that is, the candidate region with the maximum response score will be selected as the tracking result of each frame.
no code implementations • 10 May 2022 • Jingxiao Liu, Siyuan Yuan, Bin Luo, Biondo Biondi, Hae Young Noh
Bridge Health Monitoring (BHM) enables early damage detection of bridges and is thus critical for avoiding more severe damages that might result in major financial and human losses.
no code implementations • 26 Apr 2022 • Ziyan Zhang, Bo Jiang, Bin Luo
Graph Convolutional Networks (GCNs) have been widely demonstrated their powerful ability in graph data representation and learning.
no code implementations • 22 Feb 2022 • Qingyu Wang, Guorui Feng, Zhaoxia Yin, Bin Luo
Firstly, the former is used to generate the UAP, which can learn the distribution of perturbations better, and then the latter is used to find the sensitive regions concerned by the RSI classification model.
no code implementations • 22 Feb 2022 • Wenkang Zhong, Chuanyi Li, Jidong Ge, Bin Luo
Automated Program Repair (APR) aims to automatically fix bugs in the source code.
1 code implementation • 11 Feb 2022 • Yabin Zhu, Chenglong Li, Yao Liu, Xiao Wang, Jin Tang, Bin Luo, Zhixiang Huang
Tiny objects, frequently appearing in practical applications, have weak appearance and features, and receive increasing interests in meany vision tasks, such as object detection and segmentation.
1 code implementation • 13 Dec 2021 • Yunyun huang, Xiaoyu Shen, Chuanyi Li, Jidong Ge, Bin Luo
Given the fact of a case, Legal Judgment Prediction (LJP) involves a series of sub-tasks such as predicting violated law articles, charges and term of penalty.
no code implementations • 2 Dec 2021 • Ze Tang, Chuanyi Li, Jidong Ge, Xiaoyu Shen, Zheling Zhu, Bin Luo
Code summarization aims to generate brief natural language descriptions for source code.
1 code implementation • 2 Dec 2021 • Xixi Wang, Xiao Wang, Bo Jiang, Jin Tang, Bin Luo
In this work, we re-think Transformer and extend it to MutualFormer for multi-modality data representation.
no code implementations • MM 2021 • Bo Jiang, Pengfei Sun, Ziyan Zhang, Jin Tang, Bin Luo
Also, GAMnet exploits sparse GM optimization as correspondence solver which is differentiable and can also incorporate discrete one-to-one matching constraints approximately in natural in the final matching prediction.
Ranked #8 on
Graph Matching
on PASCAL VOC
(matching accuracy metric)
no code implementations • 29 Sep 2021 • Bo Jiang, Ziyan Zhang, Bin Luo
Given an input graph $\textbf{A}$, LatGCR aims to generate a flexible latent graph $\tilde{\textbf{A}}$ for graph convolutional representation which obviously enhances the representation capacity and also performs robustly w. r. t graph structural attacks and noises.
no code implementations • 4 Sep 2021 • Chenjie Wang, Chengyuan Li, Bin Luo, Wei Wang, Jun Liu
Then we extend SOLOV2 to capture temporal information in video to learn motion information, and propose a moving object instance segmentation network with RiWFPN called RiWNet.
1 code implementation • 9 Jun 2021 • Xiao Wang, Jin Tang, Bin Luo, YaoWei Wang, Yonghong Tian, Feng Wu
In this paper, we propose a novel and general target-aware attention mechanism (termed TANet) and integrate it with tracking-by-detection framework to conduct joint local and global search for robust tracking.
1 code implementation • 27 Apr 2021 • Chenglong Li, Wanlin Xue, Yaqing Jia, Zhichen Qu, Bin Luo, Jin Tang, Dengdi Sun
RGBT tracking receives a surge of interest in the computer vision community, but this research field lacks a large-scale and high-diversity benchmark dataset, which is essential for both the training of deep RGBT trackers and the comprehensive evaluation of RGBT tracking methods.
1 code implementation • 30 Mar 2021 • Xiao Wang, Zhe Chen, Jin Tang, Bin Luo, YaoWei Wang, Yonghong Tian, Feng Wu
In this paper, we propose to introduce more dynamics by devising a dynamic attention-guided multi-trajectory tracking strategy.
no code implementations • 19 Jan 2021 • Jie Wang, Zhaoxia Yin, Jin Tang, Jing Jiang, Bin Luo
The studies on black-box adversarial attacks have become increasingly prevalent due to the intractable acquisition of the structural knowledge of deep neural networks (DNNs).
no code implementations • 18 Dec 2020 • Chengyuan Li, Jun Liu, Hailong Hong, Wenju Mao, Chenjie Wang, Chudi Hu, Xin Su, Bin Luo
On the basis of this, a novel octave convolution-based semantic attention feature pyramid network (OcSaFPN) is proposed to get higher accuracy in object detection with noise.
no code implementations • 14 Nov 2020 • Andong Lu, Chenglong Li, Yuqing Yan, Jin Tang, Bin Luo
In specific, we use the modified VGG-M as the generality adapter to extract the modality-shared target representations. To extract the modality-specific features while reducing the computational complexity, we design a modality adapter, which adds a small block to the generality adapter in each layer and each modality in a parallel manner.
Ranked #14 on
Rgb-T Tracking
on GTOT
no code implementations • 26 Jul 2020 • Chenjie Wang, Chengyuan Li, Bin Luo
Most scenes in practical applications are dynamic scenes containing moving objects, so segmenting accurately moving objects is crucial for many computer vision applications.
no code implementations • 9 Jun 2020 • Weixing Liu, Jun Liu, Bin Luo
Deep learning approaches require enough training samples to perform well, but it is a challenge to collect enough real training data and label them manually.
no code implementations • 17 Mar 2020 • Zhengzheng Tu, Chun Lin, Chenglong Li, Jin Tang, Bin Luo
Classifying the confusing samples in the course of RGBT tracking is a quite challenging problem, which hasn't got satisfied solution.
no code implementations • 10 Mar 2020 • Chenjie Wang, Bin Luo, Yun Zhang, Qing Zhao, Lu Yin, Wei Wang, Xin Su, Yajun Wang, Chengyuan Li
The only input of DymSLAM is stereo video, and its output includes a dense map of the static environment, 3D model of the moving objects and the trajectories of the camera and the moving objects.
1 code implementation • 21 Dec 2019 • Bo Jiang, Zitai Zhou, Xiao Wang, Jin Tang, Bin Luo
Fusing complementary information of RGB and depth has been demonstrated to be effective for image salient object detection which is known as RGB-D salient object detection problem.
no code implementations • 18 Nov 2019 • Bo Jiang, Pengfei Sun, Jin Tang, Bin Luo
However, the matching graphs we feed to existing graph convolutional matching networks are generally fixed and independent of graph matching, which thus are not guaranteed to be optimal for the graph matching task.
Ranked #16 on
Graph Matching
on Willow Object Class
no code implementations • 15 Oct 2019 • Qing Zhao, Bin Luo, Yun Zhang
In this letter, a stereo-based multi-motion visual odometry method is proposed to acquire the poses of the robot and other moving objects.
no code implementations • 4 Sep 2019 • Bo Jiang, Beibei Wang, Jin Tang, Bin Luo
Graph Convolutional Networks (GCNs) have shown very powerful for graph data representation and learning tasks.
no code implementations • 4 Sep 2019 • Bo Jiang, Leiling Wang, Jin Tang, Bin Luo
In particular, CaGAT conducts context-aware learning on both node feature representation and edge (weight) representation simultaneously and cooperatively in a unified manner which can boost their respective performance in network training.
no code implementations • 24 Aug 2019 • Joya Chen, Dong Liu, Bin Luo, Xuezheng Peng, Tong Xu, Enhong Chen
For a long time, object detectors have suffered from extreme imbalance between foregrounds and backgrounds.
no code implementations • 14 Aug 2019 • Bo Jiang, Leiling Wang, Jin Tang, Bin Luo
In this paper, we first re-interpret graph convolution operation in GCNs as a composition of feature propagation and (non-linear) transformation.
no code implementations • 7 Aug 2019 • Zhengzheng Tu, Yan Ma, Chenglong Li, Jin Tang, Bin Luo
To maintain the clear edge structure of salient objects, we propose a novel Edge-guided Non-local FCN (ENFNet) to perform edge guided feature learning for accurate salient object detection.
1 code implementation • 24 Jul 2019 • Chenglong Li, Wei Xia, Yan Yan, Bin Luo, Jin Tang
These advantages of thermal infrared cameras make the segmentation of semantic objects in day and night.
no code implementations • 24 Jul 2019 • Yabin Zhu, Chenglong Li, Bin Luo, Jin Tang, Xiao Wang
In different modalities, we propose to prune the densely aggregated features of all modalities in a collaborative way.
no code implementations • 20 Jul 2019 • Bo Jiang, Xixi Wang, Bin Luo
Given a person image, PH-GCN first constructs a hierarchical graph to represent the pairwise relationships among different parts.
no code implementations • 22 May 2019 • Hongchao Li, Xianmin Lin, Aihua Zheng, Chenglong Li, Bin Luo, Ran He, Amir Hussain
In particular, our network is end-to-end trained and contains three subnetworks of deep features embedded by the corresponding attributes (i. e., camera view, vehicle type and vehicle color).
no code implementations • 6 May 2019 • Xiao Wang, Ziliang Chen, Rui Yang, Bin Luo, Jin Tang
In this paper, we propose Hard Person Identity Mining (HPIM) that attempts to refine the hard example mining to improve the exploration efficacy in person re-identification.
no code implementations • 26 Apr 2019 • Bo Jiang, Ziyan Zhang, Bin Luo
Given an input graph $\textbf{A}$, LatGCR aims to generate a flexible latent graph $\widetilde{\textbf{A}}$ for graph convolutional representation which obviously enhances the representation capacity and also performs robustly w. r. t graph structural attacks and noises.
Ranked #28 on
Node Classification
on Cora
no code implementations • 22 Jan 2019 • Bo Jiang, Ziyan Zhang, Jin Tang, Bin Luo
In this paper, we propose a novel Multiple Graph Adversarial Learning (MGAL) framework for multi-graph representation and learning.
1 code implementation • 22 Jan 2019 • Xiao Wang, Shaofei Zheng, Rui Yang, Aihua Zheng, Zhe Chen, Jin Tang, Bin Luo
We also review some popular network architectures which have been widely applied in the deep learning community.
no code implementations • 27 Nov 2018 • Xiao Wang, Tao Sun, Rui Yang, Chenglong Li, Bin Luo, Jin Tang
In this paper, we propose an efficient quality-aware deep neural network to model the weight of data from each domain using deep reinforcement learning (DRL).
no code implementations • 25 Nov 2018 • Xiao Wang, Chenglong Li, Rui Yang, Tianzhu Zhang, Jin Tang, Bin Luo
To refine the states of the target and re-track the target when it is back to view from heavy occlusion and out of view, we elaborately design a novel subnetwork to learn the target-driven visual attentions from the guidance of both visual and natural language cues.
no code implementations • 24 Nov 2018 • Yabin Zhu, Chenglong Li, Bin Luo, Jin Tang
This paper investigates how to perform robust visual tracking in adverse and challenging conditions using complementary visual and thermal infrared data (RGBT tracking).
no code implementations • CVPR 2018 • Xiao Wang, Chenglong Li, Bin Luo, Jin Tang
Based on the generated hard positive samples, we train a Siamese network for visual tracking and our experiments validate the effectiveness of the introduced algorithm.
no code implementations • 17 Apr 2018 • Bo Jiang, Doudou Lin, Bin Luo, Jin Tang
To address this problem, we propose a novel unified temporal coherence and graph optimized ranking model for weighted patch representation in visual tracking problem.
no code implementations • NeurIPS 2017 • Bo Jiang, Jin Tang, Chris Ding, Yihong Gong, Bin Luo
As a fundamental problem in computer vision, graph matching problem can usually be formulated as a Quadratic Programming (QP) problem with doubly stochastic and discrete (integer) constraints.
no code implementations • CVPR 2017 • Bo Jiang, Jin Tang, Chris Ding, Bin Luo
There are three main contributions of the proposed method: (1) we propose a new graph matching relaxation model, called Binary Constraint Preserving Graph Matching (BPGM), which aims to incorporate the discrete binary mapping constraints more in graph matching relaxation.
no code implementations • 23 May 2017 • Bo Jiang, Chris Ding, Bin Luo
One approach to deal with noise image data is to use data recovery techniques which aim to recover the true uncorrupted signals from the observed noise images.
1 code implementation • 11 Jan 2017 • Chenglong Li, Guizhao Wang, Yunpeng Ma, Aihua Zheng, Bin Luo, Jin Tang
In particular, we introduce a weight for each modality to describe the reliability, and integrate them into the graph-based manifold ranking algorithm to achieve adaptive fusion of different source data.