no code implementations • SemEval (NAACL) 2022 • Ze Chen, Kangxu Wang, Jiewen Zheng, Zijian Cai, Jiarong He, Jin Gao
This article describes the OPDAI submission to SemEval-2022 Task 11 on Chinese complex NER.
no code implementations • 2 Aug 2024 • Jin Gao, Lei Gan, Yuankai Li, Yixin Ye, Dequan Wang
Large multimodal models (LMMs) excel in adhering to human instructions.
1 code implementation • 19 Jul 2024 • Yunfei Zhang, Chao Liang, Jin Gao, Zhipeng Zhang, Weiming Hu, Stephen Maybank, Xue Zhou, Liang Li
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks by incorporating the extraction of appearance features as auxiliary tasks through embedding Re-Identification task (ReID) into the detector, achieving a balance between inference speed and tracking performance.
no code implementations • 16 Jul 2024 • Yanqin Jiang, Chaohui Yu, Chenjie Cao, Fan Wang, Weiming Hu, Jin Gao
The core idea is two-fold: 1) We propose a novel multi-view video diffusion model (MV-VDM) conditioned on multi-view renderings of the static 3D object, which is trained on our presented large-scale multi-view video dataset (MV-Video).
2 code implementations • ICCV 2021 • Chenxi Wang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Jin Gao, Cewu Lu
Experiments on a large-scale benchmark, GraspNet-1Billion, show that our method outperforms previous arts by a large margin (30+ AP) and achieves a high inference speed.
Ranked #3 on Robotic Grasping on GraspNet-1Billion
1 code implementation • 18 Apr 2024 • Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu
In this paper, we question if the \textit{extremely simple} lightweight ViTs' fine-tuning performance can also benefit from this pre-training paradigm, which is considerably less studied yet in contrast to the well-established lightweight architecture design methodology.
1 code implementation • 11 Mar 2024 • Fudong Ge, Yiwei Zhang, Shuhan Shen, Yue Wang, Weiming Hu, Jin Gao
2) The lower layers of the pre-trained backbone from BEV generation are shared for visual and structural streams in VPR, facilitating the learning of fine-grained local features in the visual stream.
no code implementations • 17 Feb 2024 • Jin Gao, Hanyong Xu, Luc Dao
In this study, we develop a multiple-generative agent system to simulate community decision-making for the redevelopment of Kendall Square's Volpe building.
1 code implementation • 4 Jan 2024 • Yunkun Zhang, Jin Gao, Zheling Tan, Lingfeng Zhou, Kexin Ding, Mu Zhou, Shaoting Zhang, Dequan Wang
The advent of foundation models (FMs) as an emerging suite of AI techniques has struck a wave of opportunities in computational healthcare.
no code implementations • CVPR 2024 • Hanshi Wang, Zhipeng Zhang, Jin Gao, Weiming Hu
Our motivation stems from the observation that 1) existing symmetric teacher-student methods for semi-supervised 3D object detection have characterized simplicity but impede the distillation performance between teacher and student because of the demand for an identical model structure and input data format.
1 code implementation • 18 Dec 2023 • Shihao Feng, Pengpeng Liang, Jin Gao, Erkang Cheng
Instead of performing correlation of the two branches at just one point in the network, in this paper, we present a multi-correlation Siamese Transformer network that has multiple stages and carries out feature correlation at the end of each stage based on sparse pillars.
no code implementations • 6 Nov 2023 • Yanqin Jiang, Li Zhang, Jin Gao, Weimin Hu, Yao Yao
This is achieved by leveraging the object-level 3D-aware image diffusion model as the primary supervision signal for training Dynamic Neural Radiance Fields (DyNeRF).
no code implementations • 25 Oct 2023 • Kejiang Qian, Lingjun Mao, Xin Liang, Yimin Ding, Jin Gao, Xinran Wei, Ziyi Guo, Jiajie Li
By integrating Multi-Agent Reinforcement Learning, our framework ensures that participatory urban planning decisions are more dynamic and adaptive to evolving community needs and provides a robust platform for automating complex real-world urban planning processes.
1 code implementation • NeurIPS 2023 • Yutong Kou, Jin Gao, Bing Li, Gang Wang, Weiming Hu, Yizheng Wang, Liang Li
To this end, we non-uniformly resize the cropped image to have a smaller input size while the resolution of the area where the target is more likely to appear is higher and vice versa.
2 code implementations • 27 Jul 2023 • Yunkun Zhang, Jin Gao, Mu Zhou, Xiaosong Wang, Yu Qiao, Shaoting Zhang, Dequan Wang
In this paper, we propose to Connect Image and Text Embeddings (CITE) to enhance pathological image classification.
no code implementations • 14 Feb 2023 • Qi Zhang, Zijian Yang, Yilun Huang, Ze Chen, Zijian Cai, Kangxu Wang, Jiewen Zheng, Jiarong He, Jin Gao
In this paper, we present our solution to the Multilingual Information Retrieval Across a Continuum of Languages (MIRACL) challenge of WSDM CUP 2023\footnote{https://project-miracl. github. io/}.
no code implementations • CVPR 2023 • Jin Gao, Jialing Zhang, Xihui Liu, Trevor Darrell, Evan Shelhamer, Dequan Wang
We update the target data instead, and project all test inputs toward the source domain with a generative diffusion model.
no code implementations • 8 Aug 2022 • Heng Cong, Rongyu Zhang, Jiarong He, Jin Gao
Face anti-spoofing researches are widely used in face recognition and has received more attention from industry and academics.
no code implementations • 8 Aug 2022 • Heng Cong, Lingzhi Fu, Rongyu Zhang, Yusheng Zhang, Hao Wang, Jiarong He, Jin Gao
In this work, we introduce Gradient Siamese Network (GSN) for image quality assessment.
no code implementations • 5 Aug 2022 • Qi Zhang, Zijian Yang, Yilun Huang, Ze Chen, Zijian Cai, Kangxu Wang, Jiewen Zheng, Jiarong He, Jin Gao
Our models are all trained with cross-entropy loss to classify the query-product pairs into ESCI 4 categories at first, and then we use weighted sum with the 4-class probabilities to get the score for ranking.
no code implementations • 13 Jul 2022 • Shaoru Wang, Zeming Li, Jin Gao, Liang Li, Weiming Hu
However, when facing various resource budgets in real-world applications, it costs a huge computation burden to pretrain multiple networks of various sizes one by one.
1 code implementation • 7 Jul 2022 • Jin Gao, Jialing Zhang, Xihui Liu, Trevor Darrell, Evan Shelhamer, Dequan Wang
We instead update the target data, by projecting all test inputs toward the source domain with a generative diffusion model.
1 code implementation • 30 Jun 2022 • Yanqin Jiang, Li Zhang, Zhenwei Miao, Xiatian Zhu, Jin Gao, Weiming Hu, Yu-Gang Jiang
3D object detection in autonomous driving aims to reason "what" and "where" the objects of interest present in a 3D world.
Ranked #2 on Robust Camera Only 3D Object Detection on nuScenes-C
1 code implementation • 12 Jun 2022 • Shaoru Wang, Jin Gao, Bing Li, Weiming Hu
Experiments for both synthesized and real-world scenarios consistently demonstrate the effectiveness of our approach, e. g., our method increases the degraded performance of the FCOS detector from 33. 6% AP to 35. 6% AP on COCO.
2 code implementations • 28 May 2022 • Shaoru Wang, Jin Gao, Zeming Li, Xiaoqin Zhang, Weiming Hu
We also point out some defects of such pre-training, e. g., failing to benefit from large-scale pre-training data and showing inferior performance on data-insufficient downstream tasks.
1 code implementation • CVPR 2022 • Zongyang Ma, Guan Luo, Jin Gao, Liang Li, Yuxin Chen, Shaoru Wang, Congxuan Zhang, Weiming Hu
Open-vocabulary object detection aims to detect novel object categories beyond the training set.
Ranked #28 on Open Vocabulary Object Detection on MSCOCO
2 code implementations • CVPR 2020 • Jin Gao, Yan Lu, Xiaojuan Qi, Yutong Kou, Bing Li, Liang Li, Shan Yu, Weiming Hu
In this paper, we propose a simple yet effective recursive least-squares estimator-aided online learning approach for few-shot online adaptation without requiring offline training.
no code implementations • 6 May 2021 • Zhenbang Li, Yaya Shi, Jin Gao, Shaoru Wang, Bing Li, Pengpeng Liang, Weiming Hu
In this paper, we show the existence of universal perturbations that can enable the targeted attack, e. g., forcing a tracker to follow the ground-truth trajectory with specified offsets, to be video-agnostic and free from inference in a network.
no code implementations • ECCV 2018 • Mengdan Zhang, Qiang Wang, Junliang Xing, Jin Gao, Peixi Peng, Weiming Hu, Steve Maybank
Correlation filters based trackers rely on a periodic assumption of the search sample to efficiently distinguish the target from the background.
2 code implementations • CVPR 2018 • Qiang Wang, Zhu Teng, Junliang Xing, Jin Gao, Weiming Hu, Stephen Maybank
The RASNet model reformulates the correlation filter within a Siamese tracking framework, and introduces different kinds of the attention mechanisms to adapt the model without updating the model online.
Ranked #3 on Visual Object Tracking on OTB-2013
5 code implementations • 13 Apr 2017 • Qiang Wang, Jin Gao, Junliang Xing, Mengdan Zhang, Weiming Hu
In this work, we present an end-to-end lightweight network architecture, namely DCFNet, to learn the convolutional features and perform the correlation tracking process simultaneously.