no code implementations • EMNLP 2020 • Yijun Wang, Changzhi Sun, Yuanbin Wu, Junchi Yan, Peng Gao, Guotong Xie
In particular, a span encoder is trained to recover a random shuffling of tokens in a span, and a span pair encoder is trained to predict positive pairs that are from the same sentences and negative pairs that are from different sentences using contrastive loss.
1 code implementation • 25 May 2023 • Shilin Yan, Renrui Zhang, Ziyu Guo, Wenchao Chen, Wei zhang, Hongyang Li, Yu Qiao, Zhongjiang He, Peng Gao
In this paper, we propose MUTR, a Multi-modal Unified Temporal transformer for Referring video object segmentation.
Referring Video Object Segmentation
Semantic Segmentation
+1
1 code implementation • 18 May 2023 • Siyuan Huang, Zhengkai Jiang, Hao Dong, Yu Qiao, Peng Gao, Hongsheng Li
This paper presents Instruct2Act, a framework that utilizes Large Language Models to map multi-modal instructions to sequential actions for robotic manipulation tasks.
2 code implementations • 16 May 2023 • Siyuan Huang, Bo Zhang, Botian Shi, Peng Gao, Yikang Li, Hongsheng Li
In this paper, different from previous 2D DG works, we focus on the 3D DG problem and propose a Single-dataset Unified Generalization (SUG) framework that only leverages a single source dataset to alleviate the unforeseen domain differences faced by a well-trained source model.
1 code implementation • 4 May 2023 • Renrui Zhang, Zhengkai Jiang, Ziyu Guo, Shilin Yan, Junting Pan, Hao Dong, Peng Gao, Hongsheng Li
Driven by large-data pre-training, Segment Anything Model (SAM) has been demonstrated as a powerful and promptable framework, revolutionizing the segmentation models.
Ranked #1 on
Personalized Segmentation
on PerSeg
2 code implementations • 28 Apr 2023 • Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao
This strategy effectively alleviates the interference between the two tasks of image-text alignment and instruction following and achieves strong multi-modal reasoning with only a small-scale image-text and instruction dataset.
no code implementations • 26 Apr 2023 • Xiaorui Wang, Jun Wang, Xin Tang, Peng Gao, Rui Fang, Guotong Xie
Filter pruning is widely adopted to compress and accelerate the Convolutional Neural Networks (CNNs), but most previous works ignore the relationship between filters and channels in different layers.
no code implementations • 24 Apr 2023 • Xuexue Li, Wenhui Diao, Yongqiang Mao, Peng Gao, Xiuhua Mao, Xinming Li, Xian Sun
One interaction for the guide is between two task decoders to address the feature confusion problem, and an occlusion decoupling head (ODH) is proposed to replace the general detection head.
1 code implementation • 3 Apr 2023 • Xiangyang Zhu, Renrui Zhang, Bowei He, Aojun Zhou, Dong Wang, Bin Zhao, Peng Gao
The popularity of Contrastive Language-Image Pre-training (CLIP) has propelled its application to diverse downstream vision tasks.
no code implementations • CVPR 2023 • Sheng Xu, Yanjing Li, Mingbao Lin, Peng Gao, Guodong Guo, Jinhu Lu, Baochang Zhang
At the upper level, we introduce a new foreground-aware query matching scheme to effectively transfer the teacher information to distillation-desired features to minimize the conditional information entropy.
2 code implementations • 28 Mar 2023 • Renrui Zhang, Jiaming Han, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, Peng Gao, Yu Qiao
We present LLaMA-Adapter, a lightweight adaption method to efficiently fine-tune LLaMA into an instruction-following model.
no code implementations • 20 Mar 2023 • Fan Cui, Liyong Guo, Quandong Wang, Peng Gao, Yujun Wang
To address those challenges, we explore representation learning for KWS by self-supervised contrastive learning and self-training with pretrained model.
2 code implementations • 14 Mar 2023 • Renrui Zhang, Liuhui Wang, Ziyu Guo, Yali Wang, Peng Gao, Hongsheng Li, Jianbo Shi
We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.
Ranked #1 on
Training-free 3D Part Segmentation
on ShapeNet-Part
3D Point Cloud Classification
Training-free 3D Part Segmentation
+1
1 code implementation • 9 Mar 2023 • Peng Gao, Renrui Zhang, Rongyao Fang, Ziyi Lin, Hongyang Li, Hongsheng Li, Qiao Yu
To alleviate this, previous methods simply replace the pixel reconstruction targets of 75% masked tokens by encoded features from pre-trained image-image (DINO) or image-language (CLIP) contrastive learning.
2 code implementations • CVPR 2023 • Renrui Zhang, Xiangfei Hu, Bohao Li, Siyuan Huang, Hanqiu Deng, Hongsheng Li, Yu Qiao, Peng Gao
Our CaFo incorporates CLIP's language-contrastive knowledge, DINO's vision-contrastive knowledge, DALL-E's vision-generative knowledge, and GPT-3's language-generative knowledge.
1 code implementation • 2 Mar 2023 • Rongyao Fang, Peng Gao, Aojun Zhou, Yingjie Cai, Si Liu, Jifeng Dai, Hongsheng Li
The first method is One-to-many Matching via Data Augmentation (denoted as DataAug-DETR).
1 code implementation • 2 Feb 2023 • Sheng Xu, Yanjing Li, Teli Ma, Mingbao Lin, Hao Dong, Baochang Zhang, Peng Gao, Jinhu Lv
In this paper, we introduce a Resilient Binary Neural Network (ReBNN) to mitigate the frequent oscillation for better BNNs' training.
1 code implementation • CVPR 2023 • Renrui Zhang, Liuhui Wang, Yali Wang, Peng Gao, Hongsheng Li, Jianbo Shi
We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.
no code implementations • 19 Dec 2022 • Peng Gao, Feng Gao, Jian-Cheng Ni
Drug-Drug Interactions (DDIs) prediction is an essential issue in the molecular field.
2 code implementations • CVPR 2023 • Renrui Zhang, Liuhui Wang, Yu Qiao, Peng Gao, Hongsheng Li
Pre-training by numerous image data has become de-facto for robust 2D representations.
Ranked #1 on
3D Point Cloud Linear Classification
on ModelNet40
(using extra training data)
3D Point Cloud Linear Classification
Few-Shot 3D Point Cloud Classification
2 code implementations • 21 Nov 2022 • Xiangyang Zhu, Renrui Zhang, Bowei He, Ziyao Zeng, Shanghang Zhang, Peng Gao
Contrastive Language-Image Pre-training (CLIP) has shown promising open-world performance on 2D image tasks, while its transferred capacity on 3D point clouds, i. e., PointCLIP, is still far from satisfactory.
Ranked #1 on
Zero-Shot Transfer 3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
1 code implementation • CVPR 2023 • Hongwei Xue, Peng Gao, Hongyang Li, Yu Qiao, Hao Sun, Houqiang Li, Jiebo Luo
However, unlike the low-level features such as pixel values, we argue the features extracted by powerful teacher models already encode rich semantic correlation across regions in an intact image. This raises one question: is reconstruction necessary in Masked Image Modeling (MIM) with a teacher model?
no code implementations • 10 Nov 2022 • Xinyu Yang, Haoyuan Liu, Ziyu Wang, Peng Gao
System auditing has emerged as a key approach for monitoring system call events and investigating sophisticated attacks.
no code implementations • 21 Oct 2022 • Jun Wang, Weixun Li, Changyu Hou, Xin Tang, Yixuan Qiao, Rui Fang, Pengyong Li, Peng Gao, Guotong Xie
Contrastive learning has emerged as a powerful tool for graph representation learning.
1 code implementation • 13 Oct 2022 • Yanjing Li, Sheng Xu, Baochang Zhang, Xianbin Cao, Peng Gao, Guodong Guo
The large pre-trained vision transformers (ViTs) have demonstrated remarkable performance on various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices.
1 code implementation • 7 Oct 2022 • Sheng Xu, Yanjing Li, Bohan Zeng, Teli Ma, Baochang Zhang, Xianbin Cao, Peng Gao, Jinhu Lv
This explains why existing KD methods are less effective for 1-bit detectors, caused by a significant information discrepancy between the real-valued teacher and the 1-bit student.
no code implementations • 25 Sep 2022 • Renrui Zhang, Bohao Li, Wei zhang, Hao Dong, Hongsheng Li, Peng Gao, Yu Qiao
In this paper, we propose CoMo, a Collaboration of pre-trained Models that incorporates diverse prior knowledge from various pre-training paradigms for better few-shot learning.
2 code implementations • 4 Sep 2022 • Sheng Xu, Yanjing Li, Tiancheng Wang, Teli Ma, Baochang Zhang, Peng Gao, Yu Qiao, Jinhu Lv, Guodong Guo
To address this issue, Recurrent Bilinear Optimization is proposed to improve the learning process of BNNs (RBONNs) by associating the intrinsic bilinear variables in the back propagation process.
2 code implementations • 6 Aug 2022 • Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li
Video recognition has been dominated by the end-to-end learning paradigm -- first initializing a video recognition model with weights of a pretrained image model and then conducting end-to-end training on videos.
Ranked #17 on
Action Classification
on Kinetics-400
(using extra training data)
2 code implementations • 19 Jul 2022 • Renrui Zhang, Zhang Wei, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li
On top of that, the performance of Tip-Adapter can be further boosted to be state-of-the-art on ImageNet by fine-tuning the cache model for 10$\times$ fewer epochs than existing methods, which is both effective and efficient.
1 code implementation • 14 Jul 2022 • Zhengkai Jiang, Yuxi Li, Ceyuan Yang, Peng Gao, Yabiao Wang, Ying Tai, Chengjie Wang
Unsupervised Domain Adaptation (UDA) aims to adapt the model trained on the labeled source domain to an unlabeled target domain.
Ranked #11 on
Unsupervised Domain Adaptation
on SYNTHIA-to-Cityscapes
1 code implementation • 8 Jul 2022 • Tong Zhang, Peng Gao, Hao Dong, Yin Zhuang, Guanqun Wang, Wei zhang, He Chen
Currently, under supervised learning, a model pretrained by a large-scale nature scene dataset and then fine-tuned on a few specific task labeling data is the paradigm that has dominated the knowledge transfer learning.
1 code implementation • 30 May 2022 • Ziteng Cui, Kunchang Li, Lin Gu, Shenghan Su, Peng Gao, Zhengkai Jiang, Yu Qiao, Tatsuya Harada
Challenging illumination conditions (low-light, under-exposure and over-exposure) in the real world not only cast an unpleasant visual appearance but also taint the computer vision tasks.
Ranked #2 on
Image Enhancement
on Exposure-Errors
no code implementations • SemEval (NAACL) 2022 • Changyu Hou, Jun Wang, Yixuan Qiao, Peng Jiang, Peng Gao, Guotong Xie, Qizhi Lin, Xiaopeng Wang, Xiandi Jiang, Benqi Wang, Qifeng Xiao
By assigning different weights to each model for different inputs, we adopted the Transformer layer to integrate the advantages of diverse models effectively.
Low Resource Named Entity Recognition
named-entity-recognition
+2
2 code implementations • 28 May 2022 • Renrui Zhang, Ziyu Guo, Rongyao Fang, Bin Zhao, Dong Wang, Yu Qiao, Hongsheng Li, Peng Gao
By fine-tuning on downstream tasks, Point-M2AE achieves 86. 43% accuracy on ScanObjectNN, +3. 36% to the second-best, and largely benefits the few-shot classification, part segmentation and 3D object detection with the hierarchical pre-training scheme.
Ranked #4 on
Few-Shot 3D Point Cloud Classification
on ModelNet40 5-way (20-shot)
(using extra training data)
no code implementations • 18 May 2022 • Yixuan Qiao, Hao Chen, Jun Wang, Yongquan Lai, Tuozhen Liu, Xianbin Ye, Xin Tang, Rui Fang, Peng Gao, Wenfeng Xie, Guotong Xie
This paper describes the PASH participation in TREC 2021 Deep Learning Track.
3 code implementations • 8 May 2022 • Peng Gao, Teli Ma, Hongsheng Li, Ziyi Lin, Jifeng Dai, Yu Qiao
Masked auto-encoding for feature pretraining and multi-scale hybrid convolution-transformer architectures can further unleash the potentials of ViT, leading to state-of-the-art performances on image classification, detection and semantic segmentation.
1 code implementation • 3 Apr 2022 • Kexue Fu, Peng Gao, Shaolei Liu, Renrui Zhang, Yu Qiao, Manning Wang
We propose to use the dynamically updated momentum encoder as the tokenizer, which is updated and outputs the dynamic supervision signal along with the training process.
no code implementations • 31 Mar 2022 • Li Wang, Rongzhi Gu, Weiji Zhuang, Peng Gao, Yujun Wang, Yuexian Zou
Bearing this in mind, a two-branch deep network (KWS branch and SV branch) with the same network structure is developed and a novel decoupling feature learning method is proposed to push up the performance of KWS and SV simultaneously where speaker-invariant keyword representations and keyword-invariant speaker representations are expected respectively.
1 code implementation • 24 Mar 2022 • Renrui Zhang, Han Qiu, Tai Wang, Ziyu Guo, Xuanzhuo Xu, Yu Qiao, Peng Gao, Hongsheng Li
In this paper, we introduce a novel framework for Monocular DEtection with a depth-guided TRansformer, named MonoDETR.
no code implementations • 2 Mar 2022 • Xianbin Ye, Ziliang Li, Fei Ma, Zongbi Yi, Pengyong Li, Jun Wang, Peng Gao, Yixuan Qiao, Guotong Xie
Anti-cancer drug discoveries have been serendipitous, we sought to present the Open Molecular Graph Learning Benchmark, named CandidateDrug4Cancer, a challenging and realistic benchmark dataset to facilitate scalable, robust, and reproducible graph machine learning research for anti-cancer drug discovery.
no code implementations • 9 Feb 2022 • Kexue Fu, Peng Gao, Renrui Zhang, Hongsheng Li, Yu Qiao, Manning Wang
Especially, we develop a variant of ViT for 3D point cloud feature extraction, which also achieves comparable results with existing backbones when combined with our framework, and visualization of the attention maps show that our model does understand the point cloud by combining the global shape information and multiple local structural information, which is consistent with the inspiration of our representation learning method.
6 code implementations • 24 Jan 2022 • Kunchang Li, Yali Wang, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
Different from the typical transformer blocks, the relation aggregators in our UniFormer block are equipped with local and global token affinity respectively in shallow and deep layers, allowing to tackle both redundancy and dependency for efficient and effective representation learning.
Ranked #143 on
Image Classification
on ImageNet
no code implementations • 20 Jan 2022 • Sheng Xu, Yanjing Li, Teli Ma, Bohan Zeng, Baochang Zhang, Peng Gao, Jinhu Lv
Vision transformers (ViTs) have demonstrated great potential in various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices.
2 code implementations • 12 Jan 2022 • Kunchang Li, Yali Wang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
For Something-Something V1 and V2, our UniFormer achieves new state-of-the-art performances of 60. 9% and 71. 2% top-1 accuracy respectively.
no code implementations • 7 Jan 2022 • Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Peng Gao, Zenghui Zhang, Tatsuya Harada
Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images.
2 code implementations • 22 Dec 2021 • Liang Pan, Tong Wu, Zhongang Cai, Ziwei Liu, Xumin Yu, Yongming Rao, Jiwen Lu, Jie zhou, Mingye Xu, Xiaoyuan Luo, Kexue Fu, Peng Gao, Manning Wang, Yali Wang, Yu Qiao, Junsheng Zhou, Xin Wen, Peng Xiang, Yu-Shen Liu, Zhizhong Han, Yuanjie Yan, Junyi An, Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández, Qinlong Wang, Yang Yang
Based on the MVP dataset, this paper reports methods and results in the Multi-View Partial Point Cloud Challenge 2021 on Completion and Registration.
no code implementations • 9 Dec 2021 • Jun Wang, Zhoujing Li, Yixuan Qiao, Qiming Qin, Peng Gao, Guotong Xie
This paper presents a novel superpixel based approach combining DNN and a modified segmentation method, to detect damaged buildings from VHR imagery.
2 code implementations • CVPR 2022 • Renrui Zhang, Ziyu Guo, Wei zhang, Kunchang Li, Xupeng Miao, Bin Cui, Yu Qiao, Peng Gao, Hongsheng Li
On top of that, we design an inter-view adapter to better extract the global feature and adaptively fuse the few-shot knowledge learned from 3D into CLIP pre-trained in 2D.
Ranked #3 on
Training-free 3D Part Segmentation
on ShapeNet-Part
(using extra training data)
1 code implementation • NeurIPS 2021 • Peng Gao, Jiasen Lu, Hongsheng Li, Roozbeh Mottaghi, Aniruddha Kembhavi
Convolutional neural networks (CNNs) are ubiquitous in computer vision, with a myriad of effective and efficient variations.
1 code implementation • 29 Nov 2021 • Teli Ma, Shijie Geng, Mengmeng Wang, Jing Shao, Jiasen Lu, Hongsheng Li, Peng Gao, Yu Qiao
Recent advances in large-scale contrastive visual-language pretraining shed light on a new pathway for visual recognition.
Ranked #3 on
Long-tail Learning
on Places-LT
(using extra training data)
1 code implementation • 6 Nov 2021 • Renrui Zhang, Rongyao Fang, Wei zhang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li
To further enhance CLIP's few-shot capability, CLIP-Adapter proposed to fine-tune a lightweight residual feature adapter and significantly improves the performance for few-shot classification.
no code implementations • 5 Nov 2021 • Peng Gao, Brian Reily, Rui Guo, HongSheng Lu, Qingzhao Zhu, Hao Zhang
In this paper, we introduce a novel approach that integrates uncertainty-aware spatiotemporal graph learning and model-based state estimation for a team of robots to collaboratively localize objects.
no code implementations • 26 Oct 2021 • Pengyong Li, Jun Wang, Ziliang Li, Yixuan Qiao, Xianggen Liu, Fei Ma, Peng Gao, Seng Song, Guotong Xie
Self-supervised learning has gradually emerged as a powerful technique for graph representation learning.
no code implementations • 13 Oct 2021 • Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori
In previous work, we have proposed the Audio-Visual Scene-Aware Dialog (AVSD) task, collected an AVSD dataset, developed AVSD technologies, and hosted an AVSD challenge track at both the 7th and 8th Dialog System Technology Challenges (DSTC7, DSTC8).
1 code implementation • 9 Oct 2021 • Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, Yu Qiao
Large-scale contrastive vision-language pre-training has shown significant progress in visual representation learning.
no code implementations • 24 Sep 2021 • Lei Shi, Kai Shuang, Shijie Geng, Peng Gao, Zuohui Fu, Gerard de Melo, Yunpeng Chen, Sen Su
To overcome these issues, we propose unbiased Dense Contrastive Visual-Linguistic Pretraining (DCVLP), which replaces the region regression and classification with cross-modality region contrastive learning that requires no annotations.
1 code implementation • ICCV 2021 • Peng Gao, Minghang Zheng, Xiaogang Wang, Jifeng Dai, Hongsheng Li
However, DETR suffers from its slow convergence.
no code implementations • 2 Jul 2021 • Peng Gao, Feng Gao, Jian-Cheng Ni, Hamido Fujita
Multi-hop machine reading comprehension is a challenging task in natural language processing, which requires more reasoning ability across multiple documents.
no code implementations • 24 Jun 2021 • Yixuan Qiao, Hao Chen, Jun Wang, Yihao Chen, Xianbin Ye, Ziliang Li, Xianbiao Qi, Peng Gao, Guotong Xie
TextVQA requires models to read and reason about text in images to answer questions about them.
no code implementations • 6 Jun 2021 • Teli Ma, Mingyuan Mao, Honghui Zheng, Peng Gao, Xiaodi Wang, Shumin Han, Errui Ding, Baochang Zhang, David Doermann
Object detection with Transformers (DETR) has achieved a competitive performance over traditional detectors, such as Faster R-CNN.
no code implementations • 4 Jun 2021 • Peng Gao, Shijie Geng, Yu Qiao, Xiaogang Wang, Jifeng Dai, Hongsheng Li
In this paper, we propose a novel Scalable Transformers, which naturally contains sub-Transformers of different scales and have shared parameters.
2 code implementations • 2 Jun 2021 • Peng Gao, Jiasen Lu, Hongsheng Li, Roozbeh Mottaghi, Aniruddha Kembhavi
Convolutional neural networks (CNNs) are ubiquitous in computer vision, with a myriad of effective and efficient variations.
Ranked #417 on
Image Classification
on ImageNet
no code implementations • NeurIPS 2021 • Mingyuan Mao, Renrui Zhang, Honghui Zheng, Peng Gao, Teli Ma, Yan Peng, Errui Ding, Baochang Zhang, Shumin Han
Transformers with remarkable global representation capacities achieve competitive results for visual tasks, but fail to consider high-level local pattern information in input images.
2 code implementations • 5 May 2021 • Jiaquan Ye, Xianbiao Qi, Yelin He, Yihao Chen, Dengyi Gu, Peng Gao, Rong Xiao
In our method, we divide the table content recognition task into foursub-tasks: table structure recognition, text line detection, text line recognition, and box assignment. Our table structure recognition algorithm is customized based on MASTER [1], a robust image textrecognition algorithm.
Ranked #1 on
Table Recognition
on PubTabNet
no code implementations • 5 May 2021 • Yelin He, Xianbiao Qi, Jiaquan Ye, Peng Gao, Yihao Chen, Bingcong Li, Xin Tang, Rong Xiao
This paper presents our solution for the ICDAR 2021 Competition on Scientific Table Image Recognition to LaTeX.
1 code implementation • Briefings in Bioinformatics 2021 • Pengyong Li, Jun Wang, Yixuan Qiao, Hao Chen, Yihuan Yu, Xiaojun Yao, Peng Gao, Guotong Xie, Sen Song
In MPG, we proposed a powerful GNN for modelling molecular graph named MolGNet, and designed an effective self-supervised strategy for pre-training the model at both the node and graph-level.
no code implementations • 10 Feb 2021 • Peiyi Zhang, Xiaodong Jiang, Ginger M Holt, Nikolay Pavlovich Laptev, Caner Komurlu, Peng Gao, Yang Yu
Hyper-parameters of time series models play an important role in time series analysis.
1 code implementation • 24 Jan 2021 • Shijie Geng, Peng Gao, Zuohui Fu, Yongfeng Zhang
In this paper, we leverage gradient regularized self-distillation for RObust training of Multi-Exit BERT (RomeBERT), which can effectively solve the performance imbalance problem between early and late exits.
no code implementations • 21 Jan 2021 • Peng Gao
We establish unconditional sharp upper bounds of the $k$-th moments of the family of quadratic Dirichlet $L$-functions at the central point for $0 \leq k \leq 2$.
Number Theory
no code implementations • 19 Jan 2021 • Peng Gao, Xiaoyuan Liu, Edward Choi, Bhavna Soman, Chinmaya Mishra, Kate Farris, Dawn Song
SecurityKG collects OSCTI reports from various sources, uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors, and constructs a security knowledge graph.
2 code implementations • 19 Jan 2021 • Peng Gao, Minghang Zheng, Xiaogang Wang, Jifeng Dai, Hongsheng Li
The recently proposed Detection Transformer (DETR) model successfully applies Transformer to objects detection and achieves comparable performance with two-stage object detection frameworks, such as Faster-RCNN.
no code implementations • 17 Jan 2021 • Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Haoyuan Liu, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song
Log-based cyber threat hunting has emerged as an important solution to counter sophisticated cyber attacks.
no code implementations • 21 Dec 2020 • Pengyong Li, Jun Wang, Yixuan Qiao, Hao Chen, Yihuan Yu, Xiaojun Yao, Peng Gao, Guotong Xie, Sen Song
Here, we proposed a novel Molecular Pre-training Graph-based deep learning framework, named MPG, that leans molecular representations from large-scale unlabeled molecules.
no code implementations • 9 Dec 2020 • Jun Wang, Shaoguo Wen, Kaixing Chen, Jianghua Yu, Xin Zhou, Peng Gao, Changsheng Li, Guotong Xie
Active learning generally involves querying the most representative samples for human labeling, which has been widely studied in many fields such as image classification and object detection.
1 code implementation • 18 Nov 2020 • Minghang Zheng, Peng Gao, Renrui Zhang, Kunchang Li, Xiaogang Wang, Hongsheng Li, Hao Dong
In this paper, a novel variant of transformer named Adaptive Clustering Transformer(ACT) has been proposed to reduce the computation cost for high-resolution input.
no code implementations • 16 Nov 2020 • Peng Gao, Rui Guo, HongSheng Lu, Hao Zhang
Collaborative object localization aims to collaboratively estimate locations of objects observed from multiple views or perspectives, which is a critical ability for multi-agent systems such as connected vehicles.
1 code implementation • 26 Oct 2020 • Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song
Log-based cyber threat hunting has emerged as an important solution to counter sophisticated attacks.
no code implementations • 10 Oct 2020 • Peng Gao
Large batch jobs such as Deep Learning, HPC and Spark require far more computational resources and higher cost than conventional online service.
no code implementations • 23 Sep 2020 • Peng Gao, Chiori Hori, Shijie Geng, Takaaki Hori, Jonathan Le Roux
In contrast with previous approaches where information flows only towards deeper layers of a stack, we consider a multi-pass transformer (MPT) architecture in which earlier layers are allowed to process information in light of the output of later layers.
no code implementations • 27 Jul 2020 • Changsheng Li, Chong Liu, Lixin Duan, Peng Gao, Kai Zheng
In this paper, we present a novel deep metric learning method to tackle the multi-label image classification problem.
no code implementations • 26 Jul 2020 • Lei Shi, Kai Shuang, Shijie Geng, Peng Su, Zhengkai Jiang, Peng Gao, Zuohui Fu, Gerard de Melo, Sen Su
We evaluate CVLP on several down-stream tasks, including VQA, GQA and NLVR2 to validate the superiority of contrastive learning on multi-modality representation learning.
no code implementations • 25 Jul 2020 • Peng Su, Shixiang Tang, Peng Gao, Di Qiu, Ni Zhao, Xiaogang Wang
At the core of our method, gradient regularization plays two key roles: (1) enforces the gradient of contrastive loss not to increase the supervised training loss on the source domain, which maintains the discriminative power of learned features; (2) regularizes the gradient update on the new domain not to increase the classification loss on the old target domains, which enables the model to adapt to an in-coming target domain while preserving the performance of previously observed domains.
no code implementations • 8 Jul 2020 • Shijie Geng, Peng Gao, Moitreya Chatterjee, Chiori Hori, Jonathan Le Roux, Yongfeng Zhang, Hongsheng Li, Anoop Cherian
Given an input video, its associated audio, and a brief caption, the audio-visual scene aware dialog (AVSD) task requires an agent to indulge in a question-answer dialog with a human about the audio-visual content.
no code implementations • 16 May 2020 • Keqi Wang, Peng Gao, Steven Hoi, Qian Guo, Yuhua Qian
Low-light imaging is challenging since images may appear to be dark and noised due to low signal-to-noise ratio, complex image content, and the variety in shooting scenes in extreme low-light condition.
no code implementations • 9 May 2020 • Shijie Geng, Ji Zhang, Zuohui Fu, Peng Gao, Hang Zhang, Gerard de Melo
Without identifying the connection between appearing people and character names, a model is not able to obtain a genuine understanding of the plots.
no code implementations • 3 Jan 2020 • Lei Shi, Shijie Geng, Kai Shuang, Chiori Hori, Songxiang Liu, Peng Gao, Sen Su
To solve the issue for the intermediate layers, we propose an efficient Quaternion Block Network (QBN) to learn interaction not only for the last layer but also for all intermediate layers simultaneously.
1 code implementation • ECCV 2020 • Zhengkai Jiang, Yu Liu, Ceyuan Yang, Jihao Liu, Peng Gao, Qian Zhang, Shiming Xiang, Chunhong Pan
Transferring existing image-based detectors to the video is non-trivial since the quality of frames is always deteriorated by part occlusion, rare pose, and motion blur.
Ranked #17 on
Video Object Detection
on ImageNet VID
no code implementations • WS 2019 • Xiepeng Li, Zhexi Zhang, Wei Zhu, Zheng Li, Yuan Ni, Peng Gao, Junchi Yan, Guotong Xie
We have experimented both (a) improving the fine-tuning of pre-trained language models on a task with a small dataset size, by leveraging datasets of similar tasks; and (b) incorporating the distributional representations of a KG onto the representations of pre-trained language models, via simply concatenation or multi-head attention.
no code implementations • 27 Aug 2019 • Peng Gao, Qiquan Zhang, Fei Wang, Liyi Xiao, Hamido Fujita, Yan Zhang
Although numerous recent tracking approaches have made tremendous advances in the last decade, achieving high-performance visual tracking remains a challenge.
no code implementations • 27 Aug 2019 • Zhencai Hu, Peng Gao, Fei Wang
To solve the problem of dimensional explosion in the air combat, the proposed method is implemented through feature selection, trajectory sampling, function approximation and Bellman backup operation in the air combat simulation environment.
no code implementations • ICCV 2019 • Peng Gao, Haoxuan You, Zhanpeng Zhang, Xiaogang Wang, Hongsheng Li
The proposed module learns the cross-modality relationships between latent visual and language summarizations, which summarize visual regions and question into a small number of latent representations to avoid modeling uninformative individual region-word relations.
no code implementations • 26 Jun 2019 • Zibin Zhou, Fei Wang, Wenjuan Xi, Huaying Chen, Peng Gao, Chengkang He
Another UNet is utilized to acquire primary segmentation.
no code implementations • CVPR 2019 • Peng Gao, Zhengkai Jiang, Haoxuan You, Pan Lu, Steven C. H. Hoi, Xiaogang Wang, Hongsheng Li
It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering.
no code implementations • 13 May 2019 • Qi Ni, Fei Wang, Ziwei Zhao, Peng Gao
Image feature extraction and matching is a fundamental but computation intensive task in machine vision.
no code implementations • 8 May 2019 • Peng Gao, Yipeng Ma, Ruyue Yuan, Liyi Xiao, Fei Wang
In order to achieve high performance visual tracking in various negative scenarios, a novel cascaded Siamese network is proposed and developed based on two different deep learning networks: a matching subnetwork and a classification subnetwork.
no code implementations • 23 Apr 2019 • Peng Gao, Ruyue Yuan, Fei Wang, Liyi Xiao, Hamido Fujita, Yan Zhang
In this paper, we investigate the impacts of three main aspects of visual tracking, i. e., the backbone network, the attentional mechanism, and the detection component, and propose a Siamese Attentional Keypoint Network, dubbed SATIN, for efficient tracking and accurate localization.
no code implementations • 13 Oct 2018 • Yipeng Ma, Chun Yuan, Peng Gao, Fei Wang
Correlation filter (CF) based tracking algorithms have demonstrated favorable performance recently.
no code implementations • 12 Oct 2018 • Ke Song, Chun Yuan, Peng Gao, Yunxu Sun
In order to improve the tracking speed and reduce the overall power consumption of visual tracking, this paper proposes a real-time visual tracking algorithm based on DSST(Discriminative Scale Space Tracking) approach.
no code implementations • ECCV 2018 • Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang
Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features. To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage.
1 code implementation • 25 Jun 2018 • Peng Gao, Xusheng Xiao, Ding Li, Zhichun Li, Kangkook Jee, Zhen-Yu Wu, Chung Hwan Kim, Sanjeev R. Kulkarni, Prateek Mittal
To facilitate the task of expressing anomalies based on expert knowledge, our system provides a domain-specific query language, SAQL, which allows analysts to express models for (1) rule-based anomalies, (2) time-series anomalies, (3) invariant-based anomalies, and (4) outlier-based anomalies.
Cryptography and Security Databases
no code implementations • 23 Apr 2018 • Peng Gao, Yipeng Ma, Ke Song, Chao Li, Fei Wang, Liyi Xiao, Yan Zhang
Based on the proposed circular and structural operators, a set of primal confidence score maps can be obtained by circular correlating feature maps with their corresponding structural correlation filters.
no code implementations • 20 Apr 2018 • Peng Gao, Yipeng Ma, Chao Li, Ke Song, Fei Wang, Liyi Xiao
Discriminative Correlation Filters based tracking algorithms exploiting conventional handcrafted features have achieved impressive results both in terms of accuracy and robustness.
no code implementations • 19 Apr 2018 • Peng Gao, Yipeng Ma, Ke Song, Chao Li, Fei Wang, Liyi Xiao
To the best of our knowledge, we are the first to incorporate the advantages of DCF and SOSVM for TIR object tracking.
no code implementations • 16 Apr 2018 • Yan Zhang, Peng Gao, Xiao-Qing Li
The Ray-Casting algorithm is an important method for fast real-time surface display from 3D medical images.
no code implementations • 16 Apr 2018 • Peng Gao, Ruyue Yuan, Zhicong Lin, Linsheng Zhang, Yan Zhang
In current visual object tracking system, the CPU or GPU-based visual object tracking systems have high computational cost and consume a prohibitive amount of power.
1 code implementation • 30 Jan 2016 • Shen-Yi Zhao, Ru Xiang, Ying-Hao Shi, Peng Gao, Wu-Jun Li
Recently, many distributed stochastic optimization~(DSO) methods have been proposed to solve the large-scale composite optimization problems, which have shown better performance than traditional batch methods.