1 code implementation • 31 Mar 2025 • Sudong Wang, Yunjian Zhang, Yao Zhu, Jianing Li, Zizhe Wang, Yanwei Liu, Xiangyang Ji
Large Vision-Language Models (LVLMs) are gradually becoming the foundation for many artificial intelligence applications.
1 code implementation • 28 Jan 2025 • Jianing Li, Ming Lu, Hao Wang, Chenyang Gu, Wenzhao Zheng, Li Du, Shanghang Zhang
To utilize these slice features, we propose SliceOcc, an RGB camera-based model specifically tailored for indoor 3D semantic occupancy prediction.
no code implementations • 1 Dec 2024 • Shaoyu Liu, Jianing Li, Guanghui Zhao, Yunjian Zhang, Xin Meng, Fei Richard Yu, Xiangyang Ji, Ming Li
Our EventGPT comprises an event encoder, followed by a spatio-temporal aggregator, a linear projector, an event-language adapter, and an LLM.
no code implementations • 27 Nov 2024 • Dianze Li, Jianing Li, Xu Liu, Zhaokun Zhou, Xiaopeng Fan, Yonghong Tian
To address these challenges, we propose HDI-Former, a Hybrid Dynamic Interaction ANN-SNN Transformer, marking the first trial to design a directly trained hybrid ANN-SNN architecture for high-accuracy and energy-efficient object detection using frames and events.
1 code implementation • 20 Aug 2024 • Xiao Wang, Yao Rong, Fuling Wang, Jianing Li, Lin Zhu, Bo Jiang, YaoWei Wang
Based on this dataset and several other large-scale datasets, we propose a novel baseline method that fully leverages the Mamba model's ability to integrate temporal information of CNN features, resulting in improved sign language translation outcomes.
no code implementations • 12 Aug 2024 • Mingkun Zhang, Jianing Li, Wei Chen, Jiafeng Guo, Xueqi Cheng
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
no code implementations • 23 Jul 2024 • Jie Zhao, Jianing Li, Weihan Chen, Wentong Wang, Pengfei Yuan, Xu Zhang, Deshu Peng
Human pose estimation remains a multifaceted challenge in computer vision, pivotal across diverse domains such as behavior recognition, human-computer interaction, and pedestrian tracking.
1 code implementation • 22 Jul 2024 • Luyu Qiu, Jianing Li, Chi Su, Chen Jason Zhang, Lei Chen
This work underscores the importance of explainable AI, helping to build trust in large language models and promoting their adoption in critical applications.
no code implementations • 15 Jun 2024 • Ying Fu, Yu Li, ShaoDi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu, Yunkang Zhang, Siyuan Jiang, Xiaoqiang Lu, Licheng Jiao, Fang Liu, Xu Liu, Lingling Li, Wenping Ma, Shuyuan Yang, Haiyang Xie, Jian Zhao, Shihua Huang, Peng Cheng, Xi Shen, Zheng Wang, Shuai An, Caizhi Zhu, Xuelong Li, Tao Zhang, Liang Li, Yu Liu, Chenggang Yan, Gengchen Zhang, Linyan Jiang, Bingyi Song, Zhuoyu An, Haibo Lei, Qing Luo, Jie Song, YuAn Liu, Haoyuan Zhang, Lingfeng Wang, Wei Chen, Aling Luo, Cheng Li, Jun Cao, Shu Chen, Zifei Dou, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Xuejian Gou, Qinliang Wang, Yang Liu, Shizhan Zhao, Yanzhao Zhang, Libo Yan, Yuwei Guo, Guoxin Li, Qiong Gao, Chenyue Che, Long Sun, Xiang Chen, Hao Li, Jinshan Pan, Chuanlong Xie, Hongming Chen, Mingrui Li, Tianchen Deng, Jingwei Huang, Yufeng Li, Fei Wan, Bingxin Xu, Jian Cheng, Hongzhe Liu, Cheng Xu, Yuxiang Zou, Weiguo Pan, Songyin Dai, Sen Jia, Junpei Zhang, Puhua Chen, Qihang Li
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies.
no code implementations • 19 Mar 2024 • Luyu Qiu, Jianing Li, Lei Wen, Chi Su, Fei Hao, Chen Jason Zhang, Lei Chen
In this paper, we propose XPose, a novel framework that incorporates Explainable AI (XAI) principles into pose estimation.
1 code implementation • 31 Jan 2024 • Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhang
To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images.
no code implementations • NeurIPS 2023 • Jianing Li, Vardan Papyan
Our measurements reveal a process called Residual Alignment (RA) characterized by four properties: (RA1) intermediate representations of a given input are equispaced on a line, embedded in high dimensional space, as observed by Gai and Zhang [2021]; (RA2) top left and right singular vectors of Residual Jacobians align with each other and across different depths; (RA3) Residual Jacobians are at most rank C for fully-connected ResNets, where C is the number of classes; and (RA4) top singular values of Residual Jacobians scale inversely with depth.
1 code implementation • 8 Aug 2023 • Dianze Li, Jianing Li, Yonghong Tian
Then, we design a spatiotemporal Transformer architecture to detect objects via an end-to-end sequence prediction problem, where the novel temporal Transformer module leverages rich temporal cues from two visual streams to improve the detection performance.
1 code implementation • ICCV 2023 • Qiaoyi Su, Yuhong Chou, Yifan Hu, Jianing Li, Shijie Mei, Ziyang Zhang, Guoqi Li
Spiking neural networks (SNNs) are brain-inspired energy-efficient models that encode information in spatiotemporal dynamics.
no code implementations • 23 May 2023 • Jianing Li, Bowen Chen, Zhiyong Wang, Honghai Liu
Given an untrimmed video, repetitive actions counting aims to estimate the number of repetitions of class-agnostic actions.
no code implementations • 22 May 2023 • Luzhe Huang, Jianing Li, Xiaofu Ding, Yijie Zhang, Hanlong Chen, Aydogan Ozcan
Uncertainty estimation is critical for numerous applications of deep neural networks and draws growing attention from researchers.
no code implementations • 6 Dec 2022 • Xu Liu, Jianing Li, Xiaopeng Fan, Yonghong Tian
Event cameras, offering high temporal resolutions and high dynamic ranges, have brought a new perspective to address common challenges (e. g., motion blur and low light) in monocular depth estimation.
1 code implementation • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • 26 Aug 2022 • Jianing Li, Jiaming Liu, Xiaobao Wei, Jiyuan Zhang, Ming Lu, Lei Ma, Li Du, Tiejun Huang, Shanghang Zhang
In this paper, we propose a novel Uncertainty-Guided Depth Fusion (UGDF) framework to fuse the predictions of monocular and stereo depth estimation networks for spike camera.
1 code implementation • 26 Aug 2022 • Jiaming Liu, Qizhe Zhang, Xiaoqi Li, Jianing Li, Guanqun Wang, Ming Lu, Tiejun Huang, Shanghang Zhang
Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in autonomous driving by mitigating the challenges posed by high-velocity motion blur.
1 code implementation • CVPR 2022 • Lin Zhu, Xiao Wang, Yi Chang, Jianing Li, Tiejun Huang, Yonghong Tian
We propose a novel Event-based Video reconstruction framework based on a fully Spiking Neural Network (EVSNN), which utilizes Leaky-Integrate-and-Fire (LIF) neuron and Membrane Potential (MP) neuron.
Computational Efficiency
Event-Based Video Reconstruction
+2
no code implementations • 23 Jan 2022 • Tiejun Huang, Yajing Zheng, Zhaofei Yu, Rui Chen, Yuan Li, Ruiqin Xiong, Lei Ma, Junwei Zhao, Siwei Dong, Lin Zhu, Jianing Li, Shanshan Jia, Yihua Fu, Boxin Shi, Si Wu, Yonghong Tian
By treating vidar as spike trains in biological vision, we have further developed a spiking neural network-based machine vision system that combines the speed of the machine and the mechanism of biological vision, achieving high-speed object detection and tracking 1, 000x faster than human vision.
Ranked #1 on
Deblurring
on .
no code implementations • 3 Nov 2021 • Jacob M. Remington, Jonathon B. Ferrell, Jianing Li
Short peptides with antimicrobial activity have therapeutic potential for treating bacterial infections.
2 code implementations • 11 Aug 2021 • Xiao Wang, Jianing Li, Lin Zhu, Zhipeng Zhang, Zhe Chen, Xin Li, YaoWei Wang, Yonghong Tian, Feng Wu
Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency.
Ranked #1 on
Object Tracking
on VisEvent
no code implementations • NeurIPS 2021 • Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, Roger Grosse
To this end, we propose Differentiable AIS (DAIS), a variant of AIS which ensures differentiability by abandoning the Metropolis-Hastings corrections.
no code implementations • ICCV 2021 • Lin Zhu, Jianing Li, Xiao Wang, Tiejun Huang, Yonghong Tian
In this paper, we propose a NeuSpike-Net to learn both the high dynamic range and high motion sensitivity of DVS and the full texture sampling of spike camera to achieve high-speed and high dynamic image reconstruction.
no code implementations • ECCV 2020 • Jianing Li, Shiliang Zhang
This paper tackles this challenge through jointly enforcing visual and temporal consistency in the combination of a local one-hot classification and a global multi-class classification.
no code implementations • ICML 2020 • Jianing Li, Yanyan Lan, Jiafeng Guo, Xue-Qi Cheng
We prove that under certain conditions, a linear combination of quality and diversity constitutes a divergence metric between the generated distribution and the real distribution.
1 code implementation • 28 Jan 2020 • Bo Ni, Zhichun Guo, Jianing Li, Meng Jiang
Recently, due to the booming influence of online social networks, detecting fake news is drawing significant attention from both academic communities and general public.
no code implementations • ICCV 2019 • Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, Shiliang Zhang
The long-term relations are captured by a temporal self-attention model to alleviate the occlusions and noises in video sequences.
no code implementations • 19 Nov 2018 • Jianing Li, Shiliang Zhang, Tiejun Huang
A temporal stream in this network is constructed by inserting several Multi-scale 3D (M3D) convolution layers into a 2D CNN network.
no code implementations • 20 Dec 2017 • Jianing Li, Shiliang Zhang, Jingdong Wang, Wen Gao, Qi Tian
This paper mainly establishes a large-scale Long sequence Video database for person re-IDentification (LVreID).
no code implementations • ICCV 2017 • Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian
Our deep architecture explicitly leverages the human part cues to alleviate the pose variations and learn robust feature representations from both the global image and different local parts.
Ranked #107 on
Person Re-Identification
on Market-1501