no code implementations • 16 Aug 2024 • Wei Sun, Xiaosong Zhang, Fang Wan, Yanzhao Zhou, Yuan Li, Qixiang Ye, Jianbin Jiao
In SfM-free methods, inaccurate initial poses lead to misalignment issue, which, under the constraints of per-pixel image loss functions, results in excessive gradients, causing unstable optimization and poor convergence for NVS.
no code implementations • 17 Jul 2024 • Kaixin Bai, Lei Zhang, Zhaopeng Chen, Fang Wan, Jianwei Zhang
Despite the substantial progress in deep learning, its adoption in industrial robotics projects remains limited, primarily due to challenges in data acquisition and labeling.
1 code implementation • 1 Jul 2024 • Mingxiang Liao, Hannan Lu, Xinyu Zhang, Fang Wan, Tianyu Wang, Yuzhong Zhao, WangMeng Zuo, Qixiang Ye, Jingdong Wang
For this purpose, we establish a new benchmark comprising text prompts that fully reflect multiple dynamics grades, and define a set of dynamics scores corresponding to various temporal granularities to comprehensively evaluate the dynamics of each generated video.
no code implementations • 1 Jul 2024 • Yenan Chen, Chuye Zhang, Pengxi Gu, Jianuo Qiu, Jiayi Yin, Nuofan Qiu, Guojing Huang, Bangchao Huang, Zishang Zhang, Hui Deng, Wei zhang, Fang Wan, Chaoyang Song
Then, we implemented a large-scale, multi-terrain deep reinforcement learning framework to train these reconfigurable limbs for a comparative analysis of overconstrained locomotion in energy efficiency.
1 code implementation • 25 May 2024 • Yuzhong Zhao, Feng Liu, Yue Liu, Mingxiang Liao, Chen Gong, Qixiang Ye, Fang Wan
Unfortunately, most of existing methods using fixed visual inputs remain lacking the resolution adaptability to find out precise language descriptions.
2 code implementations • 6 Feb 2024 • Feng Liu, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhou
Multi-view 3D object detection systems often struggle with generating precise predictions due to the challenges in estimating depth from images, increasing redundant and incorrect detections.
Ranked #2 on 3D Object Detection on nuScenes Camera Only
1 code implementation • 31 Jan 2024 • Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye
The multimodal model is constrained to generate captions within a few sub-spaces containing the control words, which increases the opportunity of hitting less frequent captions, alleviating the caption degeneration issue.
Ranked #1 on Dense Captioning on Visual Genome
1 code implementation • 16 Aug 2023 • Ning Guo, Xudong Han, Xiaobo Liu, Shuqiao Zhong, Zhiyuan Zhou, Jian Lin, Jiansheng Dai, Fang Wan, Chaoyang Song
Robots play a critical role as the physical agent of human operators in exploring the ocean.
no code implementations • 16 Aug 2023 • Xiaobo Liu, Xudong Han, Wei Hong, Fang Wan, Chaoyang Song
Proprioception is the "sixth sense" that detects limb postures with motor neurons.
1 code implementation • ICCV 2023 • Yuzhong Zhao, Qixiang Ye, Weijia Wu, Chunhua Shen, Fang Wan
During training, GenPromp converts image category labels to learnable prompt embeddings which are fed to a generative model to conditionally recover the input image with noise and learn representative embeddings.
Ranked #1 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric, using extra training data)
no code implementations • CVPR 2023 • Mingxiang Liao, Zonghao Guo, Yuze Wang, Peng Yuan, Bailan Feng, Fang Wan
Pointly supervised instance segmentation (PSIS) learns to segment objects using a single point within the object extent as supervision.
3 code implementations • ICCV 2023 • Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye
Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.
Ranked #5 on Few-Shot Object Detection on MS-COCO (30-shot)
no code implementations • 29 Sep 2021 • Xingzhong Hou, Boxiao Liu, Fang Wan, Haihang You
The existing pipeline is first pretraining a source model (which contains a generator and a discriminator) on a large-scale dataset and finetuning it on a target domain with limited samples.
1 code implementation • CVPR 2021 • Guangyu Guo, Junwei Han, Fang Wan, Dingwen Zhang
Weakly supervised object localization (WSOL) aims at learning to localize objects of interest by only using the image-level labels as the supervision.
1 code implementation • CVPR 2021 • Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye
Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.
Ranked #1 on Active Object Detection on MS COCO
2 code implementations • ICCV 2021 • Wei Gao, Fang Wan, Xingjia Pan, Zhiliang Peng, Qi Tian, Zhenjun Han, Bolei Zhou, Qixiang Ye
TS-CAM finally couples the patch tokens with the semantic-agnostic attention map to achieve semantic-aware localization.
no code implementations • 29 Jan 2021 • Linhan Yang, Xudong Han, Weijie Guo, Fang Wan, Jia Pan, Chaoyang Song
This paper presents a novel design of a soft tactile finger with omni-directional adaptation using multi-channel optical fibers for rigid-soft interactive grasping.
Robotics
no code implementations • 26 Jun 2020 • Feng Liu, Xiaoxong Zhang, Fang Wan, Xiangyang Ji, Qixiang Ye
We present Domain Contrast (DC), a simple yet effective approach inspired by contrastive learning for training domain adaptive detectors.
2 code implementations • 6 May 2020 • Fang Wan, Haokun Wang, Xiaobo Liu, Linhan Yang, Chaoyang Song
We present benchmarking results of the DeepClaw system for a baseline Tic-Tac-Toe task, a bin-clearing task, and a jigsaw puzzle task using three sets of standard robotic hardware.
Robotics
1 code implementation • ECCV 2020 • Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor Darrell, Huijuan Xu
Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label.
Ranked #9 on Weakly Supervised Action Localization on THUMOS’14
no code implementations • 1 Mar 2020 • Yao-Hui Chen, Sing Le, Qiao Chu Tan, Oscar Lau, Fang Wan, Chaoyang Song
This paper presents preliminary results of the design, development, and evaluation of a hand rehabilitation glove fabricated using lobster-inspired hybrid design with rigid and soft components for actuation.
no code implementations • 29 Feb 2020 • Xia Wu, Haiyuan Liu, Ziqi Liu, Mingdong Chen, Fang Wan, Chenglong Fu, Harry Asada, Zheng Wang, Chaoyang Song
Many researchers have identified robotics as a potential solution to the aging population faced by many developed and developing countries.
2 code implementations • 29 Feb 2020 • Fang Wan, Haokun Wang, Jiyuan Wu, Yujia Liu, Sheng Ge, Chaoyang Song
Such reconfigurable design with these omni-adaptive fingers enables us to systematically investigate the optimal arrangement of the fingers towards robust grasping.
2 code implementations • 29 Feb 2020 • Zeyi Yang, Sheng Ge, Fang Wan, Yujia Liu, Chaoyang Song
Robotic fingers made of soft material and compliant structures usually lead to superior adaptation when interacting with the unstructured physical environment.
2 code implementations • 29 Feb 2020 • Linhan Yang, Fang Wan, Haokun Wang, Xiaobo Liu, Yujia Liu, Jia Pan, Chaoyang Song
We use soft, stuffed toys for training, instead of everyday objects, to reduce the integration complexity and computational burden and exploit such rigid-soft interaction by changing the gripper fingers to the soft ones when dealing with rigid, daily-life items such as the Yale-CMU-Berkeley (YCB) objects.
4 code implementations • NeurIPS 2019 • Xiaosong Zhang, Fang Wan, Chang Liu, Rongrong Ji, Qixiang Ye
In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner.
Ranked #136 on Object Detection on COCO test-dev
no code implementations • 14 Jun 2019 • Yan Gao, Boxiao Liu, Nan Guo, Xiaochun Ye, Fang Wan, Haihang You, Dongrui Fan
Weakly supervised object detection (WSOD) focuses on training object detector with only image-level annotations, and is challenging due to the gap between the supervision and the objective.
1 code implementation • CVPR 2019 • Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye
Weakly supervised object detection (WSOD) is a challenging task when provided with image category supervision but required to simultaneously learn object locations and object detectors.
Ranked #15 on Weakly Supervised Object Detection on PASCAL VOC 2007
1 code implementation • CVPR 2018 • Fang Wan, Pengxu Wei, Zhenjun Han, Jianbin Jiao, Qixiang Ye
Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors.
1 code implementation • 2 Jan 2019 • Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye
In particular, the advantage of CHR is more significant in the scenarios with fewer positive training samples, which demonstrates its potential application in real-world security inspection.
1 code implementation • 23 May 2017 • Fang Wan, Chaoyang Song
In this paper, we describe the design of a hybrid neural network for logical learning that is similar to the human reasoning through the introduction of an auxiliary input, namely the indicators, that act as the hints to suggest logical outcomes.