no code implementations • ACL 2022 • Xi Ai, Bin Fang
In sequence modeling, certain tokens are usually less ambiguous than others, and representations of these tokens require fewer refinements for disambiguation.
no code implementations • COLING 2022 • Xi Ai, Bin Fang
A Multilingual model relies on language encodings to identify input languages because the multilingual model has to distinguish between the input and output languages or among all the languages for cross-lingual tasks.
no code implementations • 12 Mar 2025 • Yuhao Sun, ShiXin Zhang, Wenzhuang Li, Jie Zhao, Jianhua Shan, Zirong Shen, Zixi Chen, Fuchun Sun, Di Guo, Bin Fang
To address these issues, we integrated a pinhole camera model into the low computational cost vision-based tactile simulator Tacchi that used the Material Point Method (MPM) as the simulated method, completing the simulation of marker motion images.
no code implementations • 15 Feb 2025 • Ruoxuan Feng, Jiangyu Hu, Wenke Xia, Tianci Gao, Ao Shen, Yuhao Sun, Bin Fang, Di Hu
However, the distinct data characteristics of these low-standardized visuo-tactile sensors hinder the establishment of a powerful tactile perception system.
no code implementations • 11 Feb 2025 • Fujiao Ju, Yuxuan Wang, Shuo Wang, Chengyin Wang, Yinbo Chen, Jianfeng Li, Mingjie Dong, Bin Fang, Qianyu Zhuang
Next, we align the real spine model reconstructed from CT images with the standard skeletal model.
1 code implementation • 11 Feb 2025 • Sicheng Wang, Sheng Liu, Weiheng Wang, Jianhua Shan, Bin Fang
Embodied intelligence seamlessly integrates vision, language, and action.~However, most multimodal robotic models rely on massive fine-tuning, incurring high time and hardware costs.~To address this, we introduce RoboBERT, an end-to-end multimodal manipulation model built around a novel two-stage training paradigm.~In the first stage, we freeze most of the vision encoder and train with a single "standard" instruction phrasing, allowing the model to focus on stable policy learning via a CNN-based diffusion policy.~In the second stage, we unfreeze all modules and inject diverse natural language variants, rapidly aligning varied instructions to the already-learned policy without destabilizing performance.~We further employ systematic data augmentations to enhance robustness against visual perturbations.~Without relying on auxiliary datasets, RoboBERT achieves new state-of-the-art (SOTA) mean episode lengths of 4. 52 on the CALVIN ABCD-D benchmark and 3. 79 on the ABC-D benchmark using only language-labeled expert demonstrations and a comparatively lightweight architecture. Real-robot trials on a 6-DOF manipulator confirm higher success rates than comparable methods trained on identical data. These results demonstrate that our data-augmentation-enhanced two-stage training paradigm delivers efficient, scalable, and broadly applicable performance for multimodal robotic systems.
no code implementations • 20 Dec 2024 • Weizhi Xian, Mingliang Zhou, Leong Hou U, Lang Shujun, Bin Fang, Tao Xiang, Zhaowei Shang, Weijia Jia
This module effectively captures subtle features in images, thereby enhancing the adaptive perception of distortions on the basis of local information.
no code implementations • 21 May 2024 • Jing Gao, Ning Cheng, Bin Fang, Wenjuan Han
The Transformer model, initially achieving significant success in the field of natural language processing, has recently shown great potential in the application of tactile perception.
no code implementations • 14 Mar 2024 • Ning Cheng, You Li, Jing Gao, Bin Fang, Jinan Xu, Wenjuan Han
Tactility provides crucial support and enhancement for the perception and interaction capabilities of both humans and robots.
no code implementations • 19 Feb 2024 • Qunyue Huang, Bin Fang
The batch-level quality comparison task is formulated to enhance the training data and thus improve the robustness of the latent representations.
no code implementations • CVPR 2024 • Bin Fang, Bo Li, Shuang Wu, Shouhong Ding, Ran Yi, Lizhuang Ma
In this paper we re-examine the existing availability attack methods and propose a novel two-stage min-max-min optimization paradigm to generate robust unlearnable noise.
no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Tianyi Zheng, Shouhong Ding, Ran Yi, Lizhuang Ma
One of the crucial factors contributing to this success has been the access to an abundance of high-quality data for constructing machine learning models.
no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Ran Yi, Shouhong Ding, Lizhuang Ma
The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns.
no code implementations • 21 Feb 2023 • Yuhong Deng, Xiaofeng Guo, Yixuan Wei, Kai Lu, Bin Fang, Di Guo, Huaping Liu, Fuchun Sun
A composite robotic hand composed of a suction cup and a gripper is designed for grasping the object stably.
no code implementations • 15 Oct 2022 • Ruisheng Ran, Tianyu Gao, Bin Fang
Recently, Transformer is much popular and plays an important role in the fields of Machine Learning (ML), Natural Language Processing (NLP), and Computer Vision (CV), etc.
no code implementations • 4 Oct 2022 • Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Bin Fang, Yingfeng Chen, Jiming Chen, Yuchi Huo, Qi Ye
However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and the vulnerability of RGB images.
2 code implementations • 11 Aug 2021 • Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun, Chang Li
By contrast, EIP models the tactile sensor as a group of coordinated particles, and the elastic property is applied to regulate the deformation of particles during contact.
no code implementations • NAACL 2021 • Xi Ai, Bin Fang
In this work, to inject global information but also save cost, we present an efficient method to sample and consider a semantic draft as global information from semantic space for decoding with almost free of cost.
no code implementations • 23 Nov 2020 • Yikai Wang, Wenbing Huang, Bin Fang, Fuchun Sun
At its core, EIP models the tactile sensor as a group of coordinated particles, and the elastic theory is applied to regulate the deformation of particles during the contact process.
2 code implementations • CVPR 2020 • Runfa Chen, Wenbing Huang, Binghui Huang, Fuchun Sun, Bin Fang
The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied.
no code implementations • 16 Nov 2019 • Mingxuan Jing, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Chao Yang, Bin Fang, Huaping Liu
In this paper, we study Reinforcement Learning from Demonstrations (RLfD) that improves the exploration efficiency of Reinforcement Learning (RL) by providing expert demonstrations.
no code implementations • 25 Apr 2019 • Chuanqi Tan, Fuchun Sun, Tao Kong, Bin Fang, Wenchang Zhang
Different functional areas of the human brain play different roles in brain activity, which has not been paid sufficient research attention in the brain-computer interface (BCI) field.
4 code implementations • 17 Sep 2018 • Hongzhuo Liang, Xiaojian Ma, Shuang Li, Michael Görner, Song Tang, Bin Fang, Fuchun Sun, Jianwei Zhang
In this paper, we propose an end-to-end grasp evaluation model to address the challenging problem of localizing robot grasp configurations directly from the point cloud.
Robotics