no code implementations • 4 Feb 2025 • Senmao Li, Kai Wang, Joost Van de Weijer, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng
Diffusion priors have been used for blind face restoration (BFR) by fine-tuning diffusion models (DMs) on restoration datasets to recover low-quality images.
1 code implementation • 23 Jan 2025 • Tao Liu, Kai Wang, Senmao Li, Joost Van de Weijer, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng
Drawing inspiration from the inherent context consistency, we propose a novel training-free method for consistent text-to-image (T2I) generation, termed "One-Prompt-One-Story" (1Prompt1Story).
no code implementations • 10 Dec 2024 • Chenhao Lu, Xuxin Cheng, Jialong Li, Shiqi Yang, Mazeyu Ji, Chengjing Yuan, Ge Yang, Sha Yi, Xiaolong Wang
The locomotion policy is trained conditioned on this upper-body motion representation, ensuring that the system remains robust with both manipulation and locomotion.
2 code implementations • 21 Oct 2024 • Mengjie Zhao, Zhi Zhong, Zhuoyuan Mao, Shiqi Yang, Wei-Hsiang Liao, Shusuke Takahashi, Hiromi Wakaki, Yuki Mitsufuji
We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music.
1 code implementation • 8 Oct 2024 • M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass
In each respective optimization step, the ranked prompts are fed as in-context examples (with their accuracies) to equip the LLM with the knowledge of the type of text prompts preferred by the downstream VLM.
no code implementations • 1 Oct 2024 • Saurav Jha, Shiqi Yang, Masato Ishii, Mengjie Zhao, Christian Simon, Muhammad Jehanzeb Mirza, Dong Gong, Lina Yao, Shusuke Takahashi, Yuki Mitsufuji
Personalized text-to-image diffusion models have grown popular for their ability to efficiently acquire a new concept from user-defined text descriptions and a few images.
no code implementations • 21 Aug 2024 • Shiqi Yang, Minghuan Liu, Yuzhe Qin, Runyu Ding, Jialong Li, Xuxin Cheng, Ruihan Yang, Sha Yi, Xiaolong Wang
Compared to previous systems, which often require hardware customization according to different robots, our single system can generalize to humanoid hands, arm-hands, arm-gripper, and quadruped-gripper systems with high-precision teleoperation.
no code implementations • 3 Jul 2024 • Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang
Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning.
no code implementations • 1 Jul 2024 • Xuxin Cheng, Jialong Li, Shiqi Yang, Ge Yang, Xiaolong Wang
Teleoperation serves as a powerful method for collecting on-robot data essential for robot learning from demonstrations.
no code implementations • 23 May 2024 • Shiqi Yang, Zhi Zhong, Mengjie Zhao, Shusuke Takahashi, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji
The recent audio-visual generation methods usually resort to huge large language model or composable diffusion models.
no code implementations • 14 Feb 2024 • Shiqi Yang, Hanlin Qin, Shuai Yuan, Xiang Yan, Hossein Rahmani
However, when applied to the infrared destriping task, it becomes challenging for the vanilla auxiliary generator to consistently produce vertical noise under unsupervised constraints.
1 code implementation • 28 Jan 2024 • Shuai Yuan, Hanlin Qin, Xiang Yan, Shiqi Yang, Shuowen Yang, Naveed Akhtar, Huixin Zhou
In a real-world infrared imaging system, effectively learning a consistent stripe noise removal model is essential.
1 code implementation • 15 Dec 2023 • Senmao Li, Taihang Hu, Joost Van de Weijer, Fahad Shahbaz Khan, Tao Liu, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang
This insight motivates us to omit encoder computation at certain adjacent time-steps and reuse encoder features of previous time-steps as input to the decoder in multiple time-steps.
1 code implementation • 12 Dec 2023 • Kangneng Zhou, Daiheng Gao, Xuan Wang, Jie Zhang, Peng Zhang, Xusen Sun, Longhao Zhang, Shiqi Yang, Bang Zhang, Liefeng Bo, Yaxing Wang, Ming-Ming Cheng
This enhances masked-based editing in local areas; second, we present a novel distillation strategy: Conditional Distillation on Geometry and Texture (CDGT).
no code implementations • 12 Nov 2023 • Yijie Zhang, Yuanchen Bei, Shiqi Yang, Hao Chen, Zhiqing Li, Lijia Chen, Feiran Huang
To this end, we propose IMGCF, a simple but effective model to alleviate behavior data imbalance for multi-behavior graph collaborative filtering.
1 code implementation • NeurIPS 2023 • Kai Wang, Fei Yang, Shiqi Yang, Muhammad Atif Butt, Joost Van de Weijer
Large-scale text-to-image generative models have been a ground-breaking development in generative AI, with diffusion models showing their astounding ability to synthesize convincing images following an input text prompt.
no code implementations • 1 Sep 2023 • Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui, Jian Yang
We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity.
no code implementations • 6 Jul 2023 • Shiqi Yang, Atsushi Hashimoto, Yoshitaka Ushiku
In recent years large model trained on huge amount of cross-modality data, which is usually be termed as foundation model, achieves conspicuous accomplishment in many fields, such as image recognition and generation.
1 code implementation • 4 Oct 2022 • Kai Wang, Chenshen Wu, Andy Bagdanov, Xialei Liu, Shiqi Yang, Shangling Jui, Joost Van de Weijer
Lifelong object re-identification incrementally learns from a stream of re-identification tasks.
1 code implementation • 7 Jun 2022 • Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost Van de Weijer
In this paper, we investigate Source-free Open-partial Domain Adaptation (SF-OPDA), which addresses the situation where there exist both domain and category shifts between source and target domains.
1 code implementation • 9 May 2022 • Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost Van de Weijer
Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency.
2 code implementations • NeurIPS 2021 • Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui
In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data.
Ranked #7 on
Source-Free Domain Adaptation
on VisDA-2017
1 code implementation • ICCV 2021 • Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui
In this paper, we propose a new domain adaptation paradigm called Generalized Source-free Domain Adaptation (G-SFDA), where the learned model needs to perform well on both the target and source domains, with only access to current unlabeled target data during adaptation.
Ranked #8 on
Source-Free Domain Adaptation
on VisDA-2017
no code implementations • 8 Mar 2021 • Shiqi Yang, Kai Wang, Luis Herranz, Joost Van de Weijer
Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions.
2 code implementations • 23 Oct 2020 • Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui
When adapting to the target domain, the additional classifier initialized from source classifier is expected to find misclassified features.
Source-Free Domain Adaptation
Unsupervised Domain Adaptation
no code implementations • 12 Jun 2020 • Shiqi Yang, Xiaolong Xu, Yaozheng Zhu, Ruirui Niu, Chunqiang Xu, Yuxuan Peng, Xing Cheng, Xionghui Jia, Xiaofeng Xu, Jianming Lu, Yu Ye
However, the layer-dependent magnetism of MnBi2Te4, which is fundamental and crucial for further exploration of quantum phenomena in this system, remains elusive.
Materials Science
no code implementations • 10 Jun 2020 • Shiqi Yang, Kai Wang, Luis Herranz, Joost Van de Weijer
Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their semantic descriptions.
no code implementations • 9 Jul 2018 • Shiqi Yang, Gang Peng
This paper proposes a novel attention model for semantic segmentation, which aggregates multi-scale and context features to refine prediction.
no code implementations • 6 Jul 2018 • Shiqi Yang, Gang Peng
The discriminator is core which drives parallel networks to focus on different regions and learn different representations.
no code implementations • 12 Nov 2017 • Shiqi Yang, Gang Peng
The discriminator is core which drives parallel networks to focus on different regions and learn complementary representations.