1 code implementation • 26 Jun 2025 • Caoshuo Li, Zengmao Ding, Xiaobin Hu, Bang Li, Donghao Luo, AndyPian Wu, Chaoyang Wang, Chengjie Wang, Taisong Jin, SevenShu, Yunsheng Wu, Yongge Liu, Rongrong Ji
As one of the earliest ancient languages, Oracle Bone Script (OBS) encapsulates the cultural records and intellectual expressions of ancient civilizations.
no code implementations • 18 Jun 2025 • Chaoyang Wang, Ashkan Mirzaei, Vidit Goel, Willi Menapace, Aliaksandr Siarohin, Avalon Vinella, Michael Vasilkovsky, Ivan Skorokhodov, Vladislav Shakhrai, Sergey Korolev, Sergey Tulyakov, Peter Wonka
We propose the first framework capable of computing a 4D spatio-temporal grid of video frames and 3D Gaussian particles for each time step using a feed-forward architecture.
no code implementations • 22 May 2025 • Francesco Dalla Serra, Patrick Schrempf, Chaoyang Wang, Zaiqiao Meng, Fani Deligianni, Alison Q. O'Neil
Taking inspiration from 'Chain-of-Thought reasoning', we demonstrate that performance on the CXR VQA task can be improved by grounding the answer generator module with a radiology report predicted for the same CXR.
no code implementations • 22 May 2025 • Chaoyang Wang, Xiangtai Li, Lu Qi, Xiaofan Lin, Jinbin Bai, Qianyu Zhou, Yunhai Tong
Recent progress in panoramic image generation has underscored two critical limitations in existing approaches.
no code implementations • 15 Apr 2025 • Chaoyang Wang, Zeyu Zhang, Long Teng, Zijun Li, Shichao Kan
This mechanism dynamically balances visual and textual representations within the contrastive learning pipeline, optimizing the composed feature for retrieval.
Ranked #1 on
Image Retrieval
on CIRR
no code implementations • 21 Jan 2025 • Yu-Chu Yu, Chieh Hubert Lin, Hsin-Ying Lee, Chaoyang Wang, Yu-Chiang Frank Wang, Ming-Hsuan Yang
However, articulating the rigs into realistic affordance-aware postures (e. g., following the context, respecting the physics and the personalities of the object) remains time-consuming and heavily relies on human labor from experienced artists.
no code implementations • CVPR 2025 • Ziya Erkoç, Can Gümeli, Chaoyang Wang, Matthias Nießner, Angela Dai, Peter Wonka, Hsin-Ying Lee, Peiye Zhuang
The edited 3D mesh aligns well with the prompts, and remains identical for regions that are not intended to be altered.
no code implementations • CVPR 2025 • Chaoyang Wang, Peiye Zhuang, Tuan Duc Ngo, Willi Menapace, Aliaksandr Siarohin, Michael Vasilkovsky, Ivan Skorokhodov, Sergey Tulyakov, Peter Wonka, Hsin-Ying Lee
We propose a novel two-stream architecture.
no code implementations • 31 Oct 2024 • Tuan Duc Ngo, Peiye Zhuang, Chuang Gan, Evangelos Kalogerakis, Sergey Tulyakov, Hsin-Ying Lee, Chaoyang Wang
Our method provides a robust solution for applications requiring fine-grained, long-term motion tracking in 3D space.
no code implementations • 26 Aug 2024 • Zhenggang Tang, Peiye Zhuang, Chaoyang Wang, Aliaksandr Siarohin, Yash Kant, Alexander Schwing, Sergey Tulyakov, Hsin-Ying Lee
During inference, we employ a rapid multi-view to 3D reconstruction approach, NeuS, to obtain coarse depth for the depth-truncated epipolar attention.
no code implementations • 17 Jul 2024 • Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, Willi Menapace, Guocheng Qian, Michael Vasilkovsky, Hsin-Ying Lee, Chaoyang Wang, Jiaxu Zou, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov
Recently, new methods demonstrate the ability to generate videos with controllable camera poses these techniques leverage pre-trained U-Net-based diffusion models that explicitly disentangle spatial and temporal generation.
no code implementations • 4 Jul 2024 • Bang Li, Donghao Luo, Yujie Liang, Jing Yang, Zengmao Ding, Xu Peng, Boyuan Jiang, Shengwei Han, Dan Sui, Peichao Qin, Pian Wu, Chaoyang Wang, Yun Qi, Taisong Jin, Chengjie Wang, Xiaoming Huang, Zhan Shu, Rongrong Ji, Yongge Liu, Yunsheng Wu
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography.
1 code implementation • 27 Jun 2024 • Junli Cao, Vidit Goel, Chaoyang Wang, Anil Kag, Ju Hu, Sergei Korolev, Chenfanfu Jiang, Sergey Tulyakov, Jian Ren
Our key observation is that nearby points in the scene can share similar representations.
no code implementations • 11 Jun 2024 • Heng Yu, Chaoyang Wang, Peiye Zhuang, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Laszlo A Jeni, Sergey Tulyakov, Hsin-Ying Lee
We then learn the canonical 3D representation of the video using a freeze-time video, delicately generated from the reference video.
no code implementations • 9 Jun 2024 • Peiye Zhuang, Songfang Han, Chaoyang Wang, Aliaksandr Siarohin, Jiaxu Zou, Michael Vasilkovsky, Vladislav Shakhrai, Sergey Korolev, Sergey Tulyakov, Hsin-Ying Lee
Our method takes inspiration from large reconstruction models like LRM that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images.
1 code implementation • 30 May 2024 • Chaoyang Wang, Xiangtai Li, Lu Qi, Henghui Ding, Yunhai Tong, Ming-Hsuan Yang
For image synthesis, we propose a finite perturbation approach to enhance the diversity of generated results without changing the semantic categories.
no code implementations • 28 May 2024 • Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang
This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity.
no code implementations • 22 May 2024 • Chaoyang Wang, Pengzhi Cheng, Jingze Li, Weiwei Sun
We demonstrate that Leader Reward greatly improves the quality of the optimal solutions generated by the model.
no code implementations • 14 Mar 2024 • Chaoyang Wang, Xiangtai Li, Henghui Ding, Lu Qi, Jiangning Zhang, Yunhai Tong, Chen Change Loy, Shuicheng Yan
In-context segmentation has drawn more attention with the introduction of vision foundation models.
no code implementations • 1 Feb 2024 • Guocheng Qian, Junli Cao, Aliaksandr Siarohin, Yash Kant, Chaoyang Wang, Michael Vasilkovsky, Hsin-Ying Lee, Yuwei Fang, Ivan Skorokhodov, Peiye Zhuang, Igor Gilitschenski, Jian Ren, Bernard Ghanem, Kfir Aberman, Sergey Tulyakov
We introduce Amortized Text-to-Mesh (AToM), a feed-forward text-to-mesh framework optimized across multiple text prompts simultaneously.
no code implementations • 10 Jan 2024 • Chaoyang Wang, Peiye Zhuang, Aliaksandr Siarohin, Junli Cao, Guocheng Qian, Hsin-Ying Lee, Sergey Tulyakov
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.
1 code implementation • 10 Jan 2024 • Zhiqiang Guo, GuoHui Li, Jianjun Li, Chaoyang Wang, Si Shi
To address this problem, we propose a Dual Disentangled Variational AutoEncoder (DualVAE) for collaborative recommendation, which combines disentangled representation learning with variational inference to facilitate the generation of implicit interaction data.
no code implementations • CVPR 2024 • Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee
We marry the locality of objects with globality of scenes by introducing a hybrid 3D representation - explicit for objects and implicit for scenes.
1 code implementation • 27 Dec 2023 • Zhiqiang Guo, Jianjun Li, GuoHui Li, Chaoyang Wang, Si Shi, Bin Ruan
The multimodal recommendation has gradually become the infrastructure of online media platforms, enabling them to provide personalized service to users through a joint modeling of user historical behaviors (e. g., purchases, clicks) and item various modalities (e. g., visual and textual).
no code implementations • 21 Dec 2023 • Yen-Chi Cheng, Chieh Hubert Lin, Chaoyang Wang, Yash Kant, Sergey Tulyakov, Alexander Schwing, LiangYan Gui, Hsin-Ying Lee
Toward unlocking the potential of generative models in immersive 4D experiences, we introduce Virtual Pet, a novel pipeline to model realistic and diverse motions for target animal species within a 3D environment.
no code implementations • 13 Dec 2023 • Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee
We are witnessing significant breakthroughs in the technology for generating 3D objects from text.
no code implementations • 9 Oct 2023 • Francesco Dalla Serra, Chaoyang Wang, Fani Deligianni, Jeffrey Dalton, Alison Q O'Neil
Previous approaches to automated radiology reporting generally do not provide the prior study as input, precluding comparison which is required for clinical accuracy in some types of scans, and offer only unreliable methods of interpretability.
no code implementations • 30 Aug 2023 • Francesco Dalla Serra, Chaoyang Wang, Fani Deligianni, Jeffrey Dalton, Alison Q. O'Neil
Automated approaches to radiology reporting require the image to be encoded into a suitable token representation for input to the language model.
1 code implementation • NeurIPS 2023 • Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc van Gool, Sergey Tulyakov
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
2 code implementations • CVPR 2023 • Gengshan Yang, Chaoyang Wang, N Dinesh Reddy, Deva Ramanan
Building animatable 3D models is challenging due to the need for 3D scans, laborious registration, and manual rigging, which are difficult to scale to arbitrary categories.
3D Shape Reconstruction from Videos
Dynamic Reconstruction
+1
no code implementations • CVPR 2023 • Chaoyang Wang, Lachlan Ewen MacDonald, Laszlo A. Jeni, Simon Lucey
In this paper we present a new method for deformable NeRF that can directly use optical flow as supervision.
1 code implementation • 19 Jan 2023 • Antanas Kascenas, Pedro Sanchez, Patrick Schrempf, Chaoyang Wang, William Clackett, Shadia S. Mikhael, Jeremy P. Voisey, Keith Goatman, Alexander Weir, Nicolas Pugeault, Sotirios A. Tsaftaris, Alison Q. O'Neil
Denoising methods, for instance classical denoising autoencoders (DAEs) and more recently emerging diffusion models, are a promising approach, however naive application of pixelwise noise leads to poor anomaly detection performance.
1 code implementation • CIKM 2022 • GuoHui Li, Zhiqiang Guo, Jianjun Li, Chaoyang Wang
Specifically, for neighborhood-level dependencies, we explicitly consider both popularity score and preference correlation by designing a joint neighborhood-level dependency weight, based on which we construct a neighborhood-level dependencies graph to capture higher-order interaction features.
2 code implementations • 4 Oct 2022 • Mosam Dabhi, Chaoyang Wang, Tim Clifford, Laszlo Attila Jeni, Ian R. Fasel, Simon Lucey
Our Multi-view Bootstrapping in the Wild (MBW) approach demonstrates impressive results on standard human datasets, as well as tigers, cheetahs, fish, colobus monkeys, chimpanzees, and flamingos from videos captured casually in a zoo.
3D Reconstruction
Semi-supervised 2D and 3D landmark labeling
+1
no code implementations • CVPR 2022 • Chaoyang Wang, Xueqian Li, Jhony Kaesemodel Pontes, Simon Lucey
Here, we propose a neural trajectory prior to capture continuous spatio-temporal information without the need for offline data.
no code implementations • 22 Oct 2021 • Mosam Dabhi, Chaoyang Wang, Kunal Saluja, Laszlo Jeni, Ian Fasel, Simon Lucey
Multi-view triangulation is the gold standard for 3D reconstruction from 2D correspondences given known calibration and sufficient views.
no code implementations • 12 May 2021 • Chaoyang Wang, Ben Eckart, Simon Lucey, Orazio Gallo
Recent approaches to render photorealistic views from a limited set of photographs have pushed the boundaries of our interactions with pictures of static scenes.
no code implementations • CVPR 2021 • Chaoyang Wang, Simon Lucey
Recent success in casting Non-rigid Structure from Motion (NRSfM) as an unsupervised deep learning problem has raised fundamental questions about what novelty in NRSfM prior could the deep learning offer.
1 code implementation • NeurIPS 2020 • Chen-Hsuan Lin, Chaoyang Wang, Simon Lucey
Dense 3D object reconstruction from a single image has recently witnessed remarkable advances, but supervising neural networks with ground-truth 3D shapes is impractical due to the laborious process of creating paired image-shape datasets.
3D Object Reconstruction From A Single Image
3D Reconstruction
1 code implementation • 4 Oct 2020 • Chaoyang Wang, Zhiqiang Guo, GuoHui Li, Jianjun Li, Peng Pan, Ke Liu
Afterward, by performing a simplified RGCN-based node information propagation on the constructed heterogeneous graph, the embeddings of users and items can be adjusted with textual knowledge, which effectively alleviates the negative effects of data sparsity.
1 code implementation • 14 Apr 2020 • Chaoyang Wang, Zhiqiang Guo, Jianjun Li, Peng Pan, Guo-Hui Li
IRSs usually face the large discrete action space problem, which makes most of the existing RL-based recommendation methods inefficient.
1 code implementation • 27 Jan 2020 • Chaoyang Wang, Chen-Hsuan Lin, Simon Lucey
The recovery of 3D shape and pose from 2D landmarks stemming from a large ensemble of images can be viewed as a non-rigid structure from motion (NRSfM) problem.
no code implementations • ICCV 2019 • Chaoyang Wang, Chen Kong, Simon Lucey
This alleviates the data bottleneck, which is one of the major concern for supervised methods.
Ranked #21 on
Weakly-supervised 3D Human Pose Estimation
on Human3.6M
no code implementations • 25 Apr 2019 • Chaoyang Wang, Simon Lucey, Federico Perazzi, Oliver Wang
We present a fully data-driven method to compute depth from diverse monocular video sequences that contain large amounts of non-rigid objects, e. g., people.
no code implementations • 23 Mar 2018 • Nathaniel Chodosh, Chaoyang Wang, Simon Lucey
In this paper we consider the problem of estimating a dense depth map from a set of sparse LiDAR points.
1 code implementation • CVPR 2018 • Chaoyang Wang, Jose Miguel Buenaposada, Rui Zhu, Simon Lucey
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community.
no code implementations • 30 Nov 2017 • Rui Zhu, Chaoyang Wang, Chen-Hsuan Lin, Ziyan Wang, Simon Lucey
More recently, excellent results have been attained through the application of photometric bundle adjustment (PBA) methods -- which directly minimize the photometric error across frames.
no code implementations • 4 Nov 2017 • Rui Zhu, Chaoyang Wang, Chen-Hsuan Lin, Ziyan Wang, Simon Lucey
Reconstructing 3D shapes from a sequence of images has long been a problem of interest in computer vision.
no code implementations • ICCV 2017 • Rui Zhu, Hamed Kiani Galoogahi, Chaoyang Wang, Simon Lucey
An emerging problem in computer vision is the reconstruction of 3D shape and pose of an object from a single image.
no code implementations • 15 Jul 2017 • Rui Zhu, Hamed Kiani Galoogahi, Chaoyang Wang, Simon Lucey
An emerging problem in computer vision is the reconstruction of 3D shape and pose of an object from a single image.
no code implementations • 19 May 2017 • Chaoyang Wang, Hamed Kiani Galoogahi, Chen-Hsuan Lin, Simon Lucey
In this paper we present a new approach for efficient regression based object tracking which we refer to as Deep- LK.
no code implementations • CVPR 2015 • Chaoyang Wang, Long Zhao, Shuang Liang, Liqing Zhang, Jinyuan Jia, Yichen Wei
Hierarchical segmentation based object proposal methods have become an important step in modern object detection paradigm.
no code implementations • IEEE International Conference on Consumer Electronics (ICCE) 2013 • Molin Jia, Chaoyang Wang, Kui-Ting Chen, Takaaki Baba
The conventional approach cannot meet the requirement of physiological signal analysis to extract the main component of the acquired signal.