ID-Animator: Zero-Shot Identity-Preserving Human Video Generation

1 code implementation23 Apr 2024 Xuanhua He, Quande Liu, Shengju Qian, Xin Wang, Tao Hu, Ke Cao, Keyu Yan, Jie Zhang

Based on this pipeline, a random face reference training method is further devised to precisely capture the ID-relevant embeddings from reference images, thus improving the fidelity and generalization capacity of our model for ID-specific video generation.

Attribute Video Generation

X-Ray: A Sequential 3D Representation for Generation

1 code implementation22 Apr 2024 Tao Hu, Wenhang Ge, Yuyang Zhao, Gim Hee Lee

In this paper, we introduce X-Ray, an innovative approach to 3D generation that employs a new sequential representation, drawing inspiration from the depth-revealing capabilities of X-Ray scans to meticulously capture both the external and internal features of objects.

3D Generation

CRNet: A Detail-Preserving Network for Unified Image Restoration and Enhancement Task

1 code implementation22 Apr 2024 Kangzhen Yang, Tao Hu, Kexin Dai, Genggeng Chen, Yu Cao, Wei Dong, Peng Wu, Yanning Zhang, Qingsen Yan

In real-world scenarios, images captured often suffer from blurring, noise, and other forms of image degradation, and due to sensor limitations, people usually can only obtain low dynamic range images.

Deblurring Denoising +2

FashionEngine: Interactive 3D Human Generation and Editing via Multimodal Controls

no code implementations2 Apr 2024 Tao Hu, Fangzhou Hong, Zhaoxi Chen, Ziwei Liu

FashionEngine automates the 3D human production with three key components: 1) A pre-trained 3D human diffusion model that learns to model 3D humans in a semantic UV latent space from 2D image training data, which provides strong priors for diverse generation and editing tasks.

Virtual Try-on

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

no code implementations1 Apr 2024 Tao Hu, Fangzhou Hong, Ziwei Liu

2) Physical motion decoding that is designed to encourage physical motion learning by decoding the motion triplane features at timestep t to predict both spatial derivatives and temporal derivatives at the next timestep t+1 in the training stage.

Generalizable Novel View Synthesis Novel View Synthesis

StructLDM: Structured Latent Diffusion for 3D Human Generation

no code implementations1 Apr 2024 Tao Hu, Fangzhou Hong, Ziwei Liu

2) A structured 3D-aware auto-decoder that factorizes the global latent space into several semantic body parts parameterized by a set of conditional structured local NeRFs anchored to the body template, which embeds the properties learned from the 2D training data and can be decoded to render view-consistent humans under different poses and clothing styles.

Virtual Try-on

Generating Content for HDR Deghosting from Frequency View

no code implementations1 Apr 2024 Tao Hu, Qingsen Yan, Yuankai Qi, Yanning Zhang

To address this challenge, we propose the Low-Frequency aware Diffusion (LF-Diff) model for ghost-free HDR imaging.

HDR Reconstruction regression

Boosting Image Restoration via Priors from Pre-trained Models

no code implementations11 Mar 2024 Xiaogang Xu, Shu Kong, Tao Hu, Zhe Liu, Hujun Bao

Pre-trained models with large-scale training data, such as CLIP and Stable Diffusion, have demonstrated remarkable performance in various high-level computer vision tasks such as image understanding and generation from language descriptions.

Deblurring Denoising +2

Coronary CTA and Quantitative Cardiac CT Perfusion (CCTP) in Coronary Artery Disease

no code implementations30 Jan 2024 Hao Wu, Yingnan Song, Ammar Hoori, Ananya Subramaniam, Juhwan Lee, Justin Kim, Tao Hu, Sadeer Al-Kindi, Wei-Ming Huang, Chun-Ho Yun, Chung-Lieh Hung, Sanjay Rajagopalan, David L. Wilson

CCTA in conjunction with a new automated quantitative CCTP approach can augment the interpretation of CAD, enabling the distinction of ischemia due to obstructive lesions and MVD.

AI prediction of cardiovascular events using opportunistic epicardial adipose tissue assessments from CT calcium score

no code implementations29 Jan 2024 Tao Hu, Joshua Freeze, Prerna Singh, Justin Kim, Yingnan Song, Hao Wu, Juhwan Lee, Sadeer Al-Kindi, Sanjay Rajagopalan, David L. Wilson, Ammar Hoori

Background: Recent studies have used basic epicardial adipose tissue (EAT) assessments (e. g., volume and mean HU) to predict risk of atherosclerosis-related, major adverse cardiovascular events (MACE).

Pericoronary adipose tissue feature analysis in CT calcium score images with comparison to coronary CTA

no code implementations28 Jan 2024 Yingnan Song, Hao Wu, Juhwan Lee, Justin Kim, Ammar Hoori, Tao Hu, Vladislav Zimin, Mohamed Makhlouf, Sadeer Al-Kindi, Sanjay Rajagopalan, Chun-Ho Yun, Chung-Lieh Hung, David L. Wilson

Preliminarily, PCAT features can be assessed from three main coronary arteries in non-contrast CTCS images with performance characteristics that are at the very least comparable to CCTA.

Enhancing RAW-to-sRGB with Decoupled Style Structure in Fourier Domain

1 code implementation4 Jan 2024 Xuanhua He, Tao Hu, Guoli Wang, Zejin Wang, Run Wang, Qian Zhang, Keyu Yan, Ziyi Chen, Rui Li, Chenjun Xie, Jie Zhang, Man Zhou

However, current methods often ignore the difference between cell phone RAW images and DSLR camera RGB images, a difference that goes beyond the color matrix and extends to spatial structure due to resolution variations.

Image Restoration

Temporal-Spatial Entropy Balancing for Causal Continuous Treatment-Effect Estimation

no code implementations14 Dec 2023 Tao Hu, Honglong Zhang, Fan Zeng, Min Du, XiangKun Du, Yue Zheng, Quanqi Li, Mengran Zhang, Dan Yang, Jihao Wu

However, temporal and spatial dimensions are extremely critical in the logistics field, and this limitation may directly affect the precision of subsidy and pricing strategies.

Query by Activity Video in the Wild

1 code implementation23 Nov 2023 Tao Hu, William Thong, Pascal Mettes, Cees G. M. Snoek

In this paper, we propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval.


Towards High-quality HDR Deghosting with Conditional Diffusion Models

no code implementations2 Nov 2023 Qingsen Yan, Tao Hu, Yuan Sun, Hao Tang, Yu Zhu, Wei Dong, Luc van Gool, Yanning Zhang

To address this challenge, we formulate the HDR deghosting problem as an image generation that leverages LDR features as the diffusion model's condition, consisting of the feature condition generator and the noise predictor.

Denoising Image Generation

Timely Fusion of Surround Radar/Lidar for Object Detection in Autonomous Driving Systems

no code implementations9 Sep 2023 Wenjing Xie, Tao Hu, Neiwen Ling, Guoliang Xing, Shaoshan Liu, Nan Guan

Surround Radar/Lidar can provide 360-degree view sampling with the minimal cost, which are promising sensing hardware solutions for autonomous driving systems.

Autonomous Driving object-detection +1

Enhancing cardiovascular risk prediction through AI-enabled calcium-omics

no code implementations23 Aug 2023 Ammar Hoori, Sadeer Al-Kindi, Tao Hu, Yingnan Song, Hao Wu, Juhwan Lee, Nour Tashtish, Pingfu Fu, Robert Gilkeson, Sanjay Rajagopalan, David L. Wilson

We used a Cox model with elastic-net regularization on 2457 CT calcium score (CTCS) enriched for MACE events obtained from a large no-cost CLARIFY program (ClinicalTri-als. gov Identifier: NCT04075162).

HumanLiff: Layer-wise 3D Human Generation with Diffusion Model

no code implementations18 Aug 2023 Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, Ziwei Liu

In this work, we propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.

3D Generation Neural Rendering

Self-supervised Learning by View Synthesis

no code implementations22 Apr 2023 Shaoteng Liu, Xiangyu Zhang, Tao Hu, Jiaya Jia

In each iteration, the input to VSA is one view (or multiple views) of a 3D object and the output is a synthesized image in another target pose.

3D Classification Decoder +1

Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields

no code implementations CVPR 2023 Tao Hu, Xiaogang Xu, Shu Liu, Jiaya Jia

Also, we present Point Encoding to build Multi-scale Radiance Fields that provide discriminative 3D point features.


TriVol: Point Cloud Rendering via Triple Volumes

1 code implementation CVPR 2023 Tao Hu, Xiaogang Xu, Ruihang Chu, Jiaya Jia

However, artifacts still appear in rendered images, due to the challenges in extracting continuous and discriminative 3D features from point clouds.


Ref-NeuS: Ambiguity-Reduced Neural Implicit Surface Learning for Multi-View Reconstruction with Reflection

1 code implementation ICCV 2023 Wenhang Ge, Tao Hu, Haoyu Zhao, Shu Liu, Ying-Cong Chen

We show that together with a reflection direction-dependent radiance, our model achieves high-quality surface reconstruction on reflective surfaces and outperforms the state-of-the-arts by a large margin.

3D Reconstruction Multi-View 3D Reconstruction +1

EfficientNeRF: Efficient Neural Radiance Fields

1 code implementation2 Jun 2022 Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, Jiaya Jia

Neural Radiance Fields (NeRF) has been wildly applied to various tasks for its high-quality representation of 3D scenes.


Human-Object Interaction Detection via Disentangled Transformer

no code implementations CVPR 2022 Desen Zhou, Zhichao Liu, Jian Wang, Leshan Wang, Tao Hu, Errui Ding, Jingdong Wang

To associate the predictions of disentangled decoders, we first generate a unified representation for HOI triplets with a base decoder, and then utilize it as input feature of each disentangled decoder.

Decoder Human-Object Interaction Detection +1

EfficientNeRF Efficient Neural Radiance Fields

no code implementations CVPR 2022 Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, Jiaya Jia

Neural Radiance Fields (NeRF) has been wildly applied to various tasks for its high-quality representation of 3D scenes.


HVTR: Hybrid Volumetric-Textural Rendering for Human Avatars

no code implementations19 Dec 2021 Tao Hu, Tao Yu, Zerong Zheng, He Zhang, Yebin Liu, Matthias Zwicker

To handle complicated motions (e. g., self-occlusions), we then leverage the encoded information on the UV manifold to construct a 3D volumetric representation based on a dynamic pose-conditioned neural radiance field.

Neural Rendering

GaTector: A Unified Framework for Gaze Object Prediction

1 code implementation CVPR 2022 Binglu Wang, Tao Hu, Baoshan Li, Xiaojuan Chen, Zhijie Zhang

In this paper, we build a novel framework named GaTector to tackle the gaze object prediction problem in a unified way.

Gaze Estimation Gaze Prediction +4

EgoRenderer: Rendering Human Avatars from Egocentric Camera Images

no code implementations ICCV 2021 Tao Hu, Kripasindhu Sarkar, Lingjie Liu, Matthias Zwicker, Christian Theobalt

We next combine the target pose image and the textures into a combined feature image, which is transformed into the output color image using a neural image translation network.

Texture Synthesis Translation

CRD-CGAN: Category-Consistent and Relativistic Constraints for Diverse Text-to-Image Generation

no code implementations28 Jul 2021 Tao Hu, Chengjiang Long, Chunxia Xiao

Based on those constraints, a category-consistent and relativistic diverse conditional GAN (CRD-CGAN) is proposed to synthesize $K$ photo-realistic images simultaneously.

Text-to-Image Generation

Self-Supervised 3D Mesh Reconstruction From Single Images

no code implementations CVPR 2021 Tao Hu, LiWei Wang, Xiaogang Xu, Shu Liu, Jiaya Jia

Recent single-view 3D reconstruction methods reconstruct object's shape and texture from a single image with only 2D image-level annotation.

3D Reconstruction Attribute +2

Exploring the impact of under-reported cases on the COVID-19 spatiotemporal distribution using healthcare worker infection data

no code implementations10 Nov 2020 Peixiao Wang, Tao Hu, Hongqiang Liu, Xinyan Zhu

Therefore, in this paper, a novel framework was proposed to explore the impact of under-reporting on COVID-19 spatiotemporal distributions, and empirical analysis was carried out using infection data of healthcare workers in Wuhan and Hubei (excluding Wuhan).

Hierarchical Modes Exploring in Generative Adversarial Networks

no code implementations5 Mar 2020 Mengxiao Hu, Jinlong Li, Maolin Hu, Tao Hu

In conditional Generative Adversarial Networks (cGANs), when two different initial noises are concatenated with the same conditional information, the distance between their outputs is relatively smaller, which makes minor modes likely to collapse into large modes.

Text-to-Image Generation Translation

Multi-object Tracking via End-to-end Tracklet Searching and Ranking

no code implementations4 Mar 2020 Tao Hu, Lichao Huang, Han Shen

Recent works in multiple object tracking use sequence model to calculate the similarity score between the detections and the previous tracklets.

Multi-Object Tracking Multiple Object Tracking

Learning to Generate Dense Point Clouds with Textures on Multiple Categories

1 code implementation22 Dec 2019 Tao Hu, Geng Lin, Zhizhong Han, Matthias Zwicker

In this paper, we propose a novel approach for reconstructing point clouds from RGB images.

3D Reconstruction

3D Shape Completion with Multi-view Consistent Inference

1 code implementation28 Nov 2019 Tao Hu, Zhizhong Han, Matthias Zwicker

We formulate the regularization term as a consistency loss that encourages geometric consistency among multiple views, while the data term guarantees that the optimized views do not drift away too much from a learned shape descriptor.

SILCO: Show a Few Images, Localize the Common Object

no code implementations ICCV 2019 Tao Hu, Pascal Mettes, Jia-Hong Huang, Cees G. M. Snoek

To that end, we introduce a spatial similarity module that searches the spatial commonality among the given images.

Few-Shot Learning

A Radio Signal Modulation Recognition Algorithm Based on Residual Networks and Attention Mechanisms

no code implementations27 Sep 2019 Ruisen Luo, Tao Hu, Zuodong Tang, Chen Wang, Xiaofeng Gong, Haiyan Tu

To solve the problem of inaccurate recognition of types of communication signal modulation, a RNN neural network recognition algorithm combining residual block network with attention mechanism is proposed.

Real Time Visual Tracking using Spatial-Aware Temporal Aggregation Network

1 code implementation2 Aug 2019 Tao Hu, Lichao Huang, Xian-Ming Liu, Han Shen

Our tracker achieves leading performance in OTB2013, OTB2015, VOT2015, VOT2016 and LaSOT, and operates at a real-time speed of 26 FPS, which indicates our method is effective and practical.

Motion Estimation Real-Time Visual Tracking

VITAL: A Visual Interpretation on Text with Adversarial Learning for Image Labeling

no code implementations26 Jul 2019 Tao Hu, Chengjiang Long, Leheng Zhang, Chunxia Xiao

In this paper, we propose a novel way to interpret text information by extracting visual feature presentation from multiple high-resolution and photo-realistic synthetic images generated by Text-to-image Generative Adversarial Network (GAN) to improve the performance of image labeling.

Generative Adversarial Network

Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion

no code implementations17 Apr 2019 Tao Hu, Zhizhong Han, Abhinav Shrivastava, Matthias Zwicker

Different from image-to-image translation network that completes each view separately, our novel network, multi-view completion net (MVCN), leverages information from all views of a 3D shape to help the completion of each single view.

Image-to-Image Translation Translation

Weakly Supervised Bilinear Attention Network for Fine-Grained Visual Classification

no code implementations6 Aug 2018 Tao Hu, Jizheng Xu, Cong Huang, Honggang Qi, Qingming Huang, Yan Lu

Besides, we propose attention regularization and attention dropout to weakly supervise the generating process of attention maps.

Classification Fine-Grained Image Classification +1

Facial Landmarks Detection by Self-Iterative Regression based Landmarks-Attention Network

no code implementations18 Mar 2018 Tao Hu, Honggang Qi, Jizheng Xu, Qingming Huang

Only one self-iterative regressor is trained to learn the descent directions for samples from coarse stages to fine stages, and parameters are iteratively updated by the same regressor.

Ranked #16 on Face Alignment on 300W (NME_inter-pupil (%, Common) metric)

Face Alignment regression

A Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization

no code implementations2 Mar 2015 Tao Hu, Cengiz Pehlevan, Dmitri B. Chklovskii

Here, to overcome this problem, we derive sparse dictionary learning from a novel cost-function - a regularized error of the symmetric factorization of the input's similarity matrix.

Dictionary Learning

A Hebbian/Anti-Hebbian Neural Network for Linear Subspace Learning: A Derivation from Multidimensional Scaling of Streaming Data

no code implementations2 Mar 2015 Cengiz Pehlevan, Tao Hu, Dmitri B. Chklovskii

Such networks learn the principal subspace, in the sense of principal component analysis (PCA), by adjusting synaptic weights according to activity-dependent learning rules.

Task-group Relatedness and Generalization Bounds for Regularized Multi-task Learning

no code implementations28 Aug 2014 Chao Zhang, DaCheng Tao, Tao Hu, Xiang Li

We are mainly concerned with two theoretical questions: 1) under what conditions does RMTL perform better with a smaller task sample size than STL?

Generalization Bounds Multi-Task Learning

A Neuron as a Signal Processing Device

no code implementations12 May 2014 Tao Hu, Zaid J. Towfic, Cengiz Pehlevan, Alex Genkin, Dmitri B. Chklovskii

Here we propose to view a neuron as a signal processing device that represents the incoming streaming data matrix as a sparse vector of synaptic weights scaled by an outgoing sparse activity vector.

A mechanistic model of early sensory processing based on subtracting sparse representations

no code implementations NeurIPS 2012 Shaul Druckmann, Tao Hu, Dmitri B. Chklovskii

However, feedback inhibitory circuits are common in early sensory circuits and furthermore their dynamics may be nonlinear.

