Search Results for author: Yuxuan Zhang

Found 22 papers, 8 papers with code

Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023

no code implementations26 Mar 2024 Hongpeng Pan, Yang Yang, Zhongtian Fu, Yuxuan Zhang, Shian Du, Yi Xu, Xiangyang Ji

To address this issue, we propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera.

Motion Detection Point Tracking +2

Fast Personalized Text-to-Image Syntheses With Attention Injection

no code implementations17 Mar 2024 Yuxuan Zhang, Yiren Song, Jinpeng Yu, Han Pan, Zhongliang Jing

Currently, personalized image generation methods mostly require considerable time to finetune and often overfit the concept resulting in generated images that are similar to custom concepts but difficult to edit by prompts.

Text-to-Image Generation

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model

no code implementations12 Mar 2024 Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao

Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.

Text-to-Image Generation

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

1 code implementation26 Dec 2023 Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing

Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.

Image Generation

CogAgent: A Visual Language Model for GUI Agents

1 code implementation14 Dec 2023 Wenyi Hong, Weihan Wang, Qingsong Lv, Jiazheng Xu, Wenmeng Yu, Junhui Ji, Yan Wang, Zihan Wang, Yuxuan Zhang, Juanzi Li, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

People are spending an enormous amount of time on digital devices through graphical user interfaces (GUIs), e. g., computer or smartphone screens.

Language Modelling Visual Question Answering

DPP-based Client Selection for Federated Learning with Non-IID Data

no code implementations30 Mar 2023 Yuxuan Zhang, Chao Xu, Howard H. Yang, Xijun Wang, Tony Q. S. Quek

This paper proposes a client selection (CS) method to tackle the communication bottleneck of federated learning (FL) while concurrently coping with FL's data heterogeneity issue.

Federated Learning

Video4MRI: An Empirical Study on Brain Magnetic Resonance Image Analytics with CNN-based Video Classification Frameworks

no code implementations24 Feb 2023 Yuxuan Zhang, Qingzhong Wang, Jiang Bian, Yi Liu, Yanwu Xu, Dejing Dou, Haoyi Xiong

Due to the high similarity between MRI data and videos, we conduct extensive empirical studies on video recognition techniques for MRI classification to answer the questions: (1) can we directly use video recognition models for MRI classification, (2) which model is more appropriate for MRI, (3) are the common tricks like data augmentation in video recognition still useful for MRI classification?

Classification Data Augmentation +3

Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography

no code implementations CVPR 2023 Ilya Chugunov, Yuxuan Zhang, Felix Heide

Modern mobile burst photography pipelines capture and merge a short sequence of frames to recover an enhanced image, but often disregard the 3D nature of the scene they capture, treating pixel motion between images as a 2D aggregation problem.

Depth And Camera Motion Pose Estimation

Neural Volume Super-Resolution

no code implementations9 Dec 2022 Yuval Bahat, Yuxuan Zhang, Hendrik Sommerhoff, Andreas Kolb, Felix Heide

This allows us to super-resolve the 3D scene representation by applying 2D convolutional networks on the 2D feature planes.

Super-Resolution

An Attention-based Multi-Scale Feature Learning Network for Multimodal Medical Image Fusion

1 code implementation9 Dec 2022 Meng Zhou, Xiaolan Xu, Yuxuan Zhang

Furthermore, we propose a novel fixed fusion strategy termed Softmax-based weighted strategy based on the Softmax weights and matrix nuclear norm.

An Edge Alignment-based Orientation Selection Method for Neutron Tomography

no code implementations1 Dec 2022 Diyu Yang, Shimin Tang, Singanallur V. Venkatakrishnan, Mohammad S. N. Chowdhury, Yuxuan Zhang, Hassina Z. Bilheux, Gregery T. Buzzard, Charles A. Bouman

Neutron computed tomography (nCT) is a 3D characterization technique used to image the internal morphology or chemical composition of samples in biology and materials sciences.

All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines

1 code implementation16 Dec 2021 Yuxuan Zhang, Bo Dong, Felix Heide

Various defense methods have proposed image-to-image mapping methods, either including these perturbations in the training process or removing them in a preprocessing denoising step.

Adversarial Defense Denoising +3

The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement

1 code implementation CVPR 2022 Ilya Chugunov, Yuxuan Zhang, Zhihao Xia, Xuaner, Zhang, Jiawen Chen, Felix Heide

Modern smartphones can continuously stream multi-megapixel RGB images at 60Hz, synchronized with high-quality 3D pose information and low-resolution LiDAR-driven depth estimates.

CelebHair: A New Large-Scale Dataset for Hairstyle Recommendation based on CelebA

no code implementations14 Apr 2021 Yutao Chen, Yuxuan Zhang, Zhongrui Huang, Zhenyao Luo, Jinpeng Chen

In this paper, we present a new large-scale dataset for hairstyle recommendation, CelebHair, based on the celebrity facial attributes dataset, CelebA.

Facial Landmark Detection

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

2 code implementations CVPR 2021 Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler

To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts.

Image Segmentation Semantic Segmentation

Adaptive Radar Detection and Classification Algorithms for Multiple Coherent Signals

no code implementations23 Dec 2020 Sudan Han, Linjie Yan, Yuxuan Zhang, Pia Addabbo, Chengpeng Hao, Danilo Orlando

In this paper, we address the problem of target detection in the presence of coherent (or fully correlated) signals, which can be due to multipath propagation effects or electronic attacks by smart jammers.

General Classification

A prognostic dynamic model applicable to infectious diseases providing easily visualized guides -- A case study of COVID-19 in the UK

1 code implementation14 Dec 2020 Yuxuan Zhang, Chen Gong, Dawei Li, Zhi-Wei Wang, Shengda D Pu, Alex W Robertson, Hong Yu, John Parrington

A reasonable prediction of infectious diseases transmission process under different disease control strategies is an important reference point for policy makers.

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

no code implementations ICLR 2021 Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler

Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a teacher to disentangle the GAN's latent code into interpretable 3D properties.

Neural Rendering

A Combined Data-driven and Physics-driven Method for Steady Heat Conduction Prediction using Deep Convolutional Neural Networks

no code implementations16 May 2020 Hao Ma, Xiangyu Hu, Yuxuan Zhang, Nils Thuerey, Oskar J. Haidn

For the data-driven based method, the introduction of physical equation not only is able to speed up the convergence, but also produces physically more consistent solutions.

Deep Neural Network Fingerprinting by Conferrable Adversarial Examples

1 code implementation ICLR 2021 Nils Lukas, Yuxuan Zhang, Florian Kerschbaum

We propose a fingerprinting method for deep neural network classifiers that extracts a set of inputs from the source model so that only surrogates agree with the source model on the classification of such inputs.

Model extraction Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.